homunculus: March 2024

My nephew Andrew, a chemistry postdoc at Oxford, has just published a paper in JACS on developing inhibitors of the protein HIF (hypoxia inhibitory factor) 1A. Hurrah for him! And this got me curious enough to delve into what this molecule does. Andrew had told me before that it’s a transcription factor, which naturally led me to guess it has a fair degree of intrinsic disorder – as is indeed the case (see the floppy bits of polypeptide chain here):

Why? Because most eukaryotic TFs do, as they tend to operate in conjunction with a host of other molecules such as cofactors and seem to benefit from having a degree of promiscuity in their interactions.

That’s just one way in which I suspected a protein like this might exemplify the ways in which our molecular mechanisms operate. And indeed, this turns out to be the case. At face value, how HIF1A (sometimes written as HIF1[alpha]) does what it does looks ever more perplexingly, indeed impossibly, complicated the harder you look. But in every respect I found those details confirming the kind of picture I have tried to sketch in my book How Life Works – and I’d hope that the book might help a non-specialist see how there are actually some generic principles operating in a case like this that can bring some sense of order and logic to what otherwise appears utterly confusing. So if you’re ready for the ride, strap in.

HIF1A is a member of a family of HIF proteins, in mammals encoded by the genes HIF1A, HIF2A and HIF3A. The proteins enable cells to cope with oxygen-depleted circumstances, in general by activating or inhibiting the expression of certain genes. For example, HIF1A can upregulate expression of vascular endothelial growth factor (VEGF), a key gene involved in angiogenesis (the formation of new blood vessels), so as to encourage the formation of new sources of oxygenation. For this reason, HIFs are not merely activated in unusual conditions of oxygen stress but are a crucial part of normal development, and are associated with disorders of blood circulation, such as atherosclerosis, hypertension and aneurysms. The 2019 Nobel Prize in physiology or medicine was awarded to William Kaelin, Peter Radcliffe and Gregg Semenza for their work in discovering the HIF proteins and how they regulate the cell’s response to hypoxia.

HIF1A has also become a focus of interest for cancer treatments, because if it can be inhibited specifically in cancer cells, this could enable the tumour to be slowed or even killed by oxygen depletion. That’s what Andrew and his colleagues are working on.

The basic mode of action is interesting, but also a major saga in itself. HIF1A is produced even when the cells have plenty of oxygen – but is then targeted by enzymes that stick ubiquityl groups onto it so as to label it for destruction by proteases. Those ubiquitylating enzymes are oxygen-sensitive, and if lack of oxygen stops them working, HIF1A is no longer degraded but is free to do its regulatory work as it accumulates in the cell nucleus. (This bit of the story, like all the others, is actually rather more complicated, as HIF1A degradation is also sensitive to factors other than oxygenation, such as nutrient levels – there is evidently a fair amount of context dependence and integration of various input signals determining HIF stability. What HIF1A does, and indeed how stable it is, is also influenced by having other chemical groups appended to it: phosphorylation, SUMOylation and acetylation.)

In the nucleus, HIF1A dimerizes with another member of the family, HIF1B [or beta] (which has two subunits, encoded by the genes ARNT1 and ARNT2) to form a complex that can bind to DNA and regulate genes. Those genes it regulates have promoter groups denoted hypoxia-response elements (HREs) that the HIF1A/1B complex recognizes. These are generally close to the target genes themselves, but not always; some are distal.

OK, so far it seems like classic switch-like regulation (albeit fiendishly complicated!). But here’s where things get complicated. For one thing, there are many more genomic loci carrying the 5-base-pair HRE recognition sequence than there are actual HRE binding sites. In fact, less than 1% of the potential HRE sites are bound by HIFs in response to hypoxia. How come HIF1A/1B isn’t sticking to all those others too? No one really knows. But it seems that some of the selectivity depends on sequences flanking the HREs, in a manner as yet unclear. This reminds me of the work I wrote about recently by Polly Fordyce at Stanford and colleagues, who showed that repetitive sequences flanking regulatory sites, previously dismissed as “junk”, might act as a sort of attractive well that accumulates and holds onto the regulatory molecules like TFs, via weak and fairly non-specific interactions that nevertheless somehow cumulatively impart the right selectivity. These so-called short tandem repeats act as a kind of “lobby” where the molecules can hang around so that they are ready when needed. I’ve no idea if anything like that is happening here, but it shows that we should not be too ready to dismiss parts of the genome that seem literally peripheral and “probably” useless. However, it seems likely that factors other than the DNA sequences are also influencing HIF binding to HREs.

What’s more, HIF1A doesn’t do its job alone. Eukaryotic TFs hardly ever do. There is a whole host of other molecules involved in regulating those genes, as evident in this diagram from one article:

When I see something like this, I now know not to take it too literally. It may well be that these molecules aren’t getting together in well defined and stoichiometric complexes, but are more probably associating in looser and fuzzier ways – perhaps involving what some call transcriptional hubs or condensates, blobs with liquid-like behaviour that constitute a distinct phase from the rest of the nucleoplasm. I haven’t been able to find any indication that this is what goes on for the HIF proteins, but it wouldn’t surprise me, given how such structures seem to be involved in other regulatory processes. One review of this topic simply says that “HIF1A may stimulate transcription either by means of cooperative DNA binding or cooperative recruitment of coactivators.” (That word “recruitment” is always a giveaway, since obviously the protein is not literally summoning its coactivators from afar – “recruitment” tends to mean “these molecules somehow gather and act together in a way we don’t understand.”)

And get this: “HIF1A has been shown to contribute to transcriptional control independently of its DNA binding activity, working instead in partnership with other DNA binding proteins to affect other cellular pathways.” In other words, there seems to be another (at least one other?) mechanism by which HIF1A does its regulatory work. How is a protein designed to do a job in two totally different ways? The answer is surely that it is modular. But how do these different channels depend on one another, if at all? When does one happen, and when the other? At what level is that decision made?

So in short: what the hell? How can we start to make any sense of this process, beyond the morass of details? Well, here’s the key thing: it seems that this fuzziness and multiplicity of actors enables the regulatory process to be sensitive to higher-level information – so that exactly which genes the HIF complex regulates is tissue- or cell-state-specific. That, after all, is what we’d expect: what’s needed to survive hypoxia will vary between tissues, so the response has to be attuned to that. This is an illustration of why we mustn’t imagine that Crick’s Central Dogma gives any kind of indication of the overall information flow in cells: it is not simply from DNA to RNA to protein (even if that applies to sequence information). What a protein does will be sensitive to higher-level information too.

And in fact, even what a protein is is sensitive to that too. We have known about alternative splicing – the creation of different mRNAs, and thus proteins, from the same primary transcript – since the 1970s, of course. But I am not convinced, despite protestations to the contrary, that the implications of that have really filtered through to the public consciousness, not least in terms of how it undermines the notion of a genetically encoded “program”. Contextual information from the surroundings literally changes the output of the Central Dogma. And the HIF family offer a great illustration of this, as you’ll see.

The other two alpha units of the HIF family, HIF2A and HIF3A, also bind to HIF1B. HIF2A has its own set of target genes. But weirdly, its DNA binding domain is very similar to that of HIF1A, and so the sequences HIF1A and HIF2A bind are essentially identical. Yet they do target different genes. How on earth does that happen? Well, it seems that for one thing they have differently spliced varieties (isoforms), meaning that the proteins are stitched together differently by the spliceosome during translation. Still, it’s hard to figure out how, or if, this is the key factor in their target specificity. One review says:

Although several studies have attempted to define the isoform-specific transcriptional programs, few common themes have emerged from these investigations, thus highlighting the complex nature of this cellular response. Variables such as cell type, severity, duration and variety of stimulation, the presence of functional VHL, and even culture conditions reportedly influence the transcriptional output mediated by HIF1A versus HIF2A. Furthermore, many of these studies have only examined either HIF1A or HIF2A, and untangling HIF-dependent from HIF-independent hypoxia-induced responses has proved challenging.

Again, what the hell? And again: it’s clear that a whole bunch of higher-level information is involved in determining the outcomes. For example, a part of the cell-type specificity seems to relate to the state that the chromatin is in: how it is packaged.

HIF3A, meanwhile, seems a little different from HIF1A and HIF2A, both in terms of sequence and functionally. There are several – around six – alternative splicing variants with different regulatory functions. Some of these seem to have a negative regulatory action – for example, one isoform of HIF3A inhibits HIF1A. HIF3A seems to be a classic example of a protein with very tissue-specific alternative splicing: one form, called HIF3A4, for example, is expressed only in the corneal lens epithelium and controls vascularization there in response to hypoxia.

There’s more. How does HIF binding actually alter gene expression? Well, it’s sure not in the way the classic regulatory paradigm, the lac operon of E. coli, does it: by simply blocking RNA polymerase from attaching and transcribing the adjacent gene. Or perhaps we should say that yes, ultimately it’s a matter of hindering transcription, but in a manner that is far more complicated. In essence, HIF does this by initiating a change in the way the chromatin in that region is packed, for example by making the packing denser so that the DNA there is inaccessible to transcription.

And this too is subtle. One thing HIF binding does is to trigger enzymes that stick methyl groups onto the histone proteins around which DNA is wound in chromatin. Such changes are known to affect chromatin packing, but the details aren’t well understood. Certainly it’s not as simple as saying that methylation makes the histones bulkier and less well packed; sometimes that process will enhance transcription, and sometimes inhibit it. We don’t know what the “rules” are. But they aren’t, I think, going to be governed by any sort of simple, digital code – not least because they involve issues of three-dimensional molecular structure and solvation, and appear again to have a context dependence.

Nor should we imagine that the hypoxia response is merely a matter of the HIF proteins. Several others are involved too. At this point you might want to despair of making any sense of it all. But the point is that this process resembles nothing more than what goes on in the brain, where information from many sources is integrated and contextualized in the process of generating some appropriate output. That process involves several different scales – it is not simply a matter of this molecule speaking to that one in linear chains of communication. In this case, a more useful framework for thinking about the problem is one that is cognitive and analogue, not mechanical and digital.

Oh, there are yet more wrinkles, but I’m going to spare you those. The bottom line is that there are perils in taking the tempting line of explaining how HIF1A works by saying something along the lines of “It is a master regulator that switches genes on or off when the cell gets hypoxic.” That is true in a sense, but risks giving a false impression of understanding. In the end it implies that proteins (or their genes) just “do” things, as if by magic, and so suggests that they are in control. In reality, the way it works bears little resemblance to those pictures of blobby molecules sticking together and working via magical arrows. The information flow is much more omnidirectional, and the logic is fuzzy and combinatorial, and also poorly understood in many respects, and only makes sense if we take into account the system as a whole. When we do, it becomes clear that there is no basis for saying that genes like HIF1A dictate the hypoxia response; we can with more justification say that cells “decide” how to use their genetic resources to mount a response that is appropriate to their particular state and circumstances.

This is nothing that molecular biologists don’t know (and it is phenomenally impressive that they have got as far as they have). But I believe we need better ways to tell the story, which do justice to the real ingenuity, versatility, and contextuality of life.

homunculus

Thursday, March 28, 2024

How our cells cope with oxygen stress: a paradigm of life's fuzzy, distributed control