My nephew
Andrew, a chemistry postdoc at Oxford, has just published a paper in JACS on developing inhibitors of the protein HIF (hypoxia inhibitory factor) 1A.
Hurrah for him! And this got me curious enough to
delve into what this molecule does. Andrew had told me before that it’s a
transcription factor, which naturally led me to guess it has a fair degree of
intrinsic disorder – as is indeed the case (see the floppy bits of polypeptide chain here):
Why?
Because most eukaryotic TFs do, as they tend to operate in conjunction with a
host of other molecules such as cofactors and seem to benefit from having a
degree of promiscuity in their interactions.
That’s
just one way in which I suspected a protein like this might exemplify the ways
in which our molecular mechanisms operate. And indeed, this turns out to be the
case. At face value, how HIF1A (sometimes written as HIF1[alpha]) does what it does looks ever
more perplexingly, indeed impossibly, complicated the harder you look. But in
every respect I found those details confirming the kind of picture I have tried
to sketch in my book How Life Works – and I’d hope that the book might
help a non-specialist see how there are actually some generic principles operating
in a case like this that can bring some sense of order and logic to what
otherwise appears utterly confusing. So if you’re ready for the ride, strap in.
HIF1A is a
member of a family of HIF proteins, in mammals encoded by the genes HIF1A,
HIF2A and HIF3A. The proteins enable cells to cope with
oxygen-depleted circumstances, in general by activating or inhibiting the
expression of certain genes. For example, HIF1A can upregulate expression of vascular
endothelial growth factor (VEGF), a key gene involved in angiogenesis (the
formation of new blood vessels), so as to encourage the formation of new
sources of oxygenation. For this reason, HIFs are not merely activated in unusual
conditions of oxygen stress but are a crucial part of normal development, and
are associated with disorders of blood circulation, such as atherosclerosis,
hypertension and aneurysms. The 2019 Nobel Prize in physiology or medicine
was awarded to William Kaelin, Peter Radcliffe and Gregg Semenza for their work
in discovering the HIF proteins and how they regulate the cell’s response to hypoxia.
HIF1A has also
become a focus of interest for cancer treatments, because if it can be inhibited
specifically in cancer cells, this could enable the tumour to be slowed or even
killed by oxygen depletion. That’s what Andrew and his colleagues are working
on.
The basic mode
of action is interesting, but also a major saga in itself. HIF1A is produced
even when the cells have plenty of oxygen – but is then targeted by enzymes
that stick ubiquityl groups onto it so as to label it for destruction by
proteases. Those ubiquitylating enzymes are oxygen-sensitive, and if lack of oxygen
stops them working, HIF1A is no longer degraded but is free to do its regulatory
work as it accumulates in the cell nucleus. (This bit of the story, like all
the others, is actually rather more complicated, as HIF1A degradation is also
sensitive to factors other than oxygenation, such as nutrient levels – there is
evidently a fair amount of context dependence and integration of various input
signals determining HIF stability. What HIF1A does, and indeed how stable it
is, is also influenced by having other chemical groups appended to it: phosphorylation,
SUMOylation and acetylation.)
In the
nucleus, HIF1A dimerizes with another member of the family, HIF1B [or beta] (which
has two subunits, encoded by the genes ARNT1 and ARNT2) to form a
complex that can bind to DNA and regulate genes. Those genes it regulates have
promoter groups denoted hypoxia-response elements (HREs) that the HIF1A/1B
complex recognizes. These are generally close to the target genes themselves,
but not always; some are distal.
OK, so far
it seems like classic switch-like regulation (albeit fiendishly complicated!).
But here’s where things get complicated. For one thing, there are many more
genomic loci carrying the 5-base-pair HRE recognition sequence than there are
actual HRE binding sites. In fact, less than 1% of the potential HRE sites are
bound by HIFs in response to hypoxia. How come HIF1A/1B isn’t sticking to all
those others too? No one really knows. But it seems that some of the
selectivity depends on sequences flanking the HREs, in a manner as yet unclear.
This reminds me of the work I wrote about recently
by Polly Fordyce at Stanford and colleagues, who showed that repetitive sequences
flanking regulatory sites, previously dismissed as “junk”, might act as a sort
of attractive well that accumulates and holds onto the regulatory molecules
like TFs, via weak and fairly non-specific interactions that nevertheless
somehow cumulatively impart the right selectivity. These so-called short tandem
repeats act as a kind of “lobby” where the molecules can hang around so that
they are ready when needed. I’ve no idea if anything like that is happening
here, but it shows that we should not be too ready to dismiss parts of the
genome that seem literally peripheral and “probably” useless. However, it seems
likely that factors other than the DNA sequences are also influencing HIF
binding to HREs.
What’s
more, HIF1A doesn’t do its job alone. Eukaryotic TFs hardly ever do. There is a
whole host of other molecules involved in regulating those genes, as evident in
this diagram from one article:
When I see something like this, I now know not to take it too literally. It
may well be that these molecules aren’t getting together in well defined and stoichiometric
complexes, but are more probably associating in looser and fuzzier ways –
perhaps involving what some call transcriptional hubs or condensates, blobs
with liquid-like behaviour that constitute a distinct phase from the rest of
the nucleoplasm. I haven’t been able to find any indication that this is what
goes on for the HIF proteins, but it wouldn’t surprise me, given how such
structures seem to be involved in other regulatory processes. One review of
this topic simply says that “HIF1A may stimulate transcription either by means
of cooperative DNA binding or cooperative recruitment of coactivators.” (That
word “recruitment” is always a giveaway, since obviously the protein is not
literally summoning its coactivators from afar – “recruitment” tends to mean “these
molecules somehow gather and act together in a way we don’t understand.”)
And get this: “HIF1A has been shown to contribute to transcriptional control
independently of its DNA binding activity, working instead in partnership with
other DNA binding proteins to affect other cellular pathways.” In other words, there
seems to be another (at least one other?) mechanism by which HIF1A does its
regulatory work. How is a protein designed to do a job in two totally
different ways? The answer is surely that it is modular. But how do these
different channels depend on one another, if at all? When does one happen, and
when the other? At what level is that decision made?
So in
short: what the hell? How can we start to make any sense of this process,
beyond the morass of details? Well, here’s the key thing: it seems that this fuzziness
and multiplicity of actors enables the regulatory process to be sensitive to
higher-level information – so that exactly which genes the HIF complex
regulates is tissue- or cell-state-specific. That, after all, is what we’d
expect: what’s needed to survive hypoxia will vary between tissues, so the
response has to be attuned to that. This is an illustration of why we mustn’t
imagine that Crick’s Central Dogma gives any kind of indication of the overall
information flow in cells: it is not simply from DNA to RNA to protein (even if
that applies to sequence information). What a protein does will be
sensitive to higher-level information too.
And in
fact, even what a protein is is sensitive to that too. We have known about
alternative splicing – the creation of different mRNAs, and thus proteins, from
the same primary transcript – since the 1970s, of course. But I am not
convinced, despite protestations to the contrary, that the implications of that
have really filtered through to the public consciousness, not least in terms of
how it undermines the notion of a genetically encoded “program”. Contextual
information from the surroundings literally changes the output of the Central
Dogma. And the HIF family offer a great illustration of this, as you’ll see.
The other
two alpha units of the HIF family, HIF2A and HIF3A, also bind to HIF1B. HIF2A
has its own set of target genes. But weirdly, its DNA binding domain is very
similar to that of HIF1A, and so the sequences HIF1A and HIF2A bind are essentially
identical. Yet they do target different genes. How on earth does that happen?
Well, it seems that for one thing they have differently spliced varieties (isoforms),
meaning that the proteins are stitched together differently by the spliceosome during
translation. Still, it’s hard to figure out how, or if, this is the key factor
in their target specificity. One review says:
Although several studies have attempted to define the isoform-specific
transcriptional programs, few common themes have emerged from these
investigations, thus highlighting the complex nature of this cellular response.
Variables such as cell type, severity, duration and variety of stimulation, the
presence of functional VHL, and even culture conditions reportedly influence
the transcriptional output mediated by HIF1A versus HIF2A. Furthermore, many of
these studies have only examined either HIF1A or HIF2A, and untangling
HIF-dependent from HIF-independent hypoxia-induced responses has proved
challenging.
Again,
what the hell? And again: it’s clear that a whole bunch of higher-level
information is involved in determining the outcomes. For example, a part of the
cell-type specificity seems to relate to the state that the chromatin is in:
how it is packaged.
HIF3A,
meanwhile, seems a little different from HIF1A and HIF2A, both in terms of
sequence and functionally. There are
several – around six – alternative splicing variants with different regulatory
functions. Some of these seem to have a negative regulatory action – for example,
one isoform of HIF3A inhibits HIF1A. HIF3A seems to be a classic example of a
protein with very tissue-specific alternative splicing: one form, called
HIF3A4, for example, is expressed only in the corneal lens epithelium and
controls vascularization there in response to hypoxia.
There’s
more. How does HIF binding actually alter gene expression? Well, it’s sure not
in the way the classic regulatory paradigm, the lac operon of E. coli,
does it: by simply blocking RNA polymerase from attaching and transcribing the
adjacent gene. Or perhaps we should say that yes, ultimately it’s a matter of
hindering transcription, but in a manner that is far more complicated. In essence,
HIF does this by initiating a change in the way the chromatin in that region is
packed, for example by making the packing denser so that the DNA there is
inaccessible to transcription.
And this too is subtle. One thing HIF binding does is
to trigger enzymes that stick methyl groups onto the histone proteins around
which DNA is wound in chromatin. Such changes are known to affect chromatin packing,
but the details aren’t well understood. Certainly it’s not as simple as saying
that methylation makes the histones bulkier and less well packed; sometimes
that process will enhance transcription, and sometimes inhibit it. We don’t
know what the “rules” are. But they aren’t, I think, going to be governed by
any sort of simple, digital code – not least because they involve issues of
three-dimensional molecular structure and solvation, and appear again to have a
context dependence.
Nor should
we imagine that the hypoxia response is merely a matter of the HIF proteins.
Several others are involved too. At this point you might want to despair of
making any sense of it all. But the point is that this process resembles
nothing more than what goes on in the brain, where information from many
sources is integrated and contextualized in the process of generating some
appropriate output. That process involves several different scales – it is not
simply a matter of this molecule speaking to that one in linear chains of
communication. In this case, a more useful framework for thinking about the
problem is one that is cognitive and analogue, not mechanical and
digital.
Oh, there
are yet more wrinkles, but I’m going to spare you those. The bottom line is
that there are perils in taking the tempting line of explaining how HIF1A works
by saying something along the lines of “It is a master regulator that switches
genes on or off when the cell gets hypoxic.” That is true in a sense, but risks
giving a false impression of understanding. In the end it implies that proteins
(or their genes) just “do” things, as if by magic, and so suggests that they are
in control. In reality, the way it works bears little resemblance to
those pictures of blobby molecules sticking together and working via magical
arrows. The information flow is much more omnidirectional, and the logic is fuzzy
and combinatorial, and also poorly understood in many respects, and only makes
sense if we take into account the system as a whole. When we do, it becomes clear
that there is no basis for saying that genes like HIF1A dictate the hypoxia
response; we can with more justification say that cells “decide” how to use
their genetic resources to mount a response that is appropriate to their
particular state and circumstances.
This is
nothing that molecular biologists don’t know (and it is phenomenally impressive
that they have got as far as they have). But I believe we need better ways to
tell the story, which do justice to the real ingenuity, versatility, and
contextuality of life.