Main

Evolutionary reversibility represents a strong test of the importance of contingency and determinism in evolution. If selection is limited in its ability to drive the reacquisition of ancestral forms, then the future outcomes available to evolution at any point in time must depend strongly on the present state and, in turn, on the past2,4,8,18. Ready reversibility, in contrast, would indicate that natural selection can produce the same optimal form in any given environment, irrespective of history19. The evolutionary reversibility of a protein can be evaluated at three levels: molecular sequence, protein function, and the structural/mechanistic underpinnings for that function. The latter is most relevant to understanding the roles of contingency and determinism in evolution. Exact molecular reversal to the ancestral amino acid sequence is extremely unlikely and of trivial interest, because of the large number of sequences that code for the same structure and function. Selection will always produce adaptive functions or phenotypes in some form; however, if the underlying mechanism for a reversed function differs from that of the ancestor, then a new, analogous state will have been achieved by onward evolution, not reversal—a situation similar to false morphological reversal caused by convergent evolution4. True reversal, involving restoration of the ancestral phenotype by the ancestral structure–function relations, would indicate that the forms of functional proteins can evolve deterministically, irrespective of contingent historical events.

Recent developments in techniques for studying protein evolution allow the reversibility of protein structure and function to be studied directly. The intrinsic functions and atomic structures of ancestral genes can be determined by inferring their sequences using maximum-likelihood phylogenetics, then biochemically synthesizing, expressing and characterizing them using functional assays and X-ray crystallography20. The mechanisms by which new functions evolved can be identified by introducing historical substitutions into ancestral backgrounds and characterizing their effects on structure and function21,22. Using these techniques, we recently established the mechanistic basis for the evolution of a new function in the glucocorticoid receptor, a DNA-binding transcription factor activated by the steroid hormone cortisol to regulate the long-term stress response and other processes in humans and other vertebrates23,24. Specifically, we showed that the cortisol-specificity of the glucocorticoid receptor ligand-binding domain (LBD) evolved from a more promiscuous ancient receptor that was activated by the mineralocorticoids aldosterone and deoxycorticosterone (DOC) and, albeit more weakly, by cortisol. The new specificity of glucocorticoid receptors (Fig. 1a–c) evolved because of a marked change in structure–function relations during the 40-million-year period between AncGR1 (the glucocorticoid receptor in the last common ancestor of cartilaginous and bony fish, which had the ancestral function) and AncGR2 (the glucocorticoid receptor in the ancestor of tetrapods and ray-finned fish, which was cortisol-specific). Of the 37 amino acid changes that occurred during this interval, two conserved substitutions (Ser106Pro and Leu111Gln, called group X for convenience) were necessary and sufficient to switch the preference of the resurrected AncGR1 from mineralocorticoids to cortisol. Ser106Pro radically repositioned helix 7 along the edge of the ligand pocket, reducing activation by all hormones but moving site 111 closer to the ligand. In this new position, the hydrophobic substitution Leu111Gln generated a new hydrogen bond to the 17-hydroxyl group unique to cortisol, specifically restoring sensitivity to that hormone (Fig. 1d). Three more conserved substitutions (group Y) completed the loss of mineralocorticoid sensitivity to yield a cortisol-specific receptor; these changes further destabilized the receptor complex with mineralocorticoids but enhanced interaction with the 17-hydroxyl of cortisol. The protein could not tolerate group Y, however, without an additional pair of permissive substitutions (group Z), which added stability to the structural elements destabilized by group Y and conferred the full glucocorticoid-receptor-like function upon AncGR1 + XYZ (Fig. 1e and ref. 24).

Figure 1: Evolution and reversibility of glucocorticoid receptor function.
figure 1

a, Reduced phylogeny of corticosteroid receptors. Blue, receptors sensitive to both cortisol and mineralocorticoids; purple, sensitive to cortisol only; black, other steroid receptors (AR, androgen receptor; PR, progestogen receptor). CR, cortisol receptor; GR, glucocorticoid receptor; MR, mineralocorticoid receptor. Ancestral proteins AncGR1 and AncGR2 are labelled. Thirty-seven amino acid changes, including groups X, Y and Z, occurred during the interval between these two proteins (black box; for complete list and alignment, see Supplementary Fig. 1). Parentheses show the number of sequences in each group. Scale bar for branch lengths is shown in substitutions per site. b, c, Ligand sensitivities of AncGR1 (b) and AncGR2 (c), shown as the fold increase in expression of a luciferase reporter in the presence of increasing doses of cortisol (purple), aldosterone (solid blue), and DOC (dashed blue). d, Conformational change causing cortisol-specificity in AncGR2 (see ref. 24). Partial structures of AncGR1 and AncGR2 are superimposed. Substitutions in group X (Ser106Pro and Leu111Gln) are large effect mutations that reposition helix 7 (H7) and form a hydrogen bond to the 17-OH that is unique to cortisol (purple). Black arrows indicate change in position of these residues. Substitutions in groups Y (Leu29Met, Phe98Ile and Ser 212Δ), and Z (Asn26Thr and Gln105Leu) optimize the derived function. e, When substitutions in sets X, Y and Z are introduced into AncGR1, they recapitulate the evolution of a cortisol-specific activator. f, When these substitutions are reversed to the ancestral state (xyz) in the AncGR2 background, activation by all ligands is lost. g, All AncGR2 combinations in which group X is reversed also yield non-functional receptors. All error bars denote s.e.m.

PowerPoint slide

The most direct pathway to reverse the evolution of the structure and function of AncGR2 would be to reverse the key substitutions that generated the derived phenotype during the ‘forward’ evolution of AncGR1. We used site-directed mutagenesis on AncGR2 to reverse the amino acids in groups X, Y and Z to their ancestral states (x, y and z), and characterize their effect on receptor function in a luciferase reporter gene assay (Fig. 1f). Surprisingly, AncGR2-xyz was unable to activate transcription in response to any ligand, even at very high concentrations. Reversing only the large-effect mutations in group X also produced a non-functional AncGR2-x receptor, as did all combinations that included restoration of group x (Fig. 1g). These results indicate that the structure and function of AncGR2 are not reversible through this direct path, because the ancestral amino acids in group x—and the change in conformation they cause—are incompatible with the derived background. These states, however, were present in AncGR1 just 40 million years earlier and have been conserved in all mineralocorticoid receptors ever since, indicating that further epistatic modifiers must have evolved between AncGR1 and AncGR2.

To identify candidate historical substitutions for this restrictive effect, we combined phylogenetic and structural analysis. Thirty substitutions in addition to X, Y and Z occurred during the AncGR1–AncGR2 interval. We reasoned that amino acids required for the ancestral function should be conserved in the AncGR1-like state in extant receptors that retain the ancestral sensitivity to DOC and aldosterone; unlike X, Y and Z, however, they would not be expected to be conserved in the glucocorticoid-receptor-like receptors. Of the 30 remaining sites that changed between AncGR1 and AncGR2, only six were invariant among all or all but one of the DOC/aldosterone-sensitive receptors (Fig. 2a and Supplementary Fig. 1). To predict which of these were most likely to enable the ancestral function, we expressed the AncGR2 LBD and used X-ray crystallography (Supplementary Table 1) to determine its empirical atomic structure at 2.5 Å resolution in complex with the synthetic glucocorticoid dexamethasone (Fig. 2b and Supplementary Figs 2 and 3). The monomeric AncGR2 structure adopts the canonical active conformation for nuclear receptors25 and is nearly identical to the AncGR2 structure previously predicted by homology modelling24. Five of the six candidate substitutions identified by phylogenetic analysis (group W: His84Gln, Tyr91Cys, Ala107Tyr, Gly114Gln and Leu197Met) are on or interact directly with the repositioned helix 7; the other (Val234Phe) is far from the remodelled portion of the protein and does not seem to interact with it directly or indirectly (Fig. 2b). By comparing the structure of AncGR2 with that of AncGR1 (ref. 24), we predicted that the derived states at these five sites would be incompatible with the ancestral structure and function, because the AncGR1 states stabilize the active conformation with helix 7 in its ancestral position (Fig. 2b, c), but those in AncGR2 fail to support this conformation or actively clash with it. Specifically (Fig. 2c), two of these residues in AncGR1 (Gly 114 and Leu 197) are close together and allow tight packing of helix 7 in the ancestral position against helix 10. AncGR2, in contrast, contains Gln and Met at these positions, the side chains of which are longer and less hydrophobic; the repositioning of helix 7 allows these two residues to be tolerated, but in the ancestral conformation their side chains would clash, pushing the two helices apart and away from the ligand. A second pair, the aromatic residues His 84 and Tyr 91, form a pi-stack in AncGR1, stabilizing the β-strand that abuts the ligand and helix 7. Substitution of these residues to Gln and Cys (as in AncGR2) would destroy this interaction, increasing flexibility in the ligand pocket and destabilizing the active complex; AncGR2 can presumably tolerate this effect because of the extra stability contributed by the hydrogen bond between Gln 111 and cortisol. The fifth candidate site, Ala 107, lies near the base of helix 7, at the mouth of the ligand pocket where helices 3, 7 and 10 pack together. Replacement of Ala 107 with the bulky tyrosine of AncGR2 would clash with helices 3 and 10; however, the movement of helix 7 in AncGR2 repositions site 107, allowing a tyrosine to be tolerated.

Figure 2: Identification of restrictive substitutions that impede reversibility.
figure 2

a, Group W residues are conserved in the AncGR1-like state in virtually all extant receptors that retain the ancestral function. Parentheses show the number of sequences in each group. b, X-ray crystal structure (PDB accession 3GN8) of AncGR2 (bronze) with dexamethasone (purple). Repositioned helix 7 is shown in grey. Residues substituted between AncGR1 and AncGR2 are marked with spheres at the α-carbon. Cyan, candidate restrictive substitutions (group W). Sites in groups X, Y and Z are shown in medium, dark and light green, respectively. Blue, Val234Phe. c, Ligand pockets of AncGR1 (green, with cortisol) and AncGR2 (bronze, with dexamethasone). When in their ancestral states, group w residues (cyan) are predicted to support the ancestral conformation of helix 7 (grey), but to destabilize that conformation when in the derived states found in AncGR2.

PowerPoint slide

To test the hypothesis that group W substitutions impede direct reversal of the key function-switching mutations, we used site-directed mutagenesis to reverse group W in the AncGR2-xyz background. We then determined their functional effects using a luciferase reporter assay (Fig. 3). As predicted, reversing all five group W mutations restored the ancestral phenotype, yielding a sensitive, promiscuous receptor with a nanomolar response to both mineralocorticoids and cortisol and, like AncGR1, a preference for aldosterone (Fig. 3a). All five group W substitutions contribute to AncGR2’s intolerance of the ancestral structure/function: Tyr107Ala alone partially rescued the transcriptional function of AncGR2-xyz and shifted it substantially towards the ancestral promiscuous phenotype, as did the pairs Gln84His/Cys91Tyr and Gln114Gly/Met197Leu (Fig. 3b). Restoring the single mutations Gln84His, Cys91Tyr, Gln114Gly and Met197Leu had no or very weak effects (Supplementary Fig. 4), presumably because of the structural interactions within each pair required to improve the receptor’s function.

Figure 3: Restrictive substitutions impede evolutionary reversibility.
figure 3

a, When group W substitutions are restored to their ancestral state (w), the non-functional AncGR2-xyz is rescued, and the ancestral sensitivity to all three ligands is restored. Fold increase in luciferase reporter expression is shown with cortisol (purple), aldosterone (solid blue), and DOC (dashed blue). b, Group W substitutions all impede reversibility: restoring the ancestral states singly (Tyr107Ala) or in structurally interacting pairs (Gln84His/Cys91Tyr and Gln114Gly/Met197Leu) partially rescues AncGR2-xyz. Error bars, s.e.m.

PowerPoint slide

To test the hypothesis that group W substitutions specifically undermine the stability of the ancestral helix 7 conformation, we restored the ancestral state (w) in all possible combinations of x, y and z in the AncGR2 background (Fig. 4 and Supplementary Fig. 5). As predicted, reversal to x always impairs both the ancestral and derived functions unless group w has been reversed first. Taken together, our experiments indicate that these five mutations prevent direct evolutionary reversal by weakening aspects of the receptor structure that were required to support the ancestral conformation. By reversing all of these restrictive substitutions, the ancestral structure and function can be largely restored. The reversed AncGR2-xyzw remains slightly less sensitive to hormones than AncGR1, indicating that some of the other 25 substitutions during the AncGR1–AncGR2 interval make further, minor contributions to impeding direct evolutionary reversal (Fig. 4a, b). The restrictive effect of mutations in group W on the reversal of group X does not depend on whether these other 25 substitutions are in their ancestral or derived states (Fig. 4a, b and Supplementary Fig. 5). Although there are other combinations of individual substitutions that we did not test, our results indicate that neither the restrictive effect of group W mutations nor the permissive effect of reversing them depends narrowly on a specific genetic background.

Figure 4: Epistasis limits trajectories of reverse and forward evolution.
figure 4

The corners of each hypercube represent states for residue sets X, Y, Z and W. Edges show pathways between the derived (XYZW, bronze) and ancestral (xyzw, green) states. Red edges show unlikely evolutionary paths through non-functional intermediates; black paths pass through functional intermediates. Filled shapes at vertices indicate sensitivity to aldosterone (blue squares), DOC (blue circles), and cortisol (purple triangles); empty shapes, no activation by these hormones. Tables below each cube show sensitivity to each hormone as the EC50 (concentration required for half-maximal activation), with 95% confidence interval shown in parentheses. Dashes, no activation; asterisks, state combinations in AncGR2 and AncGR1. a, Limited evolutionary pathways to reverse AncGR2 (bronze) to the ancestral structure and function. Mutations were introduced in the AncGR2 background. b, Functional effect of substitutions during ‘forward’ evolution when introduced into AncGR1 (green). The sets X, Y, Z and W each contain more than one site, so the complete sequence space for each cube has 12 dimensions and 4,096 vertices.

PowerPoint slide

The restrictive mutations that impede direct reversal may have been adaptive or neutral when they occurred. To characterize the ‘forward’ effect of group W mutations on receptor function, we recapitulated them in the AncGR1 background with various combinations of groups X, Y and Z (Fig. 4b). In AncGR1-XYZ and other X-containing backgrounds, W mutations cause a weak or moderate improvement in receptor activation and cortisol-specificity, presumably by stabilizing the derived position of helix 7 and its interaction with cortisol. In the AncGR1 background and all other combinations that include x, however, W mutations markedly reduce sensitivity to all hormones. Because selection makes evolutionary trajectories that pass through non-functional intermediates far less likely than those involving functional intermediates at every step26, W mutations are unlikely to have been complete before the remodelling and functional shift triggered by group X. Once X was in place, however, the W mutations that prevent direct evolutionary reversal probably optimized the derived function or were neutral.

Our findings indicate that epistatic modifiers—at least some of which occurred after the new function of the glucocorticoid receptor evolved—acted as an evolutionary ratchet, making re-evolution of the ancestral structure–function relations far more difficult than it was initially. Reversal by a direct path that restores the key residues in group X became exceedingly unlikely, because features that once enabled the ancestral conformation of helix 7 had been modified. To restore the ancestral conformation by reversing group X, the restrictive effect of the substitutions in group W must first be reversed, as must group Y (Fig. 4a, b). Reversal to w and y in the absence of x, however, does nothing to enhance the ancestral function; in most contexts, reversing these mutations substantially impairs both the ancestral and derived functions (Fig. 4a, b). Furthermore, the permissive effect of reversing four of the mutations in group W requires pairs of substitutions at interacting sites. Selection for the ancestral function would therefore not be sufficient to drive AncGR2 back to the ancestral states of w and x, because passage through deleterious and/or neutral intermediates would be required; the probability of each required substitution would be low, and the probability of all in combination would be virtually zero.

We have examined the sufficiency of selection to drive direct evolutionary reversal. There may be other potentially permissive mutations, of unknown number, that could compensate for the restrictive effect of group W and allow the ancestral conformation to be restored. Reversal by such indirect pathways could be driven by selection, however, only if these other mutations, unlike those we studied, could somehow relieve the steric clashes and restore the lost stabilizing interactions that make the ancestral position of helix 7 intolerable in AncGR2, and also independently restore the ancestral function when helix 7 is in its radically different derived conformation. Whether or not mutations that could achieve these dual ends exist, reversal to the ancestral conformation would require a considerably more complex pathway than was necessary before the ratchet effect of W evolved.

The extent to which our observations concerning the evolutionary reversibility of glucocorticoid recpetors can be generalized to other proteins requires further research. We predict that future investigations, like ours, will support a molecular version of Dollo’s law4: as evolution proceeds, shifts in protein structure–function relations become increasingly difficult to reverse whenever those shifts have complex architectures, such as requiring conformational changes or epistatically interacting substitutions2,8,16. Phenotypes at higher levels of genetic organization may also display ratchet-like modes of evolution if optimization of a derived phenotype involves changes in one gene, regulatory element, morphological structure, or developmental process that epistatically undermine the conditions that enabled the ancestral state at other such ‘loci’. In contrast, phenotypic shifts caused by single or additive genetic changes are likely to be readily reversible1,27.

Our observations suggest that history and contingency during glucocorticoid receptor evolution strongly limited the pathways that could be deterministically followed under selection. The ‘adaptive peak’ represented by the promiscuous AncGR1 is a relatively close neighbour in sequence space to the more specific AncGR2. This peak was occupied in the ancestor of jawed vertebrates—indicating that no intrinsic constraints prevent its realization—but it became far more difficult to access just 40 million years later because of intervening epistatic mutations. Selection is an extraordinarily powerful evolutionary force;28 nevertheless, our observations suggest that, because of the complexity of glucocorticoid receptor architecture, low-probability permissive substitutions were required to open some mutational trajectories to exploration under selection24,29, whereas restrictive substitutions closed other potential paths. Under selection, some kind of adaptation will always occur30, but the specific adaptive forms that are realized depend on the historical trajectory that precedes them. The conditions that once facilitated evolution of the glucocorticoid receptor's ancestors were destroyed during the realization of its present form2,4,7,16,18. The past is difficult to recover because it was built on the foundation of its own history, one irrevocably different from that of the present and its many possible futures.

Methods Summary

Peptide sequences of the AncGR1 and AncGR2 LBDs were inferred using maximum-likelihood phylogenetics from an alignment of 60 peptide sequences of extant steroid and related receptors as previously described24. Complementary DNAs coding for these peptides were synthesized and subcloned and expressed as fusion constructs with Gal4-DBD (DNA-binding domain) in Chinese hamster ovary (CHO-K1) cells. Activation was measured using dual luciferase assays in the presence of increasing concentrations of various hormones. AncGR2-LBD was bacterially expressed as a maltose-binding protein/TEV fusion in the presence of dexamethasone, then purified, cleaved, dialysed, concentrated, crystallized and diffracted using X-ray crystallography at the Advanced Photon Source. The atomic structure was determined to 2.5 Å by molecular replacement based on the previously described human glucocorticoid-receptor-based homology model of AncGR2-LBD24, followed by further refinement. Details are presented in Supplementary Information.

Online Methods

Ancestral protein sequences

AncGR1 and AncGR2 sequences were inferred by maximum likelihood31 using PAML 3.15 software on the maximum-likelihood phylogeny of 60 amino acid sequences of extant steroid and related receptors (see refs 23 and 24 for details). In brief, the likelihood of each possible amino acid state was calculated given the extant sequence data, the maximum-likelihood phylogeny, the Jones–Taylor–Thornton amino acid replacement model (which had 100% posterior probability in a Bayesian evaluation of several protein models), and a gamma distribution of among-site rate variation. The maximum-likelihood amino acid sequences of the LBDs (including the carboxy-terminal extension) of AncGR1 and AncGR2 were back-translated assuming human codon bias. Coding DNAs were then synthesized (GenScript), verified by sequencing, and cloned into pSG5-Gal4-DBD with the human glucocorticoid receptor hinge domain for expression and characterization. The functions of AncGR1- and AncGR2-LBD fusion proteins, assayed as described below, were robust to statistical ambiguity in the inferred ancestral sequence24.

Molecular biology

The hormone-dependent transcriptional activity of resurrected ancestral receptors and their variants was assayed using a luciferase reporter system. CHO-K1 cells were grown in 96-well plates and transfected with 1 ng of receptor plasmid, 100 ng of a UAS-driven firefly luciferase reporter (pFRluc), and 0.1 ng of the constitutive phRLtk Renilla luciferase reporter plasmid, using Lipofectamine and Plus Reagent in OPTIMEM (Invitrogen). After 4 h, transfection medium was replaced with phenol-red-free αMEM supplemented with 10% dextran-charcoal-stripped FBS (Hyclone). After overnight recovery, cells were incubated in triplicate with aldosterone, cortisol or 11-deoxycorticosterone from 10-11 to 10-5 M for 24 h, then assayed using Dual-Glo luciferase (Promega). Firefly luciferase activity was normalized by Renilla luciferase activity. Dose-response relationships were estimated using nonlinear regression in Prism4 software (GraphPad Software, Inc.); fold increase in activation was calculated relative to vehicle-only control. Mutagenesis to recapitulate historical substitutions was performed using QuikChange (Stratagene) and verified by sequencing.

Structural biology

The atomic structure of AncGR2-LBD was determined using X-ray crystallography. AncGR2-LBD cDNA (residues 1–248) was cloned into pMALCH10T (a gift from J. Tesmer) and expressed as a maltose-binding protein/TEV-fusion protein in BL21(DE3) pLys cells in the presence of 50 μM dexamethasone using standard methods. Expressed protein was purified using affinity chromatography. After TEV cleavage, the tagged fusion protein was removed using a nickel affinity column, polished by gel filtration, dialysed (200 mM sodium chloride, 50 μM HEPES, pH 7.8 and 50 μM CHAPS), and concentrated to 3–5 mg ml-1. Crystals of AncGR2-LBD with dexamethasone were grown by hanging drop vapour diffusion at 22 °C from solutions containing 0.75 μl of protein at 3–5 mg ml-1 and 0.75 μl-1 of crystallant (0.5–0.75 M ammonium, pH 7.4), and a 21-amino-acid nuclear receptor box-3 peptide of glucocorticoid receptor coactivator human TIF2 (+H3N-PVSPKKKENALLRYLLDKDDT-CO2-, Synbiosci). Crystals were cryoprotected in crystallant with 25% glycerol and flash-cooled in liquid N2. Data to 2.5 Å resolution were collected at 100K at the South East Regional Collaborative Access Team at the Advanced Photon Source, and were processed and scaled with HKL2000 (ref. 32; Supplementary Table 1). Initial phases for the AncGR2-cortisol complex were determined using PHASER33 in the CCP4 software suite. The previously described homology model of AncGR2 (ref. 24) was used as a molecular replacement search model. All structures were refined using COOT34 and CNS35. The X-ray crystal structure of AncGR2 (PDB accession 3GN8) was compared to the model of AncGR1 (ref. 24), which was previously generated by homology modelling based on the X-ray crystal structure of its evolutionary precursor AncCR (PDB accession 2Q1V), with which it is identical at 90% of sites.