Much has been learnt about the evolution of proteins from primary sequences, particularly by including the transition from gene sequences to amino acid sequences. However, knowledge of proteins' tertiary structures is now essential, and our understanding has to recognize both the physical structure, and the dynamics of protein structures. Writing in Nature Ecology & Evolution Levin and Mishmar1 do just this and gain an important insight into factors affecting evolution. The authors have examined over 3,400 genes from over 90 species for resolved tertiary protein structures, these are mostly nuclear-coded proteins but there are a small number of mitochondrial proteins as well, and many more mitochondria have been sequenced. Don't be put off by their terminology of searching for the non-standard term ‘functional nodal mutations’ (mutations conserved over five or more branches) — the main thing is that they use tertiary structures to help their conclusions.

The authors study 300 million years of amniote (mammal, bird and reptile) evolution, and over 3,200 non-synonymous mutations are studied. They search in particular for compensatory pairs of changes that are within 5 Å of each other in the resolved tertiary structure. The procedure does not claim to identify all compensatory mutations, only to identify some of the best. They show that some of them play a role in thermoregulation of birds and mammals — the first report of this phenomenon, and an important observation derived from an understanding of tertiary structure.

Eukaryotes have had a particularly high number of insertions/deletions at the protein level during evolution, and it is important to remove these insertions/deletions for accurate alignments. However, the authors carried out both structure-based alignments of sequences and protein-sequence-based alignments and found no major differences. This welcome news means that the algorithms are working well.

An important point is that increasingly it is being accepted mathematically that the Markov models that we know and love lose information for resolving the deeper divergences2. We have difficulty even with multicellular animals3, let alone eukaryotes as a whole or prokaryotes (akaryotes). However, tertiary structure appears to hold information for longer, and so can be very useful for these deeper divergences. We need such tertiary structures, and the authors therefore limit their conclusion to those proteins with tertiary structures that are already determined.

It is time to move beyond the notion that all populations are similar in their molecular evolutionary dynamics. Lynch and Abegg4 have demonstrated a major effect of population size. For example, larger populations show a much-increased propensity for two mutations that are neutral individually but have a positive effect when combined. This may be because larger populations tend to be composed of smaller organisms. Larger populations have more neutral genetic diversity within them, and this also appears to be important. This means that smaller organisms (with larger populations) are much more likely to be innovative in evolution, something that is perhaps is illustrated biochemically by bacteria.

For these reasons we have to determine the tertiary (and quaternary — dynamic) structure of proteins. Some will be determined by X-ray (and increasingly at very low temperature), others by calculation. As a first step, we will need to know just how the tertiary structures themselves evolve in order to get proper alignments. But ultimately, we will need to know how the molecules really function. Perhaps the computer program I-TASSER5 for predicting tertiary structure deserves a mention here. The biennial CASP (critical assessment of techniques for protein structure prediction) competition aims to determine the accuracy of prediction; some newly determined protein structures are withheld and computer programs compete to predict them. I-TASSER has won the competition several times, and computer prediction of tertiary structure is getting quite good (though even computers still find it time consuming). Using both known and predicted structures, I-TASSER will be particularly useful to learn more about tertiary and quaternary structures. Chemists know a lot already and are increasingly able to accurately predict the structures of (at least smaller) molecules before they have their structures determined by X-rays6, but there is much more to learn7.

Levin and Mishmar1 have used 3D protein structure to show that thermoregulation genes are more likely to be ‘functional nodal mutations’ than many other genes, and that they are shared between birds and mammals. But understanding protein evolution can tell us much more than this, including why some animals or plants appear more successful than others. We need much more information about protein evolution to understand why some proteins can catalyse different reactions. In order to achieve this we will need much more information about protein structure, and to develop better tools to use that information.