Main

Credit: NPG

Paleovirology is an emerging field that studies ancient viruses. These viruses typically integrated into the germ line of their hosts millions of years ago and today exist as viral fossils, sometimes accounting for up to 5–10% of the host's genome. The most common examples of ancient viruses are endogenous retroviruses (ERVs), which have been associated with multiple diseases such as cancer and autoimmunity. Now, three recent genome analysis studies have furthered our understanding about ERVs.

Aswad and Katzourakis1 analysed the evolutionary history of viral superantigens (vSAGs), which are unique virulence genes that have been implicated in the pathogenesis of different conditions such as sepsis and autoimmunity. vSAGs have been well described in the betaretrovirus mouse mammary tumour virus (MMTV) and are also found in three South American herpesviruses of the genus Rhadinovirus: saimirine herpesvirus 2 (SaHV2; which infects squirrel monkeys), ateline herpesvirus 3 (AtHV3; which infects spider monkeys), and rodent herpesvirus Peru (RHVP; which infects pygmy rice rats). To understand the evolutionary history of vSAGs, Aswad and Katzourakis searched the National Center for Biotechnology Information (NCBI) genome database for sequences with similarities to vSAGs and identified vSAG-containing betaretroviral ERVs in 20 mammalian genomes. A phylogenetic analysis revealed two major lineages of vSAGs: one including primarily viruses that infect primates; and another including primarily viruses that infect rodents. Notably, this analysis also suggests the existence of horizontal gene transfer (HGT) from retroviruses to herpesviruses. Reconstructing timelines of herpesviral vSAGs and the ERVs of their hosts showed that the split between SaHV2 and AtHV3 (which occurred approximately 10.6 million years ago) is much younger than the split between their hosts. Therefore, these data suggest that the herpesviruses resulted from cross-species transmission after their hosts diverged. Furthermore, the observation of multiple independent viral integrations in two additional New World monkeys, which is consistent with the existence of a single ancestral infectious retrovirus, and the location of this ancestral retrovirus in a position basal to the clade of viruses that infect squirrel and spider monkeys support a scenario in which an infectious retrovirus is circulating in South America and leaving endogenous footprints in both host and herpesvirus genomes. Notably, the observation of HGT between unrelated viruses as a convergent evolutionary event may warrant revisiting the genetic interactions between another rhadinovirus, human herpesvirus 8 (the causative agent of Kaposi sarcoma, an illness prevalent in individuals who develop AIDS) and the HIV-1 retrovirus.

ERVs have been associated with cancer, although such links are still controversial. In humans, the lifetime risk of cancer is positively correlated with height, which suggests a link between body size and cancer risk. However, across different mammalian species, cancer risk does not increase with increased body size, which is known as 'Peto's paradox'. Katzourakis et al.2 suggest a possible explanation for this paradox that is based on ERV activity. The authors analysed 38 mammalian genomes and found that larger-bodied species have a lower frequency of ERVs compared with smaller-bodied mammals. Body size explained 68% of the variance in the mean age of ERVs per genome (which reflects when the ERVs integrated into the host's genome) and 37% of the variance in the number of ERVs acquired in the past 10 million years. These data suggest that larger body size limits the number of recently replicating ERVs. As retroviral integration can be tumorigenic, it is plausible that reduced ERV abundance resulted from the evolutionary pressure to decrease the risk of developing cancer.

The human endogenous retrovirus-K (HERV-K) has been linked to cancer, autoimmune diseases, amyotrophic lateral sclerosis and HIV-1 infection. To determine whether HERV-K has a causative role in these diseases it is important to know how individuals differ in the number of integrated viral copies (loci) in their genomes. Marchi et al.3 mined more than 400 human genomes and found significant variability in the number of these loci. Combining their findings with those from a previous study4, Marchi et al. identified 17 unfixed loci that are absent in the human reference genome, with each individual on average carrying six of these loci. The authors then compared the actual and expected number of these loci using a model that assumes a constant rate of replication since the human–chimpanzee split and showed that HERV-K was replicating until at least ~250,000 years ago. However, whether germline replication continues today is still unclear.

Collectively, these studies further our understanding about the origin and evolution of different ERVs and provide new insight into the possible links between ERVs and disease, which merit further investigation.