Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Widespread adenine N6-methylation of active genes in fungi

Abstract

N6-methyldeoxyadenine (6mA) is a noncanonical DNA base modification present at low levels in plant and animal genomes1,2,3,4, but its prevalence and association with genome function in other eukaryotic lineages remains poorly understood. Here we report that abundant 6mA is associated with transcriptionally active genes in early-diverging fungal lineages5. Using single-molecule long-read sequencing of 16 diverse fungal genomes, we observed that up to 2.8% of all adenines were methylated in early-diverging fungi, far exceeding levels observed in other eukaryotes and more derived fungi. 6mA occurred symmetrically at ApT dinucleotides and was concentrated in dense methylated adenine clusters surrounding the transcriptional start sites of expressed genes; its distribution was inversely correlated with that of 5-methylcytosine. Our results show a striking contrast in the genomic distributions of 6mA and 5-methylcytosine and reinforce a distinct role for 6mA as a gene-expression-associated epigenomic mark in eukaryotes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Phylogenetic diversity of genomes sequenced in this study and associated 6mA features.
Figure 2: Distribution of 6mA marks across early-diverging fungal genomes.
Figure 3: MAC characteristics across a subset of early-diverging fungi.
Figure 4: 6mA is associated with active genes.

Similar content being viewed by others

Accession codes

Primary accessions

BioProject

References

  1. Fu, Y. et al. N6-methyldeoxyadenosine marks active transcription start sites in Chlamydomonas. Cell 161, 879–892 (2015).

    Article  CAS  Google Scholar 

  2. Zhang, G. et al. N6-methyladenine DNA modification in Drosophila. Cell 161, 893–906 (2015).

    Article  CAS  Google Scholar 

  3. Greer, E.L. et al. DNA methylation on N6-adenine in C. elegans. Cell 161, 868–878 (2015).

    Article  CAS  Google Scholar 

  4. Wu, T.P. et al. DNA methylation on N(6)-adenine in mammalian embryonic stem cells. Nature 532, 329–333 (2016).

    Article  CAS  Google Scholar 

  5. Lücking, R., Huhndorf, S., Pfister, D.H., Plata, E.R. & Lumbsch, H.T. Fungi evolved right on track. Mycologia 101, 810–822 (2009).

    Article  Google Scholar 

  6. Zemach, A., McDaniel, I.E., Silva, P. & Zilberman, D. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science 328, 916–919 (2010).

    Article  CAS  Google Scholar 

  7. Blow, M.J. et al. The epigenomic landscape of prokaryotes. PLoS Genet. 12, e1005854 (2016).

    Article  Google Scholar 

  8. Fu, Y., Dominissini, D., Rechavi, G. & He, C. Gene expression regulation mediated through reversible m6A RNA methylation. Nat. Rev. Genet. 15, 293–306 (2014).

    Article  CAS  Google Scholar 

  9. Taylor, J.W. & Berbee, M.L. Dating divergences in the fungal tree of life: review and new analyses. Mycologia 98, 838–849 (2006).

    Article  Google Scholar 

  10. Spatafora, J.W. et al. A phylum-level phylogenetic classification of zygomycete fungi based on genome-scale data. Mycologia 108, 1028–1046 (2016).

    Article  CAS  Google Scholar 

  11. Zhang, W., Spector, T.D., Deloukas, P., Bell, J.T. & Engelhardt, B.E. Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol. 16, 14 (2015).

    Article  Google Scholar 

  12. Flusberg, B.A. et al. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat. Methods 7, 461–465 (2010).

    Article  CAS  Google Scholar 

  13. Breiling, A. & Lyko, F. Epigenetic regulatory functions of DNA modifications: 5-methylcytosine and beyond. Epigenetics Chromatin 8, 24 (2015).

    Article  Google Scholar 

  14. Grishkevich, V., Hashimshony, T. & Yanai, I. Core promoter T-blocks correlate with gene expression levels in C. elegans. Genome Res. 21, 707–717 (2011).

    Article  CAS  Google Scholar 

  15. Nawrocki, E.P. et al. Rfam 12.0: updates to the RNA families database. Nucleic Acids Res. 43, D130–D137 (2015).

    Article  CAS  Google Scholar 

  16. Parra, G., Bradnam, K. & Korf, I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).

    Article  CAS  Google Scholar 

  17. Finn, R.D. et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 44, D1, D279–D285 (2016).

    Article  Google Scholar 

  18. Obraztsova, I.N., Prados, N., Holzmann, K., Avalos, J. & Cerdá-Olmedo, E. Genetic damage following introduction of DNA in Phycomyces. Fungal Genet. Biol. 41, 168–180 (2004).

    Article  CAS  Google Scholar 

  19. Solomon, K.V. et al. Early-branching gut fungi possess a large, comprehensive array of biomass-degrading enzymes. Science 351, 1192–1195 (2016).

    Article  CAS  Google Scholar 

  20. Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

    Article  CAS  Google Scholar 

  21. Lam, K.-K., LaButti, K., Khalak, A. & Tse, D. FinisherSC: a repeat-aware tool for upgrading de novo assembly using long reads. Bioinformatics 31, 3207–3209 (2015).

    Article  CAS  Google Scholar 

  22. Chin, C.-S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013).

    Article  CAS  Google Scholar 

  23. Grigoriev, I.V. et al. MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res. 42, D699–D704 (2014).

    Article  CAS  Google Scholar 

  24. Price, A.L., Jones, N.C. & Pevzner, P.A. De novo identification of repeat families in large genomes. Bioinformatics 21 (Suppl. 1), i351–i358 (2005).

    Article  CAS  Google Scholar 

  25. Bembom, O. Sequence logos for DNA sequence alignments http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.431.3748 (2014).

  26. Lawrence, M. . et al. Software for computing and annotating genomic ranges. PLoS Comput. Biol. 9, e1003118 (2013).

    Article  CAS  Google Scholar 

  27. Yin, R. et al. Ascorbic acid enhances Tet-mediated 5-methylcytosine oxidation and promotes DNA demethylation in mammals. J. Am. Chem. Soc. 135, 10396–10403 (2013).

    Article  CAS  Google Scholar 

  28. Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).

    Article  CAS  Google Scholar 

  29. Dominissini, D., Moshitch-Moshkovitz, S., Salmon-Divon, M., Amariglio, N. & Rechavi, G. Transcriptome-wide mapping of N(6)-methyladenosine by m(6)A-seq based on immunocapturing and massively parallel sequencing. Nat. Protoc. 8, 176–189 (2013).

    Article  CAS  Google Scholar 

  30. Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

    Article  Google Scholar 

  31. Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).

    Article  Google Scholar 

  32. Enright, A.J., Van Dongen, S. & Ouzounis, C.A. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002).

    Article  CAS  Google Scholar 

  33. Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).

    Article  CAS  Google Scholar 

  34. Merchant, S.S. et al. The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318, 245–250 (2007).

    Article  CAS  Google Scholar 

  35. Attrill, H. et al. FlyBase: establishing a Gene Group resource for Drosophilamelanogaster. Nucleic Acids Res. 44, D786–D792 (2016).

    Article  CAS  Google Scholar 

  36. Church, D.M. et al. Modernizing reference genome assemblies. PLoS Biol. 9, e1001091 (2011).

    Article  CAS  Google Scholar 

  37. C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998).

  38. Katoh, K. & Standley, D.M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  Google Scholar 

  39. Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).

    CAS  PubMed  Google Scholar 

  40. Parkhomchuk, D. et al. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 37, e123 (2009).

    Article  Google Scholar 

  41. Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

    Article  CAS  Google Scholar 

  42. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  CAS  Google Scholar 

  43. Martin, J. et al. Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads. BMC Genomics 11, 663 (2010).

    Article  CAS  Google Scholar 

  44. Grabherr, M.G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  Google Scholar 

  45. Wu, T.D. & Watanabe, C.K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).

    Article  CAS  Google Scholar 

  46. Urich, M.A., Nery, J.R., Lister, R., Schmitz, R.J. & Ecker, J.R. MethylC-seq library preparation for base-resolution whole-genome bisulfite sequencing. Nat. Protoc. 10, 475–483 (2015).

    Article  CAS  Google Scholar 

  47. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  48. Schultz, M.D. et al. Human body epigenome maps reveal noncanonical DNA methylation variation. Nature 523, 212–216 (2015).

    Article  CAS  Google Scholar 

  49. Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 10–12 (2011).

    Article  Google Scholar 

  50. Schultz, M.D., Schmitz, R.J. & Ecker, J.R. 'Leveling' the playing field for analyses of single-base resolution DNA methylomes. Trends Genet. 28, 583–585 (2012).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank J.K. Henske, C. Swift, S.P. Gilmore and K.V. Solomon for preparing DNA and/or RNA for P. finnis and A. robustus; T. Porter for DNA and RNA preparation for Catenaria anguillulae; P. Liu for preparation of DNA and RNA for R. globosum and L. transversale; and D. Carter-House for preparation of genomic DNA of H. vesiculosa and R. globosum for bisulfite sequencing. For bisulfite sequencing of H. vesiculosa and R. globosum, we thank N.A. Rohr for library preparation and the Georgia Advanced Computing Resource Center (GACRC) for computational resources. Work conducted by the US Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC02-05CH11231. This work was partially supported by funding from the National Science Foundation (DEB-1441715 to JES, DEB-1441604 to J.W.S. and DEB-1354625 to T.Y.J. and I.V.G.); Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. This work was further supported by the Office of Science (BER), US Department of Energy (DE-SC0010352) and the Institute for Collaborative Biotechnologies through grant W911NF-09- 0001. R.J.S. is supported by funding from the Office of the Vice President of Research at UGA as well as the Pew Charitable Trusts.

Author information

Authors and Affiliations

Authors

Contributions

S.J.M., R.O.D. and I.V.G. designed the study. S.J.M. and R.O.D. collected and analyzed data under the supervision of G.K.-R. and I.V.G. R.C.K. optimized the protocol for 6mA IP-sequencing and PacBio library preparation. R.C.K. and C.D. sequenced genomes, including IP-sequencing. R.C.K., C.D., A.J.B. and R.J.S. conducted bisulfite sequencing. S.J.M., R.O.D. and A.J.B. analyzed bisulfite sequencing data. K.B.L., R.L. and T.R.N. conducted LC-mass spectrometry analysis. B.P.B. analyzed mass spectrometry data. K.L., B.B.A. and A.C. assembled genomes. S.J.M., S.H., A.K., S.R.A. and A.S. annotated genomes. A.L. and E.L. assembled transcriptomes. S.J.M., W.S. and G.K.-R. analyzed transcriptomes. A.G., D.C., J.M., T.Y.J., M.A.O'M., J.E.S., J.W.S. and I.V.G. coordinated genome projects. S.J.M. wrote the manuscript with significant input from A.V. and I.V.G.; and I.V.G. coordinated the project.

Corresponding author

Correspondence to Igor V Grigoriev.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Coverage, quality and reproducibility of 6mA marks across fungi.

Line graph showing per-strand coverage of 6mA marks in genomes of a) Dikarya (shown in greyscale to distinguish between different lineages), and b) early-diverging fungi. 6mA marks below a minimum coverage cutoff of 15x and above a maximum coverage (determined independently for each genome) were removed from downstream analyses. Coverage ranges for each lineage are shown in the figure legend. c) Modification quality value (mQV) distribution for each lineage and filtering cutoff used (black bar, 25 mQV). d) Box plots showing distribution of methylation ratios for methylated adenines in each genome analyzed (black bar within boxes shows median ratio). Methylation ratio refers to the proportion of molecules mapped to a given site which are methylated.

Supplementary Figure 2 Validation of SMRT-detected 6mA using mass spectrometry and IP sequencing.

a) Percent methylated adenines as detected by SMRT-analysis (SMRT-6mA) and by Mass-Spectrometry (MS-6mA). As a measure of high confidence SMRT-detected sites, percent of methylated adenines at ApT sites (SMRT-6mApT) is also included. b) 6mA overlap between SMRT and IP-seq methods within a 100kb region of H. vesiculosa scaffold 1. Red tiles: MACs detected through SMRT-analysis. Black tiles: methylated regions identified through IP-sequencing. Significant peaks were detected using macs231, Q-value ≤ 0.01. Read coverage tracks for both the control (inner circle) and pulldown (outer circle) are also shown. c) Comparison of 6mA-IP and SMRT analysis results across all lineages examined. a refers to percent of 6mA bases identified by SMRT analysis prior to filtering.

Supplementary Figure 3 Surrounding nucleotide context and relative genomic occurrence of 6mA.

a) Occurrence of 6mA at 4mers in early-diverged fungi. TAT/ATA trinucleotides are underscored in red. b) Percent of total ApT containing trinucleotides within MACs that are methylated.

Supplementary Figure 4 Expression, thymine and TAT-trinucleotide frequencies flanking and across MACs.

Frequency of TAT trinucleotides (top), thymine bases (middle) and expression (bottom) are plotted upstream, downstream and across MACs. Frequency is calculated as: # occurrences ÷ total # MACs. As MACs vary in length, all MACs ≥ 100 bp were selected, fragmented into 100 sections from start to end, then average frequency is calculated within fragment. MACs are oriented by gene direction.

Supplementary Figure 5 6mA is associated with active genes.

a) Expression and methylation level of all methylated genes, sorted by expression level. Genes are sorted by FPKM value (blue → black), with 6mA levels shown immediately below (white → dark green). While methylated genes rarely lack expression (FPKM < 1.0), the level of 6mA has no influence over the magnitude of expression. If the two were related, we would expect that as expression level increased, we would see a similar pattern in amount of 6mA present, which is not the case. b) FPKM levels of unmethylated genes, sorted by expression level.

Supplementary Figure 6 MAC overlaps with various genomic features.

a) Percent of gene models containing MACs and proximity to their transcriptional start sites. While some MACs directly overlap with the TSS, many are located slightly downstream. b) Fixed window overlaps of MACs with micro RNAs. c) Fixed window overlaps of MACs with tRNAs.

Supplementary Figure 7 6mA presence or absence is related to gene function.

a) Methylation presence/absence at all genes containing common pfam17 domains (present in at least 8 genes) and their deviation from expected. Pfams showing significant (p ≤ 0.05) departures from the expected were identified using Fisher’s exact test followed by FDR correction (significant = red, non-significant = blue). Overall percentage of significant pfams for each genome are shown in parenthesis next to lineage names. b) log2 fold change in methylation presence/absence at genes containing common pfam17 domains across all lineages (present in at least 8 genes across all genomes, significant in at least one lineage). Lineages showing significant departure from the expected are denoted with a * (adjusted p-value ≤ 0.05), or ** (adjusted p-value ≤ 0.01). Green = enriched in unmethylated gene set, purple = enriched in methylated gene set. Constitutively expressed housekeeping proteins, such as mitochondrial Rho proteins (blue arrow) are very frequently methylated, while some genes, such as Leucine-Rich-Repeat containing proteins (orange arrow) show variability across lineages.

Supplementary Figure 8 6mA and 5mC enrichment by region.

Overall percent cytosines methylated per genome (a), context (b) and distribution of both epigenomic marks, 5mC and 6mA, across the genome (c and d, respectively).

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8, Supplementary Tables 1 and 2, and Supplementary Notes 1 and 2. (PDF 2294 kb)

Supplementary Dataset 1

Pfam presence across fungi, putative methyltransferases and lineages surveyed. Worksheet 1) Spreadsheet of all pfams showing significant differences (adjusted pval ≤ 0.01) in presence/absence across earlydiverging fungi vs Dikarya. EDF = early-diverging fungi. Worksheet 2) Spreadsheet of methyltransferases showing significant differences (adjusted pval ≤ 0.01) in presence/absence across early-diverging fungi vs Dikarya. EDF = early-diverging fungi. Worksheet 3) Spreadsheet of all lineages included in gene conservation and pfam analyses. (XLSX 389 kb)

Supplementary Dataset 2

Results of Fisher's exact test per lineage. Results of Fisher's exact test examining methylation presence/absence at all common pfams (present in at least 8 genes) for each genome. Results for each lineage are shown on separate worksheets. (XLSX 133 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mondo, S., Dannebaum, R., Kuo, R. et al. Widespread adenine N6-methylation of active genes in fungi. Nat Genet 49, 964–968 (2017). https://doi.org/10.1038/ng.3859

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/ng.3859

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing