Near telomere-to-telomere genome of the model plant Physcomitrium patens

Bi, Guiqi; Zhao, Shijun; Yao, Jiawei; Wang, Huan; Zhao, Mengkai; Sun, Yuanyuan; Hou, Xueren; Haas, Fabian B.; Varshney, Deepti; Prigge, Michael; Rensing, Stefan A.; Jiao, Yuling; Ma, Yingxin; Yan, Jianbin; Dai, Junbiao

doi:10.1038/s41477-023-01614-7

Resource
Published: 26 January 2024

Near telomere-to-telomere genome of the model plant Physcomitrium patens

Nature Plants volume 10, pages 327–343 (2024)Cite this article

3709 Accesses
3 Citations
75 Altmetric
Metrics details

Subjects

Abstract

The model plant Physcomitrium patens has played a pivotal role in enhancing our comprehension of plant evolution and development. However, the current genome harbours numerous regions that remain unfinished and erroneous. To address these issues, we generated an assembly using Oxford Nanopore reads and Hi-C mapping. The assembly incorporates telomeric and centromeric regions, thereby establishing it as a near telomere-to-telomere genome except a region in chromosome 1 that is not fully assembled due to its highly repetitive nature. This near telomere-to-telomere genome resolves the chromosome number at 26 and provides a gap-free genome assembly as well as updated gene models to aid future studies using this model organism.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: T2T assembly strategy for the *P. patens* V6 genome.**

**Fig. 2: Chromosomal features of the V6 genome.**

**Fig. 3: Structural conflicts between two distinct versions of particular chromosomes.**

**Fig. 4: Characterization of centromeres.**

**Fig. 5: 3D genome attributes of the protonema and gametophore.**

The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars

Article Open access 16 September 2023

First telomere-to-telomere gapless assembly of the rice blast fungus Pyricularia oryzae

Article Open access 13 April 2024

Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing

Article Open access 07 September 2021

Data availability

The genome assembly and de novo annotations have been deposited in Figshare at https://doi.org/10.6084/m9.figshare.22975925.v2. The Illumina reads (genomic sequencing and Hi-C) and Nanopore reads generated in this study were deposited into the NCBI SRA with the BioProject ID PRJNA742485. The V5 genome assembly was submitted to NCBI with the WGS accession ABEU00000000 under BioProject ID PRJNA13064. The updated gene model lookup table and merged annotation are available for download from https://peatmoss.plantcode.cup.uni-freiburg.de/ppatens_db/downloads.php. The ChIP-seq data generated in this study are available through NGDC (https://ngdc.cncb.ac.cn) with accession PRJCA016808. The V6 genome and annotation data have been submitted to Phytozome (https://phytozome-next.jgi.doe.gov/) and will be made available in their upcoming release.

References

Cove, D. The moss Physcomitrella patens. Annu. Rev. Genet. 39, 339–358 (2005).
Article CAS PubMed Google Scholar
Engel, P. The induction of biochemical and morphological mutants in the moss Physcomitrella patens. Am. J. Bot. 55, 438–446 (1968).
Article CAS Google Scholar
Frank, W., Ratnadewi, D. & Reski, R. Physcomitrella patens is highly tolerant against drought, salt and osmotic stress. Planta 220, 384–394 (2005).
Article CAS PubMed Google Scholar
Schaefer, D. A new moss genetics: targeted mutagenesis in Physcomitrella patens. Annu. Rev. Plant Biol. 53, 477–501 (2001).
Article Google Scholar
Xu, B. et al. Contribution of NAC transcription factors to plant adaptation to land. Science 343, 1505–1508 (2014).
Article ADS CAS PubMed Google Scholar
Rensing, S. A., Goffinet, B., Meyberg, R., Wu, S. & Bezanilla, M. The moss Physcomitrium (Physcomitrella) patens: a model organism for non-seed plants. Plant Cell 32, 1361–1376 (2020).
Article CAS PubMed PubMed Central Google Scholar
Vidali, L. & Bezanilla, M. Physcomitrella patens: a model for tip cell growth and differentiation. Curr. Opin. Plant Biol. 15, 625–631 (2012).
Article CAS PubMed Google Scholar
Ishikawa, M. et al. Physcomitrella STEMIN transcription factor induces stem cell formation with epigenetic reprogramming. Nat. Plants 5, 681–690 (2019).
Article CAS PubMed Google Scholar
Reski, R., Bae, H. & Toft, H. Physcomitrella patens, a versatile synthetic biology chassis. Plant Cell Rep. 37, 1409–1417 (2018).
Article CAS PubMed Google Scholar
Rensing, S. et al. The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science 319, 64–69 (2008).
Article ADS CAS PubMed Google Scholar
The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 61, 796–815 (2014).
Google Scholar
Yu, J. et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79–92 (2002).
Article ADS CAS PubMed Google Scholar
Merchant, S. et al. The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318, 245–250 (2007).
Article ADS CAS PubMed PubMed Central Google Scholar
Lang, D. et al. The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution. Plant J. 93, 515–533 (2018).
Article CAS PubMed Google Scholar
Zimmer, A. D. et al. Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions. BMC Genomics 14, 498 (2013).
Article CAS PubMed PubMed Central Google Scholar
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Song, J. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Mol. Plant 21, 1674–2052 (2021).
Google Scholar
Li, K. et al. Gapless indica rice genome reveals synergistic contributions of active transposable elements and segmental duplications to rice genome evolution. Mol. Plant 21, 1674–2052 (2021).
Google Scholar
Han, X. et al. Two haplotype-resolved, gap-free genome assemblies of Actinidia latifolia and Actinidia chinensis shed light on regulation mechanisms of vitamin C and sucrose metabolism in kiwifruit. Mol. Plant 16, 452–470 (2022).
Article PubMed Google Scholar
Yue, J. et al. Telomere-to-telomere and gap-free reference genome assembly of the kiwifruit Actinidia chinensis. Hortic. Res. 10, uhac264 (2023).
Article PubMed Google Scholar
Deng, Y. et al. A telomere-to-telomere gap-free reference genome of watermelon and its mutation library provide important resources for gene discovery and breeding. Mol. Plant 15, 1268–1284 (2022).
Article CAS PubMed Google Scholar
Payne, Z. L. et al. A gap-free genome assembly of Chlamydomonas reinhardtii and detection of translocations induced by CRISPR-mediated mutagenesis. Plant Commun. 4, 100493 (2022).
Article PubMed PubMed Central Google Scholar
Hu, J. et al. An efficient error correction and accurate assembly tool for noisy long reads. Preprint at bioRxiv https://doi.org/10.1101/2023.03.09.531669 (2023).
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
Article CAS PubMed Google Scholar
Podlevsky, J. D. et al. The telomerase database. Nucleic Acids Res. 36, D339–D3343 (2007).
Article PubMed PubMed Central Google Scholar
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR assembly index (LAI). Nucleic Acids Res. 46, e126 (2018).
PubMed PubMed Central Google Scholar
Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2003).
Article Google Scholar
Alonge, M. et al. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol. 20, 1 (2019).
Article Google Scholar
Goel, M., Sun, H., Jiao, W. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).
Article PubMed PubMed Central Google Scholar
Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at https://doi.org/10.48550/arXiv.1207.3907 (2012).
Haas, F. B. et al. Single nucleotide polymorphism charting of P. patens reveals accumulation of somatic mutations during in vitro culture on the scale of natural variation by selfing. Front. Plant Sci. 11, 813 (2020).
Article PubMed PubMed Central Google Scholar
Zhou, Y. & Song, B.-L. An urgent call on revisions to current genome annotation strategies. Sci. China Life Sci. 66, 1942–1943 (2023).
Article PubMed Google Scholar
Parry, G. The plant nuclear envelope and regulation of gene expression. J. Exp. Bot. 66, 1673–1685 (2015).
Article CAS PubMed Google Scholar
Imaizumi, T. et al. Cryptochrome light signals control development to suppress auxin sensitivity in the moss Physcomitrella patens. Plant Cell 14, 373–386 (2002).
Article CAS PubMed PubMed Central Google Scholar
Prigge, M. J. et al. Physcomitrella patens auxin-resistant mutants affect conserved elements of an auxin-signaling pathway. Curr. Biol. 20, 1907–1912 (2010).
Article CAS PubMed Google Scholar
Bryan, V. S. Cytotaxonomic studies in the Ephemeraceae and Funariaceae. Bryologist 60, 103–126 (1957).
Article Google Scholar
Reski, R., Faust, M. & Wang, X. Genome analysis of the moss Physcomitrella patens (Hedw.) B.S.G. Mol. Gen. Genet. 244, 352–359 (1994).
Article CAS PubMed Google Scholar
Neumann, P. et al. Plant centromeric retrotransposons: a structural and cytogenetic perspective. Mob. DNA 2, 4 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, R.-G. et al. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic. Res. 9, uhac017 (2022).
Article PubMed PubMed Central Google Scholar
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl Acad. Sci. USA 117, 9451–9457 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Carey, S. B. et al. Gene-rich UV sex chromosomes harbor conserved regulators of sexual development. Sci. Adv. 7, eabh2488 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
McClintock, B. The stability of broken ends of chromosomes in Zea mays. Genetics 26, 234–282 (1941).
Article CAS PubMed PubMed Central Google Scholar
Bryant, P. & Slijepcevic, P. E. Chromosome healing, telomere capture and mechanisms of radiation-induced chromosome breakage. Int. J. Radiat. Biol. 73, 1 (1998).
Article PubMed Google Scholar
Kurzhals, R. L. et al. Chromosome healing is promoted by the telomere cap component Hiphop in Drosophila. Genetics 207, 949–959 (2017).
Article CAS PubMed PubMed Central Google Scholar
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Fortin, J.-P. & Kasper, D. H. Reconstructing A/B compartments as revealed by Hi-C using long-range correlations in epigenetic data. Genome Biol. 16, 180 (2015).
Article PubMed PubMed Central Google Scholar
Nothjunge, S. et al. DNA methylation signatures follow preformed chromatin compartments in cardiac myocytes. Nat. Commun. 8, 1667 (2017).
Article ADS PubMed PubMed Central Google Scholar
Bannister, A. J. & Kouzarides, T. Regulation of chromatin by histone modifications. Cell Res. 21, 381–395 (2011).
Article CAS PubMed PubMed Central Google Scholar
Bian, Q. et al. Histone H3K9 methylation promotes formation of genome compartments in Caenorhabditis elegans via chromosome compaction and perinuclear anchoring. Proc. Natl Acad. Sci. USA 117, 11459–11470 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Yung, W.-S. et al. Histone modifications and chromatin remodelling in plants in response to salt stress. Physiol. Plant. 173, 1495–1513 (2021).
Article CAS PubMed Google Scholar
Widiez, T. et al. The chromatin landscape of the moss Physcomitrella patens and its dynamics during development and drought stress. Plant J. 79, 67–81 (2014).
Article CAS PubMed Google Scholar
Ashton, N. W. & Cove, D. J. The isolation and preliminary characterisation of auxotrophic and analogue resistant mutants of the moss, Physcomitrella patens. Mol. Gen. Genet. 154, 87–95 (1977).
Article Google Scholar
Schlink, K. & Reski, R. Preparing high-quality DNA from moss (Physcomitrella patens). Plant Mol. Biol. Report. 20, 423–423 (2002).
Article Google Scholar
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, 884–890 (2018).
Article Google Scholar
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33, 2202–2204 (2017).
Article CAS PubMed PubMed Central Google Scholar
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 64–770 (2011).
Article Google Scholar
Danecek, P. et al. Twelve years of SAMtools and BCFtools. Gigascience 10, giab008 (2021).
Article PubMed PubMed Central Google Scholar
Ensembl/treebest. Ensembl. https://github.com/Ensembl/treebest (2016).
Chen, Y. et al. Efficient assembly of nanopore reads via highly accurate and intact error correction. Nat. Commun. 12, 60 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Article CAS PubMed PubMed Central Google Scholar
Langmead, B. & Salzberg, S. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Article CAS PubMed PubMed Central Google Scholar
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
Article PubMed PubMed Central Google Scholar
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Article CAS PubMed PubMed Central Google Scholar
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Article ADS CAS PubMed PubMed Central Google Scholar
Boratyn, G. M. et al. BLAST: a more efficient report with usability improvements. Nucleic Acids Res. 41, 29–33 (2013).
Article Google Scholar
Wick, R., Schultz, M., Zobel, J. & Holt, K. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31, 3350–3352 (2015).
Article CAS PubMed PubMed Central Google Scholar
Li, H., Feng, X. & Chu, C. The design and construction of reference pangenome graphs with minigraph. Genome Biol. 21, 265 (2020).
Article PubMed PubMed Central Google Scholar
Vaser, R. et al. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Article CAS PubMed PubMed Central Google Scholar
Aury, J.-M. & Istace, B. Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads. NAR Genom. Bioinform. 3, lqab034 (2021).
Article PubMed PubMed Central Google Scholar
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Article CAS PubMed PubMed Central Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2019).
Article Google Scholar
Bankevich, A. et al. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. J. Comput. Biol. 19, 455–477 (2012).
Article MathSciNet CAS PubMed PubMed Central Google Scholar
Pruitt, K., Tatusova, T. & Maglott, D. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33, 501–504 (2007).
Article Google Scholar
Beier, S., Tappu, R. & Huson, D. H. in Functional Metagenomics: Tools and Applications (eds Charles, T. C. et al.) 65–74 (Springer Cham, 2017).
Rhie, A. et al. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
Article CAS PubMed PubMed Central Google Scholar
Zimin, A. V. & Salzberg, S. L. The genome polishing tool POLCA makes fast and accurate corrections in genome assemblies. PLoS Comput. Biol. 16, e1007981 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Zimin, A. et al. The MaSuRCA genome assembler. Bioinformatics 29, 2669–2677 (2013).
Article CAS PubMed PubMed Central Google Scholar
Davey, J., Davis, S., Mottram, J. & Ashton, P. Tapestry: validate and edit small eukaryotic genome assemblies with long reads. Preprint at bioRxiv https://doi.org/10.1101/2020.04.24.059402 (2020).
Simão, F. R., Waterhouse, R., Ioannidis, P., Kriventseva, E. & Zdobnov, E. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Article PubMed Google Scholar
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinform. 5, 4–10 (2004).
Google Scholar
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet. Genome Res. 110, 462–467 (2005).
Article CAS PubMed Google Scholar
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
Article CAS PubMed PubMed Central Google Scholar
Edgar, R. & Myers, E. PILER: identification and classification of genomic repeats. Bioinformatics 21, 152–158 (2005).
Article Google Scholar
Price, A., Jones, N. C. & Pevzner, P. De novo identification of repeat families in large genomes. Bioinformatics 21, 351–358 (2005).
Article Google Scholar
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, 265–268 (2007).
Article Google Scholar
Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinform. 9, 18 (2007).
Article Google Scholar
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Article CAS PubMed Google Scholar
Rensing, S. et al. An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens. BMC Evol. Biol. 7, 130 (2007).
Article PubMed PubMed Central Google Scholar
Sun, P. et al. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol. Plant 15, 1841–1851 (2022).
Article CAS PubMed Google Scholar
Filippova, D., Patro, R., Duggal, G. & Kingsford, C. Identification of alternative topological domains in chromatin. Algorithms Mol. Biol. 9, 14 (2014).
Article PubMed PubMed Central Google Scholar
Lopez-Delisle, L. et al. pyGenomeTracks: reproducible plots for multivariate genomic data sets. Bioinformatics 37, 422–423 (2021).
Article CAS PubMed Google Scholar
Ramírez, F. et al. High-resolution TADs reveal DNA sequences underlying genome organization in flies. Nat. Commun. 9, 189 (2018).
Article ADS PubMed PubMed Central Google Scholar
Paulsen, J., Ali, T. M. & Collas, P. Computational 3D genome modeling using Chrom3D. Nat. Protoc. 13, 1137–1152 (2018).
Article CAS PubMed Google Scholar
Pettersen, E. et al. UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Article CAS PubMed Google Scholar
Haas, B. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, 7 (2007).
Article Google Scholar
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
Article CAS PubMed PubMed Central Google Scholar
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, 435–439 (2006).
Article Google Scholar
Korf, I. Gene finding in novel genomes. BMC Bioinform. 5, 59 (2004).
Article Google Scholar
Lomsadze, A., Ter-Hovhannisyan, V., Chernoff, Y. & Borodovsky, M. Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 33, 6494–6506 (2005).
Article CAS PubMed PubMed Central Google Scholar
Keilwagen, J., Hartung, F. & Grau, J. in Gene Prediction: Methods and Protocols (ed. Kollmar, M.) 161–177 (Humana, 2019).
Boeckmann, B. et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003).
Article CAS PubMed PubMed Central Google Scholar
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, 353–361 (2017).
Article Google Scholar
Aramaki, T. et al. KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics 36, 2251–2252 (2020).
Article CAS PubMed Google Scholar
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, 309–314 (2019).
Article Google Scholar
Mitchell, A. et al. InterPro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 47, 351–360 (2019).
Article Google Scholar
El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res. 47, 427–432 (2019).
Article Google Scholar
Lu, S. et al. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 48, 265–268 (2020).
Article Google Scholar
Törönen, P., Medlar, A. & Holm, L. PANNZER2: a rapid functional annotation web server. Nucleic Acids Res. 46, 84–88 (2018).
Article Google Scholar
Conesa, A. et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676 (2005).
Article CAS PubMed Google Scholar
Chan, P. & Lowe, T. tRNAscan-SE: searching for tRNA genes in genomic sequences. In Gene Prediction: Methods and Protocols Vol. 1962 (ed. Kollman, M.) 1–14 (Humana, 2019).
Nawrocki, E. P. & Eddy, S. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Article CAS PubMed PubMed Central Google Scholar
Shumate, A. & Steven, L. S. Liftoff: accurate mapping of gene annotations. Bioinformatics 37, 1639–1643 (2021).
Article CAS PubMed PubMed Central Google Scholar
Wu, T. D. et al. in Statistical Genomics: Methods and Protocols (eds Mathé, E. & Davis, S.) 283–334 (Humana, 2016).
Gremme, G., Steinbiss, S. & Kurtz, S. GenomeTools: a comprehensive software library for efficient processing of structured genome annotations. IEEE/ACM Trans. Comput. Biol. Bioinform. 10, 645–656 (2013).
Article PubMed Google Scholar
Pertea, G. & Pertea, M. GFF utilities: GffRead and GffCompare. F1000Res. 9, 304 (2020).
Article Google Scholar
Quinlan, A. & Hall, I. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Article CAS PubMed PubMed Central Google Scholar
Li, G. et al. ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing. Genome Biol. 11, R22 (2009).
Article Google Scholar
Tamura, K. et al. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739 (2011).
Article CAS PubMed PubMed Central Google Scholar
Vollger, M. R. et al. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. & Vidali, L. Efficient polyethylene glycol (PEG) mediated transformation of the moss Physcomitrella patens. J. Vis. Exp. 50, e2560 (2011).
Google Scholar
Gendrel, A.-V. et al. Profiling histone modification patterns in plants using genomic tiling microarrays. Nat. Methods 2, 213–218 (2005).
Article CAS PubMed Google Scholar
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Article CAS PubMed PubMed Central Google Scholar
Feng, J. et al. Identifying ChIP-seq enrichment using MACS. Nat. Protoc. 7, 1728–1740 (2012).
Article CAS PubMed Google Scholar
Gel, B. & Serra, E. karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics 33, 3088–3090 (2017).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank C. Chen from Huazhong Agricultural University and C. Yu from the Agricultural Genomics Institute at Shenzhen for their advice on chromosome analysis. We also thank H. Chen at Tsinghua University for providing assistance with P. patens genetics. This work was supported by grants from the National Key Research and Development Program of China (no. 2019YFA0906200 to J. Yan), the National Natural Science Foundation of China (nos. 31725002, 32150025 and 32030004 to J.D.), the Bureau of International Cooperation, Chinese Academy of Sciences (no. 172644KYSB20180022 to J.D.), the Shenzhen Science and Technology Program (no. KQTD20180413181837372 to J.D.), the Science Technology and Innovation Commission of Shenzhen Municipality of China (no. ZDSYS20200811142605017 to J. Yan) and the Shenzhen Outstanding Talents Training Fund to J.D. J. Yan acknowledges funding from the Innovation Program of the Chinese Academy of Agricultural Sciences and the Elite Young Scientists Program of CAAS. The gene annotation was carried out in the framework of MAdLand (http://madland.science, DFG priority program 2237). S.A.R. is grateful for funding from the DFG (RE 1697/15–1, 20–1).

Author information

These authors contributed equally: Guiqi Bi, Shijun Zhao, Jiawei Yao, Huan Wang.

Authors and Affiliations

Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
Guiqi Bi, Jiawei Yao, Huan Wang, Jianbin Yan & Junbiao Dai
CAS Key Laboratory of Quantitative Engineering Biology, Guangdong Provincial Key Laboratory of Synthetic Genomics and Shenzhen Key Laboratory of Synthetic Genomics, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Shijun Zhao, Yingxin Ma & Junbiao Dai
University of Chinese Academy of Sciences, Beijing, China
Shijun Zhao, Yuanyuan Sun, Xueren Hou & Junbiao Dai
Guangdong Province Key Laboratory of Microbial Signals and Disease Control, Integrative Microbiology Research Centre, South China Agricultural University, Guangzhou, China
Huan Wang
College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China
Mengkai Zhao
State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, China
Yuanyuan Sun, Xueren Hou & Yuling Jiao
Department of Algal Development and Evolution, Max Planck Institute for Biology Tübingen, Tübingen, Germany
Fabian B. Haas
Faculty of Chemistry and Pharmacy, University of Freiburg, Freiburg, Germany
Deepti Varshney & Stefan A. Rensing
Department of Cell and Developmental Biology, University of California, San Diego, La Jolla, CA, USA
Michael Prigge
School of Life Sciences, Center for Quantitative Biology, and Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
Yuling Jiao

Authors

Guiqi Bi
View author publications
You can also search for this author in PubMed Google Scholar
Shijun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Yao
View author publications
You can also search for this author in PubMed Google Scholar
Huan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mengkai Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xueren Hou
View author publications
You can also search for this author in PubMed Google Scholar
Fabian B. Haas
View author publications
You can also search for this author in PubMed Google Scholar
Deepti Varshney
View author publications
You can also search for this author in PubMed Google Scholar
Michael Prigge
View author publications
You can also search for this author in PubMed Google Scholar
Stefan A. Rensing
View author publications
You can also search for this author in PubMed Google Scholar
Yuling Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Yingxin Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jianbin Yan
View author publications
You can also search for this author in PubMed Google Scholar
Junbiao Dai
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J. Yan, Y.M. and J.D. conceived the study. J. Yan and J.D. managed the major scientific objectives. G.B. designed the T2T genome assembly, evaluation and data analysis. J. Yao generated the plant materials. H.W. helped with the assembly and annotation. S.Z., M.Z., Y.S. and X.H. collected the sequenced samples. J. Yao, Y.J., Y.M. and J.D. designed and performed ChIP. F.B.H., D.V., M.P. and S.A.R. combined and quality-checked the gene annotations. J. Yan and G.B. led the article preparation, together with J. Yao, H.W. and J.D. All authors read and approved the final article.

Corresponding authors

Correspondence to Jianbin Yan or Junbiao Dai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Plants thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 The process of graph-based gap-filling for the remaining 16 gaps.

a, Fifteen chromosome karyotype maps exhibit the remaining 16 gaps, which are marked with numerical values. The position of the centromere is the region where the chromosome constriction is situated. b, Upstream and downstream sequences of breakpoints (or gaps) that are aligned with the corresponding graphs. The Hi-C heatmaps accurately reveal gaps (indicated by the intersection of green lines) at a resolution of 1 kb. The upstream sequences (blue band) and downstream sequences (green band) of each gap are aligned with the graphs. On the corresponding pathway, the precise locations of 14 intervals where gaps occur on the graph are highlighted (blue and green bands). The gray region between the two bands represents the sequence that requires filling. Overlapping between the two bands suggests sequence redundancy, necessitating trimming and merging. In the case of gap number 11, its existence is attributed to tandem repeats, demanding estimation of the corresponding copy number before initiating gap repair. c, Two gaps occur in the complex region. The upstream and downstream intervals of two gaps are labeled using four distinct colors. The area is characterized by brief repetitive sequences, and to determine the precise paths for gap-filling, the process of mapping nanopore reads onto this region is utilized. Long reads that are capable of traversing the repetitive structure are then extracted to facilitate path building.

Extended Data Fig. 2 Genome assembly validation achieved by analyzing sequencing coverage and depth in relation to the 26 chromosomes in P. patens.

The coverage (0-100%) and depth information of Illumina and ONT sequencing reads on 26 chromosomes are illustrated in the left and right images, respectively. The statistical analysis was performed using a nonoverlapping window of 50 kbp. Except for the repetitive region adjacent to the centromere of Chr01, which was deliberately omitted from the secondary mapping findings, the ONT reads demonstrated comprehensive coverage of all other chromosomal regions. Furthermore, the sequencing depth of the multicopy rRNA region was markedly elevated, exceeding the typical chromosome sequencing depth of 66x, which further supports the notion of the presence of several copies of rRNA.

Extended Data Fig. 3 Taxonomy distribution obtained by analyzing the assembled unmapped short reads against the NR database using MEGAN6.

The tree represents the taxonomic classification of the matched sequences at the class level, with node size indicating the number of matched sequences. The word cloud displays the sequence matching results at the phylum level, with larger words indicating a greater number of matched sequences.

Extended Data Fig. 4 A fundamental overview of the main content presented in this work.

a, Distribution of 17-23 K-mer frequencies in the P. patens genome. b, A radar chart was utilized to show the quality disparity between the V6 genome and its antecedent, V3. The evaluation was based on six distinct indicators, and the findings were scrutinized to identify any discrepancies in quality between the two versions. c, A concise diagram illustrating the process of V6 assembly. d, Results of SyRI analysis showing genome sequence collinearity and structural variants. To ensure the utmost precision in capturing the genuine discrepancies between the two genome versions, the V3 sequence was fragmented into contigs (where N bases were interrupted). Then, using RaGOO software, 26 pseudochromosomes were created to align with V6.

Extended Data Fig. 5 The neighbor-joining cladogram tree of five P. patens accessions built by SNPs derived from Haas et al.31.

The genome sequencing material used in this study is denoted on the tree by an arrow. Bootstrap values under 100 replicates are shown on nodes.

Extended Data Fig. 6 Assembly accuracy validation for Chr25 in V6 by read mapping.

The above depiction aims to compare the level of collinearity displayed by Chr25 in the V3 and V6 versions. The top section of the diagram portrays the amalgamation of two pseudochromosomes in V3, with their boundaries demarcated by a solid black line. The position of the breakpoint in V6 is indicated by a dashed black line. The middle segment of the diagram illustrates the mapping results of nanopore reads (above 10 kbp). The bottom section of the illustration offers a more comprehensive view of the 5 kbp interval encompassing the breakpoint for detailed scrutiny.

Extended Data Fig. 7 Whole genome-wide Hi-C heatmap.

Hi-C interactions among 26 chromosomes at a 500 kbp resolution.

Supplementary information

Supplementary Information

Supplementary Figs. 1–30 and Note.

Reporting Summary

Supplementary Data 1

Supplementary Tables 1–23.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bi, G., Zhao, S., Yao, J. et al. Near telomere-to-telomere genome of the model plant Physcomitrium patens. Nat. Plants 10, 327–343 (2024). https://doi.org/10.1038/s41477-023-01614-7

Download citation

Received: 03 September 2021
Accepted: 19 December 2023
Published: 26 January 2024
Issue Date: February 2024
DOI: https://doi.org/10.1038/s41477-023-01614-7

This article is cited by

Near telomere-to-telomere genome assemblies of two Chlorella species unveil the composition and evolution of centromeres in green algae
- Bo Wang
- Yanyan Jia
- Kai Ye
BMC Genomics (2024)

Near telomere-to-telomere genome of the model plant Physcomitrium patens

Subjects

Abstract

Access options

Similar content being viewed by others

The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars

First telomere-to-telomere gapless assembly of the rice blast fungus Pyricularia oryzae

Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Extended Data Fig. 1 The process of graph-based gap-filling for the remaining 16 gaps.

Extended Data Fig. 2 Genome assembly validation achieved by analyzing sequencing coverage and depth in relation to the 26 chromosomes in P. patens.

Extended Data Fig. 3 Taxonomy distribution obtained by analyzing the assembled unmapped short reads against the NR database using MEGAN6.

Extended Data Fig. 4 A fundamental overview of the main content presented in this work.

Extended Data Fig. 5 The neighbor-joining cladogram tree of five P. patens accessions built by SNPs derived from Haas et al.31.

Extended Data Fig. 6 Assembly accuracy validation for Chr25 in V6 by read mapping.

Extended Data Fig. 7 Whole genome-wide Hi-C heatmap.

Supplementary information

Supplementary Information

Reporting Summary

Supplementary Data 1

Rights and permissions

About this article

Cite this article

This article is cited by

Near telomere-to-telomere genome assemblies of two Chlorella species unveil the composition and evolution of centromeres in green algae

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Extended Data Fig. 5 The neighbor-joining cladogram tree of five P. patens accessions built by SNPs derived from Haas et al.31.

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links