Abstract
Cotoneaster glaucophyllus is a semi-evergreen plant that blossoms in late summer, producing dense, attractive, fragrant white flowers with significant ornamental and ecological value. Here, a chromosome-scale genome assembly was obtained by integrating PacBio and Illumina sequencing data with the aid of Hi-C technology. The genome assembly was 563.3 Mb in length, with contig N50 and scaffold N50 values of ~6 Mb and ~31 Mb, respectively. Most (95.59%) of the sequences were anchored onto 17 pseudochromosomes (538.4 Mb). We predicted 35,856 protein-coding genes, 1,401 miRNAs, 655 tRNAs, 425 rRNAs, and 795 snRNAs. The functions of 34,967 genes (97.52%) were predicted. The availability of this chromosome-level genome will provide valuable resources for molecular studies of this species, facilitating future research on speciation, functional genomics, and comparative genomics within the Rosaceae family.
Similar content being viewed by others
Background & Summary
Species of the genus Cotoneaster Medic. belong to the Malinae subtribe of the Rosaceae family1, and are primarily distributed in continental Eurasia, with a remarkable species diversity in the biodiversity hotspots of the Himalayas and the Hengduan Mountains (HDM)2. Taxonomic difficulties for this genus have been caused by various evolutionary events, including hybridization, polyploidization, and apomixis. A comprehensive phylogenetic analysis of this genus has been conducted using genome-skimming data, but with the genome of Eriobotrya japonica serving as the mapping reference3, which might introduce mapping errors, incorrect alignments, difficulties in identifying orthologous genes, and genome annotation issues.
Based on morphological characteristics and molecular evidences, two subgenera or sections have been proposed: Cotoneaster, characterized by predominantly red or pink flowers with erect petals, and Chaenopetalum, noted for its primarily white flowers with spreading petals2,3,4,5. Notably, only approximately 10% of Cotoneaster species are diploid2. Cotoneaster glaucophyllus, as a representative member of the Chaenopetalum subgenus and a diploid species, has a distinct distribution in the southeastern of Hengduan Mountains and on the Yunnan-Guizhou Plateau. It is a semi-evergreen shrub that blossoms in late summer, exhibiting dense, showy, fragrant white flowers, and bears long-lasting fruits in early winter, potentially making it an important ornamental plants2,6,7. With continuous advancements in sequencing technology, abundant genome resources for numerous Rosaceae species have been extensively documented8,9,10,11,12. However, the lack of whole-genome sequencing in Cotoneaster species has been a significant obstacle in further understanding the gene functions, evolutionary history, and conservation of this complicated genus (up to 370 species).
Using the Pacific Biosciences (PacBio) platform, we generated ~117 Gb of DNA continuous long reads (CLRs) and obtained ~48 Gb of full-length transcriptome sequences. Additionally, we sequenced ~104 Gb of DNA reads and ~10 Gb of RNA reads (2 × 150 bp) as well as ~62 Gb of high-throughput chromosome conformation capture (Hi-C) reads based on the Illumina HiSeq platform. With the aid of Hi-C technologies, we finally provided a high-quality genome sequence for the diploid species (2n = 2x = 34) of C. glaucophyllus (Fig. 1).
Methods
Sample collections
Fresh leaves, fruits and roots were collected from an adult plant of C. glaucophyllus (Xiajinchang, Malipo County, Yunnan Province, China; 23°08′26.57″N, 104°48′34.54″E; a.l.s. 1959 m; Fan17545, SYS!). The samples were separately wrapped in foil paper on 28 September, 2019 (Fig. 1a,d). Immediately thereafter, they were frozen in liquid nitrogen and then were preserved in Drikold and sent to Novogene Bioinformatics Technology Co., Ltd (Beijing, China). On 15 June, 2020, we collected flower tissue from the same plant (Specimen: Fan17951, SYS!) (Fig. 1b,c).
DNA and RNA extraction and genome sequencing
Total DNA was extracted from fresh leaves using the Plant Genomic DNA Kit (DP305, Tiangen Biotech Co., Ltd., Beijing, China). The qualified DNAs were used to construct libraries intended for single molecular real-time (SMRT) sequencing using the Pacific Biosciences system (Menlo Park, CA, USA), Illumina sequencing, and Hi-C sequencing. The 20 kb library was prepared following the manufacturer’s protocol13. For the Illumina DNA paired-end library, the NEBNext® UltraTM DNA Library Prep Kit was utilized according to the provided instructions, with an insert size of 350 bp. The Hi-C library was prepared following standard procedures14.
Samples including fresh leaves, flowers, fruits, roots, and stems were pooled for total RNA extraction using the TIANGEN RNAPrep Pure Plant kit (DP432, Tiangen Biotech Co. Ltd., Beijing, China). Subsequently, the qualified RNAs were utilized for synthesizing full-length cDNAs with the SMRTer PCR cDNA Synthesis Kit (Biomarker, Beijing). Full-length transcriptome sequencing was performed on the PacBio Sequel platform. Additionally, short RNA-Seq reads (2 × 150 bp) specifically from leaf samples were generated and processed15 to facilitate the correction of the long-read RNA sequencing data and genome annotation.
PacBio long-read sequencing was performed using the PacBio Sequel system, while high throughput sequencing (2 × 150 bp) was carried out using an Illumina HiSeq sequencer. Both sequencing processes were conducted at Novogene Bioinformatics Technology Co., Ltd. (Beijing, China).
Pre-estimation of genomic characteristics
The generated Illumina sequencing data were primarily processed using the NGSQC Toolkit v2.3.316. This processing was involved in discarding reads that had adaptor contamination, reads with more than 10% unknown nucleotides (N), and paired reads that contained over 20% bases with a quality score of less than 5 in either read. Then, we performed a genome survey using Jellyfish v.2.2.717 with the default setting of k-mer = 17 (Fig. 2). Based on a kmer-based statistical approach, GenomeScope v.2.018 was used to estimate genome heterozygosity, repeat content, and size. To initially assess the genomic complexity, we employed SOAPdenovo v.2.0.419 to generate a de novo draft assembly using a k-mer length of 41. The assembled contigs were then utilized to calculate the guanine-cytosine (GC) content. The estimated genome size was determined to be 625.87 Mb, with a heterozygosity rate of 0.55% and a repeat sequence proportion of 54.97%. Moreover, the estimated GC content was 38.65%.
Genome assembly and quality assessment
The FALCON assembler20 was initially employed to perform self-correction of PacBio subreads. Subsequently, preassembled reads were assembled using the overlap-layout-consensus (OLC) algorithm, resulting in consensus contigs. To enhance the accuracy of the results, high-quality contigs were further corrected using Illumina short DNA reads through Pilon21. Leveraging the clean Hi-C data, the LACHESIS tool22 was utilized to scaffold the assembly, ultimately yielding a chromosome-level assembly. The de novo genome assembly was 563.3 Mb in length, with a contig N50 of ~6 Mb and a scaffold N50 of ~31 Mb (Table 1).
Among the 211 contigs, 124 were anchored to 17 pseudochromosomes (538.4 Mb, 95.59%) (Fig. 3, Table 2) and the remaining 87 were unanchored (24.9 Mb, 4.41%) (Table 2, Table S1). The GC content of these pseudochromosomes was ranging from 37.90% to 39.13% (Table 2).
To comprehensively evaluate the reliability of the assembly, multiple assessments were performed in addition to considering the contig/scaffold N50 length. First, the integrity of the assembly was assessed by mapping the assembled genome to the BUSCO (Benchmarking Universal Single-Copy Orthologs) database v2.023 (BUSCO, RRID: SCR 015008) and the CEGMA v2.524 (Core Eukaryotic Genes Mapping Approach, RRID: SCR 015055). The BUSCO database contains 1,440 conserved core genes in terrestrial plants, while CEGMA includes a subset of the 248 most highly-conserved Core Eukaryotic Genes (CEGs). Second, the consistency between the assembly and paired-end Illumina short reads was evaluated by calculating the mapping and coverage rates. The Burrows‒Wheeler Aligner (BWA) v0.7.1525 was used to align the 150 bp short reads to the assembly. Thirdly, assembly accuracy was assessed by conducting SNP calling using SAMtools v1.926 and BCFtools v1.9 (https://github.com/samtools/bcftools) based on the above mapping results. The rates of homozygous and heterozygous single-nucleotide polymorphisms (SNPs) were also determined.
Genome annotation
We applied a combined strategy that utilized both de novo search and homology alignment to identify the repeats. A de novo repetitive element database was generated using LTR_FINDER v.1.0.627, RepeatScout v.1.0.528, Piler-DF v2.429, and RepeatModeler v.2.0.130 with the default parameters. The raw transposable element (TE) library included all repeat sequences that were longer than 100 bp and had less than 5% “N” gaps. To obtain a nonredundant library, a combined of Repbase31 and the raw TE library processing was conducted using uclust. Finally, RepeatMasker v.4.1.032 was employed for the repeat identification using the nonredundant library. The homology-based approach utilized RepeatMasker v.4.1.032 and the Repbase31 library to identify known transposable elements (TEs). These identified TEs were subsequently aligned with the genome sequences using a TE protein database, RepeatProteinMask v.4.1.032. Tandem repeats were predicted using Tandem Repeats Finder v.4.0933. In the genome assembly, 55.60% repeat sequences were identified, among which 4.19% were tandem repeat sequences and 50.33% were long terminal repeat retrotransposons (LTR-RTs) (Table 3).
Multiple approaches, including ab initio prediction, homology-based prediction, and full-length transcript evidence, were employed to annotate gene models. For ab initio gene predication based on ab initio, GeneWise v.2.4.134, Augustus v3.2.335, Geneid v1.436, Genescan v3.137, GlimmerHMM v3.0438, and SNAP39 were used. Homologous protein sequences of Malus x domestica40, Fragaria vesca41, Rosa chinensis42, Prunus persica43, Pyrus betuleafolia44, and Eriobotrya japonica12 were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/genome/) and then were aligned to the assembly using tBLASTn v2.2.2645 (E-value ≤ 1e-5). The matching proteins were aligned to the homologous genome sequences for accurate spliced alignments with GeneWise v2.4.134 software. The IsoSeq pipeline (https://github.com/PacificBiosciences/IsoSeq) was employed to process full-length transcriptome sequencing data. The generated reads were aligned to C. glaucophyllus using HISAT v.2.0.446 with the default parameters and then the alignment was further processed by StringTie v.1.3.347. The nonredundant reference gene set was created by merging the genes predicted as described above with EVidenceModeler v1.1.148 using PASA49 (Program to Assemble Spliced Alignment) terminal exon support and including masked transposable elements as aninput for gene prediction. Furthermore, gene structure and gene elements, including average transcript length, average CDS length, and average exon and intron length, were compared among Cotoneaster glaucophyllus and the above six related species.
The tRNAs were predicted using the tRNAscan-SE50 program (http://lowelab.ucsc.edu/tRNAscan-SE/). As rRNAs are highly conserved, we selected reference rRNA sequences from closely related species and used BLAST to predict rRNA sequences. Additionally, other ncRNAs, such as miRNAs and snRNAs, were identified by searching against the Rfam51 database using the Infernal v1.134 with the default parameters. We annotated 35,856 coding genes (Tables 4) and 3,276 noncoding genes, including 1,401 miRNAs, 655 tRNAs, 425 rRNAs, and 795 snRNAs (Table 5).
Gene functions were assigned by aligning the protein sequences to Swiss-Prot52 using Blastp53, with a threshold of E-value ≤ 1e−5, and the best match was considered. Motifs and domains were annotated using InterProScan v5.3154, which involved searching against publicly available databases, including ProDom55, PRINTS56, Pfam57, SMART58, PANTHER59, and PROSITE60. Gene Ontology (GO) IDs were assigned to each gene based on the corresponding InterPro entry. Protein function predictions were made by transferring annotations from the closest BLAST hit (E-value ≤ 1e−5) in the SwissProt database51 and DIAMOND v0.8.2261 hit (E-value ≤ 1e−5) in the NR database. Additionally, we mapped the gene set to a KEGG pathway and identified the best match for each gene. The functions of 34,967 genes (97.52%) were predicted (Table 6). Comparative analysis of gene elements among Rosaceae-related species revealed that the genome assembly of Cotoneaster glaucophyllus exhibits a shorter average exon length (229.78 bp) and a longer average intron length (508.51 bp) than those of other considered species (Fig. 4, Table 7).
Data Records
The raw data of Hi-C short reads, Illumina DNA short reads, PacBio DNA long reads, RNA short reads, and PacBio RNA long reads have been deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive database with accession numbers SRR2593387962, SRR2593387863, SRR2593387764, SRR2593387665, and SRR2593387566 under BioProject accession number PRJNA1012579. The genome assembly has been deposited at GenBank under the WGS accession JAVVNS00000000067. Additionally, the genome assembly, predicted transcripts and protein sequences, functional annotation files (gff files), and NR and KEGG annotation files have been deposited in Figshare68.
Technical Validation
Multiple parameters were employed to assess the quality of the genome assembly. The BUSCO evaluation indicated that among the Eukaryota BUSCO genes, 62.9% (906) of the sequences were identified as complete and single-copy, while 30.3% (436) were complete but duplicated. Additionally, 1.1% (16) of the sequences were fragmented, and 5.7% (82) were found to be missing. Analysis of the 248 most highly-conserved Core Eukaryotic Genes (CEGs) revealed the presence of 238 complete genes (95.97%) and 6 incomplete genes (2.42%). The evaluation of the consistency between the assembly and paired-end DNA short reads indicated that the overall mapping and coverage rates were 94.61% and 99.99%, respectively. The rates of homozygous and heterozygous single-nucleotide polymorphisms (SNPs) were 0.001413% (798) and 0.288695% (163,081). Furthermore, we mapped the DNA continuous long reads (CLRs) to the genome using the minimap269, and calculated the sequencing depth and coverage for each pseudo-chromosome (Table 2). These results collectively demonstrate a genome assembly of high quality, completeness, and accuracy.
Code availability
All software and pipelines were executed in strict accordance with the manuals and protocols provided by the published bioinformatic tools. No custom programming or coding was used.
References
The Angiosperm Phylogeny Group. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Botanical Journal of the Linnean Society 181, 1–20 (2016).
Fryer, J. & Hylmö, B. Cotoneasters: A Comprehensive Guide To Shrubs for Flowers, Fruit, and Foliage. (Timber Press, Portland and London, 2009).
Meng, K. K. et al. Phylogenomic analyses based on genome-skimming data reveal cyto-nuclear discordance in the evolutionary history of Cotoneaster (Rosaceae). Mol Phylogenet Evol 158, 107083 (2021).
Robertson, K. R. et al. A synopsis of genera in Maloideae (Rosaceae). Syst Bot 16, 376–394 (1991).
Li, F. F. et al. Molecular phylogeny of Cotoneaster (Rosaceae) inferred from nuclear ITS and multiple chloroplast sequences. PLANT Syst Evol 300, 1533–1546 (2014).
Lu, L. D. et al. Rosaceae. In Wu, Z.Y. and Raven, P.H. (Eds.). Flora of China. Science Press, Beijing, China and Missouri Botanical Garden Press, St. Louis. 9, 46–434 (2003).
Yü, T. T. et al. Rosaceae. In: Yü, T. T. (Ed.), Flora Reipublicae Popularis Sinicae. Science Press, Beijing 36, 107–178 (1974).
Cao, K. et al. Chromosome-level genome assemblies of four wildpeach species provide insights into genome evolution and genetic basis of stress resistance. BMC Biol 20, 139 (2022).
Soyturk, A. et al. De novo assembly and characterization of the first draft genome of quince (Cydonia oblonga Mill.). Sci Rep 11, 3818 (2021).
Zhang, J. X. et al. The high-quality genome of diploid strawberry (Fragaria nilgerrensis) provides new insights into anthocyanin accumulation. Plant Biotechnol J 18, 1908–1924 (2020).
Sun, X. P. et al. Phased diploid genome assemblies and pan-genomes provide insights into the genetic history of apple domestication. Nat Genet 52, 1423–1432 (2020).
Jiang, S. et al. Chromosome-level genome assembly and annotation of the loquat (Eriobotrya japonica) genome. Gigascience 9 (2020).
Guidelines for Preparing 20 kb SMRTbell TM Templates, https://www.pacb.com/wp-content/uploads/2015/09/User-Bulletin-Guidelines-for-Preparing-20-kb-SMRTbell-Templates.pdf Accessed on 25 Nov 2020.
Belton, J. M. et al. Hi-C: a comprehensive technique to capture the conformation of genomes. Methods 58, 268–276 (2012).
Meng, K. K. et al. Isolation and identification of EST-SSR markers in Chunia bucklandioides (Hamamelidaceae). Appl Plant Sci 4 (2016).
Patel, R. K. & Jain, M. NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data. PLOS ONE 7, e30619 (2012).
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Ranallo-Benavidez, T. R. et al. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun 11, 1432 (2020).
Luo, R. B. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience 1, 18 (2012).
Chin, C. S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat Methods 13, 1050–1054 (2016).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One 9, e112963 (2014).
Sedayao, J. & Akita, K. LACHESIS: A Tool for Benchmarking Internet Service Providers (1995).
Simao, F. A. et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Parra, G. et al. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067 (2007).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res 35, W265–W268 (2007).
Price, A. L. et al. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1), i351–i358 (2005).
Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21(Suppl 1), i152–i158 (2005).
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci USA 117, 9451–9457 (2020).
Bao, W. D. et al. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA 6, 11 (2015).
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics Chapter 4, 4.10 (2009).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27, 573–580 (1999).
Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).
Stanke, M. et al. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res 32, W309–W312 (2004).
Alioto, T. et al. Using geneid to Identify Genes. Curr Protoc Bioinformatics 64, e56 (2018).
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 268, 78–94 (1997).
Majoros, W. H. et al. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
Bromberg, Y. & Rost, B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res 35, 3823–3835 (2007).
Zhang, L. Y. et al. A high-quality apple genome assembly reveals the association of a retrotransposon and red fruit colour. Nat Commun 10, 1494 (2019).
Shulaev, V. et al. The genome of woodland strawberry (Fragaria vesca). Nat Genet 43, 109–116 (2011).
Raymond, O. et al. The Rosa genome provides new insights into the domestication of modern roses. Nature Genetics 50, 772–777 (2018).
Lian, X. D. et al. De novo chromosome-level genome of a semi-dwarf cultivar of Prunus persica identifies the aquaporin PpTIP2 as responsible for temperature-sensitive semi-dwarf trait and PpB3-1 for flower type and size. Plant Biotechnol J 20, 886–902 (2022).
Dong, X. et al. De novo assembly of a wild pear (Pyrus betuleafolia) genome. Plant Biotechnol J 18, 581–595 (2020).
NCBI. BLASTALL v2.2.26. Bethesda, MD: National Center for Biotechnology Information. (2009).
Kim, D. et al. HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12, 357–360 (2015).
Pertea, M. et al. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11, 1650–1667 (2016).
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9, R7 (2008).
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 31, 5654–5666 (2003).
Lowe, T. M. & Eddy, S. R. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25, 955–964 (1997).
Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 33, D121–D124 (2005).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 28, 45–48 (2000).
Gish, W. & States, D. J. Identification of protein coding regions by database similarity search. Nat Genet 3, 266–272 (1993).
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).
Gouzy, J. et al. XDOM, a graphical tool to analyse domain arrangements in any set of protein sequences. Comput Appl Biosci 13, 601–608 (1997).
Attwood, T. K. et al. The PRINTS database: a fine-grained protein sequence annotation and analysis resource–its status in 2012. Database (Oxford) 2012, bas019 (2012).
El-Gebali, S. et al. The Pfam protein families database in 2019. Nucleic Acids Res 47, D427–D432 (2019).
Letunic, I. et al. SMART 4.0: towards genomic data integration. Nucleic Acids Res 32, D142–D144 (2004).
Mi, H. Y. et al. PANTHER in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res 41, D377–D386 (2013).
Sigrist, C. J. A. et al. New and continuing developments at PROSITE. Nucleic Acids Research 41, D344–D347 (2012).
Buchfink, B. et al. Fast and sensitive protein alignment using DIAMOND. Nat Methods 12, 59–60 (2015).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR25933879 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR25933878 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR25933877 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR25933876 (2023).
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR25933875 (2023).
Meng, K. K. Chromosome-scale genome assembly and annotation of Cotoneaster glaucophyllus, GenBank, https://identifiers.org/ncbi/insdc.gca:GCA_036320875.1 (2024).
Meng, K. K. Chromosome-scale genome assembly and annotation of Cotoneaster glaucophyllus, Figshare, https://doi.org/10.6084/m9.figshare.24100161.v1 (2023).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Acknowledgements
This work was financed by the National Natural Science Foundation of China (Grant No. 32370216), the Key Basic Research Program of Yunnan Province (Grant No. 202101BC070003), Key Technologies Research for the Germplasm of Important Woody Flowers in Yunnan Province (Grant No. 202302AE090018), and the National Natural Science Foundation of China (Grant No. 32000267 and 32370225). We thank Yu-Bing Zhou (Jierui Biotech, Guangzhou, China) for his helpful discussion on reviewers’ assistance in response to review comments.
Author information
Authors and Affiliations
Contributions
Q.F. and Y.M. designed the project, supervised the work, and revised the manuscript. K.M., Q.F. and Y.M. collected the samples. K.M. conducted the experiments, performed the analysis, and wrote the manuscript. S.C., W.L., and S.W. provided assistance with data analysis and manuscript revisions. All the authors have read and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Meng, K., Liao, W., Wei, S. et al. Chromosome-scale genome assembly and annotation of Cotoneaster glaucophyllus. Sci Data 11, 406 (2024). https://doi.org/10.1038/s41597-024-03246-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41597-024-03246-8