Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Research Briefing
  • Published:

Machine learning improves genome quality prediction across the microbial tree of life

CheckM2 is a tool that applies machine learning to evaluate the quality of genomes from metagenomic data. CheckM2 is faster and more accurate than existing methods, and it outperforms them when applied to novel lineages and lineages with reduced genome sizes, such as Patescibacteria and the DPANN superphylum.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Genome quality predictions of CheckM2 compared to CheckM1.

References

  1. Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019). This article provides an example of the scale of MAG recovery from metagenomic data.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). This article presents the groundbreaking application of machine learning to address the protein folding problem.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Tang, B. et al. Recent advances of deep learning in bioinformatics and computational biology. Front. Genet. 10, 214 (2019). This review explains the nature of machine learning and how it is relevant to diverse biological problems.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Parks, D. H. et al. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 25, 1043–1055 (2015). This paper presents CheckM1 — the basis for designing CheckM2 and one of the most popular tools used to assess genome quality.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Simão, F. A. et al. BUSCO: Assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015). This article describes BUSCO, a highly popular alternative tool used to assess genome quality.

    Article  PubMed  Google Scholar 

Download references

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is a summary of: Chklovski, A., Parks, D. H., Woodcroft, B. J. & Tyson, G. W. CheckM2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat. Methods https://doi.org/10.1038/s41592-023-01940-w (2023).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Machine learning improves genome quality prediction across the microbial tree of life. Nat Methods 20, 1137–1138 (2023). https://doi.org/10.1038/s41592-023-01941-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-023-01941-9

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing