Accelerating minimap2 for long-read sequencing applications on modern CPUs

Kalikar, Saurabh; Jain, Chirag; Vasimuddin, Md; Misra, Sanchit

doi:10.1038/s43588-022-00201-8

Brief Communication
Published: 28 February 2022

Accelerating minimap2 for long-read sequencing applications on modern CPUs

Saurabh Kalikar¹,
Chirag Jain²,
Md Vasimuddin¹ &
…
Sanchit Misra ORCID: orcid.org/0000-0001-7863-858X¹

Nature Computational Science volume 2, pages 78–83 (2022)Cite this article

1961 Accesses
16 Citations
38 Altmetric
Metrics details

Subjects

A preprint version of the article is available at bioRxiv.

Abstract

Long-read sequencing is now routinely used at scale for genomics and transcriptomics applications. Mapping long reads or a draft genome assembly to a reference sequence is often one of the most time-consuming steps in these applications. Here we present techniques to accelerate minimap2, a widely used software for this task. We present multiple optimizations using single-instruction multiple-data parallelization, efficient cache utilization and a learned index data structure to accelerate the three main computational modules of minimap2: seeding, chaining and pairwise sequence alignment. These optimizations result in an up to 1.8-fold reduction of end-to-end mapping time of minimap2 while maintaining identical output.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Work distribution for three modules.**

**Fig. 2: Performance comparison of minimap2 and mm2-fast on a single socket Cascade Lake CPU (28 cores) for full datasets.**

Bidirectional epigenetic editing reveals hierarchies in gene regulation

Article Open access 17 May 2024

High-resolution genome-wide mapping of chromosome-arm-scale truncations induced by CRISPR–Cas9 editing

Article Open access 29 May 2024

scGPT: toward building a foundation model for single-cell multi-omics using generative AI

Article 26 February 2024

Data availability

Datasets used for benchmarking are publicly available (Supplementary Table 2). Human reference genome is available at https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta.gz. All ONT and PacBio HiFi datasets (HG002, HG003, HG004) used are available at https://precision.fda.gov/challenges/10/view. Datasets for PacBio CLR (HG002, HG003, HG004) are available at https://github.com/genome-in-a-bottle/giab_data_indexes. Genome assemblies are available at: CHM13: NCBI (GCA009914755.3), HG002 (hap1) and HG002 (hap2) are publicly available at ref. ³³. The speedup shown in the paper can also be realized with a smaller subset of the above datasets. Source Data are provided with this paper.

Code availability

The mm2-fast source code is available under the open source MIT license at https://github.com/bwa-mem2/mm2-fast. The particular version of mm2-fast used in this manuscript is publicly available at ref. ³⁴. The scripts used for the experiments in the manuscript are available at ref. ³⁵.

Optimization Notice: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance. Intel, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in the US and/or other countries.

References

Chaisson, M. J. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10, 1–16 (2019).
Article Google Scholar
Conesa, A. et al. A survey of best practices for RNA-seq data analysis. Genome Biol. 17, 1–19 (2016).
Google Scholar
Beyter, D. et al. Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits. Nat. Genet. 53, 779–886 (2021).
Rhie, A. et al. Towards complete and error-free genome assemblies of all vertebrate species. Nature 592, 737–746 (2021).
Article Google Scholar
De Coster, W., Weissensteiner, M. H. & Sedlazeck, F. J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 22, 572–587 (2021).
PromethION Brochure (Nanophore Technologies, 2021); https://nanoporetech.com/sites/default/files/s3/literature/PromethION-brochure.pdf
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Article Google Scholar
Guo, L., Lau, J., Ruan, Z., Wei, P. & Cong, J. Hardware acceleration of long read pairwise overlapping in genome sequencing: a race between FPGA and GPU. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines 127–135 (IEEE, 2019).
Zeni, A. et al. LOGAN: high-performance GPU-based X-drop long-read alignment. In 2020 IEEE International Parallel and Distributed Processing Symposium 462–471 (IEEE, 2020).
Feng, Z., Qiu, S., Wang, L. & Luo, Q. Accelerating long read alignment on three processors. In Proc. 48th International Conference on Parallel Processing 1–10 (ACM, 2019).
Roberts, M., Hayes, W., Hunt, B. R., Mount, S. M. & Yorke, J. A. Reducing storage requirements for biological sequence comparison. Bioinformatics 20, 3363–3369 (2004).
Article Google Scholar
Abouelhoda, M. I. & Ohlebusch, E. Chaining algorithms for multiple genome comparison. J. Discrete Algorithms 3, 321–341 (2005).
Article MathSciNet Google Scholar
Jain, C., Gibney, D. & Thankachan, S. V. Co-linear chaining with overlaps and gap costs. Preprint at https://www.biorxiv.org/content/10.1101/2021.02.03.429492v2 (2021).
Ho, D. et al. LISA: learned indexes for DNA sequence analysis. Preprint at https://arxiv.org/abs/1910.04728 (2020).
Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864 (2017).
Article Google Scholar
Nurk, S., Koren, S., Rhie, A., Rautiainen, M. et al. The complete sequence of a human genome. Preprint at https://doi.org/10.1101/2021.05.26.445798 (2021).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Article Google Scholar
Payne, A. et al. Readfish enables targeted nanopore sequencing of gigabase-sized genomes. Nat. Biotechnol. 39, 442–450 (2021).
Article Google Scholar
Kovaka, S., Fan, Y., Ni, B., Timp, W. & Schatz, M. C. Targeted nanopore sequencing by real-time mapping of raw electrical signal with uncalled. Nat. Biotechnol. 39, 431–441 (2021).
Article Google Scholar
Zhang, H. et al. Real-time mapping of nanopore raw signals. Bioinformatics https://doi.org/10.1093/bioinformatics/btab264 (2021).
Jain, C., Rhie, A., Hansen, N., Koren, S. & Phillippy, A.M. A long read mapping method for highly repetitive reference sequences. Preprint at https://www.biorxiv.org/content/10.1101/2020.11.01.363887v1.full (2020).
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
Article Google Scholar
Ren, J. & Chaisson, M. lRA: the long read aligner for sequences and contigs. Preprint at https://doi.org/10.1371/journal.pcbi.1009078 (2020).
Kraska, T., Beutel, A., Chi, E.H., Dean, J. & Polyzotis, N. The case for learned index structures. In ACM International Conference on Management of Data 489–504 (ACM, 2018).
Galakatos, A., Markovitch, M., Binnig, C., Fonseca, R. & Kraska, T. FITing-Tree: a data-aware index structure. In SIGMOD ’19: Proceedings of the 2019 International Conference on Management of Data 1189–1206 (ACM, 2019); https://doi.org/10.1145/3299869.3319860
Ferragina, P. & Vinciguerra, G. The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. PVLDB 13, 1162–1175 (2020).
Google Scholar
Ding, J. et al. ALEX: An Updatable Adaptive Learned Index. In SIGMOD ‘20: Proceedings of the 2020 International Conference on Management of Data 969-984 (ACM, 2020). https://doi.org/10.1145/3318464.3389711
Wu, Y., Yu, J., Tian, Y., Sidle, R. & Barber, R. Designing succinct secondary indexing mechanism by exploiting column correlations. In SIGMOD ’19: Proceedings of the 2019 International Conference on Management of Data 1223–1240 (ACM, 2019). https://doi.org/10.1145/3299869.3319861
Kirsche, M., Das, A. & Schatz, M. C. Sapling: accelerating suffix array queries with learned data models. Bioinformatics 37, 744–749 (2021).
Article Google Scholar
Marcus, R. et al. Benchmarking learned indexes. In PVLDB Vol. 14, 1–13 (2021).
Marcus, R., Zhang, E. & Kraska, T. CDFShop: exploring and optimizing learned index structures. In SIGMOD ’20: Proc. 2020 ACM SIGMOD International Conference on Management of Data 2789–2792 (ACM, 2020); https://doi.org/10.1145/3318464.3384706
Suzuki, H. & Kasahara, M. Introducing difference recurrence relations for faster semi-global alignment of long sequences. BMC Bioinformatics 19, 33–47 (2018).
Article Google Scholar
Cheng, H., Concepcion, G., Feng, X., Zhang, H. & Li, H. Human Assemblies Evaluated in the Hifiasm Paper (Zenodo, 2020); https://doi.org/10.5281/zenodo.4393631
Kalikar, S., Jain, C., Md, V. & Misra, S. mm2-fast Source Code Used in the Manuscript—Accelerating Minimap2 for Long-Read Sequencing Applications on Modern CPUs (Zenodo, 2022); https://doi.org/10.5281/zenodo.5888171
Kalikar, S., Jain, C., Md, V. & Misra, S. Scripts Used for the Experiments in the Manuscript—Accelerating Minimap2 for Long-Read Sequencing Applications on Modern CPUs (Zenodo, 2022); https://doi.org/10.5281/zenodo.5884451

Download references

Acknowledgements

This work is supported in part by the National Supercomputing Mission (NSM) India under DST/NSM/R&D_HPC_Applications to C.J. The authors are grateful to H. Li for guidance and technical discussions on minimap2 and working with us to get our improvements integrated in a branch of minimap2 github repo.

Author information

Authors and Affiliations

Intel Labs, Bangalore, India
Saurabh Kalikar, Md Vasimuddin & Sanchit Misra
Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
Chirag Jain

Authors

Saurabh Kalikar
View author publications
You can also search for this author in PubMed Google Scholar
Chirag Jain
View author publications
You can also search for this author in PubMed Google Scholar
Md Vasimuddin
View author publications
You can also search for this author in PubMed Google Scholar
Sanchit Misra
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.K. led the software implementation of mm2-fast. All authors contributed to algorithm design, experiments and manuscript preparation, and read and approved the final manuscript.

Corresponding authors

Correspondence to Saurabh Kalikar, Chirag Jain, Md Vasimuddin or Sanchit Misra.

Ethics declarations

Competing interests

S.K., V.M. and S.M. are employees of Intel Corporation.

Peer review

Peer review information

Nature Computational Science thanks Aydin Buluc, Zemin Ning and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Handling editor: Fernando Chirigati, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Minimap2 workflow depicting its three key modules – (i) seeding, (ii) chaining, and (iii) alignment – and mm2-fast optimizations.

The seeding stage identifies short fixed-length exact matches between a read and a reference sequence. Chaining stage selects an ordered subset of these exact matches (anchors) to form a chain. The final alignment stage computes base-level alignments for filling the gaps between adjacent anchors in these chains. Our optimizations to each of the modules are shown in the blue dotted rectangle.

Extended Data Fig. 2 Cross-platform performance of our optimizations for Rome, Skylake, Cascade Lake and Ice Lake architectures using single socket.

X-axis shows various query datasets and y-axis indicates the speedup achieved by mm2-fast over minimap2 – both running on the same CPU.

Source data

Extended Data Fig. 3 Data structures used for hash table.

Minimizers extracted from the reference sequence are stored in a sorted list as key-value pairs. Position list maintains a separate list of the positions of minimizers on the reference sequence.

Extended Data Fig. 4 Two-layer RMI.

An example minimizer lookup is illustrated - get_mm_hits(mm5) calls a lookup for a minimizer mm5. The RMI root predicts the leaf layer model which in turn predicts the location of mm4 in the sorted list. Finally, the last mile search from mm4 walks to the location of mm5 and returns its value to the caller.

Extended Data Fig. 5 Chaining of two co-linear anchors A and B.

Here two anchors overlap on the query sequence. Gap cost function in minimap2 is calculated using the reference gap, query gap, and the average length of all anchors avg_qlen.

Supplementary information

Supplementary Information

Supplementary Tables 1–4, Figs. 1 and 2, Algorithms 1 and 2, and Sections 1 and 2.

Peer Review Information

Supplementary Data 1

Source data showing the single-threaded and multithreaded runtime of mm2-fast.

Supplementary Data 2

Source data showing the time spent by mm2-fast and minimap2 in the chaining module.

Source data

Source Data Fig. 1

Source Data showing the time spent by mm2-fast and minimap2 in various modules.

Source Data Fig. 2

Source Data showing the end-to-end mapping time of mm2-fast and minimap2 on the full datasets.

Source Data Extended Data Fig. 2

Source data showing the speedups of mm2-fast on various architectures.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kalikar, S., Jain, C., Vasimuddin, M. et al. Accelerating minimap2 for long-read sequencing applications on modern CPUs. Nat Comput Sci 2, 78–83 (2022). https://doi.org/10.1038/s43588-022-00201-8

Download citation

Received: 20 July 2021
Accepted: 25 January 2022
Published: 28 February 2022
Issue Date: February 2022
DOI: https://doi.org/10.1038/s43588-022-00201-8

This article is cited by

Efficient end-to-end long-read sequence mapping using minimap2-fpga integrated with hardware accelerated chaining
- Kisaru Liyanage
- Hiruna Samarakoon
- Hasindu Gamaarachchi
Scientific Reports (2023)
MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data
- Yuxuan Du
- Fengzhu Sun
Nature Communications (2023)

Subjects

Abstract

Access options

Similar content being viewed by others

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links