Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Mapping genotypes to chromatin accessibility profiles in single cells

Abstract

In somatic tissue differentiation, chromatin accessibility changes govern priming and precursor commitment towards cellular fates1,2,3. Therefore, somatic mutations are likely to alter chromatin accessibility patterns, as they disrupt differentiation topologies leading to abnormal clonal outgrowth. However, defining the impact of somatic mutations on the epigenome in human samples is challenging due to admixed mutated and wild-type cells. Here, to chart how somatic mutations disrupt epigenetic landscapes in human clonal outgrowths, we developed genotyping of targeted loci with single-cell chromatin accessibility (GoT–ChA). This high-throughput platform links genotypes to chromatin accessibility at single-cell resolution across thousands of cells within a single assay. We applied GoT–ChA to CD34+ cells from patients with myeloproliferative neoplasms with JAK2V617F-mutated haematopoiesis. Differential accessibility analysis between wild-type and JAK2V617F-mutant progenitors revealed both cell-intrinsic and cell-state-specific shifts within mutant haematopoietic precursors, including cell-intrinsic pro-inflammatory signatures in haematopoietic stem cells, and a distinct profibrotic inflammatory chromatin landscape in megakaryocytic progenitors. Integration of mitochondrial genome profiling and cell-surface protein expression measurement allowed expansion of genotyping onto DOGMA-seq through imputation, enabling single-cell capture of genotypes, chromatin accessibility, RNA expression and cell-surface protein expression. Collectively, we show that the JAK2V617F mutation leads to epigenetic rewiring in a cell-intrinsic and cell type-specific manner, influencing inflammation states and differentiation trajectories. We envision that GoT–ChA will empower broad future investigations of the critical link between somatic mutations and epigenetic alterations across clonal populations in malignant and non-malignant contexts.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: GoT–ChA profiles single-cell genotypes with chromatin accessibility.
Fig. 2: GoT–ChA applied to human JAK2V617F-mutated MF samples.
Fig. 3: JAK2V617F-mutant HSPCs exhibit intrinsic pro-inflammatory and myeloid-biased epigenetic priming.
Fig. 4: JAK2V617F-driven epigenetic dysregulation of the EP haemoglobin locus.
Fig. 5: GoT–ChA integration with ASAP–seq.

Similar content being viewed by others

Data availability

Raw data and processed data files generated from cell lines are available at Gene Expression Omnibus (GEO) as part of the superseries GSE203251. Processed data files generated from patient samples are deposited at GEO as part of the superseries GSE203251. Patient raw sequencing data containing genomic sequences generated in this study have been deposited at the European Genome–Phenome Archives under accession number EGAS50000000164. The GRCh38 reference genome was used for alignment of single-cell ATAC–seq data (refdata-cellranger-atac-GRCh38-1.2.0) and for DOGMA-seq data (refdata-cellranger-arc-GRCh38-2020-A-2.0.0) and are freely available from the 10x Genomics website (https://support.10xgenomics.com).

Code availability

The code used for raw data processing and noise correction approaches for the genotyping data obtained through GoT–ChA, as well as functions for downstream differential gene accessibility and TF motif accessibility are available at GitHub (https://github.com/landau-lab/Gotcha).

References

  1. Corces, M. R. et al. Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution. Nat. Genet. 48, 1193–1203 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  3. Ma, S. et al. Chromatin potential identified by shared single-cell profiling of RNA and chromatin. Cell 183, 1103–1116 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Izzo, F. et al. DNA methylation disruption reshapes the hematopoietic differentiation landscape. Nat. Genet. 52, 378–387 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Nam, A. S. et al. Single-cell multi-omics of human clonal hematopoiesis reveals that DNMT3A R882 mutations perturb early progenitor states through selective hypomethylation. Nat. Genet. 54, 1514–1526 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Mullally, A. et al. Physiological Jak2V617F expression causes a lethal myeloproliferative neoplasm with differential effects on hematopoietic stem and progenitor cells. Cancer Cell 17, 584–596 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Gerritsen, M. et al. RUNX1 mutations enhance self-renewal and block granulocytic differentiation in human in vitro models and primary AMLs. Blood Adv. 3, 320–332 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Jaiswal, S. et al. Age-related clonal hematopoiesis associated with adverse outcomes. N. Engl. J. Med. 371, 2488–2498 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  9. Levine, R. L. et al. Activating mutation in the tyrosine kinase JAK2 in polycythemia vera, essential thrombocythemia, and myeloid metaplasia with myelofibrosis. Cancer Cell 7, 387–397 (2005).

    Article  CAS  PubMed  Google Scholar 

  10. Kralovics, R. et al. A gain-of-function mutation of JAK2 in myeloproliferative disorders. N. Engl. J. Med. 352, 1779–1790 (2005).

    Article  CAS  PubMed  Google Scholar 

  11. James, C. et al. A unique clonal JAK2 mutation leading to constitutive signalling causes polycythaemia vera. Nature 434, 1144–1148 (2005).

    Article  ADS  CAS  PubMed  Google Scholar 

  12. Baxter, E. J. et al. Acquired mutation of the tyrosine kinase JAK2 in human myeloproliferative disorders. Lancet 365, 1054–1061 (2005).

    Article  CAS  PubMed  Google Scholar 

  13. Panteli, K. E. et al. Serum interleukin (IL)-1, IL-2, sIL-2Ra, IL-6 and thrombopoietin levels in patients with chronic myeloproliferative diseases. Br. J. Haematol. 130, 709–715 (2005).

    Article  CAS  PubMed  Google Scholar 

  14. Jamieson, C. H. M. et al. The JAK2 V617F mutation occurs in hematopoietic stem cells in polycythemia vera and predisposes toward erythroid differentiation. Proc. Natl Acad. Sci. USA 103, 6224–6229 (2006).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  15. Giustacchini, A. et al. Single-cell transcriptomics uncovers distinct molecular signatures of stem cells in chronic myeloid leukemia. Nat. Med. 23, 692–702 (2017).

    Article  CAS  PubMed  Google Scholar 

  16. Rodriguez-Meira, A. et al. Unravelling intratumoral heterogeneity through high-sensitivity single-cell mutational analysis and parallel RNA sequencing. Mol. Cell 73, 1292–1305 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Rodriguez-Meira, A., O’Sullivan, J., Rahman, H. & Mead, A. J. TARGET-seq: a protocol for high-sensitivity single-cell mutational analysis and parallel RNA sequencing. STAR Protoc. 1, 100125 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  18. van Galen, P. et al. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell 176, 1265–1281 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Nam, A. S. et al. Somatic mutations and cell identity linked by genotyping of transcriptomes. Nature 571, 355–360 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  20. Morita, K. et al. Clonal evolution of acute myeloid leukemia revealed by high-throughput single-cell genomics. Nat. Commun. 11, 5327 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. Miles, L. A. et al. Single-cell mutation analysis of clonal evolution in myeloid malignancies. Nature 587, 477–482 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  22. Van Egeren, D. et al. Reconstructing the lineage histories and differentiation trajectories of individual cancer cells in myeloproliferative neoplasms. Cell Stem Cell 28, 514–523 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Van Egeren, D. et al. Transcriptional differences between JAK2-V617F and wild-type bone marrow cells in patients with myeloproliferative neoplasms. Exp. Hematol. 107, 14–19 (2022).

    Article  PubMed  Google Scholar 

  24. Turkalj, S. et al. GTAC enables parallel genotyping of multiple genomic loci with chromatin accessibility profiling in single cells. Cell Stem Cell 30, 722–740 (2023).

    Article  CAS  PubMed  Google Scholar 

  25. Mackinnon, R. N. et al. Genome organization and the role of centromeres in evolution of the erythroleukaemia cell line HEL. Evol. Med. Publ. Health 2013, 225–240 (2013).

    Article  Google Scholar 

  26. Stuart, T., Srivastava, A., Madad, S., Lareau, C. A. & Satija, R. Single-cell chromatin state analysis with Signac. Nat. Methods 18, 1333–1341 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Mustjoki, S. et al. JAK2V617F mutation and spontaneous megakaryocytic or erythroid colony formation in patients with essential thrombocythaemia (ET) or polycythaemia vera (PV). Leuk. Res. 33, 54–59 (2009).

    Article  CAS  PubMed  Google Scholar 

  28. Schieber, M., Crispino, J. D. & Stein, B. Myelofibrosis in 2019: moving beyond JAK2 inhibition. Blood Cancer J. 9, 74 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Pardanani, A. & Tefferi, A. Definition and management of ruxolitinib treatment failure in myelofibrosis. Blood Cancer J. 4, e268 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Cervantes, F. et al. Three-year efficacy, safety, and survival findings from COMFORT-II, a phase 3 study comparing ruxolitinib with best available therapy for myelofibrosis. Blood 122, 4047–4053 (2013).

    Article  CAS  PubMed  Google Scholar 

  31. Mondet, J., Hussein, K. & Mossuz, P. Circulating cytokine levels as markers of inflammation in philadelphia negative myeloproliferative neoplasms: diagnostic and prognostic interest. Mediators Inflamm. 2015, 670580 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Tefferi, A. et al. Circulating interleukin (IL)-8, IL-2R, IL-12, and IL-15 levels are independently prognostic in primary myelofibrosis: a comprehensive cytokine profiling study. J. Clin. Oncol. 29, 1356–1363 (2011).

    Article  CAS  PubMed  Google Scholar 

  33. Verstovsek, S. et al. A double-blind, placebo-controlled trial of ruxolitinib for myelofibrosis. N. Engl. J. Med. 366, 799–807 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Vukotić, M. et al. Inhibition of proinflammatory signaling impairs fibrosis of bone marrow mesenchymal stromal cells in myeloproliferative neoplasms. Exp. Mol. Med. 54, 273–284 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  35. Dunbar, A. J. et al. CXCL8/CXCR2 signaling mediates bone marrow fibrosis and is a therapeutic target in myelofibrosis. Blood 141, 2508–2519 (2023).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. Hu, W.-H. et al. NIBP, a novel NIK and IKKβ-binding protein that enhances NF-κB activation. J. Biol. Chem. 280, 29233–29241 (2005).

    Article  CAS  PubMed  Google Scholar 

  37. Jeanpierre, S. et al. The quiescent fraction of chronic myeloid leukemic stem cells depends on BMPR1B, Stat3 and BMP4-niche signals to persist in patients in remission. Haematologica 106, 111–122 (2021).

    Article  CAS  PubMed  Google Scholar 

  38. Wu, Y. et al. The prognostic value of matrix metalloproteinase-7 and matrix metalloproteinase-15 in acute myeloid leukemia. J. Cell. Biochem. 120, 10613–10624 (2019).

    Article  CAS  PubMed  Google Scholar 

  39. Ikeda, M., Chiba, S., Ohashi, K. & Mizuno, K. Furry protein promotes aurora A-mediated Polo-like kinase 1 activation. J. Biol. Chem. 287, 27670–27681 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Komorowska, K. et al. Hepatic leukemia factor maintains quiescence of hematopoietic stem cells and protects the stem cell pool during regeneration. Cell Rep. 21, 3514–3523 (2017).

    Article  CAS  PubMed  Google Scholar 

  41. Ficara, F. et al. Pbx1 restrains myeloid maturation while preserving lymphoid potential in hematopoietic progenitors. J. Cell Sci. 126, 3181–3191 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  42. Ficara, F., Murphy, M. J., Lin, M. & Cleary, M. L. Pbx1 regulates self-renewal of long-term hematopoietic stem cells by maintaining their quiescence. Cell Stem Cell 2, 484–496 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Kleppe, M. et al. JAK-STAT pathway activation in malignant and nonmalignant cells contributes to MPN pathogenesis and therapeutic response. Cancer Discov. 5, 316–331 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Kleppe, M. et al. Dual targeting of oncogenic activation and inflammatory signaling increases therapeutic efficacy in myeloproliferative neoplasms. Cancer Cell 33, 785–787 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Dunbar, A. J. et al. Jak2V617F reversible activation shows its essential requirement in myeloproliferative neoplasms. Cancer Discov. https://doi.org/10.1158/2159-8290.CD-22-0952 (2024).

  46. Wernig, G. et al. Unifying mechanism for different fibrotic diseases. Proc. Natl Acad. Sci. USA 114, 4757–4762 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  47. Burda, P., Laslo, P. & Stopka, T. The role of PU.1 and GATA-1 transcription factors during normal and leukemogenic hematopoiesis. Leukemia 24, 1249–1257 (2010).

    Article  CAS  PubMed  Google Scholar 

  48. Zhang, P. et al. Negative cross-talk between hematopoietic regulators: GATA proteins repress PU.1. Proc. Natl Acad. Sci. USA 96, 8705–8710 (1999).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  49. Basak, A. & Sankaran, V. G. Regulation of the fetal hemoglobin silencing factor BCL11A. Ann. N. Y. Acad. Sci. 1368, 25–30 (2016).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  50. Sankaran, V. G. et al. Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322, 1839–1842 (2008).

    Article  ADS  CAS  PubMed  Google Scholar 

  51. Hoffman, R. et al. Fetal hemoglobin in polycythemia vera: cellular distribution in 50 unselected patients. Blood 53, 1148–1155 (1979).

    Article  CAS  PubMed  Google Scholar 

  52. Mimitou, E. P. et al. Scalable, multimodal profiling of chromatin accessibility, gene expression and protein levels in single cells. Nat. Biotechnol. 39, 1246–1258 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Baum, C. M., Weissman, I. L., Tsukamoto, A. S., Buckle, A. M. & Peault, B. Isolation of a candidate human hematopoietic stem-cell population. Proc. Natl Acad. Sci. USA 89, 2804–2808 (1992).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  54. Asch, A. S., Barnwell, J., Silverstein, R. L. & Nachman, R. L. Isolation of the thrombospondin membrane receptor. J. Clin. Invest. 79, 1054–1061 (1987).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Valet, C. et al. Adipocyte fatty acid transfer supports megakaryocyte maturation. Cell Rep. 32, 107875 (2020).

    Article  CAS  PubMed  Google Scholar 

  56. Mustjoki, S. & Young, N. S. Somatic mutations in “benign” disease. N. Engl. J. Med. 384, 2039–2052 (2021).

    Article  CAS  PubMed  Google Scholar 

  57. Martincorena, I. et al. Somatic mutant clones colonize the human esophagus with age. Science 362, 911–917 (2018).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  58. Martincorena, I. et al. Tumor evolution. High burden and pervasive positive selection of somatic mutations in normal human skin. Science 348, 880–886 (2015).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  59. Granja, J. M. et al. ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis. Nat. Genet. 53, 403–411 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Mulè, M. P., Martins, A. J. & Tsang, J. S. Normalizing and denoising protein expression data from droplet-based single cell profiling. Nat. Commun. 13, 2099 (2022).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  61. Stoeckius, M. et al. Cell hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics. Genome Biol. 19, 224 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Thibodeau, A. et al. AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data. Genome Biol. 22, 252 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  66. Schep, A. N., Wu, B., Buenrostro, J. D. & Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. Nat. Methods 14, 975–978 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  PubMed  Google Scholar 

  69. Melsted, P. et al. Modular, efficient and constant-memory single-cell RNA-seq preprocessing. Nat. Biotechnol. 39, 813–818 (2021).

    Article  CAS  PubMed  Google Scholar 

  70. Gehring, J., Hwee Park, J., Chen, S., Thomson, M. & Pachter, L. Highly multiplexed single-cell RNA-seq by DNA oligonucleotide tagging of cellular proteins. Nat. Biotechnol. 38, 35–38 (2020).

    Article  CAS  PubMed  Google Scholar 

  71. Lareau, C. A. et al. Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling. Nat. Biotechnol. 39, 451–461 (2021).

    Article  CAS  PubMed  Google Scholar 

  72. Ludwig, L. S. et al. Lineage tracing in humans enabled by mitochondrial mutations and single-cell genomics. Cell 176, 1325–1339 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Svetnik, V. et al. Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43, 1947–1958 (2003).

    Article  CAS  PubMed  Google Scholar 

  74. Satpathy, A. T. et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat. Biotechnol. 37, 925–936 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Plummer, N. W. et al. Expanding the power of recombinase-based labeling to uncover cellular diversity. Development 142, 4385–4393 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. Ruzankina, Y. et al. Deletion of the developmentally essential gene ATR in adult mice leads to age-related phenotypes and stem cell loss. Cell Stem Cell 1, 113–126 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Kozlov, A., Alves, J. M., Stamatakis, A. & Posada, D. CellPhy: accurate and fast probabilistic inference of single-cell phylogenies from scDNA-seq data. Genome Biol. 23, 37 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

R.M.M. is supported by a Medical Scientist Training Program grant from the National Institute of General Medical Sciences of the National Institutes of Health under award number T32GM007739 to the Weill Cornell/Rockefeller/Sloan Kettering Tri-Institutional MD-PhD Program and by the Weill Cornell Medicine NYSTEM Training Program under award number C32558GG. F.I. is supported by the American Society of Hematology Fellow-to-Faculty Scholar Award number 204377-01. A.J.D. is a William Raveis Charitable Fund Physician-Scientist of the Damon Runyon Cancer Research Foundation (PST-24-19) and has received funding from the American Association of Cancer Research and the American Association of Clinical Oncology. E.O.E. was supported by a Medical Scientist Training Program grant from the National Institute of General Medical Sciences of the National Institutes of Health under award number T32GM007739 to the Weill Cornell/Rockefeller/Sloan Kettering Tri-Institutional MD-PhD Program. A subset of biospecimens and data for this work were provided through the Hematological Malignancies Tissue Bank, which is administered and functions under the auspices of the NCI-designated Tisch Cancer Institute at the Icahn School of Medicine at Mount Sinai. R.C. is supported by Lymphoma Research Foundation and Marie Skłodowska-Curie fellowships. R.L.L. is supported by a Leukemia & Lymphoma Society Specialized Center of Research grant and a National Cancer Institute award (P01 CA108671) and the National Institutes of Health/National Cancer Institute (P50 CA254838-01). B.M. is supported by the National Heart, Blood and Lung Institute (K08 1K08HL163489-01A1) and the National Cancer Institute (P01 2P01CA108671-15). D.A.L. is supported by the Burroughs Wellcome Fund Career Award for Medical Scientists, Valle Scholar Award, Leukemia Lymphoma Scholar Award and the Mark Foundation Emerging Leader Award. This work was also supported by the Tri-Institutional Stem Cell Initiative, the National Heart Lung and Blood Institute (R01HL157387-01A1), the National Cancer Institute (R33 CA267219), the National Human Genome Research Institute, Center of Excellence in Genomic Science (RM1HG011014) and the National Institutes of Health Common Fund Somatic Mosaicism Across Human Tissues (UG3NS132139). This work was enabled by the Weill Cornell Flow Cytometry Core. We thank A. Melnick for commenting on the manuscript and M. Hoare for providing the FOXO1S22W mutant cell line. This work was made possible by the MacMillan Family Foundation and the MacMillan Center for the Study of the Non-Coding Cancer Genome at the New York Genome Center.

Author information

Authors and Affiliations

Authors

Contributions

R.M.M., F.I. and D.A.L. conceived the project, devised the research strategy and analysed the data. R.M.M., F.I., E.P.M., R.C., P.S. and D.A.L. developed GoT–ChA. R.M.M., F.I., S.K., T.B. and D.A.L. developed the analytical pipelines for processing GoT–ChA data. M.S., S.E.G.-B., J.A., R.H., B.M., I.M.G., D.C.C. and O.A.-W. conducted a database search and retrieved patient samples for experimental use. R.M.M., S.G., L.M. and J.S. performed the experiments. F.I., R.M.M., S.K., T.P., R.R., E.O.E., T.B. and L.M. performed the computational analyses. A.J.D. and R.L.B. performed the in vivo mouse bulk RNA-seq experiments. J.M.S. and D.C.C. collected the samples and performed the CD90 flow cytometry measurements. R.M.M., F.I., C.P. and D.A.L. wrote the manuscript. R.M.M., F.I., S.K., T.P., A.J.D., R.L.B., E.P.M., M.S., R.R., S.G., L.M., R.H., R.C., O.A.-W., P.S., B.M., I.M.G., J.M.S., R.L.L. and D.A.L. helped to interpret results. R.M.M., F.I. and D.A.L. acquired funding for this work. All of the authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Franco Izzo or Dan A. Landau.

Ethics declarations

Competing interests

M.S. served on the advisory board for Novartis, Kymera, Sierra Oncology, GSK, Rigel, BMS and Taiho; consulted for Boston Consulting and Dedham group and participated in GME activity for Novartis, Curis Oncology, Haymarket Media and Clinical care options. R.H. has served as a consultant for Protagonist Therapeutics, received research funding from Kartos Therapeutics, Novartis and AbbVie, and is on the data safety monitoring board of Novartis and AbbVie. O.A.-W. has served as a consultant for H3B Biomedicine, Foundation Medicine, Merck, Pfizer, Codify Therapeutics and Janssen, and is on the scientific advisory board of Envisagenics, AIChemy and Codify Therapeutics. O.A.-W. has received previous research funding from H3B Biomedicine, LOXO Oncology, Nurix Therapeutics, Codify Therapeutics and Minovia unrelated to the current work. O.A.-W. is a scientific co-founder of Codify Therapeutics. P.S. and E.P.M. are current employees of 10x Genomics and Immunai, respectively. R.L.L. is on the supervisory board of Qiagen and is a scientific advisor to Imago, Mission Bio, Bakx, Zentalis, Ajax, Auron, Prelude, C4 Therapeutics and Isoplexis. R.L.L. has received research support from Abbvie, Constellation, Ajax, Zentalis and Prelude. R.L.L. has received research support from and consulted for Celgene and Roche and has consulted for Syndax, Incyte, Janssen, Astellas, Morphosys and Novartis. R.L.L. has received honoraria from Astra Zeneca and Novartis for invited lectures and from Gilead and Novartis for grant reviews. D.A.L. has served as a consultant for Abbvie and Illumina and is on the scientific advisory board of Mission Bio and C2i Genomics. D.A.L. has received previous research funding from BMS, 10x Genomics and Illumina unrelated to the current work. R.M.M., F.I., E.P.M., R.C., P.S. and D.A.L. have filed a patent for GoT–ChA (63/288,874). The other authors declare no competing interests.

Peer review

Peer review information

Nature thanks Andrew Adey, Vijay Sankaran and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 GoT-ChA primers, genotyping and quality control metrics.

a, Primer design schematic for GoT-ChA. b, Primer binding sites (blue) for TP53R248 and JAK2V617 genotyping, with custom primer handles from a. c, Schematic showing GoT-ChA library construction, composed of a biotinylated hemi-nested PCR, a streptavidin-biotin pull-down, and an on-bead sample indexing PCR, resulting in genotyping libraries compatible with Illumina sequencing. d, Representative image of electrophoresis gel for GoT-ChA for two out of 21 total samples. Full length gel can be found in Supplementary Fig. 1. e, Representative bioanalyzer traces of GoT-ChA genotyping (top) and GoT-ChA scATAC (bottom) libraries for two samples. FU, fluorescent units. f, Sanger sequencing confirming known homozygosity of TP53R248 WT HEL cells and TP53R248Q mutant CA46 cells. g, Differential gene accessibility score heat map showing distinct HEL and CA46 cells in the TP53R248 mixing study (FDR < 0.05 and log2FC > 1.25; Wilcoxon rank sum test followed by Benjamini-Hochberg correction). h, Chromatin accessibility coverage of marker genes (EBF1 and GATA1; FDR < 0.05, log2FC > 1.25; Wilcoxon rank sum test followed by Benjamini-Hochberg correction), agnostic to genotyping information, used for cell line identity assignments (Methods). i, Heatmap showing heteroplasmy for mutually exclusive mitochondrial variants detected in the scATAC-seq data for HEL or CA46 cells (Methods). j, scATAC-seq library fragment size distribution for the TP53R248 mixing study, showing expected nucleosomal periodicity. k, Number of unique nuclear fragments per cell for each cell line in the TP53R248 mixing study, indicating adequate complexity of the scATAC-seq libraries (HEL n = 2,540 cells; CA46 n = 2,117 cells). l, Transcription start site (TSS) enrichment scores per cell in the TP53R248 mixing study, showing high signal-to-background ratio in the scATAC-seq data (HEL n = 2,540 cells; CA46 n = 2,117 cells). m, Histograms of WT (left) and MUT (right) number of reads per cell from the TP53R248 mixing study. Kernel density estimation (KDE) lines for overall data (red), background (yellow), and signal (pink) are shown for each genotype. n, Scatter plots comparing GoT-ChA assigned genotypes (top) compared to the true genotypes as determined by cell line identity (bottom). Dotted lines show the detected threshold for the distinction between background and signal before updated cluster assignments for both WT and MUT data. For all boxplots, error bars represent the range, boxes represent the interquartile range and lines represent the median.

Extended Data Fig. 2 Genotyping accuracy, quality control metrics, and GoT-ChA data processing for JAK2V617F locus.

a, Sanger sequencing confirmation of known genotypes for the JAK2V617 mixing study: CCRF-CEM WT cells, SET-2 heterozygous cells, and HEL homozygous mutant cells. SET-2 data confirm the known allelic ratio of 3:1 for mutated:WT alleles in this cell line. b, Heat map of differential gene accessibility score (FDR < 0.05, log2FC > 1.25; Wilcoxon rank sum test followed by Benjamini-Hochberg correction) distinguishing the CCRF-CEM, SET-2, and HEL cells used in the JAK2V617 mixing study. c, Chromatin accessibility coverage of marker genes (FDR < 0.05, log2FC > 1.25), agnostic to genotyping information used for cell line identity assignments. Wilcoxon rank sum test followed by Benjamini-Hochberg correction. d, Heatmap showing heteroplasmy of mutually exclusive mitochondrial variants detected in the scATAC-seq data for HEL, CCRF-CEM and SET-2 cells (Methods) e, Fragment size distribution for the JAK2V617 mixing study scATAC-seq library, showing expected nucleosomal periodicity. f, Scatter plots showing the number of unique nuclear fragments per cell vs. the transcriptional start site (TSS) enrichment. Dotted lines indicate the selected thresholds based on the distribution. g, Histograms of WT (left) and MUT (right) read distributions from the JAK2V617 mixing study. KDE lines for overall data (red), background (yellow), and signal (pink) are shown for each genotype. h, Scatter plots comparing GoT-ChA-assigned genotypes (left) to the true genotypes (right) as determined by cell line identity. Dotted lines indicate the initial thresholds identified between background noise and signal for either WT (vertical line) or MUT (horizontal line) data before final genotype assignment after clustering (Methods). i, JAK2V617 locus coverage (Methods). j, same as Fig. 1e for JAK2V617-mutant HEL cells (with known chromosome 9 amplification) vs healthy control (Methods). k,l, Fraction of cells genotyped by GoT-ChA (k) or GoT-ChA genotyping accuracy (l) per targeted locus copy number. Grey area, 95% confidence interval. m, Sanger sequencing confirmation of known genotypes for the FOXO1S22 (c.65 C > G) mixing study: SUM159 WT cells and HEPG2 homozygous mutant cells. n, UMAP coloured by GoT-ChA FOXO1S22 genotype classifications of HEPG2 (n = 8,111 cells) and SUM159 (n = 2,841 cells) assigned as wild-type (WT, blue), mutant (MUT, red), or not assignable (NA, grey) cells.

Extended Data Fig. 3 Multiplexed GoT-ChA protocol for simultaneous capture of multiple targeted loci.

a, Sanger sequencing traces showing the expected genotypes of OCI-AML3, CA46, HEL, and SET-2 cell lines for NRASQ61, TP53M133, TP53R248 and JAK2V617 utilized in the multiplexed-adapted GoT-ChA cell mixing experiment. Extended Data Fig. 2a has JAK2V617 sequencing traces for HEL and SET-2 cells. b, Accessibility-based UMAP for original GoT-ChA protocol for CA46 (grey), HEL (gold), OCI-AML3 (violet) and SET-2 (green) cells. c, Accessibility-based UMAP for multiplexing-adapted GoT-ChA protocol (Methods) for cell lines from b. d, Differential gene accessibility markers (FDR < 0.05, Log2FC > 1.25; Wilcoxon rank sum test followed by Benjamini-Hochberg correction) used for cell line identification. e, UMAP coloured by GoT-ChA JAK2V617 genotypes of each cell as wild type (WT, blue), mutant (MUT, red), or not assignable (NA, grey) for original GoT-ChA (left) and multiplexed-adapted GoT-ChA (right). f-i, Same as panel e, but for NRASQ61, TP53M133, TP53R248_1 and TP53R248_2, respectively. j, Percentage of cells genotyped for targeted loci (JAKV617, NRASQ61, TP53M133, TP53R248_1 and TP53R248_2) for either GoT-ChA original or GoT-ChA adapted protocols (Methods). k, Accuracy for targeted loci and protocols as in j (Methods). l, Distribution of percentage of cells for which a given number of targeted loci were captured, for either the GoT-ChA original or multiplex adapted GoT-ChA protocols (Methods). m, Fraction of cells genotyped according to targeted gene accessibility quantile across targeted loci. Accessibility was assessed as normalized scATAC fragments mapping to the gene body; cells with zero scATAC fragments mapped to the targeted gene were assigned to the first quantile.

Extended Data Fig. 4 Quality control, data integration and doublet filtering of primary samples processed with GoT-ChA.

a, scATAC-seq library fragment size distribution for primary samples, showing expected nucleosomal periodicity. b, Distribution of the number of ATAC fragments per cell for each processed primary sample. Cells with fragment counts below 1,000 or above 50,000 were filtered out. Cell numbers are in Supplementary Table 3. c, Distribution of nucleosome signal per cell for each of the processed primary samples. Cells with nucleosome signal above 4 were filtered out. Cell numbers are in Supplementary Table 3. d, Accessibility-based UMAP for primary samples. e, Accessibility-based UMAP split according to the technology used to generate the scATAC profiles (GoT-ChA [n = 72,318 cells], GoT-ChA-ASAP [n = 62,860 cells] or DOGMA-seq [n = 15,465 cells], Methods). f, UMAP coloured according to multiplet calling (Methods), either cells (n = 163,964; grey) or multiplet (n = 9,899; red) are shown. Multiplet detection rate corresponds to 5.7% of total barcodes. g, Percentage of detected multiplets according to initial Seurat clusters. Cell clusters with multiplet detection above 25% (red) were filtered out. h, Percentage of detected multiplets per primary sample before filtering. i, Count of ATAC fragments per single cell according to multiplet calling as cell (grey) or multiplet (red) for each primary sample. j, Count of detected ATAC features according to multiplet calling as cell (grey) or multiplet (red) for each primary sample. For all boxplots, error bars represent the range, boxes represent the interquartile range and lines represent the median.

Extended Data Fig. 5 Marker features for cell cluster identity assignment in primary samples.

a, Differential gene accessibility score (FDR < 0.05, log2FC > 1.25; Wilcoxon rank sum test followed by Benjamini-Hochberg correction) heatmap for each identified cell cluster. Mean gene accessibility and proportion of cells with detected accessibility is shown. b, Representative TF motif accessibility across cell clusters for primary samples (n = 21 samples). c, Genomic track examples of differentially accessible peaks (FDR < 0.05, log2FC > 1; Wilcoxon rank sum test followed by Benjamini-Hochberg correction) across cell clusters. d, Differential TF motif accessibility score (FDR < 0.05, log2FC > 0; Wilcoxon rank sum test followed by Benjamini-Hochberg correction) between HSC, HSCMY and HSCLY clusters. e, Accessibility-based UMAP coloured by the predicted cell type label obtained via bridge integration mapping (Methods). f, Confusion matrix between manually annotated cluster labels and predicted labels based on scRNA-seq reference via bridge integration mapping (Methods).

Extended Data Fig. 6 Genotype assignment based on GoT-ChA read distribution for primary samples.

a, Accessibility-based UMAP coloured by GoT-ChA genotype assignment (blue = WT; red = homozygous mutant; gold = heterozygous; grey = NA) for each primary sample (n = 21 samples). b, Correlation between JAK2V617F variant allele fraction (VAF) as measured by bulk DNA sequencing (Bulk DNA VAF) and pseudobulk JAK2V617F VAF as estimated from GoT-ChA genotype calls (Spearman’s ρ = 0.64; R2 = 0.51; P = 1.2 × 10−3; Two-sided F-test). Grey area represents the 95% confidence interval c, Genotype frequency for Pt-10 JAK2V617 locus as measured by GoT-ChA (n = 8,682 cells) or Mission Bio Tapestri (n = 2,223 cells). HET = JAK2V617F heterozygous; MUT = homozygous JAK2V617F mutant; WT = wild-type. d, Accessibility tracks of normalized ATAC signal across all genotyped cells in the dataset (n = 45,167 cells) for the JAK2 promoter region (± 2 kb from transcriptional start site) for WT (n = 14,878 cells), homozygous MUT (n = 22,842 cells) and HET (n = 7,647 cells). e, Accessibility tracks of normalized ATAC signal across HSCs (n = 7,627 cells), EP1 (n = 11,816 cells), GMP (n = 10,310 cells) or MkP (n = 7,154 cells) clusters for the JAK2 promoter region (± 2 kb from transcriptional start site). f, Percentage of genotyped cells according to JAK2 gene accessibility quantile. Each quantile comprises 100 randomly sampled cells. Quantiles were defined by normalized accessibility score (as reads mapping the gene for every 10,000 reads per cell, Methods), with ranges corresponding to: 0.01 − 10.84 (Quantile 1), 10.85 − 21.67 (Quantile 2), 21.68 − 32.51 (Quantile 3), 32.52 − 43.33 (Quantile 4) and 43.34 – 108.33 (Quantile 5). g, Percentage of genotyped cells and mean JAK2 gene accessibility per cell cluster (Spearman’s ρ = −0.003; R2 = 0.015; P = 0.55; Two-sided F-test). Grey area represents the 95% confidence interval. h, Accessibility tracks across the JAK2 gene body (± 2 kb) for each cell cluster. The genomic coordinates corresponding to the JAK2V617F (c.1849G>T) mutation are highlighted in pink.

Extended Data Fig. 7 JAK2V617F-mutated cells are enriched in erythroid, megakaryocyte and granulocyte-monocyte progenitor cells in untreated or patients with no clinical response to ruxolitinib.

a, JAK2V617 genotyping efficiency across studies applying single-cell droplet-based genotyping, plotted as mean ± s.d. of biologically independent samples (points). b, Heatmap showing the normalized mutant fraction across indicated HSCs, MEPs, MkPs and erythroid progenitor (EP[1–3]) clusters (>20 cells genotyped) for untreated (green) or ruxolitinib-treated (yellow) clonal hematopoiesis (CH), polycythemia vera (PV) and myelofibrosis (MF) patient samples with >20 cells genotyped per cluster. c, Normalized fraction of mutated cells in HSCs (n = 1,365 cells), MEP (n = 2,565 cells), erythroid progenitors (EP1, n = 2,315 cells; EP2-3, n = 3,610 cells) and MkP (n = 1,784 cells) in untreated MF patients (each dot represents a single patient; Two-sided Wilcoxon rank sum test; error bars show standard error). d, Normalized fraction of mutated cells in HSCs (n = 1,365 cells), HSCMY (n = 2,970 cells) and GMP (n = 2,209 cells) in untreated MF patients (each dot represents a single patient; Two-sided Wilcoxon rank sum test; error bars show standard error). e, Normalized fraction of mutated cells in HSC (n = 883 cells), MEP (n = 1,352 cells), erythroid progenitors (EP1, n = 1,639 cells; EP2-3, n = 2,482 cells) and MkP (n = 907 cells) in ruxolitinib-treated MF patients (each dot represents a single patient; Two-sided Wilcoxon rank sum test; error bars show standard error). f, Normalized fraction of mutated cells in HSC (n = 883 cells), HSCMY, (n = 889 cells) and GMP (n = 2,562 cells) in ruxolitinib-treated MF patients (each dot represents a single patient; Two-sided Wilcoxon rank sum test; error bars show standard error). g, Accessibility-based UMAP coloured by GoT-ChA genotype assignment for the Pt-07 sample (no on-treatment response to ruxolitinib) as WT (n = 994 cells), homozygous MUT (n = 674 cells), HET (n = 193 cells) or not assignable (NA, n = 3,448). h, Top: odds ratio between the fraction of mutated cells in each of the indicated clusters and the fraction of mutated cells for the remaining clusters (Two-sided Fisher Exact test; dots indicate the estimated odds ratio, error bars show the 95% confidence interval; the dotted line indicates an odds ratio of 1, signifying no change). Bottom: total number of cells in each cluster for which genotyping data are available. i, Pseudotime estimation as calculated by Monocle 3 (Methods) for the Pt-07 sample UMAP from (g), setting the HSC cluster as the starting point of the trajectories.

Extended Data Fig. 8 Per sample differences in TF motif accessibility and gene pathway enrichment.

a, Normalized accessibility tracks for genes with increased accessibility in JAK2V617F-mutated HSC and HSCMY clusters (BMPR1B, MMP15) or in WT cells (HLF, BAG2). b, Heatmap for examples of differentially accessible TF motifs in early HSCs and HSCMY clusters in untreated patients. Hierarchical clustering was performed and heatmap was split by rows defining two expected groups. Colour scale indicates the mean z-score difference between JAK2V617F-mutated and WT cells. TF motifs are defined as upregulated (red) or downregulated (blue) in JAK2V617F. Samples with > 50 cells genotyped in the analysed clusters were included. c, Heatmap showing the TF motif accessibility for those TF found to be statistically significant between WT (n = 1,902 cells) and JAK2V617F homozygous mutant (n = 1,885 cells) HSCs and HSCMY, including JAK2V617F heterozygous cells (n = 371 cells) for visualization. Colour scale represents row scaled mean z-scores of motif accessibility for the indicated TFs. d, STAT1 TF motif accessibility in a longitudinal sample (Pt-01) that progressed from PV (n = 76 WT cells; n = 30 JAK2V617F cells) to MF (n = 192 WT cells; n = 117 JAK2V617F cells). e, Heatmap of correlation values between STAT TFs (STAT1, STAT2, STAT3, STAT4, STAT5A, STAT5B and STAT6) and TFs involved in the NF-κB pathway (NFKB1, NFKB2, REL, RELA and RELB). Colour scale represents the Spearman’s ρ value. Side barplot represents the mean correlation across columns for the indicated row. f, Jak2RL experiment schematic. Bulk RNA-seq was performed on sorted LSK cells from Jak2V617F and Jak2V617F-deleted mice (top). Pre-ranked gene set enrichment of differentially expressed genes within the erythroid (FDR = 2.5 × 10−4; normalized enrichment score (NES) = −1.87; heme metabolism Hallmark gene set) and TNF via NF-kB (FDR = 4.1 × 10−4; NES = −1.59) gene sets in Jak2V617F compared to Jak2V617F-deleted mouse LSK cells (bottom). NES, normalized enrichment score. g, Differential TF motif accessibility (FDR < 0.05, absolute Δz-score > 0.1; Two-sided Wilcoxon rank sum test followed by Benjamini-Hochberg correction) in Pt-19 CH sample within the early stem cell clusters (HSC and HSCMY). h, Heatmap comparing changes in TF motif accessibility between JAK2V617F-mutated and WT early HSC and HSCMY clusters in CH (P < 0.05, absolute Δz-score > 0.1; Two-sided Wilcoxon rank sum test) or MF (FDR < 0.05, absolute Δz-score > 0.1; LMM followed by likelihood ratio test and Benjamini-Hochberg correction). Colour scale represents the Δz-score. Concordant changes (black, same direction in both CH and MF) and significance (red, P < 0.05 in CH; FDR < 0.05 in MF) are shown. i, Heatmap for examples of differentially accessible TF motifs in the MkP cluster in untreated patients. Hierarchical clustering was performed and heatmap was split by rows defining two expected groups. Colour scale indicates the mean z-score difference between JAK2V617F-mutated and WT cells. TF motifs are defined as upregulated (red) or downregulated (blue) in JAK2V617F. Samples with at least 50 cells genotyped in the analysed clusters were included. j, TF footprinting for JUN comparing WT (blue) and mutant (red) in untreated (n = 12) MF patient samples. Shadowed regions- represent the 95% confidence interval. k, Gene set enrichment analysis illustrating an enrichment of Hallmark inflammatory signature in JAK2V617F-mutated MkPs compared to WT MkPs (FDR = 0.15; normalized enrichment score [NES] = 1.42). l, Schematic of mouse model experiment. m, Gene set enrichment analysis illustrating a depletion of Jun targets in Jak2V617F-deleted compared to Jak2V617F mouse MEPs (FDR = 4.9 × 10−5; normalized enrichment score [NES] = −1.75). n, Heatmap for examples of differentially accessible TF motifs in the erythroid progenitor (EP[1–3]) clusters in untreated patients. Hierarchical clustering was performed and heatmap was split by rows defining two expected groups. Colour scale indicates the mean z-score difference between JAK2V617F-mutated and WT cells. TF motifs are defined as upregulated (red) or downregulated (blue) in JAK2V617F. Samples with at least 50 cells genotyped in the analysed clusters were included. o, Gene set enrichment analysis illustrating an enrichment of heme metabolism genes in JAK2V617F erythroid progenitor clusters (EP[1–3]) compared to WT (FDR = 0.05; normalized enrichment score [NES] = 1.52). p, Heatmap showing the TF motif accessibility for those TFs found to be statistically significant between WT (n = 1,312 cells) and JAK2V617F homozygous mutant (n = 3,745 cells) EP[1–3] cells, including JAK2V617F heterozygous cells (n = 446 cells) or JAK2V617F homozygous cells (n = 3,745 cells) for visualization. Colour scale represents row scaled mean z-scores of motif accessibility for the indicated TFs. q, TF footprinting for BCL11A comparing WT (blue) and mutant (red) in EPs (EP[1–3]; n = 5,925 cells) of untreated MF patient samples. Shadowed areas represent the 95% confidence interval.

Extended Data Fig. 9 Quality control, mitochondrial-based genotype imputation and protein measurements with GoT-ChA-ASAP.

a, scATAC-seq library fragment size distribution for primary samples processed through GoT-ChA-ASAP, showing expected nucleosomal periodicity. b, Distribution of scATAC fragment counts per cell for samples processed through GoT-ChA-ASAP. Cells with fragment counts below 1,000 or above 50,000 were filtered out (Methods). c, Distribution of nucleosome signal per cell for samples processed through GoT-ChA-ASAP. Cells with nucleosome signal above 4 were filtered out (Methods). d, Lineage tree of HSPCs from a patient (ET1) with essential thrombocythemia (ET)22 built from 21,430 clonal SNVs detected within the single-cell expanded clones across the whole genome using CellPhy77. Terminal nodes are coloured based on JAK2 genotype. Cell heteroplasmies for two mitochondrial mutations are shown in the heatmap on the right. e, Heatmap of heteroplasmy of mitochondrial variants per cell per patient sample (Methods). f, Correlation of TF motif accessibility mean Δz-score between JAK2V617F-mutated and WT early HSC and HSCMY clusters between cells genotyped via GoT-ChA-ASAP or via mitochondrial-based genotype imputation for Pt-02 (Pearson’s ρ = 0.94; R2 = 0.88; P < 2.2 × 10−16; Two-sided F-test, shadowed area represents the 95% confidence interval). g, Pt-02 UMAP coloured by genotype from GoT-ChA (n = 7,763 cells), GoT-ChA-ASAP (n = 11,602 cells), GoT-ChA-ASAP with mtDNA-based genotype imputation (n = 11,602 cells) or DOGMA-seq with mtDNA-based genotype imputation (n = 15,465 cells), showing percent of genotyped cells. h, Pearson correlation values between mutant cell fractions for each cluster for Pt-02 between methods in g or shuffled control. i, UMAP from g, coloured by cell-surface protein expression from GoT-ChA-ASAP.

Extended Data Fig. 10 Integrated mitochondrial-based genotype imputation with chromatin accessibility, gene expression and protein measurements using GoT-ChA-ASAP.

a, Differential cell surface protein expression rank between JAK2V617F-mutated and WT HSCMY cells in ruxolitinib-treated patients (LMM followed by likelihood ratio test and Bonferroni correction). b, CD90 protein expression in the HSCMY cluster for patients processed with GoT-ChA-ASAP with > 50 genotyped cells in the cluster. Patient Pt-08 was removed due to the presence of additional mutations. Two-sided Wilcoxon rank sum test; Δ represents the effect size. c, CD90 (THY1 gene) imputed gene accessibility scores in HSC and HSCMY clusters for untreated MF samples (n = 12); excluding Pt-01 (PV) and Pt-02, or ruxolitinib-treated samples (n = 6). LMM modelling patient identity as random effects, followed by likelihood ratio test. d, Flow cytometry gating for measurements of CD90 mean fluorescence intensity (MFI) in HSCs defined as Lineage-, CD45+, CD34+, CD38-, CD45RA- cells. e, Correlation between CD90 mean fluorescence intensity (MFI) and JAK2V617F variant allele fraction (VAF) in HSCs. (Two-sided F-test). f, Correlation between CD90 MFI and JAK2V617F variant allele fraction (VAF) in the hematopoietic progenitor cell (HPC) compartment defined as Lineage-, CD45+, CD34+, CD38+, CD45RA- cells (n = 71 patients; P > 0.05 [n.s.]; Two-sided F-test; grey area represents the 95% confidence interval). g, Comparison of correlation between JAK2V617F VAF and CD90 MFI within HPCs or HSCs (Methods). Dots represent Spearman’s ρ values, error bars represent the 95% confidence interval, the dotted line marks zero (no correlation). Two-sided F-test. h, CD90 protein expression as measured by MissionBio Tapestri (Methods) in Pt-11 (n = 195 cells WT; n = 62 cells JAK2V617F); Two-sided Wilcoxon rank sum test. The trend towards increased CD90 in JAK2V617F-mutated HSCs does not reach statistical significance due to low cell number (Two-sided Wilcoxon rank sum test). i, Accessibility track for THY1 in WT (blue) or JAK2V617F-mutated (red) HSC and HSCMY cells defined by mitochondrial-based genotype imputation in the Pt-02 sample processed through DOGMA-seq. Imputed THY1 expression at the RNA level is shown in the violin plot on the right panel (P < 2.2 × 10−16; Two-sided Wilcox rank sum test). Peak to gene expression linkage is shown (FDR < 0.05; colour scale shows the correlation value). j, Correlation between WT vs JAK2V617F changes in RNA expression or gene accessibility for the same gene in HSC + HSCMY clusters for Pt-02 DOGMA-seq data. Two-sided F-test. k, RNA expression levels for Pt-02 HSC and HSCMY clusters for BMPR1B (top) and FRY (bottom) in WT (n = 467 cells) and JAK2V617F-mutated cells (n = 163 cells). Two-sided Wilcoxon rank sum test. l, Correlation between WT vs JAK2V617F changes in RNA expression or gene accessibility for the same gene in MkP cluster for Pt-02 DOGMA-seq data. Two-sided F-test. m, CD36 protein expression in the MkP cluster for either untreated (n = 225 cells WT; n = 550 cells JAK2V617F) or ruxolitinib-treated (n = 36 cells WT; n = 108 cells JAK2V617F) MF patients (Two-sided Wilcoxon rank sum test). For all boxplots, error bars represent the range, boxes represent the interquartile range and lines represent the median.

Supplementary information

Supplementary Information

Supplementary Methods and Discussion, Supplementary references and Supplementary Figs 1–6.

Reporting Summary

Supplementary Table 1

GoT–ChA primer sequences.

Supplementary Table 2

Statistics for the Spearman correlation between mean-normalized gene accessibility and the percentage of cells genotyped per accessibility quantile.

Supplementary Table 3

Patient sample cohort.

Supplementary Table 4

Differential gene accessibility score in HSC and HSCMY clusters in untreated patients with myelofibrosis.

Supplementary Table 5

Differential motif accessibility analysis of HSC and HSCMY clusters in untreated patients with myelofibrosis.

Supplementary Table 6

Differential motif accessibility analysis of HSC and HSCMY clusters in ruxolitinib-treated patients with myelofibrosis.

Supplementary Table 7

Differential motif accessibility analysis of MkP cluster in untreated patients with myelofibrosis.

Supplementary Table 8

Differential motif accessibility analysis of EP1, EP2 and EP3 clusters in untreated patients with myelofibrosis.

Supplementary Table 9

Differential motif accessibility analysis of EP1, EP2 and EP3 clusters in ruxolitinib-treated patients with myelofibrosis.

Supplementary Table 10

Mean CD90 protein expression per cell cluster.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Izzo, F., Myers, R.M., Ganesan, S. et al. Mapping genotypes to chromatin accessibility profiles in single cells. Nature (2024). https://doi.org/10.1038/s41586-024-07388-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1038/s41586-024-07388-y

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer