Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Advances in machine-learning-based sampling motivated by lattice quantum chromodynamics

Abstract

Sampling from known probability distributions is a ubiquitous task in computational science, underlying calculations in domains from linguistics to biology and physics. Generative machine-learning (ML) models have emerged as a promising tool in this space, building on the success of this approach in applications such as image, text and audio generation. Often, however, generative tasks in scientific domains have unique structures and features — such as complex symmetries and the requirement of exactness guarantees — that present both challenges and opportunities for ML. This Perspective outlines the advances in ML-based sampling motivated by lattice quantum field theory, in particular for the theory of quantum chromodynamics. Enabling calculations of the structure and interactions of matter from our most fundamental understanding of particle physics, lattice quantum chromodynamics is one of the main consumers of open-science supercomputing worldwide. The design of ML algorithms for this application faces profound challenges, including the necessity of scaling custom ML architectures to the largest supercomputers, but also promises immense benefits, and is spurring a wave of development in ML-based sampling more broadly. In lattice field theory, if this approach can realize its early promise it will be a transformative step towards first-principles physics calculations in particle, nuclear and condensed matter physics that are intractable with traditional approaches.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Depiction of a single cube within the spacetime lattice of a lattice QCD calculation.
Fig. 2: Comparison between the sampling tasks of quantum field generation for lattice quantum chromodynamics and image generation.
Fig. 3: Illustration of a gauge-equivariant transformation layer80, 81.
Fig. 4: Demonstration of the advantages of flow-based sampling in a U(1) lattice gauge theory in two spacetime dimensions83.
Fig. 5: Sketch of the upfront and sampling costs of hybrid Monte Carlo compared with flow-based models.

Similar content being viewed by others

References

  1. Borsanyi, S. et al. Ab initio calculation of the neutron–proton mass difference. Science 347, 1452–1455 (2015).

  2. Brown, Z. S., Detmold, W., Meinel, S. & Orginos, K. Charmed bottom baryon spectroscopy from lattice QCD. Phys. Rev. D 90, 094507 (2014).

    Article  ADS  Google Scholar 

  3. Aaij, R. et al. Observation of two new \({\Xi }_{b}^{-}\) baryon resonances. Phys. Rev. Lett. 114, 062004 (2015).

  4. Aaij, R. et al. Observation of the doubly charmed baryon \({\Xi }_{cc}^{++}\). Phys. Rev. Lett. 119, 112001 (2017).

  5. Joó, B. et al. Status and future perspectives for lattice gauge theory calculations to the exascale and beyond. Eur. Phys. J. A 55, 199 (2019).

    Article  ADS  Google Scholar 

  6. Detmold, W. et al. Hadrons and nuclei. Eur. Phys. J. A 55, 193 (2019).

    Article  ADS  Google Scholar 

  7. Calì, S., Hackett, D. C., Lin, Y., Shanahan, P. E. & Xiao, B. Neural-network preconditioners for solving the Dirac equation in lattice gauge theory. Phys. Rev. D 107, 034508 (2023).

    Article  ADS  MathSciNet  Google Scholar 

  8. Lehner, C. & Wettig, T. Gauge-equivariant pooling layers for preconditioners in lattice QCD. Preprint at https://arxiv.org/abs/2304.10438 (2023).

  9. Lehner, C. & Wettig, T. Gauge-equivariant neural networks as preconditioners in lattice QCD. Preprint at https://arxiv.org/abs/2302.05419 (2023).

  10. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1092 (1953).

    Article  ADS  MATH  Google Scholar 

  11. Duane, S., Kennedy, A. D., Pendleton, B. J. & Roweth, D. Hybrid Monte Carlo. Phys. Lett. B 195, 216–222 (1987).

    Article  ADS  MathSciNet  Google Scholar 

  12. Chen, D. et al. QCDOC: a 10-teraflops scale computer for lattice QCD. Nucl. Phys. B Proc. Suppl. 94, 825–832 (2001).

    Article  ADS  Google Scholar 

  13. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  ADS  Google Scholar 

  14. Hoffmann, J. et al. Training compute-optimal large language models. Preprint at https://arxiv.org/abs/2203.15556 (2022).

  15. Thoppilan, R. et al. Lamda: Language models for dialog applications. Preprint at https://arxiv.org/abs/2201.08239 (2022).

  16. Peskin, M. E. & Schroeder, D. V. An Introduction to Quantum Field Theory (Addison-Wesley, 1995).

  17. Berezin, F. A. The method of second quantization. Pure Appl. Phys. 24, 1–228 (1966).

    MathSciNet  Google Scholar 

  18. Gattringer, C. & Lang, C. B. Quantum Chromodynamics on the Lattice Vol. 788 (Springer, 2010).

  19. Hastings, W. K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97–109 (1970).

    Article  MathSciNet  MATH  Google Scholar 

  20. Schaefer, S., Sommer, R. & Virotta, F. Critical slowing down and error analysis in lattice QCD simulations. Nucl. Phys. B 845, 93–119 (2011).

    Article  ADS  MATH  Google Scholar 

  21. Beck, C., Hutzenthaler, M., Jentzen, A. & Kuckuck, B. An overview on deep learning-based approximation methods for partial differential equations. Discrete Contin. Dyn. Syst. B 28, 3697–3746 (2023).

  22. Oord, A. v. d. et al. Wavenet: a generative model for raw audio. Preprint at https://arxiv.org/abs/1609.03499 (2016).

  23. Dhariwal, P. & Nichol, A. Diffusion models beat GANs on image synthesis. Adv. Neural Inf. Process. Syst. 34, 8780–8794 (2021).

    Google Scholar 

  24. Saharia, C. et al. Photorealistic text-to-image diffusion models with deep language understanding. Adv. Neural Inf. Process. Syst. 35, 36479–36494 (2022).

    Google Scholar 

  25. Child, R. Very deep VAEs generalize autoregressive models and can outperform them on images. Preprint at https://arxiv.org/abs/2011.10650 (2020).

  26. Kaplan, J. et al. Scaling laws for neural language models. Preprint at https://arxiv.org/abs/2001.08361 (2020).

  27. Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).

    Google Scholar 

  28. Lieber, O., Sharir, O., Lenz, B. & Shoham, Y. Jurassic-1: Technical Details and Evaluation White Paper (AI21 Labs, 2021).

  29. Rae, J. W. et al. Scaling language models: methods, analysis & insights from training gopher. Preprint at https://arxiv.org/abs/2112.11446 (2021).

  30. Smith, S. et al. Using DeepSpeed and Megatron to train Megatron-Turing NLG 530B, a large-scale generative language model. Preprint at https://arxiv.org/abs/2201.11990 (2022).

  31. Kullback, S. & Leibler, R. A. On information and sufficiency. Ann. Math. Stat. 22, 79–86 (1951).

    Article  MathSciNet  MATH  Google Scholar 

  32. Goodfellow, I. et al. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 27, 2672–2680 (2014).

    Google Scholar 

  33. Rezende, D. J., Mohamed, S. & Wierstra, D. Stochastic backpropagation and approximate inference in deep generative models. Proceedings of Machine Learning Research 32(2), 1278–1286 (2014).

    Google Scholar 

  34. Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2014).

  35. Van Oord, A., Kalchbrenner, N. & Kavukcuoglu, K. Pixel recurrent neural networks. Proceedings of Machine Learning Research 48, 1747–1756 (2016).

    Google Scholar 

  36. Chen, C. et al. Continuous-time flows for efficient inference and density estimation. Proceedings of Machine Learning Research 80, 824–833 (2018).

    Google Scholar 

  37. Chen, R. T. & Duvenaud, D. K. Neural networks with cheap differential operators. Adv. Neural Inf. Process. Syst. 32, 9961–9971 (2019).

    Google Scholar 

  38. Papamakarios, G., Nalisnick, E. T., Rezende, D. J., Mohamed, S. & Lakshminarayanan, B. Normalizing flows for probabilistic modeling and inference. J. Mach. Learn. Res. 22, 1–64 (2021).

    MathSciNet  MATH  Google Scholar 

  39. Rezende, D. & Mohamed, S. Variational inference with normalizing flows. Proceedings of Machine Learning Research 37, 1530–1538 (2015).

    Google Scholar 

  40. Tabak, E. G. & Turner, C. V. A family of nonparametric density estimation algorithms. Commun. Pure Appl. Math. 66, 145–164 (2013).

    Article  MathSciNet  MATH  Google Scholar 

  41. Dinh, L., Sohl-Dickstein, J. & Bengio, S. Density estimation using real NVP. In International Conference on Learning Representations (ICLR, 2017).

  42. Kingma, D. P. & Dhariwal, P. Glow: generative flow with invertible 1x1 convolutions. Adv. Neural Inf. Process. Syst. 31, 10215–10224 (2018).

    Google Scholar 

  43. Papamakarios, G., Pavlakou, T. & Murray, I. Masked autoregressive flow for density estimation. Adv. Neural Inf. Process. Syst. 30, 2338–2347 (2017).

    Google Scholar 

  44. Huang, C.-W., Dinh, L. & Courville, A. Augmented normalizing flows: bridging the gap between generative flows and latent variable models. Preprint at https://arxiv.org/abs/2002.07101 (2020).

  45. Laszkiewicz, M., Lederer, J. & Fischer, A. Marginal tail-adaptive normalizing flows. Proceedings of Machine Learning Research 162, 12020–12048 (2022).

    Google Scholar 

  46. Wu, H., Köhler, J. & Noé, F. Stochastic normalizing flows. Adv. Neural Inf. Process. Syst. 33, 5933–5944 (2020).

    Google Scholar 

  47. Müller, T., McWilliams, B., Rousselle, F., Gross, M. & Novák, J. Neural importance sampling. ACM Trans. Graph. 38, 1–19 (2019).

    Article  Google Scholar 

  48. Robert, C. P., Casella, G. & Casella, G. Monte Carlo Statistical Methods Vol. 2 (Springer, 1999).

  49. Hoffman, M. et al. Neutralizing bad geometry in Hamiltonian Monte Carlo using neural transport. Preprint at https://arxiv.org/abs/1903.03704 (2019).

  50. Nijkamp, E. et al. Learning energy-based model with flow-based backbone by neural transport MCMC. Preprint at https://arxiv.org/abs/2006.06897 (2020).

  51. Wang, T., Wu, Y., Moore, D. & Russell, S. J. Meta-learning MCMC proposals. Adv. Neural Inf. Process. Syst. 31, 4146–4156 (2018).

    Google Scholar 

  52. Song, J., Zhao, S. & Ermon, S. A-NICE-MC: adversarial training for MCMC. Adv. Neural Inf. Process. Syst. 30, 5140–5150 (2017).

    Google Scholar 

  53. Li, Z., Chen, Y. & Sommer, F. T. A neural network MCMC sampler that maximizes proposal entropy. Entropy 23, 269 (2021).

    Article  ADS  MathSciNet  Google Scholar 

  54. Huang, L. & Wang, L. Accelerated Monte Carlo simulations with restricted Boltzmann machines. Phys. Rev. B 95, 035105 (2017).

    Article  ADS  Google Scholar 

  55. Liu, J., Qi, Y., Meng, Z. Y. & Fu, L. Self-learning Monte Carlo method. Phys. Rev. B 95, 041101 (2017).

    Article  ADS  Google Scholar 

  56. Liu, J., Shen, H., Qi, Y., Meng, Z. Y. & Fu, L. Self-learning Monte Carlo method and cumulative update in fermion systems. Phys. Rev. B 95, 241104 (2017).

    Article  ADS  Google Scholar 

  57. Nagai, Y., Shen, H., Qi, Y., Liu, J. & Fu, L. Self-learning Monte Carlo method: continuous-time algorithm. Phys. Rev. B 96, 161102 (2017).

    Article  ADS  Google Scholar 

  58. Shen, H., Liu, J. & Fu, L. Self-learning Monte Carlo with deep neural networks. Phys. Rev. B 97, 205140 (2018).

    Article  ADS  Google Scholar 

  59. Xu, X. Y., Qi, Y., Liu, J., Fu, L. & Meng, Z. Y. Self-learning quantum Monte Carlo method in interacting fermion systems. Phys. Rev. B 96, 041119 (2017).

    Article  ADS  Google Scholar 

  60. Chen, C. et al. Symmetry-enforced self-learning Monte Carlo method applied to the Holstein model. Phys. Rev. B 98, 041102 (2018).

    Article  ADS  Google Scholar 

  61. Nagai, Y., Okumura, M. & Tanaka, A. Self-learning Monte Carlo method with Behler–Parrinello neural networks. Phys. Rev. B 101, 115111 (2020).

    Article  ADS  Google Scholar 

  62. Nagai, Y., Tanaka, A. & Tomiya, A. Self-learning Monte Carlo for non-Abelian gauge theory with dynamical fermions. Phys. Rev. D 107, 054501 (2023).

    Article  ADS  MathSciNet  Google Scholar 

  63. Pawlowski, J. M. & Urban, J. M. Reducing autocorrelation times in lattice simulations with generative adversarial networks. Mach. Learn. Sci. Technol. 1, 045011 (2020).

    Article  Google Scholar 

  64. Foreman, S. et al. HMC with normalizing flows. PoS LATTICE2021, 073 (2022).

    Google Scholar 

  65. Arbel, M., Matthews, A. & Doucet, A. Annealed flow transport Monte Carlo. Proceedings of Machine Learning Research 139, 318–330 (2021).

    Google Scholar 

  66. Matthews, A. G. D. G., Arbel, M., Rezende, D. J. & Doucet, A. Continual repeated annealed flow transport Monte Carlo. Proceedings of Machine Learning Research 162, 15196–15219 (2022).

    Google Scholar 

  67. Caselle, M., Cellini, E., Nada, A. & Panero, M. Stochastic normalizing flows as non-equilibrium transformations. J. High Energy Phys. 2022, 1–31 (2022).

    Article  MATH  Google Scholar 

  68. Veach, E. & Guibas, L. J. Optimally combining sampling techniques for monte carlo rendering. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, 419–428 (1995).

  69. Müller, T., Rousselle, F., Keller, A. & Novák, J. Neural control variates. ACM Trans. Graph. 39, 1–19 (2020).

    Article  Google Scholar 

  70. Li, S.-H. & Wang, L. Neural network renormalization group. Phys. Rev. Lett. 121, 260601 (2018).

    Article  ADS  Google Scholar 

  71. Li, S.-H., Dong, C.-X., Zhang, L. & Wang, L. Neural canonical transformation with symplectic flows. Phys. Rev. X 10, 021020 (2020).

    Google Scholar 

  72. Tomiya, A. & Nagai, Y. Gauge covariant neural network for 4 dimensional non-Abelian gauge theory. Preprint at https://arxiv.org/abs/2103.11965 (2021).

  73. Tanaka, A. & Tomiya, A. Towards reduction of autocorrelation in HMC by machine learning. Preprint at https://arxiv.org/abs/1712.03893 (2017).

  74. Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 1–48 (2019).

    Article  Google Scholar 

  75. Mitrovic, J., McWilliams, B., Walker, J. C., Buesing, L. H. & Blundell, C. Representation learning via invariant causal mechanisms. In International Conference on Learning Representations (2020).

  76. Rezende, D. J., Racanière, S., Higgins, I. & Toth, P. Equivariant Hamiltonian flows. Preprint at https://arxiv.org/abs/1909.13739 (2019).

  77. Cohen, T. & Welling, M. Group equivariant convolutional networks. Proceedings of Machine Learning Research 48, 2990–2999 (2016).

    Google Scholar 

  78. Fuchs, F., Worrall, D., Fischer, V. & Welling, M. SE(3)-transformers: 3D roto-translation equivariant attention networks. Adv. Neural Inf. Process. Syst. 33, 1970–1981 (2020).

    Google Scholar 

  79. Du, W. et al. SE(3) equivariant graph neural networks with complete local frames. Proceedings of Machine Learning Research 162, 5583–5608 (2022).

    Google Scholar 

  80. Kanwar, G. et al. Equivariant flow-based sampling for lattice gauge theory. Phys. Rev. Lett. 125, 121601 (2020).

    Article  ADS  MathSciNet  Google Scholar 

  81. Boyda, D. et al. Sampling using SU(N) gauge equivariant flows. Phys. Rev. D 103, 074504 (2021).

    Article  ADS  MathSciNet  Google Scholar 

  82. Jin, X.-Y. Neural network field transformation and its application in HMC. In The 38th International Symposium on Lattice Field Theory Vol. 396, 600 (PoS, 2022).

  83. Kanwar, G. et al. Equivariant flow-based sampling for lattice gauge theory. Phys. Rev. Lett. 125, 121601 (2020).

    Article  ADS  MathSciNet  Google Scholar 

  84. Katsman, I. et al. Equivariant manifold flows. Adv. Neural Inf. Process. Syst. 34, 10600–10612 (2021).

    Google Scholar 

  85. Finkenrath, J. Tackling critical slowing down using global correction steps with equivariant flows: the case of the Schwinger model. Preprint at https://arxiv.org/abs/2201.02216 (2022).

  86. de Haan, P., Rainone, C., Cheng, M. & Bondesan, R. Scaling up machine learning for quantum field theory with equivariant continuous flows. Preprint at https://arxiv.org/abs/2110.02673 (2021).

  87. Albergo, M. S. et al. Flow-based sampling for fermionic lattice field theories. Phys. Rev. D 104, 114507 (2021).

    Article  ADS  MathSciNet  Google Scholar 

  88. Hackett, D. C. et al. Flow-based sampling for multimodal distributions in lattice field theory. Preprint at https://arxiv.org/abs/2107.00734 (2021).

  89. Albergo, M. S., Kanwar, G. & Shanahan, P. E. Flow-based generative models for markov chain monte carlo in lattice field theory. Phys. Rev. D 100, 034515 (2019).

    Article  ADS  MathSciNet  Google Scholar 

  90. Vaitl, L., Nicoli, K. A., Nakajima, S. & Kessel, P. Path-gradient estimators for continuous normalizing flows. Proceedings of Machine Learning Research, 162, 21945–21959 (2022).

    Google Scholar 

  91. Köhler, J., Klein, L. & Noé, F. Equivariant flows: exact likelihood generative learning for symmetric densities. Proceedings of Machine Learning Research 119, 5361–5370 (2020).

    Google Scholar 

  92. Abbott, R. et al. Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions. Phys. Rev. D 106, 074506 (2022).

    Article  ADS  Google Scholar 

  93. Albergo, M. S. et al. Flow-based sampling for fermionic lattice field theories. Phys. Rev. D 104, 114507 (2021).

    Article  ADS  MathSciNet  Google Scholar 

  94. Abbott, R. et al. Sampling QCD field configurations with gauge-equivariant flow models. In The 39th International Symposium on Lattice Field Theory Vol. 430, 036 (PoS, 2023).

  95. Lüscher, M. Trivializing maps, the Wilson flow and the HMC algorithm. Commun. Math. Phys. 293, 899–919 (2010).

    Article  ADS  MathSciNet  MATH  Google Scholar 

  96. Lüscher, M. & Weisz, P. Perturbative analysis of the gradient flow in non-Abelian gauge theories. J. High Energy Phys. 2011, 1–23 (2011).

    Article  MathSciNet  MATH  Google Scholar 

  97. Gerdes, M., de Haan, P., Rainone, C., Bondesan, R. & Cheng, M. C. N. Learning lattice quantum field theories with equivariant continuous flows. Preprint at https://arxiv.org/abs/2207.00283 (2022).

  98. Bacchio, S., Kessel, P., Schaefer, S. & Vaitl, L. Learning trivializing gradient flows for lattice gauge theories. Phys. Rev. D 107, L051504 (2023).

    Article  ADS  MathSciNet  Google Scholar 

  99. Albergo, M. S. et al. Flow-based sampling in the lattice Schwinger model at criticality. Phys. Rev. D 106, 014514 (2022).

    Article  ADS  Google Scholar 

  100. Abbott, R. et al. Aspects of scaling and scalability for flow-based sampling of lattice QCD. Preprint at https://arxiv.org/abs/2211.07541 (2022).

  101. Gabbard, H., Messenger, C., Heng, I. S., Tonolini, F. & Murray-Smith, R. Bayesian parameter estimation using conditional variational autoencoders for gravitational-wave astronomy. Nat. Phys. 18, 112–117 (2022).

    Article  Google Scholar 

  102. Musaelian, A. et al. Learning local equivariant representations for large-scale atomistic dynamics. Nat. Commun. 14, 579 (2023).

    Article  ADS  Google Scholar 

  103. Singha, A., Chakrabarti, D. & Arora, V. Conditional normalizing flow for Markov chain Monte Carlo sampling in the critical region of lattice field theory. Phys. Rev. D 107, 014512 (2023).

    Article  ADS  Google Scholar 

  104. Lehner, C. & Wettig, T. Gauge-equivariant neural networks as preconditioners in lattice QCD. Preprint at https://arxiv.org/abs/2302.05419 (2023).

  105. Sutton, R. The Bitter Lesson (2019); https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf.

Download references

Acknowledgements

We thank W. Detmold and R. D. Young for comments on the manuscript. P.E.S. was supported in part by the US Department of Energy, Office of Science, Office of Nuclear Physics, under grant contract number DE-SC0011090, and by Early Career Award DE-SC0021006, by a NEC research award, and by the Carl G. and Shirley Sontheimer Research Fund. G.K. was supported by funding from the Schweizerischer Nationalfonds (grant agreement no. 200020_200424).

Author information

Authors and Affiliations

Authors

Contributions

The authors contributed equally to all aspects of the article.

Corresponding author

Correspondence to Phiala E. Shanahan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Physics thanks Tanmoy Bhattacharya and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cranmer, K., Kanwar, G., Racanière, S. et al. Advances in machine-learning-based sampling motivated by lattice quantum chromodynamics. Nat Rev Phys 5, 526–535 (2023). https://doi.org/10.1038/s42254-023-00616-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42254-023-00616-w

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics