Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture

Boettcher, Stefan

doi:10.1038/s41467-023-41106-y

Download PDF

Matters Arising
Open access
Published: 14 September 2023

Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture

Stefan Boettcher ORCID: orcid.org/0000-0003-1273-6771¹

Nature Communications volume 14, Article number: 5658 (2023) Cite this article

1150 Accesses
1 Citations
8 Altmetric
Metrics details

Subjects

Matters Arising to this article was published on 14 September 2023

The Original Article was published on 09 February 2023

arising from C. Fan et al. Nature Communications https://doi.org/10.1038/s41467-023-36363-w (2023)

In ref. ¹, the authors present a deep reinforced learning approach to augment combinatorial optimization heuristics. In particular, they present results for several spin glass ground state problems², for which instances on non-planar networks are generally NP-hard, in comparison with several Monte Carlo-based methods, such as simulated annealing (SA) or parallel tempering (PT)³. Here, we examine those results in the context of well-established literature and find that, albeit fast and capable for small instance sizes, the presentation lacks signs of the claimed superiority for larger instances, unless one competes with Greedy Search for speed.

Indeed, the results of ref. ¹ demonstrates that the reinforced learning improves the results over those obtained with SA or PT, or at least allows for reduced runtimes for the heuristics before results of comparable quality have been obtained relative to those other methods. To facilitate the conclusion that their method is “superior”, the authors of ref. ¹ pursue two basic strategies: (1) A commercial GUROBI solver (see https://www.gurobi.com/) is called on to procure a sample of exact ground states as a testbed to compare with, and (2) a head-to-head comparison between the heuristics is given for a sample of larger instances where exact ground states are hard to ascertain. Here, we put these studies into a larger context, showing that the claimed superiority is at best marginal for smaller samples and becomes essentially irrelevant with respect to any sensible approximation of true ground states in the larger samples. For example, this method becomes irrelevant as a means to determine stiffness exponents θ in d > 2, as mentioned by the authors, where the problem is not only NP-hard but requires the subtraction of two almost equal ground-state energies and systemic errors in each of ≈ 1% found here are unacceptable⁴. This larger picture of the method arises from a straightforward finite-size corrections study over the spin glass ensembles the authors employ, using data that has been available for decades^5,6.

In our investigation here, we focus on mainly two ensembles of NP-hard problems the authors utilize: The Edwards–Anderson spin glass on a cubic lattice (EA in d = 3) with periodic boundary conditions⁷ and the mean-field (all-to-all connected) Sherrington–Kirkpatrick spin glass (SK)⁸. The ensemble for both models consists of instances where all bonds are chosen randomly from a normal distribution of zero mean and unit variance. The ensemble is parametrized by its size, i.e., the number of variables N in a spin configuration $\overrightarrow{\sigma }$, where N = L³ in the case of EA. With those hard combinatorial problems, there are many ways to find exact solutions for instances of small N, such as a solver like GUROBI, however, for any practical application at large N, the super-polynomial rise in complexity necessitates the use of heuristic methods. Thus, the scalability of a heuristic is of particular concern. In the formal study of computational complexity, this is typically addressed by establishing bounds on an all-encompassing worst-case scenario⁹. For many complicated meta-heuristics¹⁰, such as the case of the method presented here, insights into the capability of a heuristic can be gained only from comparative studies over widely accepted testbeds of instances or those selected from specific ensembles. The authors have clearly adopted the ensemble approach¹.

Especially with regard to scaleability, the ensemble picture deserves particular attention, for the following reasons. Those ensembles typically have a “thermodynamic limit”, i.e., their averages are well-defined and possess a clear meaning for N → ∞, which is a typical large instance approach. At times, that limit may even be solvable, such as in the case of SK¹¹, but that is not essential here, as exemplified by EA. More importantly, that limit is usually attained in an equally well-defined manner through finite-size corrections (FSC). To be specific in this context, for the cost function a heuristic is trying to minimize, the authors have chosen the ground state energy density, ${e}_{0}=\mathop{\min }\nolimits_{\overrightarrow{\sigma }}H\left(\overrightarrow{\sigma }\right)/N$, of the Hamiltonian H for each of their (physically motivated) spin glass ensembles. Instances are generated via random choices of bonds J_ij from a characteristic distribution P(J), see Eq. (1) in ref. ¹. If the thermodynamic limit for the ensemble-averaged ground-state energy density ${\left\langle {e}_{0}\right\rangle }_{N=\infty }$ exists, FSC assumes the asymptotic scaling form

$${\left\langle {e}_{0}\right\rangle }_{N} \sim {\left\langle {e}_{0}\right\rangle }_{N=\infty }+\frac{A}{{N}^{\omega }}+\ldots,\qquad (N\to \infty ),$$

(1)

for a constant A and a correction exponent ω(>0). Clearly, other forms of corrections might exist and higher-order terms could well obscure the assumed behavior deep into the large-N regime. Yet, self-consistency with the form in Eq. (1) of the actual data for small N, where reliable (or exact) results can be ascertained, often provides a powerful baseline to assess the scalability of a heuristic^12,13. This is certainly the case here, and it provides a larger picture of the results in ref. ¹.

Long before the PT results³ that the authors reference in their study of EA in d = 3, virtually identical results have been found by Pal⁵ using a genetic algorithm (GA). Despite the doubts the authors raise (in the caption (Note that several references in ref. ¹ are incorrect, e.g., in the caption to Fig. 5 “ref. 51” should be to ref. 50 and the label “f” should be “d” for the 3d-EA at L = 10.) of their Fig. 5), both the PT and the GA data exhibit a consistent scaling picture, shown here in Fig. 1. While the authors do not provide any tabulated data for their corresponding results, at least for the larger samples we can extract estimated values for their best results (for DIRAC-SA, shown as red circles in Fig. 1) from the plots provided in their Fig. S5d–f. There, the fact that the DIRAC-SA data is better than either PT or SA is taken as evidence of the superiority of their method by the authors. However, considering how far separated from any actual ground states every one of the datasets employed in this comparison really is, this advantage, whether in speed or in accuracy, is rather inconsequential in the larger picture of Fig. 1.

**Fig. 1: Extrapolation plot according to the finite-size corrections form in Eq. (1) for the ensemble-averaged ground state energy densities obtained with various heuristics for EA in d = 3.**

Similarly, the results the authors provide for SK prove inconclusive in the larger picture of long-established results for this case^6,13,14. Here, ref. ¹ merely provides results of their method for quite small instances, where GUROBI allows to obtain exact ground states for comparison. While these results are indeed consistent with the predicted scaling, as shown in Fig. 2, the sizes bounded by N ≤ 216 considered in their study have very limited predictive power about the scalability of their method for any size that would make their method competitive, either in speed or in accuracy, with state-of-the-art heuristics at larger N. After all, with an ensemble approach, it is not necessary to rely on exactly solved instances to make impactful comparisons, as our discussion of EA demonstrates.

**Fig. 2: Extrapolation plot of ensemble-averaged ground state energy densities for SK according to Eq. (1) with ω = 2/3.**

In conclusion, a comparison with existing data shows little evidence for the claimed superiority of the deep reinforcement learning strategy to enhance optimization heuristics proposed in ref. ¹. The comparison provided here for both, a sparse short-range and a dense infinite-range spin glass model, is quite exemplary for all the ensembles the authors discuss so that this conclusion is likely not particular to these two cases. The authors should be lauded for having demonstrated some gains relative to simple greedy algorithms for EA¹⁵, but their results remain too far from optimality, even if under the < 1% level we found in Fig. 1, to be of any use in applications to the physics of spin glasses the authors imply. For example, in the stiffness problem, one determines the ground state of an instance in EA and again for reversed boundary conditions, which inserts a relative domain wall between the ground states with separate energies ${e}_{0}^{1,2}(L) \sim {\left\langle {e}_{0}\right\rangle }_{L=\infty }+{A}_{1,2}/{L}^{d\omega }+\ldots$. That domain wall has a much smaller energy, ${{{{{{{\rm{{{\Delta }}}}}}}}}}e=\left|{e}_{1}-{e}_{2}\right|\sim {{{{{{{\rm{{{\Delta }}}}}}}}}}A/{L}^{d\omega }\to 0$, which relates FSC to the stiffness exponent via dω = d − θ¹², as used in Fig. 1. These exponents were determined for EA in dimensions d = 3, …, 7 by finding ground states for millions of dilute lattices with up to N = 10⁷ using a hybrid EO algorithm^4,16. Hence, the heuristics chosen as a base for their comparison are surprisingly narrow, considering that the authors refer to ref. ² for the use of heuristics for spin glasses, which also discusses GA and EO.

Data availability

Most data discussed in this comment is already available directly from the respective references cited. Beyond that, any data presented or discussed in this comment is also available on request from the author via email.

References

Fan, C. et al. Searching for spin glass ground states through deep reinforcement learning. Nat. Commun. 14, 725 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Hartmann, A. & Rieger, H. editors. New Optimization Algorithms in Physics (Wiley-VCH, Berlin, 2004).
Wang, W., Machta, J. & Katzgraber, H. G. Comparing monte carlo methods for finding ground states of Ising spin glasses: population annealing, simulated annealing, and parallel tempering. Phys. Rev. E 92, 013303 (2015).
Article ADS MathSciNet Google Scholar
Boettcher, S. Stiffness of the Edwards–Anderson model in all dimensions. Phys. Rev. Lett. 95, 197205 (2005).
Article ADS PubMed Google Scholar
Pal, K. F. The ground state of the cubic spin-glass with short-range interactions of Gaussian distribution. Physica A 233, 60–66 (1996).
Article ADS Google Scholar
Boettcher, S. Extremal optimization for Sherrington–Kirkpatrick spin glasses. Eur. Phys. J. B 46, 501–505 (2005).
Article ADS CAS Google Scholar
Edwards, S. F. & Anderson, P. W. Theory of spin glasses. J. Phys. F 5, 965–974 (1975).
Article ADS Google Scholar
Sherrington, D. & Kirkpatrick, S. Solvable model of a spin-glass. Phys. Rev. Lett. 35, 1792–1796 (1975).
Article ADS Google Scholar
Garey, M. R. & Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, W. H., New York, 1979).
Martello, S., Osman, I., Roucairol, C. & Voss, S. (eds.) Meta-Heuristics: Advances and Trends in Local Search Paradigms for Optimization (Kluwer, Boston, 1999).
Mézard, M., Parisi, G. & Virasoro, M. A. Spin Glass Theory and Beyond (World Scientific, Singapore, 1987).
Boettcher, S. & Falkner, S. Finite-size corrections for ground states of Edwards–Anderson spin glasses. Europhys. Lett. 98, 47005 (2012).
Article ADS Google Scholar
Boettcher, S. Analysis of the relation between quadratic unconstrained binary optimization and the spin-glass ground-state problem. Phys. Rev. Res. 1, 033142 (2019).
Article CAS Google Scholar
Aspelmeier, T., Billoire, A., Marinari, E. & Moore, M. A. Finite size corrections in the Sherrington–Kirkpatrick model. J. Phys. A: Math. Theor. 41, 324008 (2008).
Article MathSciNet MATH Google Scholar
Boettcher, S. Inability of a graph neural network heuristic to outperform greedy algorithms in solving combinatorial optimization problems. Nat. Mach. Intell. 5, 24–25 (2022).
Article Google Scholar
Boettcher, S. Low-temperature excitations of dilute lattice spin glasses. Europhys. Lett. 67, 453–459 (2004).
Article ADS CAS Google Scholar

Download references

Author information

Authors and Affiliations

Department of Physics, Emory University, Atlanta, GA, 30322, USA
Stefan Boettcher

Authors

Stefan Boettcher
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The author assembled the research presented in this manuscript and wrote the paper.

Corresponding author

Correspondence to Stefan Boettcher.

Ethics declarations

Competing interests

The author declares no competing interests.

Peer review

Peer review information

Nature Communications thanks Federico Ricci-Tersenghi, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Boettcher, S. Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture. Nat Commun 14, 5658 (2023). https://doi.org/10.1038/s41467-023-41106-y

Download citation

Received: 19 February 2023
Accepted: 07 August 2023
Published: 14 September 2023
DOI: https://doi.org/10.1038/s41467-023-41106-y

This article is cited by

Reply to: Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture
- Changjun Fan
- Mutian Shen
- Yang-Yu Liu
Nature Communications (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.