arising from C. Fan et al. Nature Communications https://doi.org/10.1038/s41467-023-36363-w (2023)
In ref. 1, the authors present a deep reinforced learning approach to augment combinatorial optimization heuristics. In particular, they present results for several spin glass ground state problems2, for which instances on non-planar networks are generally NP-hard, in comparison with several Monte Carlo-based methods, such as simulated annealing (SA) or parallel tempering (PT)3. Here, we examine those results in the context of well-established literature and find that, albeit fast and capable for small instance sizes, the presentation lacks signs of the claimed superiority for larger instances, unless one competes with Greedy Search for speed.
Indeed, the results of ref. 1 demonstrates that the reinforced learning improves the results over those obtained with SA or PT, or at least allows for reduced runtimes for the heuristics before results of comparable quality have been obtained relative to those other methods. To facilitate the conclusion that their method is “superior”, the authors of ref. 1 pursue two basic strategies: (1) A commercial GUROBI solver (see https://www.gurobi.com/) is called on to procure a sample of exact ground states as a testbed to compare with, and (2) a head-to-head comparison between the heuristics is given for a sample of larger instances where exact ground states are hard to ascertain. Here, we put these studies into a larger context, showing that the claimed superiority is at best marginal for smaller samples and becomes essentially irrelevant with respect to any sensible approximation of true ground states in the larger samples. For example, this method becomes irrelevant as a means to determine stiffness exponents θ in d > 2, as mentioned by the authors, where the problem is not only NP-hard but requires the subtraction of two almost equal ground-state energies and systemic errors in each of ≈ 1% found here are unacceptable4. This larger picture of the method arises from a straightforward finite-size corrections study over the spin glass ensembles the authors employ, using data that has been available for decades5,6.
In our investigation here, we focus on mainly two ensembles of NP-hard problems the authors utilize: The Edwards–Anderson spin glass on a cubic lattice (EA in d = 3) with periodic boundary conditions7 and the mean-field (all-to-all connected) Sherrington–Kirkpatrick spin glass (SK)8. The ensemble for both models consists of instances where all bonds are chosen randomly from a normal distribution of zero mean and unit variance. The ensemble is parametrized by its size, i.e., the number of variables N in a spin configuration \(\overrightarrow{\sigma }\), where N = L3 in the case of EA. With those hard combinatorial problems, there are many ways to find exact solutions for instances of small N, such as a solver like GUROBI, however, for any practical application at large N, the super-polynomial rise in complexity necessitates the use of heuristic methods. Thus, the scalability of a heuristic is of particular concern. In the formal study of computational complexity, this is typically addressed by establishing bounds on an all-encompassing worst-case scenario9. For many complicated meta-heuristics10, such as the case of the method presented here, insights into the capability of a heuristic can be gained only from comparative studies over widely accepted testbeds of instances or those selected from specific ensembles. The authors have clearly adopted the ensemble approach1.
Especially with regard to scaleability, the ensemble picture deserves particular attention, for the following reasons. Those ensembles typically have a “thermodynamic limit”, i.e., their averages are well-defined and possess a clear meaning for N → ∞, which is a typical large instance approach. At times, that limit may even be solvable, such as in the case of SK11, but that is not essential here, as exemplified by EA. More importantly, that limit is usually attained in an equally well-defined manner through finite-size corrections (FSC). To be specific in this context, for the cost function a heuristic is trying to minimize, the authors have chosen the ground state energy density, \({e}_{0}=\mathop{\min }\nolimits_{\overrightarrow{\sigma }}H\left(\overrightarrow{\sigma }\right)/N\), of the Hamiltonian H for each of their (physically motivated) spin glass ensembles. Instances are generated via random choices of bonds Jij from a characteristic distribution P(J), see Eq. (1) in ref. 1. If the thermodynamic limit for the ensemble-averaged ground-state energy density \({\left\langle {e}_{0}\right\rangle }_{N=\infty }\) exists, FSC assumes the asymptotic scaling form
for a constant A and a correction exponent ω(>0). Clearly, other forms of corrections might exist and higher-order terms could well obscure the assumed behavior deep into the large-N regime. Yet, self-consistency with the form in Eq. (1) of the actual data for small N, where reliable (or exact) results can be ascertained, often provides a powerful baseline to assess the scalability of a heuristic12,13. This is certainly the case here, and it provides a larger picture of the results in ref. 1.
Long before the PT results3 that the authors reference in their study of EA in d = 3, virtually identical results have been found by Pal5 using a genetic algorithm (GA). Despite the doubts the authors raise (in the caption (Note that several references in ref. 1 are incorrect, e.g., in the caption to Fig. 5 “ref. 51” should be to ref. 50 and the label “f” should be “d” for the 3d-EA at L = 10.) of their Fig. 5), both the PT and the GA data exhibit a consistent scaling picture, shown here in Fig. 1. While the authors do not provide any tabulated data for their corresponding results, at least for the larger samples we can extract estimated values for their best results (for DIRAC-SA, shown as red circles in Fig. 1) from the plots provided in their Fig. S5d–f. There, the fact that the DIRAC-SA data is better than either PT or SA is taken as evidence of the superiority of their method by the authors. However, considering how far separated from any actual ground states every one of the datasets employed in this comparison really is, this advantage, whether in speed or in accuracy, is rather inconsequential in the larger picture of Fig. 1.
Similarly, the results the authors provide for SK prove inconclusive in the larger picture of long-established results for this case6,13,14. Here, ref. 1 merely provides results of their method for quite small instances, where GUROBI allows to obtain exact ground states for comparison. While these results are indeed consistent with the predicted scaling, as shown in Fig. 2, the sizes bounded by N ≤ 216 considered in their study have very limited predictive power about the scalability of their method for any size that would make their method competitive, either in speed or in accuracy, with state-of-the-art heuristics at larger N. After all, with an ensemble approach, it is not necessary to rely on exactly solved instances to make impactful comparisons, as our discussion of EA demonstrates.
In conclusion, a comparison with existing data shows little evidence for the claimed superiority of the deep reinforcement learning strategy to enhance optimization heuristics proposed in ref. 1. The comparison provided here for both, a sparse short-range and a dense infinite-range spin glass model, is quite exemplary for all the ensembles the authors discuss so that this conclusion is likely not particular to these two cases. The authors should be lauded for having demonstrated some gains relative to simple greedy algorithms for EA15, but their results remain too far from optimality, even if under the < 1% level we found in Fig. 1, to be of any use in applications to the physics of spin glasses the authors imply. For example, in the stiffness problem, one determines the ground state of an instance in EA and again for reversed boundary conditions, which inserts a relative domain wall between the ground states with separate energies \({e}_{0}^{1,2}(L) \sim {\left\langle {e}_{0}\right\rangle }_{L=\infty }+{A}_{1,2}/{L}^{d\omega }+\ldots\). That domain wall has a much smaller energy, \({{{{{{{\rm{{{\Delta }}}}}}}}}}e=\left|{e}_{1}-{e}_{2}\right|\sim {{{{{{{\rm{{{\Delta }}}}}}}}}}A/{L}^{d\omega }\to 0\), which relates FSC to the stiffness exponent via dω = d − θ12, as used in Fig. 1. These exponents were determined for EA in dimensions d = 3, …, 7 by finding ground states for millions of dilute lattices with up to N = 107 using a hybrid EO algorithm4,16. Hence, the heuristics chosen as a base for their comparison are surprisingly narrow, considering that the authors refer to ref. 2 for the use of heuristics for spin glasses, which also discusses GA and EO.
Data availability
Most data discussed in this comment is already available directly from the respective references cited. Beyond that, any data presented or discussed in this comment is also available on request from the author via email.
References
Fan, C. et al. Searching for spin glass ground states through deep reinforcement learning. Nat. Commun. 14, 725 (2023).
Hartmann, A. & Rieger, H. editors. New Optimization Algorithms in Physics (Wiley-VCH, Berlin, 2004).
Wang, W., Machta, J. & Katzgraber, H. G. Comparing monte carlo methods for finding ground states of Ising spin glasses: population annealing, simulated annealing, and parallel tempering. Phys. Rev. E 92, 013303 (2015).
Boettcher, S. Stiffness of the Edwards–Anderson model in all dimensions. Phys. Rev. Lett. 95, 197205 (2005).
Pal, K. F. The ground state of the cubic spin-glass with short-range interactions of Gaussian distribution. Physica A 233, 60–66 (1996).
Boettcher, S. Extremal optimization for Sherrington–Kirkpatrick spin glasses. Eur. Phys. J. B 46, 501–505 (2005).
Edwards, S. F. & Anderson, P. W. Theory of spin glasses. J. Phys. F 5, 965–974 (1975).
Sherrington, D. & Kirkpatrick, S. Solvable model of a spin-glass. Phys. Rev. Lett. 35, 1792–1796 (1975).
Garey, M. R. & Johnson, D. S. Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, W. H., New York, 1979).
Martello, S., Osman, I., Roucairol, C. & Voss, S. (eds.) Meta-Heuristics: Advances and Trends in Local Search Paradigms for Optimization (Kluwer, Boston, 1999).
Mézard, M., Parisi, G. & Virasoro, M. A. Spin Glass Theory and Beyond (World Scientific, Singapore, 1987).
Boettcher, S. & Falkner, S. Finite-size corrections for ground states of Edwards–Anderson spin glasses. Europhys. Lett. 98, 47005 (2012).
Boettcher, S. Analysis of the relation between quadratic unconstrained binary optimization and the spin-glass ground-state problem. Phys. Rev. Res. 1, 033142 (2019).
Aspelmeier, T., Billoire, A., Marinari, E. & Moore, M. A. Finite size corrections in the Sherrington–Kirkpatrick model. J. Phys. A: Math. Theor. 41, 324008 (2008).
Boettcher, S. Inability of a graph neural network heuristic to outperform greedy algorithms in solving combinatorial optimization problems. Nat. Mach. Intell. 5, 24–25 (2022).
Boettcher, S. Low-temperature excitations of dilute lattice spin glasses. Europhys. Lett. 67, 453–459 (2004).
Author information
Authors and Affiliations
Contributions
The author assembled the research presented in this manuscript and wrote the paper.
Corresponding author
Ethics declarations
Competing interests
The author declares no competing interests.
Peer review
Peer review information
Nature Communications thanks Federico Ricci-Tersenghi, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Boettcher, S. Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture. Nat Commun 14, 5658 (2023). https://doi.org/10.1038/s41467-023-41106-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-023-41106-y
This article is cited by
-
Reply to: Deep reinforced learning heuristic tested on spin-glass ground states: The larger picture
Nature Communications (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.