Introduction

The computational cost of approximating the ground state energy of an n-electron molecular system on classical computing architectures typically grows exponentially in n. Quantum computers allow for the encoding of the exponentially scaling underlying Hilbert space using only \({{{{{{{\mathcal{O}}}}}}}}(n)\) qubits, and are therefore likely to outperform classical devices on a range of chemical simulations1,2,3. The Variational-Quantum Eigensolver (VQE) is a hybrid quantum-classical algorithm that is considered a very promising candidate for chemical calculations on Noisy Intermediate Scale Quantum (NISQ) devices4,5. In this approach, a parameterized wave-function is generated and variationally tuned to minimize the expectation value of the molecular electronic Hamiltonian. A variety of different parameterized wave-functions have been proposed, including the Trotterised Unitary Coupled Cluster (tUCC) ansatz6,7 which consists of a sequence of exponential, unitary operators acting on a judiciously chosen reference state. While the tUCC approach includes electronic correlation and has, in principle, a rather simple quantum circuit structure, the excessive depth of these quantum circuits make them ill-suited for applications in the NISQ regime. This issue has led to the proposal that ansatz wave-functions be constructed through the action of a selective subset of possible unitary operators, i.e., only those operators whose inclusion in the ansatz can potentially lead to the largest decrease in the expectation value of the molecular electronic Hamiltonian. In this context, the Adaptive Derivative-Assembled Pseudo-Trotter VQE (ADAPT-VQE)8 has emerged as the gold standard for generating highly accurate and compact ansatz wave-functions. In ADAPT-VQE, the ansatz is grown iteratively by appending a sequence of unitary operators to the reference Hartree-Fock state. At each iteration, the unitary operator to be applied is chosen according to a simple criterion based on the gradient of the expectation value of the Hamiltonian (see the Methods section for details).

Assuming that the number of spin-orbitals N being considered is proportional to the number of electrons n in the system, the pool of potential unitary operators in tUCC-based VQEs scales as \({{{{{{{\mathcal{O}}}}}}}}({N}^{\ell })\) for ≥4. Thus, the pool of operators in tUCC singles and doubles, for instance, scales as \({{{{{{{\mathcal{O}}}}}}}}({N}^{4})\), that of tUCC singles, doubles, and triples scales as \({{{{{{{\mathcal{O}}}}}}}}({N}^{6})\), etc. As a consequence, conventional VQEs based on the tUCC ansatz require the representation of a product of \({{{{{{{\mathcal{O}}}}}}}}({N}^{\ell })\) unitary operators on quantum circuitry and the optimization of an \({{{{{{{\mathcal{O}}}}}}}}({N}^{\ell })\)-dimensional cost function, both of which are practically impossible using the current generation of NISQ devices. The ADAPT-VQE algorithm attempts to alleviate these problems by avoiding the inclusion of unitary operators in the ansatz wave-function that are not expected to lead to a lowering of the resulting energy. Numerical evidence suggests that ADAPT-VQE is indeed resource-saving and the energy-gradient criterion employed by ADAPT-VQE leads to much more accurate wave-functions than conventional VQE algorithms while preserving moderate circuit depth8,9,10. Thus, while the state-of-the-art k-UpCCGSD algorithm11, which the review article12 considers the most promising fixed-ansatz VQE, is shown to obtain an accuracy of about 10−6 Hartree for the BeH2 molecule at equilibrium distance at a cost of more than 7000 controlled NOT (CNOT) gates (see Table 1 in13), ADAPT-VQE achieves a higher accuracy of about 2 × 10−8 Hartree for the same system using only about 2400 CNOT gates9. In spite of this comparative advantage, such an energy-gradient guided procedure has a tendency to fall into local minima of the energy landscape. Exiting from such minima comes at the expense of adding and optimizing operators through multiple ADAPT iterations14 and leads to over-parameterized wave-functions. In practice, this is associated with an unnecessary increase of the quantum circuit depth required for the representation of the ansatz wave-function coupled to an increasingly difficult classical optimization. This is dramatically revealed in the supplementary information in9 wherein the basic qubit excitation-based (QEB) variant of ADAPT-VQE is applied to the strongly correlated stretched H6 linear chain, and it is shown that more than a thousand CNOT gates are required to construct a chemically accurate ansatz. Given that the current state-of-the-art simulations on physical quantum computers typically involve a maximal circuit depth of less than 100 CNOT gates15, it seems unrealistic in the very short-term to expect a chemically accurate quantum device implementation of ADAPT-VQE for strongly correlated molecules. Let us remark here that while the focus of this article is on hybrid quantum-classical adaptive algorithms in the tradition of ADAPT-VQE, quantum imaginary time evolution approaches have also been recently proposed and shown an improved optimization in the high-dimensional non-convex energy landscape16.

Our proposed approach for overcoming the challenges of energy plateaus requires modifying the manner in which the ansatz wave-function is constructed. Indeed, rather than constructing an ansatz wave-function through an energy minimization procedure and potentially encountering local minima, we grow the ansatz wave-function through a process that maximizes its overlap with a—potentially intermediate—target wave-function that already captures some electronic correlation of the system. We then use such a target wave-function as a guide to help us build our ansatz in the right direction so as to catch the bulk of electronic correlation (see the Methods section for a detailed description and workflow). The resulting overlap-guided ansatz is subsequently used as a high-accuracy initialization for an ADAPT-VQE procedure, an algorithm that we refer to as Overlap-ADAPT-VQE. We benchmark and compare the ansatz wave-functions obtained with Overlap-ADAPT-VQE method to standard ADAPT-VQE on a range of small chemical systems with varying levels of correlation. Our results indicate that this Overlap-ADAPT-VQE strategy yields chemically accurate ansatz wave-functions that are significantly more compact than those produced by the classical ADAPT-VQE procedure thus maintaining optimism for achieving chemically accurate molecular simulations on near-term quantum devices.

Results

Setting of numerical simulations

The classical numerical simulations reported in this section have been carried out with an in-house code, using the Openfermion-PySCF module17 for integral computations and OpenFermion18 for the second quantization and Jordan–Wigner mappings. All calculations are performed within the minimal STO-3G basis set19 without considering frozen orbitals unless otherwise specified. Note that the number of qubits that a simulation requires is equal to the number of spin-orbitals of a system, which therefore limits the quality of the single-particle basis and the size of the system that can be simulated. All optimization routines use the Broyden-Fletcher-Goldfarb-Shanno algorithm implemented on the SciPy Python module20. We use a pool of non-spin-complemented restricted single- and double-qubit excitations evolutions. By “restricted”, we mean that we consider only excitations from occupied orbitals to virtual orbitals with respect to the Hartree-Fock determinant. Using fewer operators in the pool makes the gradient screening process faster and easier to handle from a computational point of view9. To ensure a fair comparison, this same operator pool is used for both the overlap-guided Ansatz and ADAPT-VQE.

To anticipate applications of such adaptive algorithms on noisy quantum machines, there are essentially two constraints to respect:

  • The circuit depth should be kept as shallow as possible so as to reduce the effect of decoherence in NISQ devices. In the current context, the circuit depth corresponds to the number of gates used to construct our wave-function ansatz.

  • The number of measurements an NISQ device can undertake is very limited. On the other hand, the ADAPT-VQE algorithm requires a large number of measurements both in the form of gradient evaluations at the beginning of each iteration and during the VQE optimization step of the ansatz wave-function. The optimization step in particular often requires an excessive number of measurements since the cost function is both high-dimensional and noisy. Consequently, the optimization of the ansatz wave-function is simply intractable with a limited number of evaluations thus preventing practical application of ADAPT-VQE on current quantum devices.

In order to implement such adaptive algorithms on the current generation of NISQ devices, therefore, we must minimize both the circuit depth and the number of evaluations. Indeed, as the depth of a circuit increases, the noise level also increases, which results in a greater number of samples being required for accurate measurement of the Hamiltonian expectation values. In ADAPT-VQE, each operator added to the ansatz corresponds to an additional layer of quantum gates in the circuit and an additional parameter in the ansatz. Consequently, to address both the circuit depth and the number of evaluations constraints, we will evaluate the energy convergence as a function of the number of operators present in the ansatz. Details on the operator and gate counts for all circuits used in this study can be found in Supplementary Note 2 (see Supplementary Table 1).

Application of Overlap-ADAPT-VQE to reference full-CI wave-functions

As a first proof-of-concept, we apply the Overlap-ADAPT procedure to the reference full-CI wave-functions of some simple, yet strongly correlated molecular systems in an effort to understand the compactness of the wave function generated by the qubit excitation-based (QEB) ADAPT-VQE algorithm in the chemical accuracy regime. To do so, we will compute the energy of the Overlap-ADAPT approximation of the target full-CI wave-functions of a stretched BeH2 molecule and a stretched linear H6 chain in a minimal basis set as a function of the number of optimization parameters, and plot this energy in comparison to the energy obtained using QEB-ADAPT-VQE.

The resulting energy plots, which are displayed in Fig. 1, clearly show that the overlap-guided adaptive procedure is able to avoid the initial energy plateaus afflicting the ADAPT procedure that prevent the attainment of chemical accuracy in a small number of iterative steps. These results strongly suggest the potential for creating a more condensed ansatz wave-function than that generated by ADAPT-VQE which can sidestep the issue of early energy plateaus.

Fig. 1: Comparison of the full-CI overlap-guided ADAPT-VQE and ADAPT-VQE for the ground state energy of a stretched BeH2 molecule and a stretched linear H6 chain.
figure 1

a demonstrates the numerical results for a stretched BeH2 molecule while b displays the results for a stretched linear H6 chain. Both plots represent the energy convergence as a function of the number of parameters in the ansatz. The pink area indicates chemical accuracy at 10−3 Hartree.

Before proceeding, let us point out that a key metric for evaluating the efficiency of the overlap-ADAPT algorithm is to compute the overlap between the ansatz wave-function and the full-CI wave-function over the course of several algorithm iterations. Consequently, for the stretched BeH2 and stretched linear H6 chain considered above, we plot the overlap convergence with respect to the full-CI wave-function in Fig. 2. It is readily seen that the Overlap-ADAPT procedure targeted at the full-CI wave-function outperforms the original ADAPT-VQE, achieving a higher overlap with the full-CI wave-function for both a stretched BeH2 molecule and a stretched linear H6 chain. In particular, for the H6 system, while ADAPT-VQE reaches a plateau and stalls its progress, the Overlap-ADAPT procedure smoothly advances without interruption.

Fig. 2: Comparison of the full-CI overlap-guided ADAPT-VQE and ADAPT-VQE for maximizing the overlap with the full-CI wave-function of a stretched BeH2 molecule and a stretched linear H6 chain.
figure 2

a demonstrates the numerical results for a stretched BeH2 molecule while b displays the results for a stretched linear H6 chain. Both plots represent the lack of fidelity between the ansatz and the full-CI wave-function, calculated as one minus the overlap, as a function of the number of parameters in the ansatz.

Of course, the Overlap-ADAPT-VQE targeted at a full-CI wave-function does not define a practical VQE since the full-CI ground state energy is precisely the quantity we wish to approximate. A practical VQE based on orbital overlap optimization can, however, be developed by replacing the targeted full-CI wave-function with a tractable high accuracy approximation thereof and using the resulting overlap-guided ansatz wave-function as a high accuracy initialization for a new ADAPT-VQE procedure. The targeted “computable” wave-function in this situation can be completely general, i.e., it can be the output of any existing numerical algorithm, whether classical or quantum.

The goal of the forthcoming subsections is to showcase the efficacy of this Overlap-ADAPT algorithm in obtaining chemically accurate results using a minimal number of optimization parameters. Such findings are important for practical uses of quantum computing for quantum chemistry since, as we have already stated, real-life chemists are interested in reaching convergence in energies corresponding to the so-called chemical accuracy, i.e. 10−3 to 10−4 Hartree. Our results can therefore introduce a practical route for compactifying the ADAPT-VQE operator counts using the Overlap-ADAPT-VQE within this accuracy regime.

Application of Overlap-ADAPT-VQE for compactification of ADAPT-VQE Ansatzë

As a first practical test of its effectiveness, we apply the overlap-guided adaptive algorithm to a target wave-function provided by an existing qubit excitation-based (QEB) ADAPT-VQE procedure and then use the result as a high-accuracy initialization for a new QEB-ADAPT-VQE procedure. Essentially, this first set of numerical experiments is meant to model the situation where we have a strong constraint on the circuit depth (represented by the number of optimization parameter in the ansatz wave-function), and we wish to see if it is possible to use the Overlap-ADAPT-VQE procedure to compactify the ADAPT-VQE ansatz thereby obtaining a higher accuracy wave-function that respects the constraint on the circuit depth.

We compute the ground state energy of the benchmark Beryllium Hydride (BeH2) molecule considered in the original ADAPT-VQE articles8. We consider the BeH2 molecule both at its equilibrium geometry (bond length of 1.3264 Angstrom) as well as at a stretched geometry (bond length of 3.0 Angstrom), which is meant to model a more strongly correlated system. Our results are depicted in Fig. 3.

Fig. 3: Comparison of the Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a BeH2 molecule.
figure 3

a demonstrates the numerical results for a BeH2 molecule at equilibrium geometry while b displays the results for a stretched BeH2 molecule. Both plots represent the energy convergence as a function of the number of parameters in the ansatz. The left-pointing triangles denote the target wave-functions used for subsequent Overlap-ADAPT procedures. For simplicity, we do not plot the entire Overlap-ADAPT curve, but rather only the portion corresponding to the energy minimization using a classical ADAPT-VQE procedure. Thus, in a, the overlap maximization portion of Overlap-Guided QEB-ADAPT-VQE lasts until parameter 40 at which point the energy minimization portion is initiated. The green dotted line corresponds to a full- configuration interaction (full-CI) Overlap-ADAPT-VQE procedure which is plotted as a reference. Note that at equilibrium distance (a), the QEB-ADAPT-VQE curve and the full-CI Overlap-ADAPT- VQE nearly coincide whereas for the stretched molecule (b) the full-CI Overlap-ADAPT-VQE curve is lower. The pink area indicates chemical accuracy at 10−3 Hartree.

The numerical results indicate that the Overlap-ADAPT-VQE can indeed compactify the QEB-ADAPT-VQE ansatz wave-function and using the output as an initialization for a new QEB-ADAPT-VQE yields a much more accurate wave-function. Under the constraint of a maximal operator count of 50, the overlap-guided procedure improves the final accuracy of the computed BeH2 ground state energy at equilibrium and stretched geometries by a factor of 3 and 10, respectively. Note that the improvement in accuracy is much higher in the case of the stretched BeH2 molecule which exhibits strong correlation, and this suggests that the comparative advantage of the overlap-guided adaptive algorithm over a pure ADAPT-VQE procedure will be more conspicuous for strongly correlated molecules—systems for which the ADAPT-VQE algorithm struggles to compute the ground state energy. Thus, in the case of the BeH2 molecule, for instance, we are able to achieve chemical accuracy using only a 34 operator-ansatz wave-function whereas the QEB-ADAPT-VQE algorithm requires more than 50. Numerical simulations for stretched BeH2 using a lower maximal operator count of 40 and 45 are displayed in Fig. 4 and show similar improvements in the final accuracy of the ansatz wave-function, although the advantage decreases as the maximal operator count decreases.

Fig. 4: Comparison of the Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a stretched BeH2 molecule with different maximal operator counts.
figure 4

a demonstrates the numerical results for a maximal operator count of 40 while b displays the results for a maximal operator count of 45. Both plots represent the energy convergence as a function of the number of parameters in the ansatz. The left-pointing triangles denote the target wave-functions used for a subsequent overlap-ADAPT procedure. For simplicity, we do not plot the entire Overlap-ADAPT curve, rather only the portion corresponding to the energy minimization using a classical ADAPT-VQE procedure. The green dotted line corresponds to a full-CI Overlap-ADAPT-VQE procedure which is plotted as a reference. The pink area indicates chemical accuracy at 10−3 Hartree.

A further test of the Overlap-ADAPT-VQE applied to a target QEB-ADAPT-VQE wave-function is carried out for the diatomic Nitrogen (N2) molecule at equilibrium and stretched geometries. Although the minimal basis set for N2 is quite large, a tractable computation can be carried out using an active space approach where the eight core electrons of the N2 molecule are frozen and the ground state energy of the system is computed using the resulting frozen core effective Hamiltonian, an approach commonly referred as CAS(6,6). As shown in Fig. 5, the Overlap-ADAPT procedure does not further compactify the QEB-ADAPT-VQE wave-function at equilibrium, the final accuracy of the Overlap-QEB-ADAPT-VQE being only slightly higher than that of the classical QEB-ADAPT-VQE procedure. Nevertheless, by applying the Overlap-ADAPT-VQE procedure twice, i.e., taking a QEB-ADAPT-VQE wave-function as the first target, performing an Overlap-ADAPT-VQE procedure, and then taking the resulting wave-function as the target for an additional Overlap-ADAPT-VQE procedure yields a significant gain in accuracy for the stretched geometry. Indeed, the Overlap-QEB-ADAPT-VQE energy is nearly an order of magnitude more accurate than the classical QEB-ADAPT-VQE energy.

Fig. 5: Comparison of the Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of an N2 molecule.
figure 5

a demonstrates the numerical results for an N2 molecule at equilibrium geometry while b displays the results for a stretched N2 molecule. Both plots represent the energy convergence as a function of the number of parameters in the ansatz. The left-pointing triangles denote the target wave-functions used for a subsequent Overlap-ADAPT procedure. For simplicity, we do not plot the entire Overlap-ADAPT curve, rather only the portion corresponding to the energy minimization using a classical ADAPT-VQE procedure. The green dotted line corresponds to a full-CI Overlap-ADAPT-VQE procedure which is plotted simply as a reference. The pink area indicates chemical accuracy at 10−3 Hartree.

Let us remark that as a rule of thumb, for all these simulations, the Overlap-ADAPT algorithm is used to construct an approximate wave-function using a number of operators equal to ~40–50% of the maximal operator count. If the maximal operator count is more flexible, then as a general rule we observe that the ADAPT-VQE ansatz taken immediately after the ADAPT process has exited an energy plateau, serves as an effective choice of target wave-function for an overlap-guided adaptive procedure, i.e., the Overlap-ADAPT-VQE can produce a more compact wave-function with comparable energy to that of the target ADAPT wave-function. On the other hand, taking ADAPT-VQE ansatz wave-function from the middle of an energy plateau as the overlap-guided target seems to be a less effective strategy.

Application of Overlap-ADAPT-VQE to classically computed wave-functions

The stretched linear H6 chain is a molecular system that exhibits a high degree of electronic correlation. The complex electronic structure creates a rough energy landscape with many local minima, making finding of the global energy minimum difficult. This system has already been extensively studied9 and it was shown that achieving chemical accuracy with ADAPT-VQE method required constructing an ansatz wave function with more than 150 operators from a pool of either generalized fermionic or generalized qubit-excitations. Clearly, resources of this kind are far from being accessible on current NISQ devices, and it is therefore necessary to develop adaptive methods for simulating systems using a much smaller operator count. Until now, the most extensive VQE experiments have typically encompassed around 10 operators while accumulating an error of at least 0.1 Hartree21,22. Unfortunately, the ADAPT-VQE ansatz wave-function, presumably not constructed with a satisfactory choice of qubit excitation evolution operators prior to an unreachable number of iterations, cannot be used as the target of the overlap-guided adaptive algorithm as in the previous subsection. Instead, we propose the use of an intermediate, classically computed, multi-configuration wave-function as the overlap-guided target, an approach that has the consequent advantage of not costing additional quantum resources. While many different choices of classically computed wave-functions are possible, in this study, we choose to employ the so-called CI perturbatively selected iteratively (CIPSI) algorithm implemented in QP223 (see the Methods section for a brief recap of the CIPSI method).

CIPSI-Overlap-ADAPT numerical results

We performed CIPSI calculations through the open-source quantum chemistry environment Quantum Package23 for the different molecular systems. As mentioned previously, the CIPSI wave-function is used as a target for the overlap-guided adaptive algorithm and is therefore not required to be very accurate. In particular, all CIPSI wave-functions employed in this study have errors much larger than 10−3 Hartree, i.e., they are not chemically accurate. In the remainder of this section, we compare the energy convergence of the QEB-ADAPT-VQE algorithm starting from an intermediate wave-function obtained by applying the overlap-guided algorithm to a CIPSI wave-function with the traditional QEB-ADAPT-VQE procedure that initializes from a simple Hartree-Fock ansatz. As a rule of thumb, for all these simulations, the Overlap-ADAPT-VQE is used to construct an approximate wave-function with energy comparable to that of the targeted CIPSI wave-function before initiating the subsequent QEB-ADAPT-VQE procedure.

Figure 6 shows the energy convergence plot of the two different ADAPT-VQE protocols on the stretched linear H6 system. We observe a significant difference in the results, with chemical accuracy being achieved using only 40 parameters when the QEB-ADAPT-VQE procedure is initialized with the overlap-guided-CIPSI intermediate wave-function whereas the classical ADAPT-VQE ansatz is ~15 times less accurate despite using 50 parameters. Additional calculations revealed that with the classical QEB-ADAPT-VQE protocol requires >150 parameters to achieve chemical accuracy9. This massive performance gap demonstrates that the CIPSI wave-function initialization guides the ansatz construction in a manner that avoids an initial massive energy plateau which impedes the progress of classical QEB-ADAPT-VQE.

Fig. 6: Comparison of the CIPSI-Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a linear H6 chain with an interatomic distance of 3 Angstrom.
figure 6

The plot represents the energy convergence as a function of the number of parameters in the ansatz. The CIPSI-Overlap ansatz is grown up to 20 parameters and then used as the initial state for an ADAPT-VQE process. This transition from Overlap-ADAPT-VQE to classical ADAPT-VQE is denoted by the top-pointing triangle. The horizontal black dotted line corresponds to the energy error of the initial CIPSI target wave-function. The light blue dotted line corresponds to the energy of the tUCCSD method9, which consists of an ansatz wave-function composed of 118 generalized excitation evolutions acting on a reference Hartree-Fock state. The green dotted line corresponds to an full-CI Overlap-ADAPT-VQE procedure which is plotted simply as a reference. The pink area indicates chemical accuracy at 10−3 Hartree.

Let us emphasize here that the initial CIPSI wave-function was composed of only 50 determinants and had an error larger than 10−2 Hartree, which suggests that even a low accuracy classically computed target wave-function for the overlap-guided algorithm is enough to improve the convergence of the subsequent QEB-ADAPT-VQE procedure. This observation is particularly important since it highlights the potential of applying this CIPSI-Overlap-ADAPT procedure to much larger systems with strong correlation where CIPSI approaches are not effective and are simply unable to achieve chemical accuracy. For such systems, we can envision computing a CIPSI wave-function at the limit of classical computational resources, using this non-chemically accurate CIPSI wave-function as a target for the overlap-guided adaptive algorithm, and initializing a subsequent QEB-ADAPT-VQE procedure on a quantum computer in order to obtain a final result with chemical accuracy.

To further test the effectiveness of this CIPSI-Overlap-ADAPT approach, we return to the stretched BeH2 molecule considered in the previous subsection. We employ two different CIPSI wave-functions as targets for the overlap-guided adaptive algorithm and use the approximate wave-functions obtained as high accuracy initializations for QEB-ADAPT-VQE procedures. Our results are displayed in Fig. 7 and demonstrate that the CIPSI-Overlap-ADAPT produces a significantly more compact ansatz than the classical QEB-ADAPT-VQE procedure for both choices of CIPSI wave-functions. In both cases, the final accuracy of the wave-function with a maximal operator count of 50 operators is nearly an order of magnitude more than that of QEB-ADAPT-VQE. Furthermore, as noted in the case of the H6 molecule, the choice of a low accuracy CIPSI wave-function as the initial target for the Overlap-ADAPT-VQE does not meaningfully degrade the final accuracy. Let us also remark here that the CIPSI-Overlap-ADAPT-VQE wave-function obtained at the end of the iterative process can then further be used a target for an additional Overlap-ADAPT-VQE procedure, thereby further increasing the accuracy of the ansatz wave-function. In the case of the stretched BeH2 molecule, this results in further minor improvements to the final energy that is achievable using a maximal operator count of 50, as displayed in Fig. 7.

Fig. 7: Comparison of the CIPSI-Overlap-ADAPT-VQE and ADAPT-VQE for the ground state energy of a stretched BeH2 molecule for different CIPSI wave-functions.
figure 7

The plots represent the energy convergence as a function of the number of parameters in the ansatz. a the CIPSI-Overlap ansatz is grown up to 12 parameters using a low accuracy initial CIPSI target whereas in b, the CIPSI-Overlap ansatz is grown up to 25 parameters using a moderate accuracy initial CIPSI target. The resulting wave-functions in both cases are used as initial states for ADAPT-VQE procedures. This transition from Overlap-ADAPT-VQE to classical ADAPT-VQE is denoted by the top-pointing triangle. The horizontal dotted lines correspond to the energy error of the initial CIPSI target wave-functions. The green dotted lines corresponds to a full-CI Overlap-ADAPT-VQE procedure which is plotted simply as a reference. The pink area indicates chemical accuracy at 10−3 Hartree.

Discussion

In this study, we have explored the possibility of creating ansatz wave-functions for the variational-quantum eigensolver that are more compact than the popular ADAPT-VQE at the chemical accuracy level for some small molecular systems. Since the overparametrization phenomenon observed in the ADAPT algorithm can be attributed to the algorithm’s natural propensity to encounter local energy minima, we have proposed an overlap-guided adaptive algorithm called Overlap-ADAPT-VQE, wherein the ansatz wave-function is grown by maximizing its overlap with an intermediate target wave-function that already captures some electronic correlation. We then use this overlap-guided ansatz as a high-accuracy initialization for a classical ADAPT-VQE procedure.

As a first test of our proposed approach, we used an existing ADAPT-VQE ansatz wave-function as a target for the overlap-guided adaptive algorithm. The resulting ansatz wave-function was shown to achieve chemical accuracy using significantly less operators than the classical ADAPT-VQE ansatz. We have also shown that this compression process can be carried out more than once and leads to an even more compact ansatz. For strongly correlated systems, the overlap-guided ansatz is steered by the target wave-function away from the majority of local traps that are typically encountered in standard ADAPT-VQE when starting from the Hartree-Fock state. While it appears that the ADAPT ansatz is already quite compact for systems with poor electronic correlation, the Overlap-ADAPT approach remains able to offer slight improvements.

Motivated next by the inability of ADAPT-VQE to process highly correlated systems such as the stretched linear H6 chain using a reasonably compact ansatz, we combined classical selected-CI approaches and quantum computing by taking a CIPSI wave-function as a target for our overlap-guided adaptive algorithm. The resulting CIPSI-Overlap-ADAPT-VQE procedure produced a massive improvement over standard ADAPT-VQE, allowing us to reach chemical accuracy using an ansatz with only 40 operators compared to more than 150 for the classical ADAPT-VQE method.

Previous studies have already investigated the use of additional classical computation to enhance the UCCSD or ADAPT-VQE methods and have demonstrated promising improvements7,24,25,26,27. Our work builds upon this research and contributes to this line of study. It is worth noting that the overlap-guided ansatz can also be interpreted as a state preparation algorithm for Hamiltonian simulation28,29,30, as it generates a state with high overlap on the ground state (see Fig. 2).

However, within our framework, the hybrid selected-CI-Overlap algorithm has the potential to bring a quantum advantage over classical quantum chemistry methods by following this procedure: pushing the classical computation of a complex molecular system to its limits, then generating the corresponding ansatz in a quantum computer using the Overlap adaptive algorithm, and further improving this ansatz through ADAPT-VQE and potentially additional overlap-guided compression steps. We are also testing the possibility of a final perturbative state (PT2) calculation following the spirit of the modern classical selected-CI approaches.

Finally, let us emphasize that Overlap-ADAPT-VQE is, by design, able to integrate seamlessly with the recent improvements made to ADAPT-VQE31,32, sharing the same structure and adaptive property while still leveraging its own unique approach to operator selection, and many combinations with ADAPT variants can now be proposed and studied. Conversely, convergence in overlaps can be achieved more quickly by incorporating a wider range of operators, such as generalized excitations or symmetry-breaking operators, into the pool of operators used. This would lead to immediate improvements in the performance of the Overlap-ADAPT-VQE algorithm. To explore further the capabilities of the various Overlap-ADAPT approaches and their potential practical advantage over classical methods, we are currently working towards larger-scale simulations on extended implementations encompassing larger qubit counts on present NISQ machines and HPC simulators.

Methods

Qubit representation of the molecular Hamiltonian

The molecular electronic Hamiltonian with one-body and two-body interactions can be expressed in second-quantization notation as

$$H:= \mathop{\sum}\limits_{p,q}{h}_{pq}{a}_{p}^{{{{\dagger}}} }{a}_{q}+\mathop{\sum}\limits_{p,q,r,s}{h}_{pqrs}{a}_{p}^{{{{\dagger}}} }{a}_{r}^{{{{\dagger}}} }{a}_{s}{a}_{q}.$$
(1)

Here, p, q, r, and s are indices that label the spin-orbitals used to discretize the system, ap and \({a}_{p}^{{{{\dagger}}} }\) are the pth fermionic annihilation and creation operators that satisfy the anti-commutation relations:

$$\left\{{a}_{p},{a}_{q}^{{{{\dagger}}} }\right\}:= {a}_{p}{a}_{q}^{{{{\dagger}}} }+{a}_{q}^{{{{\dagger}}} }{a}_{p}={\delta }_{pq}\quad \,{{\mbox{and}}}\,\quad \left\{{a}_{p},{a}_{q}\right\}:= {a}_{p}{a}_{q}+{a}_{q}{a}_{p}=0,$$
(2)

with δpq representing the classical Kronecker symbol in the frame of operator algebra, and hpq and hpqrs are one-electron and two-electron integrals that can be computed on classical hardware through the expressions

$${h}_{pq} := {\int}_{{{\!\!\!\!\!\mathbb{R}}}^{3}}{\Psi }_{p}^{* }({{{{{{{\bf{x}}}}}}}})\left(-\frac{1}{2}\Delta -{V}_{{{{{{{{\rm{nuc}}}}}}}}}\right){\Psi }_{q}({{{{{{{\bf{x}}}}}}}})\ d{{{{{{{\bf{x}}}}}}}},\\ {h}_{pqrs} := {\int}_{{{\!\!\!\!\!\mathbb{R}}}^{3}}{\int}_{{{\!\!\!\!\!\mathbb{R}}}^{3}}{\Psi }_{p}^{* }({{{{{{{\bf{x}}}}}}}}){\Psi }_{r}^{* }({{{{{{{\bf{y}}}}}}}})\left(\frac{1}{| {{{{{{{\bf{x}}}}}}}}-{{{{{{{\bf{y}}}}}}}}| }\right){\Psi }_{q}({{{{{{{\bf{x}}}}}}}}){\Psi }_{s}({{{{{{{\bf{y}}}}}}}})\ d{{{{{{{\bf{x}}}}}}}}d{{{{{{{\bf{y}}}}}}}},$$
(3)

where Ψp, Ψq, Ψr, Ψs denote spin-orbitals labeled by the indices p, q, r, and s, respectively.

In order to represent the second-quantized Hamiltonian H on a quantum computer, we use the Jordan–Wigner transform33,34 to map the creation and annihilation operators to tensor products involving unitary matrices. To this end, we denote by \({\left\vert 0\right\rangle }_{p}\) and \({\left\vert 1\right\rangle }_{p}\) states corresponding to an empty and occupied spin-orbital p respectively. Using this formalism, the reference Hartree-Fock state for a system having n electrons in N spin-orbitals can be expressed as \(\left\vert {\Psi }_{{{\mbox{HF}}}}\right\rangle := \left\vert {1}_{0}\ldots {1}_{n}{0}_{n+1}\ldots {0}_{N}\right\rangle\), and the corresponding fermionic creation and annihilation operators are given by

$${a}_{p} =\left(\mathop{\bigotimes }\limits_{i=0}^{p-1}{Z}_{i}\right)\otimes \frac{{X}_{p}+i{Y}_{p}}{2}\,=:\left(\mathop{\bigotimes }\limits_{i=0}^{p-1}{Z}_{i}\right)\otimes {Q}_{p},\\ {a}_{p}^{{{{\dagger}}} } =\left(\mathop{\bigotimes }\limits_{i=0}^{p-1}{Z}_{i}\right)\otimes \frac{{X}_{p}-i{Y}_{p}}{2}\,=:\left(\mathop{\bigotimes }\limits_{i=0}^{p-1}{Z}_{i}\right)\otimes {Q}_{p}^{{{{\dagger}}} },$$
(4)

where Xp, Yp, Zp are single-qubit Pauli gates applied to qubit p35. Note that in Equation (4), we have introduced the so-called qubit excitation and de-excitation operators Qp and \({Q}_{p}^{{{{\dagger}}} }\) respectively that switch the occupancy of the spin-orbital. These operators will be the subject of further discussion in the sequel. Let us also remark here that the Jordan–Wigner-transformed excitation and de-excitation operators (4) respect the anti-commutation relations (1). This is simply a consequence of including the tensor product of Z-Pauli gates in Equation (4)33.

The variational-quantum Eigensolver

Equipped with the single-qubit Pauli gate representation of the molecular Hamiltonian H, we are now interested in approximating its ground state eigenvalue. The Variational-Quantum-Eigensolver (VQE) is a hybrid quantum-classical algorithm that couples a classical optimization loop to a subroutine that computes on a quantum computer, the expectation value of the Hamiltonian with respect to a proposed ansatz wave-function. This quantum subroutine involves two fundamental steps:

  1. 1.

    The preparation of a trial quantum state (the ansatz wave-function) \(\vert \Psi (\overrightarrow{\theta })\rangle\). A variety of different functional forms for the ansatz wave-function have been proposed7,36,37,38 including the aforementioned tUCC ansatz which consists of a sequence of parameterized, exponential fermionic excitation and de-excitation operators acting on a reference state (see below for explicit expressions of these operators).

  2. 2.

    The measurement of the expectation value \(\langle \Psi (\overrightarrow{\theta })\vert H\vert \Psi (\overrightarrow{\theta })\rangle\).

The output of the quantum subroutine is fed into a classical optimization algorithm which calculates the optimal set of parameters \({\overrightarrow{\theta }}_{{{{{{{{\rm{opt}}}}}}}}}\) that minimizes the expectation value of the Hamiltonian H. The variational principle ensures that the resulting optimized energy is always an upper bound for the exact ground state energy E0 of H, i.e.,

$$\left\langle \Psi \left({\overrightarrow{\theta }}_{{{{{{{{\rm{opt}}}}}}}}}\right)\right\vert H\left\vert \Psi \left({\overrightarrow{\theta }}_{{{{{{{{\rm{opt}}}}}}}}}\right)\right\rangle \ge {E}_{0}.$$
(5)

The fundamental challenge in implementing the VQE methodology on NISQ devices is thus to construct an ansatz wave-function that can capture the most important contributions to the electronic correlation energy and, at the same time, is capable of being represented on rather shallow quantum circuits. A necessary condition to achieve the latter is that the chosen ansatz wave-function be parameterized with a relatively small number of optimization parameters. Thus, the major computational shortcoming of the popular tUCCSD method– which otherwise possesses an attractive functional form7—is that its actual implementation on quantum computers requires extremely deep circuits which generate far too much noise on the current generation of NISQ devices22. Indeed, implementing the tUCCSD algorithm on quantum architectures through the Jordan–Wigner mapping (4) requires O(N3n2) quantum gates7 (recall that N is the number of spin-orbitals being considered and n is the number of electrons in the system so that if N is proportional to n, then the number of quantum gates required will be of the order of O(N5)). This problem is further exacerbated by the ubiquitous usage of CNOT gates in the construction of quantum circuits for fermionic excitation and de-excitation operators. tUCCSD has been recently extended to triple excitations (tUCCSDT)39 and coupled to both spin and orbital symmetries to reduce the operators count but this latter remains too high for real-life QPUs implementation despite a significantly increased accuracy over tUCCSD.

The ADAPT-VQE ansatz

The adaptive derivative-assembled pseudo-Trotter variational-quantum eigensolver (ADAPT-VQE)8 was designed to overcome the computational shortcomings of the traditional tUCCSD method by proposing an ansatz function that is adaptively grown through an iterative process. ADAPT-VQE is based on the fact40 that the full-CI quantum state can be represented by the action of a potentially infinitely long product of only one-body and two-body operators on the reference Hartree-Fock determinant, i.e.,

$$\left\vert {\Psi }_{{{{{{{{\rm{FCI}}}}}}}}}\right\rangle =\mathop{\prod }\limits_{k}^{\infty }\left[\mathop{\prod}\limits_{pq}{\hat{A}}_{p}^{q}({\theta }_{k}^{pq})\mathop{\prod}\limits_{pqrs}{\hat{A}}_{pq}^{rs}({\theta }_{k}^{pqrs})\right]\left\vert {\Psi }_{{{{{{{{\rm{HF}}}}}}}}}\right\rangle .$$
(6)

Here, \({\hat{A}}_{p}^{q}({\theta }_{k}^{pq}):= {e}^{{\theta }_{k}^{pq}{\hat{\tau }}_{p}^{q}(k)}\) and \({\hat{A}}_{pq}^{rs}(k):= {e}^{{\theta }_{k}^{pqrs}{\hat{\tau }}_{pq}^{rs}}\) where \({\hat{\tau }}_{p}^{q}\) and \({\hat{\tau }}_{pq}^{rs}\) denote the anti-symmetric operators \({\hat{a}}_{p}^{q}-{\hat{a}}_{q}^{p}\) and \({\hat{a}}_{pq}^{rs}-{\hat{a}}_{rs}^{pq}\) and \({\theta }_{k}^{pq}\) (resp. \({\theta }_{k}^{pqrs}\)) is the expansion coefficient of the kth repetition of the operator \({\hat{A}}_{p}^{q}\) (resp. \({\hat{A}}_{pq}^{rs}\)).

The general workflow of the ADAPT-VQE algorithm is as follows:

  1. 1.

    On classical hardware, compute one-electron and two-electron integrals, and map the molecular Hamiltonian into a qubit representation. On quantum hardware, boot the qubits to an initial state \(\vert {\Psi }^{0}\rangle =\vert {\Psi }_{{{{{{{{\rm{HF}}}}}}}}}\rangle\).

  2. 2.

    Define a pool of parameterized unitary operators that will be used to construct the ansatz.

  3. 3.

    On quantum hardware, at the mth iteration, identify the parameterized unitary operator \({\hat{{{{{{{{\mathcal{U}}}}}}}}}}_{m}({\theta }_{m})\) whose action on the current ansatz \(\vert {\Psi }^{m-1}\rangle\) will produce a new wave-function with the largest drop in energy. This identification is done by computing suitable gradients at θm = 0, the gradients being expressed in terms of commutators involving the molecular Hamiltonian acting on the current ansatz wave-function:

    $$\frac{\partial }{\partial {\theta }_{m}}\langle {\Psi }^{m-1}| \,\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m})^{{{{\dagger}}} }H\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m})| {\Psi }^{m-1}\rangle {\left\vert \right.}_{{\theta }_{m} = 0}=\langle {\Psi }^{m-1}| [H,\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}(0)]| {\Psi }^{m-1}\rangle$$
    (7)
  4. 4.

    Exit the iterative process if the gradient norm is smaller than some threshold ϵ. Otherwise, append the selected operator to the left of the current ansatz wave-function \(\vert {\Psi }^{m-1}\rangle\), i.e., define \(\vert \widetilde{{\Psi }^{m}}\rangle := \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m})\vert {\Psi }^{m-1}\rangle =\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m})\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m-1}}({\theta }_{m-1}^{{\prime} })\ldots \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{1}}({\theta }_{1}^{{\prime} })\left\vert {\Psi }^{0}\right\rangle\).

  5. 5.

    Hybrid Quantum-Classical VQE: Optimize all parameters θm, θm−1, …, θ1 in the new ansatz wave-function so as to minimize the expectation value of the molecular Hamiltonian, i.e., solve the optimization problem

    $${\overrightarrow{\theta }}^{{{{{{{{\rm{opt}}}}}}}}} := ({\theta }_{1}^{{\prime} },\ldots ,{\theta }_{m-1}^{{\prime} },{\theta }_{m}^{{\prime} })\\ := \mathop{{{{{{{{\rm{argmin}}}}}}}}}\limits_{{\theta }_{1},\ldots ,{\theta }_{m-1},{\theta }_{m}}\langle \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m}){\hat{{{{{{{{\mathcal{U}}}}}}}}}}_{m-1}({\theta }_{m-1})\ldots \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{1}}({\theta }_{1}){\Psi }^{0}| H\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m}){\hat{{{{{{{{\mathcal{U}}}}}}}}}}_{m-1}({\theta }_{m-1})\ldots \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{1}}({\theta }_{1}){\Psi }^{0}\rangle$$
    (8)

    and define the new ansatz wave-function \(\left\vert {\Psi }^{m}\right\rangle\) using the newly optimized parameters \({\theta }_{1}^{{\prime} },\ldots ,{\theta }_{m}^{{\prime} }\), i.e., define \(\left\vert {\Psi }^{m}\right\rangle := \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m}}({\theta }_{m}^{{\prime} })\hat{{{{{{{{{\mathcal{U}}}}}}}}}_{m-1}}({\theta }_{m-1}^{{\prime} })\ldots \hat{{{{{{{{{\mathcal{U}}}}}}}}}_{1}}({\theta }_{1}^{{\prime} })\left\vert {\Psi }^{0}\right\rangle\). Let us emphasize that although we also denote the newly optimized parameters at the current mth iteration by \({\theta }_{1}^{{\prime} },\ldots {\theta }_{m}^{{\prime} }\), these optimized values are not necessarily the same as those used to define \(\left\vert {\Psi }^{m-1}\right\rangle\) and referenced in Step 4 above.

  6. 6.

    Return to Step 3 with the updated ansatz \(\left\vert {\Psi }^{m}\right\rangle\).

There are essentially three types of operator pools that are used to construct the ADAPT-VQE ansatz.

  • Fermionic-ADAPT-VQE8 uses a pool of spin-complemented pairs of single and double fermionic excitation operators. The quantum circuits performing these unitary operations are of the staircase shape (see Fig. 8a).

  • Qubit-ADAPT-VQE10 divides the fermionic-ADAPT operators after the Jordan–Wigner mapping and takes the individual Pauli strings as operators of the pool. The quantum circuit for an operator is a single layer of fermionic excitation “CNOT-staircase” circuits, similar to the circuit displayed in Fig. 8b.

  • Qubit-Excitation-Based-ADAPT-VQE (QEB-ADAPT-VQE)9 uses a pool of qubit excitation operators. Exponential single-qubit and double-qubit excitation evolutions can be expressed using the qubit creation and annihilation operators Qp and \({Q}_{p}^{{{{\dagger}}} }\) defined through Equation (4) as

    $${U}_{pq}^{{{{{{{{\rm{(sq)}}}}}}}}}(\theta ) =\exp (\theta ({Q}_{p}^{{{{\dagger}}} }{Q}_{q}-{Q}_{q}^{{{{\dagger}}} }{Q}_{p}))\\ {U}_{pqrs}^{{{{{{{{\rm{(dq)}}}}}}}}}(\theta ) =\exp (\theta ({Q}_{p}^{{{{\dagger}}} }{Q}_{q}^{{{{\dagger}}} }{Q}_{r}{Q}_{s}-{Q}_{r}^{{{{\dagger}}} }{Q}_{s}^{{{{\dagger}}} }{Q}_{p}{Q}_{q})),$$
    (9)

    which, after the Jordan–Wigner encoding yields

    $${U}_{pq}^{{{{{{{{\rm{(sq)}}}}}}}}}(\theta ) = \exp \left(-i\frac{\theta }{2}\left({X}_{q}{Y}_{p}-{Y}_{q}{X}_{p}\right)\right)\\ {U}_{pqrs}^{{{{{{{{\rm{(dq)}}}}}}}}}(\theta ) = \exp \left(-i\frac{\theta }{8}\left({X}_{r}{Y}_{s}{X}_{p}{X}_{q}+{Y}_{r}{X}_{s}{X}_{p}{X}_{q}+{Y}_{r}{Y}_{s}{Y}_{p}{X}_{q}+{Y}_{r}{Y}_{s}{X}_{p}{Y}_{q}\right.\right.\\ -\left.\left.{X}_{r}{X}_{s}{Y}_{p}{X}_{q}-{X}_{r}{X}_{s}{X}_{p}{Y}_{q}-{Y}_{r}{X}_{s}{Y}_{p}{Y}_{q}-{X}_{r}{Y}_{s}{Y}_{p}{Y}_{q}\right)\right),$$
    (10)

    with p, q, r, and s denoting, as usual, indices for the spin-orbitals, and we have written (sq) and (dq) as abbreviations for single-qubit and double-qubit excitation evolutions respectively. The quantum circuits corresponding to the double-qubit excitation operators41 are then given in Fig. 8c.

Fig. 8: Examples of quantum circuits for different fermionic excitation and qubit evolution operators.
figure 8

a displays a quantum circuit applying the operator \({e}^{\theta ({a}_{i}^{{{{\dagger}}} }{a}_{k})}\) as part of a single fermionic excitation, b shows a quantum circuit performing a generic single-qubit evolution \({U}_{pq}^{({{{{{{{\rm{sq}}}}}}}})}(\theta )\), and c displays a quantum circuit performing a generic double-qubit evolution \({U}_{pqrs}^{({{{{{{{\rm{dq}}}}}}}})}(\theta )\). Note that the terms single-qubit and double-qubit excitations refer to the fact that these operator perform rotations on one pair and two pairs of qubits respectively, not one and two individual qubits.

Extensive comparisons between these pools of operators have been carried out by Yordanov et al.9 and numerical evidence suggests that QEB-ADAPT-VQE generates the most computationally tractable ansatz wave-functions. This is primarily due to the fact that qubit excitation circuits can be constructed using much fewer quantum gates than fermionic excitation circuits41 in combination with the observation that qubit excitation evolutions approximate molecular electronic wave-functions with almost the same level of accuracy as fermionic excitation evolutions. For the purpose of this article, therefore, we will restrict our attention to operator pools involving qubit excitation evolutions and work in the framework of QEB-ADAPT-VQE.

The overlap-guided adaptive algorithm (Overlap-ADAPT)

The numerical evidence presented in the article8,9,14 demonstrates that the ADAPT-VQE algorithm is capable of approximating the ground state full-CI energy with very high accuracy. Unfortunately, achieving a suitably accurate approximation to the sought-after energy may require a large number of ADAPT iterations which results both in deep quantum circuits that cannot be implemented on the current generation of NISQ devices as well as an increasingly computationally expensive optimization procedure. This problem is particularly apparent in strongly correlated systems for which the ADAPT algorithm frequently encounters energy plateaus prior to achieving the classical chemical accuracy threshold of 10−3 Hartree. During such plateaus, a series of new operators are added to the ansatz without meaningfully reducing the energy. Since quantum chemists are primarily interested in numerical results in the regime 10−3 to 10−4 Hartree, i.e., slightly more accurate than the chemical accuracy threshold, it is natural to ask if the ADAPT-VQE procedure could be modified so as to avoid these initial energy plateau slowdowns and achieve the required accuracy using an ansatz compact-enough to be implementable on current NISQ devices.

To make these ideas more precise, let us first introduce for any natural number p, the set of all wave-functions that can be represented by the product of exactly p exponential, one-body, and two-body qubit excitation evolution operators acting on the Hartree-Fock reference state:

$${W}_{p}:= \left\{\left(\mathop{\prod }\limits_{k=1}^{p}\exp \left({\theta }_{k}{Q}_{{p}_{k}}{Q}_{{p}_{k}}^{{{{\dagger}}} }\right)\right)\left\vert {\Psi }_{{{{{{{{\rm{HF}}}}}}}}}\right\rangle :{\theta }_{k}\in {\mathbb{R}},\,{Q}_{{p}_{k}},{Q}_{{p}_{k}}^{{{{\dagger}}} }\,\,{{\mbox{defined as in Equation}}}\,(4)\right\}.$$
(11)

 Given now an arbitrary electronic wave-function \(\left\vert {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}\right\rangle\), we can define the best approximation of \(\left\vert {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}\right\rangle\) in the set Wp as

$$\left\vert {\Psi }_{p}^{* }\right\rangle := \mathop{{{{{{{{\rm{argmin}}}}}}}}}\limits_{\left\vert \Psi \right\rangle \in {W}_{p}}\left\Vert \left\vert \Psi \right\rangle -\left\vert {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}\right\rangle \right\Vert ,$$
(12)

where denotes a suitable norm such as the usual L2 or H1 norms on the space of all electronic wave-functions. The L2-norm and the H1-norm can both be computed on either classical computers or on quantum devices, depending on whether the underlying wave-functions are represented classically or on quantum circuitry. The computation of the L2-norm, however, is more direct and we have therefore adopted this choice of norm for the numerical simulations considered in this study.

Returning now to Equation (12), we see that \(\vert {\Psi }_{p}^{* }\rangle\) is the best approximation of an arbitrary target wave-function \(\vert {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}\rangle\) using a product of exactly p exponential qubit excitation evolution operators acting on the Hartree-Fock reference state. The question we are now interested in answering is the following: If we take the full-CI wave-function \(\vert {\Psi }_{{{{{{{{\rm{FCI}}}}}}}}}\rangle\) as the target, does the corresponding best approximation \(\vert {\Psi }_{p}^{{{{{{{{\rm{FCI}}}}}}}}}\rangle\) defined according to (12) provide a chemically accurate wave-function for small choices of p? More precisely, we wish to explore if for small choices of maximal operator count p it holds that

$$\left\langle {\Psi }_{p}^{{{{{{{{\rm{FCI}}}}}}}}}\right\vert H\left\vert {\Psi }_{p}^{{{{{{{{\rm{FCI}}}}}}}}}\right\rangle -\left\langle {\Psi }_{{{{{{{{\rm{FCI}}}}}}}}}\right\vert H\left\vert {\Psi }_{{{{{{{{\rm{FCI}}}}}}}}}\right\rangle =\left\langle {\Psi }_{p}^{{{{{{{{\rm{FCI}}}}}}}}}\right\vert H\left\vert {\Psi }_{p}^{{{{{{{{\rm{FCI}}}}}}}}}\right\rangle -{E}_{0} < 1{0}^{-3}\ {{{{{{{\rm{Ha}}}}}}}}.$$
(13)

 The answer to this question will be a strong indication as to whether there exists an ansatz wave-function that is simultaneously more compact than the ADAPT-VQE ansatz and which can also capture the bulk of the electronic correlation in the system. Let us emphasize that we are specifically interested in understanding whether we can obtain a more compact ansatz wave-function than that produced by ADAPT-VQE at chemical accuracy and not at the level of full-CI accuracy.

Unfortunately, answering this question by solving the optimization problem (12) for an arbitrary target wave-function exactly is not computationally feasible since the size of the set Wp grows exponentially in p. Nevertheless, an adaptive, iterative procedure that generates an approximate solution to the optimization problem (12) can be defined as follows (see also Fig. 9). Given a target wave-function \(\left\vert {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}\right\rangle\) and a maximal operator count p:

  1. 1.

    Set the initialization to the Hartree-Fock reference state, i.e., set \(\left\vert {\Psi }^{0}\right\rangle =\left\vert {\Psi }_{{{{{{{{\rm{HF}}}}}}}}}\right\rangle\).

  2. 2.

    At the mth iteration, m ≤ p, identify the parametrized exponential qubit excitation evolution operator \({\widehat{A}}_{m}({\theta }_{m})\) whose action on the current ansatz \(\left\vert {\Psi }^{m-1}\right\rangle\) will produce a new wave-function with the largest overlap with respect to the target wave-function. This identification is done by computing the following gradient involving the current ansatz wave-function at θm = 0:

    $$\frac{\partial }{\partial {\theta }_{m}}\langle {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}| \,{\widehat{A}}_{m}({\theta }_{m}){\Psi }^{m-1}\rangle {\left\vert \right.}_{{\theta }_{m} = 0}.$$
    (14)

    A detailed description of how to compute the gradients given in Equation (14) can be found in the Supplementary Note 1.

  3. 3.

    Append the selected operator to the left of the current ansatz wave-function \(\vert {\Psi }^{m-1}\rangle\), i.e., define \(\vert \widetilde{{\psi }^{m}}\rangle := {\widehat{A}}_{m}({\theta }_{m})\vert {\Psi }^{m-1}\rangle ={\widehat{A}}_{m}({\theta }_{m}){\widehat{A}}_{m-1}({\theta }_{m-1}^{{\prime} })\ldots {\widehat{A}}_{1}({\theta }_{1}^{{\prime} })\left\vert {\Psi }^{0}\right\rangle\).

  4. 4.

    Optimize all parameters θm, θm−1, …,θ1 in the new ansatz wave-function \(\left\vert \widetilde{{\psi }^{m}}\right\rangle\) so as to maximize its overlap with the target wave-function, i.e., solve the optimization problem

    $${\overrightarrow{\theta }}^{{{{{{{{\rm{opt}}}}}}}}} := ({\theta }_{1}^{{\prime} },\ldots ,{\theta }_{m-1}^{{\prime} },{\theta }_{m}^{{\prime} })\\ := \mathop{{{{{{{{\rm{argmax}}}}}}}}}\limits_{{\theta }_{1},\ldots ,{\theta }_{m-1},{\theta }_{m}}\langle {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}| {\widehat{A}}_{m}({\theta }_{m}){\widehat{A}}_{m-1}({\theta }_{m-1})\ldots {\widehat{A}}_{1}({\theta }_{1}){\Psi }^{0}\rangle ,$$
    (15)

    and define the new ansatz wave-function \(\left\vert {\Psi }^{m}\right\rangle\) using the newly optimized parameters \({\theta }_{1}^{{\prime} },\ldots ,{\theta }_{m}^{{\prime} }\), i.e., define \(\left\vert {\Psi }^{m}\right\rangle := {\widehat{A}}_{m}({\theta }_{m}^{{\prime} }){\widehat{A}}_{m-1}({\theta }_{m-1}^{{\prime} })\ldots {\widehat{A}}_{1}({\theta }_{1}^{{\prime} })\left\vert {\Psi }^{0}\right\rangle\). Let us emphasize that although we also denote the newly optimized parameters at the current mth iteration by \({\theta }_{1}^{{\prime} },\ldots {\theta }_{m}^{{\prime} }\), these optimized values are not necessarily the same as those used to define \(\left\vert {\Psi }^{m-1}\right\rangle\) and referenced in Step 3 above.

  5. 5.

    If the total number of operators in the updated ansatz is equal to p, exit the iterative process. Otherwise go to Step 2 with the updated ansatz wave-function.

Fig. 9: Workflow for overlap-guided adaptive algorithm (Overlap-ADAPT).
figure 9

Note that the target state \(\left\vert {\Psi }_{{{{{{{{\rm{ref}}}}}}}}}\right\rangle\) and the maximal operator count p have to be provided as an input to the algorithm.

We refer to this adaptive procedure as the Overlap-ADAPT-VQE. Let us emphasize here that rather than fixing a maximal operator count, we may employ some other convergence criteria such as the magnitude of the overlap or the magnitude of the gradient vectors as in the original ADAPT-VQE. Moreover, depending on whether the target wave-function is in a quantum or a classical representation, the gradient screening and the overlap measurements can be performed using either a quantum or a classical device. In particular, if the targeted wave-function is classically computed, then no additional quantum resources or measurements are required to compute the overlaps. Classically computed wave-functions that are particularly suited to the Overlap-ADAPT-VQE framework are provided by the so-called Selected-CI (SCI) methods.

Combining classical selected-CI approaches and quantum computing

The key idea of SCI methods is to build a compact representation of the reference wave-function by selecting on-the-fly, the most relevant Slater determinants thanks to an importance criterion based on perturbation theory (PT). Thanks to this clever selection of the Slater determinants, the variational energy of the reference wave-function converges rapidly towards the full-CI energy. Although the recent revival of SCI approaches23,42,43,44,45,46,47,48,49 has significantly pushed further the size limit of systems for which near full-CI quality energies can be obtained (typically a few tens of correlated electrons in about two hundreds of orbitals50,51), the scaling of SCI methods is intrinsically exponential in the number of correlated electrons and orbitals.

The reason for this exponential scaling is directly linked to the linear parametrization of the sought-after wave-function in terms of Slater determinants, which implies that the intrinsic exponential structure of the wave-function must be built explicitly by adding more and more determinants to the reference wave-function. This necessarily leads to size consistency errors which manifest through an underestimation of the coefficients of the reference and perturbative wave-functions and therefore of the correlation energy. Because the size consistency errors grow with the total (absolute) value of the correlation energy, SCI methods struggle more and more as the number of correlated electrons increases and/or the strength of correlation increases. Recently, attempts to cure this problem have been proposed with a selection of the individual excitation operators52,53 in a single-reference CC approach.

To overcome these limitations of SCI approaches, an alternative idea is to combine the robust and linear parametrization of SCI with the intrinsic exponential parametrization of the ansatz used in QC computation to take advantage of both worlds:

  1. 1.

    While reaching chemical accuracy in SCI methods is a struggle in the strong correlation regime, obtaining a compact and robust representation of the bulk of correlation effects is an easy task thanks to the smart selection of Slater determinants and the simplicity of the linear parametrization;

  2. 2.

    Use this compact SCI wave-function as the target of the overlap-guided adaptive algorithm so as to obtain an intermediate wave-function represented in terms of qubit excitation evolution operators acting on the Hartree-Fock reference state;

  3. 3.

    Use the intermediate wave-function as a high accuracy initialization of a new QEB-ADAPT-VQE procedure.

For the purpose of this study, we choose to employ the so-called CI perturbatively selected iteratively (CIPSI) algorithm implemented in QP223 to generate the required SCI wave-function.

The CIPSI algorithm in a nutshell

The CIPSI algorithm, which was originally introduced in the late seventies54,55, is the archetype of SCI approaches: it approximates the full-CI wave-function through an iterative selected-CI procedure, and the full-CI energy through a second-order multi-reference perturbation theory (in this case, with an Epstein–Nesbet56,57 partition).

The CIPSI energy is defined as

$${E}_{{{{{{{{\rm{CIPSI}}}}}}}}}:= {E}_{{{\mbox{v}}}}+{E}^{(2)}.$$
(16)

Here, Ev is the variational energy given by

$${E}_{{{\mbox{v}}}}:= \mathop{\min }\limits_{\{{c}_{{{{{{{{\rm{I}}}}}}}}}\}}\frac{\left\langle {\Psi }^{(0)}\right\vert H\left\vert {\Psi }^{(0)}\right\rangle }{\langle {\Psi }^{(0)}| {\Psi }^{(0)}\rangle },$$
(17)

where the reference wave-function \(\left\vert {\Psi }^{(0)}\right\rangle ={\sum }_{{{{{{{{\rm{I}}}}}}}}\in {{{{{{{\mathcal{R}}}}}}}}}\,\,{c}_{{{{{{{{\rm{I}}}}}}}}}\,\,\left\vert {{{{{{{\rm{I}}}}}}}}\right\rangle\) is expanded in Slater determinants \(\left\vert {{{{{{{\rm{I}}}}}}}}\right\rangle\) within the CI reference space \({{{{{{{\mathcal{R}}}}}}}}\), and E(2) is the second-order energy correction defined as

$${E}^{(2)}:= \mathop{\sum}\limits_{\kappa }\frac{| \left\langle {\Psi }^{(0)}\right\vert H\left\vert \kappa \right\rangle {| }^{2}}{{E}_{{{\mbox{v}}}}-\left\langle \kappa \right\vert H\left\vert \kappa \right\rangle }=\mathop{\sum}\limits_{\kappa }\,\,{e}_{\kappa }^{(2)},$$
(18)

where κ denotes a determinant outside the reference space \({{{{{{{\mathcal{R}}}}}}}}\).

The CIPSI energy is systematically refined by doubling the size of the CI reference space at each iteration, selecting the determinants κ with the largest \(| {e}_{\kappa }^{(2)}|\). The calculations are stopped when a target value of E(2) is reached.