Main

Long before we possessed computers, human beings strove to find patterns in data. Ptolemy fitted observations of the motions of the stars to a geocentric model of the cosmos, with complex epicycles to explain the retrograde motions of the planets. In the sixteenth century, Kepler analysed the data of Copernicus and Brahe to reveal a previously hidden pattern: planets move in ellipses with the Sun at one focus of the ellipse. The analysis of astronomical data to reveal such patterns gave rise to mathematical techniques such as methods for solving linear equations (Newton–Gauss), learning optima via gradient descent (Newton), polynomial interpolation (Lagrange), and least-squares fitting (Laplace). The nineteenth and early twentieth centuries gave rise to a broad range of mathematical methods for analysing data to reveal the patterns that it contained.

The construction of digital computers in the mid-twentieth century allowed the automation of data analysis techniques. Over the past half-century, the rapid progression of computer power has allowed the implementation of linear algebraic data analysis techniques such as regression and principal component analysis, and has led to more complex learning methods such as support vector machines. Over the same time frame, the development and rapid advance of digital computers spawned novel machine learning methods. Artificial neural networks such as perceptrons were implemented in the 1950s (ref. 1), as soon as computers had the power to realize them. Deep learning built on neural networks (such as Hopfield networks and Boltzmann machines) and training methods (such as back propagation) were introduced and implemented in the 1960s to 1990s (ref. 2). In the past decade, particularly in the past five years, the combination of powerful computers and special-purpose information processors capable of implementing deep networks with billions of weights3, together with their application to very large datasets, has revealed that such deep learning networks are capable of identifying complex and subtle patterns in data.

Quantum mechanics is well known to produce atypical patterns in data. Classical machine learning methods such as deep neural networks frequently have the feature that they can both recognize statistical patterns in data and produce data that possess the same statistical patterns: they recognize the patterns that they produce. This observation suggests the following hope. If small quantum information processors can produce statistical patterns that are computationally difficult for a classical computer to produce, then perhaps they can also recognize patterns that are equally difficult to recognize classically.

The realization of this hope depends on whether efficient quantum algorithms can be found for machine learning. A quantum algorithm is a set of instructions solving a problem, such as determining whether two graphs are isomorphic, that can be performed on a quantum computer. Quantum machine learning software makes use of quantum algorithms as part of a larger implementation. By analysing the steps that quantum algorithms prescribe, it becomes clear that they have the potential to outperform classical algorithms for specific problems (that is, reduce the number of steps required). This potential is known as quantum speedup.

The notion of a quantum speedup depends on whether one takes a formal computer science perspective—which demands mathematical proofs—or a perspective based on what can be done with realistic, finite-size devices—which requires solid statistical evidence of a scaling advantage over some finite range of problem sizes. For the case of quantum machine learning, the best possible performance of classical algorithms is not always known. This is similar to the case of Shor’s polynomial-time quantum algorithm for integer factorization: no sub-exponential-time classical algorithm has been found, but the possibility is not provably ruled out.

Determination of a scaling advantage contrasting quantum and classical machine learning would rely on the existence of a quantum computer and is called a ‘benchmarking’ problem. Such advantages could include improved classification accuracy and sampling of classically inaccessible systems. Accordingly, quantum speedups in machine learning are currently characterized using idealized measures from complexity theory: query complexity and gate complexity (see Box 1 and Box 1 Table). Query complexity measures the number of queries to the information source for the classical or quantum algorithm. A quantum speedup results if the number of queries needed to solve a problem is lower for the quantum algorithm than for the classical algorithm. To determine the gate complexity, the number of elementary quantum operations (or gates) required to obtain the desired result are counted.

Table 1 Speedup techniques for given quantum machine learning subroutines

Query and gate complexity are idealized models that quantify the necessary resources to solve a problem class. Without knowing how to map this idealization to reality, not much can be said about the necessary resource scaling in a real-world scenario. Therefore, the required resources of classical machine learning algorithms are mostly quantified by numerical experimentation. The resource requirements of quantum machine learning algorithms are likely to be similarly difficult to quantify in practice. The analysis of their practical feasibility is a central subject of this review.

As will be seen throughout the review, there are quantum algorithms for machine learning that exhibit quantum speedups4,5,6,7. For example, the quantum basic linear algebra subroutines (BLAS)—Fourier transforms, finding eigenvectors and eigenvalues, solving linear equations—exhibit exponential quantum speedups over their best known classical counterparts8,9,10. This quantum BLAS (qBLAS) translates into quantum speedups for a variety of data analysis and machine learning algorithms including linear algebra, least-squares fitting, gradient descent, Newton’s method, principal component analysis, linear, semidefinite and quadratic programming, topological analysis and support vector machines9,11,12,13,14,15,16,17,18,19. At the same time, special-purpose quantum information processors such as quantum annealers and programmable quantum optical arrays are well matched to deep learning architectures20,21,22. Although it is not clear yet to what extent this potential can be realized, there are reasons to be optimistic that quantum computers can recognize patterns in data that classical computers cannot.

The learning machines we consider can be either classical23,24,25,26,27,28,29,30,31,32 or quantum8,9,11,13,33,34,35,36. The data they analyse can be either classical or quantum states produced by quantum sensing or measuring apparatus30,37. We briefly discuss conventional machine learning—the use of classical computers to find patterns in classical data. We then turn to quantum machine learning, where the data that the quantum computer analyses can be either classical data, which ends up encoded as quantum states, or quantum data. Finally, we discuss briefly the problem of using classical machine learning techniques to find patterns in quantum dynamics.

Classical machine learning

Classical machine learning and data analysis can be divided into several categories. First, computers can be used to perform ‘classic’ data analysis methods such as least-squares regression, polynomical interpolation and data analysis. Machine learning protocols can be supervised or unsupervised. In supervised learning, the training data are divided into labelled categories, such as samples of handwritten digits together with the actual number the handwritten digit is supposed to represent, and the job of the machine is to learn how to assign labels to data outside the training set. In unsupervised learning, the training set is unlabelled, and the goal of the machine is to find the natural categories into which the training data falls (for example, different types of photos on the internet) and then to categorize data outside the training set. Finally, there are machine learning tasks, such as playing Go, that involve combinations of supervised and unsupervised learning, together with training sets that may be generated by the machine itself.

Linear-algebra-based quantum machine learning

A wide variety of data analysis and machine learning protocols operate by performing matrix operations on vectors in a high-dimensional vector space. But quantum mechanics is all about matrix operations on vectors in high-dimensional vector spaces.

The key ingredient behind these methods is that the quantum state of n quantum bits or qubits is a vector in a 2n-dimensional complex vector space; performing a quantum logic operations or a measurement on qubits multiplies the corresponding state vector by 2n × 2n matrices. By building up such matrix transformations, quantum computers have been shown to perform common linear algebraic operations such as Fourier transforms38, finding eigenvectors and eigenvalues39, and solving linear sets of equations over 2n-dimensional vector spaces in time that is polynomial in n, exponentially faster than their best known classical counterparts8. This latter is commonly referred to as the Harrow, Hassidim and Lloyd (HHL) algorithm8 (see Box 2). The original variant assumed a well conditioned matrix that is sparse. Sparsity is unlikely in data science, but later improvements relaxed this assumption to include low-rank matrices as well10,33,40. Going past HHL, here we survey several quantum algorithms which appear as subroutines when linear algebra techniques are employed in quantum machine learning software.

Quantum principal component analysis

For example, consider principal component analysis (PCA). Suppose that the data are presented in the form of vectors vj in a d-dimensional vector space, where d = 2n = N. For example, vj could be the vector of changes in prices of all stocks in the stock market from time tj to time tj + 1. The covariance matrix of the data is , where superscript T denotes the transpose operation: the covariance matrix summarizes the correlations between the different components of the data, for example, correlations between changes in the prices of different stocks. In its simplest form, principal component analysis operates by diagonalizing the covariance matrix: , where the ck are the eigenvectors of C, and ek are the corresponding eigenvalues. (Because C is symmetric, the eigenvectors ck form an orthonormal set.) If only a few of the eigenvalues ck are large, and the remainder are small or zero, then the eigenvectors corresponding to those eigenvalues are called the principal components of C. Each principal component represents an underlying common trend or form of correlation in the data, and decomposing a data vector v in terms of principal components, v = , allows one both to compress the representation of the data and to predict future behaviour. Classical algorithms for performing PCA scale as O(d2) in terms of computational complexity and query complexity. (We note that we make use of ‘big O’ notation to keep track of the leading term that dominates scaling.)

For quantum principal component analysis of classical data11, we choose a data vector vj at random, and use a quantum random access memory (qRAM)41 to map that vector into a quantum state: . The quantum state that summarizes the vector has logd qubits, and the operation of the qRAM requires O(d) operations divided over O(logd) steps that can be performed in parallel. Because vj was chosen at random, the resulting quantum state has a density matrix , where N is the number of data vectors. By comparison with the covariance matrix C for the classical data, we see that the density matrix for the quantum version of the data actually is the covariance matrix, up to an overall factor. By repeatedly sampling the data, and using a trick called density matrix exponentiation42 combined with the quantum phase estimation algorithm39, which finds eigenvectors and eigenvalues of matrices, we can take the quantum version of any data vector and decompose it into the principal components , revealing the eigenvalue of C at the same time: . The properties of the principal components of C can then be probed by making measurements on the quantum representation of the eigenvectors of C. The quantum algorithm scales as O[(logN)2] in both computational complexity and query complexity. That is, quantum PCA is exponentially more efficient than classical PCA.

Quantum support vector machines and kernel methods

The simplest examples of supervised machine learning algorithms are linear support vector machines and perceptrons. These methods seek to find an optimal separating hyperplane between two classes of data in a dataset such that, with high probability, all training examples of one class are found only on one side of the hyperplane. The most robust classifier for the data is given when the margin between the hyperplane and the data are maximized. Here the ‘weights’ learned in the training are the parameters of the hyperplane. One of the greatest powers of the support vector machine lies in its generalization to nonlinear hypersurfaces via kernel functions43. Such classifiers have found great success in image segmentation as well as in the biological sciences.

Like its classical counterpart, the quantum support vector machine is a paradigmatic example of a quantum machine learning algorithm13. A first quantum support vector machine was discussed in the early 2000s44, using a variant of Grover’s search for function minimization45. Finding s support vectors out of N vectors consequently takes iterations. Recently, a least-squares quantum support vector machine was developed that harnesses the full power of the qBLAS subroutines. The data input can come from various sources, such as from qRAM accessing classical data or from a quantum subroutine preparing quantum states. Once the data are made available to the quantum computing device, they are processed with quantum phase estimation and matrix inversion (the HHL algorithm). All the operations required to construct the optimal separating hyperplane and to test whether a vector lies on one side or the other can in principle be performed in time that is polynomial in logN, where N is the dimension of the matrix required to prepare a quantum version of the hyperplane vector. Polynomial13 and radial basis function kernels46 are discussed, as well as another kernel-based method called Gaussian process regression47. This approach to quantum support machines has been experimentally demonstrated in a nuclear magnetic resonance testbed for a handwritten digit recognition task48.

qBLAS-based optimization

Many data analysis and machine learning techniques involve optimization. Of increasing interest is the use of D-Wave processors to solve combinatorial optimization problems by means of quantum annealing. Some optimization problems can also be formulated as a single-shot solution of a linear system, such as the optimization of a quadratic function subject to equality constraints, a subset of quadratic programming problems. If the matrices involved are sparse or low rank, such problems can be solved in time that is polynomial in logd, where d is the system dimension via the HHL matrix inversion algorithm, yielding an exponential speedup over classical algorithms, which run in time that is polynomial in d.

Most methods in machine learning require iterative optimization of their performance. As an example, inequality constraints are often handled via penalty functions49 and variations of gradient descent or Newton’s method. A modification of the quantum PCA method implements iterative gradient descent and Newton’s methods for polynomial optimization, and can again provide an exponential speedup over classical methods19. Multiple copies of the present solution, encoded in a quantum state, are used to improve that solution at each step. Brandao and Svore provide a quantum version of semi-definite programming that holds out the possibility of super-polynomial speedups18. The quantum approximate optimization algorithm (the QAO algorithm)50 provides a unique approach to optimization based on alternating qubit rotations with the application of the problem’s penalty function.

Reading classical data into quantum machines

Classical data must be input before being processed on a quantum computer. This ‘input problem’ often has little overhead but can present a serious bottleneck for certain algorithms. Likewise, the ‘output problem’ is faced when reading out data after being processed on a quantum device. Like the input problem, the output problem often causes a noticeable operational slowdown.

In particular, if we wish to apply HHL, least-squares fitting, quantum principal component analysis, quantum support vector machines, and related approaches to classical data, the procedure begins by first loading considerable amounts of data into a quantum system, which can require exponential time51. This can be addressed in principle using qRAM but the cost of doing so may be prohibitive for big data problems52. Apart from combinatorial-optimization-based approaches, the only known linear-algebra-based quantum machine learning algorithm that does not rely on large-scale qRAM is the quantum algorithm for performing topological analysis of data (persistent homology)14. With the notable exceptions of least-squares fitting and quantum support vector machines, linear-algebra-based algorithms can also suffer from the output problem because desirable classical quantities such as the solution vector for HHL or the principal components for PCA are exponentially hard to estimate.

Despite the potential for exponential quantum speedups, without much effort put into optimization, the circuit size and circuit depth overhead can balloon (to around 1025 quantum gates in one proposed realization of HHL53). Ongoing work is needed to optimize such algorithms, provide better cost estimates and ultimately to understand the sort of quantum computer that we would need to provide useful quantum alternatives to classical machine learning.

Deep quantum learning

Classical deep neural networks are highly effective tools for machine learning and are well suited to inspire the development of deep quantum learning methods. Special-purpose quantum information processors such as quantum annealers and programmable photonic circuits are well suited for constructing deep quantum learning networks21,54,55. The simplest deep neural network to quantize is the Boltzmann machine (see Box 3 and Box 3 Figure). The classical Boltzmann machine consists of bits with tunable interactions: the Boltzmann machine is trained by adjusting those interactions so that the thermal statistics of the bits, described by a Boltzmann–Gibbs distribution (see Fig. 1b), reproduces the statistics of the data. To quantize the Boltzmann machine one simply takes the neural network and expresses it as a set of interacting quantum spins, corresponding to a tunable Ising model. Then by initializing the input neurons in the Boltzmann machines into a fixed state and allowing the system to thermalize, we can read out the output qubits to obtain an answer.

Figure 1: Quantum tunnelling versus thermalization.
figure 1

A quantum state tunnels when approaching a resonance point before decoherence induces thermalization. Shades of blue illustrate occupation of energy levels (black dashes). a, A quantum state must traverse a local minimum in thermal annealing, whereas a coherent quantum state can tunnel when brought close to resonance. b, Coherent effects decay through interaction with an environment, causing the probability distribution of the occupancy of a system’s energy levels to follow a Gibbs distribution.

PowerPoint slide

An essential feature of deep quantum learning is that it does not require a large, general-purpose quantum computer. Quantum annealers are special-purpose quantum information processors that are much easier to construct and to scale up than are general-purpose quantum computers (see Fig. 1a). Quantum annealers are well suited for implementing deep quantum learners, and are commercially available. The D-Wave quantum annealer is a tunable transverse Ising model that can be programmed to yield the thermal states of classical systems, and certain quantum spin systems. The D-Wave device has been used to perform deep quantum learning protocols on more than a thousand spins56. Quantum Boltzmann machines22 with more general tunable couplings, capable of implementing universal quantum logic, are currently at the design stage57. On-chip silicon waveguides have been used to construct linear optical arrays with hundreds of tunable interferometers, and special-purpose superconducting quantum information processors could be used to implement the quantum approximate optimization algorithm.

There are several ways that quantum computers can provide advantages here. First, quantum methods can make the system thermalize quadratically faster than its classical counterpart20,58,59,60. This can make accurate training of fully connected Boltzmann machines practical. Second, quantum computers can accelerate Boltzmann training by providing improved ways of sampling. Because the neuron activation pattern in the Boltzmann machine is stochastic, many repetitions are needed to determine success probabilities, and in turn, to discover the effect that changing a weight in the neural network has on the performance of the deep network. When training a quantum Boltzmann machine, in contrast, quantum coherence can quadratically reduce the number of samples needed to learn the desired task. Furthermore, quantum access to the training data (that is, qRAM or a quantum blackbox subroutine) allows the machine to be trained using quadratically fewer access requests to the training data than are required by classical methods: a quantum algorithm can train a deep neural network on a large training dataset while reading only a minuscule number of training vectors20.

Quantum information processing provides new, fundamentally quantum, models for deep learning. For example, adding a transverse field to the Ising model quantum Boltzmann machine can induce a variety of quantum effects such as tunnelling22,61. Adding further quantum couplings transforms the quantum Boltzmann machine into a variety of quantum systems57,62. Adding a tunable transverse interaction to a tunable Ising model is known to be universal for full quantum computing57: with the proper weight assignments this model can execute any algorithm that a general-purpose quantum computer can perform. Such universal deep quantum learners may recognize and classify patterns that classical computers cannot.

Unlike classical Boltzmann machines, quantum Boltzmann machines output a quantum state. Thus deep quantum networks can learn to generate quantum states representative of a wide variety of systems, allowing the network to act as a form of quantum associative memory63. This ability to generate quantum states is absent from classical machine learning. Thus quantum Boltzmann training has applications beyond classifying quantum states and providing richer models for classical data.

Quantum machine learning for quantum data

Perhaps the most immediate application of quantum machine learning is to quantum data—the actual states generated by quantum systems and processes. As described above, many quantum machine learning algorithms find patterns in classical data by mapping the data to quantum mechanical states, and then manipulating those states using basic quantum linear algebra subroutines. These quantum machine learning algorithms can be applied directly to the quantum states of light and of matter to reveal their underlying features and patterns. The resulting quantum modes of analysis are frequently much more efficient and more illuminating than the classical analysis of data taken from quantum systems. For example, given multiple copies of a system described by an N × N density matrix, quantum principal component analysis can be used to find its eigenvalues and to reveal the corresponding eigenvectors in time O[(logN)2], compared with the O(N2) measurements needed for a classical device to perform tomography on a density matrix, and the O(N2) operations needed to perform the classical PCA. Such quantum analysis of quantum data could profitably be performed on the relatively small quantum computers that are likely to be available over the next several years.

A particularly powerful quantum data analysis technique is the use of quantum simulators to probe quantum dynamics. Quantum simulators are ‘quantum analogue computers’—quantum systems whose dynamics can be programmed to match the dynamics of some desired quantum system. A quantum simulator can either be a special-purpose device constructed to simulate a particular class of quantum systems, or a general-purpose quantum computer. By connecting a trusted quantum simulator to an unknown system and tuning the model of the simulator to counteract the unknown dynamics, the dynamics of the unknown system can be efficiently learned using approximate Bayesian inference64,65,66. This exponentially reduces the number of measurements needed to perform the simulation. Similarly, the universal quantum emulator algorithm67 allows one to reconstruct quantum dynamics and the quantum Boltzmann training algorithm of ref. 61 allows states to be reconstructed, in time logarithmic in the dimension of the Hilbert space, which is exponentially faster than reconstructing the dynamics via classical tomography.

To use a quantum computer to help characterize a quantum system65,66 or to accept input states for use in a quantum PCA algorithm, we must face the substantial technical challenge of loading coherent input states. Nonetheless, because such applications do not require qRAM and offer the potential for exponential speedups for device characterization22,61,65,66 they remain among the promising possibilities for near-term application of quantum machine learning.

Designing and controlling quantum systems

A major challenge in the development of quantum computation and information science involves tuning quantum gates to match the exacting requirements needed for quantum error correction. Heuristic search methods can help to achieve this in a supervised learning scenario68,69 (for instance in the case of nearest-neighbour-coupled superconducting artificial atoms69 with gate fidelity above 99.9% in the presence of noise) and thus to reach an accepted threshold for fault-tolerant quantum computing. A similar methodology has been successful in constructing a single-shot Toffoli gate, again reaching gate fidelity above 99.9%70. Genetic algorithms have been employed to reduce digital and experimental errors in quantum gates71. They have been used to simulate controlled-NOT gates by means of ancillary qubits and imperfect gates. Besides outperforming protocols for digital quantum simulations, it has been shown that genetic algorithms are also useful for suppressing experimental errors in gates72. Another approach used stochastic gradient descent and two-body interactions to embed a Toffoli gate into a sequence of quantum operations or gates without time-dependent control using the natural dynamics of a quantum network73. Dynamical decoupling sequences help to protect quantum states from decoherence, which can be designed using recurrent neural networks74.

Controlling a quantum system is just as important and complex. Learning methods have also been very successful in developing control sequences to optimize adaptive quantum metrology, which is a key quantum building block in many quantum technologies. Genetic algorithms have been proposed for the control of quantum molecules to overcome the problem caused by changing environmental parameters during an experiment75. Reinforcement learning algorithms using heuristic global optimization, like the algorithm used for designing circuits, have been widely successful, particularly in the presence of noise and decoherence, scaling well with the system size76,77,78. One can also exploit reinforcement learning in gate-based quantum systems. For instance, adaptive controllers based on intelligent agents for quantum information demonstrate adaptive calibration and compensation strategies to an external stray field of unknown magnitude in a fixed direction.

Classical machine learning is also a powerful tool with which to extract theoretical insights about quantum states. Neural networks have recently been deployed to study two central problems in condensed matter, namely phase-of-matter detection79,80 and ground-state search81. These succeeded in achieving better performances than established numerical tools. Theoretical physicists are now studying these models to understand analytically their descriptive power compared to traditional methods such as tensor networks. Interesting applications to exotic states of matter are already on the market, and have been shown to capture highly non-trivial features from disordered or topologically ordered systems.

Perspectives on future work

As we have discussed in this review, small quantum computers and larger special-purpose quantum simulators, annealers and so on seem to have potential use in machine learning and data analysis15,21,22,36,48,82,83,84,85,86,87,88,89,90,91,92,93,94,95. However, the execution of quantum algorithms requires quantum hardware that is not yet available.

On the hardware side, there have been great strides in several enabling technologies. Small-scale quantum computers with 50–100 qubits will be made widely available via quantum cloud computing (the ‘Qloud’). Special-purpose quantum information processors such as quantum simulators, quantum annealers, integrated photonic chips, nitrogen vacancy centres (NV)-diamond arrays, qRAM, and made-to-order superconducting circuits will continue to advance in size and complexity. Quantum machine learning offers a suite of potential applications for small quantum computers23,24,25,26,27,28,29,30,31,96,97,98 complemented and enhanced by special-purpose quantum information processors21,22, digital quantum processors70,73,78,99,100 and sensors76,77,101.

In particular, quantum annealers with around 2,000 qubits have been built and operated, using integrated superconducting circuits that are, in principle, scalable. The biggest challenges for quantum annealers to implement quantum machine learning algorithms include improving connectivity and implementing more general tunable couplings between qubits. Programmable quantum optic arrays with around 100 tunable interferometers have been constructed using integrated photonics in silicon, but loss of quantum effects increases as such circuits are scaled up. A particularly important challenge for quantum machine learning is the construction of interface devices such as qRAM that allow classical information to be encoded in quantum mechanical form52. A qRAM to access N pieces of data consists of a branching array of 2N quantum switches, which must operate coherently during a memory call. In principle, such a qRAM takes time O(logN) to perform a memory call, and can tolerate error rates of up to O(1/logN) per switching operation, where logN is the depth of the qRAM circuit. Proof-of-principle demonstrations of qRAM have been performed, but constructing large arrays of quantum switches is a difficult technological problem.

These hardware challenges are technical in nature, and clear paths exist towards overcoming them. They must be overcome, however, if quantum machine learning is to become a ‘killer app’ for quantum computers. As noted previously, most of the quantum algorithms that have been identified face a number of caveats that limits their applicability. We can distill the caveats mentioned above into four fundamental problems.

  1. 1

    The input problem. Although quantum algorithms can provide dramatic speedups for processing data, they seldom provide advantages in reading data. This means that the cost of reading in the input can in some cases dominate the cost of quantum algorithms. Understanding this factor is an ongoing challenge.

  2. 2

    The output problem. Obtaining the full solution from some quantum algorithms as a string of bits requires learning an exponential number of bits. This makes some applications of quantum machine learning algorithms infeasible. This problem can potentially be sidestepped by learning only summary statistics for the solution state.

  3. 3

    The costing problem. Closely related to the input/output problems, at present very little is known about the true number of gates required by quantum machine learning algorithms. Bounds on the complexity suggest that for sufficiently large problems they will offer huge advantages, but it is still unclear when that crossover point occurs.

  4. 4

    The benchmarking problem. It is often difficult to assert that a quantum algorithm is ever better than all known classical machine algorithms in practice because this would require extensive benchmarking against modern heuristic methods. Establishing lower bounds for quantum machine learning would partially address this issue.

To avoid some of these problems, we could apply quantum computing to quantum, rather than classical, data. One aim therein is to use quantum machine learning to characterize and control quantum computers66. This would enable a virtuous cycle of innovation similar to that which occurred in classical computing, wherein each generation of processors is then leveraged to design the next-generation processors. We have already begun to see the first fruits of this cycle with classical machine learning being used to improve quantum processor designs23,24,25,26,27,28,29,30,31,102,103,104, which in turn provide powerful computational resources for quantum-enhanced machine learning applications themselves8,9,11,13,33,34,35,36.