Introduction

In the past several years, physical realizations of quantum computing have achieved remarkable progresses1,2. As a result, the following four tasks are becoming more and more important in quantum computing. First, to run a quantum algorithm, which is usually designed in the language of a quantum circuit, on a quantum computer, we have to compile it into a series of quantum instructions that can be executed directly on the quantum hardware, and as a whole, this is essentially another quantum circuit. Second, when executing quantum instructions on a quantum computer, the hardware configuration has to be respected, which means that the available quantum instructions are actually restricted. If this is not the case, we have to map the quantum circuit at hand into another desirable one. Third, for now, the scaling of quantum computing is still small, and quantum computational resources are very precious, therefore it is always nice to make sure that the executed quantum circuit has been optimized. Fourth, quantum computing has been physically implemented on different quantum platforms, then if we run the same quantum algorithm on different platforms, an important problem is to make sure they are essentially the same, where the quantum circuits may look different.

It is not hard to see that a common part of the above four fundamental problems is that we need to transfer a quantum circuit into another or compare two quantum circuits. Undoubtedly, during these transformations or comparisons, a basic requirement is to find out whether an initial quantum circuit and the compiled, optimized, or compared quantum circuit have exactly the same functions. As a consequence, equivalent checking of quantum circuits is a profound problem in quantum computing and quantum engineering. We stress that sometimes the compared two quantum circuits are located at different places.

In fact, this problem has attracted a lot of attention, and quite a few approaches have been proposed accordingly. Particularly, in ref. 3 an approach based on decision diagrams was proposed for equivalence checking of quantum circuits, where the central idea is representing quantum circuits as decision programs, on which the comparisons are performed. In ref. 4, a concept called reversible miter was proposed for this problem, which is a generalization of miter circuits utilized in digital electronic circuits and can be integrated with circuit simplifications and decision program techniques. Meanwhile, as mentioned above, equivalence checking of quantum circuits has been extensively studied in the optimization of quantum circuits and the verification of quantum compilers5,6,7,8,9,10. Very recently, equivalence checking has also been introduced to handle sequential quantum circuits, where a Mealy machine-based framework was proposed11.

Despite these encouraging approaches for equivalence checking of quantum circuits, however, they share the common feature that internal structures of involved quantum circuits can be seen. If we use the language of software testing, this is essentially a kind of white-box testing. Then like in software testing, black-box testing that the internal structures of quantum circuits cannot be seen and should also be a realistic scenario that needs to be considered.

Indeed, as mentioned, in the future it will be an important problem for us to find out whether two separated manufactured quantum circuit chips that the insides cannot be seen have the same functions with certainty. Trying to solve this problem is the main target of the current paper. We stress that in our setting we do not have any prior knowledge of quantum circuits to be compared, and this is essentially different from the topic of unitary operation discrimination12,13,14, where every unitary operation is picked up from a small set known beforehand.

In this paper, based on the key role played by quantum nonlocality, we design an elegant approach that can achieve black-box equivalence checking of quantum circuits with certainty. Clearly, no similar approach exists for the classical counterpart of this problem. Particularly, we provide a complete mathematical characterization for our approach. First, we prove that in our protocol, the observed quantum nonlocality is the strongest if and only if the two involved quantum circuits have exactly the same functions. Second, we show that the protocol also works well in an approximate sense, i.e., for a given strength of observed quantum nonlocality, we provide analytical lower and upper bounds for the distance between the two quantum circuits. By providing numerical evidence, we verify the correctness of these bounds. Third, by looking into the structure of the gap between the above two bounds, we proposed a modified protocol such that the gap disappears, which means that based on the observed nonlocality we can completely pin down the distance between the compared quantum circuits generally. Fourth, we analyze the computational cost of the modified protocol and show that it is independent of the size of compared quantum circuits. That is, for a given precision we need only a constant cost to check the equivalence of large quantum circuits. Lastly, we discuss the possibility to generalize our protocol to the case of multiple quantum circuits, where we want to determine whether three or even more quantum circuits are equivalent to each other in one go. We argue that at least when the number of quantum circuits is odd, this is impossible. We believe that our results demonstrate a possibility to apply quantum nonlocality to important problems in future quantum engineering.

Results

The exact equivalence checking of two quantum circuits

Suppose two d-dimensional quantum circuits C1 and C2 are held by two separated players, Alice and Bob, respectively. Since the Hadamard gate and the Toffoli gate form a universal gate set for quantum computation15, and the matrix representations for both of the two quantum gates involve only real numbers, quantum circuits with real matrix representations already enjoy the full power of quantum computation. Because of this fact, in this work, we suppose that the matrix representations of C1 and C2 are real, denoted U1 and U2. Then our task is to determine whether U1 is equivalent to U2 up to a global phase (since they are real, a global phase can only be ±1). Let us first consider the smallest case where C1 and C2 are single-qubit quantum circuits.

Before introducing our main idea, let us recall some facts on quantum nonlocality and Bell experiments. Suppose Alice and Bob share a lot of EPR pairs, i.e., \(\left\vert \,{{\mbox{EPR}}}\,\right\rangle =\frac{1}{\sqrt{2}}(\left\vert 00\right\rangle +\left\vert 11\right\rangle )\). On each EPR pair, they repeat the following procedure. Both of them perform random local measurements on their qubits respectively, where Alice measures observables A0 = σX and A1 = σZ, and Bob measures observables \({B}_{0}=({\sigma }_{X}+{\sigma }_{Z})/\sqrt{2}\) and \({B}_{1}=({\sigma }_{X}-{\sigma }_{Z})/\sqrt{2}\). Here σX and σZ are Pauli matrices. Then they calculate all the probability distribution p(abxy), i.e., the probability that Alice and Bob obtain outcomes a on Ax and b on By respectively, where a, b {−1, 1} and x, y {0, 1}. Let 〈AxBy〉 = ∑a,babp(abxy), and

$${I}_{{{\mbox{CHSH}}}}=\langle {A}_{0}{B}_{0}\rangle +\langle {A}_{1}{B}_{0}\rangle +\langle {A}_{0}{B}_{1}\rangle -\langle {A}_{1}{B}_{1}\rangle ,$$
(1)

then it holds that \({I}_{{{\mbox{CHSH}}}}=2\sqrt{2}\). As a comparison, if p(abxy) is produced by a classical system, the corresponding value will not be larger than 2, and this is the famous Clauser–Horne–Shimony–Holt (CHSH) inequality16. A well-known fact is that the above violation of the CHSH inequality achieved by EPR pairs is optimal16, which is the foundation of many quantum information processing tasks17,18,19.

We now change the above Bell experiment a little bit by adding one more step. Before measuring each EPR pair, Alice and Bob input the qubit they hold into C1 and C2 respectively, then the overall output will be \(\left\vert \psi \right\rangle =\frac{1}{\sqrt{2}}({U}_{1}\left\vert 0\right\rangle \otimes {U}_{2}\left\vert 0\right\rangle +{U}_{1}\left\vert 1\right\rangle \otimes {U}_{2}\left\vert 1\right\rangle )\), on which they perform the same sets of local measurements as above. Here we stress that it is crucial to use the same sets of local measurements. We now analyze the new value of ICHSH, denoted \({I}_{\,{{\mbox{CHSH}}}\,}^{{\prime} }\).

We first consider the case that U1 = U2. Recall that they are real unitary matrices, then it can be verified that \(\left\vert \psi \right\rangle =\left\vert \,{{\mbox{EPR}}}\,\right\rangle\), which means \({I}_{\,{{\mbox{CHSH}}}\,}^{{\prime} }=2\sqrt{2}\). That is to say, if C1 and C2 are the same, the above experiment will still result in a maximal violation. In this situation, a natural problem is to find out whether the converse is correct or not, i.e., whether or not \({I}_{\,{{\mbox{CHSH}}}\,}^{{\prime} }=2\sqrt{2}\) always implies that U1 = U2. If this is correct, then we can perfectly determine whether C1 and C2 are equivalent by performing the above modified Bell experiment.

Actually, this is indeed the case. It has been known that if \({I}_{\,{{\mbox{CHSH}}}\,}^{{\prime} }=2\sqrt{2}\), the following conditions are satisfied17.

$$\frac{{A}_{0}\pm {A}_{1}}{\sqrt{2}}\left\vert \psi \right\rangle ={B}_{0/1}\left\vert \psi \right\rangle .$$
(2)

By straightforward calculations, it can be verified that this indicates that \(\left\vert \psi \right\rangle =\left\vert \,{{\mbox{EPR}}}\,\right\rangle\) up to a global phase. On the other hand, if U1 ≠ ±U2, it can be checked that \(\left\vert \psi \right\rangle \ne \pm \!\left\vert \,{{\mbox{EPR}}}\,\right\rangle\), which means that if \({I}_{\,{{\mbox{CHSH}}}\,}^{{\prime} }=2\sqrt{2}\), we must have U1 = U2.

We now move to the general case, where the common size of C1 and C2 is d-dimensional. Let d = 2n. Inspired by the single-qubit case, Alice and Bob hope they can use a similar protocol to find out whether C1 and C2 are equivalent. That is, they hope that the following plan could be realized. Again, they first prepare and share many copies of the maximally entangled state

$$\left\vert {{{\Phi }}}_{d}\right\rangle =\frac{1}{\sqrt{d}}\mathop{\sum }\limits_{i=0}^{d-1}\left\vert ii\right\rangle .$$
(3)

Note that if the quantum circuits are based on qubits, \(\left\vert {{{\Phi }}}_{d}\right\rangle\) can be prepared by combining n EPR pairs together. Then they choose a certain Bell inequality such that \(\left\vert {{{\Phi }}}_{d}\right\rangle\) violates it maximally, where they record the local measurements that achieve the maximal violation. Then for each copy of \(\left\vert {{{\Phi }}}_{d}\right\rangle\), Alice and Bob input their own subsystems into the corresponding quantum circuits they hold respectively. On the output state, which is now \(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle\), they perform the same local measurements as recorded above. By repeating the experiments, they collect the measurement outcome statistics data p(abxy), where x, y {1, 2, . . . , m} and a, b {0, 1, . . . , d−1} are the labels for the local measurements and the corresponding outcomes. Then they examine the measurement outcome statistics data with the above chosen Bell inequality, and hope that \(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle\) violates the Bell inequality maximally if and only if U1 = U2 up to a global phase.

Clearly, if the above Bell equality exists, like in the qubit case, Alice and Bob can determine whether C1 and C2 are equivalent perfectly according to the violation. Interestingly, it turns out that such a Bell inequality does exist.

According to our plan, such a desirable Bell inequality should be violated maximally by maximally entangled states. However, it has been well-known that entanglement is a different resource from quantum nonlocality, and on many Bell inequalities it is not maximally entangled states that achieve the maximal violations, say the Collins-Gisin–Linden–Masser–Popescu (CGLMP) inequalities20. In the meantime, quantum nonlocality can be observed directly by quantum experiments, while entanglement cannot, thus we often choose to characterize unknown entanglement by looking into the underlying quantum nonlocality. Therefore, when doing this, we hope that the quantum nonlocality we observed and the underlying entanglement is as consistent as possible, which implies that the above desirable Bell inequalities will be nice choices. Fortunately, in ref. 21 such a class of beautiful Bell inequalities have been proposed, which were deliberately designed to be violated maximally by \(\left\vert {{{\Phi }}}_{d}\right\rangle\).

Specifically, to perform the measurement labeled by x, Alice measures an observable with eigenvectors \({\left\vert a\right\rangle }_{x}\) (a = 0, 1, . . . , d−1, and x = 1, 2, . . . , m), and

$${\left\vert a\right\rangle }_{x}=\frac{1}{\sqrt{d}}\mathop{\sum }\limits_{k=0}^{d-1}\,{{\mbox{exp}}}\,\left[\frac{2\pi {{{\bf{i}}}}}{d}k(a-{\alpha }_{x})\right]\left\vert k\right\rangle ,$$
(4)

where \({{{\bf{i}}}}=\sqrt{-1}\) is the imaginary number, and αx = (x−1/2)/m. Similarly, to perform the measurement labeled by y, Bob measures an observable with eigenvectors \({\left\vert b\right\rangle }_{y}\) (b = 0, 1, . . . , d−1, and y = 1, 2, . . . , m), and

$${\left\vert b\right\rangle }_{y}=\frac{1}{\sqrt{d}}\mathop{\sum }\limits_{k=0}^{d-1}\,{{\mbox{exp}}}\,\left[-\frac{2\pi {{{\bf{i}}}}}{d}k(b-{\beta }_{y})\right]\left\vert k\right\rangle ,$$
(5)

where βy = y/m. On an arbitrary quantum state \(\left\vert \phi \right\rangle\), the Bell expression is essentially equivalent to

$${I}_{d,m}(\left\vert \phi \right\rangle )=\mathop{\sum }\limits_{i=1}^{m}\mathop{\sum }\limits_{l=1}^{d-1}\left\langle \phi \right\vert ({A}_{i}^{l}\otimes {\bar{B}}_{i}^{l})\left\vert \phi \right\rangle ,$$
(6)

where \({A}_{i}^{l}=\mathop{\sum }\nolimits_{a = 0}^{d-1}{\omega }^{al}{\left\vert a\right\rangle }_{ii}\left\langle a\right\vert\), \({\bar{B}}_{i}^{l}={({A}_{i}^{l})}^{* }\), and ω = exp(2πi/d). Note that \({A}_{i}^{l}\) and \({\bar{B}}_{i}^{l}\) are unitary matrices.

In ref. 21, it was proved that the Tsirelson bound of Id,m is m(d−1), which is achieved exactly by \(\left\vert {{{\Phi }}}_{d}\right\rangle\) and strictly larger than the classical bound. Indeed, a property of \(\left\vert {{{\Phi }}}_{d}\right\rangle\) is that for any d × d matrices M and N, it holds that \((M\otimes N)\left\vert {{{\Phi }}}_{d}\right\rangle =(I\otimes N{M}^{T}\left\vert {{{\Phi }}}_{d}\right\rangle )\). Since \({\bar{B}}_{i}^{l}={({A}_{i}^{l})}^{* }\) for any i and l, we have that \(\left\langle {{{\Phi }}}_{d}\right\vert ({A}_{i}^{l}\otimes {\bar{B}}_{i}^{l})\left\vert {{{\Phi }}}_{d}\right\rangle =\left\langle {{{\Phi }}}_{d}\right\vert (I\otimes I)\left\vert {{{\Phi }}}_{d}\right\rangle =1\), implying that Id,m = m(d−1) on this state.

Let us go back to our task. We first notice that if C1 and C2 are the same, i.e., U1 = U2 = U, \(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle\) always achieve the Tsirelson bound of Id,m. In fact, for any i and l it holds that

$$\begin{array}{l}\left\langle {{{\Phi }}}_{d}\right\vert ({U}^{{\rm {T}}}\otimes {U}^{{\rm {T}}})({A}_{i}^{l}\otimes {\bar{B}}_{i}^{l})(U\otimes U)\left\vert {{{\Phi }}}_{d}\right\rangle \\ =\left\langle {{{\Phi }}}_{d}\right\vert (I\otimes {U}^{{\rm {T}}}{\bar{B}}_{i}^{l}U{U}^{{\rm {T}}}{({A}_{i}^{l})}^{{\rm {T}}}U)\left\vert {{{\Phi }}}_{d}\right\rangle \\ =\left\langle {{{\Phi }}}_{d}\right\vert (I\otimes I)\left\vert {{{\Phi }}}_{d}\right\rangle \\ =1.\end{array}$$
(7)

Hence, the new value of Id,m is still m(d−1). In this situation, similar to the case of single-qubit quantum circuits, we need to consider whether the converse is correct or not, i.e., whether we can have both U1 ≠ U2 and \({I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )=m(d-1)\) at the same time. We now show that this is impossible.

Theorem 1

\({I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )=m(d-1)\) if and only if U1 = U2 up to a global phase.

Proof

We only need to prove that \({I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )=m(d-1)\) implies U1 = U2. According to the definition of Id,m, we know that if \({I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )=m(d-1)\), each term in the summation of Eq. (6) will be 1. Therefore, for any i {1, 2, . . . , m} it holds that (let l = 1)

$$\begin{array}{l}\left\langle {{{\Phi }}}_{d}\right\vert ({U}_{1}^{{\rm {T}}}\otimes {U}_{2}^{{\rm {T}}})({A}_{i}^{1}\otimes {\bar{B}}_{i}^{1})({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle \\ =\left\langle {{{\Phi }}}_{d}\right\vert (I\otimes {U}_{2}^{{\rm {T}}}{\bar{B}}_{i}^{1}{U}_{2}{U}_{1}^{{\rm {T}}}{({A}_{i}^{1})}^{{\rm {T}}}{U}_{1})\left\vert {{{\Phi }}}_{d}\right\rangle \\ =\frac{1}{d}{{{\rm{Tr}}}}({U}_{2}^{{\rm {T}}}{\bar{B}}_{i}^{1}{U}_{2}{U}_{1}^{{\rm {T}}}{({A}_{i}^{1})}^{{\rm {T}}}{U}_{1})\\ =\frac{1}{d}{{{\rm{Tr}}}}({U}_{1}{U}_{2}^{{\rm {T}}}{\bar{B}}_{i}^{1}{U}_{2}{U}_{1}^{{\rm {T}}}{({A}_{i}^{1})}^{{\rm {T}}})\\ =1,\end{array}$$
(8)

where we have utilized the fact that for any d × d matrices M and N, it holds that \(\left\langle {{{\Phi }}}_{d}\right\vert (I\otimes M)\left\vert {{{\Phi }}}_{d}\right\rangle ={{{\rm{Tr}}}}(M)/d\) and \({{{\rm{Tr}}}}(MN)={{{\rm{Tr}}}}(NM)\). Hence, we obtain that \({{{\rm{Tr}}}}({U}_{1}{U}_{2}^{{\rm {T}}}{\bar{B}}_{i}^{1}{U}_{2}{U}_{1}^{{\rm {T}}}{({A}_{i}^{1})}^{{\rm {T}}})=d\).

Meanwhile, note that \({U}_{1}{U}_{2}^{{\rm {T}}}{\bar{B}}_{i}^{1}{U}_{2}{U}_{1}^{{\rm {T}}}{({A}_{i}^{1})}^{{\rm {T}}}\) is a d × d unitary matrix, thus we have that \({U}_{1}{U}_{2}^{{\rm {T}}}{\bar{B}}_{i}^{1}{U}_{2}{U}_{1}^{{\rm {T}}}{({A}_{i}^{1})}^{{\rm {T}}}=I\). For simplicity, let \({S}_{1}={U}_{2}{U}_{1}^{{\rm {T}}}\) and \({S}_{2}={\bar{B}}_{i}^{1}\). Then this means \({S}_{1}^{{\dagger} }{S}_{2}{S}_{1}{S}_{2}^{{\dagger} }=I\), which is also S2S1 = S1S2, where we have utilized the fact that both S1 and S2 are unitary matrices. Since S1 and S2 are also normal matrices, this shows that they can be simultaneously diagonalizable.

Similarly, let j ≠ i {1, 2, . . . , m} and \({S}_{3}={\bar{B}}_{j}^{1}\), then S1 and S3 can also be simultaneously diagonalizable. Recall the definition of \({\bar{B}}_{j}^{1}\), whose eigenvectors are given by the conjugate of Eq. (4), then we have that S1 can be diagonalized in the following two different ways:

$${S}_{1}=\mathop{\sum }\limits_{a=0}^{d-1}{g}_{a}{({\left\vert a\right\rangle }_{ii}\left\langle a\right\vert )}^{* }=\mathop{\sum }\limits_{a=0}^{d-1}{h}_{a}{({\left\vert a\right\rangle }_{jj}\left\langle a\right\vert )}^{* },$$
(9)

where for any a, ga and ha are unit complex numbers. Then

$${}_{i}^{*}\left\langle 0\right\vert {S}_{1}{\left\vert 0\right\rangle}_{i}^{* }={g}_{0}=\mathop{\sum}\limits_{a=0}^{d-1}{h}_{a}\cdot {|}_{i}^{*}{\langle 0| a\rangle}_{j}^{*}{|}^{2}.$$
(10)

At the same time, for any a {0, 1, . . . , d−1} it can be verified that \(0 \,<\, {| }_{i}^{* }{\langle 0| a\rangle }_{j}^{* }{| }^{2} < 1\). Combining this with the fact that \(\mathop{\sum }\nolimits_{a = 0}^{d-1}{| }_{i}^{* }{\langle 0| a\rangle }_{j}^{* }{| }^{2}=1\), we obtain that there exists a γ [0, 2π) such that g0 = h0 = . . . = hd−1 = e, which implies that S1 = eI. According to the definition of S1, we now have that U1 = U2 up to a global phase, which completes the proof.

The theorem shows the correctness of our plan, and we can indeed determine whether U1 and U2 have the same function by examining the underlying quantum nonlocality of \(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle\).

The approximate case

Since equivalent checking is an important issue in engineering applications, we need to address the situation that quantum circuits are realized approximately. For example, unitary operations U1 and U2 correspond to two different quantum circuits for the same quantum algorithm, hence they are supposed to be the same. However, due to certain mistakes, one of the quantum circuits contains some more quantum gates, which implies that U1 ≠ U2. Here for simplicity, we suppose the error in realizing quantum circuits are unitary error. Note that this form of error covers the case that the preparation of \(\left\vert {{{\Phi }}}_{d}\right\rangle\) is also affected by local unitary errors. Our numerical simulations show that more general forms of weak errors that are expressed as quantum operations can also be handled, though it is hard to provide analytical discussions like in the unitary case below.

Since U1 ≠ U2, if we do the Bell experiment introduced previously using U1 and U2, the Bell expression value \({I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )\) will not be exactly m(d−1). In this situation, an interesting problem is whether or not we can draw any nontrivial conclusions on D(U1, U2), the distance between U1 and U2, based on the value of \({I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )\). We now show that this is indeed the case, and furthermore, D(U1, U2) can be lower and upper bounded analytically.

In this paper, we choose the definition for D(U1, U2) given by ref. 22, which is

$$D({U}_{1},{U}_{2})=\sqrt{1-{\left\vert \frac{1}{d}{{{\rm{Tr}}}}({U}_{1}^{{\rm {T}}}{U}_{2})\right\vert }^{2}}.$$
(11)

Meanwhile, we need to use the following key fact (see Supplementary note 1 for its proof).

Lemma 1

Suppose \(\left\vert \psi \right\rangle\) is a d × d quantum state orthogonal to \(\left\vert {{{\Phi }}}_{d}\right\rangle\). Then

$$-m\le {I}_{d,m}(\left\vert \psi \right\rangle )\le m(d-2).$$
(12)

Having this fact, we are ready to give the second main result of the current paper.

Theorem 2

Suppose \(V={I}_{d,m}(({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle )\), then we have that

$$\sqrt{1-\frac{V+m}{md}}\le D({U}_{1},{U}_{2})\le \sqrt{1-\frac{V-m(d-2)}{m}}.$$
(13)

Proof

Let \(\left\vert \alpha \right\rangle =({U}_{1}\otimes {U}_{2})\left\vert {{{\Phi }}}_{d}\right\rangle =(I\otimes {U}_{2}{U}_{1}^{{\rm {T}}})\left\vert {{{\Phi }}}_{d}\right\rangle\). Suppose an orthogonal decomposition of \({U}_{2}{U}_{1}^{{\rm {T}}}\) is \({U}_{2}{U}_{1}^{{\rm {T}}}=\mathop{\sum }\nolimits_{j = 0}^{d-1}{e}^{{{{{\bf{i\theta }}}}}_{{{{\bf{j}}}}}}\left\vert {\lambda }_{j}\right\rangle \left\langle {\lambda }_{j}\right\vert\), where θj [0, 2π). Note that we also have \(\left\vert {{{\Phi }}}_{d}\right\rangle =\mathop{\sum }\nolimits_{j = 0}^{d-1}\left\vert {\lambda }_{j}\right\rangle {\left\vert {\lambda }_{j}\right\rangle }^{* }/\sqrt{d}\). Therefore, we have that

$$\left\vert \alpha \right\rangle =\mathop{\sum }\limits_{j=0}^{d-1}{e}^{{{{{\bf{i\theta }}}}}_{{{{\bf{j}}}}}}\left\vert {\lambda }_{j}\right\rangle {\left\vert {\lambda }_{j}\right\rangle }^{* }/\sqrt{d}.$$
(14)

Let \(\left\vert \alpha \right\rangle ={c}_{1}\left\vert {{{\Phi }}}_{d}\right\rangle +{c}_{2}\left\vert {{{\Phi }}}^{\perp }\right\rangle\), where c1 and c2 are complex numbers, c12 + c22 = 1, and 〈ΦΦd〉 = 0. Then it can be seen that

$${c}_{1}=\langle {{{\Phi }}}_{d}| \alpha \rangle =\mathop{\sum }\limits_{j=0}^{d-1}\frac{{e}^{{{{{\bf{i\theta }}}}}_{{{{\bf{j}}}}}}}{d}=\frac{{{{\rm{Tr}}}}({U}_{2}{U}_{1}^{{\rm {T}}})}{d},$$
(15)

which means that \(D{({U}_{1},{U}_{2})}^{2}=1-| {c}_{1}{| }^{2}\).

For convenience, let \(B=\mathop{\sum }\nolimits_{i = 1}^{m}\mathop{\sum }\nolimits_{l = 1}^{d-1}({A}_{i}^{l}\otimes {\bar{B}}_{i}^{l})\). Then it holds that

$$\begin{array}{ll}V&=\left\langle \alpha \right\vert B\left\vert \alpha \right\rangle =({c}_{1}^{* }\left\langle {{{\Phi }}}_{d}\right\vert +{c}_{2}^{* }\left\langle {{{\Phi }}}^{\perp }\right\vert )B({c}_{1}\left\vert {{{\Phi }}}_{d}\right\rangle +{c}_{2}\left\vert {{{\Phi }}}^{\perp }\right\rangle )\\ &=| {c}_{1}{| }^{2}\left\langle {{{\Phi }}}_{d}\right\vert B\left\vert {{{\Phi }}}_{d}\right\rangle +| {c}_{2}{| }^{2}\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle \\ &=| {c}_{1}{| }^{2}\cdot m(d-1)+(1-| {c}_{1}{| }^{2})\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle .\end{array}$$
(16)

According to Lemma 1, we have that \(-m\le \left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle \le m(d-2)\), which means that

$$\sqrt{\frac{V-m(d-2)}{m}}\le | {c}_{1}| \le \sqrt{\frac{V+m}{md}}.$$
(17)

Combining this with the fact that \(D{({U}_{1},{U}_{2})}^{2}=1-| {c}_{1}{| }^{2}\), we complete the proof.

Note that when V = m(d−1), both the lower and the upper bounds are exactly 1, implying that both of them are tight in this case. When V does not achieve m(d−1), the lower bound for D(U1, U2) reveals the minimum distance between U1 and U2, thus in some sense it is more informative than the upper bound.

To examine the performance of the above analytical bounds, we test them with numerical simulations. For this, we generate many random instances for U1 and U2, then for each pair of U1 and U2 we compute the corresponding exact values of D(U1, U2), which are next compared with the lower and upper bounds for D(U1, U2) given by Theorem 2. The results are listed in Fig. 1, where it can be seen that the lower bound is quite tight in many instances.

Fig. 1: The values of D(U1, U2) and V.
figure 1

Here d = 4, m = 2, and each blue point represents a pair of U1 and U2 that is randomly generated, on which the exact values of D(U1, U2) and V are given. The red and the orange solid lines are, respectively, the lower and the upper bounds provided by Theorem 2.

Direct determination of the distance D(U 1, U 2)

In Fig. 1, it can be observed that in most cases the upper bound for D(U1, U2) given by Theorem 2 is quite loose compared with the lower bound. From the proof for Theorem 2, it can be seen that the reason is that the bound \(\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle \le m(d-2)\) we have utilized is far from tight in most cases. If we could somehow improve the upper bound for \(\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle\), our estimation for D(U1, U2) will be more accurate accordingly.

To understand the behavior of \(\left\langle \phi \right\vert B\left\vert \phi \right\rangle\), we study its value for a uniformly random pure state \(\left\vert \phi \right\rangle\). It turns out that \(\left\langle \phi \right\vert B\left\vert \phi \right\rangle\) is very small with a probability close to 1. Particularly, we have the following fact, and its proof can be seen in Supplementary Note 2.

Lemma 2

Given 0 < δ < 1. Suppose \(\left\vert \psi \right\rangle\) is a d × d quantum state, which as a unit vector is chosen uniformly at random on the d2-dimensional real unit sphere. Then with the probability of no less than 1−δ it holds that

$${I}_{d,m}(\left\vert \psi \right\rangle )\le m\sqrt{\frac{4}{3d\delta }}.$$
(18)

Though for a random pair U1 and U2, it is possible that the distribution of \(\left\vert {{{\Phi }}}^{\perp }\right\rangle\) is not uniformly random, the above lemma still helps us to understand why the estimation \(\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle \le m(d-2)\) is quite loose overall. Inspired by this, we now adjust the structure of our protocol, and the purpose is to make sure that the new value of \(\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle\) is low.

Suppose U1 and U2 are the two n-qubit circuits that we want to compare. Now we construct a 2n-qubit circuit as shown in Fig. 2, and denote it as \({U}_{1}^{{\prime} }\), where U1 is a part of \({U}_{1}^{{\prime} }\). And \({U}_{2}^{{\prime} }\) is constructed similarly. Then we apply our protocol to compare the new quantum circuits \({U}_{1}^{{\prime} }\) and \({U}_{2}^{{\prime} }\), whose size is now larger.

Fig. 2: The 2n-qubit circuit \({U}_{1}^{{\prime} }\).
figure 2

This circuit contains the n-qubit circuit U1.

We now prove that this adjustment will pin down the new value of \(\left\langle {{{\Phi }}}^{\perp }\right\vert B\left\vert {{{\Phi }}}^{\perp }\right\rangle\) to be −m, which is actually the smallest possible. As a result, the upper bound for \(D({U}_{1}^{{\prime} },{U}_{2}^{{\prime} })\) given by Theorem 2 now matches the lower bound completely. That is to say, from the value of Bell expression \({I}_{d,m}(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Psi }}}_{d}\right\rangle )\), \(D({U}_{1},{U}_{2})=D({U}_{1}^{{\prime} },{U}_{2}^{{\prime} })\) can be determined directly, where d = 22n.

Theorem 3

Suppose \(V={I}_{d,m}(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Phi }}}_{d}\right\rangle )\) where d = 22n, then we have that

$$D({U}_{1},{U}_{2})=D({U}_{1}^{{\prime} },{U}_{2}^{{\prime} })=\sqrt{1-\frac{V+m}{md}}.$$
(19)

Proof

Denote the operation of all the control-Z gates combined in Fig. 2 by UZ (as a unitary matrix on 2n qubits). That is \({U}_{1}^{{\prime} }={U}_{{{{\rm{Z}}}}}({U}_{1}\otimes I),{U}_{2}^{{\prime} }={U}_{{{{\rm{Z}}}}}({U}_{2}\otimes I)\). Then

$$\begin{array}{ll}D({U}_{1}^{{\prime} },{U}_{2}^{{\prime} })&=\sqrt{1-\left\vert \frac{1}{{2}^{2n}}{{{\rm{Tr}}}}({U}_{1}^{{{{\prime} T}}}{U}_{2}^{{\prime} })\right\vert }\\ &=\sqrt{1-\left\vert \frac{1}{{2}^{2n}}{{{\rm{Tr}}}}(({U}_{1}^{{\rm{T}}}\otimes I){{U}_{{{{\rm{Z}}}}}}^{{\rm{T}}}{U}_{{{{\rm{Z}}}}}({U}_{2}\otimes I))\right\vert }\\ &=\sqrt{1-\left\vert \frac{1}{{2}^{2n}}{{{\rm{Tr}}}}(({U}_{1}^{{\rm{T}}}\otimes I)({U}_{2}\otimes I))\right\vert }\\ &=\sqrt{1-\left\vert \frac{1}{{2}^{2n}}{{{\rm{Tr}}}}({U}_{1}^{{\rm{T}}}{U}_{2}\otimes I)\right\vert }\\ &=\sqrt{1-\left\vert \frac{1}{{2}^{n}}{{{\rm{Tr}}}}({U}_{1}^{{\rm{T}}}{U}_{2})\right\vert }\\ &=D({U}_{1},{U}_{2}).\end{array}$$
(20)

In the proof for Lemma 1 (see Supplementary Note 1), we have already known that if we let \(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Phi }}}_{d}\right\rangle =\mathop{\sum }\nolimits_{k = 0}^{d-1}\mathop{\sum }\nolimits_{j = 0}^{d-1}{\gamma }_{kj}\left\vert k\right\rangle \left\vert j\right\rangle\), it holds that

$${I}_{d,m}(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Phi }}}_{d}\right\rangle )=m\mathop{\sum }\limits_{r=0}^{d-1}\left({\left\vert \mathop{\sum }\limits_{k = 0}^{d-r-1}{\gamma }_{k(k+r)}\right\vert }^{2}+{\left\vert \mathop{\sum }\limits_{k = d-r}^{d-1}{\gamma }_{k(k+r-d)}\right\vert }^{2}\right)-m.$$
(21)

Now let us notice the following properties of γkj. Let \(k={{a}_{1}{a}_{2}\ldots {a}_{n}{b}_{1}{b}_{2}\ldots {b}_{n}}\) and \(j={{c}_{1}{c}_{2}\ldots {c}_{n}{d}_{1}{d}_{2}\ldots {d}_{n}}\) be binary representations of k and j, where ai, bi, ci, di {0, 1} for 1 ≤ i ≤ n. Then based on the construction of \({U}_{1}^{{\prime} }\) and \({U}_{2}^{{\prime} }\) given by Fig. 2, it can be verified that

  1. 1.

    If \(\overline{{a}_{1}{a}_{2}\ldots {a}_{n}}\,\ne \,\overline{{c}_{1}{c}_{2}\ldots {c}_{n}}\), then γkj = 0;

  2. 2.

    If \(\overline{{a}_{1}{a}_{2}\ldots {a}_{n}}=\overline{{c}_{1}{c}_{2}\ldots {c}_{n}}\) and bi ≠ di, we let \(k^{\prime} =\overline{{a}_{1}{a}_{2}\ldots {a}_{i-1}(1-{a}_{i}){a}_{i+1}\ldots {a}_{n}{b}_{1}{b}_{2}\ldots {b}_{n}}\), and \(j^{\prime} =\overline{{c}_{1}{c}_{2}\ldots {c}_{i-1}(1-{c}_{i}){c}_{i+1}\ldots {c}_{n}{d}_{1}{d}_{2}\ldots {d}_{n}}\), then \({\gamma }_{k^{\prime} j^{\prime} }=-{\gamma }_{kj}\), where we have utilized the facts that only one of ai and 1 − ai can trigger the Z operators on the positions bi and di and that bi ≠ di.

By using the properties repeatedly, one can prove that \(\mathop{\sum }\nolimits_{k = 0}^{d-r-1}{\gamma }_{k(k+r)}=\mathop{\sum }\nolimits_{k = d-r}^{d-1}{\gamma }_{k(k+r-d)}=0\) when r ≠ 0. Thus we have that

$$\begin{array}{l}{I}_{d,m}(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Phi }}}_{d}\right\rangle )=\,m{\left\vert \mathop{\sum }\limits_{k = 0}^{d-1}{\gamma }_{kk}\right\vert }^{2}-m\\\qquad\qquad\qquad\qquad\quad =\,md{\left\vert \left\langle {{{\Phi }}}_{d}\right\vert ({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Phi }}}_{d}\right\rangle \right\vert }^{2}-m\\\qquad\qquad\qquad\qquad\quad =\,md(1-D{({U}_{1}^{{\prime} },{U}_{2}^{{\prime} })}^{2})-m.\end{array}$$
(22)

That is, \(D({U}_{1}^{{\prime} },{U}_{2}^{{\prime} })=\sqrt{1-\frac{V+m}{md}}\).

Therefore, to determine the distance between two n-qubit quantum circuits, we can embed them into two larger 2n-qubit quantum circuits and then apply our original protocol to the latter. Though the cost is a little bit higher, the estimation for the distance can be much more accurate. We also perform numerical simulations to verify our modified protocol, where again random U1 and U2 are sampled. The results are listed in Fig. 3.

Fig. 3: Determining the distance D(U1, U2) based on Theorem 3.
figure 3

Here U1 and U2 are randomly sampled, and m = 2. Every blue point corresponds to picking up a specific pair of U1 and U2 and then running the Monte Carlo process described in section The analysis of computational cost, where s = 100, 1000, 10,000, respectively. We sample U1 and U2 of 1, 2, or 3 qubits, which means the experiments are performed on two 2, 4, or 6-qubit circuits. The red solid line is given by Theorem 3.

The analysis of computational cost

Now let us analyze the computational cost of our modified protocol, that is, the number of times that we have to run the unknown circuits in order to give a good estimation of the distance D(U1, U2) based on Theorem 3. For convenience, we reformulate the Bell expression as below, and the corresponding details can be found in21.

$$\begin{array}{l}{I}_{d,m}=dm{I}_{d,m}^{{\prime} }-m\\ {I}_{d,m}^{{\prime} }=\frac{1}{m}\mathop{\sum }\limits_{k=0}^{d-1}\mathop{\sum }\limits_{i=1}^{m}{\alpha }_{k}[P({A}_{i}={B}_{i}+k)+P({B}_{i}={A}_{i+1}+k)]\end{array}$$
(23)

where \({\alpha }_{k}=\frac{1}{2d}\tan (\frac{\pi }{2m})\cot (\frac{\pi }{d}(k+\frac{1}{2m}))\) and Am+1 = A1 + 1. For simplicity, in this section Id,m and \({I}_{d,m}^{{\prime} }\) are short for \({I}_{d,m}(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Psi }}}_{d}\right\rangle )\) and \({I}_{d,m}^{{\prime} }(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Psi }}}_{d}\right\rangle )\), respectively. Since −m ≤ Id,m ≤ m(d−1), we have \(0\le {I}_{d,m}^{{\prime} }\le 1\). Meanwhile, Theorem 3 implies that \(D({U}_{1},{U}_{2})=\sqrt{1-{I}_{d,m}^{{\prime} }}\).

Now we consider the estimation of \({I}_{d,m}^{{\prime} }\), where d = 22n. First Alice and Bob apply circuits on their own subsystems of the maximally entangled state to get \(({U}_{1}^{{\prime} }\otimes {U}_{2}^{{\prime} })\left\vert {{{\Psi }}}_{d}\right\rangle\). Then choose r {0, 1} and i {1, 2, …, m} equiprobably. If r = 0, Alice and Bob perform measurements Ai and Bi respectively and obtain the outcomes a and b, then they return 2αabmodd. If r = 1, Alice and Bob perform measurements Ai+1 and Bi and obtain the outcomes a and b, then they return 2αbamodd. They repeat the above process s times. Denote the return values by Xj, j = 1, 2, …, s. Then it turns out that \(X\equiv \frac{1}{s}\mathop{\sum }\nolimits_{j = 1}^{s}{X}_{j}\) is an estimation of \({I}_{d,m}^{{\prime} }\).

Indeed, note that \({\mathbb{E}}({X}_{j})={I}_{d,m}^{{\prime} }\), which means \({\mathbb{E}}(X)={I}_{d,m}^{{\prime} }\). Furthermore, since αk ≤ 1, by Hoeffding’s inequality, if \(s \,> \,8\log (1/\delta )/{\epsilon }^{2}\), we have that

$$P(| X-{I}_{d,m}^{{\prime} }| \ge \epsilon )\le \delta .$$
(24)

That is to say, in order to estimate the value of \({I}_{d,m}^{{\prime} }\) within additive error ϵ, the cost of our protocol is \(O(\log (1/\delta )/{\epsilon }^{2})\), which is completely independent of the dimension. Then according to Theorem 3, if we want to estimate D = D(U1, U2) within additive error ϵ, then the cost of our protocol will be \(O(\log (1/\delta )/{D}^{2}{\epsilon }^{2})\) if D > ϵ, or \(O(\log (1/\delta )/{\epsilon }^{4})\) if 0 ≤ D ≤ ϵ.

As a comparison, we can consider an alternative approach to verify whether U1 and U2 are the same, which performs quantum process tomography (QPT) for U1 and U2 separately and then compare the two outputs. The standard QPT technique needs to estimate roughly O(d4) quantities. Recently, QPT protocols have been customized to characterize unitary operations23,24, which reduced the cost to O(d2). The cost of our protocol is much less than QPT and gets rid of the exponential growth with the number of qubits increasing, which means our protocol is practical in the era of large-scale quantum computation.

Lastly, we would like to stress that the measurements involved in our protocol can be physically implemented by a serial of single-qubit measurements. In fact, it is not hard to verify that the observable eigenvectors given in Eqs. (4) and (5) can always be decomposed as tensor products of single-qubit pure states as below:

$$\begin{array}{l}{\left\vert a\right\rangle }_{x}=\frac{1}{\sqrt{d}}\mathop{\sum }\limits_{k=0}^{d-1}\exp \left[\frac{2\pi i}{d}k(a-{\alpha }_{x})\right]\left\vert k\right\rangle\\\qquad = \mathop{ \bigotimes }\limits_{j = 1}^{n}\left(\left\vert 0\right\rangle +\exp \left[\frac{2\pi i}{d}{2}^{j-1}(a-{\alpha }_{x})\right]\left\vert 1\right\rangle \right)/\sqrt{2},\\ {\left\vert b\right\rangle }_{y}\,=\frac{1}{\sqrt{d}}\mathop{\sum }\limits_{k=0}^{d-1}\exp \left[-\frac{2\pi i}{d}k(b-{\beta }_{y})\right]\left\vert k\right\rangle \\\qquad = \mathop{\bigotimes }\limits_{j = 1}^{n}\left(\left\vert 0\right\rangle +\exp \left[-\frac{2\pi i}{d}{2}^{j-1}(b-{\beta }_{y})\right]\left\vert 1\right\rangle \right)/\sqrt{2}.\end{array}$$
(25)

As a result, to measure the original observables characterized by Eqs. (4) and (5), one only needs to measure the quantum system qubit by qubit, from j = n to j = 1, which can obtain the original measurement outcome bit by bit. This implies that it is realistic to implement our protocol physically.

The equivalence checking of multiple quantum circuits

Now let us go one step further. Suppose we have k ≥ 3 quantum circuits C1, C2, ..., Ck, and again we want to know whether they are equivalent to each other. Apparently, we can solve the problem by comparing these quantum circuits pair by pair. But if we are unlucky, we need to run the above two-circuit protocol for k−1 times. With the success in the two-circuit case, we may wonder, whether or not we can design a similar protocol such that a proper k-partite Bell inequality allows us to solve the multi-circuit problem in one go. We now show that, at least in the case that k is odd, this is impossible.

Recall that a key part of our protocol is finding a k-partite quantum state \(\left\vert {\psi }_{k}\right\rangle\) and a certain Bell inequality such that \(\left\vert {\psi }_{k}\right\rangle\) violates it maximally. Furthermore, \(\left\vert {\psi }_{k}\right\rangle\) has to satisfy the condition that for any local unitary matrix U, it holds that \((U\otimes U\otimes \cdots \otimes U)\left\vert {\psi }_{k}\right\rangle =\left\vert {\psi }_{k}\right\rangle\).

For simplicity, we now suppose that for each party the local dimension is 2, and the following argument is easy to be generalized to high-dimensional cases. Then we have that

$$({\sigma }_{x}\otimes {\sigma }_{x}\otimes \cdots \otimes {\sigma }_{x})\left\vert {\psi }_{k}\right\rangle =\left\vert {\psi }_{k}\right\rangle$$

and

$$({\sigma }_{z}\otimes {\sigma }_{z}\otimes \cdots \otimes {\sigma }_{z})\left\vert {\psi }_{k}\right\rangle =\left\vert {\psi }_{k}\right\rangle ,$$

where σx and σz are Pauli matrices. However, since k is odd, σxσxσx and σzσzσz anticommute, which means that \(\left\vert {\psi }_{k}\right\rangle\) is the zero vector, a contradiction.

Therefore, when k is odd, we cannot generalize our two-circuit protocol to solve the equivalence checking problem in one go. However, we cannot rule out this possibility for the case that k is even, where the major challenge is to find a desirable multipartite Bell inequality. We leave this for future work.

Discussion

In this paper, we have proposed a protocol for black-box equivalence checking of quantum circuits, where the key quantum property we have utilized is quantum nonlocality. We have proved the correctness of our protocol analytically and numerically. Particularly, we have shown that for any given strength of observed quantum nonlocality, the distance between two compared quantum circuits can be estimated accurately in an analytical manner. Furthermore, it turns out that the computational cost of our protocol is independent of the size of compared quantum circuits. Our work can be regarded as a nontrivial application of quantum nonlocality in the area of quantum engineering, and we hope this protocol can be applied in future quantum industries.