Introduction

Random numbers play a vital role in various tasks, such as cryptography,1 numerical simulation,2 and lottery. For example, in the well-known quantum key distribution (QKD) protocol proposed by Bennett and Brassard,3 the security is guaranteed by random choices of the encoding and measurement bases. Distinct from deterministic evolution of classical processes, quantum mechanics endows the capability of generating genuine randomness by collapsing the coherence in the measurement basis.4,5

According to the generation speed and the randomness reliability, quantum random number generators (QRNGs) can be categorized into following three types. Practical QRNGs, which assume well characterized devices, normally have a fast generation speed.6,7,8 Fully self-testing QRNGs, which adopt no assumptions on device implementations, generally have a low randomness generation speed owing to the stringent requirements.9,10,11,12,13 Semi-self-testing QRNGs lies somewhere in between, which have certain assumptions of device implementations while have high randomness generation speed in the meantime.14,15,16,17,18,19,20,21 We refer to ref. 22 for a detailed review of the developments of different types of QRNGs. For these three types, a tradeoff between the randomness generation speed and the randomness reliability exists in practice. In many tasks such as QKD, both the randomness generation speed and the reliability are required in order to ensure the key generation rate and the security. For those tasks, semi-self-testing QRNGs serve as promising candidates that fulfill both requirements.

Recently, several semi-self-testing QRNG schemes have been proposed.14,15,16,17,18,19,20,21 By assuming the underlying dimension and the independence of the source and the measurement, a QRNG scheme14 has been proposed such that the output randomness can be self-tested. While, as the randomness is certified by the input and output statistics, the random number generation rate is only about 23 bps. The generation rate was further improved to the order of MHz with input and output statistics and weaker assumptions.18,19 Later, with trusted measurement but uncharacterized randomness source, a source-independent (SI) QRNG scheme is proposed,15 where the randomness generation speed is analyzed to be comparable to practical QRNGs that has characterized devices. Conventionally, QRNGs make use of special light sources, such as lasers, and specific physical model to characterize the randomness source. With more common light sources, such as sunlight, and no assumptions of the randomness origin, the SI-QRNG scheme can still faithfully generate random numbers.

In this work, we explore randomness generation with general light sources, laser and sunlight without assuming the physical structures. By exploiting the SI-QRNG scheme15 and considering practical issues of measurement device imperfections, we experimentally demonstrate the possibility of fast and reliable randomness generation.

Results

Scheme

As shown in Fig. 1, a conventional QRNG is composed of the randomness source and the detection device.22 In the source part, it consists of a light source and a state preparation device. Generally, practical QRNGs6,7,8 make use of specific models to describe the structure of the randomness source. While the SI-QRNG scheme15 supplies the possibility of randomness generation without assuming neither the light source nor the state preparation devices.

Fig. 1
figure 1

Source-independent randomness generation with uncharacterized light source and state preparation

First, we review the concept of SI-QRNG based on the scheme of ref. 15 The SI randomness generation procedure is summarized by source, squashing, random sampling, parameter estimation, and randomness extraction, as follows.

  • The state preparation device is expected to prepare the light in the polarization state of \(\left|+\right\rangle =(\left|H\right\rangle +\left|V\right\rangle )/\sqrt{2}\), where \(Z=\{\left|H\right\rangle ,\left|V\right\rangle \}\) is the computational basis. While no assumption is made on the photon source, it is untrusted and could be controlled by Eve. Thus, the actual prepared quantum state may have an arbitrary and unknown dimension. Randomness can still be quantified and extracted with the following steps.

  • The squashing process maps arbitrary quantum states into qubits and vacuum states. The vacuum components are regarded as loss. In practice, the squashing process can be realized by adding a series of spectrum, spatial and temporal filters to post-select the expected optical modes.

  • In the measurement, we randomly choose the \(X=\{\left|\pm \right\rangle =(\left|H\right\rangle \pm \left|V\right\rangle )/\sqrt{2}\}\) and \(Z\) basis to measure according to a random seed. The random seed needed is exponentially smaller than the number of extracted random bits.15 Suppose that the number of runs in total is \(N\), including \({N}_{X}\) in the \(X\) basis and \({N}_{Z}\) in the \(Z\) basis. Due to the loss, the output of detection could be null. In this case, the number of qubits measured in total is \(n\), including \({n}_{X}\) in the \(X\) basis and \({n}_{Z}\) in the \(Z\) basis. It is worth noting that this protocol is loss tolerance. In the ideal case, the measurement device chooses the measurement basis after confirming the received state is not a vacuum state. In practice, the measurement basis is chosen before the confirmation of loss, which is usually done by observing whether detectors click or not. However, the detection device does not anticipate the position of losses. Thus the effect of loss only decreases \({n}_{X}\) and \({n}_{Z}\), but the positions of the effective X and Z measurements are still uniformly random.

  • When measuring in the \(X\) basis, the result of \(\left|-\right\rangle\) is defined to be an error, and a double click is considered as a half error. Then, we can evaluate the phase error \({e}_{pZ}\) in the \(Z\) basis according to the bit error rate \({e}_{bX}\) in the X basis and its statistical deviation \(\theta\)23 according to

    $${\varepsilon }_{\theta }\,=\,\text{Prob}\,({e}_{pZ}\,{>}\,{e}_{bX}+\theta )\,{\le}\,\frac{1}{\sqrt{{q}_{X}(1-{q}_{X}){e}_{bX}(1-{e}_{bX})n}}{2}^{-n\xi (\theta )},$$
    (1)

    where \(\xi (\theta )=H({e}_{bX}+\theta -{q}_{X}\theta )-{q}_{X}H({e}_{bX})-(1-{q}_{X})H({e}_{bX}+\theta )\). Here \({q}_{X}={n}_{X}/n\) is the ratio of the \(X\) basis measurement. \(H(x)=-x{\mathrm{log}}_{2}(x)-(1-x){\mathrm{log}}_{2}(1-x)\) is the binary Shannon entropy function. Since the dimension of the source is unlimited, it may emit multiphoton states. When using threshold detectors, multiphoton states may cause double clicks, which directly contribute to the error rate \({e}_{bX}\) and decrease the number of extracted random bits.

  • By utilizing the Toeplitz-matrix hashing method,24 the phase error can be corrected by consuming \({n}_{Z}H({e}_{bX}+\theta )\) number of bits with a failing probability of \({2}^{-{t}_{e}}\).25 Thus, we can finally extract

    $${R}_{0}={n}_{Z}-{n}_{Z}H({e}_{bX}+\theta )-{t}_{e}$$
    (2)

    number of random bits and the final failure probability (in trace-distance measure) is given by \(\varepsilon =\sqrt{({\varepsilon }_{\theta }+{2}^{-{t}_{e}})(2-{\varepsilon }_{\theta }-{2}^{-{t}_{e}})}\).

Here, \({R}_{0}\) is the number of extracted random bits without considering the imperfections of the measurement devices. In practice, the measurement bases may not be exactly complementary to each other, and the detection efficiencies of the two detectors might be different. With a slight modification to Eq. (2), the number of extracted random bits with imperfect measurement devices \({R}_{{\rm{final}}}\) can still be quantified when these imperfections are characterized. On the other hand, dark counts of the detectors may also increase the phase error rate and decrease the number of extracted random bits. Since the dark counts are independent with respect to the measurement basis, the effect of dark counts can be regarded as noise of the photon source which has already been considered in the analysis.

In the original theoretical proposal, the \(X\) and \(Z\) basis measurements are assumed perfect. In our work, we also take measurement imperfections into account. Specifically, we consider the case that the actual measurement bases \({X}^{\prime}\) and \({Z}^{\prime}\) are not complementary to each other. In this case, we can make use of the general uncertainty relation for two general bases,26

$$H(Z^{\prime} )\ge -{\mathrm{log}}_{2}{\max }_{{x}^{\prime},{z}^{\prime}}| \langle {x}^{\prime} | {z}^{\prime} \rangle {| }^{2}-H(X^{\prime} ),$$
(3)

where \(\{|{x^\prime} \rangle \}\) and \(\{|{z}^{\prime} \rangle \}\) are, respectively, the eigenstates of \({X}^{{\prime} }\) and \({Z}^{{\prime} }\), and \(H(Z^{\prime} )\) and \(H(X^{\prime} )\) are, respectively, the entropy of the measurement outcome of \({X}^{{\prime} }\) and \({Z}^{{\prime} }\). In quantum random number generation, we can regard \(-{\mathrm{log}}_{2}{\max }_{{x}^{\prime} ,{z}^{\prime}}| \langle {x}^{\prime} | {z}^{\prime} \rangle {| }^{2}\) as the randomness that we can obtained by measuring an eigenstate of the \(X^{\prime}\) basis, and regard \(H(X^{\prime} )\) as the amount of states that are required to distill the eigenstate.5,27 That is, given \(N\) copies of the quantum state \(\rho\), one can effectively first perform a dephasing operation in the \(X^{\prime}\) basis to collapse them into one of its eigenstates. Then we aim to distill the dephased state into a specific eigenstate, which costs \(NH(X^{\prime} )\) copies of states. For each eigenstate, it generates \(-{\mathrm{log}}_{2}{\max }_{{x}^{\prime} ,{z}^{\prime} }| \langle {x}^{\prime} | {z}^{\prime} \rangle {| }^{2}\) randomness. Therefore, the total randomness obtained is \(-N{\mathrm{log}}_{2}{\max }_{{x}^{\prime} ,{z}^{\prime} }| \langle {x}^{\prime} | {z}^{\prime} \rangle {| }^{2}-NH(X^{\prime} )\) and each state generates \(-{\mathrm{log}}_{2}{\max }_{{x}^{\prime} ,{z}^{\prime} }| \langle {x}^{\prime}| {z}^{\prime}\rangle {| }^{2}-H(X^{\prime} )\) randomness on average. In practice, the dephasing and distillation process can be equivalently achieved with the recently proposed coherence distillation protocols, which can be further reduced to a randomness extraction procedure. Therefore, the final randomness output for two general imperfect bases is

$${R}_{1}=-2{n}_{Z}{\mathrm{log}}_{2}{\mathop{\max}\limits_{{x}{^\prime},{z}{^\prime}} }| {\langle {x}{^\prime}}| {z}{^\prime}{\rangle} | -{n}_{Z}H({e}_{bX}+{\theta} )-{t}_{e}.$$
(4)

Note that the randomness output only depends on the term \({\max }_{{x}{^{\prime} },{z}{^{\prime} }}| \langle {x}{^{\prime} }| {z}{^{\prime} }\rangle |\) instead of a full description of the \({X}^{^{\prime} }\) and \({Z}^{^{\prime} }\) bases.

In addition, we also consider the case that the measurement efficiencies in the two eigenstates are different. Suppose the efficiencies of projecting onto \(\left|0\right\rangle\) and \(\left|1\right\rangle\) are given by \({\eta }_{0}\) and \({\eta }_{1}\), respectively. Then according to the standard analysis in QKD,28 the randomness output will be further modified to

$${R}_{{\rm{final}}}=\frac{2\min ({\eta }_{0},{\eta }_{1})}{{\eta }_{0}+{\eta }_{1}}\left[-2{n}_{Z}\,{\log}_{2}\mathop{\max}\limits_{{x}{^{\prime} },{z}{^{\prime}}}| \langle {x}{^{\prime} }| {z}{^{\prime} }\rangle | -{n}_{Z}H({e}_{bX}+\theta )-{t}_{e}\right],$$
(5)

That is, the total randomness is rescaled with a factor \(\frac{2\min ({\eta }_{0},{\eta }_{1})}{{\eta }_{0}\,+\,{\eta }_{1}}\le 1\). The maximum value of 1 can be achieved when \({\eta }_{0}={\eta }_{1}\). Here, we assume that the adversary has no information of the detection efficiency mismatch, otherwise there may exist attacks in analogy to the time-shift attack from QKD.29 With such an assumption, the factor can be understood as a simple strategy that we randomly discard the measurement outcome of the higher efficiency detector such that the effective efficiencies of the two detectors are the same.

In experiment, the term \({\max }_{{x}^{\prime} ,{z}^{\prime} }| \langle {x}^{\prime} | {z}^{\prime} \rangle |\) and the efficiency \({\eta }_{0},{\eta }_{1}\) can be first measured during a calibration procedure on the measurement device. Then, the SI-QRNG scheme can be applied to produce randomness according to the randomness rate formulae in Eq. (5).

Experimental realization

As shown in Fig. 2, the experiment setup can be accordingly divided into two parts, the randomness source part and the detection part. While the detection part should be elaborately designed and carefully calibrated, the randomness source part can be uncharacterized or even untrusted.

Fig. 2
figure 2

The setup of the experiment can be divided into the photon source and detection parts. In the detection part, a Sagnac type interferometer with a phase modulator (PM) is used to select the measurement basis by applying a controllable unitary operator. After proper attenuation, the photons are detected by two up-conversion single photon detectors. In the randomness source part, a photon source is modulated by an amplitude modulator (AM) and transmit through a polarized beam splitter (PBS), a half-wave plate (HWP) and a quarter-wave plate (QWP) to prepare the desired state. The photon source used here is a cw laser and the sunlight, and can be replaced by any other light if necessary. ISO optical isolator, CIR optical circulator, FPBS fiber polarized beam splitter, ATT attenuator, PC polarization controller

In the detection part, filters in several dimensions are employed to rule out unexpected optical modes. Spectral filters, including two 100 GHz DWDMs and several interference filters, are used to guarantee that only photons with expected wavelength can enter, and the isolation on unwanted wavelength is over 60 dB. The coupling of single mode fiber excluded unwanted spatial modes. Finally, photons arrived at wrong time will be inspected and eliminated by a time-digital converter (TDC). The selection of measurement basis is realized by a Sagnac type interferometer and a phase modulator (PM) to obtain high visibility and stability, as shown in Fig. 2. For an input pulse with arbitrary polarization state of \(\alpha \left|H\right\rangle +\beta \left|V\right\rangle\), where \({\alpha }^{2}+{\beta }^{2}=1\), it is split by a fiber polarized beam splitter (FPBS2) when entering the Sagnac interferometer. The length of fiber from PM to one port of FPBS2 is 25.2 m shorter than to the other port. Thus, the time for the clockwise (\(\left|H\right\rangle\)) and anti-clockwise (\(\left|V\right\rangle\)) parts of the split pulses reach the PM are separated by around 126 ns, and finally back to FPBS2 at the same time. By carefully control the PM, the two parts of pulse can be applied by different phase, named \({\varphi }_{c}\) and \({\varphi }_{a}\). After combined again in FPBS2, the state of output pulse is \(\alpha {e}^{i{\varphi }_{c}}\left|H\right\rangle +\beta {e}^{i{\varphi }_{a}}\left|V\right\rangle\), correspond to a unitary operation of \({U}_{F}\). A fiber polarization controller is employed to perform an additional unitary operation of \({U}_{C}\). Here,

$${U}_{F}=\left(\begin{array}{cc}{{\rm{e}}}^{{\rm{i}}{\varphi }_{c}}&0\\ 0&{{\rm{e}}}^{{\rm{i}}{\varphi }_{a}}\end{array}\right),\quad {U}_{C}=\frac{1}{2}\left(\begin{array}{cc}1+{\rm{i}}&1-{\rm{i}}\\ 1-{\rm{i}}&1+{\rm{i}}\end{array}\right).$$
(6)

Finally, after appropriate attenuation, the pulse is separated by FPBS3 and detected by two up-conversion single photon detectors with efficiency of \(10 \%\), dark count of 200 cps and dead time of 5 ns.30 In this way, we can choose to perform \(Z\) (\(X\)) basis measurement by setting \({\varphi }_{c}\) to be 0 (\(-\pi /4\)) and \({\varphi }_{a}\) to be 0 (\(\pi /4\)). The probability of measuring in the \(Z\) (\(X\)) basis is selected as \(99.6 \%\) (\(0.4 \%\)) in our experiment and the average photon number per pulse is selected around \(13.9\) before detection to maximize the generation rate of random number. Generally, higher photon number per pulse brings higher error rate in the \(X\) basis and higher double clicks probability in the \(Z\) basis that need to be discarded, while lower photon number per pulse leads to lower \({n}_{z}\). Thus, there is a tradeoff for choosing a proper average photon number. The details of optimizing the probability of measuring in the \(Z\) (\(X\)) basis and the average photon number per pulse is discussed in Methods.

As aforementioned, the detection may have imperfections. Therefore, the detection part is first calibrated by an auxiliary cw laser diode with expected wavelength of 1550.12 nm. Considering the imperfect measurement basis of \({X}^{{\prime} }\) and \({Z}^{{\prime} }\), an additional process is performed to estimate \({\max }_{{x}^{\prime},{z}^{\prime}}| \langle {x}^{\prime} | {z}^{\prime} \rangle |\) in (5). Firstly, the input state is prepared as the eigenstate of \({Z}^{^{\prime} }\) basis, that is, the ratio of photon counting between detector 1 and detector 2 is above 30 dB under \({Z}^{^{\prime} }\) basis measurement. Then, in the \({X}^{^{\prime} }\) basis measurement, the ratio of photon counting between detector 1 and detector 2 is measured and the bound of \(-2\,\mathrm{log}{\max }_{{x}{^{\prime} },{z}{^{\prime} }}| \langle {x}{^{\prime} }| {z}{^{\prime} }\rangle |\) is calculated to be 0.952.

Although the randomness source part can be untrusted, to demonstrate the high generation rate of the setup, a carefully calibrated randomness source is realized. An amplitude modulator (AM) is used to modulate the input photons to pulses with frequency of 4 MHz and the full width at half maximum (FWHM) of 100 ns. Another fiber polarized beam splitter (FPBS1), a half-wave plate (HWP) and a quarter-wave plate (QWP), is used to prepare the desired polarization state for the detection part.

As the photon source can be any light that does not need to be trustable, the choice of photon source is flexible. Here, we also demonstrate the use of the most common light in the nature—the sunlight, as the photon source to generate random numbers. A collimator mounted on an equatorial mount is placed on the rooftop to collect sunlight into a single mode fiber. The sun can be approximately considered as an area light source with divergence angle around 0.5°.31 Thus, a common collimator with focus length of 11 mm is enough to collect sufficient photon intensities. About 49 nW of light can be collected into single mode fiber under a good weather after filtered by a 1550\(\pm\)1.5 nm bandpass filter.

The optimal input state for the detection part is the eigenstate of \(X\) basis. However, input state with other polarization state does not affect the reliability of randomness. Although the error rate in the \(X\) basis will increase and the random number generation rate will reduce. By rotating the HWP in randomness source part to different angles, the relationship between the input state and the error rate of \(X\) basis measurement is shown in Fig. 3. Under the near optimal condition with the input state of \(\left|+\right\rangle\), we performed the experiment for both laser and sunlight. The error rate in the \(X\) basis measurement is \(0.33 \%\) for laser and \(0.21 \%\) for sunlight. The quantum random number generation rate is 1.81 Mbps for laser and 1.72 Mbps for sunlight. The detailed results is shown as Table 1. The extracted random bits can pass NIST randomness test as shown in Fig. 4.

Fig. 3
figure 3

For our experiment, when the input state is \(\left|+\right\rangle\), the bit error rate close to zero. Rotating the HWP change the input state while the bit error rate. This figure shows the relationship between the angle of HWP and the error rate

Table 1 Experiment result
Fig. 4
figure 4

The P value of NIST tests. Blue represents laser and red represents sunlight

Discussion

In this work, we theoretically modified the SI-QRNG scheme by considering practical issues of measurement devices and experimentally demonstrated the applicability of the scheme in generating reliable and fast random numbers. Compared to the proof-of-principle demonstration in the theoretical work15 whose randomness generation rate is about \(1{0}^{-3}\) Mbps, our implementation improved the generation speed over three orders. Therefore, our SI-QRNG scheme can be applied in many scenarios where both the randomness generation speed and reliability are required. The randomness generation rate here is mainly limited by the detection rate of the single photon detector. Improving the detection rate of the single photon detector can thus further increase the randomness generation rate.

Our result highlighted the power of the state-of-the-art quantum technology. In previous works, it was shown that randomness can be obtained by measuring the intensity of an LED light with a mobile phone32 or by measuring the arrival time of photons from cosmic sources.33,34 However, such QRNG schemes are based on physical models of the LED light or the cosmic source. In our work, we showed that such assumptions are not necessary. Even with a common light in the nature—sunlight, we can still generate randomness both reliably and fast. Since no assumption is made on the photon source, the coherence or photon number statistics of the photon source does not affect the randomness of the extracted bits. In future works, by exploiting the SI-QRNG scheme, it is also interesting to modify (in theory) and realize (in experiment) those QRNG schemes such that the assumption of the source is removed.

Methods

Optimizing the generation rate

To optimize the final quantum random number generation rate, some proper parameters should be chosen or measured in Eq. (5). The first term \(\frac{2\,\min ({\eta }_{0},{\eta }_{1})}{{\eta }_{0}\,+\,{\eta }_{1}}\) depends on the efficiencies of the two detectors, and has a maximal value of 1 when the two efficiencies are equal. Thus, \({\eta }_{0}\) and \({\eta }_{1}\) are configured to be approximatively uniform (\(\eta =10 \%\)) in our experiment. The second term \(-2{\mathrm{log}}_{2}{\max }_{{x}^{{\prime} },{z}^{{\prime} }}| \langle {x}^{{\prime} }| {z}^{{\prime} }\rangle |\) depends on the accuracy of controlling the PC and the PM. Due to the imperfection of the actual measurement basis, this term is calculated to be 0.952. \({t}_{e}\) is chosen as 100.35 Other terms are related to the average photon number \(\lambda\) per pulse before detection. A low average photon number lowers \({n}_{z}\), while a high average photon number brings higher \({e}_{pz}\) and higher double click probability in the Z basis. We can rewrite the final random number generation rate as follows:

$${r}_{{\rm{final}}}=G\cdot {p}_{z({\rm{single-click}})}(0.952-H({e}_{bX}+\theta ))-100.$$
(7)

Here, G stands for the repetition rate of squash speed and equal to 4 MHz in our experiment. \({p}_{z({\rm{single-click}})}\) is the probability that one and only one detector clicks for a pulse. As Poisson distribution for laser and multi-mode sunlight,

$${p}_{z({\rm{single-click}})}=2{{\rm{e}}}^{-\frac{{\lambda }^{\prime}}{2}}(1-{{\rm{e}}}^{-\frac{{\lambda }^{\prime}}{2}}),$$
(8)

where \({\lambda }^{^{\prime} }=\lambda \cdot \eta\). For large \({n}_{X}\), \({e}_{pZ}\) and \({e}_{bX}\) can be regarded as the same. The relationship between \(\lambda\) and the final raw rate is shown in Fig. 5. The optimal \(\lambda\) is about 14.4. For sunlight, the actual \(\lambda\) of 11.6 is slightly deviated from the optimal value, due to the intensity fluctuate of sunlight. However, for a wide range of \(\lambda\) the random number generation rate is not obviously dropped.

Fig. 5
figure 5

Relationship between the average photon number \(\lambda\) per pulse and final random number generation rate. \(r\) have the maximal value for \(\lambda =13.9\). In this experiment, \(\lambda\) is \(14.4\) for laser and is \(11.6\) for sunlight. The circle and the triangle denote the actual key generation rate for the two photon sources, respectively. These two values are slightly deviated from the theoretical curve owing to the imperfections