Addressable superconductor integrated circuit memory from delay lines

Volk, Jennifer; Wynn, Alex; Golden, Evan; Sherwood, Timothy; Tzimpragos, Georgios

doi:10.1038/s41598-023-43205-8

Download PDF

Article
Open access
Published: 03 October 2023

Addressable superconductor integrated circuit memory from delay lines

Jennifer Volk^1,2,
Alex Wynn²,
Evan Golden²,
Timothy Sherwood³ &
…
Georgios Tzimpragos⁴

Scientific Reports volume 13, Article number: 16639 (2023) Cite this article

1186 Accesses
1 Citations
Metrics details

Subjects

Abstract

Recent advances in logic schemes and fabrication processes have renewed interest in using superconductor electronics for energy-efficient computing and quantum control processors. However, scalable superconducting memory still poses a challenge. To address this issue, we present an alternative to approaches that solely emphasize storage cell miniaturization by exploiting the minimal attenuation and dispersion properties of superconducting passive transmission lines to develop a delay-line memory system. This fully superconducting design operates at speeds between 20 and 100 GHz, with ± 24% and ± 13% bias margins, respectively, and demonstrates data densities in the 10s of Mbit/cm² with the MIT Lincoln Laboratory SC2 fabrication process. Additionally, the circulating nature of this design allows for minimal control circuitry, eliminates the need for data splitting and merging, and enables inexpensive implementations of sequential access and content-addressable memories. Further advances in fabrication processes suggest data densities of 100s of Mbit/cm² and beyond.

Entanglement of nanophotonic quantum memory nodes in a telecom network

Article Open access 15 May 2024

Van der Waals opto-spintronics

Article 22 May 2024

Quantum control of a cat qubit with bit-flip times exceeding ten seconds

Article 06 May 2024

Introduction

Superconductor electronics (SCEs) feature almost zero static power dissipation, speed-of-light energy-efficient interconnects, and clock rates in the 100s of GHz¹. In addition to these characteristics, SCEs can serve as facilitators for integrated classical-quantum computers due to their cryogenic nature^2,3. Despite advances in fabrication⁴, tools^5,6, and logic schemes^7,8, however, the lack of a reliable high-speed and high-density superconducting memory continues to impede the development of practical SCE systems⁹. In this paper, we introduce a scalable superconducting delay line memory that takes advantage of the technology’s fast switching, zero resistance, and high-kinetic inductance properties.

Previous research has shown that directly applying single flux quantum (SFQ) principles to memory results in designs with low access latency but insufficient density¹⁰. While arrays of vortex transition (VT) cells have demonstrated the ability to store up to 1 Mbit of data per square centimeter¹¹, they face significant limitations for further advancement due to their reliance on superconducting transformers. On the other hand, hybrid architectures that combine SFQ and complementary metal-oxide semiconductor (CMOS) technologies provide better scalability, albeit with long access latencies¹². CMOS units may scale more effectively than their superconducting counterparts but are slower and usually reside outside of the 4.2 kelvin cryocooler due to their power and thermal footprints. In the search for a viable superconducting memory solution, a considerable effort has also been invested in superconducting memory cells built from novel superconducting-ferromagnetic stack-ups^{13,14,15,16,17}. While promising in many aspects, such designs suffer from complex device structures and thus have their own practical limitations. Lastly, an approach that attempts to find a compromise between the hard-to-scale VT cells and the hard-to-fabricate superconducting-ferromagnetic hybrids is that of superconducting nanowire memory cells^18,19. Recent implementation results indicate a bit cell area of 26.5 $\upmu$m²²⁰, which is the most compact experimentally-verified superconducting storage element to date. However, despite their advantages, superconducting nanowire memory cells are addressed by hTrons, which bring about relatively slow access times, considerably high power consumption, and high error rates for multi-cell arrays.

In this paper, we forgo traditional array structures and propose a novel superconducting delay-line memory based on Passive Transmission Lines (PTLs). PTLs are, in essence, superconducting wires that can transmit single flux quanta with zero resistance, making them ideal for signal routing in large-scale SFQ designs²¹. If used to form loops, however, they can also be thought of as storage mediums. Figure 1 illustrates one such loop, which transmits SFQ pulses introduced at its input and delivers them after a specific time to its output. The pulses are then picked up and fed back to the loop input, where they are circulated again. Reading from and writing to the memory is achieved in a time-serial manner. This enables the time-sharing of control circuitry, eliminates the need for data splitting and merging, and, in doing so, circumvents the obstacles presented by previous approaches.

The idea of using delay lines in memory implementations has roots that go back several decades. One of the earliest examples was the acoustic delay line memory, invented by Eckert and Mauchly²², and published by Auerbach et al.²³. In the mid-1950s, IBM produced the 650 calculator, which relied on a magnetic delay line memory known as drum memory^24,25. Approximately a decade later, Oregon State University designed NEBULA, a medium-speed serial digital computer that utilized a content-addressable memory (CAM) made from 35 glass delay lines²⁶. IBM also experimented with a similar concept in the 1970s while constructing non-volatile bubble memory. Bubble memory stores data in small magnetized “bubble” areas, and read and write operations are performed by repositioning these bubbles²⁷. More recently, racetrack memories have been developed, which promise to deliver high performance at a low cost. In this case, data is stored along a series of magnetic domain walls in non-superconducting magnetic nanowires. Reading and writing are accomplished by passing current through the nanowires, which forces domain walls to advance^28,29.

When it comes to superconductor electronics, the concept of realizing memory through data circulation is nearly unexplored. Hattori et al.³⁰ made an early attempt to utilize a YBa₂Cu₃O$_{7-\delta }$ (YBCO) coplanar delay line in the late 1990s for constructing high-speed cell buffer storage for asynchronous transfer mode switching systems. Another effort was made in 2016 by Ishida et al.³¹, who proposed an SFQ cache architecture that relies on circular shift registers built from synchronous destructive read-out (DRO) cells. Although these approaches provide evidence of the feasibility and speed of superconducting delay line memories, their underlying designs are limited in capacity or addressing capability. For example, in the study conducted by Hattori et al.³⁰, a GaAs 2 $\times$ 2 crossbar switch allowed for an interface to be established with the YBCO coplanar delay line. But this crossbar lacks support for addressing, non-destructive readouts, or speeds exceeding 10 GHz. In addition, the estimated signal travel speed in the YBCO line is approximately 0.4c, where c is the speed of light. This implies that the minimum spacing between two subsequent data signals in the YBCO line would be about 12 mm. Hence, both the controller design and the line material impose significant constraints on the memory density. Regarding the DRO-based cache architecture³¹, synchronous cells are used to form a circular shift register, which hampers energy and area efficiency. Furthermore, shifting is controlled by a sequence of clock pulses whose length is determined by the provided address and the register’s current position, thereby resulting in additional overhead.

The proposed PTL-based memory design, in contrast, employs primarily passive components; is fully superconducting; encompasses all typical memory functionalities, such as addressing, data overwrite, and non-destructive readout; and achieves interface speeds of up to 100 GHz in simulation with satisfactory bias margins. The high controller (pulse injection) speed, coupled with the SFQ slow-down caused by high-kinetic inductance materials, boosts memory density by reducing the minimum spacing between subsequent pulses in PTLs. For example, the minimum spacing in NbN nanowires is about 570 times shorter than that of YBCO lines³⁰. To validate these hypotheses and quantify the projected gains, we conduct detailed analog simulations and formulate models that establish the relationship between memory density and various factors such as the operating frequency of the interface circuitry, pulse travel speed in the PTL, and line dimensions. For the latter two, we investigate PTL designs with different topologies, material compositions, and fabrication processes.

Results

Architecture description and functional evaluation

The block diagram of the proposed memory is shown in Fig. 1. The design consists of a PTL-based delay line and a control logic block. The delay line serves as the circulating loop storage and delays any data that arrives at its input (loop_data_in). The delay introduced by the loop depends on the line’s length and the pulse travel speed in the line. At the end of each round trip, the data at the output of the line (loop_data_out) enters the controller, which serves as a memory interface. The controller is responsible for deciding whether signals from the feedback path (loop_data_out) or the input (write_data) will be forwarded to the delay line (loop_data_in) for another round and whether it will be copied and forwarded to the readout port (read_data).

Figure 2 illustrates the schematic of the controller. Temporally-encoded SFQ signals, generated by comparing the value of an address counter with a target address, are used for addressing. The Merger cell, denoted by the letter m, stitches together and forwards all signals that appear on its two input lines to its single output line. When no pulse arrives on the write_address signal line within the designated interval, a pulse appears on its complementary signal line, $\lnot {write\_address}$. A pulse on the loop_data_out line then flows from the DRO2R (DRO with two outputs) on the left into the delay line input (loop_data_in) on the right, without waiting for other signals. Inversely, when a pulse arrives on the write_address line, a pulse on the loop_data_out line is ignored, the content of the DRO2R is cleared, and write_data is forwarded to the delay line input and readout circuitry. The use of differential signaling for write addressing enables the correction of potential data timing distortions incurred in the control circuitry and the storage loop. The readout circuitry on the right comprises a DRO2R cell. For readout to occur, a pulse is loaded through the read_address line. As with the first DRO2R, there are two cases: either a pulse arrives through the loop_data_in line and pushes the stored value to the Q0 output port (read_data), or a pulse on the complementary $\lnot {read\_address}$ line clears the cell, flushing the stored value.

Simulation results for this design are provided in Fig. 3. In both cases, the controller operates at 100 GHz, and three full rotations, or trips, are shown. Each trip consists of four intervals, with three of them corresponding to the number of supported memory addresses and one serving as the header, denoted by h. More specifically, in Fig. 3a, a pulse is provided on the write_data line during the header interval of the first rotation, trip 0. In the third interval of the same trip, a pulse arrives on each of the write_address and read_address lines, denoting a write to and read from address 1. Upon the arrival of the write_address pulse, a pulse appears on the loop_data_in line, which demonstrates a successful memory write operation. The subsequent appearance of a pulse on the read_data line after the arrival of the pulse on the read_address line indicates that write operations have higher priority than read. To illustrate the non-destructive nature of readout, in trip 1 of Fig. 3a, a pulse is asserted on the read_address line again, but this time, it is not paired with a pulse on the write_address line. It is again followed by the appearance of a pulse on the read_data line, which evidences the desired behavior. To demonstrate data overwrite, the same pulse ordering as before is used to set up the memory in Fig. 3b. However, in this case, a pulse is asserted on the write_address line during trip 1 without being accompanied by a second write_data pulse. As anticipated, no pulse appears on the loop_data_in line after this operation, indicating that the first pulse was successfully overwritten.

Note that the presented memory system allows one to search and operate on all of the memory contents while waiting for the entire circulation time to pass, thereby eliminating the need to broadcast to or continuously poll individual cells. The design’s rotating nature not only circumvents classic fan-in and fan-out limitations of superconductor electronics but also supports the addition of multiple write and read ports and the inexpensive implementation of content-addressable memories²⁶.

Circuit descriptions and performance evaluation

To evaluate the performance and feasibility of the proposed memory design, we first provide schematics and simulation results for the memory controller’s main components; next, we analyze their latency; lastly, we perform a voltage bias margin analysis for the entire system, including all loading effects due to the control logic, PTL, and accompanying driver and receiver circuitry. The controller, shown in Fig. 2, consists of a DRO cell, two DRO2R cells, and a Merger cell (m). Figures 4, 5, and 6 provide schematics for each cell as well as corresponding simulation waveforms to demonstrate cell function.

As is the case for any system, the electrical and timing properties of these cells affect both the performance and functionality of the proposed memory. In particular, electrical issues, typically brought on by susceptibility to parametric variation, can lead to fatally under- or over-biased Josephson Junctions (JJs), which in turn can lead to circuit dysfunction, or delayed or early switching times. To avoid erroneous behavior and ensure correct system timing, the effects of under- and over-biasing are first examined at the cell level. Performing this bias analysis for cells in isolation, however, is not sufficient because it excludes the loading effects that are present in a system setting. To account for loading, iterative measurements and component tuning are carried out in an in-situ approach to achieve the desired timing. Accordingly, each cell is fully loaded by the remaining components in the memory controller. The results of this process are shown in Fig. 7. Nominal cell delays are indicated in red. DRO and DRO2R delays are measured as clock-to-Q delays, while the Merger delay is measured as the propagation delay from either of the inputs to the output. Delays in each cell increase as bias decreases, and decrease as bias increases. To make bias margins symmetric, a set of component parameters is chosen that centers the nominal delay of each cell between the upper and lower time bounds of the controller.

Using interval analysis and the above delays, the maximum operating frequency of the controller is estimated, and the cell tuning and bias margin measurements are repeated. Another round of bias margin measurements is conducted, wherein the bias of the entire design is varied instead of individual cells. Figure 8 illustrates our results for frequencies ranging from 20 to 100 GHz. We notice that electrical issues—caused by, for example, Josephson junctions that are subject to the above biasing concerns and that switch too frequently or not at all—drive limitations in bias margin width at lower frequencies, while timing issues are the limiter at higher frequencies. This happens because timing constraints get tighter as the address timing interval is reduced. For example, at 100 GHz, the address timing interval is just 10 ps, which leaves little room for the same variations in propagation delay that we observed in Fig. 7. Our SPICE simulations show bias margins ranging from ± 24% (at 20 GHz) to ± 13% (at 100 GHz), which surpass the widely-accepted ± 10% threshold³².

Data density estimation

The physical storage density—that is, bits per area—of the proposed memory depends on (1) the PTL’s linewidth and spacing requirements, set by the fabrication process; (2) the travel speed of SFQ pulses in the PTL, set by the material of choice and the line topology; (3) the relative timing between two adjacent SFQ pulses, set by the controller’s operating frequency; and (4) the number of PTL memory routing layers. We estimate the density of the proposed PTL-based superconducting delay line memory by choosing various settings for each of these free variables and summarize our results in Table 1.

Table 1 Memory density estimates for a variety of mature, aggressive, and academic fabrication processes.

Full size table

A typical Nb stripline of 250 nm linewidth with a minimum spacing of 250 nm⁴ propagates SFQ pulses at a speed of 0.3c. This leads to data densities of up to 0.9 Mbit/cm² at 100 GHz, if four metal routing layers are used. Reducing the Nb stripline linewidth and minimum spacing from 250 to 120 nm is a possible but more aggressive design choice³³ and results in densities of up to 1.9 Mbit/cm². By switching device material and topology to an MoN kinetic inductor microstrip with the same dimensions, available on just one layer within MIT Lincoln Laboratory’s SFQ5ee³⁴ and SC2³³ processes, the travel speed of pulses in the line falls by about 6$\times$. This slowdown yields densities of up to 1.4 and 4.0 Mbit/cm² for 250 nm and 120 nm linewidths, respectively, at 100 GHz. By default, the SFQ5ee and SC2 fabrication processes allow for one MoN high-kinetic inductance layer, four Nb signal-routing layers, and three ground planes, based on the typical allocation. An increase in the number of the MoN high-kinetic inductance layers from one to four transforms the line topology into that of a stripline and increases the data density to 3.2 Mbit/cm² at 20 GHz and 19 Mbit/cm² at 100 GHz for a 120 nm linewidth and 120 nm spacing.

At this point, it is evident that the use of materials with increasingly high kinetic inductance is conducive to higher densities. To this end, we explore the potential of NbTiN striplines that exhibit approximately an order of magnitude higher inductance than their MoN counterparts and propagate SFQ pulses at a speed of 0.011c³⁵. Our results indicate that for a NbTiN stripline with 100 nm width, 120 nm spacing, four metal routing layers, and controller frequencies between 20 and 100 GHz, the estimated data densities range from 10.7 to 53.3 Mbit/cm².

A more forward-looking approach comes from the use of NbN kinetic inductor nanowires. In the case of an experimentally-tested NbN nanowire with 40 nm linewidth, the inductance scales to 2050 pH/$\upmu$m³⁶. A roughly proportional drop in capacitance keeps the pulse travel speed the same, 0.011c, but the reduced linewidth pushes the maximum data density to 75.4 Mbit/cm² at 100 GHz. A reduction in linewidth to 15 nm causes the inductance to increase to 5467 pH/$\upmu$m³⁶ based on a factor of 82 pH/$\square$, which drops the travel speed to 0.007c and increases data density to 140.3 Mbit/cm² at 100 GHz. This is equivalent to a physical pulse spacing of 21 $\upmu$m, about 600$\times$ shorter than that in the YBCO line³⁰. Moreover, assuming that CMOS layer stacking techniques—such as those used to create 100-layer stacks in V-NANDs—could be applied to superconductor technology, this NbN nanowire technology could provide a memory density of 3507 Mbit/cm² at 100 GHz operating speed.

Discussion

The microarchitectural features of our most prominent computer systems have historically co-evolved with the technology and devices that embody them. For instance, in the case of CMOS, the combination of cheap transistors and power-hungry interconnects³⁷ has motivated the development of designs that trade off resource redundancy for reduced data movement. By contrast, data movement in superconductor electronics is relatively cheaper than switching due to the nearly lossless nature of superconducting interconnects. Therefore, the critical question—which the presented research addresses—is whether, in the case of superconductor electronics, we can capitalize on the unique properties of superconducting interconnects to improve hardware efficiency, simplicity, and density.

Previous attempts to develop cryogenic memory technology, which functions at temperatures of 4 kelvin or below, have predominantly centered on creating individual storage cells that can subsequently be arranged in a standard grid configuration³⁸. While such configurations have been effective in CMOS, they cannot fully leverage the advantage that superconductor electronics provide in terms of low-cost signal transmission. Additionally, they often disregard the forthcoming obstacles that arise from the limitations of inductor scaling and fan-in/out¹¹, as well as the significant power consumption and long access times¹², high device complexity^{13,14,15,16,17}, and increased bit-error rate^18,19,20 that come with the suggested structures.

The presented approach departs from this tradition and exploits superconducting passive transmission lines for the construction of a delay-line memory system. From a device perspective, delay lines are easy to construct and offer high signal fidelity. For instance, experimental results indicate that SFQ pulses can travel over 7 mm-long PTLs without re-amplification³⁹. Moreover, the footprint and kinetic inductance of PTLs can be substantially modified by their topology, material composition, and fabrication process. By harnessing these characteristics in conjunction with our high-speed SFQ control logic circuitry, we anticipate achieving data density improvements of two orders of magnitude compared to the current state of the art. This can be accomplished while retaining the same number of metal (memory) layers as existing fabrication processes. The method of vertical growth—common in technologies like CMOS V-NANDs—is also anticipated to increase data density gains by an additional order of magnitude.

Looking forward, although there are no fundamental limitations to the development of scalable superconducting delay-line memories, a direction that invites further exploration is that of impedance-matching transformers, such as tapers⁴⁰. Alternatively, in order to address any potential impedance-mismatch challenges, exploring methods to boost the transmission line capacitance and utilizing higher-permittivity dielectrics, like NbOx⁴¹, between the signal and ground planes could prove to be a promising approach. Lastly, there are several interesting questions pertaining to the microarchitecture–compiler relationship. Specifically, optimizations in hierarchical design, instruction scheduling, and data placement can significantly reduce the memory readout time and improve its efficiency in cases involving both sequential access and content-based addressing⁴².

Methods

Memory control circuit design and analysis

Simulation setup

The simulations were conducted using WRSPICE⁴³, a version of SPICE designed for superconductor electronic designs. The resistively and capacitively-shunted JJ model employed in the simulations is based on MIT Lincoln Laboratory’s SFQ5ee 100 $\upmu {\text {A}}/\upmu {\text {m}}^2$ fabrication process³⁴ with an $I_C R_n$ value of 1.65 mV. Bias sources consisted of resistors tied to a voltage bias line, with resistor values chosen for the equivalent current listed. Bias was ramped up from zero to the nominal value. To generate SFQ input pulses from an input DC current stimulation, DC-to-SFQ converters were utilized. The subsequent analysis does not consider parasitic inductances associated with shunt resistors, as their impact on circuit performance has been determined to be negligible⁴⁴.

Circuit design

For the design of the memory controller’s key components, as illustrated in Figs. 4, 5, and 6, we began with publicly accessible schematics^45,46. To achieve a higher operational frequency and wider bias margins, the number of JJs on the critical path of the corresponding circuit was minimized. To accomplish this, we designed a DRO cell with 4 JJs that matched the critical currents of JJs in the neighboring cells, resulting in improved signal quality. Additionally, the original DRO2R design features output paths that are symmetrical. However, as depicted in Fig. 2, only the Q0 output (data_out0) is connected to the delay line loop. To meet the timing demands of the loop, the internal path from data_in to data_out0 was shortened. The serial inductors in the Merger design were also reduced from their initial values to improve critical path latency. Finally, we eliminated active Splitter cells⁴⁵, which are commonly used to provide fan-out for shared nodes but incur an overhead of 3 JJs each, by adopting a recently-proposed $I_C$ ranking technique⁴⁷. The combined modifications resulted in a memory controller circuit that comprises 29 JJs and a logic path delay of ~ 10 ps, which is determined by the sum of the DRO2R cell’s setup and propagation delays, averaged over its bias margins.

Analysis

To evaluate performance, the timing resolution was set to 0.5 ps, which was interpolated from the internal step size used by WRSPICE, and delays were measured as peak-to-peak values. Specifically, the DRO and DRO2R propagation delays were measured as the clock-to-Q delay, and the Merger propagation delay was measured as the delay from either input to the output. To determine the upper and lower time bounds of each cell, detailed bias margin analyses were conducted. For bias margin measurements, we started at the nominal voltage and decremented by steps of 1% of the nominal voltage over the bias range where the circuit operates, to locate the lower limit. We then incremented the nominal voltage to locate the upper limit. Static timing analysis using minimum and maximum intervals was performed to ensure that the timing at the limits of the bias margins meets the hold, setup, and propagation time requirements at different target frequencies.

Memory density analysis

The memory density of the PTL delay line was calculated using the equation:

$$\begin{aligned} Density = \frac{f}{w \times v}, \end{aligned}$$

(1)

where f represents the operating frequency of the memory controller in Hz, w is the PTL pitch in m, and v is the travel speed of single flux quanta in m/s.

Controller operational frequency and PTL pitch

The delay results shown in Fig. 7 indicate a maximum controller operational frequency of 100 GHz. In the conducted memory density analysis, as summarized in Table 1, the value of f is selected to span from this value down to 20 GHz to highlight the trade-off between density versus bias margins, as suggested by the findings presented in Fig. 8. Regarding the PTL pitch—the sum of the PTL linewidth and minimum spacing requirement—this is typically determined by the fabrication process. For example, the well-established SFQ5ee³⁴ process allows for the reliable fabrication of Nb and MoN microstrips and striplines with a 250 nm linewidth and 250 nm spacing, or a pitch of 500 nm. The more advanced SC2³³ process can reduce the pitch to 240 nm. Further miniaturization down to a pitch of 135 nm and smaller is possible using e-beam lithography³⁶.

Travel speed

The velocity factor equation was used to determine the travel speed (v) of single flux quanta in PTLs constructed with different materials. The equation is given by:

$$\begin{aligned} v = \frac{1}{c\sqrt{L \times C}}. \end{aligned}$$

(2)

Here, c is the speed of light in m/s, L is the inductance per unit length in H/m, and C is the capacitance per unit length in F/m. The capacitance and inductance values used in the calculations were sourced from literature and are shown in Table 1. In cases where geometric inductance is significant, such as in Nb striplines, the inductance scales non-linearly and must be evaluated at each linewidth individually. Based on available experimental measurements³³, a 250 nm-wide Nb stripline has 0.5 pH/$\upmu$m, while a 120 nm-wide line has 0.65 pH/$\upmu$m. For high-kinetic inductors like MoN and NbTiN striplines and MoN microstrips, the kinetic inductance is directly proportional to the length of the inductor and inversely proportional to its width. This can be calculated by multiplying the known inductance per square values of 8 pH/$\square$ for MoN striplines and microstrips⁴⁸ and 49 pH/$\square$ for NbTiN striplines³⁵ by the unit length divided by the linewidth. Regarding their capacitances, we relied on recently-published fabrication results³³. For instance, a stripline having a linewidth of 120 nm has a capacitance of 0.19 fF/$\upmu$m, which is solely determined by the dielectric material and geometry. Similarly, a microstrip having a linewidth of 120 nm has a capacitance of 0.14 fF/$\upmu$m. For the 40 nm NbN nanowires, experimental measurements yielded an inductance of 82 pH/$\square$ or 2050 pH/$\upmu$m for this width, and a capacitance of 0.044 fF/$\upmu$m³⁶. Lastly, inductance and capacitance values for 15 nm NbN nanowires were obtained from simulation results³⁶. Effects of coherent quantum phase slips (CQPS) have been observed in NbN films with comparable cross-sectional dimensions⁴⁹. Nevertheless, these effects do not present an issue here, as the kinetic inductance of the films, even with the most extreme linewidths considered, is still one order of magnitude lower than what has been used to observe CQPS. Consequently, the phase-slip amplitude in our case will be $e^{-9}$ times lower than in systems in which CQPS has been shown to be a significant contributor to conductivity⁵⁰ , and for this reason, we neglect it in our analysis.

Layer stacking

An approach to further enhance memory density is by vertically stacking multiple layers of PTLs. In this case, the memory density increases linearly with the number of PTL layers, denoted by N in the equation that follows:

$$\begin{aligned} Density = \frac{f}{w \times v}\times N. \end{aligned}$$

(3)

In this regard, a stackup of Nb striplines with nine planarized superconducting layers, stackable stud vias, self-shunted Nb/AlOx–Al/Nb Josephson junctions, and a single layer of MoN kinetic inductors has been successfully developed and currently serves as the de facto fabrication process^4,51. This stackup, MIT Lincoln Laboratory’s state-of-the-art SFQ5ee process, allows for four memory layers based on the typical allocation of ground planes and signal planes. Several advancements to this process are under development, including the addition of self-shunted junctions⁵², which use higher critical current density and can reduce the chip area of digital hardware by 50%, and the introduction of two additional routing layers⁵³. Looking forward, foundries have recently announced plans to scale up to sixteen routing layers and one layer for NbTiN kinetic inductors⁵⁴; to the best of our knowledge, however, there is no fundamental limitation that constrains vertical growth^{45,55,56,57,58}.

Data availability

All data relevant to the study are available from the corresponding authors upon request.

References

Holmes, D. S., Kadin, A. M. & Johnson, M. W. Superconducting computing in large-scale hybrid systems. Computer 48, 34–42. https://doi.org/10.1109/MC.2015.375 (2015).
Article Google Scholar
Li, K., McDermott, R. & Vavilov, M. G. Hardware-efficient qubit control with single-flux-quantum pulse sequences. Phys. Rev. Appl. 12, 014044. https://doi.org/10.1103/PhysRevApplied.12.014044 (2019).
Article CAS ADS Google Scholar
Holmes, A. et al. NISQ+: Boosting quantum computing power by approximating quantum error correction. In 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) 556–569. https://doi.org/10.1109/ISCA45697.2020.00053 (2020).
Tolpygo, S. K. et al. Developments toward a 250-nm, fully planarized fabrication process with ten superconducting layers and self-shunted Josephson junctions. In 2017 16th International Superconductive Electronics Conference (ISEC) 1–3. https://doi.org/10.1109/ISEC.2017.8314189 (2017).
Fourie, C. J. et al. ColdFlux superconducting EDA and TCAD tools project: Overview and progress. IEEE Trans. Appl. Supercond. 29, 1–7. https://doi.org/10.1109/TASC.2019.2892115 (2019).
Article Google Scholar
Christensen, M. et al. PyLSE: A pulse-transfer level language for superconductor electronics. In Proc. 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation PLDI 2022 (Association for Computing Machinery, 2022).
Tzimpragos, G. et al. A computational temporal logic for superconducting accelerators. In Proc. Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 20 435–448. https://doi.org/10.1145/3373376.3378517 (Association for Computing Machinery, 2020).
Tzimpragos, G., Volk, J., Wynn, A., Smith, J. E. & Sherwood, T. Superconducting computing with alternating logic elements. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) 651–664. https://doi.org/10.1109/ISCA52012.2021.00057 (2021).
Holmes, D. S., Ripple, A. L. & Manheimer, M. A. Energy-efficient superconducting computing-power budgets and requirements. IEEE Trans. Appl. Supercond. 23, 1701610. https://doi.org/10.1109/TASC.2013.2244634 (2013).
Article ADS Google Scholar
Nagasawa, S., Numata, H., Hashimoto, Y. & Tahara, S. High-frequency clock operation of Josephson 256-word/spl times/16-bit RAMs. IEEE Trans. Appl. Supercond. 9, 3708–3713. https://doi.org/10.1109/77.783834 (1999).
Article ADS Google Scholar
Semenov, V. K., Polyakov, Y. A. & Tolpygo, S. K. Very large scale integration of Josephson-junction-based superconductor random access memories. IEEE Trans. Appl. Supercond. 29, 1–9. https://doi.org/10.1109/TASC.2019.2904971 (2019).
Article Google Scholar
Van Duzer, T. et al. 64-kb hybrid Josephson-CMOS 4 Kelvin RAM with 400 ps access time and 12 MW read power. IEEE Trans. Appl. Supercond. 23, 1700504. https://doi.org/10.1109/TASC.2012.2230294 (2013).
Article Google Scholar
Feofanov, A. K. et al. Implementation of superconductor/ferromagnet/superconductor $\pi$-shifters in superconducting digital and quantum circuits. Nat. Phys. 6, 593–597. https://doi.org/10.1038/nphys1700 (2010).
Article CAS Google Scholar
Vernik, I. V. et al. Magnetic Josephson junctions with superconducting interlayer for cryogenic memory. IEEE Trans. Appl. Supercond. 23, 1701208. https://doi.org/10.1109/TASC.2012.2233270 (2013).
Article CAS ADS Google Scholar
Baek, B., Rippard, W. H., Benz, S. P., Russek, S. E. & Dresselhaus, P. D. Hybrid superconducting-magnetic memory device using competing order parameters. Nat. Commun. 5, 4888. https://doi.org/10.1038/ncomms4888 (2014).
Article CAS Google Scholar
Gingrich, E. C. et al. Controllable 0-$\pi$ Josephson junctions containing a ferromagnetic spin valve. Nat. Phys. 12, 564–567. https://doi.org/10.1038/nphys3681 (2016).
Article CAS Google Scholar
Nguyen, M.-H. et al. Cryogenic memory architecture integrating spin hall effect based magnetic memory and superconductive cryotron devices. Sci. Rep. 10, 9. https://doi.org/10.1038/s41598-019-57137-9 (2019).
Article CAS Google Scholar
Murphy, A., Averin, D. V. & Bezryadin, A. Nanoscale superconducting memory based on the kinetic inductance of asymmetric nanowire loops. New J. Phys. 19, 063015. https://doi.org/10.1088/1367-2630/aa7331 (2017).
Article CAS ADS Google Scholar
Zhao, Q.-Y. et al. A compact superconducting nanowire memory element operated by nanowire cryotrons. Supercond. Sci. Technol. 31, 035009. https://doi.org/10.1088/1361-6668/aaa820 (2018).
Article CAS ADS Google Scholar
Butters, B. A. et al. A scalable superconducting nanowire memory cell and preliminary array test. Supercond. Sci. Technol. 34, 035003. https://doi.org/10.1088/1361-6668/abd14e (2021).
Article ADS Google Scholar
Kameda, Y., Yorozu, S. & Hashimoto, Y. A new design methodology for single-flux-quantum (SFQ) logic circuits using passive-transmission-line (PTL) wiring. IEEE Trans. Appl. Supercond. 17, 508–511. https://doi.org/10.1109/TASC.2007.898718 (2007).
Article CAS ADS Google Scholar
Eckert, J. J. P. & Mauchly, J. W. Memory System. US Patent 2,629,827 (1953).
Auerbach, I., Eckert, J., Shaw, R. & Sheppard, C. Mercury delay line memory using a pulse rate of several megacycles. Proc. IRE 37, 855–861. https://doi.org/10.1109/JRPROC.1949.229683 (1949).
Article Google Scholar
Hamilton, F. E. & Kubie, E. C. The IBM magnetic drum calculator type 650. J. ACM 1, 13–20. https://doi.org/10.1145/320764.320768 (1954).
Article Google Scholar
Frankel, S. Useful applications of a magnetic-drum computer. Electr. Eng. 75, 634–639. https://doi.org/10.1109/EE.1956.6442018 (1956).
Article MathSciNet Google Scholar
Rux, P. A glass delay line content-addressed memory system. IEEE Trans. Comput. 18, 512–520. https://doi.org/10.1109/T-C.1969.222703 (1969).
Article MATH Google Scholar
Bonyhard, P. et al. Magnetic bubble memory chip design. IEEE Trans. Magn. 9, 433–436. https://doi.org/10.1109/TMAG.1973.1067599 (1973).
Article ADS Google Scholar
Parkin, S. S. P., Hayashi, M. & Thomas, L. Magnetic domain-wall racetrack memory. Science 320, 190–194. https://doi.org/10.1126/science.1145799 (2008).
Article CAS PubMed ADS Google Scholar
Parkin, S. & Yang, S.-H. Memory on the racetrack. Nat. Nanotechnol. 10, 195–198. https://doi.org/10.1038/nnano.2015.41 (2015).
Article CAS PubMed ADS Google Scholar
Hattori, W., Yoshitake, T. & Tahara, S. A reentrant delay-line memory using a YBa$_2$Cu$_3$O$_{7-\delta }$ coplanar delay-line. IEEE Trans. Appl. Supercond. 9, 3829–3832. https://doi.org/10.1109/77.783862 (1999).
Article ADS Google Scholar
Ishida, K., Tanaka, M., Ono, T. & Inoue, K. Single-flux-quantum cache memory architecture. In 2016 International SoC Design Conference (ISOCC) 105–106. https://doi.org/10.1109/ISOCC.2016.7799755 (2016).
Hashimoto, Y. et al. Design and investigation of gate-to-gate passive interconnections for SFQ logic circuits. IEEE Trans. Appl. Supercond. 15, 3814–3820. https://doi.org/10.1109/TASC.2005.847487 (2005).
Article ADS Google Scholar
Tolpygo, S. K., Golden, E. B., Weir, T. J. & Bolkhovsky, V. Inductance of superconductor integrated circuit features with sizes down to 120 nm. Supercond. Sci. Technol. 34, 085005. https://doi.org/10.1088/1361-6668/ac04b9 (2021).
Article ADS Google Scholar
Tolpygo, S. K. et al. Advanced fabrication processes for superconducting very large-scale integrated circuits. IEEE Trans. Appl. Supercond. 26, 1–10. https://doi.org/10.1109/TASC.2016.2519388 (2016).
Article Google Scholar
Hazard, T. M. et al. Nanowire superinductance fluxonium qubit. Phys. Rev. Lett. 122, 010504. https://doi.org/10.1103/PhysRevLett.122.010504 (2019).
Article CAS PubMed ADS Google Scholar
Niepce, D., Burnett, J. & Bylander, J. High kinetic inductance NbN nanowire superinductors. Phys. Rev. Appl. 11, 044014. https://doi.org/10.1103/PhysRevApplied.11.044014 (2019).
Article CAS ADS Google Scholar
Magen, N., Kolodny, A., Weiser, U. & Shamir, N. Interconnect-power dissipation in a microprocessor. In Proc. 2004 International Workshop on System Level Interconnect Prediction, SLIP 04 7–13. https://doi.org/10.1145/966747.966750 (Association for Computing Machinery, 2004).
Alam, S., Hossain, M. S., Srinivasa, S. R. & Aziz, A. Cryogenic memory technologies. Nat. Electron. 6, 1–14 (2023).
Article Google Scholar
Talanov, V. V. et al. Propagation of picosecond pulses on superconducting transmission line interconnects. Supercond. Sci. Technol. 35, 055011 (2022).
Article ADS Google Scholar
Klopfenstein, R. W. A transmission line taper of improved design. Proc. IRE 44, 31–35. https://doi.org/10.1109/JRPROC.1956.274847 (1956).
Article Google Scholar
Zhao, Y., Zhang, Z. & Lin, Y. Optical and dielectric properties of a nanostructured NbO$_2$ thin film prepared by thermal oxidation. J. Phys. D Appl. Phys. 37, 3392 (2004).
Article CAS ADS Google Scholar
Crofut, W. A. & Sottile, M. R. Design techniques of a delay-line content-addressed memory. IEEE Trans. Electron. Comput. 15(529–534), 1966. https://doi.org/10.1109/PGEC.1966.264360 (1966).
Article Google Scholar
Incorporated, W. R. WRspice Reference Manual. Tech. Rep. (2019).
Maezawa, M. Numerical study of the effect of parasitic inductance on rsfq circuits. IEICE Trans. Electron. 84, 20–28 (2001).
Google Scholar
Likharev, K. & Semenov, V. RSFQ logic/memory family: A new Josephson-junction technology for sub-terahertz-clock-frequency digital systems. IEEE Trans. Appl. Supercond. 1, 3–28. https://doi.org/10.1109/77.80745 (1991).
Article ADS Google Scholar
Zinoviev, D. Design and Partial Implementation of RSFQ-Based Batcher–Banyan Switch and Support Tools (Lambert Academic Publishing, 1997).
Google Scholar
Volk, J., Tzimpragos, G., Wynn, A., Golden, E. & Sherwood, T. Low-cost superconducting fan-out with cell $\text{ I}_\text{ C }$ ranking. IEEE Trans. Appl. Supercond. 33, 1–12. https://doi.org/10.1109/TASC.2023.3256797 (2023).
Article Google Scholar
Tolpygo, S. K. et al. Superconductor electronics fabrication process with MoNx kinetic inductors and self-shunted Josephson junctions. IEEE Trans. Appl. Supercond. 28, 1–12. https://doi.org/10.1109/TASC.2018.2809442 (2018).
Article Google Scholar
Shaikhaidarov, R. et al. Quantized current steps due to the AC coherent quantum phase-slip effect. Nature 608, 45–49 (2022).
Article CAS PubMed ADS Google Scholar
Astafiev, O. et al. Coherent quantum phase slip. Nature 484, 355–358 (2012).
Article CAS PubMed ADS Google Scholar
Tolpygo, S. K. et al. Deep sub-micron stud-via technology of superconductor VLSI circuits. Supercond. Sci. Technol. 27, 025016. https://doi.org/10.1088/0953-2048/27/2/025016 (2014).
Article CAS ADS Google Scholar
Tolpygo, S. K. & Semenov, V. K. Increasing integration scale of superconductor electronics beyond one million Josephson junctions. J. Phys. Conf. Ser. 1559, 012002. https://doi.org/10.1088/1742-6596/1559/1/012002 (2020).
Article CAS Google Scholar
Tolpygo, S. K. et al. Progress toward superconductor electronics fabrication process with planarized nbn and nbn/nb layers. IEEE Trans. Appl. Supercond. 33, 1–12. https://doi.org/10.1109/TASC.2023.3246430 (2023).
Article Google Scholar
Herr, A. et al. Scaling nbtin-based ac-powered Josephson digital to 400 m devices/cm$^2$ (2023).
Radparvar, M. Superconducting niobium and niobium nitride processes for medium-scale integration applications. Cryogenics 35, 535–540 (1995).
Article CAS ADS Google Scholar
Villegirr, J.-C. et al. NbN multilayer technology on R-plane sapphire. IEEE Trans. Appl. Supercond. 11, 68–71 (2001).
Article ADS Google Scholar
Baggetta, E., Ebert, B., Hadacek, N., Villegier, J.-C. & Maignan, M. New design and implementation of a fast modulator in NbN technology. IEEE Trans. Appl. Supercond. 15, 453–456 (2005).
Article CAS ADS Google Scholar
Villegier, J.-C. et al. Extraction of material parameters in NbN multilayer technology for RSFQ circuits. Physica C 326, 133–143 (1999).
Article ADS Google Scholar
Volk, J., Tzimpragos, G., Wynn, A., Golden, E. & Sherwood, T. Low-cost superconducting fan-out with cell I$_C$ ranking. IEEE Trans. Appl. Supercond.https://doi.org/10.1109/TASC.2023.3256797 (2023).
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank D. Scott Holmes for his helpful comments and valuable discussions.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, UC Santa Barbara, Santa Barbara, CA, 93106, USA
Jennifer Volk
Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, MA, 02420, USA
Jennifer Volk, Alex Wynn & Evan Golden
Department of Computer Science, UC Santa Barbara, Santa Barbara, CA, 93106, USA
Timothy Sherwood
Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, 48109, USA
Georgios Tzimpragos

Authors

Jennifer Volk
View author publications
You can also search for this author in PubMed Google Scholar
Alex Wynn
View author publications
You can also search for this author in PubMed Google Scholar
Evan Golden
View author publications
You can also search for this author in PubMed Google Scholar
Timothy Sherwood
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Tzimpragos
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.V. and G.T. conceived the idea, designed the proposed memory system, and performed the timing analysis. J.V. designed, simulated, and analyzed the corresponding circuits. J.V., A.W., E.V., and G.T. contributed to the memory density analysis and the writing of the manuscript. G.T. and T.S. oversaw the project. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Jennifer Volk or Georgios Tzimpragos.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Volk, J., Wynn, A., Golden, E. et al. Addressable superconductor integrated circuit memory from delay lines. Sci Rep 13, 16639 (2023). https://doi.org/10.1038/s41598-023-43205-8

Download citation

Received: 20 May 2023
Accepted: 21 September 2023
Published: 03 October 2023
DOI: https://doi.org/10.1038/s41598-023-43205-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.