Diffusion models for conditional generation of hypothetical new families of superconductors

Yuan, Samuel; Dordevic, S. V.

doi:10.1038/s41598-024-61040-3

Download PDF

Article
Open access
Published: 04 May 2024

Diffusion models for conditional generation of hypothetical new families of superconductors

Samuel Yuan¹ &
S. V. Dordevic²

Scientific Reports volume 14, Article number: 10275 (2024) Cite this article

409 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

Effective computational search holds great potential for aiding the discovery of high-temperature superconductors (HTSs), especially given the lack of systematic methods for their discovery. Recent progress has been made in this area with machine learning, especially with deep generative models, which have been able to outperform traditional manual searches at predicting new superconductors within existing superconductor families but have yet to be able to generate completely new families of superconductors. We address this limitation by implementing conditioning—a method to control the generation process—for our generative model and develop SuperDiff, a denoising diffusion probabilistic model with iterative latent variable refinement conditioning for HTS discovery—the first deep generative model for superconductor discovery with conditioning on reference compounds. With SuperDiff, by being able to control the generation process, we were able to computationally generate completely new families of hypothetical superconductors for the very first time. Given that SuperDiff also has relatively fast training and inference times, it has the potential to be a very powerful tool for accelerating the discovery of new superconductors and enhancing our understanding of them.

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

Chemical short-range disorder in lithium oxide cathodes

Article 08 May 2024

De novo generation of multi-target compounds using deep generative chemistry

Article Open access 06 May 2024

Introduction

Superconductors exhibit zero resistivity and perfect diamagnetism. These traits lend them useful for various important technologies, including Maglev trains, MRI magnets, power transmission lines, and quantum computers. However, a major current limitation is that the superconducting transition temperatures ($T_c$) of all known superconductors at ambient pressures are well below room temperature, restricting their broader practical application. Consequently, the search for superconductors with higher $T_c$ is a very active field, as they have significant potential to considerably improve the efficiency of current technologies while also enabling new ones.

Currently, however, superconductivity in high $T_c$ superconductors is not very well understood. As a result, there exists no systematic method for searching for new high $T_c$ superconductors¹, and the most common method for searches for new high $T_c$ superconductors is essentially trial-and-error. For instance, the study in Hosono et al.² surveyed approximately 1000 compounds over four years, of which they found only about $3\%$ to be superconducting. That study is a testament to the extreme inefficiency of finding new high $T_c$ superconductors through pure manual search.

Understanding this, more recently, computational techniques have been applied to assist researchers in the search for new high $T_c$ superconductors. Specifically, a number of works have applied machine learning to this search for superconductors. Although serving as very valuable tools in many respects, most of these attempts^3,4,5, have been limited to classification and regression models, which only search through existing databases and are not able to generate any new compounds. Only recently, with deep generative models applied to superconductor discovery, have new hypothetical superconductors not found in most popular compound datasets been generated^6,7,8. In Kim and Dordevic⁶, a Generative Adversarial Network (GAN)⁹ was applied for unconditional high $T_c$ superconductor generation, and in Wines et al.⁷, a Crystal Diffusion Variational Autoencoder (CDVAE)¹⁰ was also applied for unconditional superconductor generation so that crystal structure could be accounted for; however, that work used a different dataset and focused on the different task of generating stoichiometric Bardeen–Cooper–Schrieffer (BCS) conventional superconductors¹¹ and so did not generate any superconductors with $T_c \gtrsim$ 20 K.

New attempts at high $T_c$ superconductor discovery with generative models are not without limitations, however. Most notably, although past models have been able to successfully generate new superconductors within existing superconductor families, they have not been able to generate completely new families of superconductors, which would be particularly desirable. This is because they are only unconditional models, which learn only the training dataset distribution. As unconditional models, the generation process of these models cannot be controlled. In other words, past models lack conditioning functionality—a method for controlling the generation process, that, in this context, means giving an example superconductor, the reference compound, and having the model generate similar superconductors, ideally by interpolating between the example and what the model has learned from the training dataset. With conditioning, the possibility of generating new families of superconductors can be opened, and researchers can be given control over the generation process. This can be especially useful for researchers looking to find only specific types of superconductors or expand on their own new discoveries. Parallel to our work, Zhong et al.⁸ also applied a diffusion model for high $T_c$ superconductor discovery; however, their model was, like previous GANs, greatly limited by its lack of support for conditional generation with reference compounds—which is our main focus. Thus, their diffusion model shared with previous models the major limitation of being unable to generate any new families of superconductors—essentially, their work was only recreating the performance of the GAN in Kim and Dordevic⁶ but with a diffusion model instead and added $T_c$ label control only. Once again, we note that, in this work, we consider “conditioning” to mean conditioning the model on reference compounds only, as only this allows for the controlled generation of known and new families of superconductors. Moreover, Kim and Dordevic⁶ also struggled at generating unique (distinct from others in the given generated set) pnictides because of the small number of pnictides in SuperCon, the training dataset.

To resolve these limitations, in this work, we implement a Denoising Diffusion Probabilistic Model (DDPM)^12,13 for superconductor generation as our unconditional model and further implement conditioning with the Iterative Latent Variable Refinement (ILVR)¹⁴ extension to DDPM, which allows for one-shot generation without additional training. With conditioning, we hope to be able to generate new families of superconductors for the first time, as identified by the clustering analysis proposed in Roter et al.¹⁵, by experimenting with feeding the model different reference superconductors—this would mark a leap in the capabilities of computational searches for superconductors.

Diffusion models are a class of deep generative models that are inspired by nonequilibrium thermodynamics¹³ and have recently shown superior performance and outperformed GANs in image synthesis¹⁶ and materials discovery¹⁷. Diffusion Models are also at the heart of popular new image generation software, such as DALL$\cdot$E 2¹⁸ and Stable Diffusion¹⁹. More recently, these models have also been implemented and shown considerable promise for a variety of scientific applications, such as for drug discovery²⁰.

We coin this first approach to conditionally generating new superconductors with reference compounds as “SuperDiff”. With SuperDiff, we aim to resolve the issues found in past works as a result of the small pnictide training dataset with the unconditional DDPM and, as our main focus, explore how the conditional DDPM can adapt to new information to generate completely new families of superconductors for the first time.

Methods

As stated in the introduction, we leverage the capabilities of Denoising Diffusion Probabilistic Models and Iterative Latent Variable Refinement to propose a method for conditionally generating new hypothetical superconductors. Here, we discuss the details of the creation of SuperDiff by discussing the sourcing and processing of superconductor data, providing a brief overview of the details of the underlying DDPM and ILVR methods used, and discussing the techniques we use to evaluate the quality of SuperDiff outputs.

Data processing

All data for the model was sourced from SuperCon²¹, which is the largest database for superconducting materials. The dataset was processed by the steps in Kim and Dordevic⁶ and, like in previous studies^4,6,15, only the chemical composition data was used. Every compound from SuperCon was represented as a column vector for input into the model. As shown in Fig. 1, each compound was encoded as a $96 \times 1$ column vector as 96 is the maximum atomic number present in the dataset.

Denoising diffusion probabilistic model

Denoising Diffusion Probabilistic Models (DDPMs)^12,13 function by learning a Markov chain to progressively transform an isotropic Gaussian into a data distribution. The general structure of the DDPM used is shown in Fig. 2. The DDPM consists of two parts: a forwards “diffusion” process that adds noise to data, and a generative reverse process that learns the reverse of the forwards process—“denoising” the forwards process. The forward process is a fixed Markov chain that gradually adds Gaussian noise to data. Each step in the forward process is defined as

$$\begin{aligned} q(\textbf{x}_t | \textbf{x}_{t-1}) := \mathscr {N}(\textbf{x}_t; \sqrt{1-\beta _t}\textbf{x}_{t-1}; \beta _t\textbf{I})\, , \end{aligned}$$

(1)

where $\beta _1,\ldots , \beta _T$ is the variance schedule, $\textbf{I}$ is the identity matrix, and $\textbf{x}_0$ is dimensionally equivalent to latent variables $\textbf{x}_1,\ldots , \textbf{x}_T$ (all vectors in $\mathbb {R}^{96}$). In this work, we adopt the cosine variance schedule proposed in Nichol and Dhariwal²².

A notable property of the forwards process is that given clean data $\textbf{x}_0$, noised data at any time-step $\textbf{x}_t$ can be sampled in closed-form:

$$\begin{aligned} q(\textbf{x}_t | \textbf{x}_0) := \mathscr {N}(\textbf{x}_t; \sqrt{\overline{\alpha }_t}\textbf{x}_{0}; (1-\overline{\alpha }_t)\textbf{I})\, , \end{aligned}$$

(2)

where $\alpha _t:= 1-\beta _t$ and $\overline{\alpha }_t = \prod _{s=1}^{t} \alpha _s$. This can be reparametrized²³ as:

$$\begin{aligned} \textbf{x}_t = \sqrt{\overline{\alpha }_t}\textbf{x}_0 + \sqrt{1 - \overline{\alpha }_t}\varvec{\epsilon }\, , \end{aligned}$$

(3)

where $\varvec{\epsilon } \sim \mathscr {N}(0, \textbf{I})$ and is dimensionally equivalent to $\textbf{x}_0$.

The reverse process is then defined to be

$$\begin{aligned} p_{\theta }(\textbf{x}_{t-1} | \textbf{x}_t) := \mathscr {N}(\textbf{x}_{t-1}; \varvec{\mu }_{\theta }(\textbf{x}_t, t); \sigma _{t}^2\textbf{I})\, . \end{aligned}$$

(4)

In this work, we fix $\sigma _{t}^2 = \beta _t$. Then, as shown in Ho et al.¹², by rewriting $\varvec{\mu }_{\theta }$ as a linear combination of $\textbf{x}_t$ and $\varvec{\epsilon }_\theta$, a neural network that predicts $\varvec{\epsilon }$ from $\textbf{x}_t$ with input and output dimensions equal to that of the noise it predicts, the reverse process may be rewritten as:

$$\begin{aligned} \textbf{x}_{t-1} = \frac{1}{\sqrt{\alpha _t}} \left( \textbf{x}_t - \frac{1 - \alpha _t}{\sqrt{1 - \overline{\alpha }_t}}\varvec{\epsilon }_\theta (\textbf{x}_t,t)\right) + \sigma _t\textbf{z}\, , \end{aligned}$$

(5)

where $\textbf{z} \sim \mathscr {N}(0, \textbf{I})$.

To train the DDPM, noise is added to $\textbf{x}_0$ using the forward process $q(\textbf{x}_t | \textbf{x}_{0})$ for a randomly sampled $t \sim \text {Uniform}(\{1,\ldots , T\})$, which the neural network then learns to remove through the reverse process.

Four different versions of the DDPM were trained on SuperCon: one for cuprates, one for pnictides, one for others, and one for all classes (“everything”). The training datasets for each version of the DDPM were randomly split into training and validation sets in an approximately $95\%-5 \%$ proportion. Training curves for all versions of the DDPM were able to converge and stabilize after around 50 epochs, and each version of the DDPM was trained for between 50 and 100 epochs, depending on the approximate lowest validation loss. For all versions of the DDPM, NAdam²⁴ was chosen as the optimizer, and provided satisfactory results. Moreover, like in Ho et al.¹², T was set to 1000 and the U-Net²⁵ neural network architecture was used for $\varvec{\epsilon }_{\theta }$ (for this work, a 1D U-Net was used as opposed to the 2D U-Net used for images).

Conditioning

Iterative Latent Variable Refinement (ILVR)¹⁴ was used to condition the DDPM. Because ILVR is training-free, the same four trained unconditional DDPMs could be relatively easily modified for conditioning.

ILVR is a slight modification to the reverse diffusion process, and the general structure of ILVR used is shown in Fig. 3. At each step of the reverse “denoising” process, instead of sampling $\textbf{x}_{t-1}$ directly from $p_{\theta }(\textbf{x}_{t-1}|\textbf{x}_t)$ like in unconditional DDPM, $\textbf{x}_{t-1}$ instead becomes

$$\begin{aligned} \textbf{x}_{t-1} = \phi _{N}(\textbf{y}_{t-1}) + \textbf{x}_{t-1}' - \phi _{N}(\textbf{x}_{t-1}')\, , \end{aligned}$$

(6)

where $\textbf{x}_{t-1}' \sim p_{\theta }(\textbf{x}_{t-1}' | \textbf{x}_t)$ is the original unconditional proposal, $\textbf{y}_{t-1} \sim q(\textbf{y}_{t-1} | \textbf{y})$ is the condition encoding by the noising process in Eq. (2), and $\phi _N$ is a linear low-pass filtering operation that maintains the dimensionality of the input.

The goal of ILVR conditioning is to have $\phi _{N}(\textbf{x}_{0}) = \phi _{N}(\textbf{y})$, thereby allowing the generated output $\textbf{x}_0$ to share high-level features with reference $\textbf{y}$. In this case, the generated superconductor should have similar chemical composition as the reference superconductor.

In Choi et al.¹⁴, it was stated that the scale factor N could be changed to control the amount of information brought from the reference to the generated output, where lower N results in greater similarity between generated output and reference and higher N results in only coarse information from the reference being brought by the model to the generated output. In our work, we found that $N > 4$ resulted in large numbers of invalid compounds with negative amounts of elements. As a result, we used $N = 2$ up to $N = 4$, but we still found the conclusions made about changing N in Choi et al.¹⁴ applicable.

Sampling

As mentioned previously, we trained four versions of the unconditional model, each of which was then copied and modified with ILVR conditioning to also create four versions of the conditional model. We thus have four versions of the unconditional DDPM (without ILVR), which we call “unconditional SuperDiff”, and four versions of the conditional DDPM (with ILVR), which we call “conditional SuperDiff.” On a single consumer Nvidia RTX 3060 Ti GPU, each version of SuperDiff was trained in under 2 h, and we sampled 500,000 compounds from each of the four unconditional SuperDiff versions, which took less than 10 h for each version. These relatively fast training and inference times make SuperDiff a system that can be trained and used using resources at most universities and even consumers. For conditional SuperDiff, we sampled varying amounts of compounds for different reference superconductors, and we discuss those results later.

All sampled compounds were initially screened through various quality checks to ensure that all generated compounds were reasonably realistic. First, we obviously eliminated all generated compounds with negative amounts of elements. Note that we round all amounts of elements to two decimal places beforehand. Next, we eliminate compounds with either too few (only 1) or too many elements—for Cuprates, we limit outputs to compounds with a maximum of 7 elements, and for Pnictides and Others, we limit outputs to compounds with a maximum of 5 elements. After these basic checks, we removed duplicates and further evaluated compound validity with the charge neutrality and electronegativity checks from the SMACT package²⁶. Finally, we ran formation energy prediction with ElemNet^27,28. We will discuss the performance of model generations against these checks later.

Clustering

To identify if SuperDiff could generate new superconductor families, clustering analysis was performed. Clustering, which is an unsupervised machine-learning method used to find hidden patterns within data, was applied to the SuperCon database in Roter et al.¹⁵, which established that these methods, when applied to superconductors, could exceed human performance in identifying different “families” of superconductors, which are represented as clusters. In this work, we use the clustering method for superconductors from Roter et al.¹⁵ to evaluate generated outputs for new families. In Roter et al.¹⁵, it was also found that, for superconductors, to visualize clustering results, the t-SNE method worked best. t-SNE is a non-linear dimensionality reduction technique that allows higher dimensional data (96-dimensional superconductor data points in this case) to be represented in 2D or 3D²⁹ (which do not have any physical meaning).

As discussed in the introduction, a major objective of this work was to generate new families of superconductors, as identified by the clustering model—that is, to generate new clusters of superconductors. This was something not accomplished by previous works, including the GAN in Kim and Dordevic⁶ and the diffusion model in Zhong et al.⁸. In order to achieve this goal, we experimented with the conditional model’s ability to interpolate between the reference compound and the training dataset. This idea of experimenting with a conditional DDPM’s ability to interpolate between the reference set and training set was proposed in Giannone et al.³⁰ to attempt to achieve few-shot generation on image classes never seen during training. We attempt to do this with superconductors in this work. For instance, we experiment with conditioning the cuprate version of conditional SuperDiff on new, different reference cuprates not in the families of cuprate superconductors in the training dataset. We examine the model’s ability to generate new clusters or families of superconductors using information from the reference compound with this technique, and we report our clustering results below.

Results

In this section we report the performance of SuperDiff on various checks and discuss some interesting new findings. We first evaluate the performance of unconditional SuperDiff with the 500,000 compounds we generated for each of the four classes by performing various computational tests, which included some general compound checks as well as checks for superconductivity. We use the same computational tests for unconditional SuperDiff as used for the GAN in Kim and Dordevic⁶ and are thus able to directly compare unconditional performance. Afterward, as our most notable results, we evaluate the performance of both the unconditional and conditional versions of SuperDiff on clustering and manually identify and present some promising new families of superconductors generated by the conditional SuperDiff.

Duplicates and validity

For the 500,000 compounds generated by each version of unconditional SuperDiff, we first screened for duplicates between the generated set and the training set (the portion of the SuperCon database of the same class) and for duplicates within the generated set itself. After this, we ran the charge neutrality and electronegativity checks on the generated compounds with the SMACT package²⁶. We present the results of these general tests in Table 1, and then we remove all duplicates from the generated sets.

Table 1 Summary of unconditional SuperDiff performance for the four versions we trained from the 500,000 compounds we sampled from each version.

Full size table

We notice that the novelty % and uniqueness % of generated results are all very high, which means that unconditional SuperDiff is able to successfully generate both diverse and novel compounds. Unconditional SuperDiff, here, outperforms the GAN in Kim and Dordevic⁶ in all metrics regarding generation novelty and uniqueness, and, similar to as proposed in their work, we also speculate that the high novelty percentage is due to the non-stoichiometric nature of the compounds we generate, which opens up a large composition space for the model. Notably, unconditional SuperDiff maintains a very high uniqueness % for pnictides despite the small training set, something not accomplished by the Wasserstein GAN in Kim and Dordevic⁶. This corroborates the observation of the superior ability of DDPMs to generate diverse results when compared to a GAN in other disciplines¹⁶. Lastly, although the SMACT check²⁶ results varied greatly between classes and the proportion of valid compounds for some classes was fairly low, the fast inference time justifies that SuperDiff is still able to generate valid compounds reasonably well for all classes.

Overall, these results indicate that all versions of unconditional SuperDiff are able to generate both novel and unique compounds—overcoming the past issues faced by Kim and Dordevic⁶—as well as valid compounds. As conditional SuperDiff maintains much of the same components as the unconditional model, it was unsurprising that—in most cases—conditional SuperDiff was also able to generate novel, unique, and valid compounds; however, for conditional SuperDiff, these qualities were very much dependent on the reference compound—we still run these checks on all compounds generated by conditional SuperDiff and filter out invalid compounds.

Formation energy

We further validated the performance of SuperDiff on generating synthesizable compounds by predicting the formation energies of the generated compounds with ElemNet^27,28, which is a deep neural network model for predicting material properties from only elemental compositions. We chose ElemNet for our formation energy prediction because of its ability to use only chemical composition, as we do not consider crystal structure in our generation process. Because ElemNet does not take in compounds as column vectors in $\mathbb {R}^{96}$, as SuperDiff does, but instead takes them in as column vectors in $\mathbb {R}^{86}$ with certain elements removed, we ran the ElemNet formation energy prediction on only the compounds generated by SuperDiff that ElemNet would directly support—this did constitute the great majority of generated compounds. We display the distributions for the predicted formation energies of generated compounds in Fig. 4.

As shown in the figure, unconditional SuperDiff generated a majority of compounds with negative formation energy for all classes of superconductors, with the mean formation energy for all classes predicted to be negative as well. In Jha et al.²⁷, it was stated that negative formation energy values for compounds are a good indicator of their stability and synthesizability; therefore, although these predictions are not definitive proof—experimentation validation would be necessary—these predictions provide an indication that most of the compounds generated by unconditional SuperDiff are plausibly stable and synthesizable.

For conditional SuperDiff, the distribution of formation energies for generated compounds is heavily dependent on the reference compound. However, given a reasonable reference compound—that is, a valid reference compound that belongs to the class of superconductor that the version of SuperDiff was trained on—we demonstrate that conditional SuperDiff is able to generate compounds predicted to be stable by ElemNet. Specifically, as shown in Fig. 4, for the cuprates version of conditional SuperDiff conditioned on $\mathrm {YBa_{1.4}Sr_{0.6}Cu_{3}O_{6}Se_{0.51}}$³¹ and the pnictides version of conditional SuperDiff conditioned on $\mathrm {BaFe_{1.7}Ni_{0.3}As_{2}}$³²—some of the compounds we conditioned conditional SuperDiff on to find new families of superconductors later—the predicted distribution of formation energies for generated compounds show all generated compounds to have negative formation energy. These results indicate that, given reasonable reference compounds, conditional SuperDiff can generate plausibly stable and synthesizable compounds, which is not surprising given the fundamental architecture similarities between conditional and unconditional SuperDiff.

Superconductivity

After those general checks, we performed some computational checks for superconductivity in order to verify that unconditional SuperDiff is indeed able to generate probable superconductors. We ran the compounds generated by unconditional SuperDiff through the K-Nearest Neighbors (KNN) classification model and regression model from Roter and Dordevic⁴ for predicting superconductivity and critical temperature, respectively, based on elemental composition.

For the predicted proportion of generated compounds that were superconducting, we accounted for the inherent probabilistic error of the classification model by using Bayesian statistics to estimate the true proportion of superconducting generated compounds given the classification model’s predicted proportion $p_{sc}$ and the true positive ${{{\textsf {\textit{tp}}}}}$ and false positive rates ${{{\textsf {\textit{fp}}}}}$ of the classification model. The true proportion of generated compounds that are superconductors $\rho _{sc}$ may be estimated as⁶

$$\begin{aligned} \rho _{sc} \approx \frac{p_{sc} - {{{\textsf {\textit{fp}}}}}}{{{{\textsf {\textit{tp}}}}} - {{{\textsf {\textit{fp}}}}}}\, , \end{aligned}$$

(7)

where ${{{\textsf {\textit{tp}}}}} = 98.69\%$ and ${{{\textsf {\textit{fp}}}}} = 16.94\%$ are reported by Roter and Dordevic⁴.

For the generated compounds that were predicted to be superconducting, we used the regression model in Roter and Dordevic⁴ to predict their critical temperatures. Like all other tests done so far, this computational prediction is only an approximation. We tabulated the results of the classification and regression predictions on the compounds generated by unconditional SuperDiff in Table 1. We will discuss the predicted superconductivity of compounds generated by conditional SuperDiff later.

As seen in the table, all versions of unconditional SuperDiff were able to generate predicted superconductors at a rate comparable to the GAN in Kim and Dordevic⁶ and much higher than the 3% achieved by manual search in Hosono et al.²—notably, unconditional SuperDiff seems to perform much better on pnictides despite the small training set. This is further indication of the effectiveness of computational search for superconductors when compared to manual searches. Moreover, unconditional SuperDiff seems to capture the critical temperature distribution of the SuperCon training dataset much better than the GAN in Kim and Dordevic⁶.

Although actual synthesis and testing in a lab are required to confirm superconductivity, these checks, combined with the clustering analysis results that we will discuss later, provide a general indication that unconditional SuperDiff is able to generate highly plausible superconductors.

Clustering results

We ran the clustering analysis described previously on both unconditional and conditional SuperDiff. We display the clustering results for the cuprates version of unconditional SuperDiff in Fig. 5. Superconductors from the SuperCon database are shown with full circles of different colors, whereas our predictions are shown with open black circles. Although unconditional SuperDiff generated compounds in all known clusters or families of superconductors, no new families of superconductors were generated by unconditional SuperDiff—this was true for the other versions of unconditional SuperDiff as well. This was the expected result for unconditional SuperDiff as the underlying DDPM’s goal is to just find a mapping from Gaussian noise to the training data distribution, not some other new distribution. However, superconductor discovery has a particular interest in the generation of new families of superconductors, so a method to control the generation process to change the generated data distribution is desirable. With conditional SuperDiff, we are able to control the generation process to computationally generate new families of superconductors for the first time.

In Fig. 5, we also display a sample clustering result from the “others” version of conditional SuperDiff conditioned on various compounds. As seen in the plot, we identified two new clusters: $\mathrm {Li_{1-x}Be_{x}Ga_{2}Rh}$, which was generated by conditioning SuperDiff on $\mathrm {LiGa_2Rh}$³³, and $\mathrm {Na_{1-x}Al_{1-y}Mg_{x+y}Ge_{1-z}Ga_{z}}$, which was generated by conditioning SuperDiff on $\textrm{NaAlGe}$³⁴. Those and other predicted families will be discussed in more detail below.

These clustering results show that, with this ability to control generation, and by conditioning SuperDiff on compounds not in the SuperCon training set, SuperDiff is able to use information from various reference compounds to generate completely new families of superconductors. As expected, due to the nature of the conditioning method, we note that for these generated new families, the reference compound does belong to the new cluster generated based on it; however, one of the main contributions presented in this work is that we are able to extrapolate a new family of superconductors from an otherwise single reference compound. We performed this clustering analysis on all versions of conditional SuperDiff conditioned on a variety of different reference compounds, and we discuss the promising new families of superconductors generated by conditional SuperDiff in more detail and verify their superconductivity below.

Promising generated new families

After running clustering analysis for the different versions of conditional SuperDiff conditioned on a variety of reference compounds, we manually identified the most promising new families of superconductors generated by conditional SuperDiff. Beyond the novelty, uniqueness, and SMACT checks, we further checked for the novelty of these newly generated families by searching on the internet and through other databases—these newly generated families could not be found anywhere else. We tabulated these most promising new families of superconductors generated by conditional SuperDiff in Table 2. There, we identified the reference compound used as well as a few output examples and their respective predicted $T_c$ using the regression model in Roter and Dordevic⁴, and determined the general formula for the new family. We notice that most compounds generated with conditional SuperDiff are predicted to be superconducting, with predicted $T_c$ being reasonable for each class. Additionally, a particularly interesting result to note was that our model seemed to generate some new families of superconductors with double or, in one case, even triple doping. This is an interesting new avenue for superconductor discovery that has not been extensively studied, which our model suggests should be explored in more detail by material scientists.

Table 2 Promising new families of superconductors generated by conditional SuperDiff.

Full size table

We further demonstrate that conditional SuperDiff is able to generate realistic new families of superconductors by plotting the predicted $T_c$ using the regression model in Roter and Dordevic⁴ against the Cesium doping content for the newly generated $\mathrm {Ba_{2-x}Cs_{x}CuO_{3.3}}$ family in Fig. 6. We notice that the generated $\mathrm {Ba_{2-x}Cs_{x}CuO_{3.3}}$ family is predicted to exhibit the expected parabolic $T_c$ doping dependence relationship for this type of cuprate superconductor, which was observed previously in other cuprate families³⁵.

These findings again show that SuperDiff is not only able to generate new superconductors within known families but is also able to overcome the limitations of previous generative models to generate completely new families of superconductors that are also realistic—although we note that for some reference compounds, SuperDiff was also unable to generate new families of superconductors.

Discussion

With the lack of a systematic approach, the discovery of new high $T_c$ superconductors has long depended on material scientists’ serendipity. Recently, machine learning has been applied to this field to help assist scientists, but past works still lacked many key capabilities, for instance, the ability to computationally find new families of superconductors. Moreover, recent generative model approaches applied to this field also lacked methods of controlling the generation process by incorporating information from reference compounds^6,7,8.

In this paper, we have introduced a novel method for superconductor discovery using diffusion models with conditioning functionality that has addressed these major issues. Like previous works applying generative models to superconductor discovery, we were able to generate novel, realistic, and highly plausible superconductors that lie outside of existing databases—leveraging this “inverse design” approach to significantly outperform manual search and previous classification model approaches. With our unconditional model, we were also able to address the low generated compound uniqueness issues that plagued previous works due to the small training data set for pnictides. Most importantly, however, beyond the unconditional performance improvements the diffusion model brought, our contribution of implementing conditioning with ILVR for superconductor discovery to allow the generation process to be controlled enabled the creation of a tool for computationally generating completely new families of superconductors. We verified the generation of new families of superconductors with our clustering analysis, and we presented several of these promising new families of generated superconductors for several different classes of superconductors in Table 2. Once again, we point out that no previous computational model for superconductor discovery would have been capable of generating these new families of superconductors as they attempt to produce only samples that match the training data.

The application of deep generative models for superconductor discovery continues to be a very promising and exciting approach. Future studies can benefit from possible improvements that can be made to SuperDiff, including implementing a physics-informed diffusion model and creating and utilizing a better, more comprehensive training dataset of superconductors. Nevertheless, SuperDiff in its current form is still very powerful as a tool for superconductor discovery, and researchers can currently benefit from it in a myriad of ways, such as by using its novel generations as inspiration—starting with the new families introduced here, using it to expand on their own new discoveries, or by simply experimenting with many more reference compounds (such as high-pressure superconductors) to continue using it to generate completely new families of hypothetical superconductors or hypothetical superconductors with even higher $T_c$.

Data availability

The SuperCon dataset²¹ used in this study to train the SuperDiff model is publicly available at https://doi.org/10.48505/nims.3739, and a copy of the processed dataset used is available at https://github.com/sdkyuanpanda/SuperDiff/tree/54f0520a67bf8308fbf437b2b66aa36beee52acd/datasets. The 265,722 valid compounds generated by the four versions of unconditional SuperDiff and the 270 valid compounds generated by conditional SuperDiff conditioned on $\mathrm {YBa_{1.4}Sr_{0.6}Cu_{3}O_{6}Se_{0.51}}$³¹ are available at https://github.com/sdkyuanpanda/SuperDiff/tree/54f0520a67bf8308fbf437b2b66aa36beee52acd/outputs. Other data that support the results of this study are available from the corresponding author upon reasonable request.

Code availability

An implementation of the proposed model, SuperDiff, is publicly available online at https://github.com/sdkyuanpanda/SuperDiff and is citable on Zenodo at https://doi.org/10.5281/zenodo.10699906³⁶.

References

Hirsch, J., Maple, M. & Marsiglio, F. Superconducting materials classes: Introduction and overview. Phys. C Superconduct. Appl. 514, 1–8. https://doi.org/10.1016/j.physc.2015.03.002 (2015).
Article ADS CAS Google Scholar
Hosono, H. et al. Exploration of new superconductors and functional materials, and fabrication of superconducting tapes and wires of iron pnictides. Sci. Technol. Adv. Mater. 16, 033503. https://doi.org/10.1088/1468-6996/16/3/033503 (2015).
Article CAS PubMed PubMed Central Google Scholar
Stanev, V. et al. Machine learning modeling of superconducting critical temperature. npj Comput. Mater. 4, 29. https://doi.org/10.1038/s41524-018-0085-8 (2018).
Article ADS CAS Google Scholar
Roter, B. & Dordevic, S. Predicting new superconductors and their critical temperatures using machine learning. Phys. C Superconduct. Appl. 575, 1353689. https://doi.org/10.1016/j.physc.2020.1353689 (2020).
Article ADS CAS Google Scholar
Konno, T. et al. Deep learning model for finding new superconductors. Phys. Rev. B 103, 014509. https://doi.org/10.1103/PhysRevB.103.014509 (2021).
Article ADS CAS Google Scholar
Kim, E. & Dordevic, S. V. Scgan: A generative adversarial network to predict hypothetical superconductors. J. Phys. Condens. Matter 36, 025702. https://doi.org/10.1088/1361-648X/acfdeb (2023).
Article ADS Google Scholar
Wines, D., Xie, T. & Choudhary, K. Inverse design of next-generation superconductors using data-driven deep generative models. J. Phys. Chem. Lett. 14, 6630–6638. https://doi.org/10.1021/acs.jpclett.3c01260 (2023).
Article CAS PubMed Google Scholar
Zhong, C. et al. High-performance diffusion model for inverse design of high $t_c$ superconductors with effective doping and accurate stoichiometry. InfoMathttps://doi.org/10.1002/inf2.12519 (2024).
Article Google Scholar
Goodfellow, I. J. et al. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’14, 2672–2680 (MIT Press, Cambridge, MA, USA, 2014).
Xie, T., Fu, X., Ganea, O.-E., Barzilay, R. & Jaakkola, T. Crystal diffusion variational autoencoder for periodic material generation. arXiv:2110.06197 (2021).
Bardeen, J., Cooper, L. N. & Schrieffer, J. R. Microscopic theory of superconductivity. Phys. Rev. 106, 162–164. https://doi.org/10.1103/PhysRev.106.162 (1957).
Article ADS MathSciNet CAS Google Scholar
Ho, J., Jain, A. & Abbeel, P. Denoising diffusion probabilistic models. arxiv:2006.11239 (2020).
Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N. & Ganguli, S. Deep unsupervised learning using nonequilibrium thermodynamics. In Bach, F. & Blei, D. (eds.) Proceedings of the 32nd International Conference on Machine Learning, vol. 37 of Proceedings of Machine Learning Research, 2256–2265 (PMLR, Lille, France, 2015).
Choi, J., Kim, S., Jeong, Y., Gwon, Y. & Yoon, S. Ilvr: Conditioning method for denoising diffusion probabilistic models (2021). arXiv:2108.02938.
Roter, B., Ninkovic, N. & Dordevic, S. Clustering superconductors using unsupervised machine learning. Phys. C Superconduct. Appl. 598, 1354078. https://doi.org/10.1016/j.physc.2022.1354078 (2022).
Article ADS CAS Google Scholar
Dhariwal, P. & Nichol, A. Diffusion models beat gans on image synthesis (2021). arXiv:2105.05233.
Alverson, M., Baird, S., Murdock, R. & Sparks, T. Generative adversarial networks and diffusion models in material discovery. https://doi.org/10.26434/chemrxiv-2022-6l4pm (2022).
Ramesh, A., Dhariwal, P., Nichol, A., Chu, C. & Chen, M. Hierarchical text-conditional image generation with clip latents (2022). arXiv:2204.06125.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P. & Ommer, B. High-resolution image synthesis with latent diffusion models (2021). arXiv:2112.10752.
Corso, G., Stärk, H., Jing, B., Barzilay, R. & Jaakkola, T. S. Diffdock: Diffusion steps, twists, and turns for molecular docking. In The Eleventh International Conference on Learning Representations (2023).
SuperCon. https://doi.org/10.48505/nims.3739 (2020).
Nichol, A. & Dhariwal, P. Improved denoising diffusion probabilistic models (2021). arXiv:2102.09672.
Kingma, D. P. & Welling, M. Auto-encoding variational bayes (2022). arXiv:1312.6114.
Dozat, T. Incorporating Nesterov momentum into Adam. In Proceedings of the 4th International Conference on Learning Representations, 1–4 (2016).
Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Navab, N., Hornegger, J., Wells, W. M. & Frangi, A. F. (eds.) Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, 234–241 (Springer, 2015).
Davies, D. W. et al. Computational screening of all stoichiometric inorganic materials. Chem 1, 617–627. https://doi.org/10.1016/j.chempr.2016.09.010 (2016).
Article CAS PubMed PubMed Central Google Scholar
Jha, D. et al. Elemnet: Deep learning the chemistry of materials from only elemental composition. Sci. Rep. 8, 17593. https://doi.org/10.1038/s41598-018-35934-y (2018).
Article ADS CAS PubMed PubMed Central Google Scholar
Jha, D. et al. Enhancing materials property prediction by leveraging computational and experimental data using deep transfer learning. Nat. Commun. 10, 1. https://doi.org/10.1038/s41467-019-13297-w (2019).
Article ADS CAS Google Scholar
van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Google Scholar
Giannone, G., Nielsen, D. & Winther, O. Few-shot diffusion models (2022). arXiv:2205.15463.
Grinenko, V. et al. Extraordinary physical properties of superconducting ${\rm YBa}_{1.4}{\rm Sr}_{0.6}{\rm Cu}_{3}{\rm O}_{6}{\rm Se}_{0.51}$ in a multiphase ceramic material (2023). arXiv:2309.16814.
Wang, M. et al. Doping dependence of spin excitations and its correlations with high-temperature superconductivity in iron pnictides. Nat. Commun. 4, 2874. https://doi.org/10.1038/ncomms3874 (2013).
Article ADS CAS PubMed Google Scholar
Mondal, P., Khanom, S., Shahed, N. A., Hossain, M. K. & Ahmed, F. An ab initio insight into the structural, physical, thermodynamic and optoelectronic properties of superconducting heusler-like ${\rm LiGa}_{2}{\rm Rh}$. Phys. C Superconduct. Appl. 603, 1354142. https://doi.org/10.1016/j.physc.2022.1354142 (2022).
Article ADS CAS Google Scholar
Ikenobe, T., Yamada, T., Hirai, D., Yamane, H. & Hiroi, Z. Superconductivity induced by doping holes in the nodal-line semimetal ${\rm NaAlGe}$. Phys. Rev. Mater. 7, 104801. https://doi.org/10.1103/PhysRevMaterials.7.104801 (2023).
Article CAS Google Scholar
Tallon, J. & Loram, J. The doping dependence of t*—what is the real high-tc phase diagram?. Phys. C Superconduct. 349, 53–68. https://doi.org/10.1016/S0921-4534(00)01524-0 (2001).
Article ADS CAS Google Scholar
Yuan, S. & Dordevic, S. SuperDiff: Diffusion models for conditional generation of hypothetical new families of superconductors. https://doi.org/10.5281/zenodo.10699906 (2024).
Ohsugi, S., Kitaoka, Y., Azuma, M., Fujishiro, Y. & Takano, M. Antiferromagnetic order in the ladder compound ${\rm SrCu}_{2}{\rm O}_{3}$; cu-nmr/nqr measurements. J. Low Temp. Phys. 117, 1671–1675. https://doi.org/10.1023/A:1022532203832 (1999).
Article ADS CAS Google Scholar
Fumagalli, R. et al. Crystalline and magnetic structure of ${\rm Ba}_{2}{\rm CuO}_{3+\delta }$ investigated by x-ray absorption spectroscopy and resonant inelastic x-ray scattering. Phys. C Superconduct. Appl. 581, 1353810. https://doi.org/10.1016/j.physc.2020.1353810 (2021).
Article ADS CAS Google Scholar
Bush, A. A. et al. Exotic phases of frustrated antiferromagnet ${\rm LiCu}_{2}{\rm O}_{2}$. Phys. Rev. B 97, 054428. https://doi.org/10.1103/PhysRevB.97.054428 (2018).
Article ADS CAS Google Scholar

Download references

Author information

Authors and Affiliations

Homestead High School, Cupertino, CA, 95014, USA
Samuel Yuan
Department of Physics, The University of Akron, Akron, OH, 44325, USA
S. V. Dordevic

Authors

Samuel Yuan
View author publications
You can also search for this author in PubMed Google Scholar
S. V. Dordevic
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization and design: S.Y. Data analysis and interpretation: S.Y. and S.V.D. Drafting of the manuscript: S.Y. Critical revision of the manuscript for important intellectual content: S.Y. and S.V.D. Supervision: S.V.D. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Samuel Yuan.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yuan, S., Dordevic, S.V. Diffusion models for conditional generation of hypothetical new families of superconductors. Sci Rep 14, 10275 (2024). https://doi.org/10.1038/s41598-024-61040-3

Download citation

Received: 25 February 2024
Accepted: 30 April 2024
Published: 04 May 2024
DOI: https://doi.org/10.1038/s41598-024-61040-3

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.