Gas adsorption meets deep learning: voxelizing the potential energy surface of metal-organic frameworks

Sarikas, Antonios P.; Gkagkas, Konstantinos; Froudakis, George E.

doi:10.1038/s41598-023-50309-8

Download PDF

Article
Open access
Published: 26 January 2024

Gas adsorption meets deep learning: voxelizing the potential energy surface of metal-organic frameworks

Scientific Reports volume 14, Article number: 2242 (2024) Cite this article

2762 Accesses
2 Citations
6 Altmetric
Metrics details

Subjects

Abstract

Intrinsic properties of metal-organic frameworks (MOFs), such as their ultra porosity and high surface area, deem them promising solutions for problems involving gas adsorption. Nevertheless, due to their combinatorial nature, a huge number of structures is feasible which renders cumbersome the selection of the best candidates with traditional techniques. Recently, machine learning approaches have emerged as efficient tools to deal with this challenge, by allowing researchers to rapidly screen large databases of MOFs via predictive models. The performance of the latter is tightly tied to the mathematical representation of a material, thus necessitating the use of informative descriptors. In this work, a generalized framework to predict gaseous adsorption properties is presented, using as one and only descriptor the capstone of chemical information: the potential energy surface (PES). In order to be machine understandable, the PES is voxelized and subsequently a 3D convolutional neural network (CNN) is exploited to process this 3D energy image. As a proof of concept, the proposed pipeline is applied on predicting ${\hbox {CO}_{2}}$ uptake in MOFs. The resulting model outperforms a conventional model built with geometric descriptors and requires two orders of magnitude less training data to reach a given level of performance. Moreover, the transferability of the approach to different host-guest systems is demonstrated, examining ${\hbox {CH}_4}$ uptake in COFs. The generic character of the proposed methodology, inherited from the PES, renders it applicable to fields other than reticular chemistry.

Direct prediction of gas adsorption via spatial atom interaction learning

Article Open access 03 November 2023

Accelerating the prediction of CO2 capture at low partial pressures in metal-organic frameworks using new machine learning descriptors

Article Open access 03 October 2023

Machine learned features from density of states for accurate adsorption energy prediction

Article Open access 04 January 2021

Introduction

Reticular chemistry, the science and art of combining molecular building blocks to form extended periodic structures¹, has endowed chemists and material scientists with a vast chemical space, the latter serving as a giant toolbox that can help them to solve a wide variety of problems. Metal-organic frameworks (MOFs), a class of nanoporous materials composed of metal ions/clusters and organic linkers², exemplify this idea. Owing to their exceptionally high porosity and surface area³ along with their tunable nature have burgeoned as prominent materials for gas-adsorption related applications^4,5. One such example, is carbon capture and storage⁶, where MOF-based sorbents are considered green and efficient solutions.

The inherent combinatorial character of MOFs has given birth to large databases of either in vitro^7,8 or in silico synthesized materials^9,10,11,12. Although a plethora of choices is desirable, an immense materials space inevitably complicates the efficient identification of the best candidates. The large size of current and prospective MOFs databases¹³, precludes approaches such as experimental synthesis and performance characterization, since a single laboratory study can range from weeks to months. Performance assessment based on molecular simulations¹⁴, significantly ameliorates the time penalty that accompanies the evaluation of a single structure. Nevertheless, the overwhelming number of MOFs that require filtering renders brute-force computational screening suboptimal.

The ever-increasing amount of data requires methods that are able to handle them efficiently and effectively. Machine learning (ML) techniques satisfy the aforementioned requirements and can accelerate the identification of promising materials by means of predictive models^{15,16,17,18,19,20,21,22,23,24}. Given a mathematical description of a structure (input) and a property of interest (output), a supervised ML algorithm seeks to build a model for the underlying structure-property relationship. In ML parlance, inputs and outputs are known as descriptors (or features) and labels, respectively. “Garbage in, garbage out” applies, which entails that high-performing ML models are possible only if information-rich descriptors are employed.

With regards to gas adsorption in MOFs, a commonly used set of descriptors are the so-called geometric ones^25,26,27 including properties like void fraction and gravimetric surface area. Although these descriptors lead to fruitful results when used to predict gas uptake at high pressures, they fall short as we transition to the low pressure regime, especially when modeling gases with non-negligible electrostatic interactions, e.g. ${\hbox {CO}_2}$ and ${\hbox {H}_2}$. The origin of these shortcomings is rooted to the inability of these descriptors to capture the fundamental factor that governs adsorption: host-guest interactions. Attempts have been performed to address the limitations of geometric descriptors, i.e. to capture the energetics of adsorption, giving rise to the so-called energy-based descriptors^28,29,30.

One such work was performed by Bucior et al.³² where bins from the unit cell’s energy histogram were used as descriptors, leading to ML models of remarkable accuracy with regards to ${\hbox {H}_2}$ and ${\hbox {CH}_4}$ uptake. In another study³³, a set of hypothetical probe atoms were used to fingerprint the energetic landscape of the unit cell by averaging the interaction between the probe and framework atoms. Augmenting the set of geometric descriptors with these average interactions significantly increased the performance of ML models regarding ${\hbox {CO}_2}$, ${\hbox {H}_2\hbox {S}}$, ${\hbox {H}_2}$ and ${\hbox {CH}_4}$ uptake. Since gas adsorption essentially boils down to the potential energy surface (PES), a natural question that arises is why not use the PES itself as descriptor?

In this work, we propose a generalized framework for predicting gaseous adsorption properties, free of hand-crafted features, using solely the PES as descriptor (Fig. 1). Two steps are central to this approach:

(i)
a machine understandable format of PES
(ii)
an algorithm capable of handling effectively this format

For the first step, a voxelized representation of the PES is adopted. In essence, the input can be thought as a 3D image of the material, where each 3D pixel, i.e. voxel, is colorized by the values of the potential energy. For the second step, we resort to a deep learning^34,35 solution, namely a 3D convolutional neural network (CNN). Similar to reticular chemistry, in deep learning—a subfield of ML—simple computational units (neurons) are combined to form (neural) networks, the latter being able to extract useful information from the raw input, that is to extract features in a completely data-driven way.

It should be added that our pipeline doesn’t impose extra computational burden compared to the afore-described studies^32,33, since in both of them the features are extracted from the PES or an approximation thereof. The essential difference lies on the feature extraction step: here the algorithm decides what matters for the task at hand by looking on the data, removing the need for manual feature extraction.

With respect to application of CNNs on MOFs, Cho et al.³⁶ trained a CNN by representing the structure as a 3D binary matrix, with each matrix value indicating “available” or “non-available” adsorption sites in the structure. Notably, the CNN was capable of accurately predicting ${\hbox {CH}_4}$ adsorption isotherms of zeolites. Following up this work, in order to account for the chemical diversity of MOFs, Hung and co-workers³⁷ trained a CNN on two 3D matrices that encode element and point-charge information. The CNN achieved remarkable accuracy on predicting Henry adsorption constants for ${\hbox {CH}_4}$ and ${\hbox {CO}_2}$ and improved compared to the binary matrix approach. Nevertheless, the fact that an indirect representation of the PES is used as a descriptor, entails that the information content the CNN receives is not maximal. Maximum information content is possible only if the PES itself is used as a descriptor. Here, the CNN is trained on a single 3D matrix instead of two but more importantly, it looks directly on the PES. In other words, we feed the CNN with the object that completely characterizes the sorption behavior of a material: the PES.

Since the PES uniquely combines the structural properties and the electronic structure of a material in real space, the proposed scheme is applicable in any host-guest system for predicting any adsorption property of interest. As a proof of concept, the suggested approach is applied on MOFs for predicting ${\hbox {CO}_2}$ uptake. The transferability of the approach is also demonstrated, examining ${\hbox {CH}_4}$ uptake on COFs. In both cases, the proposed pipeline is compared with conventional schemes where geometric descriptors are employed.

Methods

Voxelized PES

The steps for the calculation of the voxelized PES are schematically summarized in Fig. 2. As a first step, a 3D grid of size $n \times n \times n$ is overlayed over the unit cell of the material. Hereon, $n = 25$ as a balance between resolution and computational cost, since voxelization scales up as ${\mathcal {O}}(n^3)$. Next, each voxel centered at grid point ${\textbf{r}}_i$ is colorized with the interaction energy of a probe molecule at ${\textbf{r}}_i$ with the framework atoms.

In the proposed framework, the grid size and the type of the potential are treated as “hyperparameters” controlling the trade-off between information content and computational cost. The ultimate representation of the PES is achieved as $n \rightarrow \infty$ and when the voxels are filled with energy values obtained from ab-initio calculations. Because this study serves as a proof of concept, to facilitate the modelling of interactions, a spherical probe molecule is used and host-guest interactions are approximated with the Lennard-Jones (LJ) potential (for more details see Supplementary Information). To ease the calculation of energy voxels, the Python package MOX$\epsilon \lambda$ is introduced and used in all cases. For the remainder of this study, we use interchangeably the terms “voxelized PES” and “energy voxels”.

Convolutional neural networks

CNNs are specialized neural network architectures to process image-like data^38,39,40,41 based on convolution (Fig. 3). Convolving a filter with an image can be seen as template matching. When a local image patch matches the filter—template to be matched—the output is highly positive. Sliding the filter over the image and recording the output values produces a feature map. This notion generalizes to 3D. A convolutional layer, contains many learnable filters, each one of them looking for a different pattern and producing its own feature map based on feature maps from the previous layer or the raw image (in case of first convolutional layer). By composing many such layers, a CNN extract features hierarchically, with the level of (feature) abstraction increasing the deeper we go into the network.

Apart from convolutional layers, another common building block of CNNs are the pooling layers. The role of these layers is to downsample (reduce the resolution) in a parameter-free way the feature maps produced by convolutional layers. By downsampling in this manner, they reduce the memory-computational footprint of the CNN and also the number of parameters, thereby reducing the risk of overfitting⁴². A pooling layer takes as inputs the feature maps of the preceding convolutional layer and subsamples them by substituting the outputs in a small neighborhood of the feature map with a summary statistic³⁵. Figure 3 illustrates the pooling layer used in this work, known as max pooling, which uses the $\max$ function to compute the summary statistic (same idea applies to 3D). From the same figure it can be also seen that small translations to the input (input B is just a shifted version of input A by 1 pixel to the right) produce the same output when passed through the max pooling layer, meaning that the latter introduces into the network some level of invariance to small translations^35,42. The architecture of the 3D CNN used in this work called RetNet is schematically depicted in Fig. 4. More information regarding architecture and training details can be found on the Supplementary Information.

Results

Visualizing RetNet

A closer look at the “internals” of RetNet—the processing that energy voxels undergo as they pass through the network—trained on the MOFs dataset⁹, is provided in Fig. 4. For the sake of clarity, only some feature maps from the first five layers are visualized. It should be noted that each feature map of a given layer takes into account all the feature maps from the previous layer, the only exception being the pooling layers which just dowsample. For instance, each feature map of the Conv2 layer combines all the 12 feature maps from Conv1 layer whereas each feature map of the MaxPool1 layer is a downsampled version of the corresponding feature map in Conv2 layer. Although feature maps are not meant to be human interpretable (especially the ones found deeper in the network), it is worth to notice that the first two Conv layers highlight the texture of the structure. For example, it can be seen that the 3rd feature map from Conv1 layer outlines the skeleton of the framework.

The MaxPool2 layer is followed by two consecutive Conv layers and the Flatten layer flattens out all feature maps of Conv5 into a single vector (of size 3240) which is then processed by a fully connected neural network. Since Output is a linear layer (see Table S1 in the Supplementary Information), RetNet essentially does just the following:

$$\begin{aligned} \underbrace{\overbrace{{\textbf{x}}}^\text {PES}}_\text {input} \quad \longrightarrow \quad \underbrace{ \overbrace{\mathbf {\phi }({\textbf{x}};\mathbf {\theta })}^\text {fingerprint} }_\text {feature extraction} \quad \longrightarrow \quad \underbrace{ \overbrace{{\textbf{w}}^\top \mathbf {\phi }({\textbf{x}};\mathbf {\theta }) + w_0}^\text {gas uptake} }_\text {output} \end{aligned}$$

(1)

In other words, it extracts a fingerprint from the PES and then uses a linear model on top of this fingerprint to predict the gas uptake. All the layers between Input and Output layers (i.e. from Conv1 to Dense2) are responsible for this feature extraction step, with the size of the fingerprint being determined by the size of the Dense2 layer (a 20-dimensional vector, see also Figs. S6–S7 in the Supplementary Information). This learnable fingerprint extraction step (parameters $\mathbf {\theta }$ of $\mathbf {\phi }$ are learned during training), is what fundamentally differentiates our method compared to approaches that use hand-crafted fingerprints^32,33. Feature extraction from the PES has been “unlocked” and is now part of the training phase.

Learning curves

The learning curves for ML models built with energy voxels and geometric descriptors are depicted in Fig. 5. As can been seen, in the MOFs dataset, the CNN model ($R^2=0.859$) outperforms the Random Forest (RF) model ($R^2=0.690$) even with such a crude approximation for the PES (LJ potential doesn’t take into account electrostatic interactions). Notably, the CNN model requires two orders of magnitude less training samples, approximately 300, to reach the performance of the RF model. As stated previously, in this work we strived for minimal computational cost which means that still the information content of the voxelized PES is not maximized. As such, higher performance can be achieved by employing more refined potentials and the upper limit with an ab-initio constructed PES. Similar behavior is observed on the COFs dataset⁴³. Again, the CNN model generalizes better, achieving a $R^2$ of 0.969 compared to 0.941 for the RF model. In this case, the CNN needs around one order of magnitude less training samples to match RF’s performance. The observation that in both cases the learning curve of the CNN model lies above the corresponding one of the RF model should be attributed (mainly) to the following two factors. First, the increased information content of the voxelized PES (input of CNN models) over the geometric descriptors (input of RF models). Second, the ability of CNNs to process image-like data (the voxelized PES is just a single-channel 3D image). It should be mentioned that another factor that gave a performance boost to our CNN models was the application of data augmentation during their training phase (see Figs. S2–S3 in the Supplementary Information).

Discussion

We would like to point out that in the case of ${\hbox {CH}_4}$, which lacks dipole and quadrupole moment, the LJ potential approximates very well the true potential, which is reflected on the increased performance of the CNN compared to the ${\hbox {CO}_2}$ case. This observation along the fact that in both cases the same resolution was used, motivates focusing first on refining the potential than increasing resolution in order to maximize the information content of the voxelized PES and as such, the performance of the ML models³³. For adsorbates like ${\hbox {CO}_2}$ and ${\hbox {H}_2}$, where electrostatic interactions with the framework atoms are non-negligible, an accurate representation of the voxelized PES necessitates the inclusion of these interactions. However, there is no free lunch and such refinements come at the price of assigning partial charges to each framework atom, which is a computationally intensive task. Fortunately, there are approaches^44,45,46 that can assign partial charges extremely fast and with high fidelity via ML models, opening the door for efficient construction of an accurate voxelized PES.

Additionally, considering that the proposed framework bases its roots at interactions, which are ubiquitous in nature, renders it extremely modular and applicable to fields besides reticular chemistry. For example, if one is interested in predicting properties of organic molecules (e.g. solubility), a straightforward application of our framework is to voxelize the electrostatic potential map of the organic molecule and then use it as input to train a 3D CNN for predicting the property(ies) of interest.

Moreover, the fact that under the hood the proposed framework uses a member of the deep learning family, enables incorporation of transfer learning techniques^47,48, which can greatly decrease the amount of reference data required for the CNN training. In transfer learning, the model leverages the knowledge it has gained by solving an original task to solve new but similar to the original tasks. For example, instead of retraining from scratch the CNN for every adsorption property of interest, one can train the CNN at a specific property (original task, e.g. gas uptake) and then fine-tune this pre-trained model to the other properties (new tasks, e.g. gas selectivity). A good pre-trained model will require less training data to perform well in the new task, because it can exploit the shareable knowledge it has acquired by solving the original task.

Since the performance of a ML model depends highly on the informativeness of the descriptor(s) and the algorithm of choice, refinements of the potential used to model host-guest interactions, architecture modifications along with inclusion of transfer learning techniques, can further improve the performance and data efficiency of the suggested pipeline. As a final note, it should be remembered that reticular chemistry and chemistry are three dimensional, and if we are to “machine-learn them” properly, we ought to respect it. We envision that our study will motivate the adoption of three dimensional inputs in future chemistry-oriented ML works.

Data availability

The energy voxels for MOFs are publicly available in: https://figshare.com/articles/dataset/RetNet/24598845. The energy voxels for COFs are available from the corresponding author upon request.

Code availability

MOX$\epsilon \lambda$, is hosted on GitHub: https://github.com/frudakis-research-group/moxel and can be installed from PyPI: https://pypi.org/project/pymoxel/. Documentation for the package is available at: https://moxel.readthedocs.io/en/stable. A PyTorch implementation of RetNet is available on GitHub: https://github.com/frudakis-research-group/retnet.

References

Yaghi, O. M. Reticular chemistry in all dimensions. ACS Cent. Sci. 5, 1295–1300. https://doi.org/10.1021/acscentsci.9b00750 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yaghi, O. M. Emergence of Metal-Organic Frameworks, chap. 1, 1–27 (Wiley, 2019). https://onlinelibrary.wiley.com/doi/pdf/10.1002/9783527821099.ch1.
Farha, O. K. et al. Metal-organic framework materials with ultrahigh surface areas: Is the sky the limit?. J. Am. Chem. Soc. 134, 15016–15021. https://doi.org/10.1021/ja3055639 (2012).
Article CAS PubMed Google Scholar
Spanopoulos, I. et al. Reticular synthesis of HKUST-like tbo-MOFs with enhanced CH$_4$ storage. J. Am. Chem. Soc. 138, 1568–1574. https://doi.org/10.1021/jacs.5b11079 (2016).
Article CAS PubMed Google Scholar
Suresh, K. et al. Optimizing hydrogen storage in MOFs through engineering of crystal morphology and control of crystal size. J. Am. Chem. Soc. 143, 10727–10734. https://doi.org/10.1021/jacs.1c04926 (2021).
Article CAS PubMed Google Scholar
An, J., Geib, S. J. & Rosi, N. L. High and selective CO$_2$ uptake in a cobalt adeninate metal-organic framework exhibiting pyrimidine- and amino-decorated pores. J. Am. Chem. Soc. 132, 38–39. https://doi.org/10.1021/ja909169x (2009).
Article CAS Google Scholar
Moghadam, P. Z. et al. Development of a cambridge structural database subset: A collection of metal-organic frameworks for past, present, and future. Chem. Mater. 29, 2618–2625. https://doi.org/10.1021/acs.chemmater.7b00441 (2017).
Article CAS Google Scholar
Chung, Y. G. et al. Advances, updates, and analytics for the computation-ready, experimental metal-organic framework database: Core mof 2019. J. Chem. Eng. Data 64, 5985–5998. https://doi.org/10.1021/acs.jced.9b00835 (2019).
Article CAS Google Scholar
Boyd, P. G. et al. Data-driven design of metal-organic frameworks for wet flue gas CO$_2$ capture. Nature 576, 253–256. https://doi.org/10.1038/s41586-019-1798-7 (2019).
Article ADS CAS PubMed Google Scholar
Wilmer, C. E. et al. Large-scale screening of hypothetical metal-organic frameworks. Nat. Chem. 4(2), 83–9 (2011).
Article PubMed Google Scholar
Rosen, A. S. et al. Machine learning the quantum-chemical properties of metal-organic frameworks for accelerated materials discovery. Matter 4, 1578–1597. https://doi.org/10.1016/j.matt.2021.02.015 (2021).
Article CAS Google Scholar
Colón, Y. J., Gómez-Gualdrón, D. A. & Snurr, R. Q. Topologically guided, automated construction of metal-organic frameworks and their evaluation for energy-related applications. Crystal Growth Des. 17, 5801–5810. https://doi.org/10.1021/acs.cgd.7b00848 (2017).
Article CAS Google Scholar
Lee, S. et al. Computational screening of trillions of metal-organic frameworks for high-performance methane storage. ACS Appl. Mater. Interfaces 13, 23647–23654. https://doi.org/10.1021/acsami.1c02471 (2021).
Article CAS PubMed Google Scholar
Ahmed, A. et al. Exceptional hydrogen storage achieved by screening nearly half a million metal-organic frameworks. Nat. Commun. 10, 9365. https://doi.org/10.1038/s41467-019-09365-w (2019).
Article CAS Google Scholar
Cao, Z., Magar, R., Wang, Y. & Farimani, A. B. MOFormer: Self-supervised transformer model for metal-organic framework property prediction. J. Am. Chem. Soc. 145, 2958–2967. https://doi.org/10.1021/jacs.2c11420 (2023).
Article CAS PubMed PubMed Central Google Scholar
Fanourgakis, G. S., Gkagkas, K., Tylianakis, E. & Froudakis, G. Fast screening of large databases for top performing nanomaterials using a self-consistent, machine learning based approach. J. Phys. Chem. C 124, 19639–19648. https://doi.org/10.1021/acs.jpcc.0c05491 (2020).
Article CAS Google Scholar
Borboudakis, G. et al. Chemically intuited, large-scale screening of mofs by machine learning techniques. npj Comput. Mater. 3, 1–7 (2017).
CAS Google Scholar
Ahmed, A. & Siegel, D. Predicting hydrogen storage in mofs via machine learning. ChemRxivhttps://doi.org/10.26434/chemrxiv.13345250.v1 (2020).
Thornton, A. W. et al. Materials genome in action: Identifying the performance limits of physical hydrogen storage. Chem. Mater. 29, 2844–2854. https://doi.org/10.1021/acs.chemmater.6b04933 (2017).
Article CAS PubMed PubMed Central Google Scholar
Fanourgakis, G. S., Gkagkas, K., Tylianakis, E. & Froudakis, G. E. A universal machine learning algorithm for large-scale screening of materials. J. Am. Chem. Soc. 142, 3814–3822. https://doi.org/10.1021/jacs.9b11084 (2020).
Article CAS PubMed Google Scholar
Fernandez, M., Boyd, P. G., Daff, T. D., Aghaji, M. Z. & Woo, T. K. Rapid and accurate machine learning recognition of high performing metal organic frameworks for ${\text{ CO}_2}$ capture. J. Phys. Chem. Lett. 5, 3056–3060. https://doi.org/10.1021/jz501331m (2014).
Article CAS PubMed Google Scholar
Deshwal, A., Simon, C. M. & Doppa, J. R. Bayesian optimization of nanoporous materials. Mol. Syst. Des. Eng. 6, 1066–1086. https://doi.org/10.1039/d1me00093d (2021).
Article CAS Google Scholar
Anderson, R., Biong, A. & Gómez-Gualdrón, D. A. Adsorption isotherm predictions for multiple molecules in MOFs using the same deep learning model. J. Chem. Theory Comput. 16, 1271–1283. https://doi.org/10.1021/acs.jctc.9b00940 (2020).
Article CAS PubMed Google Scholar
Tsamardinos, I. et al. An automated machine learning architecture for the accelerated prediction of metal-organic frameworks performance in energy and environmental applications. Microporous Mesoporous Mater. 300, 110160. https://doi.org/10.1016/j.micromeso.2020.110160 (2020).
Article CAS Google Scholar
Fernandez, M., Woo, T. K., Wilmer, C. E. & Snurr, R. Q. Large-scale quantitative structure-property relationship (qspr) analysis of methane storage in metal-organic frameworks. J. Phys. Chem. C 117, 7681–7689. https://doi.org/10.1021/jp4006422 (2013).
Article CAS Google Scholar
Wu, Y., Duan, H. & Xi, H. Machine learning-driven insights into defects of zirconium metal-organic frameworks for enhanced ethane-ethylene separation. Chem. Mater. 32, 2986–2997. https://doi.org/10.1021/acs.chemmater.9b05322 (2020).
Article CAS Google Scholar
Dureckova, H., Krykunov, M., Aghaji, M. Z. & Woo, T. K. Robust machine learning models for predicting high CO$_2$ working capacity and CO$_2$ selectivity of gas adsorption in metal organic frameworks for precombustion carbon capture. J. Phys. Chem. C 123, 4133–4139. https://doi.org/10.1021/acs.jpcc.8b10644 (2019).
Article CAS Google Scholar
Simon, C. M., Mercado, R., Schnell, S. K., Smit, B. & Haranczyk, M. What are the best materials to separate a xenon/krypton mixture?. Chem. Mater. 27, 4459–4475. https://doi.org/10.1021/acs.chemmater.5b01475 (2015).
Article CAS Google Scholar
Shi, K. et al. Two-dimensional energy histograms as features for machine learning to predict adsorption in diverse nanoporous materials. J. Chem. Theory Comput.https://doi.org/10.1021/acs.jctc.2c00798 (2023).
Article PubMed Google Scholar
Fanourgakis, G. S., Gkagkas, K., Tylianakis, E., Klontzas, E. & Froudakis, G. A robust machine learning algorithm for the prediction of methane adsorption in nanoporous materials. J. Phys. Chem. A 123, 6080–6087. https://doi.org/10.1021/acs.jpca.9b03290 (2019).
Article CAS PubMed Google Scholar
Dubbeldam, D., Calero, S. & Vlugt, T. J. iRASPA: GPU-accelerated visualization software for materials scientists. Mol. Simul. 44, 653–676. https://doi.org/10.1080/08927022.2018.1426855 (2018).
Article CAS Google Scholar
Bucior, B. J. et al. Energy-based descriptors to rapidly predict hydrogen storage in metal-organic frameworks. Mol. Syst. Des. Eng. 4, 162–174. https://doi.org/10.1039/C8ME00050F (2019).
Article CAS Google Scholar
Fanourgakis, G. S., Gkagkas, K., Tylianakis, E. & Froudakis, G. A generic machine learning algorithm for the prediction of gas adsorption in nanoporous materials. J. Phys. Chem. C 124, 7117–7126. https://doi.org/10.1021/acs.jpcc.9b10766 (2020).
Article CAS Google Scholar
Alzubaidi, L. et al. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Datahttps://doi.org/10.1186/s40537-021-00444-8 (2021).
Article PubMed PubMed Central Google Scholar
Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016). http://www.deeplearningbook.org.
Cho, E. H. & Lin, L.-C. Nanoporous material recognition via 3d convolutional neural networks: Prediction of adsorption properties. J. Phys. Chem. Lett. 12, 2279–2285. https://doi.org/10.1021/acs.jpclett.1c00293 (2021).
Article CAS PubMed Google Scholar
Hung, T.-H., Xu, Z.-X., Kang, D.-Y. & Lin, L.-C. Chemistry-encoded convolutional neural networks for predicting gaseous adsorption in porous materials. J. Phys. Chem. C 126, 2813–2822. https://doi.org/10.1021/acs.jpcc.1c09649 (2022).
Article CAS Google Scholar
Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems (eds Pereira, F. et al.) 1097–1105 (Curran Associates Inc, 2012).
Szegedy, C. et al. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 1–9. https://doi.org/10.1109/CVPR.2015.7298594 (2015).
Yadav, S. S. & Jadhav, S. M. Deep convolutional neural network based medical image classification for disease diagnosis. J. Big Datahttps://doi.org/10.1186/s40537-019-0276-2 (2019).
Article Google Scholar
Kajita, S., Ohba, N., Jinnouchi, R. & Asahi, R. A universal 3d voxel descriptor for solid-state material informatics with deep convolutional neural networks. Sci. Rep.https://doi.org/10.1038/s41598-017-17299-w (2017).
Article PubMed PubMed Central Google Scholar
Géron, A. Hands-on Machine Learning with Scikit-Learn and Tensor: Concepts, Tools, and Techniques to Build Intelligent Systems (O’Reilly Media, 2017).
Mercado, R. et al. In silico design of 2d and 3d covalent organic frameworks for methane storage applications. Chem. Mater. 30, 5069–5086. https://doi.org/10.1021/acs.chemmater.8b01425 (2018).
Article ADS CAS Google Scholar
Bleiziffer, P., Schaller, K. & Riniker, S. Machine learning of partial charges derived from high-quality quantum-mechanical calculations. J. Chem. Inf. Model. 58, 579–590. https://doi.org/10.1021/acs.jcim.7b00663 (2018).
Article CAS PubMed Google Scholar
Raza, A., Sturluson, A., Simon, C. M. & Fern, X. Message passing neural networks for partial charge assignment to metal-organic frameworks. J. Phys. Chem. C 124, 19070–19082. https://doi.org/10.1021/acs.jpcc.0c04903 (2020).
Article CAS Google Scholar
Kancharlapalli, S., Gopalan, A., Haranczyk, M. & Snurr, R. Q. Fast and accurate machine learning strategy for calculating partial atomic charges in metal-organic frameworks. J. Chem. Theory Comput. 17, 3052–3064. https://doi.org/10.1021/acs.jctc.0c01229 (2021).
Article CAS PubMed Google Scholar
Kang, Y., Park, H., Smit, B. & Kim, J. A multi-modal pre-training transformer for universal transfer learning in metal-organic frameworks. Nat. Mach. Intell. 5, 309–318. https://doi.org/10.1038/s42256-023-00628-2 (2023).
Article Google Scholar
Ma, R., Colón, Y. J. & Luo, T. Transfer learning study of gas adsorption in metal-organic frameworks. ACS Appl. Mater. Interfaces 12, 34041–34048. https://doi.org/10.1021/acsami.0c06858 (2020).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This research has been co-financed by Toyota Motor Europe NV/SA. Authors would like to acknowledge financial support from European Union: Horizon Europe (project MOST-H2; Grant agreement no. 101058547).

Author information

Authors and Affiliations

Department of Chemistry, University of Crete, Voutes Campus, 70013, Heraklion, Crete, Greece
Antonios P. Sarikas & George E. Froudakis
Advanced Technology Division, Toyota Motor Europe NV/SA, Technical Center, Hoge Wei 33B, 1930, Zaventem, Belgium
Konstantinos Gkagkas

Authors

Antonios P. Sarikas
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Gkagkas
View author publications
You can also search for this author in PubMed Google Scholar
George E. Froudakis
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization, A.P.S. and G.E.F.; Methodology and software development, A.P.S.; Investigation, A.P.S.; Resources, K.G. and G.E.F.; Writing-original draft preparation, A.P.S.; Writing-review and editing, K.G. and G.E.F.; Supervision, G.E.F. All authors reviewed the manuscript.

Corresponding author

Correspondence to George E. Froudakis.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sarikas, A.P., Gkagkas, K. & Froudakis, G.E. Gas adsorption meets deep learning: voxelizing the potential energy surface of metal-organic frameworks. Sci Rep 14, 2242 (2024). https://doi.org/10.1038/s41598-023-50309-8

Download citation

Received: 11 August 2023
Accepted: 17 December 2023
Published: 26 January 2024
DOI: https://doi.org/10.1038/s41598-023-50309-8

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.