Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Learning on topological surface and geometric structure for 3D molecular generation

Abstract

Highly effective de novo design is a grand challenge of computer-aided drug discovery. Practical structure-specific three-dimensional molecule generations have started to emerge in recent years, but most approaches treat the target structure as a conditional input to bias the molecule generation and do not fully learn the detailed atomic interactions that govern the molecular conformation and stability of the binding complexes. The omission of these fine details leads to many models having difficulty in outputting reasonable molecules for a variety of therapeutic targets. Here, to address this challenge, we formulate a model, called SurfGen, that designs molecules in a fashion closely resembling the figurative key-and-lock principle. SurfGen comprises two equivariant neural networks, Geodesic-GNN and Geoatom-GNN, which capture the topological interactions on the pocket surface and the spatial interaction between ligand atoms and surface nodes, respectively. SurfGen outperforms other methods in a number of benchmarks, and its high sensitivity on the pocket structures enables an effective generative-model-based solution to the thorny issue of mutation-induced drug resistance.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Generated molecules and randomly sampled molecules for a COVID-19 target, 3CL protein.
Fig. 2: Illustration of SurfGen on real-world targets.
Fig. 3: Conditional generated molecules on shikimate kinase mutants.

Similar content being viewed by others

Data availability

The data are available at Zenodo (https://doi.org/10.5281/zenodo.8307911)44. PDB IDs 1ZYU and 6LU7 are available in the PDB (https://www.rcsb.org/). Source data are available with this paper.

Code availability

The code is available at GitHub (https://github.com/HaotianZhangAI4Science/SurfGen)45.

References

  1. Ferreira, L. G., Dos Santos, R. N., Oliva, G. & Andricopulo, A. D. Molecular docking and structure-based drug design strategies. Molecules 20, 13384–13421 (2015).

    Article  Google Scholar 

  2. Anderson, A. C. The process of structure-based drug design. Chem. Biol. 10, 787–797 (2003).

    Article  Google Scholar 

  3. Shoichet, B. K. Virtual screening of chemical libraries. Nature 432, 862–865 (2004).

    Article  Google Scholar 

  4. Böhm, H.-J. The computer program LUDI: a new method for the de novo design of enzyme inhibitors. J. Comput. Aided Mol. Des. 6, 61–78 (1992).

    Article  Google Scholar 

  5. Wang, R., Gao, Y. & Lai, L. LigBuilder: a multi-purpose program for structure-based drug design. Mol. Model. Annu. 6, 498–516 (2000).

    Article  Google Scholar 

  6. David, L., Nielsen, P. A., Hedstrom, M. & Norden, B. Scope and limitation of ligand docking: methods, scoring functions and protein targets. Curr. Comput. Aided Drug Design 1, 275–306 (2005).

    Article  Google Scholar 

  7. Jorgensen, W. L. Rusting of the lock and key model for protein-ligand binding. Science 254, 954–955 (1991).

    Article  Google Scholar 

  8. Ain, Q. U., Aleksandrova, A., Roessler, F. D. & Ballester, P. J. Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening. Wiley Interdiscip. Rev. Comput. Mol. Sci. 5, 405–424 (2015).

    Article  Google Scholar 

  9. McNutt, A. T. et al. GNINA 1.0: molecular docking with deep learning. J. Cheminformatics 13, 1–20 (2021).

    Article  Google Scholar 

  10. Shen, C. et al. Boosting protein–ligand binding pose prediction and virtual screening based on residue–atom distance likelihood potential and graph transformer. J. Med. Chem. 65, 10691–10706 (2022).

    Article  MathSciNet  Google Scholar 

  11. Jiang, D. et al. Interactiongraphnet: a novel and efficient deep graph representation learning framework for accurate protein–ligand interaction predictions. J. Med. Chem. 64, 18209–18232 (2021).

    Article  Google Scholar 

  12. Moon, S., Zhung, W., Yang, S., Lim, J. & Kim, W. Y. PIGNet: a physics-informed deep learning model toward generalized drug–target interaction predictions. Chem. Sci. 13, 3661–3673 (2022).

    Article  Google Scholar 

  13. Gainza, P. et al. Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning. Nat. Methods 17, 184–192 (2020).

    Article  Google Scholar 

  14. Deng, C. et al. Vector neurons: A general framework for so (3)-equivariant networks. Proc. IEEE/CVF International Conference on Computer Vision 12200–12209. (2021).

  15. Zang, C. & Wang, F. Moflow: an invertible flow model for generating molecular graphs. Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 617–626 (2020).

  16. Peng, X. et al. Pocket2Mol: efficient molecular sampling based on 3D protein pockets. International Conference on Machine Learning. 17644–17655. (2022).

  17. Ragoza, M., Masuda, T. & Koes, D. R. Generating 3D molecules conditional on receptor binding sites with deep generative models. Chem. Sci. 13, 2701–2713 (2022).

    Article  Google Scholar 

  18. Liu, M., Luo, Y., Uchino, K., Maruhashi, K. & Ji, S. Generating 3D molecules for target protein binding. International Conference on Machine Learning, 13912–13924. (2022).

  19. Jeon, W. & Kim, D. Autonomous molecule generation using reinforcement learning and docking to develop potential novel inhibitors. Sci. Rep. 10, 22104 (2020).

    Article  Google Scholar 

  20. Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem. 31, 455–461 (2010).

    Article  Google Scholar 

  21. Wang, R., Liu, L., Lai, L. & Tang, Y. SCORE: a new empirical method for estimating the binding affinity of a protein–ligand complex. Mol. Model. Annu. 4, 379–394 (1998).

    Article  Google Scholar 

  22. Francoeur, P. G. et al. Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J. Chem. Inf. Model. 60, 4200–4215 (2020).

    Article  Google Scholar 

  23. Schneuing, A. et al. Structure-based drug design with equivariant diffusion models. Preprint at https://arxiv.org/abs/2210.13695 (2022).

  24. Martin, Y. C., Kofron, J. L. & Traphagen, L. M. Do structurally similar molecules have similar biological activity? J. Med. Chem. 45, 4350–4358 (2002).

    Article  Google Scholar 

  25. Yang, J., Cai, Y., Zhao, K., Xie, H. & Chen, X. Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov. Today 27, 103356 (2022).

    Article  Google Scholar 

  26. Kang, S.-G. et al. In-pocket 3D graphs enhance ligand–target compatibility in generative small-molecule creation. Preprint at https://arxiv.org/abs/2204.02513 (2022).

  27. Wang, M. et al. Relation: a deep generative model for structure-based de novo drug design. J. Med. Chem. 65, 9478–9492 (2022).

    Article  Google Scholar 

  28. Gan, J., Gu, Y., Li, Y., Yan, H. & Ji, X. Crystal structure of Mycobacterium tuberculosis shikimate kinase in complex with shikimic acid and an ATP analogue. Biochemistry 45, 8539–8545 (2006).

    Article  Google Scholar 

  29. Pereira, J. H. et al. Shikimate kinase: a potential target for development of novel antitubercular agents. Curr. Drug Targets 8, 459–468 (2007).

    Article  Google Scholar 

  30. Jing, B., Eismann, S., Suriana, P., Townshend, R. J. & Dror, R. Learning from protein structure with geometric vector perceptrons. Preprint at https://arxiv.org/abs/2009.01411 (2020).

  31. Lamm, G. The Poisson–Boltzmann equation. Rev. Comput. Chem. 19, 147–365 (2003).

    Article  Google Scholar 

  32. Kortemme, T., Morozov, A. V. & Baker, D. An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and protein–protein complexes. J. Mol. Biol. 326, 1239–1259 (2003).

    Article  Google Scholar 

  33. Hagemans, D., Van Belzen, I. A., Morán Luengo, T. & Rüdiger, S. G. A script to highlight hydrophobicity and charge on protein surfaces. Front. Mol. Biosci. 2, 56 (2015).

    Article  Google Scholar 

  34. Shi, C. et al. Graphaf: a flow-based autoregressive model for molecular graph generation. International Conference on Learning Representations (ICLR), 2020.

  35. Lin, H. et al. DiffBP: generative diffusion of 3D molecules for target protein binding. Preprint at https://arxiv.org/abs/2211.11214 (2022).

  36. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  37. Lu, W. et al. TANKBind: Trigonometry-aware neural networks for drug-protein binding structure prediction. Adv. Neural Inf. Process. Syst. 35, 7236–7249 (2022).

    Google Scholar 

  38. Luo, S., Guan, J., Ma, J. & Peng, J. A 3D generative model for structure-based drug design. Adv. Neural Inf. Process. Syst. 34, 6229–6239 (2021).

    Google Scholar 

  39. Burley, S. K. et al. Protein Data Bank (PDB): the single global macromolecular structure archive. Protein Crystallogr. 1607, 627–641 (2017).

    Article  Google Scholar 

  40. Liu, T., Lin, Y., Wen, X., Jorissen, R. N. & Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities. Nucleic Acids Res. 35, D198–D201 (2007).

    Article  Google Scholar 

  41. Jin, Z. et al. Structure of Mpro from SARS-CoV-2 and discovery of its inhibitors. Nature 582, 289–293 (2020).

    Article  Google Scholar 

  42. Tanimoto, T. T. Elementary Mathematical Theory of Classification and Prediction, IBM Internal Report (1958).

  43. Landrum, G. RDKit documentation. Release 1, 4 (2013).

    Google Scholar 

  44. Odi,n Z. CrossDock processed data. Zenodo https://doi.org/10.5281/zenodo.7751348 (2023).

  45. Odin, Z. SurfGenV1. Zenodo https://doi.org/10.5281/zenodo.8307911 (2023).

  46. Clark, D. E. & Pickett, S. D. Computational methods for the prediction of ‘drug-likeness’. Drug Discov. Today 5, 49–58 (2000).

    Article  Google Scholar 

  47. Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminformatics 1, 1–11 (2009).

    Article  Google Scholar 

  48. Ganesan, A. The impact of natural products upon modern drug discovery. Curr. Opin. Chem. Biol. 12, 306–317 (2008).

    Article  Google Scholar 

  49. Sangster, J. Octanol–water partition coefficients of simple organic compounds. J. Phys. Chem. Ref. Data 18, 1111–1229 (1989).

    Article  Google Scholar 

Download references

Acknowledgements

This work was financially supported by the National Key Research and Development Program of China (2022YFF1203003), the National Natural Science Foundation of China (22220102001, 82373791, and 81973281) and the Natural Science Foundation of Zhejiang Province (LD22H300001).

Author information

Authors and Affiliations

Authors

Contributions

O.Z. contributed to the main idea and code. T.W. and N.W. contributed to the paper writing and code reorganization. G.W. contributed to the collection of the dataset and the corresponding experiment. D.J. contributed to the real-world case of the COVID-19 target experiment. X.W., H.Z. and J.W. contributed to the data analysis and drawing. N.W. contributed to the assessment of LigBuilder and Morld methods. E.W. contributed to the instruction in physical concepts. G.C. and Y.D. contributed to the visualization and technique support. P.P. contributed to the suggestion of the mutation experiment with molecular generation protocol. Y.K. and C.-Y.H. contributed to the paper revision and experimental design. T.H. contributed to the essential financial support and conception, and was responsible for the overall quality.

Corresponding authors

Correspondence to Yu Kang, Chang-Yu Hsieh or Tingjun Hou.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Computational Science thanks Huziel Sauceda and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Kaitlin McCardle, in collaboration with the Nature Computational Science team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–7, Figs. 1–3 and Tables 1–6.

Reporting Summary

Source data

Source Data Fig. 1

The corresponding metrics of the visualized examples in Fig. 1.

Source Data Fig. 2

Unprocessed raw data to draw the distribution plot in Fig. 2a.

Source Data Fig. 3

The molecular and pocket volumes shown in Fig. 3c.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, O., Wang, T., Weng, G. et al. Learning on topological surface and geometric structure for 3D molecular generation. Nat Comput Sci 3, 849–859 (2023). https://doi.org/10.1038/s43588-023-00530-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s43588-023-00530-2

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research