Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Activity-difference training of deep neural networks using memristor crossbars

Abstract

Artificial neural networks have rapidly progressed in recent years, but are limited by the high energy costs required to train them on digital hardware. Emerging analogue hardware, such as memristor arrays, could offer improved energy efficiencies. However, the widely used backpropagation training algorithms are generally incompatible with such hardware because of mismatches between the analytically calculated training information and the imprecision of actual analogue devices. Here we report activity-difference-based training on co-designed tantalum oxide analogue memristor crossbars. Our approach, which we term memristor activity-difference energy minimization, treats the network parameters as a constrained optimization problem, and numerically calculates local gradients via Hopfield-like energy minimization using behavioural differences in the hardware targeted by the training. We use the technique to train one-layer and multilayer neural networks that can classify Braille words with high accuracy. With modelling, we show that our approach can offer over four orders of magnitude energy advantage compared with digital approaches for scaled-up problem sizes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Spectrum of training techniques and their comparisons.
Fig. 2: Experimental demonstration of MADEM to reconstruct Braille words.
Fig. 3: Scalability of MADEM and performance benchmarking.

Similar content being viewed by others

Data availability

All the data presented in the manuscript and used to support its conclusions will be supplied by the authors upon reasonable request.

Code availability

All the simulation codes used to support the conclusions of the manuscript will be supplied by the authors upon reasonable request.

References

  1. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 60, 84–90 (ACM, 2012).

  2. LeCun, Y., Bengio, Y. & Hinton, G. E. Deep learning. Nature 521, 436–444 (2015).

    Article  Google Scholar 

  3. Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

    Article  MATH  Google Scholar 

  4. LeCun, Y. et al. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989).

    Article  Google Scholar 

  5. Thompson, N. C., Greenewald, K., Lee, K. & Manso, G. F. The computational limits of deep learning. Preprint at https://arxiv.org/abs/2007.05558 (2020).

  6. Mazzoni, P., Andersen, R. A. & Jordan, M. I. A more biologically plausible learning rule for neural networks. Proc. Natl Acad. Sci. USA 88, 4433–4437 (1991).

    Article  Google Scholar 

  7. Seung, H. S. Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40, 1063–1073 (2003).

    Article  Google Scholar 

  8. Strubell, E., Ganesh, A. & McCallum, A. Energy and policy considerations for deep learning in NLP. in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3645–3650 (Association for Computational Linguistics, 2019).

  9. Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: can language models be too big? In ACM Conference on Fairness, Accountability, and Transparency 610–623 (ACM, 2021).

  10. Danesh, C. D. et al. Synaptic resistors for concurrent inference and learning with high energy efficiency. Adv. Mater. 31, 1808032 (2019).

    Article  Google Scholar 

  11. Marković, D., Mizrahi, A., Querlioz, D. & Grollier, J. Physics for neuromorphic computing. Nat. Rev. Phys. 2, 499–510 (2020).

    Article  Google Scholar 

  12. Sokolov, A. S., Abbas, H., Abbas, Y. & Choi, C. Towards engineering in memristors for emerging memory and neuromorphic computing: a review. J. Semicond. 42, 013101 (2021).

    Article  Google Scholar 

  13. Zhu, J., Zhang, T., Yang, Y. & Huang, R. A comprehensive review on emerging artificial neuromorphic devices. Appl. Phys. Rev. 7, 011312 (2020).

    Article  Google Scholar 

  14. Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).

    Article  Google Scholar 

  15. Li, C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 2385 (2018).

    Article  Google Scholar 

  16. Wang, Z. et al. In situ training of feed-forward and recurrent convolutional memristor networks. Nat. Mach. Intell. 1, 434–442 (2019).

    Article  Google Scholar 

  17. Xi, Y. et al. In-memory learning with analog resistive switching memory: a review and perspective. Proc. IEEE 109, 14–42 (2020).

    Article  Google Scholar 

  18. Xia, Q. & Yang, J. J. Memristive crossbar arrays for brain-inspired computing. Nat. Mater. 18, 309–323 (2019).

    Article  Google Scholar 

  19. Lim, D.-H. et al. Spontaneous sparse learning for PCM-based memristor neural networks. Nat. Commun. 12, 319 (2021).

    Article  Google Scholar 

  20. Sung, C., Hwang, H. & Yoo, I. K. Perspective: a review on memristive hardware for neuromorphic computation. J. Appl. Phys. 124, 151903 (2018).

    Article  Google Scholar 

  21. Mehonic, A. et al. Memristors—from in-memory computing, deep learning acceleration, and spiking neural networks to the future of neuromorphic and bio-inspired computing. Adv. Intell. Syst. 2, 2000085 (2020).

    Article  Google Scholar 

  22. Cramer, B. et al. Surrogate gradients for analog neuromorphic computing. Proc. Natl Acad. Sci. USA 119, e2109194119 (2022).

    Article  Google Scholar 

  23. Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549–555 (2022).

    Article  Google Scholar 

  24. Hinton, G. E., Sejnowski, T. J. & Ackley, D. H. Boltzmann Machines: Constraint Satisfaction Networks that Learn. Report No. CMU-CS-84-119 (Department of Computer Science, Carnegie-Mellon University, 1984).

  25. Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).

    Article  Google Scholar 

  26. Movellan, J. Contrastive Hebbian learning in the continuous Hopfield model. in Connectionist Models. 10–17 (Elsevier, 1991).

  27. Xie, X. & Seung, H. S. Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Comput. 15, 441–454 (2003).

    Article  MATH  Google Scholar 

  28. Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. Difference target propagation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases 498–515 (Springer, 2015).

  29. Spall, J. C. et al. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37, 332–341 (1992).

    Article  MATH  Google Scholar 

  30. Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).

    Article  Google Scholar 

  31. Zoppo, G., Marrone, F. & Corinto, F. Equilibrium propagation for memristor-based recurrent neural networks. Front. Neurosci. 14, 240 (2020).

    Article  Google Scholar 

  32. Kendall, J., Pantone, R., Manickavasagam, K., Bengio, Y. & Scellier, B. Training end-to-end analog neural networks with equilibrium propagation. Preprint at https://arxiv.org/abs/2006.01981 (2020).

  33. Ernoult, M., Grollier, J., Querlioz, D., Bengio, Y. & Scellier, B. Updates of equilibrium prop match gradients of backprop through time in an RNN with static input. In Advances in Neural Information Processing Systems 32, 7081–7091 (Curran Associates, 2019).

  34. Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J. & Hinton, G. E. Backpropagation and the brain. Nat. Rev. Neurosci. 21, 335–346 (2020).

    Article  Google Scholar 

  35. Xiao, M., Meng, Q., Zhang, Z., Wang, Y. & Lin, Z. Training feedback spiking neural networks by implicit differentiation on the equilibrium state. In Advances in Neural Information Processing Systems 34, 14516–14528 (Curran Associates, 2021).

  36. Bai, S., Koltun, V. & Kolter, J. Z. Multiscale deep equilibrium models. In Advances in Neural Information Processing Systems 33, 5238–5250 (Curran Associates, 2020).

  37. Bai, S., Kolter, J. Z. & Koltun, V. Deep equilibrium models. In Advances in Neural Information Processing Systems 32 (Curran Associates, 2019).

  38. O’Connor, P., Gavves, E. & Welling, M. Training a spiking neural network with equilibrium propagation. In Proc. Twenty-Second International Conference on Artificial Intelligence and Statistics 89, 1516–1523 (PMLR, 2019).

  39. Dillavou, S., Stern, M., Liu, A. J. & Durian, D. J. Demonstration of decentralized, physics-driven learning. Phys. Rev. Appl. 18, 014040 (2022).

    Article  Google Scholar 

  40. Stern, M., Dillavou, S., Miskin, M. Z., Durian, D. J. & Liu, A. J. Physical learning beyond the quasistatic limit. Phys. Rev. Research 4, L022037 (2022).

    Article  Google Scholar 

  41. Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl Acad. Sci. USA 79, 2554–2558 (1982).

    Article  MATH  Google Scholar 

  42. Hopfield, J. J. Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Natl Acad. Sci. USA 81, 3088–3092 (1984).

    Article  MATH  Google Scholar 

  43. Saxena, V. Mixed-signal neuromorphic computing circuits using hybrid CMOS-RRAM integration. IEEE Trans. Circuits Syst. II: Express Br 68, 581–586 (2020).

    Article  Google Scholar 

  44. Cai, F. et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 409–418 (2020).

    Article  Google Scholar 

  45. Kumar, S., Strachan, J. P. & Williams, R. S. Chaotic dynamics in nanoscale NbO2 Mott memristors for analogue computing. Nature 548, 318–321 (2017).

    Article  Google Scholar 

  46. Zoppo, G., Marrone, F. & Corinto, F. Equilibrium propagation for memristor-based recurrent neural networks. Front. Neurosci. 14, 240 (2020).

    Article  Google Scholar 

  47. Ramsauer, H. et al. Hopfield networks is all you need. in International Conference on Learning Representations (Johannes Kepler Univ. Linz, 2021).

  48. Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).

    Article  Google Scholar 

  49. Neftci, E. O., Pedroni, B. U., Joshi, S., Al-Shedivat, M. & Cauwenberghs, G. Stochastic synapses enable efficient brain-inspired learning machines. Front. Neurosci. 10, 241 (2016).

    Article  Google Scholar 

  50. Neftci, E. O., Das, S., Pedroni, B. U., Kreutz-Delgado, K. & Cauwenberghs, G. Event-driven contrastive divergence for spiking neuromorphic systems. Front. Neurosci. 7, 272 (2014).

    Article  Google Scholar 

Download references

Acknowledgements

R. Pantone and X. Sheng are gratefully acknowledged for feedback on the manuscript and/or assistance with the experiments. S.Y. and R.S.W. were partly supported by the Air Force Office of Scientific Research (AFOSR) under grant no. AFOSR-FA9550-19-0213, titled ‘Brain Inspired Networks for Multifunctional Intelligent Systems in Aerial Vehicles’. R.S.W. acknowledges the X-Grants Program of the President’s Excellence Fund at Texas A&M University. We acknowledge the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory operated for the US Department of Energy (DOE)’s National Nuclear Security Administration under contract DE-NA0003525. This paper describes objective technical results and analyses. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US Department of Energy or the United States Government. Part of this work was performed at the Stanford Nano Shared Facilities (SNSF), supported by the National Science Foundation under award ECCS-2026822. This research used resources of the Advanced Light Source, a US DOE Office of Science User Facility under contract DE-AC02-05CH11231.

Author information

Authors and Affiliations

Authors

Contributions

All the authors contributed to the conception of the ideas, literature review, writing of the manuscript, preparation of the figures and editing.

Corresponding author

Correspondence to Suhas Kumar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Electronics thanks Hyungjin Kim and Huaqiang Wu for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–16 and Figs. 1–27.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yi, Si., Kendall, J.D., Williams, R.S. et al. Activity-difference training of deep neural networks using memristor crossbars. Nat Electron 6, 45–51 (2023). https://doi.org/10.1038/s41928-022-00869-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41928-022-00869-w

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing