Activity-difference training of deep neural networks using memristor crossbars

Yi, Su-in; Kendall, Jack D.; Williams, R. Stanley; Kumar, Suhas

doi:10.1038/s41928-022-00869-w

Article
Published: 21 November 2022

Activity-difference training of deep neural networks using memristor crossbars

Nature Electronics volume 6, pages 45–51 (2023)Cite this article

6865 Accesses
12 Citations
58 Altmetric
Metrics details

Subjects

Abstract

Artificial neural networks have rapidly progressed in recent years, but are limited by the high energy costs required to train them on digital hardware. Emerging analogue hardware, such as memristor arrays, could offer improved energy efficiencies. However, the widely used backpropagation training algorithms are generally incompatible with such hardware because of mismatches between the analytically calculated training information and the imprecision of actual analogue devices. Here we report activity-difference-based training on co-designed tantalum oxide analogue memristor crossbars. Our approach, which we term memristor activity-difference energy minimization, treats the network parameters as a constrained optimization problem, and numerically calculates local gradients via Hopfield-like energy minimization using behavioural differences in the hardware targeted by the training. We use the technique to train one-layer and multilayer neural networks that can classify Braille words with high accuracy. With modelling, we show that our approach can offer over four orders of magnitude energy advantage compared with digital approaches for scaled-up problem sizes.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Spectrum of training techniques and their comparisons.**

**Fig. 2: Experimental demonstration of MADEM to reconstruct Braille words.**

**Fig. 3: Scalability of MADEM and performance benchmarking.**

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators

Article Open access 02 January 2024

Hexagonal boron nitride (h-BN) memristor arrays for analog-based machine learning hardware

Article Open access 25 July 2022

4K-memristor analog-grade passive crossbar circuit

Article Open access 31 August 2021

Data availability

All the data presented in the manuscript and used to support its conclusions will be supplied by the authors upon reasonable request.

Code availability

All the simulation codes used to support the conclusions of the manuscript will be supplied by the authors upon reasonable request.

References

Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 60, 84–90 (ACM, 2012).
LeCun, Y., Bengio, Y. & Hinton, G. E. Deep learning. Nature 521, 436–444 (2015).
Article Google Scholar
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
Article MATH Google Scholar
LeCun, Y. et al. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989).
Article Google Scholar
Thompson, N. C., Greenewald, K., Lee, K. & Manso, G. F. The computational limits of deep learning. Preprint at https://arxiv.org/abs/2007.05558 (2020).
Mazzoni, P., Andersen, R. A. & Jordan, M. I. A more biologically plausible learning rule for neural networks. Proc. Natl Acad. Sci. USA 88, 4433–4437 (1991).
Article Google Scholar
Seung, H. S. Learning in spiking neural networks by reinforcement of stochastic synaptic transmission. Neuron 40, 1063–1073 (2003).
Article Google Scholar
Strubell, E., Ganesh, A. & McCallum, A. Energy and policy considerations for deep learning in NLP. in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 3645–3650 (Association for Computational Linguistics, 2019).
Bender, E. M., Gebru, T., McMillan-Major, A. & Shmitchell, S. On the dangers of stochastic parrots: can language models be too big? In ACM Conference on Fairness, Accountability, and Transparency 610–623 (ACM, 2021).
Danesh, C. D. et al. Synaptic resistors for concurrent inference and learning with high energy efficiency. Adv. Mater. 31, 1808032 (2019).
Article Google Scholar
Marković, D., Mizrahi, A., Querlioz, D. & Grollier, J. Physics for neuromorphic computing. Nat. Rev. Phys. 2, 499–510 (2020).
Article Google Scholar
Sokolov, A. S., Abbas, H., Abbas, Y. & Choi, C. Towards engineering in memristors for emerging memory and neuromorphic computing: a review. J. Semicond. 42, 013101 (2021).
Article Google Scholar
Zhu, J., Zhang, T., Yang, Y. & Huang, R. A comprehensive review on emerging artificial neuromorphic devices. Appl. Phys. Rev. 7, 011312 (2020).
Article Google Scholar
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).
Article Google Scholar
Li, C. et al. Efficient and self-adaptive in-situ learning in multilayer memristor neural networks. Nat. Commun. 9, 2385 (2018).
Article Google Scholar
Wang, Z. et al. In situ training of feed-forward and recurrent convolutional memristor networks. Nat. Mach. Intell. 1, 434–442 (2019).
Article Google Scholar
Xi, Y. et al. In-memory learning with analog resistive switching memory: a review and perspective. Proc. IEEE 109, 14–42 (2020).
Article Google Scholar
Xia, Q. & Yang, J. J. Memristive crossbar arrays for brain-inspired computing. Nat. Mater. 18, 309–323 (2019).
Article Google Scholar
Lim, D.-H. et al. Spontaneous sparse learning for PCM-based memristor neural networks. Nat. Commun. 12, 319 (2021).
Article Google Scholar
Sung, C., Hwang, H. & Yoo, I. K. Perspective: a review on memristive hardware for neuromorphic computation. J. Appl. Phys. 124, 151903 (2018).
Article Google Scholar
Mehonic, A. et al. Memristors—from in-memory computing, deep learning acceleration, and spiking neural networks to the future of neuromorphic and bio-inspired computing. Adv. Intell. Syst. 2, 2000085 (2020).
Article Google Scholar
Cramer, B. et al. Surrogate gradients for analog neuromorphic computing. Proc. Natl Acad. Sci. USA 119, e2109194119 (2022).
Article Google Scholar
Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549–555 (2022).
Article Google Scholar
Hinton, G. E., Sejnowski, T. J. & Ackley, D. H. Boltzmann Machines: Constraint Satisfaction Networks that Learn. Report No. CMU-CS-84-119 (Department of Computer Science, Carnegie-Mellon University, 1984).
Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).
Article Google Scholar
Movellan, J. Contrastive Hebbian learning in the continuous Hopfield model. in Connectionist Models. 10–17 (Elsevier, 1991).
Xie, X. & Seung, H. S. Equivalence of backpropagation and contrastive Hebbian learning in a layered network. Neural Comput. 15, 441–454 (2003).
Article MATH Google Scholar
Lee, D.-H., Zhang, S., Fischer, A. & Bengio, Y. Difference target propagation. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases 498–515 (Springer, 2015).
Spall, J. C. et al. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37, 332–341 (1992).
Article MATH Google Scholar
Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).
Article Google Scholar
Zoppo, G., Marrone, F. & Corinto, F. Equilibrium propagation for memristor-based recurrent neural networks. Front. Neurosci. 14, 240 (2020).
Article Google Scholar
Kendall, J., Pantone, R., Manickavasagam, K., Bengio, Y. & Scellier, B. Training end-to-end analog neural networks with equilibrium propagation. Preprint at https://arxiv.org/abs/2006.01981 (2020).
Ernoult, M., Grollier, J., Querlioz, D., Bengio, Y. & Scellier, B. Updates of equilibrium prop match gradients of backprop through time in an RNN with static input. In Advances in Neural Information Processing Systems 32, 7081–7091 (Curran Associates, 2019).
Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J. & Hinton, G. E. Backpropagation and the brain. Nat. Rev. Neurosci. 21, 335–346 (2020).
Article Google Scholar
Xiao, M., Meng, Q., Zhang, Z., Wang, Y. & Lin, Z. Training feedback spiking neural networks by implicit differentiation on the equilibrium state. In Advances in Neural Information Processing Systems 34, 14516–14528 (Curran Associates, 2021).
Bai, S., Koltun, V. & Kolter, J. Z. Multiscale deep equilibrium models. In Advances in Neural Information Processing Systems 33, 5238–5250 (Curran Associates, 2020).
Bai, S., Kolter, J. Z. & Koltun, V. Deep equilibrium models. In Advances in Neural Information Processing Systems 32 (Curran Associates, 2019).
O’Connor, P., Gavves, E. & Welling, M. Training a spiking neural network with equilibrium propagation. In Proc. Twenty-Second International Conference on Artificial Intelligence and Statistics 89, 1516–1523 (PMLR, 2019).
Dillavou, S., Stern, M., Liu, A. J. & Durian, D. J. Demonstration of decentralized, physics-driven learning. Phys. Rev. Appl. 18, 014040 (2022).
Article Google Scholar
Stern, M., Dillavou, S., Miskin, M. Z., Durian, D. J. & Liu, A. J. Physical learning beyond the quasistatic limit. Phys. Rev. Research 4, L022037 (2022).
Article Google Scholar
Hopfield, J. J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl Acad. Sci. USA 79, 2554–2558 (1982).
Article MATH Google Scholar
Hopfield, J. J. Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Natl Acad. Sci. USA 81, 3088–3092 (1984).
Article MATH Google Scholar
Saxena, V. Mixed-signal neuromorphic computing circuits using hybrid CMOS-RRAM integration. IEEE Trans. Circuits Syst. II: Express Br 68, 581–586 (2020).
Article Google Scholar
Cai, F. et al. Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 409–418 (2020).
Article Google Scholar
Kumar, S., Strachan, J. P. & Williams, R. S. Chaotic dynamics in nanoscale NbO₂ Mott memristors for analogue computing. Nature 548, 318–321 (2017).
Article Google Scholar
Zoppo, G., Marrone, F. & Corinto, F. Equilibrium propagation for memristor-based recurrent neural networks. Front. Neurosci. 14, 240 (2020).
Article Google Scholar
Ramsauer, H. et al. Hopfield networks is all you need. in International Conference on Learning Representations (Johannes Kepler Univ. Linz, 2021).
Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).
Article Google Scholar
Neftci, E. O., Pedroni, B. U., Joshi, S., Al-Shedivat, M. & Cauwenberghs, G. Stochastic synapses enable efficient brain-inspired learning machines. Front. Neurosci. 10, 241 (2016).
Article Google Scholar
Neftci, E. O., Das, S., Pedroni, B. U., Kreutz-Delgado, K. & Cauwenberghs, G. Event-driven contrastive divergence for spiking neuromorphic systems. Front. Neurosci. 7, 272 (2014).
Article Google Scholar

Download references

Acknowledgements

R. Pantone and X. Sheng are gratefully acknowledged for feedback on the manuscript and/or assistance with the experiments. S.Y. and R.S.W. were partly supported by the Air Force Office of Scientific Research (AFOSR) under grant no. AFOSR-FA9550-19-0213, titled ‘Brain Inspired Networks for Multifunctional Intelligent Systems in Aerial Vehicles’. R.S.W. acknowledges the X-Grants Program of the President’s Excellence Fund at Texas A&M University. We acknowledge the Laboratory Directed Research and Development program at Sandia National Laboratories, a multimission laboratory operated for the US Department of Energy (DOE)’s National Nuclear Security Administration under contract DE-NA0003525. This paper describes objective technical results and analyses. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US Department of Energy or the United States Government. Part of this work was performed at the Stanford Nano Shared Facilities (SNSF), supported by the National Science Foundation under award ECCS-2026822. This research used resources of the Advanced Light Source, a US DOE Office of Science User Facility under contract DE-AC02-05CH11231.

Author information

Authors and Affiliations

Texas A&M University, College Station, TX, USA
Su-in Yi & R. Stanley Williams
Rain Neuromorphics, San Francisco, CA, USA
Jack D. Kendall
Sandia National Laboratories, Livermore, CA, USA
Suhas Kumar

Authors

Su-in Yi
View author publications
You can also search for this author in PubMed Google Scholar
Jack D. Kendall
View author publications
You can also search for this author in PubMed Google Scholar
R. Stanley Williams
View author publications
You can also search for this author in PubMed Google Scholar
Suhas Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All the authors contributed to the conception of the ideas, literature review, writing of the manuscript, preparation of the figures and editing.

Corresponding author

Correspondence to Suhas Kumar.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Electronics thanks Hyungjin Kim and Huaqiang Wu for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Sections 1–16 and Figs. 1–27.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yi, Si., Kendall, J.D., Williams, R.S. et al. Activity-difference training of deep neural networks using memristor crossbars. Nat Electron 6, 45–51 (2023). https://doi.org/10.1038/s41928-022-00869-w

Download citation

Received: 10 March 2022
Accepted: 13 October 2022
Published: 21 November 2022
Issue Date: January 2023
DOI: https://doi.org/10.1038/s41928-022-00869-w

This article is cited by

Training an Ising machine with equilibrium propagation
- Jérémie Laydevant
- Danijela Marković
- Julie Grollier
Nature Communications (2024)
Simple Circuit Implementation of String Scaling Fractional-order Memristor with Fixed Valid Frequency Range
- Bo Yu
- Yi-Fei Pu
- Xiao Yuan
Nonlinear Dynamics (2024)
Combinational logic circuits based on a power- and area-efficient memristor with low variability
- Shruti Sandip Ghodke
- Sanjay Kumar
- Shaibal Mukherjee
Journal of Computational Electronics (2024)

Activity-difference training of deep neural networks using memristor crossbars

Subjects

Abstract

Access options

Similar content being viewed by others

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators

Hexagonal boron nitride (h-BN) memristor arrays for analog-based machine learning hardware

4K-memristor analog-grade passive crossbar circuit

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

This article is cited by

Training an Ising machine with equilibrium propagation

Simple Circuit Implementation of String Scaling Fractional-order Memristor with Fixed Valid Frequency Range

Combinational logic circuits based on a power- and area-efficient memristor with low variability

Search

Quick links

Subjects

Abstract

Access options

Similar content being viewed by others

Purely self-rectifying memristor-based passive crossbar array for artificial neural network accelerators

Hexagonal boron nitride (h-BN) memristor arrays for analog-based machine learning hardware

4K-memristor analog-grade passive crossbar circuit

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Training an Ising machine with equilibrium propagation

Simple Circuit Implementation of String Scaling Fractional-order Memristor with Fixed Valid Frequency Range

Combinational logic circuits based on a power- and area-efficient memristor with low variability

Search

Quick links