Training deep neural networks for binary communication with the Whetstone method

Severa, William; Vineyard, Craig M.; Dellana, Ryan; Verzi, Stephen J.; Aimone, James B.

doi:10.1038/s42256-018-0015-y

Article
Published: 28 January 2019

Training deep neural networks for binary communication with the Whetstone method

Nature Machine Intelligence volume 1, pages 86–94 (2019)Cite this article

2222 Accesses
61 Citations
109 Altmetric
Metrics details

Subjects

A preprint version of the article is available at arXiv.

Abstract

The computational cost of deep neural networks presents challenges to broadly deploying these algorithms. Low-power and embedded neuromorphic processors offer potentially dramatic performance-per-watt improvements over traditional processors. However, programming these brain-inspired platforms generally requires platform-specific expertise. It is therefore difficult to achieve state-of-the-art performance on these platforms, limiting their applicability. Here we present Whetstone, a method to bridge this gap by converting deep neural networks to have discrete, binary communication. During the training process, the activation function at each layer is progressively sharpened towards a threshold activation, with limited loss in performance. Whetstone sharpened networks do not require a rate code or other spike-based coding scheme, thus producing networks comparable in timing and size to conventional artificial neural networks. We demonstrate Whetstone on a number of architectures and tasks such as image classification, autoencoders and semantic segmentation. Whetstone is currently implemented within the Keras wrapper for TensorFlow and is widely extendable.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: Overview of the Whetstone process.**

**Fig. 2: Training a single network through the Whetstone process.**

**Fig. 3: How Whetstone training influences the performance of different network topologies and tasks.**

**Fig. 4: Whetstone training requires N-hot output encodings.**

**Fig. 5: Whetstone has the ability to sharpen diverse networks.**

An adiabatic method to train binarized artificial neural networks

Article Open access 05 October 2021

Yuansheng Zhao & Jiang Xiao

Molecular convolutional neural networks with DNA regulatory circuits

Article 04 July 2022

Xiewei Xiong, Tong Zhu, … Hao Pei

Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks

Article 14 October 2021

Bojian Yin, Federico Corradi & Sander M. Bohté

Data availability

All data used come from publicly available datasets: MNIST³⁴, Fashion-MNIST³⁵, CIFAR³⁶ and COCO¹⁹. Whetstone is available at https://github.com/SNL-NERL/Whetstone, licensed under the GPL.

References

He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).
Pinheiro, P. O., Collobert, R. & Dollár, P. Learning to segment object candidates. Proc. 28th International Conference on Neural Information Processing Systems 2, 1990–1998 (2015).
Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
Article Google Scholar
Yang, T.-J., Chen, Y.-H. & Sze, V. Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 6071–6079 (IEEE, 2017).
Coppola, G. & Dey, E. Driverless cars are giving engineers a fuel economy headache. Bloomberg.com https://www.bloomberg.com/news/articles/2017-10-11/driverless-cars-are-giving-engineers-a-fuel-economy-headache (2017).
Horowitz, M. 1.1 Computing’s energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) 10–14 (IEEE, 2014).
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) 1–12 (IEEE, 2017).
Rao, N. Intel® nervana™ neural network processors (NNP) redefine AI silicon. Intel https://ai.intel.com/intel-nervana-neural-network-processors-nnp-redefine-ai-silicon/ (2018).
Hemsoth, N. Intel, Nervana shed light on deep learning chip architecture. The Next Platform https://www.nextplatform.com/2018/01/11/intel-nervana-shed-light-deep-learning-chip-architecture/ (2018).
Markidis, S. et al. Nvidia tensor core programmability, performance & precision. Preprint at https://arxiv.org/abs/1803.04014 (2018).
Merolla, P. A. et al. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345, 668–673 (2014).
Article Google Scholar
Khan, M. M. et al. Spinnaker: mapping neural networks onto a massively-parallel chip multiprocessor. In IEEE International Joint Conference on Neural Networks, 2008, IJCNN 2008 (IEEE World Congress on Computational Intelligence) 2849–2856 (IEEE, 2008).
Schuman, C. D. et al. A survey of neuromorphic computing and neural networks in hardware. Preprint at https://arxiv.org/abs/1705.06963 (2017).
James, C. D. et al. A historical survey of algorithms and hardware architectures for neural-inspired and neuromorphic computing applications. Biolog. Inspired Cogn. Architec. 19, 49–64 (2017).
Article Google Scholar
Knight, J. C., Tully, P. J., Kaplan, B. A., Lansner, A. & Furber, S. B. Large-scale simulations of plastic neural networks on neuromorphic hardware. Front. Neuroanat. 10, 37 (2016).
Article Google Scholar
Sze, V., Chen, Y.-H., Yang, T.-J. & Emer, J. S. Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105, 2295–2329 (2017).
Article Google Scholar
Bergstra, J., Yamins, D. & Cox, D. D. Hyperopt: a python library for optimizing the hyperparameters of machine learning algorithms. In Proceedings of the 12th Python in Science Conference 13–20 (Citeseer, 2013).
Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A. & Talwalkar, A. Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res. 18, 6765–6816 (2017).
MathSciNet MATH Google Scholar
Lin, T.-Y. et al. Microsoft coco: common objects in context. In European Conference on Computer Vision, 740–755 (Springer, 2014).
Hunsberger, E. & Eliasmith, C. Training spiking deep networks for neuromorphic hardware. Preprint at https://arxiv.org/abs/1611.05141 (2016).
Esser, S. K., Appuswamy, R., Merolla, P., Arthur, J. V. & Modha, D. S. Backpropagation for energy-efficient neuromorphic computing. In Advances in Neural Information Processing Systems 28 (eds Cortes, C., Lawrence, N. D., Lee, D. D., Sugiyama, M. & Garnett, R.) 1117–1125 (Curran Associates, Red Hook, 2015).
Esser, S. et al. Convolutional networks for fast, energy-efficient neuromorphic computing. 2016. Preprint at http://arxiv.org/abs/1603.08270 (2016).
Rueckauer, B., Lungu, I.-A., Hu, Y., Pfeiffer, M. & Liu, S.-C. Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Front. Neurosci. 11, 682 (2017).
Article Google Scholar
Bohte, S. M., Kok, J. N. & La Poutré, J. A. Spikeprop: backpropagation for networks of spiking neurons. In European Symposium on Artificial Neural Networks 419–424 (ELEN, London, 2000).
Huh, D. & Sejnowski, T. J. Gradient descent for spiking neural networks. Preprint at https://arxiv.org/abs/1706.04698 (2017).
Cao, Y., Chen, Y. & Khosla, D. Spiking deep convolutional neural networks for energy-efficient object recognition. Int. J. Comput. Vis. 113, 54–66 (2015).
Article MathSciNet Google Scholar
Hunsberger, E. & Eliasmith, C. Spiking deep networks with LIF neurons. Preprint at https://arxiv.org/abs/1510.08829 (2015).
Liew, S. S., Khalil-Hani, M. & Bakhteri, R. Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems. Neurocomputing 216, 718–734 (2016).
Article Google Scholar
Nise, N. S. Control Systems Engineering, 5th edn (Wiley, New York, NY, 2008).
Chollet, F. et al. Keras https://github.com/fchollet/keras (2015).
Rothganger, F., Warrender, C. E., Trumbo, D. & Aimone, J. B. N2A: a computational tool for modeling from neurons to algorithms. Front. Neural Circuits 8, 1 (2014).
Article Google Scholar
Davison, A. P. et al. Pynn: a common interface for neuronal network simulators. Front. Neuroinform. 2, 11 (2009).
Google Scholar
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R. & Bengio, Y. Binarized neural networks. In Proceedings of Advances in Neural Information Processing Systems 4107–4115 (Curran Associates, Red Hook, 2016).
LeCun, Y., Cortes, C. & Burges, C. Mnist handwritten digit database. AT&T Labs http://yann.lecun.com/exdb/mnist 2 (2010).
Xiao, H., Rasul, K. & Vollgraf, R. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. Preprint at https://arxiv.org/abs/1708.07747 (2017).
Krizhevsky, A. & Hinton, G. Learning Multiple Layers of Features from Tiny Images. Technical Report, Univ. Toronto (2009).

Download references

Acknowledgements

This work was supported by Sandia National Laboratories’ Laboratory Directed Research and Development (LDRD) Program under the Hardware Acceleration of Adaptive Neural Algorithms Grand Challenge project and the DOE Advanced Simulation and Computing program. Sandia National Laboratories is a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, a wholly owned subsidiary of Honeywell International, for the US Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.

This Article describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the US Department of Energy or the US Government.

Author information

Authors and Affiliations

Center for Computing Research, Sandia National Laboratories, Albuquerque, NM, USA
William Severa, Craig M. Vineyard, Ryan Dellana, Stephen J. Verzi & James B. Aimone

Authors

William Severa
View author publications
You can also search for this author in PubMed Google Scholar
Craig M. Vineyard
View author publications
You can also search for this author in PubMed Google Scholar
Ryan Dellana
View author publications
You can also search for this author in PubMed Google Scholar
Stephen J. Verzi
View author publications
You can also search for this author in PubMed Google Scholar
James B. Aimone
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to Whetstone algorithm theory and design. W.S. and R.D. implemented code and performed experiments. W.S., C.M.V., R.D. and J.B.A. analysed results. W.S., C.M.V. and J.B.A. wrote the manuscript.

Corresponding authors

Correspondence to William Severa or James B. Aimone.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary notes and figures

Rights and permissions

Reprints and permissions

About this article

Cite this article

Severa, W., Vineyard, C.M., Dellana, R. et al. Training deep neural networks for binary communication with the Whetstone method. Nat Mach Intell 1, 86–94 (2019). https://doi.org/10.1038/s42256-018-0015-y

Download citation

Received: 12 July 2018
Accepted: 13 December 2018
Published: 28 January 2019
Issue Date: February 2019
DOI: https://doi.org/10.1038/s42256-018-0015-y

This article is cited by

Opportunities for neuromorphic computing algorithms and applications
- Catherine D. Schuman
- Shruti R. Kulkarni
- Bill Kay
Nature Computational Science (2022)
Neuromorphic scaling advantages for energy-efficient random walk computations
- J. Darby Smith
- Aaron J. Hill
- James B. Aimone
Nature Electronics (2022)
Memristors learn to play
- Sam Green
- James B. Aimone
Nature Electronics (2019)