Low Complexity Gradient Computation Techniques to Accelerate Deep Neural Network Training

Dongyeob Shin, Geonho Kim, Joongho Jo, Jongsun Park

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Deep neural network (DNN) training is an iterative process of updating network weights, called gradient computation, where (mini-batch) stochastic gradient descent (SGD) algorithm is generally used. Since SGD inherently allows gradient computations with noise, the proper approximation of computing weight gradients within SGD noise can be a promising technique to save energy/time consumptions during DNN training. This article proposes two novel techniques to reduce the computational complexity of the gradient computations for the acceleration of SGD-based DNN training. First, considering that the output predictions of a network (confidence) change with training inputs, the relation between the confidence and the magnitude of the weight gradient can be exploited to skip the gradient computations without seriously sacrificing the accuracy, especially for high confidence inputs. Second, the angle diversity-based approximations of intermediate activations for weight gradient calculation are also presented. Based on the fact that the angle diversity of gradients is small (highly uncorrelated) in the early training epoch, the bit precision of activations can be reduced to 2-/4-/8-bit depending on the resulting angle error between the original gradient and quantized gradient. The simulations show that the proposed approach can skip up to 75.83% of gradient computations with negligible accuracy degradation for CIFAR-10 dataset using ResNet-20. Hardware implementation results using 65-nm CMOS technology also show that the proposed training accelerator achieves up to 1.69× energy efficiency compared with other training accelerators.

Original languageEnglish
Pages (from-to)5745-5759
Number of pages15
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume34
Issue number9
DOIs
Publication statusPublished - 2023 Sept 1

Bibliographical note

Funding Information:
This work was supported in part by the National Research Foundation of Korea grant funded by the Korean Government under Grant NRF-2020R1A2C3014820, in part by the Ministry of Science and ICT (MSIT), South Korea, through the Information Technology Research Center (ITRC) Support Program under Grant IITP-2021-2018-0-01433 supervised by the Institute for Information and Communications Technology Promotion (IITP), and in part by the Industrial Strategic Technology Development Program (Development of SoC technology based on Spiking Neural Cell for smart mobile and IoT Devices) under Grant 10077445

Publisher Copyright:
© 2021 IEEE.

Keywords

  • Confidence
  • gradient approximation
  • low-precision training
  • stochastic gradient descent (SGD)
  • training accelerator

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Low Complexity Gradient Computation Techniques to Accelerate Deep Neural Network Training'. Together they form a unique fingerprint.

Cite this