Abstract
In deep neural network (DNN) training, network weights are iteratively updated with the weight gradients that are obtained from stochastic gradient descent (SGD). Since SGD inherently allows gradient calculations with noise, approximating weight gradient computations have a large potential of training energy/time savings without degrading accuracy. In this paper, we propose an input-dependent approximation of the weight gradient for improving energy efficiency of training process. Considering that the output predictions of network (confidence) changes with training inputs, the relation between the confidence and the magnitude of weight gradient can be efficiently exploited to skip the gradient computations without accuracy drop, especially for high confidence inputs. With a given squared error constraint, the computation skip rates can be also controlled by changing the confidence threshold. The simulation results show that our approach can skip 72.6% of gradient computations for CIFAR-100 dataset using ResNet-18 without accuracy degradation. Hardware implementation with 65nm CMOS process shows that our design achieves 88.84% and 98.16% of maximum per epoch training energy and time savings, respectively, for CIFAR-100 dataset using ResNet-18 compared to state-of-the-art training accelerator.
Original language | English |
---|---|
Title of host publication | 2020 57th ACM/IEEE Design Automation Conference, DAC 2020 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781450367257 |
DOIs | |
Publication status | Published - 2020 Jul |
Event | 57th ACM/IEEE Design Automation Conference, DAC 2020 - Virtual, San Francisco, United States Duration: 2020 Jul 20 → 2020 Jul 24 |
Publication series
Name | Proceedings - Design Automation Conference |
---|---|
Volume | 2020-July |
ISSN (Print) | 0738-100X |
Conference
Conference | 57th ACM/IEEE Design Automation Conference, DAC 2020 |
---|---|
Country/Territory | United States |
City | Virtual, San Francisco |
Period | 20/7/20 → 20/7/24 |
Bibliographical note
Funding Information:This work was supported by the National Research Foundation of Korea grant funded by the Korea government (No. NRF-2020R1A2C3014820), and the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2020-2018-0-01433) supervised by the IITP (Institute for Information & communications Technology Promotion).
Publisher Copyright:
© 2020 IEEE.
ASJC Scopus subject areas
- Computer Science Applications
- Control and Systems Engineering
- Electrical and Electronic Engineering
- Modelling and Simulation