A layer-wise frequency scaling for a neural processing unit

Jaehoon Chung, Hyun Mi Kim, Kyoungseon Shin, Chun Gi Lyuh, Yong Cheol Peter Cho, Jinho Han, Youngsu Kwon, Young Ho Gong, Sung Woo Chung

Research output: Contribution to journalArticlepeer-review

Abstract

Dynamic voltage frequency scaling (DVFS) has been widely adopted for run-time power management of various processing units. In the case of neural processing units (NPUs), power management of neural network applications is required to adjust the frequency and voltage every layer to consider the power behavior and performance of each layer. Unfortunately, DVFS is inappropriate for layer-wise run-time power management of NPUs due to the long latency of voltage scaling compared with each layer execution time. Because the frequency scaling is fast enough to keep up with each layer, we propose a layer-wise dynamic frequency scaling (DFS) technique for an NPU. Our proposed DFS exploits the highest frequency under the power limit of an NPU for each layer. To determine the highest allowable frequency, we build a power model to predict the power consumption of an NPU based on a real measurement on the fabricated NPU. Our evaluation results show that our proposed DFS improves frame per second (FPS) by 33% and saves energy by 14% on average, compared with DVFS.

Original languageEnglish
Pages (from-to)849-858
Number of pages10
JournalETRI Journal
Volume44
Issue number5
DOIs
Publication statusPublished - 2022 Oct

Bibliographical note

Funding Information:
This work was supported in part by the Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2018‐0‐00195) and in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A2C2003500).

Publisher Copyright:
Copyright © 2022 Electronics and Telecommunications Research Institute.

Keywords

  • dynamic frequency scaling
  • neural processing unit
  • power model

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • General Computer Science
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A layer-wise frequency scaling for a neural processing unit'. Together they form a unique fingerprint.

Cite this