Abstract
Deep learning inference is increasingly run at the edge. As the programming and system stack support becomes mature, it enables acceleration opportunities in a mobile system, where the system performance envelope is scaled up with a plethora of programmable co-processors. Thus, intelligent services designed for mobile users can choose between running inference on the CPU or any of the co-processors in the mobile system, and exploiting connected systems such as the cloud or a nearby, locally connected mobile system. By doing so, these services can scale out the performance and increase the energy efficiency of edge mobile systems. This gives rise to a new challenge-deciding when inference should run where. Such execution scaling decision becomes more complicated with the stochastic nature of mobile-cloud execution environment, where signal strength variation in the wireless networks and resource interference can affect real-time inference performance and system energy efficiency. To enable energy efficient deep learning inference at the edge, this paper proposes AutoScale, an adaptive and lightweight execution scaling engine built on the customdesigned reinforcement learning algorithm. It continuously learns and selects the most energy efficient inference execution target by considering characteristics of neural networks and available systems in the collaborative cloud-edge execution environment while adapting to stochastic runtime variance. Real system implementation and evaluation, considering realistic execution scenarios, demonstrate an average of 9.8x and 1.6x energy efficiency improvement over the baseline mobile CPU and cloud offloading, respectively, while meeting the real-time performance and accuracy requirements.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020 |
| Publisher | IEEE Computer Society |
| Pages | 1082-1096 |
| Number of pages | 15 |
| ISBN (Electronic) | 9781728173832 |
| DOIs | |
| Publication status | Published - 2020 Oct |
| Externally published | Yes |
| Event | 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020 - Virtual, Athens, Greece Duration: 2020 Oct 17 → 2020 Oct 21 |
Publication series
| Name | Proceedings of the Annual International Symposium on Microarchitecture, MICRO |
|---|---|
| Volume | 2020-October |
| ISSN (Print) | 1072-4451 |
Conference
| Conference | 53rd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2020 |
|---|---|
| Country/Territory | Greece |
| City | Virtual, Athens |
| Period | 20/10/17 → 20/10/21 |
Bibliographical note
Publisher Copyright:© 2020 IEEE Computer Society. All rights reserved.
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
ASJC Scopus subject areas
- Hardware and Architecture
Fingerprint
Dive into the research topics of 'Autoscale: Energy efficiency optimization for stochastic edge inference using reinforcement learning'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS