Abstract
Serving deep neural network (DNN) models requires a lot of memory and has a relatively long response time. Researchers nowadays try to integrate DNN inference services with the serverless architecture, which workload is suited for the serverless, because of its short-lived and burst characteristics. In addition, the cases of deploying DNN services to edge clouds are also increasing, to reduce the response time. However, serving DNN models in a serverless manner has a problem of excessive memory usage deriving from the data duplication, which is a more serious problem on strongly resource-constrained edge cloud. To address this problem, we designed ShmFaas on the open-source serverless platform running on the Kubernetes with minimal code changes. First, we implemented the serverless system with lightweight memory isolation by sharing DNN models in-memory, in order to avoid the model duplication problem. Also, we designed an LRU-based model eviction algorithm for efficient memory usage on the edge cloud. As a result, in our experiment, the system's memory usage is reduced by more than 29.4% compared to the existing system, and it shows that the overhead due to the proposed system is negligible enough to be used for real-world workloads.
| Original language | English |
|---|---|
| Title of host publication | 2023 IEEE International Conference on Consumer Electronics, ICCE 2023 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| ISBN (Electronic) | 9781665491303 |
| DOIs | |
| Publication status | Published - 2023 |
| Event | 2023 IEEE International Conference on Consumer Electronics, ICCE 2023 - Las Vegas, United States Duration: 2023 Jan 6 → 2023 Jan 8 |
Publication series
| Name | Digest of Technical Papers - IEEE International Conference on Consumer Electronics |
|---|---|
| Volume | 2023-January |
| ISSN (Print) | 0747-668X |
| ISSN (Electronic) | 2159-1423 |
Conference
| Conference | 2023 IEEE International Conference on Consumer Electronics, ICCE 2023 |
|---|---|
| Country/Territory | United States |
| City | Las Vegas |
| Period | 23/1/6 → 23/1/8 |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Keywords
- Cloud Computing
- DNN Inference
- Serverless
ASJC Scopus subject areas
- Industrial and Manufacturing Engineering
- Electrical and Electronic Engineering
Fingerprint
Dive into the research topics of 'Improving Memory Utilization by Sharing DNN Models for Serverless Inference'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS