Abstract
Serving deep neural network (DNN) models requires a lot of memory and has a relatively long response time. Researchers nowadays try to integrate DNN inference services with the serverless architecture, which workload is suited for the serverless, because of its short-lived and burst characteristics. In addition, the cases of deploying DNN services to edge clouds are also increasing, to reduce the response time. However, serving DNN models in a serverless manner has a problem of excessive memory usage deriving from the data duplication, which is a more serious problem on strongly resource-constrained edge cloud. To address this problem, we designed ShmFaas on the open-source serverless platform running on the Kubernetes with minimal code changes. First, we implemented the serverless system with lightweight memory isolation by sharing DNN models in-memory, in order to avoid the model duplication problem. Also, we designed an LRU-based model eviction algorithm for efficient memory usage on the edge cloud. As a result, in our experiment, the system's memory usage is reduced by more than 29.4% compared to the existing system, and it shows that the overhead due to the proposed system is negligible enough to be used for real-world workloads.
Original language | English |
---|---|
Title of host publication | 2023 IEEE International Conference on Consumer Electronics, ICCE 2023 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781665491303 |
DOIs | |
Publication status | Published - 2023 |
Event | 2023 IEEE International Conference on Consumer Electronics, ICCE 2023 - Las Vegas, United States Duration: 2023 Jan 6 → 2023 Jan 8 |
Publication series
Name | Digest of Technical Papers - IEEE International Conference on Consumer Electronics |
---|---|
Volume | 2023-January |
ISSN (Print) | 0747-668X |
Conference
Conference | 2023 IEEE International Conference on Consumer Electronics, ICCE 2023 |
---|---|
Country/Territory | United States |
City | Las Vegas |
Period | 23/1/6 → 23/1/8 |
Bibliographical note
Funding Information:This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2019R1A2C1006754).
Funding Information:
ACKNOWLEDGMENT This work was supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT)(NO.2022-0-00983, Development of an edge cloud-based vehicle sharing platform that supports user-specific automotive healthcare services).
Publisher Copyright:
© 2023 IEEE.
Keywords
- Cloud Computing
- DNN Inference
- Serverless
ASJC Scopus subject areas
- Industrial and Manufacturing Engineering
- Electrical and Electronic Engineering