Improving Memory Utilization by Sharing DNN Models for Serverless Inference

Myung Hyun Kim, Jaehak Lee, Heonchang Yu, Eunyoung Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Serving deep neural network (DNN) models requires a lot of memory and has a relatively long response time. Researchers nowadays try to integrate DNN inference services with the serverless architecture, which workload is suited for the serverless, because of its short-lived and burst characteristics. In addition, the cases of deploying DNN services to edge clouds are also increasing, to reduce the response time. However, serving DNN models in a serverless manner has a problem of excessive memory usage deriving from the data duplication, which is a more serious problem on strongly resource-constrained edge cloud. To address this problem, we designed ShmFaas on the open-source serverless platform running on the Kubernetes with minimal code changes. First, we implemented the serverless system with lightweight memory isolation by sharing DNN models in-memory, in order to avoid the model duplication problem. Also, we designed an LRU-based model eviction algorithm for efficient memory usage on the edge cloud. As a result, in our experiment, the system's memory usage is reduced by more than 29.4% compared to the existing system, and it shows that the overhead due to the proposed system is negligible enough to be used for real-world workloads.

Original languageEnglish
Title of host publication2023 IEEE International Conference on Consumer Electronics, ICCE 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665491303
DOIs
Publication statusPublished - 2023
Event2023 IEEE International Conference on Consumer Electronics, ICCE 2023 - Las Vegas, United States
Duration: 2023 Jan 62023 Jan 8

Publication series

NameDigest of Technical Papers - IEEE International Conference on Consumer Electronics
Volume2023-January
ISSN (Print)0747-668X

Conference

Conference2023 IEEE International Conference on Consumer Electronics, ICCE 2023
Country/TerritoryUnited States
CityLas Vegas
Period23/1/623/1/8

Bibliographical note

Funding Information:
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2019R1A2C1006754).

Funding Information:
ACKNOWLEDGMENT This work was supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT)(NO.2022-0-00983, Development of an edge cloud-based vehicle sharing platform that supports user-specific automotive healthcare services).

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Cloud Computing
  • DNN Inference
  • Serverless

ASJC Scopus subject areas

  • Industrial and Manufacturing Engineering
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Improving Memory Utilization by Sharing DNN Models for Serverless Inference'. Together they form a unique fingerprint.

Cite this