Two-Phase Split Computing Framework in Edge-Cloud Continuum

Haneul Ko, Bokyeong Kim, Yumi Kim, Sangheon Pack*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Split computing is a promising approach to reduce the inference latency of deep neural network (DNN) models. In this article, we propose a two-phase split computing framework (TSCF). In TSCF, for vertical interlayer splitting between the computing nodes at different levels (e.g., central and edge clouds), a shortest path problem in a directed graph is formulated and a pruning-based low-complexity solution is devised. In addition, for horizontal intralayer splitting between the computing nodes at the same level (e.g., edge clouds), the execution units of a specific layer are further divided and distributed to the computing nodes at the same level proportionally to their available resources. The evaluation results demonstrate that TSCF can reduce inference latency more than 38.8% compared to the traditional interlayer splitting scheme by efficiently using the resources of distributed computing nodes. In addition, it is demonstrated that near-optimal performance in terms of inference latency can be achieved even with a pruning-based low-complexity solution.

Original languageEnglish
Pages (from-to)21741-21749
Number of pages9
JournalIEEE Internet of Things Journal
Volume11
Issue number12
DOIs
Publication statusPublished - 2024 Jun 15

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Keywords

  • Deep neural network (DNN)
  • inference latency
  • interlayer splitting
  • intralayer splitting
  • two-phase split computing

ASJC Scopus subject areas

  • Signal Processing
  • Information Systems
  • Hardware and Architecture
  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Two-Phase Split Computing Framework in Edge-Cloud Continuum'. Together they form a unique fingerprint.

Cite this