Accelerating Convolutional Neural Network Inference in Split Computing: An In-Network Computing Approach

Hochan Lee, Haneul Ko, Chanbin Bae, Sangheon Pack

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Since the latest deep neural network (DNN) models are complex and have many layers, processing an entire DNN model on mobile devices is challenging. To cope with this challenge, a split computing (SC) approach has been proposed, which divides a DNN model into multiple layers and distributes them to mobile devices and edge servers. On the other hand, in-network computing (INC) is a promising technology that offloads computational tasks to network devices (e.g., programmable switches) and thus provides low latency and line-rate packet processing. Although the switch cannot directly process complex DNN models due to its limited computing and memory resources, it has the potential to process specific layers that require simple arithmetic operations. For example, processing the max-pooling layer of convolutional neural network (CNN) models can be offloaded to the switch. In this paper, we consider a network where there are three types of computing nodes: mobile device, edge servers, and switches, and formulate the problem of placing the layers of the CNN model on the computing nodes to minimize the inference latency considering the resource constraints of computing nodes. Then, we derive the optimal results by solving the formulated optimization problem. Evaluation results demonstrate that the optimal results show lower inference latency than a random layer placement scheme and a server-only placement scheme.

Original languageEnglish
Title of host publication38th International Conference on Information Networking, ICOIN 2024
PublisherIEEE Computer Society
Pages773-776
Number of pages4
ISBN (Electronic)9798350330946
DOIs
Publication statusPublished - 2024
Event38th International Conference on Information Networking, ICOIN 2024 - Hybrid, Ho Chi Minh City, Viet Nam
Duration: 2024 Jan 172024 Jan 19

Publication series

NameInternational Conference on Information Networking
ISSN (Print)1976-7684

Conference

Conference38th International Conference on Information Networking, ICOIN 2024
Country/TerritoryViet Nam
CityHybrid, Ho Chi Minh City
Period24/1/1724/1/19

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • In-network computing
  • Split computing

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Accelerating Convolutional Neural Network Inference in Split Computing: An In-Network Computing Approach'. Together they form a unique fingerprint.

Cite this