A survey on parallel training algorithms for deep neural networks

Dongsuk Yook, Hyowon Lee, In Chul Yoo

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

Since a large amount of training data is typically needed to train Deep Neural Networks (DNNs), a parallel training approach is required to train the DNNs. The Stochastic Gradient Descent (SGD) algorithm is one of the most widely used methods to train the DNNs. However, since the SGD is an inherently sequential process, it requires some sort of approximation schemes to parallelize the SGD algorithm. In this paper, we review various efforts on parallelizing the SGD algorithm, and analyze the computational overhead, communication overhead, and the effects of the approximations.

Original languageEnglish
Pages (from-to)505-514
Number of pages10
JournalJournal of the Acoustical Society of Korea
Volume39
Issue number6
DOIs
Publication statusPublished - 2020

Bibliographical note

Publisher Copyright:
Copyright © 2020 The Acoustical Society of Korea.

Keywords

  • Deep Neural Network (DNN)
  • Deep learning
  • Parallel processing
  • Stochastic Gradient Descent (SGD)

ASJC Scopus subject areas

  • Signal Processing
  • Instrumentation
  • Acoustics and Ultrasonics
  • Applied Mathematics
  • Speech and Hearing

Fingerprint

Dive into the research topics of 'A survey on parallel training algorithms for deep neural networks'. Together they form a unique fingerprint.

Cite this