A time delay convolutional neural network for acoustic scene classification

Younglo Lee, Sangwook Park, Hanseok Ko

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

In recent years, demands for more natural interaction between human and machine through speech have been increasing. In order to accomplish this mission, it becomes more significant for machine to understand the human's contextual status. This paper proposes a novel neural network framework that can be applied to commercial smart devices with microphones to recognize acoustic contextual information. Our approach takes into consideration the fact that an acoustic signal has more local connectivity on the time axis than the frequency axis. Experimental results show that the proposed method outperforms two conventional approaches, which are Gaussian Mixture Models (GMMs) and Multi-Layer Perceptron (MLP), by 8.6% and 7.8% respectively in overall accuracy.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Consumer Electronics, ICCE 2018
EditorsSaraju P. Mohanty, Peter Corcoran, Hai Li, Anirban Sengupta, Jong-Hyouk Lee
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-3
Number of pages3
ISBN (Electronic)9781538630259
DOIs
Publication statusPublished - 2018 Mar 26
Event2018 IEEE International Conference on Consumer Electronics, ICCE 2018 - Las Vegas, United States
Duration: 2018 Jan 122018 Jan 14

Publication series

Name2018 IEEE International Conference on Consumer Electronics, ICCE 2018
Volume2018-January

Other

Other2018 IEEE International Conference on Consumer Electronics, ICCE 2018
Country/TerritoryUnited States
CityLas Vegas
Period18/1/1218/1/14

Bibliographical note

Publisher Copyright:
© 2018 IEEE.

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Electrical and Electronic Engineering
  • Media Technology

Fingerprint

Dive into the research topics of 'A time delay convolutional neural network for acoustic scene classification'. Together they form a unique fingerprint.

Cite this