TY - GEN
T1 - Self-subtraction network for end to end noise robust classification
AU - Kim, Donghyeon
AU - Han, David K.
AU - Ko, Hanseok
N1 - Publisher Copyright:
© 2019 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2019/9
Y1 - 2019/9
N2 - Acoustic event classification in surveillance applications typically employs deep learning-based end-to-end learning methods. In real environments, their performance degrades significantly due to noise. While various approaches have been proposed to overcome the noise problem, most of these methodologies rely on supervised learning-based feature representation. Supervised learning system, however, requires a pair of noise free and noisy audio streams. Acquisition of ground truth and noisy acoustic event data requires significant efforts to adequately capture the varieties of noise types for training. This paper proposes a novel supervised learning method for noise robust acoustic event classification in an end-to-end framework named Self Subtraction Network (SSN). SSN extracts noise features from an input audio spectrogram and removes them from the input using LSTMs and an auto-encoder. Our method applied to Urbansound8k dataset with 8 noise types at four different levels demonstrates improved performances compared to the state of the art methods.
AB - Acoustic event classification in surveillance applications typically employs deep learning-based end-to-end learning methods. In real environments, their performance degrades significantly due to noise. While various approaches have been proposed to overcome the noise problem, most of these methodologies rely on supervised learning-based feature representation. Supervised learning system, however, requires a pair of noise free and noisy audio streams. Acquisition of ground truth and noisy acoustic event data requires significant efforts to adequately capture the varieties of noise types for training. This paper proposes a novel supervised learning method for noise robust acoustic event classification in an end-to-end framework named Self Subtraction Network (SSN). SSN extracts noise features from an input audio spectrogram and removes them from the input using LSTMs and an auto-encoder. Our method applied to Urbansound8k dataset with 8 noise types at four different levels demonstrates improved performances compared to the state of the art methods.
UR - http://www.scopus.com/inward/record.url?scp=85076359160&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076359160&partnerID=8YFLogxK
U2 - 10.1109/AVSS.2019.8909821
DO - 10.1109/AVSS.2019.8909821
M3 - Conference contribution
AN - SCOPUS:85076359160
T3 - 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2019
BT - 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2019
Y2 - 18 September 2019 through 21 September 2019
ER -