Abstract
Bottleneck features have been shown to be effective in improving the accuracy of speaker recognition, language identification and automatic speech recognition. However, few works have focused on bottleneck features for acoustic event recognition. This paper proposes a novel acoustic event recognition framework using bottleneck features derived from a Deep Neural Network (DNN). In addition to conventional features (MFCC, Mel-spectrum, etc.), this paper employs rhythm, timbre, and spectrum-statistics features for effectively extracting acoustic characteristics from audio signals. The effectiveness of the proposed method is demonstrated on a database of real life recordings via experiments, and its robust performance is verified by comparing to conventional methods.
Original language | English |
---|---|
Pages (from-to) | 2954-2957 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Volume | 08-12-September-2016 |
DOIs | |
Publication status | Published - 2016 |
Event | 17th Annual Conference of the International Speech Communication Association, INTERSPEECH 2016 - San Francisco, United States Duration: 2016 Sept 8 → 2016 Sept 16 |
Bibliographical note
Publisher Copyright:Copyright © 2016 ISCA.
Keywords
- Acoustic event recognition
- Bottleneck feature
- Deep belief network
- Deep neural network
- Feature extraction
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modelling and Simulation