Abstract
Source separation is a task that aims to separate multiple sounds from mixed audio. Recent source separation studies suggest that they process the separation primarily through a time-domain approach and that most are carried out mainly on speech separation in a clean environment. A few studies of animal sound separation have made advances that would contain various environmental noises. To address this issue, we propose a novel method to separate the animal sounds among background noise and two overlapping sources. The proposed method focuses on taking into account the real-world environment by adding background noise. The proposed model structure adds a classification network to the dual-path recurrent neural network (DPRNN). In particular, the mixed audio becomes separated through a mask of the single source estimated within the DPRNN. The separated source is then converted into a mel-spectrogram for feature representation. We use the resulting feature as input to a classification network for classification for verification of the separation performance. The experimental results confirm that the proposed method achieves better separation performance than when using DPRNN alone.
Original language | English |
---|---|
Journal | Proceedings of the International Congress on Acoustics |
Publication status | Published - 2022 |
Event | 24th International Congress on Acoustics, ICA 2022 - Gyeongju, Korea, Republic of Duration: 2022 Oct 24 → 2022 Oct 28 |
Bibliographical note
Funding Information:This work was supported by Korea Environment Industry & Technology Institute(KEITI) through Exotic Invasive Species Management Program, funded by Korea Ministry of Environment(MOE)(2021002280004)
Publisher Copyright:
© ICA 2022.All rights reserved
Keywords
- Animal sound separation
- deep learning
- time domain
ASJC Scopus subject areas
- Mechanical Engineering
- Acoustics and Ultrasonics