Class-conditional WaveGAN-based data augmentation scheme for animal sound classification

Eunbeen Kim, Jaeuk Moon, Jonghwa Shim, Eenjun Hwang

Research output: Contribution to journalConference articlepeer-review

Abstract

Even though recent deep learning-based models have shown outstanding performance, they require a large amount of high-quality data for training. Recently, a generative adversarial network (GAN) has demonstrated the potential to generate virtual data based on small amounts of real data. However, in a multi-class environment, the original GAN is expensive as it requires constructing a separate generative model for each class. In this paper, we propose a class-conditional WaveGAN-based data augmentation scheme. In particular, we want to generate sounds of animals belonging to the same order in biological classification using a single model. To do that, we first learn common features from animal sound data of all target classes. Then, by embedding class label information into the WaveGAN, we generate virtual animal sound data that have the unique characteristics of each of the target classes. To effectively learn the characteristics of real animal sounds, we use multi-resolution short-time Fourier transform loss. To evaluate the performance of the proposed scheme, we compare it with traditional signal augmentation methods. Experimental results show that our scheme effectively generates realistic animal sounds, and achieves better performance in animal sound classification than other signal augmentation methods in terms of various classification metrics.

Original languageEnglish
JournalProceedings of the International Congress on Acoustics
Publication statusPublished - 2022
Event24th International Congress on Acoustics, ICA 2022 - Gyeongju, Korea, Republic of
Duration: 2022 Oct 242022 Oct 28

Bibliographical note

Publisher Copyright:
© ICA 2022.All rights reserved

Keywords

  • Animal sound classification
  • Data augmentation
  • GAN
  • Multi-class environment

ASJC Scopus subject areas

  • Mechanical Engineering
  • Acoustics and Ultrasonics

Fingerprint

Dive into the research topics of 'Class-conditional WaveGAN-based data augmentation scheme for animal sound classification'. Together they form a unique fingerprint.

Cite this