Efficient adversarial audio synthesis via progressive upsampling

Youngwoo Cho, Minwook Chang, Sanghyeon Lee, Hyoungwoo Lee, Gerard Jounghyun Kim, Jaegul Choo

Research output: Contribution to journalConference articlepeer-review

1 Citation (Scopus)

Abstract

This paper proposes a novel generative model called PUGAN, which progressively synthesizes high-quality audio in a raw waveform. Progressive upsampling GAN (PUGAN) leverages the progressive generation of higher-resolution output by stacking multiple encoder-decoder architectures. Compared to an existing state-of-the-art model called WaveGAN, which uses a single decoder architecture, our model generates audio signals and converts them to a higher resolution in a progressive manner, while using a significantly smaller number of parameters, e.g., 3.17x smaller for 16 kHz output, than WaveGAN. Our experiments show that the audio signals can be generated in real time with a comparable quality to that of WaveGAN in terms of the inception scores and human perception.

Original languageEnglish
Pages (from-to)3410-3414
Number of pages5
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2021-June
DOIs
Publication statusPublished - 2021
Event2021 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2021 - Virtual, Toronto, Canada
Duration: 2021 Jun 62021 Jun 11

Bibliographical note

Funding Information:
Acknowledgements. This work was supported by Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government (20ZS1200, Fundamental Technology Research for Human-Centric Autonomous Intelligent Systems) and Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2019-0-00075, Artificial Intelligence Graduate School Program(KAIST)).

Publisher Copyright:
© 2021 IEEE

Keywords

  • Generative adversarial networks (GANs)
  • Real-time sound effect synthesis

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Efficient adversarial audio synthesis via progressive upsampling'. Together they form a unique fingerprint.

Cite this