Abstract
Adaptive text-to-speech (TTS) has attracted increasing interests for the purpose of training TTS systems without tons of high quality data. Nevertheless, existing adaptive TTS systems still show low adaptation quality for novel speakers, since it is hard to learn an extensive speaking style with limited data. To address this issue, we propose progressive variational autoencoder (PVAE) which generates data with adapting to style gradually. PVAE learns a progressively style-normalized representation, which is a key component of progressive style adaptation. We extend PVAE to PVAE-TTS, a multi-speaker adaptive TTS model which generates natural speech with high adaptation quality for novel speakers. To further improve the adaptation quality, we also propose dynamic style layer normalization (DSLN) which utilizes a convolution operation. The experimental results demonstrate the superiority of PVAE-TTS in terms of both subjective and objective evaluations.
Original language | English |
---|---|
Title of host publication | 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 6312-6316 |
Number of pages | 5 |
ISBN (Electronic) | 9781665405409 |
DOIs | |
Publication status | Published - 2022 |
Event | 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Virtual, Online, Singapore Duration: 2022 May 23 → 2022 May 27 |
Publication series
Name | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
---|---|
Volume | 2022-May |
ISSN (Print) | 1520-6149 |
Conference
Conference | 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 |
---|---|
Country/Territory | Singapore |
City | Virtual, Online |
Period | 22/5/23 → 22/5/27 |
Bibliographical note
Publisher Copyright:© 2022 IEEE
Keywords
- adaptive TTS
- speaker adaptation
- speech synthesis
- text-to-speech
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering