Abstract
In speech enhancement, MetricGAN-based approaches reduce the discrepancy between the Lp loss and evaluation metrics by utilizing a non-differentiable evaluation metric as the objective function. However, optimizing multiple metrics simultaneously remains challenging owing to the problem of confusing gradient directions. In this paper, we propose an effective multi-metric optimization method in MetricGAN via online knowledge distillation-MetricGANOKD. MetricGAN-OKD, which consists of multiple generators and target metrics, related by a one-to-one correspondence, enables generators to learn with respect to a single metric reliably while improving performance with respect to other metrics by mimicking other generators. Experimental results on speech enhancement and listening enhancement tasks reveal that the proposed method significantly improves performance in terms of multiple metrics compared to existing multi-metric optimization methods. Further, the good performance of MetricGAN-OKD is explained in terms of network generalizability and correlation between metrics.
| Original language | English |
|---|---|
| Pages (from-to) | 31521-31538 |
| Number of pages | 18 |
| Journal | Proceedings of Machine Learning Research |
| Volume | 202 |
| Publication status | Published - 2023 |
| Event | 40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States Duration: 2023 Jul 23 → 2023 Jul 29 |
Bibliographical note
Publisher Copyright:© 2023 Proceedings of Machine Learning Research. All rights reserved.
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability
Fingerprint
Dive into the research topics of 'MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS