MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement

  • Wooseok Shin
  • , Byung Hoon Lee
  • , Jin Sob Kim
  • , Hyun Joon Park
  • , Sung Won Han*
  • *Corresponding author for this work

    Research output: Contribution to journalConference articlepeer-review

    Abstract

    In speech enhancement, MetricGAN-based approaches reduce the discrepancy between the Lp loss and evaluation metrics by utilizing a non-differentiable evaluation metric as the objective function. However, optimizing multiple metrics simultaneously remains challenging owing to the problem of confusing gradient directions. In this paper, we propose an effective multi-metric optimization method in MetricGAN via online knowledge distillation-MetricGANOKD. MetricGAN-OKD, which consists of multiple generators and target metrics, related by a one-to-one correspondence, enables generators to learn with respect to a single metric reliably while improving performance with respect to other metrics by mimicking other generators. Experimental results on speech enhancement and listening enhancement tasks reveal that the proposed method significantly improves performance in terms of multiple metrics compared to existing multi-metric optimization methods. Further, the good performance of MetricGAN-OKD is explained in terms of network generalizability and correlation between metrics.

    Original languageEnglish
    Pages (from-to)31521-31538
    Number of pages18
    JournalProceedings of Machine Learning Research
    Volume202
    Publication statusPublished - 2023
    Event40th International Conference on Machine Learning, ICML 2023 - Honolulu, United States
    Duration: 2023 Jul 232023 Jul 29

    Bibliographical note

    Publisher Copyright:
    © 2023 Proceedings of Machine Learning Research. All rights reserved.

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Software
    • Control and Systems Engineering
    • Statistics and Probability

    Fingerprint

    Dive into the research topics of 'MetricGAN-OKD: Multi-Metric Optimization of MetricGAN via Online Knowledge Distillation for Speech Enhancement'. Together they form a unique fingerprint.

    Cite this