Linear spectral transformation for robust speech recognition using maximum mutual information

Donghyun Kim, Dongsuk Yook

    Research output: Contribution to journalArticlepeer-review

    14 Citations (Scopus)

    Abstract

    This paper presents a transformation-based rapid adaptation technique for robust speech recognition using a linear spectral transformation (LST) and a maximum mutual information (MMI) criterion. Previously, a maximum likelihood linear spectral transformation (ML-LST) algorithm was proposed for fast adaptation in unknown environments. Since the MMI estimation method does not require evenly distributed training data and increases the a posteriori probability of the word sequences of the training data, we combine the linear spectral transformation method and the MMI estimation technique in order to achieve extremely rapid adaptation using only one word of adaptation data. The proposed algorithm, called MMI-LST, was implemented using the extended Baum-Welch algorithm and phonetic lattices, and evaluated on the TIMIT and FFMTIMIT corpora. It provides a relative reducion in the speech recognition error rate of 11.1% using only 0.25 s of adaptation data.

    Original languageEnglish
    Pages (from-to)496-499
    Number of pages4
    JournalIEEE Signal Processing Letters
    Volume14
    Issue number7
    DOIs
    Publication statusPublished - 2007 Jul

    Bibliographical note

    Funding Information:
    Manuscript received August 1, 2006; revised November 1, 2006. This work was supported by Grant R01-2006-000-11162-0 from the Basic Research Program of the Korea Science and Engineering Foundation. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Steve Renals.

    Keywords

    • Linear spectral transformation
    • Maximum mutual information (MMI)
    • Rapid adaptation
    • Robust speech recognition

    ASJC Scopus subject areas

    • Signal Processing
    • Electrical and Electronic Engineering
    • Applied Mathematics

    Fingerprint

    Dive into the research topics of 'Linear spectral transformation for robust speech recognition using maximum mutual information'. Together they form a unique fingerprint.

    Cite this