Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition

Suwon Shon, Seongkyu Mun, David K. Han, Hanseok Ko

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    2 Citations (Scopus)

    Abstract

    This paper analyzes heteroscedasticity in i-vector for robust forensics and surveillance speaker recognition system. Linear Discriminant Analysis (LDA), a widely-used linear dimension reduction technique, assumes that classes are homoscedastic within a same covariance. In this paper it is assumed that general speech utterances contain both homoscedastic and heteroscedastic elements. We show the validity of this assumption by employing several analyses and also demonstrate that dimension reduction using principal components is feasible. To effectively handle the presence of heteroscedastic and homoscedastic elements, we propose a fusion approach of applying both LDA and Heteroscedastic-LDA (HLDA). The experiments are conducted to show its effectiveness and compare to other methods using the telephone database of National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE) 2010 extended.

    Original languageEnglish
    Title of host publicationAVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    ISBN (Electronic)9781467376327
    DOIs
    Publication statusPublished - 2015 Oct 19
    Event12th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2015 - Karlsruhe, Germany
    Duration: 2015 Aug 252015 Aug 28

    Publication series

    NameAVSS 2015 - 12th IEEE International Conference on Advanced Video and Signal Based Surveillance

    Other

    Other12th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2015
    Country/TerritoryGermany
    CityKarlsruhe
    Period15/8/2515/8/28

    Keywords

    • Algorithm design and analysis
    • Analytical models
    • Computational modeling
    • Speech
    • Speech processing
    • Switches
    • Transforms

    ASJC Scopus subject areas

    • Computer Networks and Communications
    • Electrical and Electronic Engineering
    • Communication

    Fingerprint

    Dive into the research topics of 'Maximum likelihood Linear Dimension Reduction of heteroscedastic feature for robust Speaker Recognition'. Together they form a unique fingerprint.

    Cite this