Unsupervised adaptation without estimated transriptions

Hyeopwoo Lee, Dongsuk Yook

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

To estimate the unknown distortion parameters from input test signals, estimated transcriptions are typically used for unsupervised adaptation. In a low signal to noise ratio (SNR) condition, the transcription estimated by a decoding procedure can be error prone because of the high mismatch between the acoustic models and the input signal. As a result, it can cause performance degradation of the adapted systems. To account for this problem, we propose an unsupervised adaptation method that can adapt the acoustic models without the estimated transcription. Instead, Gaussian mixture models (GMM) and pseudo phoneme models (PPM) are used. Using these models the unknown distortion parameters are estimated based on the vector Taylor series (VTS) model adaptation scheme. On the Aurora2 task, we obtained relative reduction of 5.4% in word error rate (WER).

Original languageEnglish
Title of host publication2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
Pages7918-7921
Number of pages4
DOIs
Publication statusPublished - 2013 Oct 18
Event2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Vancouver, BC, Canada
Duration: 2013 May 262013 May 31

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Country/TerritoryCanada
CityVancouver, BC
Period13/5/2613/5/31

Keywords

  • Unsupervised adaptation
  • robust speech recognition
  • vector Taylor series

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Unsupervised adaptation without estimated transriptions'. Together they form a unique fingerprint.

Cite this