TY - GEN
T1 - Tied Mixture modeling optimization for Korean-digit in the embedded ASR system
AU - Kim, Kihyeon
AU - Ko, Hanseok
PY - 2004
Y1 - 2004
N2 - In the embedded Automatic Speech Recognition (ASR) system, Semi-Continuous Hidden Markov Model (SCHMM) or Tied-Mixture (TM) model is one of the most promising acoustic modeling methods that solve the size problem of the existing Continuous Hidden Markov Model (CHMM) while minimizing the recognition performance degradation. Moreover, for a general isolated word task, context dependent models such as tri-phones are used to guarantee high recognition performance of the embedded system. However, to use the models constructed only in this way alone cannot be sufficient to render improved recognition rate in Korean-digit speech task where a large mutual similarity exists. Hence, we construct new dedicated HMM's for all or parts of Korean-digit that has exclusive states using the same Gaussian pool of previous tri-phone models. This remedial action allows the structure of entire HMMs maintained while minimizing the occupied memory space. Representative experiments are expected to reduce word-error-rate on the Korean-digit task by about 86% in comparison with using only general tri-phone models.
AB - In the embedded Automatic Speech Recognition (ASR) system, Semi-Continuous Hidden Markov Model (SCHMM) or Tied-Mixture (TM) model is one of the most promising acoustic modeling methods that solve the size problem of the existing Continuous Hidden Markov Model (CHMM) while minimizing the recognition performance degradation. Moreover, for a general isolated word task, context dependent models such as tri-phones are used to guarantee high recognition performance of the embedded system. However, to use the models constructed only in this way alone cannot be sufficient to render improved recognition rate in Korean-digit speech task where a large mutual similarity exists. Hence, we construct new dedicated HMM's for all or parts of Korean-digit that has exclusive states using the same Gaussian pool of previous tri-phone models. This remedial action allows the structure of entire HMMs maintained while minimizing the occupied memory space. Representative experiments are expected to reduce word-error-rate on the Korean-digit task by about 86% in comparison with using only general tri-phone models.
KW - Embedded ASR System
KW - Exclusive HMM's
KW - Korean Digits
KW - Tied Mixture Model
UR - http://www.scopus.com/inward/record.url?scp=10444277350&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=10444277350&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:10444277350
SN - 0780385276
T3 - 2004 IEEE International Symposium on Consumer Electronics - Proceedings
SP - 595
EP - 599
BT - 2004 IEEE International Symposium on Consumer Electronics - Proceedings
T2 - 2004 IEEE International Symposium on Consumer Electronics - Proceedings
Y2 - 1 September 2004 through 3 September 2004
ER -