A voice trigger system using keyword and speaker recognition for mobile devices

Hyeopwoo Lee, Sukmoon Chang, Dongsuk Yook, Yongserk Kim

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)

Abstract

Voice activity detection plays an important role for an efficient voice interface between human and mobile devices, since it can be used as a trigger to activate an automatic speech recognition module of a mobile device. If the input speech signal can be recognized as a predefined magic word coming from a legitimate user, it can be utilized as a trigger. In this paper, we propose a voice trigger system using a keyword-dependent speaker recognition technique. The voice trigger must be able to perform keyword recognition, as well as speaker recognition, without using computationally demanding speech recognizers to properly trigger a mobile device with low computational power consumption. We propose a template based method and a hidden Markov model (HMM) based method for the voice trigger to solve this problem. The experiments using a Korean word corpus show that the template based method performed 4.1 times faster than the HMM based method. However, the HMM based method reduced the recognition error by 27.8% relatively compared to the template based method. The proposed methods are complementary and can be used selectively depending on the device of interest.1

Original languageEnglish
Article number5373813
Pages (from-to)2377-2384
Number of pages8
JournalIEEE Transactions on Consumer Electronics
Volume55
Issue number4
DOIs
Publication statusPublished - 2009 Nov

Bibliographical note

Funding Information:
1This work was supported by the Korea Research Foundation (KRF) grant funded by the Korea government (MEST) (No. 2009-0077392). It was also supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency) (NIPA-2009-C1090-0902-0007).

Keywords

  • Dynamic time warping
  • Gaussian mixture model
  • Hidden Markov model
  • Keyword recognition
  • Speaker recognition
  • Vector quantization
  • Voice trigger

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A voice trigger system using keyword and speaker recognition for mobile devices'. Together they form a unique fingerprint.

Cite this