Speaker localization in noisy environments using steered response voice power

Hyeontaek Lim, In Chul Yoo, Youngkyu Cho, Dongsuk Yook

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

Many devices, including smart TVs and humanoid robots, can be operated through speech interface. Since a user can interact with such a device at a distance, speech-operated devices must be able to process speech signals from a distance. Although many methods exist to localize speakers via sound source localization, it is very difficult to reliably find the location of a speaker in a noisy environment. In particular, conventional sound source localization methods only find the loudest sound source within a given area, and such a sound source may not necessarily be related to human speech. This can be problematic in real environments where loud noises frequently occur, and the performance of speech-based interfaces for a variety of devices could be negatively impacted as a result. In this paper, a new speaker localization method is proposed. It identifies the location associated with the maximum voice power from all candidate locations. The proposed method is tested under a variety of conditions using both simulation data and real data, and the results indicate that the performance of the proposed method is superior to that of a conventional algorithm for various types of noises1.

Original languageEnglish
Article number7064118
Pages (from-to)112-118
Number of pages7
JournalIEEE Transactions on Consumer Electronics
Volume61
Issue number1
DOIs
Publication statusPublished - 2015 Feb 1

Bibliographical note

Publisher Copyright:
© 1975-2011 IEEE.

Keywords

  • human-robot interface
  • sound source localization
  • speaker localization

ASJC Scopus subject areas

  • Media Technology
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Speaker localization in noisy environments using steered response voice power'. Together they form a unique fingerprint.

Cite this