Abstract
Many devices, including smart TVs and humanoid robots, can be operated through speech interface. Since a user can interact with such a device at a distance, speech-operated devices must be able to process speech signals from a distance. Although many methods exist to localize speakers via sound source localization, it is very difficult to reliably find the location of a speaker in a noisy environment. In particular, conventional sound source localization methods only find the loudest sound source within a given area, and such a sound source may not necessarily be related to human speech. This can be problematic in real environments where loud noises frequently occur, and the performance of speech-based interfaces for a variety of devices could be negatively impacted as a result. In this paper, a new speaker localization method is proposed. It identifies the location associated with the maximum voice power from all candidate locations. The proposed method is tested under a variety of conditions using both simulation data and real data, and the results indicate that the performance of the proposed method is superior to that of a conventional algorithm for various types of noises1.
Original language | English |
---|---|
Article number | 7064118 |
Pages (from-to) | 112-118 |
Number of pages | 7 |
Journal | IEEE Transactions on Consumer Electronics |
Volume | 61 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2015 Feb 1 |
Bibliographical note
Publisher Copyright:© 1975-2011 IEEE.
Keywords
- human-robot interface
- sound source localization
- speaker localization
ASJC Scopus subject areas
- Media Technology
- Electrical and Electronic Engineering