A threshold adaptation based voice query transcription scheme for music retrieval

Byeong Jun Han, Seungmin Rho, Eenjun Hwang

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature extractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (FO) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.

Original languageEnglish
Pages (from-to)445-451
Number of pages7
JournalTransactions of the Korean Institute of Electrical Engineers
Volume59
Issue number2
Publication statusPublished - 2010 Feb

Keywords

  • Audio signal analysis
  • Music transcription
  • Note onset detection
  • Query-by-humming

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A threshold adaptation based voice query transcription scheme for music retrieval'. Together they form a unique fingerprint.

Cite this