Abstract
This paper presents a threshold adaptation based voice query transcription scheme for music information retrieval. The proposed scheme analyzes monophonic voice signal and generates its transcription for diverse music retrieval applications. For accurate transcription, we propose several advanced features including (i) Energetic Feature extractor (EFX) for onset, peak, and transient area detection; (ii) Modified Windowed Average Energy (MWAE) for defining multiple small but coherent windows with local threshold values as offset detector; and finally (iii) Circular Average Magnitude Difference Function (CAMDF) for accurate acquisition of fundamental frequency (FO) of each frame. In order to evaluate the performance of our proposed scheme, we implemented a prototype music transcription system called AMT2 (Automatic Music Transcriber version 2) and carried out various experiments. In the experiment, we used QBSH corpus [1], adapted in MIREX 2006 contest data set. Experimental result shows that our proposed scheme can improve the transcription performance.
Original language | English |
---|---|
Pages (from-to) | 445-451 |
Number of pages | 7 |
Journal | Transactions of the Korean Institute of Electrical Engineers |
Volume | 59 |
Issue number | 2 |
Publication status | Published - 2010 Feb |
Keywords
- Audio signal analysis
- Music transcription
- Note onset detection
- Query-by-humming
ASJC Scopus subject areas
- Electrical and Electronic Engineering