Many content-based music retrieval systems represent music using the MIDI format to improve the retrieval efficiency. In such systems, voice queries such as humming need to be transcribed into MIDI note to find out any matched music from database. In this paper, we present an ADF-based voice query processing system that transforms original voice signal into MIDI format. To perform the transformation, a sequence of pitch and duration pairs are extracted from the original voice signal. The pitch is tracked by an autocorrelation function, which is frequently used in the pitch analysis in time-domain. For the exact duration detection, we propose a novel algorithm that combines the onset detection method using ADF and pitch tracking-based duration detection method. In order to estimate the accuracy of the transformation, we tested various queries on the prototype retrieval system and report some of the results.