TY - JOUR
T1 - Holistic Approaches to Music Genre Classification using Efficient Transfer and Deep Learning Techniques
AU - Prabhakar, Sunil Kumar
AU - Lee, Seong Whan
N1 - Funding Information:
This work was supported by Institute for Information & Communications Technology Promotion (IITP) grant funded by Korea government (No. 2017-0-00451, Development of BCI based Brain and Cognitive Computing Technology for Recognizing User’s Intentions using Deep Learning; No. 2019-0-00079, Artificial Intelligence Graduate School Program (Korea University)).
Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2023/1
Y1 - 2023/1
N2 - With the rapid development of high-tech multimedia technologies, many musical resource assets are available online and it has always triggered an interest in the classification of different music genres. Detecting a set of music belonging to a similar genre is the main intention of the music recommendation playlist. With the help of machine learning, transfer learning and deep learning concepts, a robust music classifier is necessary so that the unlabelled music can be easily tagged and thereby the users experience of using media players with music files can be improved. The existing approaches in the past decade has various shortcomings due to the manual extraction of features followed by traditional machine learning classification techniques affecting the classification accuracy to a great extent along with its drawback to not perform well on multiclass classification problems and its inability to deal with huge data size. In this work, five interesting and novel approaches are proposed for music genre classification such as the proposed Weighted Visibility Graph based Elastic Net Sparse Classifier (WVG-ELNSC), the proposed classification using sequential machine learning analysis with Stacked Denoising Autoencoder (SDA) classifier, the proposed Riemannian Alliance based Tangent Space Mapping (RA-TSM) transfer learning techniques, classification using Transfer Support Vector Machine (TSVM) algorithm, and finally the proposed deep learning classifier with Bidirectional Long Short-Term Memory (BiLSTM) cum Attention model with Graphical Convolution Network (GCN) termed as BAG deep learning model is used here. The experiments are done for three music datasets such as GTZAN, ISMIR 2004 and MagnaTagATune datasets and a relatively higher classification accuracy of 93.51% is obtained when the proposed deep learning BAG model is utilized.
AB - With the rapid development of high-tech multimedia technologies, many musical resource assets are available online and it has always triggered an interest in the classification of different music genres. Detecting a set of music belonging to a similar genre is the main intention of the music recommendation playlist. With the help of machine learning, transfer learning and deep learning concepts, a robust music classifier is necessary so that the unlabelled music can be easily tagged and thereby the users experience of using media players with music files can be improved. The existing approaches in the past decade has various shortcomings due to the manual extraction of features followed by traditional machine learning classification techniques affecting the classification accuracy to a great extent along with its drawback to not perform well on multiclass classification problems and its inability to deal with huge data size. In this work, five interesting and novel approaches are proposed for music genre classification such as the proposed Weighted Visibility Graph based Elastic Net Sparse Classifier (WVG-ELNSC), the proposed classification using sequential machine learning analysis with Stacked Denoising Autoencoder (SDA) classifier, the proposed Riemannian Alliance based Tangent Space Mapping (RA-TSM) transfer learning techniques, classification using Transfer Support Vector Machine (TSVM) algorithm, and finally the proposed deep learning classifier with Bidirectional Long Short-Term Memory (BiLSTM) cum Attention model with Graphical Convolution Network (GCN) termed as BAG deep learning model is used here. The experiments are done for three music datasets such as GTZAN, ISMIR 2004 and MagnaTagATune datasets and a relatively higher classification accuracy of 93.51% is obtained when the proposed deep learning BAG model is utilized.
KW - Deep learning
KW - Machine learning
KW - Multiclass classification
KW - Music classification
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85136474348&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2022.118636
DO - 10.1016/j.eswa.2022.118636
M3 - Article
AN - SCOPUS:85136474348
SN - 0957-4174
VL - 211
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 118636
ER -