TY - JOUR
T1 - An efficient and effective ensemble of support vector machines for anti-diabetic drug failure prediction
AU - Kang, Seokho
AU - Kang, Pilsung
AU - Ko, Taehoon
AU - Cho, Sungzoon
AU - Rhee, Su Jin
AU - Yu, Kyung Sang
N1 - Funding Information:
This work was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIP) (No. 2011–0030814 ), Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science , ICT & Future Planning ( NRF-2014R1A1A1004648 ), and the Brain Korea 21 PLUS Project in 2014. This work was also supported by the Engineering Research Institute of SNU.
Publisher Copyright:
© 2015 Elsevier Ltd. All rights reserved.
Copyright:
Copyright 2015 Elsevier B.V., All rights reserved.
PY - 2015/6/1
Y1 - 2015/6/1
N2 - The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O(N3). To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80%.
AB - The treatment of patients with type 2 diabetes is mostly based on drug therapies, aiming at managing glucose levels appropriately. As the number of patients with type 2 diabetes continually increases worldwide, predicting drug treatment failure becomes an important issue. Support vector machine (SVM) can be a good method for the anti-diabetic drug failure prediction problem; however, it is difficult to train SVM on large-scale medical datasets directly because of its high training time complexity O(N3). To address the limitation, we propose an efficient and effective ensemble of SVMs, called E3-SVM. The proposed method excludes superfluous data points when constructing an SVM ensemble, thereby yielding a better classification performance. The proposed method consists of two phases. The first phase is to select the data points that are likely to be the support vectors by applying data selection methods. The second phase is to construct an SVM ensemble using the selected data points. We demonstrated the efficiency and effectiveness of the proposed method using the real-world dataset of the anti-diabetic drug failure prediction problem for type 2 diabetes. Experimental results show that the proposed method requires less training time to achieve comparable success, compared to the conventional SVM ensembles. Moreover, the proposed method obtains more reliable prediction results for each independent run of constructing an ensemble. In conclusion, firstly, the proposed method provides an efficient and effective way to use SVM for large-scale datasets. Secondly, we confirmed the suitability of SVM for the anti-diabetic drug failure prediction problem with an accuracy of about 80%.
KW - Data selection
KW - Drug failure prediction
KW - Ensemble
KW - Support vector machines
KW - Type 2 diabetes
UR - http://www.scopus.com/inward/record.url?scp=84923233733&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84923233733&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2015.01.042
DO - 10.1016/j.eswa.2015.01.042
M3 - Article
AN - SCOPUS:84923233733
SN - 0957-4174
VL - 42
SP - 4265
EP - 4273
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - 9
ER -