TY - JOUR
T1 - Active semi-supervised learning with multiple complementary information
AU - Park, Sung Ho
AU - Kim, Seoung Bum
N1 - Funding Information:
The authors would like to thank the editor and reviewers for their useful comments and suggestions, which were greatly help in improving the quality of the paper. This research was supported by Brain Korea PLUS, Basic Science Research Program through the National Research Foundation of Korea funded by the Ministry of Science, ICT and Future Planning ( NRF-2016R1A2B1008994 ), the Ministry of Trade, Industry & Energy under Industrial Technology Innovation Program ( R1623371 ), and by Institute for Information & communications Technology Promotion grant funded by the Korea government (No. 2018-0-00440 , ICT-based Crime Risk Prediction and Response Platform Development for Early Awareness of Risk Situation).
Publisher Copyright:
© 2019 Elsevier Ltd
PY - 2019/7/15
Y1 - 2019/7/15
N2 - In many practical machine learning problems, the acquisition of labeled data is often expensive and time consuming. To reduce this labeling cost, active learning has been introduced in many scientific fields. This study considers the problem of active learning of a regression model in the context of an optimal experimental design. Classical optimal experimental design approaches are based on the least square errors of labeled samples. Recently, a couple of active learning approaches that take advantage of both labeled and unlabeled data have been developed based on Laplacian regularized regression models with a single criterion. However, these approaches are susceptible to selecting undesirable samples when the number of initially labeled samples is small. To address this susceptibility, this study proposes an active learning method that considers multiple complementary criteria. These criteria include sample representativeness, diversity information, and variance reduction of the Laplacian regularization model. Specifically, we developed novel density and diversity criteria based on a clustering algorithm to identify the samples that are representative of their distributions, while minimizing their redundancy. Experiments were conducted on synthetic and benchmark data to compare the performance of the proposed method with that of existing methods. Experimental results demonstrate that the proposed active learning algorithm outperforms its existing counterparts.
AB - In many practical machine learning problems, the acquisition of labeled data is often expensive and time consuming. To reduce this labeling cost, active learning has been introduced in many scientific fields. This study considers the problem of active learning of a regression model in the context of an optimal experimental design. Classical optimal experimental design approaches are based on the least square errors of labeled samples. Recently, a couple of active learning approaches that take advantage of both labeled and unlabeled data have been developed based on Laplacian regularized regression models with a single criterion. However, these approaches are susceptible to selecting undesirable samples when the number of initially labeled samples is small. To address this susceptibility, this study proposes an active learning method that considers multiple complementary criteria. These criteria include sample representativeness, diversity information, and variance reduction of the Laplacian regularization model. Specifically, we developed novel density and diversity criteria based on a clustering algorithm to identify the samples that are representative of their distributions, while minimizing their redundancy. Experiments were conducted on synthetic and benchmark data to compare the performance of the proposed method with that of existing methods. Experimental results demonstrate that the proposed active learning algorithm outperforms its existing counterparts.
KW - Active learning
KW - Diversity
KW - Optimal experimental design
KW - Representativeness
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85061675068&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2019.02.017
DO - 10.1016/j.eswa.2019.02.017
M3 - Article
AN - SCOPUS:85061675068
SN - 0957-4174
VL - 126
SP - 30
EP - 40
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -