TY - JOUR
T1 - Predicting future onset of depression among community dwelling adults in the Republic of Korea using a machine learning algorithm
AU - Na, Kyoung Sae
AU - Cho, Seo Eun
AU - Geem, Zong Woo
AU - Kim, Yong Ku
N1 - Funding Information:
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP; Ministry of Science, ICT & Future Planning) (No. 2017R1C1B5073684 ). This paper is based on the PhD dissertation of Kyoung-Sae Na.
Publisher Copyright:
© 2020 Elsevier B.V.
PY - 2020/3/16
Y1 - 2020/3/16
N2 - Because depression has high prevalence and cause enduring disability, it is important to predict onset of depression among community dwelling adults. In this study, we aimed to build a machine learning-based predictive model for future onset of depression. We used nationwide survey data to construct training and hold-out test set. The class imbalance was dealt with the Synthetic Minority Over-sampling Technique. A tree-based ensemble method, random forest, was used to build a predictive model. Depression was defined by 9 or more on the Center for Epidemiologic Studies – Depression Scale 11 items version. Hyperparameters were tuned throughout the 10-fold cross-validation. A total of 6,588 (6,067 of non-depression and 521 of depression) participants were included in the study. The area under receiver operating characteristics curve was 0.870. The overall accuracy, sensitivity, and specificity were 0.862, 0.730, and 0.866, respectively. Satisfactions for leisure, familial relationship, general, social relationship, and familial income had importance in building predictive model for the onset of future depression. Our study demonstrated that predicting future onset of depression by using survey data could be possible. This predictive model is expected to be used for early identification of individuals at risk for depression and secure time to intervention.
AB - Because depression has high prevalence and cause enduring disability, it is important to predict onset of depression among community dwelling adults. In this study, we aimed to build a machine learning-based predictive model for future onset of depression. We used nationwide survey data to construct training and hold-out test set. The class imbalance was dealt with the Synthetic Minority Over-sampling Technique. A tree-based ensemble method, random forest, was used to build a predictive model. Depression was defined by 9 or more on the Center for Epidemiologic Studies – Depression Scale 11 items version. Hyperparameters were tuned throughout the 10-fold cross-validation. A total of 6,588 (6,067 of non-depression and 521 of depression) participants were included in the study. The area under receiver operating characteristics curve was 0.870. The overall accuracy, sensitivity, and specificity were 0.862, 0.730, and 0.866, respectively. Satisfactions for leisure, familial relationship, general, social relationship, and familial income had importance in building predictive model for the onset of future depression. Our study demonstrated that predicting future onset of depression by using survey data could be possible. This predictive model is expected to be used for early identification of individuals at risk for depression and secure time to intervention.
KW - Artificial intelligence
KW - Depression
KW - Machine learning
KW - Mental health
KW - Prediction
UR - http://www.scopus.com/inward/record.url?scp=85078970669&partnerID=8YFLogxK
U2 - 10.1016/j.neulet.2020.134804
DO - 10.1016/j.neulet.2020.134804
M3 - Article
C2 - 32014516
AN - SCOPUS:85078970669
SN - 0304-3940
VL - 721
JO - Neuroscience Letters
JF - Neuroscience Letters
M1 - 134804
ER -