Abstract
Heart failure (HF) is the terminal stage of all heart disease and the leading cause of mortality. A reliable prognostic model for predicting mortality in patients with HF can help to support better decisions in clinical practice. Many attempts have been made to increase the reliability of the prognostic model using electronic health record (EHR), but it is still not known which oversampling method is efficient in imbalanced and insufficient EHR dataset. This study performed a comparative analysis of renowned oversampling methods (i.e., synthetic minority oversampling technique (SMOTE), borderline-SMOTE, and adaptive synthetic (ADASYN) sampling techniques) in constructing prognostic models for HF patients. All 299 patients had left ventricular systolic dysfunction, belonging to New York Heart Association class III and IV (Survival = 203, Deceased = 96). Follow up time was 4-285 days with an average of 130 days. The above three oversampling methods were compared in the case where the prognostic models were constructed by the random forest to predict mortality of patients with HF. The baseline model without oversampling method showed an F-score of 0.55. The oversampling method improved the F-score by 0.05 or more compared to the baseline model. SMOTE showed the highest prognostic capacity (F-score = 0.63) among the oversampling methods (F-score of borderline SMOTE = 0.60, ADASYN = 0.62). In all three oversampling methods, ejection fraction, serum creatinine, and age were consistently observed with high importance. Consequently, SMOTE is the most adequate algorithm for oversampling EHR data to predict mortality in HF patients.
Original language | English |
---|---|
Title of host publication | ICTC 2020 - 11th International Conference on ICT Convergence |
Subtitle of host publication | Data, Network, and AI in the Age of Untact |
Publisher | IEEE Computer Society |
Pages | 379-383 |
Number of pages | 5 |
ISBN (Electronic) | 9781728167589 |
DOIs | |
Publication status | Published - 2020 Oct 21 |
Event | 11th International Conference on Information and Communication Technology Convergence, ICTC 2020 - Jeju Island, Korea, Republic of Duration: 2020 Oct 21 → 2020 Oct 23 |
Publication series
Name | International Conference on ICT Convergence |
---|---|
Volume | 2020-October |
ISSN (Print) | 2162-1233 |
ISSN (Electronic) | 2162-1241 |
Conference
Conference | 11th International Conference on Information and Communication Technology Convergence, ICTC 2020 |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju Island |
Period | 20/10/21 → 20/10/23 |
Bibliographical note
Publisher Copyright:© 2020 IEEE.
Keywords
- Electronic Health Record
- Heart Failure
- Oversampling
- Prognostic Model
ASJC Scopus subject areas
- Information Systems
- Computer Networks and Communications