A Comparison of the Effects of Data Imputation Methods on Model Performance

Wooyoung Kim, Wonwoong Cho, Jangho Choi, Jiyong Kim, Cheonbok Park, Jaegul Choo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)


Missing values cause critical problems on training a prediction model. Various missing data imputation methods have been introduced to settle down the problem. However, the imputation accuracy obtained by the methods is insufficient to validate performance of prediction models. Thus, in this study, we compare (1) imputation accuracy from various imputation methods as well as (2) the effects of imputation methods on prediction accuracy, investigating a relationship between imputation accuracy and prediction accuracy. For the comparison, we use water quality data composed of the latest actual observational multi-sensor data from Daecheong Lake. We conduct several experiments to compare seven imputation methods including a state of the art method, and their effects on three distinct prediction models. Through quantitative comparison and analysis, we proved that it is necessary to consider both imputation accuracy and model prediction accuracy when choosing an imputation method.

Original languageEnglish
Title of host publication21st International Conference on Advanced Communication Technology
Subtitle of host publicationICT for 4th Industrial Revolution!, ICACT 2019 - Proceeding
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages8
ISBN (Electronic)9791188428021
Publication statusPublished - 2019 Apr 29
Event21st International Conference on Advanced Communication Technology, ICACT 2019 - Pyeongchang, Korea, Republic of
Duration: 2019 Feb 172019 Feb 20

Publication series

NameInternational Conference on Advanced Communication Technology, ICACT
ISSN (Print)1738-9445


Conference21st International Conference on Advanced Communication Technology, ICACT 2019
Country/TerritoryKorea, Republic of

Bibliographical note

Funding Information:
This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No.2018-0-00219, Space-time complex artificial intelligence blue-green algae prediction technology based on direct-readable water quality complex sensor and hyperspectral image)

Publisher Copyright:
© 2019 Global IT Research Institute (GIRI).


  • SVD imputation
  • amelia imputation
  • imputation methods
  • incomplete data
  • knn imputation
  • linear interpolation
  • mean imputation
  • mice imputation
  • missing data
  • missing values
  • model performance
  • randomforest imputation

ASJC Scopus subject areas

  • Electrical and Electronic Engineering


Dive into the research topics of 'A Comparison of the Effects of Data Imputation Methods on Model Performance'. Together they form a unique fingerprint.

Cite this