TY - JOUR
T1 - Multiple predicting K-fold cross-validation for model selection
AU - Jung, Yoonsuh
N1 - Funding Information:
Jung’s work was partially supported by National Research Foundation 2017R1C1B5017431 and by Korea University Grant K1705711.
Publisher Copyright:
© 2017, © American Statistical Association and Taylor & Francis 2017.
PY - 2018/1/2
Y1 - 2018/1/2
N2 - K-fold cross-validation (CV) is widely adopted as a model selection criterion. In K-fold CV, (K – 1) folds are used for model construction and the hold-out fold is allocated to model validation. This implies model construction is more emphasised than the model validation procedure. However, some studies have revealed that more emphasis on the validation procedure may result in improved model selection. Specifically, leave-m-out CV with n samples may achieve variable-selection consistency when m/n approaches to 1. In this study, a new CV method is proposed within the framework of K-fold CV. The proposed method uses (K – 1) folds of the data for model validation, while the other fold is for model construction. This provides (K – 1) predicted values for each observation. These values are averaged to produce a final predicted value. Then, the model selection based on the averaged predicted values can reduce variation in the assessment due to the averaging. The variable-selection consistency of the suggested method is established. Its advantage over K-fold CV with finite samples are examined under linear, non-linear, and high-dimensional models.
AB - K-fold cross-validation (CV) is widely adopted as a model selection criterion. In K-fold CV, (K – 1) folds are used for model construction and the hold-out fold is allocated to model validation. This implies model construction is more emphasised than the model validation procedure. However, some studies have revealed that more emphasis on the validation procedure may result in improved model selection. Specifically, leave-m-out CV with n samples may achieve variable-selection consistency when m/n approaches to 1. In this study, a new CV method is proposed within the framework of K-fold CV. The proposed method uses (K – 1) folds of the data for model validation, while the other fold is for model construction. This provides (K – 1) predicted values for each observation. These values are averaged to produce a final predicted value. Then, the model selection based on the averaged predicted values can reduce variation in the assessment due to the averaging. The variable-selection consistency of the suggested method is established. Its advantage over K-fold CV with finite samples are examined under linear, non-linear, and high-dimensional models.
KW - Cross-validation
KW - K-fold cross-validation
KW - model selection
KW - tuning parameter selection
UR - http://www.scopus.com/inward/record.url?scp=85034647662&partnerID=8YFLogxK
U2 - 10.1080/10485252.2017.1404598
DO - 10.1080/10485252.2017.1404598
M3 - Article
AN - SCOPUS:85034647662
SN - 1048-5252
VL - 30
SP - 197
EP - 215
JO - Journal of Nonparametric Statistics
JF - Journal of Nonparametric Statistics
IS - 1
ER -