In recent years, we have witnessed the explosion of large-scale data in various fields. Classical statistical methodologies, such as linear regression or generalized linear regression, often show inadequate performance on heterogeneous data because the key homogeneity assumption fails. In this paper, we present a flexible framework to handle heterogeneous populations that can be naturally grouped into several ordered subtypes. A local model technique utilizing ordinal class labels during the training stage is proposed. We define a new 'progression score' that captures the progression of ordinal classes, and use a truncated Gaussian kernel to construct the weight function in a local regression framework. Furthermore, given the weights, we apply sparse shrinkage on the local fitting to handle high dimensionality. In this way, our local model is able to conduct variable selection on each query point. Numerical studies show the superiority of our proposed method over several existing ones. Our method is also applied to the Alzheimer's Disease Neuroimaging Initiative data to make predictions on the longitudinal clinical scores based on different modalities of baseline brain image features.
Bibliographical noteFunding Information:
This work was supported in part by NIH under Grant EB008374, Grant AG041721, Grant AG049371, Grant AG042599, Grant AG053867, Grant EB022880, and Grant R01GM126550 and in part by NSF under Grant IIS1632951 and Grant DMS-1821231.
© 1982-2012 IEEE.
- local models
- ordinal classification
- random forests
ASJC Scopus subject areas
- Radiological and Ultrasound Technology
- Computer Science Applications
- Electrical and Electronic Engineering