Abstract
Parametric model-based regression imputation is commonly applied to missing-data problems, but is sensitive to misspecification of the imputation model. Little and An (2004) proposed a semiparametric approach called penalized spline propensity prediction (PSPP), where the variable with missing values is modeled by a penalized spline (P-Spline) of the response propensity score, which is logit of the estimated probability of being missing given the observed variables. Variables other than the response propensity are included parametrically in the imputation model. However they only considered point estimation based on single imputation with PSPP. We consider here three approaches to standard errors estimation incorporating the uncertainty due to non response: (a) standard errors based on the asymptotic variance of the PSPP estimator, ignoring sampling error in estimating the response propensity; (b) standard errors based on the bootstrap method; and (c) multiple imputation-based standard errors using draws from the joint posterior predictive distribution of missing values under the PSPP model. Simulation studies suggest that the bootstrap and multiple imputation approaches yield good inferences under a range of simulation conditions, with multiple imputation showing some evidence of closer to nominal confidence interval coverage when the sample size is small.
Original language | English |
---|---|
Pages (from-to) | 1718-1731 |
Number of pages | 14 |
Journal | Communications in Statistics: Simulation and Computation |
Volume | 37 |
Issue number | 9 |
DOIs | |
Publication status | Published - 2008 Nov |
Keywords
- Asymptotic variance
- Bootstrap
- Gibbs sampler
- Missing data
- Multiple imputation
- Penalized spline
- Response propensity
ASJC Scopus subject areas
- Statistics and Probability
- Modelling and Simulation