TY - JOUR

T1 - Using Principal Components as Auxiliary Variables in Missing Data Estimation

AU - Howard, Waylon J.

AU - Rhemtulla, Mijke

AU - Little, Todd D.

N1 - Funding Information:
Funding: This work was supported by Grant 1053160 from the NSF.
Publisher Copyright:
© 2015, Copyright © Taylor & Francis Group, LLC.

PY - 2015/5/4

Y1 - 2015/5/4

N2 - To deal with missing data that arise due to participant nonresponse or attrition, methodologists have recommended an “inclusive” strategy where a large set of auxiliary variables are used to inform the missing data process. In practice, the set of possible auxiliary variables is often too large. We propose using principal components analysis (PCA) to reduce the number of possible auxiliary variables to a manageable number. A series of Monte Carlo simulations compared the performance of the inclusive strategy with eight auxiliary variables (inclusive approach) to the PCA strategy using just one principal component derived from the eight original variables (PCA approach). We examined the influence of four independent variables: magnitude of correlations, rate of missing data, missing data mechanism, and sample size on parameter bias, root mean squared error, and confidence interval coverage. Results indicate that the PCA approach results in unbiased parameter estimates and potentially more accuracy than the inclusive approach. We conclude that using the PCA strategy to reduce the number of auxiliary variables is an effective and practical way to reap the benefits of the inclusive strategy in the presence of many possible auxiliary variables.

AB - To deal with missing data that arise due to participant nonresponse or attrition, methodologists have recommended an “inclusive” strategy where a large set of auxiliary variables are used to inform the missing data process. In practice, the set of possible auxiliary variables is often too large. We propose using principal components analysis (PCA) to reduce the number of possible auxiliary variables to a manageable number. A series of Monte Carlo simulations compared the performance of the inclusive strategy with eight auxiliary variables (inclusive approach) to the PCA strategy using just one principal component derived from the eight original variables (PCA approach). We examined the influence of four independent variables: magnitude of correlations, rate of missing data, missing data mechanism, and sample size on parameter bias, root mean squared error, and confidence interval coverage. Results indicate that the PCA approach results in unbiased parameter estimates and potentially more accuracy than the inclusive approach. We conclude that using the PCA strategy to reduce the number of auxiliary variables is an effective and practical way to reap the benefits of the inclusive strategy in the presence of many possible auxiliary variables.

UR - http://www.scopus.com/inward/record.url?scp=84931569171&partnerID=8YFLogxK

U2 - 10.1080/00273171.2014.999267

DO - 10.1080/00273171.2014.999267

M3 - Article

C2 - 26610030

AN - SCOPUS:84931569171

VL - 50

SP - 285

EP - 299

JO - Multivariate Behavioral Research

JF - Multivariate Behavioral Research

SN - 0027-3171

IS - 3

ER -