Extending the state-space model to accommodate missing values in responses and covariates

Arlene Naranjo, A. Alexandre Trindade, George Casella

Research output: Contribution to journalArticlepeer-review

4 Scopus citations


This article proposes an extended state-space model for accommodating multivariate panel data. The novel aspect of this contribution is an adjustment to the classical model for multiple subjects that allows missingness in the covariates in addition to the responses. Missing covariate data are handled by a second state-space model nested inside the first to represent unobserved exogenous information. Relevant Kalman filter equations are derived, and explicit expressions are provided for both the E- and M-steps of an expectation-maximization (EM) algorithm, to obtain maximum (Gaussian) likelihood estimates of all model parameters. In the presence of missing data, the resulting EM algorithm becomes computationally intractable, but a simplification of the M-step leads to a new procedure that is shown to be an expectation/conditional maximization (ECM) algorithm under exogeneity of the covariates. Simulation studies reveal that the approach appears to be relatively robust to moderate percentages of missing data, even with fewer subjects and time points, and that estimates are generally consistent with the asymptotics. The methodology is applied to a dataset from a published panel study of elderly patients with impaired respiratory function. Forecasted values thus obtained may serve as an "early-warning" mechanism for identifying patients whose lung function is nearing critical levels. Supplementary materials for this article are available online.

Original languageEnglish
Pages (from-to)202-216
Number of pages15
JournalJournal of the American Statistical Association
Issue number501
StatePublished - 2013


  • EM algorithm
  • Kalman filter
  • Longitudinal study
  • Panel data
  • Transition model


Dive into the research topics of 'Extending the state-space model to accommodate missing values in responses and covariates'. Together they form a unique fingerprint.

Cite this