TY - JOUR
T1 - High-dimensional ODEs coupled with mixed-effects modeling techniques for dynamic gene regulatory network identification
AU - Lu, Tao
AU - Liang, Hua
AU - Li, Hongzhe
AU - Wu, Hulin
N1 - Funding Information:
Tao Lu is Ph.D. candidate (E-mail: tao_lu@urmc.rochester.edu), Hua Liang (E-mail: hliang@bst.rochester.edu) and Hulin Wu (E-mail: hwu@bst.rochester. edu) are Professors, Department of Biostatistics and Computational Biology, School of Medicine and Dentistry, University of Rochester, Rochester, New York 14642. Hongzhe Li is Professor, Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA 19104 (E-mail: hongzhe@mail.med.upenn.edu). The authors would like to thank Shenghua Li for his helpful discussions, Jeanne Holden-Wiltse for her editorial help, and the editor, an associate editor, and three referees for their constructive comments and suggestions. This research was partially supported by the NSF grants DMS-0806097 and DMS-1007167 (Liang); and the NIH grants ES009911, CA127334, and AG025532 (Li), and the NIAID/NIH grants AI50020, AI078498, AI078842, and AI087135 (Wu).
PY - 2011/12
Y1 - 2011/12
N2 - Gene regulation is a complicated process. The interaction of many genes and their products forms an intricate biological network. Identification of this dynamic network will help us understand the biological processes in a systematic way. However, the construction of a dynamic network is very challenging for a high-dimensional system. In this article we propose to use a set of ordinary differential equations (ODE), coupled with dimensional reduction by clustering and mixed-effects modeling techniques, to model the dynamic gene regulatory network (GRN). The ODE models allow us to quantify both positive and negative gene regulation as well as feedback effects of genes in a functional module on the dynamic expression changes of genes in another functional module, which results in a directed graph network. A five-step procedure-clustering, smoothing, regulation identification, parameter estimates refining, and function enrichment analysis (CSIEF)-is developed to identify the ODE-based dynamic GRN. In the proposed CSIEF procedure, a series of cutting-edge statistical methods and techniques are employed, that include nonparametric mixed-effects models with a mixture distribution for clustering, nonparametric mixed-effects smoothing-based methods for ODE models, the smoothly clipped absolute deviation (SCAD)-based variable selection, and stochastic approximation EM (SAEM) approach for mixed-effects ODE model parameter estimation. The key step, the SCAD-based variable selection, is justified by investigating its asymptotic properties and validated by Monte Carlo simulations. We apply the proposed method to identify the dynamic GRN for yeast cell cycle progression data. We are able to annotate the identified modules through function enrichment analyses. Some interesting biological findings are discussed. The proposed procedure is a promising tool for constructing a general dynamic GRN and more complicated dynamic networks. This article has supplementary material online.
AB - Gene regulation is a complicated process. The interaction of many genes and their products forms an intricate biological network. Identification of this dynamic network will help us understand the biological processes in a systematic way. However, the construction of a dynamic network is very challenging for a high-dimensional system. In this article we propose to use a set of ordinary differential equations (ODE), coupled with dimensional reduction by clustering and mixed-effects modeling techniques, to model the dynamic gene regulatory network (GRN). The ODE models allow us to quantify both positive and negative gene regulation as well as feedback effects of genes in a functional module on the dynamic expression changes of genes in another functional module, which results in a directed graph network. A five-step procedure-clustering, smoothing, regulation identification, parameter estimates refining, and function enrichment analysis (CSIEF)-is developed to identify the ODE-based dynamic GRN. In the proposed CSIEF procedure, a series of cutting-edge statistical methods and techniques are employed, that include nonparametric mixed-effects models with a mixture distribution for clustering, nonparametric mixed-effects smoothing-based methods for ODE models, the smoothly clipped absolute deviation (SCAD)-based variable selection, and stochastic approximation EM (SAEM) approach for mixed-effects ODE model parameter estimation. The key step, the SCAD-based variable selection, is justified by investigating its asymptotic properties and validated by Monte Carlo simulations. We apply the proposed method to identify the dynamic GRN for yeast cell cycle progression data. We are able to annotate the identified modules through function enrichment analyses. Some interesting biological findings are discussed. The proposed procedure is a promising tool for constructing a general dynamic GRN and more complicated dynamic networks. This article has supplementary material online.
KW - Differential equations
KW - Network graph
KW - Nonparametric
KW - Nonparametric mixed effects
KW - Saccharomyces cerevisiae
KW - Smoothly clipped absolute deviation
KW - Stochastic approximation EM
KW - Time course microarray data
KW - Two-stage smoothing based method
KW - Yeast cell cycles
UR - http://www.scopus.com/inward/record.url?scp=84862970480&partnerID=8YFLogxK
U2 - 10.1198/jasa.2011.ap10194
DO - 10.1198/jasa.2011.ap10194
M3 - Article
AN - SCOPUS:84862970480
SN - 0162-1459
VL - 106
SP - 1242
EP - 1258
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 496
ER -