TY - JOUR
T1 - Pathway crosstalk effects
T2 - shrinkage and disentanglement using a Bayesian hierarchical model
AU - Tomoiaga, Alin
AU - Westfall, Peter
AU - Donato, Michele
AU - Draghici, Sorin
AU - Hassan, Sonia
AU - Romero, Roberto
AU - Tellaroli, Paola
N1 - Publisher Copyright:
© 2016, International Chinese Statistical Association.
PY - 2016/10/1
Y1 - 2016/10/1
N2 - Identifying the biological pathways that are related to various clinical phenotypes is an important concern in biomedical research. Based on estimated expression levels and/or p values, overrepresentation analysis (ORA) methods provide rankings of pathways, but they are tainted because pathways overlap. This crosstalk phenomenon has not been rigorously studied and classical ORA does not take into consideration: (1) that crosstalk effects in cases of overlapping pathways can cause incorrect rankings of pathways, (2) that crosstalk effects can cause both excess type I errors and type II errors, (3) that rankings of small pathways are unreliable, and (4) that type I error rates can be inflated due to multiple comparisons of pathways. We develop a Bayesian hierarchical model that addresses these problems, providing sensible estimates and rankings, and reducing error rates. We show, on both real and simulated data, that the results of our method are more accurate than the results produced by the classical overrepresentation analysis, providing a better understanding of the underlying biological phenomena involved in the phenotypes under study. The R code and the binary datasets for implementing the analyses described in this article are available online at: http://www.eng.wayne.edu/page.php?id=6402.
AB - Identifying the biological pathways that are related to various clinical phenotypes is an important concern in biomedical research. Based on estimated expression levels and/or p values, overrepresentation analysis (ORA) methods provide rankings of pathways, but they are tainted because pathways overlap. This crosstalk phenomenon has not been rigorously studied and classical ORA does not take into consideration: (1) that crosstalk effects in cases of overlapping pathways can cause incorrect rankings of pathways, (2) that crosstalk effects can cause both excess type I errors and type II errors, (3) that rankings of small pathways are unreliable, and (4) that type I error rates can be inflated due to multiple comparisons of pathways. We develop a Bayesian hierarchical model that addresses these problems, providing sensible estimates and rankings, and reducing error rates. We show, on both real and simulated data, that the results of our method are more accurate than the results produced by the classical overrepresentation analysis, providing a better understanding of the underlying biological phenomena involved in the phenotypes under study. The R code and the binary datasets for implementing the analyses described in this article are available online at: http://www.eng.wayne.edu/page.php?id=6402.
KW - Bayes model
KW - data augmentation
KW - gene expression
KW - genomic pathway analysis
KW - hierarchical modeling
UR - http://www.scopus.com/inward/record.url?scp=84979704500&partnerID=8YFLogxK
U2 - 10.1007/s12561-016-9160-1
DO - 10.1007/s12561-016-9160-1
M3 - Article
AN - SCOPUS:84979704500
SN - 1867-1764
VL - 8
SP - 374
EP - 394
JO - Statistics in Biosciences
JF - Statistics in Biosciences
IS - 2
ER -