Design of probabilistic random forests with applications to anticancer drug sensitivity prediction

Raziur Rahman; Saad Haider; Souparno Ghosh; Ranadip Pal

doi:10.4137/CIN.S30794

Design of probabilistic random forests with applications to anticancer drug sensitivity prediction

Raziur Rahman, Saad Haider, Souparno Ghosh, Ranadip Pal

Research output: Contribution to journal › Article › peer-review

12 Scopus citations

Abstract

Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity predic-tion problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.

Original language	English
Pages (from-to)	57-73
Number of pages	17
Journal	Cancer Informatics
Volume	15
DOIs	https://doi.org/10.4137/CIN.S30794
State	Published - Mar 31 2016

Keywords

Drug sensitivity prediction
Heteroscedasticity
Probabilistic random forests
Variance analysis of random forests

Access to Document

10.4137/CIN.S30794

Cite this

@article{7539199065d64367a7f411e1785ddff4,

title = "Design of probabilistic random forests with applications to anticancer drug sensitivity prediction",

abstract = "Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees{\textquoteright} prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity predic-tion problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.",

keywords = "Drug sensitivity prediction, Heteroscedasticity, Probabilistic random forests, Variance analysis of random forests",

author = "Raziur Rahman and Saad Haider and Souparno Ghosh and Ranadip Pal",

note = "Publisher Copyright: {\textcopyright} the authors, publisher and licensee Libertas Academica Limited.",

year = "2016",

month = mar,

day = "31",

doi = "10.4137/CIN.S30794",

language = "English",

volume = "15",

pages = "57--73",

journal = "Cancer Informatics",

issn = "1176-9351",

}

TY - JOUR

T1 - Design of probabilistic random forests with applications to anticancer drug sensitivity prediction

AU - Rahman, Raziur

AU - Haider, Saad

AU - Ghosh, Souparno

AU - Pal, Ranadip

N1 - Publisher Copyright: © the authors, publisher and licensee Libertas Academica Limited.

PY - 2016/3/31

Y1 - 2016/3/31

N2 - Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity predic-tion problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.

AB - Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity predic-tion problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.

KW - Drug sensitivity prediction

KW - Heteroscedasticity

KW - Probabilistic random forests

KW - Variance analysis of random forests

UR - http://www.scopus.com/inward/record.url?scp=84963627678&partnerID=8YFLogxK

U2 - 10.4137/CIN.S30794

DO - 10.4137/CIN.S30794

M3 - Article

AN - SCOPUS:84963627678

SN - 1176-9351

VL - 15

SP - 57

EP - 73

JO - Cancer Informatics

JF - Cancer Informatics

ER -

Design of probabilistic random forests with applications to anticancer drug sensitivity prediction

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this