TY - JOUR
T1 - Testing for treeness
T2 - Lateral gene transfer, phylogenetic inference, and model selection
AU - Velasco, Joel D.
AU - Sober, Elliott
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - A phylogeny that allows for lateral gene transfer (LGT) can be thought of as a strictly branching tree (all of whose branches are vertical) to which lateral branches have been added. Given that the goal of phylogenetics is to depict evolutionary history, we should look for the best supported phylogenetic network and not restrict ourselves to considering trees. However, the obvious extensions of popular tree-based methods such as maximum parsimony and maximum likelihood face a serious problem-if we judge networks by fit to data alone, networks that have lateral branches will always fit the data at least as well as any network that restricts itself to vertical branches. This is analogous to the well-studied problem of overfitting data in the curve-fitting problem. Analogous problems often have analogous solutions and we propose to treat network inference as a case of model selection and use the Akaike Information Criterion (AIC). Strictly tree-like networks are more parsimonious than those that postulate lateral as well as vertical branches. This leads to the conclusion that we should not always infer LGT events whenever it would improve our fit-to-data, but should do so only when the improved fit is larger than the penalty for adding extra lateral branches.
AB - A phylogeny that allows for lateral gene transfer (LGT) can be thought of as a strictly branching tree (all of whose branches are vertical) to which lateral branches have been added. Given that the goal of phylogenetics is to depict evolutionary history, we should look for the best supported phylogenetic network and not restrict ourselves to considering trees. However, the obvious extensions of popular tree-based methods such as maximum parsimony and maximum likelihood face a serious problem-if we judge networks by fit to data alone, networks that have lateral branches will always fit the data at least as well as any network that restricts itself to vertical branches. This is analogous to the well-studied problem of overfitting data in the curve-fitting problem. Analogous problems often have analogous solutions and we propose to treat network inference as a case of model selection and use the Akaike Information Criterion (AIC). Strictly tree-like networks are more parsimonious than those that postulate lateral as well as vertical branches. This leads to the conclusion that we should not always infer LGT events whenever it would improve our fit-to-data, but should do so only when the improved fit is larger than the penalty for adding extra lateral branches.
KW - Akaike Information Criterion
KW - Lateral gene transfer
KW - Model selection
KW - Parsimony
KW - Phylogenetic networks
UR - http://www.scopus.com/inward/record.url?scp=77956010970&partnerID=8YFLogxK
U2 - 10.1007/s10539-010-9222-6
DO - 10.1007/s10539-010-9222-6
M3 - Article
AN - SCOPUS:77956010970
VL - 25
SP - 675
EP - 687
JO - Biology and Philosophy
JF - Biology and Philosophy
SN - 0169-3867
IS - 4
ER -