Abstract
In this article, we consider the problem of comparing two binary classifiers evaluated on the same sample. McNemar's test can be used to compare overall predictive accuracy. However, to evaluate the classifiers in a clinically relevant manner, expected misclassification cost should be accounted for. We show that a Wald-type test can be constructed for this purpose. We further derive a likelihood ratio test for comparison of two classifiers based on expected misclassification cost. The null distribution of the test statistic is approximated by simulation from strategically chosen parameter values. The properties of the tests are examined through simulation of correlated classification indicators. The Wald-type test has approximate Type I error control while maintaining a power advantage over the likelihood ratio test and is therefore recommended for most applications. If conservative error control is desired, the likelihood ratio test calibrated from several strategically chosen parameter values is recommended. The methods are illustrated on a prospective cohort study of coronary heart disease and also on a case-control study of preeclampsia. An interval of misclassification cost ratios for which the Wald test rejects the null hypothesis of equal expected misclassification cost is reported. Full simulation results are available as supplementary tables (S1-S6) in the online supplementary materials.
Original language | English |
---|---|
Pages (from-to) | 301-312 |
Number of pages | 12 |
Journal | Statistics in Biopharmaceutical Research |
Volume | 4 |
Issue number | 3 |
DOIs | |
State | Published - Jul 2012 |
Keywords
- Likelihood ratio test
- Multinomial
- Power
- Type I error
- Wald-type test