Comparing the Expected Misclassification Cost for Two Classifiers Based on Estimates From the Same Sample

James F. Troendle, Kai F. Yu, Peter H. Westfall, Gene Pennello, Enrique F. Schisterman

Research output: Contribution to journalArticlepeer-review


In this article, we consider the problem of comparing two binary classifiers evaluated on the same sample. McNemar's test can be used to compare overall predictive accuracy. However, to evaluate the classifiers in a clinically relevant manner, expected misclassification cost should be accounted for. We show that a Wald-type test can be constructed for this purpose. We further derive a likelihood ratio test for comparison of two classifiers based on expected misclassification cost. The null distribution of the test statistic is approximated by simulation from strategically chosen parameter values. The properties of the tests are examined through simulation of correlated classification indicators. The Wald-type test has approximate Type I error control while maintaining a power advantage over the likelihood ratio test and is therefore recommended for most applications. If conservative error control is desired, the likelihood ratio test calibrated from several strategically chosen parameter values is recommended. The methods are illustrated on a prospective cohort study of coronary heart disease and also on a case-control study of preeclampsia. An interval of misclassification cost ratios for which the Wald test rejects the null hypothesis of equal expected misclassification cost is reported. Full simulation results are available as supplementary tables (S1-S6) in the online supplementary materials.

Original languageEnglish
Pages (from-to)301-312
Number of pages12
JournalStatistics in Biopharmaceutical Research
Issue number3
StatePublished - Jul 2012


  • Likelihood ratio test
  • Multinomial
  • Power
  • Type I error
  • Wald-type test


Dive into the research topics of 'Comparing the Expected Misclassification Cost for Two Classifiers Based on Estimates From the Same Sample'. Together they form a unique fingerprint.

Cite this