Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives

Cynthia M. Kroeger; Keisuke Ejima; Bridget A. Hannon; Tanya M. Halliday; Bryan McComb; Margarita Teran-Garcia; John A. Dawson; David B. King; Andrew W. Brown; David B. Allison

doi:10.1093/ajcn/nqaa357

Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives

Cynthia M. Kroeger, Keisuke Ejima, Bridget A. Hannon, Tanya M. Halliday, Bryan McComb, Margarita Teran-Garcia, John A. Dawson, David B. King, Andrew W. Brown, David B. Allison

Nutritional Sciences

Research output: Contribution to journal › Review article › peer-review

3 Scopus citations

Abstract

The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: The Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.

Original language	English
Pages (from-to)	517-524
Number of pages	8
Journal	American Journal of Clinical Nutrition
Volume	113
Issue number	3
DOIs	https://doi.org/10.1093/ajcn/nqaa357
State	Published - Mar 1 2021

Keywords

association
causation
heteroscedasticity
nonparametric tests
nutrition
obesity
research rigor
statistical methods

Access to Document

10.1093/ajcn/nqaa357

Cite this

Kroeger, C. M., Ejima, K., Hannon, B. A., Halliday, T. M., McComb, B., Teran-Garcia, M., Dawson, J. A., King, D. B., Brown, A. W., & Allison, D. B. (2021). Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives. American Journal of Clinical Nutrition, 113(3), 517-524. https://doi.org/10.1093/ajcn/nqaa357

@article{1514af0914b640c4a98b193d307ad29d,

title = "Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives",

abstract = "The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: The Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.",

keywords = "association, causation, heteroscedasticity, nonparametric tests, nutrition, obesity, research rigor, statistical methods",

author = "Kroeger, {Cynthia M.} and Keisuke Ejima and Hannon, {Bridget A.} and Halliday, {Tanya M.} and Bryan McComb and Margarita Teran-Garcia and Dawson, {John A.} and King, {David B.} and Brown, {Andrew W.} and Allison, {David B.}",

note = "Publisher Copyright: {\textcopyright} 2021 The Author(s). Published by Oxford University Press on behalf of the American Society for Nutrition.",

year = "2021",

month = mar,

day = "1",

doi = "10.1093/ajcn/nqaa357",

language = "English",

volume = "113",

pages = "517--524",

journal = "American Journal of Clinical Nutrition",

issn = "0002-9165",

publisher = "Oxford University Press (OUP)",

number = "3",

}

Kroeger, CM, Ejima, K, Hannon, BA, Halliday, TM, McComb, B, Teran-Garcia, M, Dawson, JA, King, DB, Brown, AW & Allison, DB 2021, 'Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives', American Journal of Clinical Nutrition, vol. 113, no. 3, pp. 517-524. https://doi.org/10.1093/ajcn/nqaa357

Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives. / Kroeger, Cynthia M.; Ejima, Keisuke; Hannon, Bridget A. et al.
In: American Journal of Clinical Nutrition, Vol. 113, No. 3, 01.03.2021, p. 517-524.

Research output: Contribution to journal › Review article › peer-review

TY - JOUR

T1 - Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity

T2 - Evidence of the problem and valid alternatives

AU - Kroeger, Cynthia M.

AU - Ejima, Keisuke

AU - Hannon, Bridget A.

AU - Halliday, Tanya M.

AU - McComb, Bryan

AU - Teran-Garcia, Margarita

AU - Dawson, John A.

AU - King, David B.

AU - Brown, Andrew W.

AU - Allison, David B.

PY - 2021/3/1

Y1 - 2021/3/1

N2 - The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: The Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.

AB - The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: The Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.

KW - association

KW - causation

KW - heteroscedasticity

KW - nonparametric tests

KW - nutrition

KW - obesity

KW - research rigor

KW - statistical methods

UR - http://www.scopus.com/inward/record.url?scp=85102906879&partnerID=8YFLogxK

U2 - 10.1093/ajcn/nqaa357

DO - 10.1093/ajcn/nqaa357

M3 - Review article

C2 - 33515017

AN - SCOPUS:85102906879

SN - 0002-9165

VL - 113

SP - 517

EP - 524

JO - American Journal of Clinical Nutrition

JF - American Journal of Clinical Nutrition

IS - 3

ER -

Persistent confusion in nutrition and obesity research about the validity of classic nonparametric tests in the presence of heteroscedasticity: Evidence of the problem and valid alternatives

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this