TY - JOUR

T1 - Teaching Principal Components Using Correlations

AU - Westfall, Peter H.

AU - Arias, Andrea L.

AU - Fulton, Lawrence V.

N1 - Publisher Copyright:
© 2017 Taylor & Francis Group, LLC.

PY - 2017/9/3

Y1 - 2017/9/3

N2 - Introducing principal components (PCs) to students is difficult. First, the matrix algebra and mathematical maximization lemmas are daunting, especially for students in the social and behavioral sciences. Second, the standard motivation involving variance maximization subject to unit length constraint does not directly connect to the “variance explained” interpretation. Third, the unit length and uncorrelatedness constraints of the standard motivation do not allow re-scaling or oblique rotations, which are common in practice. Instead, we propose to motivate the subject in terms of optimizing (weighted) average proportions of variance explained in the original variables; this approach may be more intuitive, and hence easier to understand because it links directly to the familiar “R-squared” statistic. It also removes the need for unit length and uncorrelatedness constraints, provides a direct interpretation of “variance explained,” and provides a direct answer to the question of whether to use covariance-based or correlation-based PCs. Furthermore, the presentation can be made without matrix algebra or optimization proofs. Modern tools from data science, including heat maps and text mining, provide further help in the interpretation and application of PCs; examples are given. Together, these techniques may be used to revise currently used methods for teaching and learning PCs in the behavioral sciences.

AB - Introducing principal components (PCs) to students is difficult. First, the matrix algebra and mathematical maximization lemmas are daunting, especially for students in the social and behavioral sciences. Second, the standard motivation involving variance maximization subject to unit length constraint does not directly connect to the “variance explained” interpretation. Third, the unit length and uncorrelatedness constraints of the standard motivation do not allow re-scaling or oblique rotations, which are common in practice. Instead, we propose to motivate the subject in terms of optimizing (weighted) average proportions of variance explained in the original variables; this approach may be more intuitive, and hence easier to understand because it links directly to the familiar “R-squared” statistic. It also removes the need for unit length and uncorrelatedness constraints, provides a direct interpretation of “variance explained,” and provides a direct answer to the question of whether to use covariance-based or correlation-based PCs. Furthermore, the presentation can be made without matrix algebra or optimization proofs. Modern tools from data science, including heat maps and text mining, provide further help in the interpretation and application of PCs; examples are given. Together, these techniques may be used to revise currently used methods for teaching and learning PCs in the behavioral sciences.

KW - Factor analysis

KW - heat map

KW - optimality

KW - rotation

KW - variance explained

UR - http://www.scopus.com/inward/record.url?scp=85024399150&partnerID=8YFLogxK

U2 - 10.1080/00273171.2017.1340824

DO - 10.1080/00273171.2017.1340824

M3 - Article

C2 - 28715259

AN - SCOPUS:85024399150

VL - 52

SP - 648

EP - 660

JO - Multivariate Behavioral Research

JF - Multivariate Behavioral Research

SN - 0027-3171

IS - 5

ER -