13. Factor Analysis
Published Online: 27 MAR 2003
Copyright © 2002 John Wiley & Sons, Inc.
Methods of Multivariate Analysis, Second Edition
How to Cite
Rencher, A. C. (2002) Factor Analysis, in Methods of Multivariate Analysis, Second Edition, John Wiley & Sons, Inc., New York, NY, USA. doi: 10.1002/0471271357.ch13
- Published Online: 27 MAR 2003
- Published Print: 22 FEB 2002
Print ISBN: 9780471418894
Online ISBN: 9780471271352
- common factors;
- latent variables;
- specific variance;
- spectral decomposition;
- principal factor method;
- iterated principal factor method;
- Heywood case;
- maximum likelihood;
- scree test;
- orthogonal rotation;
- oblique rotation;
- varimax rotation;
- simple structure;
- complexity of a variable;
- pattern matrix;
- structure matrix;
- factor scores;
- measure of sampling adequacy
In factor analysis we represent each of the variables y1, y2, …, yp as a linear combination of a few random variables f1, f2, …, fm (m < p) called factors. The coefficients of the factors are called loadings. The factors are underlying constructs or latent variables that ‘generate’ the y's. Like the original variables, the factors vary from individual to individual; but unlike the variables, the factors cannot be measured or observed.
If the original variables y1, y2, …, yp are at least moderately correlated, the basic dimensionality of the system is less than p. The goal of factor analysis is to reduce the redundancy among the variables by using a smaller number of factors.
Suppose the pattern of the high and low correlations in the correlation matrix is such that the variables in a particular subset have high correlations among themselves but low correlations with all the other variables. Then there may be a single underlying factor that gave rise to the variables in the subset. If the other variables can be similarly grouped into subsets with a like pattern of correlations, then a few factors can represent these groups of variables. In this case the pattern in the correlation matrix corresponds directly to the factors.
Certain basic assumptions are made about the factors and the error term in the model. These assumptions lead to an expression for the covariance matrix of the y's as a function of the loadings (coefficients of the factors in the model). Various methods of estimating the loadings are discussed.
In practice, there are some data sets for which the factor analysis model does not provide a satisfactory fit. Sometimes a few easily interpretable factors emerge, but for other data sets, neither the number of factors nor the interpretation is clear. Some possible reasons for these failures are discussed in Section 13.7.
There are four common methods that can be used to decide how many factors to use in fitting the model. Three of the four methods are based on the eigenvalues of the covariance matrix (or correlation matrix). These three methods will usually agree for a data set that can be successfully fit by factor analysis.
The loadings can be rotated without losing any essential properties. This usually results in factors that are more interpretable.
In some applications, we are interested in factor scores, which are estimates of the underlying factor values for each observation. These can be obtained from the y's by a regression method.
The techniques in this chapter are amply illustrated with real data sets, and the problems at the end of the chapter further develop and illustrate the methods.