11. Canonical Correlation
Published Online: 27 MAR 2003
Copyright © 2002 John Wiley & Sons, Inc.
Methods of Multivariate Analysis, Second Edition
How to Cite
Rencher, A. C. (2002) Canonical Correlation, in Methods of Multivariate Analysis, Second Edition, John Wiley & Sons, Inc., New York, NY, USA. doi: 10.1002/0471271357.ch11
- Published Online: 27 MAR 2003
- Published Print: 22 FEB 2002
Print ISBN: 9780471418894
Online ISBN: 9780471271352
- multiple correlation;
- canonical correlation;
- canonical variate;
- characteristic equation;
- standardized coefficient vectors;
- structure coefficients;
- redundancy analysis;
- dummy variables
Canonical correlation analysis is concerned with the amount of (linear) relationship between two sets of variables y = (y1, y2, …, yp)′ and x = (x1, x2, …, xp)′ measured on the same sampling unit. We denote the two sets of variables as y and x to conform to notation in Chapters 3, 7, and 10. In Chapter 7, we discussed the hypothesis that y and x were independent. In Chapter 10, we regressed y on x. In this chapter, we consider a measure of overall correlation between y and x.
Canonical correlation analysis is often a useful complement to a multivariate regression analysis. Canonical correlation is an extension of multiple correlation, R, which is the correlation between one y and several x's (see Section 10.2.6). The multiple correlation can be defined as the maximum correlation between y and a linear combination of the x's. Correspondingly, the canonical correlation can be defined as the maximum correlation between a linear combination of the y's and a linear combination of the x's. The number of nonzero canonical correlations is equal to the smaller of p and q, where p is the number of y's and q is the number of the x's. These canonical correlations can be found by use of eigenvalues of a matrix product involving variances and covariances. The coefficient vectors in the linear combinations (canonical variates) are found as the eigenvectors corresponding to the eigenvalues.
Canonical correlations and canonical variates are related computationally and conceptually to multivariate regression, MANOVA, and discriminant analysis.
The canonical correlations (all of them) can be tested for significance using Wilks' Λ. This test is equivalent to the test for independence of y and x in Section 7.4.1 and to the test for overall regression of y on x in Section 10.5.1. Even though these three tests are equivalent, we consider them separately because each of the three has an extension that is different from the other two (see Section 11.4.1).
If the test for all canonical correlations is significant, it is of interest to test each of them to see which are significant. This is done by partitioning Wilks' Λ, and approximate chi-square and F tests are available.
Various approaches to interpretation of the canonical variates (the linear combinations of the y's and of the x's that correspond to the canonical correlations) are discussed. The standardized coefficients are the most useful.
The techniques in this chapter are amply illustrated with real data sets, and the problems at the end of the chapter further develop and illustrate the methods.