Article
The Probability of Chance Correlation Using Partial Least Squares (PLS)
Article first published online: 19 SEP 2006
DOI: 10.1002/qsar.19930120205
Copyright © 1993 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Additional Information
How to Cite
Clark, M. and Cramer, R. D. (1993), The Probability of Chance Correlation Using Partial Least Squares (PLS). Quantitative Structure-Activity Relationships, 12: 137–145. doi: 10.1002/qsar.19930120205
Publication History
- Issue published online: 19 SEP 2006
- Article first published online: 19 SEP 2006
- Manuscript Accepted: 5 JAN 1993
- Manuscript Received: 20 SEP 1992
- Abstract
- References
- Cited By
Keywords:
- Partial least squares;
- chance correlation;
- stepwise regression;
- CoMFA;
- cross validation
Abstract
The frequency of chance correlation using partial least squares (PLS) has been measured experimentally for variously dimensioned data, comprising either completely random numbers, random numbers containing a perfect correlation within, and CoMFA field descriptors. This frequency, much lower than that for stepwise multiple regression, is maximal for datasets in which the number of descriptors equals the number of compounds, and surprisingly decreases indefinitely as the number of descriptors becomes much greater than the number of compounds. However, perfect correlations involving descriptor subsets are not detected by PLS if the number of irrelevant descriptors is excessive. In CoMFA applications, the probability of chance correlation is usually negligible. For example with 21 compounds a crossvalidated r2 value greater than 0.25 will occur by chance in less than 5% of trials.

1868-1751/asset/2022_left.gif?v=1&s=55861aec609bfeb0bc3c0534a51214d53d9fdc7d)
1868-1751/asset/olbannerright.gif?v=1&s=ecd199dedfd0b2cddec070f1d2c6b8962951f728)
1868-1751/asset/cover.gif?v=1&s=cd469305234527718d7feaadf703e4ff05f791e3)