Data Reduction Using a Discrete Wavelet Transform in Discriminant Analysis of Very High Dimensionality Data



Summary We present a method of data reduction using a wavelet transform in discriminant analysis when the number of variables is much greater than the number of observations. The method is illustrated with a prostate cancer study, where the sample size is 248, and the number of variables is 48,538 (generated using the ProteinChip technology). Using a discrete wavelet transform, the 48,538 data points are represented by 1271 wavelet coefficients. Information criteria identified 11 of the 1271 wavelet coefficients with the highest discriminatory power. The linear classifier with the 11 wavelet coefficients detected prostate cancer in a separate test set with a sensitivity of 97% and specificity of 100%.