Author to whom correspondence should be addressed.


Most methods for variable selection work from the top down and steadily remove features until only a small number remain. They often rely on a predictive model, and there are usually significant disconnections in the sequence of methodologies that leads from the training samples to the choice of the predictor, then to variable selection, then to choice of a classifier, and finally to classification of a new data vector. In this paper we suggest a bottom-up approach that brings the choices of variable selector and classifier closer together, by basing the variable selector directly on the classifier, removing the need to involve predictive methods in the classification decision, and enabling the direct and transparent comparison of different classifiers in a given problem. Specifically, we suggest ‘wrapper methods’, determined by classifier type, for choosing variables that minimize the classification error rate. This approach is particularly useful for exploring relationships among the variables that are chosen for the classifier. It reveals which variables have a high degree of leverage for correct classification using different classifiers; it shows which variables operate in relative isolation, and which are important mainly in conjunction with others; it permits quantification of the authority with which variables are selected; and it generally leads to a reduced number of variables for classification, in comparison with alternative approaches based on prediction.