A novel support vector classifier for longitudinal high-dimensional data and its application to neuroimaging data



Recent technological advances have made it possible for many studies to collect high dimensional data (HDD) longitudinally, for example, images collected during different scanning sessions. Such studies may yield temporal changes of selected features that, when incorporated with machine learning methods, are able to predict disease status or responses to a therapeutic treatment. Support vector machine (SVM) techniques are robust and effective tools well suited for the classification and prediction of HDD. However, current SVM methods for HDD analysis typically consider cross-sectional data collected during one time period or session (e.g., baseline). We propose a novel support vector classifier (SVC) for longitudinal HDD that allows simultaneous estimation of the SVM separating hyperplane parameters and temporal trend parameters, which determine the optimal means to combine the longitudinal data for classification and prediction. Our approach is based on an augmented reproducing kernel function and uses quadratic programming for optimization. We demonstrate the use and potential advantages of our proposed methodology using a simulation study and a data example from the Alzheimer's disease Neuroimaging Initiative. The results indicate that our proposed method leverages the additional longitudinal information to achieve higher accuracy than methods using only cross-sectional data and methods that combine longitudinal data by naively expanding the feature space. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 604–611, 2011