SEARCH

SEARCH BY CITATION

Keywords:

  • object recognition;
  • Gabor filters;
  • template matching;
  • classification;
  • brain imaging;
  • PET;
  • SPECT;
  • ADNI;
  • Alzheimer's disease;
  • NFL

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

Functional brain imaging is a common tool in monitoring the progression of neurodegenerative and neurological disorders. Identifying functional brain imaging derived features that can accurately detect neurological disease is of primary importance to the medical community. Research in computer vision techniques to identify objects in photographs have reported high accuracies in that domain, but their direct applicability to identifying disease in functional imaging is still under investigation in the medical community. In particular, Serre et al. (2005: In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR-05). pp 994–1000) introduced a biophysically inspired filtering method emulating visual processing in striate cortex which they applied to perform object recognition in photographs. In this work, the model described by Serre et al. [2005] is extended to three-dimensional volumetric images to perform signal detection in functional brain imaging (PET, SPECT). The filter outputs are used to train both neural network and logistic regression classifiers and tested on two distinct datasets: ADNI Alzheimer's disease 2-deoxy-D-glucose (FDG) PET and National Football League players Tc99m HMPAO SPECT. The filtering pipeline is analyzed to identify which steps are most important for classification accuracy. Our results compare favorably with other published classification results and outperform those of a blinded expert human rater, suggesting the utility of this approach. Hum Brain Mapp 35:38–52, 2014. © 2012 Wiley Periodicals, Inc.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

Significant progress has been made in the diagnostic decision-making processes and in predicting the onset and the course of brain disorders [Kantarci and Jack, 2003; Lovestone, 2010; Rachakonda et al., 2004; Roe et al., 2011]. The traditional endpoint diagnosis, clinical measurements, and cognitive tests used in clinical trials have proved to be informative but have their own limitations in accurately quantifying the progression of brain disorders in an unbiased and objective manner [Borroni et al., 2007; Knopman et al., 2001]. Advances in brain imaging technologies have enabled researchers to investigate and test novel biomarkers that could serve either as diagnostic tools to aid clinical decision-making or as surrogates, reflecting disease progression and underlying disease pathology [Biomarkers Definitions Working Group, 2001]. Accordingly, there is a growing body of evidence in the literature showing that structural and functional brain imaging can be valuable tools for predicting and classifying gradually progressive neurological and psychiatric disorders such as Alzheimer's disease (AD) [Drzezga, 2009; Kawachi et al., 2006; Mosconi et al., 2006; Nordberg et al., 2010; Tartaglia et al., 2011]. Although both PET and MRI imaging modalities have been found to be discriminative in various neurological disorders, there is disagreement in the community about which are most sensitive for particular disorders. Specifically, differences in sensitivity and specificity of structural magnetic resonance imaging (MRI) and 2-deoxy-D-glucose (FDG) positron emission tomography (PET) features in the prediction of early AD has been debated in the literature with no clear consensus [De Santi et al., 2001; Mosconi et al., 2006]. Nevertheless, AD research studies evaluating the diagnostic and predictive value of regional specific glucose metabolic rate and volume changes suggest the greater reliability of FDG PET over MRI in discriminating AD from subjects with intact and mild cognitive impairment (MCI) [De Santi et al., 2001; Kawachi et al., 2006; Mosconi et al., 2006]. However, De Santi et al. and Mosconi et al. indicate image postprocessing influences the outcome of discriminative analyses and subsequently, their predictive value.

Although advances in imaging have enabled researchers to visually inspect both functional and structural brain scans of disease, it is often difficult for the human observer to identify the subtle differences in the brain images that are often necessary for reliable disease classification. Furthermore, visual identification of brain diseases by a human observer is time consuming and error prone. Automated image analysis algorithms that can reliably discriminate the diseased from the healthy brain are preferred because they save time, are generally less prone to errors, are not influenced by rater bias or inter-rater differences in neuroanatomical expertise, and can identify subtle statistical correlations in the data. For preventative and longitudinal studies in large populations, automated image analysis is critically important to evaluate the data. To achieve automated and reliable image analysis and classification, we can use computer vision techniques that are designed to extract information from images.

Object recognition in images and video is an active area of research in the computer vision community. Finding objects is fundamentally related to pattern recognition where the presence of unique patterns of colors, edges, and/or textures are consistent with a particular class of object. Probabilistic models are particularly well suited for recognition problems because they provide a structured approach to modeling uncertainty and can be less sensitive to noise in the data. Object recognition systems often consist of a feature extraction component and a classifier. The feature extractor is used to identify properties of the objects that are most important in discriminating one object from another. The features along with a labeled training set are then used to train a classifier to map the features into a class label for each object the detection system is built to recognize. Although the overall process is simple, there are many subtleties in real world applications of detection systems such as object illumination, scale, occlusion, and orientation that affect accuracy. Most often we have a small set of images representing the objects to be recognized and do not have exhaustive examples at all possible scales, orientations, illuminations, etc. The challenge is therefore to find a feature space that avoids irrelevant variations in the objects and instead captures the most discriminating characteristics [Forsyth and Ponce, 2002].

One source of inspiration for engineering such invariant features is the primate visual system, which performs object detection robustly across a huge range of viewpoints, illuminations, and occlusions. One very successful method, the scale invariant feature transform (SIFT) proposed by Lowe 1999 uses features with partial invariance to local variations in scale and illumination, similar to the receptive fields of the neurons in the inferior temporal cortex, an area important for object recognition in primates. Serre et al. 2005 introduced a filtering method whose hierarchical architecture was designed specifically to emulate visual processing in the cat and primate striate cortex. They applied this method to detecting objects in photographs and reported high success rates from a few training examples. Mutch and Lowe 2006 reported similar performance results using a similar filtering scheme that scaled the input images instead of the filters as was done in Serre et al.'s work.

Similar to object recognition in photographs, for automated image-based diagnosis, it is necessary to ignore some classes of variation across healthy individuals while identifying other specific variations which are indicative of disease state. Differences in ligand uptake in the brain measured by functional brain imaging modalities such as FDG PET and Tc99m HMPAO Single Photon Emission Tomography (SPECT) result in spatially smooth patterns of differing intensities which can be used to differentiate a disease group from healthy subjects. Similarly, precise morphology/anatomy may vary among individuals requiring some degree of local scale and orientation invariance. Based on this insight, we extend the neurologically inspired filtering model described by Serre et al. 2005 to signal detection in functional brain imaging. To evaluate how well the Serre et al.'s feature model works in capturing disease patterns in the human brain, the model is extended to three-dimensional volumetric space and signal detection differentiation in functional brain imaging. The hierarchical filtering pipeline is analyzed to identify which steps are most important for classification accuracy and the filter outputs are used to train both neural network (NN) and logistic regression (LR) classifiers. Two distinct and previously published datasets are tested using this feature extraction and classification method: (1) Alzheimer's Disease Neuroimaging Initiative (ADNI) AD FDG PET scans sampled at baseline, 12 months, and 24 months time-points versus the study-specific age-matched healthy comparison (HC) subjects [Mueller et al., 2008]; (2) a Tc99m HMPAO SPECT National Football League (NFL) dataset versus study-specific age-matched HC subjects [Amen et al., 2011]. The AD classification results are further compared against a blinded expert human rater (J.H.F.), providing a baseline measure of how well a human counterpart can recognize disease in the same dataset.

METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

Filtering and Feature Extraction

The image filtering pipeline consists of a series of alternating steps of simple filtering (S layers) and complex filtering (C layers) layers briefly summarized here and discussed in detail in subsequent sections. The first simple layer (S1) outputs respond to oriented edges at different spatial scales and orientations (see S1 Layer). Spatial scales in this context refer to the underlying spatial distribution of the signal in the images. Filters with larger spatial scales will respond to larger (spatially) image signals. S1 layer filters are separated into “bands” where each band is composed of two similar spatial scales as shown in Table 1, rows 1 and 2. The first complex layer (C1) combines the outputs from the S1 layer at different scales but within orientations, providing scale invariance (see C1 Layer). The complex layers pool the simple layer outputs using a max operator, where the strongest simple layer output drives the complex layer output. The second simple layer (S2) matches the detections from the C1 layer against healthy subjects in a template matching framework where higher scores indicate a closer match (see C1 Layer Training Patches and S2 Layer). The second complex layer (C2) combines the outputs from template matching scores across orientations gaining invariance to orientation (see C2 Layer).

Table 1. S1 layer Gabor filter sizes and parameters by band (rows 1–3) where bands are used to group similar filter sizes
Band12345678
  1. Row 4 shows the C1 layer grid size for maximums over Gabor filter scales. Row 5 shows the template patch sizes common to all bands.

Filter3, 57, 911, 1315, 1719, 2123, 2527, 2931, 33
Sigma1.4, 2.13.0, 3.94.6, 5.66.5, 7.58.5, 9.610.6,11.712.9,14.115.3,16.5
Lambda1.7, 2.63.6, 4.85.7, 6.88.0, 9.210.4,11.813.1,14.515.9,17.418.9,20.5
Max Grid436383103123143163183
Patch53, 93, 133, 173
S1 layer

The S1 layer is computed by applying 16 orientated three-dimensional Gabor filters at orientations θ ∊ {0, π/4, π/2, 3π/4}, ϕ ∊ {0, π/4, π/2, 3π/4}, and wavelength λ to each brain scan in the dataset. A Gabor filter is a linear filter whose impulse response is a harmonic function multiplied by a Gaussian function:

  • display math(1)

The cosine term in Eq. (1) controls the harmonic component through the λ wavelength parameter. The variables x, y, and z are the spatial variables defining the spatial extent of the filter. The standard deviation σ describes the size of the Gaussian envelope. The orientation of the filter is represented by variables θ and ϕ, where ϕ orients the filter in the x-y plane and θ is the orientation from the positive z axis. For a detailed description of three-dimensional Gabor filters, refer to Bau et al. [Bau and Healey, 2008; Bau et al., 2008]. Frequency and orientation representations of the filter are similar to those of the human visual system. The original Serre method performed Gabor filtering in two-dimensional, consistent with the image matrix of photographs. In this work, the Gabor filtering was performed in three-dimensional and applied using filter sizes, sigmas, and lambdas over a series of eight bands. The parameters of each band are listed in Table 1, rows 2 to 4. The filter sizes and parameters were kept essentially the same as the Serre work, but the spatial extents of the bands were decreased in order to make the features more sensitive to small activation differences in functional brain imaging. The relative proportions between sizes across the bands remained the same. The voxel sizes of the functional brain imaging data used in this study were 2 mm3 per voxel (see Methods for a detailed description of the test data). The smallest filter size in the Serre work (7 pixels) if directly applied as seven voxels would be unlikely to respond to small differential signals that could be discriminative in the context of functional imaging and disease. To avoid missing small signals, the lowest filter band was set to three voxels. An example of the AD PET scan slices filtered with the three-dimensional Gabor functions are shown in Figure 1. Oriented signals are indeed differentially selected by the filters, consistent with our hypothesized responses of the filters when applied to functional brain imaging data.

image

Figure 1. Examples of Gabor filtered slices. For each example, the filter size, σ, and λ remained constant at 53, 2.1, and 2.6, respectively while the orientation parameters θ and ϕ were varied. A) θ = 0, ϕ = 0; B) θ = π/4, ϕ = π/4; C) θ = π/2, ϕ = π/4; D) θ = 3π/4, ϕ = π/4. The maximum filter responses are shown in red. As the orientation of the filters change (A–D), signals of similar orientations are selected by the filter. [Color figure can be viewed in the online issue, which is available at wileyonlinelibrary.com.]

Download figure to PowerPoint

C1 layer

The C1 layer combines incident S1 units of the same θ and ϕ orientations, creating tolerance to size and shift within Gabor filter orientation. Complex cells in the hierarchical visual cortex model have larger receptor fields than the S1 layer [Serre et al., 2005]. To operationalize this relationship, the S1 layer volumes are filtered with a max operator over Gabor filter scales (Table 1, row 1(filter)), but within each orientation band (columns of Table 1). Max filtering is a nonlinear image processing technique where the value at each voxel in the filtered image is the maximum of the input image voxels in a local neighborhood defined by the filter size. The filter size over which the maximums are calculated depends on the Gabor filter size (shown in Table 1, row 4 (max grid)). Gabor filters with larger spatial scales will respond more strongly to larger (spatially) signals in the images at the same θ and ϕ orientations, therefore, the corresponding max filter sizes should be tuned accordingly. These operations are performed for each Gabor orientation and for each band resulting in 16 × 8 volumes, representing maximums over scales but within orientations. Due to the large numbers of voxels in the volumes and thus the large numbers of max operations over increasing neighborhoods, we used the algorithm developed by Van Herk et al. 1992 to efficiently compute the maximums over neighborhoods for each voxel in the S1 layer volumes. The method requires only a small number of operations per voxel to compute the maximums and lowers the computational time of this stage of the processing pipeline.

C1 layer training patches

Template matching is a common approach to object recognition in computer vision systems. It is a technique which matches image regions to stored representative templates using a specific scoring function [Brunelli, 2009]. In this work, representative templates were collected on a random subset of hold-out healthy subjects to be used in the subsequent S2 layer template-matching step. Ten randomly selected hold-out training images were chosen for template extraction. Templates were extracted randomly across these training images and from random locations within the images but constrained to fall within the boundaries of user specified regions of interest (see The Alzheimer's Disease Neuroimaging Initiative (ADNI)). The regions from which templates are randomly sampled are completely user defined and could be chosen based on some a priori hypothesis or from the literature. Selecting templates from specific regions of interest in the brain is similar to learning that a car is characterized by particular features in spatial locations, e.g. rides on four tires, has doors on the sides, a hood on the front, etc.

Operationally, the user selects regions of interest and the number of features before pipeline execution. We uniformly divide the number of random locations across the number of regions of interest. To generate the random voxel locations within a region of interest, we use an atlas labelmap, which assigns a numerical code to each atlas region. Each atlas region is therefore defined by all the contiguous voxels in the labelmap volume that have equal numerical codes. From this information, we can find the cube containing this region. We then use rejection sampling: drawing a random point uniformly within the cube, we accept it if it falls within the bounded region; otherwise we reject and try again. This process continues until the required number of locations has been found for each region. In our experiments, 50 or 100 templates were chosen to describe the low level representation of the brain images. We chose the two sets such that we had a reasonable number of templates per region of interest selected and so we could evaluate the dependence of the classification results on the number of feature scores used. The original Serre work suggests a modest dependence of performance on the number of feature scores used. For each selected template location, 53, 93, 133, and 173 voxel “patches” were extracted from each of the 16 Gabor filtering orientations and bands from the C1 layer of the 10 randomly selected hold-out healthy subject training images. These patches are simply contiguous sets of voxels of differing spatial extents (53, 93, 133, and 173) centered on the template location and effectively give the vision system a “memory” of image feature examples from the functional brain images of healthy subjects.

S2 layer

The S2 layer corresponds to the template-matching phase of the pipeline. For each C1 image in the test dataset and for each template patch collected from the hold-out healthy subject data, we compare the Gaussian radial basis function score shown in Eq. (2) for each band independently. The S2 unit's response depends on the Euclidean distance between the test dataset patch (X) and the stored prototype patch (P) sampled at the same location, scale, and orientation. If the functional activity profile in the test data is identical to the stored template patch, the score equals 1 whereas if the differences from the stored template patch are large, the score approaches 0. The parameter γ normalizes for different patch sizes (n ∊ {5, 9, 13, 17}) when computing the score in Eq. (2). The parameter γ is fixed to (n/5)3 where n is the patch size and the denominator is the smallest patch size. The parameter σ in Eq. (2) is the uncertainty or variance in the stored prototype patch (P). This parameter was set to 1 in all experiments. Alternatively, it could be set to the empirical variance of the training prototype patches discussed in C1 Layer Training Patches.

  • display math(2)
C2 layer

The final layer in the pipeline computes the maximum response of the S2 layer scores from all bands and orientations for each prototype template. The final feature sets therefore consist of 50 or 100 shift and scale invariant scores (i.e., for 50 and 100 prototype patches) that are subsequently used for classification. Conceptually, for each test image, for each prototype template patch sampled from a brain region, we are using the score that indicates the best match between the test image and a healthy subject regardless of signal size and orientation. We expect that subjects with neurological disorders will match less well with the healthy subjects and thus have a lower score. The final size of the feature vector therefore depends only on the number of patches extracted during training and not on the number of voxels in the full three-dimensional brain image. This allows the user to balance the number of template patches sampled during patch selection (i.e. number of features) and the number of subjects available in the dataset. Flexibility in choosing the number of features provides insulation from classifier overfitting, which can occur if the number of features greatly exceeds the number of examples.

Evaluation

We used two datasets to evaluate the approach. Both are functional imaging datasets but distinctly different modalities. We selected these datasets to evaluate the generality of this approach and its application to distinctly different neurological abnormalities.

The Alzheimer's Disease Neuroimaging Initiative (ADNI)

ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and nonprofit organizations as a $60 million, 5-year public–private partnership. The primary goal of ADNI is to test whether serial MRI, PET, other biological markers, such as cerebrospinal fluid (CSF) markers, Apolipoprotein E (APOE) status, and full-genome genotyping via blood sample, and clinical and neuropsychological assessments can be combined to measure the progression of MCI and AD. Determination of sensitive and specific markers of very early AD progression is intended to: (1) aid in the development of new treatments, (2) increase the ability to monitor their effectiveness, and (3) reduce the time and cost of clinical trials. The principal investigator of the initiative is Michael W. Weiner, M.D., of the Veteran's Affairs Medical Center and University of California, San Francisco. ADNI is the result of efforts of many coinvestigators from a broad range of academic institutions and private corporations, and participants have been recruited from over 50 sites across the U.S. and Canada. ADNI participants range in age from 55 to 90 years and include approximately 200 cognitively normal elderly followed for 3 years, 400 elderly with MCI followed for 3 years, and 200 elderly with early AD followed for 2 years. Participants are evaluated at baseline, 6, 12, 18 (for MCI only), 24, and 36 months (AD participants do not have a 36-month evaluation). Baseline and longitudinal follow-up structural MRI scans are collected on the full sample and 11C-labeled Pittsburgh Compound-B (11C-PIB) and FDG PET scans are collected on a subset of every 6 to 12 months (for study details see http://www.adni-info.org). A subset of these data were published in Mueller et al. 2008 and Langbaum et al. 2009 was used in this analysis.

AD dataset

The dataset used in this study consisted of 154 baseline FDG PET scans acquired as part of the ADNI study and published in Mueller et al. 2008 and Langbaum et al. 2009. There were 82 HC subjects (mini-mental state exam (MMSE) 28.6 ± 1.1; age 75.1 ± 9.6 yr) and 72 AD subjects (MMSE 23.2 ± 3.5; age 75.1 ± 11.2 yr) from the baseline ADNI sample used for this study. The 12m and 24m ADNI samples contained a subset of the baseline dataset due to subject dropout. The 12m sample included 72 HC subjects (MMSE 29.2 ± 1.2; age 77.5 ± 8.4 yr) and 61 AD subjects (MMSE 20.9 ± 4.9; age 75.4 ± 11.8 yr). The 24m sample included 68 HC subjects (MMSE 28.6 ± 3.7; age 76.0 ± 10.2 yr) and 33 AD subjects (MMSE 18.4 ± 6.1; age 74.6 ± 15.2 yr). The acquisition protocol consisted of collecting six 5-min frames 30 to 60 min post-18FDG injection. During the uptake period subjects were asked to rest comfortably in a dimly lit room with their eyes open. The collected frames were registered to the first frame (acquired at 30–35 min postinjection) and averaged to yield a single 30 min average PET image in “native” space. The image matrix, field of view, and resolution of the datasets from participating sites were then matched by the ADNI group. The images were spatially normalized to the MNI atlas using SPM8 software (2007) resulting in image matrices of 79 × 95 × 68 voxels in x, y, and z dimensions, respectively with isotropic 23 mm voxel sizes. The Automated Anatomical Labeling (AAL) atlas was used to constrain the region of interest selection based on the anatomical parcellations available in the atlas [Tzourio-Mazoyer et al., 2002]. The AAL atlas used to define the region of interest boundaries is consistent with the space defined by the MNI atlas.

Coordinates for template patch sampling and S2 layer matching scores were constrained to fall within regions identified in the literature to be affected by AD (see C1 Layer Training Patches and S2 Layer). Delacourte et al. identified stages of AD neurofibrillary degeneration in patients of various ages and different cognitive statuses 1988. Further, Langbaum et al. 2009 identified regions of reduced metabolic rates in AD. Regions included the cingulate cortex, parietal and temporal lobes, among others. For this study, we chose AAL atlas regions (bilateral): anterior and posterior cingulum, temporal lobes (middle), hippocampus, amygdala, thalamus, frontal and orbital cortices (superior and middle), temporal pole (superior, middle, inferior), and parietal lobe (inferior) as being consistent with published findings on potentially discriminative regions.

NFL dataset

The NFL dataset used in this study consisted of 162 technetium-99m hexamethylpropyleneamine oxide (Tc99m HMPAO) SPECT scans acquired for a study evaluating the impact of playing American football by Amen et al. 2009. There were 83 HC (age 41.7 ± 17.8 yrs) and 79 NFL (age: 57.5 ± 11.5 yr) subjects. Subjects were injected with an age/weight appropriate dose of Tc99m HMPAO and performed the Conners' Continuous Performance test II for 30 min during uptake. All subjects completed the task and were subsequently scanned on a high-resolution Picker Prism 3000 triple-headed gamma camera with fan beam collimators. The original reconstructed image matrices were 128 × 128 × 29 voxels with sizes of 2.16 mm × 2.16 mm × 6.48 mm. The images were spatially normalized to the MNI atlas using SPM8 software (2007) resulting in image matrices of 79 × 95 × 68 voxels in x, y, and z dimensions, respectively with isotropic 23 mm voxel sizes. Images were smoothed using an 8-mm FWHM isotropic Gaussian kernel. The preprocessing steps were identical to the previously published work by Amen et al. In the previously published work, a subset of the HC dataset was used and matched on gender and race. For this work, all subjects were used regardless of race and gender.

Coordinates for template patch sampling and S2 layer matching scores (see C1 Layer Training Patches and S2 Layer) were constrained to fall within regions identified in Amen et al. as the top discriminating regions for the NFL group. To our knowledge, the Amen study was the first brain imaging study evaluating NFL players and as such, the regions were picked based only on that publication. For this study, we used AAL atlas regions (bilateral): anterior and posterior cingulum, frontal pole, hippocampus, amygdala, and temporal pole (middle and inferior).

Ethics

The NFL study was approved by each of the participating sites' Institutional Review Boards (IRBs) and complied with the Code of Ethics of the World Medical Association (Declaration of Helsinki). Written informed consent was obtained from all participants after they had received a complete description of the studies.

The ADNI data were previously collected across 50 research sites. Study subjects gave written informed consent at the time of enrollment for imaging and genetic sample collection and completed questionnaires approved by each participating sites' Institutional Review Board (IRB).

Feature sets

In order to identify which components of the feed-forward hierarchical model implemented in this study were most important in correct classification, three separate feature sets were computed. The FTM (Gabor filter + template match) feature set is the result of the full hierarchical pipeline as described in Methods. In order to understand the effect of the Gabor filtering, the TM (template matching) dataset was created using the same procedures outlined in Methods without Gabor filtering. More precisely, the dataset consists of selecting template patches from the unfiltered images (neither S1 nor C1 layers) and performing the computations in the S2 and C2 layers. To evaluate the effect of template matching, the AP (average patch) feature set consists of simply averaging the voxels in the neighborhood around the prototype patch locations selected in C1 Layer Training Patches, across the various filter sizes (Table 1, row 2) and taking the maximum response.

To compare the feature sets of the hierarchical model with more typical data reduction (DR) techniques, the maximum group difference (MaxT) and DR sets were computed. The MaxT feature set is computed by performing a typical voxel-wise independent, two-sample t-test in the SPM8 software. The resulting SPM(t) maps were then thresholded at P < 0.01 (AD baseline), P < 0.001 (AD 12m), P < 0.001 (AD 24m), and P < 1e-6 (NFL) and corrected for multiple comparisons using the family-wise error rate (FWE) correction. Probability thresholds were chosen to limit the number of voxels in the resulting t-score maps such that similar numbers of voxels were obtained for each data set (∼3K points). The absolute values of the resulting t-scores were ranked and the data from the top 50 and 100 locations were then sampled from each subject and used for classification (MaxT). The DR feature set used all the locations found in the group difference maps, discussed above, after probability thresholding (∼3K points), sampling the original data at those locations (∼3K points) for each subject. The resulting N × K matrix (N subjects, K sampling locations) was mean-centered for each column K and run through principal components analysis (PCA). Each subject's data was then projected onto the eigenvectors of the top 50 and 100 largest eigenvalues from the PCA decomposition giving a low dimensional representation with 50 or 100 feature scores that were subsequently used for classification. The top 50 and 100 largest eigenvectors were chosen so that the projected dataset contained 50 and 100 scores per subject, consistent with the number of feature scores calculated from the full feed-forward hierarchical model.

Classification

Classification was done using both a multilayered perceptron NN and a LR classifier to understand the dependence of the results on the classifier chosen [Hall et al., 2009]. Each classifier was trained separately on the same datasets to compare the performance of the simpler LR classifier, able to find linear decision boundaries, with the NN classifier, able to model more complex nonlinear functions. The NN was constructed with one hidden layer (hidden layer nodes = (#features + #classes)/2) and trained with a learning rate of 0.3 and a momentum of 0.2. For each classifier, 10-fold cross validation was used. The dataset was divided in each fold into training and testing subsets. The classifier was trained using the training subset and tested on the testing subset. This process was repeated 10 times. Areas under the receiver operating characteristic (ROC-AUC) curves were computed from the probability of class membership of the testing data from each of the trained classifiers. The full filtering pipeline (FTM) ROC-AUC curves were statistically compared with each of the alternative methods for each dataset and classifier using the DeLong et al. 2009 method of comparing areas under correlated ROC curves as implemented in the pROC package [Robin et al., 2011]. To compute 95% confidence intervals and statistics, the data were resampled 2,000 times, stratified by group membership.

To compare the classifier results on the baseline AD dataset with the visual ratings of neuroanatomist (J.H.F.), true-positive (TP) and false-positive (FP) rates were calculated. To calculate the TP and FP rates, the probability of class membership from the trained classifiers for each testing subset data point, in each fold, was computed. The data point was assigned to the class with the largest probability. The TP rate was the proportion of examples in the testing subsets that were classified as class AD, among all testing examples that were originally labeled as class AD. The TP rate is the average across all folds. The FP rate was the proportion of examples in the testing subsets that were classified as class AD, but were originally labeled as the alternative class, among all testing examples which are not of class AD. The FP rate is the average across all folds. The TP and FP results for Dr. Fallon were computed from his designation of either AD or healthy control for each of the baseline data compared with the original class labels.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

To summarize the performance of each classifier, the ROC-AUC results for the Alzheimer's disease (AD) baseline, 12m, and 24m datasets are shown in Figures 2, 34, respectively for 50 and 100 feature datasets and both LR and NN classifiers. The confidence intervals for each ROC-AUC and statistical comparisons of the FTM with each of the other methods for all classifiers and datasets are shown in Tables 2-4. The FTM method outperformed the other methods in terms of ROC-AUC in 80% of the tests, and was statistically better in 35%. No other method was statistically better than FTM; although, the PCA DR strategy in the 50-feature, baseline AD, LR classifier was close (P < 0.064). Overall, the NN classifier generally outperformed the LR classifier in ROC-AUC. Further, the FTM method was statistically better than all other methods in 46% of the NN classification experiments compared with 25% using the LR classifier, suggesting a benefit of using the more sophisticated classifier with the FTM method. There was a small, nonsignificant, increase on average in ROC-AUC over all the classifiers in the results using the larger 100 feature datasets. Overall performance of the FTM trained classifiers were consistent with other published classification results (see Discussion) using the ADNI dataset, with maximum ROC-AUC at baseline of 0.962 ± 0.025 (NN, 100 feature), at 12m of 0.837 ± 0.073 (NN, 100 feature), and at 24m of 0.878 ± 0.070 (NN, 100 feature).

image

Figure 2. Area under the ROC curve for AD classification of the ADNI baseline data set for logistic regression (LR) and neural network (NN) classifiers for both 50 and 100 feature datasets (MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching). The FTM method outperforms the others in 94% of the cases and is statistically better in 50% of the cases.

Download figure to PowerPoint

image

Figure 3. Area under the ROC curve for AD classification of the ADNI 12m data set for logistic regression (LR) and neural network (NN) classifiers for both 50 and 100 feature datasets (MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching). The FTM method outperforms the others in 88% of the cases and is statistically better in 38% of the cases.

Download figure to PowerPoint

image

Figure 4. Area under the ROC curve for AD classification of the ADNI 24m data set for logistic regression (LR) and neural network (NN) classifiers for both 50 and 100 feature datasets (MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching). The FTM method outperforms the others in 56% of the cases and is statistically better in 19% of the cases.

Download figure to PowerPoint

Table 2. Results from the AD ROC-AUC analysis of the ADNI baseline data
Dataset (#feat)ClassifierMethodROC-AUC95% Confz-Score (XAUC FTMAUC)P (FTMAUC = XAUC)
  1. The table lists ROC-AUC measurements, 95% confidence intervals, z-scores, and probabilities for comparisons of the FTM method with the other methods within each dataset and classifier combination. Negative z-scores indicate methods that are lower in ROC-AUC than the FTM method. Significant differences are highlighted in bold.

  2. MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching.

  FTM0.7910.857–0.725  
 TM0.6440.721–0.5673.2590.001
 AP0.7290.801–0.657−1.5560.120
 DR0.8610.919–0.8031.8540.064
 MaxT0.6920.768–0.6162.1870.029
AD-Bas (50)NN     
 FTM0.9280.970–0.886  
 TM0.7830.858–0.7094.3211.55E-04
 AP0.9020.951–0.854−1.1320.258
 DR0.9050.952–0.858−0.8330.405
 MaxT0.7770.855–0.6983.6612.51E-04
AD-Bas (100)LR     
 FTM0.7630.832–0.694  
 TM0.6890.766–0.612−1.7610.078
 AP0.7130.832–0.694−1.3200.187
 DR0.6980.775–0.620−1.6040.109
 MaxT0.6870.761–0.614−1.5740.115
AD-Bas (100)NN     
 FTM0.9620.987–0.938  
 TM0.6440.722–0.5678.6232.20E-16
 AP0.8850.940–0.8313.3368.50E-04
 DR0.6780.763–0.5946.6972.13E-11
 MaxT0.7730.849–0.6965.0534.35E-07
Table 3. Results from the AD ROC-AUC analysis of the ADNI 12m data
Dataset (#feat)ClassifierMethodROC-AUC95% Confz-Score (XAUC FTMAUC)P (FTMAUC = XAUC)
  1. The table lists ROC-AUC measurements, 95% confidence intervals, z-scores, and probabilities for comparisons of the FTM method with the other methods within each dataset and classifier combination. Negative z-scores indicate methods that are lower in ROC-AUC than the FTM method. Significant differences are highlighted in bold.

  2. MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching.

AD-12m (50)LR     
 FTM0.7780.851–0.705  
 TM0.6640.747–0.5822.1730.030
 AP0.7560.831–0.682−0.4990.618
 DR0.7260.805–0.648−1.0600.289
 MaxT0.6090.701–0.5182.8300.005
AD-12m (50)NN     
 FTM0.8250.898–0.753  
 TM0.7810.862–0.701−0.9520.341
 AP0.8380.908–0.7690.3190.750
 DR0.7710.854–0.689−1.2920.196
 MaxT0.6810.776–0.5852.3710.019
AD-12m (100)LR     
 FTM0.7590.835–0.683  
 TM0.6480.734–0.5612.1660.030
 AP0.6990.781–0.618−1.2100.226
 DR0.6760.763–0.588−1.5460.122
 MaxT0.6710.754–0.588−1.5320.127
AD-12m (100)NN     
 FTM0.8370.910–0.764  
 TM0.7830.861–0.706−1.4110.158
 AP0.8550.919–0.7910.5900.555
 DR0.7140.802–0.6272.2340.022
 MaxT0.6870.780–0.5942.4820.014
Table 4. Results from the AD ROC-AUC analysis of the ADNI 24m data
Dataset (#feat)ClassifierMethodROC-AUC95% Confz-Score (XAUC FTMAUC)P (FTMAUC = XAUC)
  1. The table lists ROC-AUC measurements, 95% confidence intervals, z-scores, and probabilities for comparisons of the FTM method with the other methods within each dataset and classifier combination. Negative z-scores indicate methods that are lower in ROC-AUC than the FTM method. Significant differences are highlighted in bold.

  2. MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching.

AD-24m (50)LR     
 FTM0.7490.843–0.655  
 TM0.6580.763–0.553−1.4370.151
 AP0.7940.886–0.7020.7940.427
 DR0.8280.915–0.7401.3710.171
 MaxT0.7870.902–0.6730.5020.616
AD-24m (50)NN     
 FTM0.8410.924–0.758  
 TM0.8830.955–0.8100.9910.322
 AP0.8650.942–0.7880.7360.462
 DR0.8160.915–0.717−0.4590.646
 MaxT0.7660.861–0.670−1.3350.182
AD-24m (100)LR     
 FTM0.8220.906–0.737  
 TM0.8230.906–0.7400.0260.979
 AP0.8220.892–0.716−0.3190.75
 DR0.5610.426–0.6964.6203.84E-06
 MaxT0.8130.915–0.710−0.1430.886
AD-24m (100)NN     
 FTM0.8780.948–0.806  
 TM0.8640.944–0.783−0.3830.702
 AP0.8800.957–0.8040.1020.919
 DR0.6770.788–0.5668.2142.20E-16
 MaxT0.7580.860–0.6562.2730.023

Neuroanatomist (J.H.F.) was given the baseline AD dataset images in transaxial, coronal, and sagittal orientations, without the diagnosis and given no practice set of normal or ADs to examine before the analysis, and asked to classify the scans as either AD or HC. These results are only available for the baseline AD data due to the significant effort in manually rating so many scans. J.H.F. achieved a true/false-positive rate for AD of 0.718/0.380 and for the HC group of 0.671/0.244 as shown in Table 6. The FTM classifier performed better in both true/false positives for both AD and HC groups while also outperforming the MaxT and DR methods, further suggesting the potential utility of this approach.

The AUC results for the NFL group are shown in Figure 5 for 50 and 100 feature datasets and both LR and NN classifiers. The confidence intervals for each ROC-AUC and statistical comparisons of the FTM with each of the other methods for all classifiers and datasets are shown in Table 5. Interestingly, unlike the AD dataset, the FTM method did not dominate the others, outperforming the other methods in 44% of the tests and was statistically better in only one. Alternatively, the MaxT method consistently outperformed the others in terms of ROC-AUC and was statistically better than the FTM method in three out of four comparisons. We speculate this result is related to specific brain functional changes accompanying repeated head injuries evident in the NFL dataset (see Discussion). Overall performance of the FTM classifier was still quite good with maximum ROC-AUC of 0.939 ± 0.037/0.145 (LR, 100 features). Unlike the AD experiments, the NN classifier did not outperform the LR classifier for the FTM dataset but did for the best performing MaxT dataset.

image

Figure 5. Area under the ROC curve for NFL classification for logistic regression (LR) and neural network (NN) classifiers for both 50 and 100 feature datasets (MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching). The MaxT method outperformed the other methods, statistically better than the FTM method in all comparisons except in the LR-50 feature dataset. The FTM ROC-AUC was still very good, always greater than 0.900 and as high as 0.939 in the NN-100feature dataset.

Download figure to PowerPoint

Table 5. Results from the NFL ROC-AUC analysis
Dataset (#feat)ClassifierMethodROC-AUC95% Confz-Score (XAUC FTMAUC)P (FTMAUC = XAUC)
  1. The table lists ROC-AUC measurements, 95% confidence intervals, z-scores, and probabilities for comparisons of the FTM method with the other methods within each dataset and classifier combination. Negative z-scores indicate methods that are lower in ROC-AUC than the FTM method. Significant differences are highlighted in bold.

  2. MaxT = maximum t-score, DR = PCA data reduction, AP = average patch, TM = template matching, FTM = Gabor filtering + template matching.

NFL (50)LR     
 FTM0.9090.954–0.864  
 TM0.7020.777–0.6284.6625.09E-06
 AP0.8760.931–0.821−0.9070.365
 DR0.9420.979–0.9061.1370.255
 MaxT0.9560.988–0.9242.0590.040
NFL (50)NN     
 FTM0.9080.957–0.860  
 TM0.8560.917–0.795−1.3060.193
 AP0.9330.974–0.8930.7760.438
 DR0.9740.995–0.9542.4050.016
 MaxT0.9861.000–0.9692.9440.003
NFL (100)LR     
 FTM0.9390.976–0.902  
 TM0.8770.931–0.822−1.8560.065
 AP0.8830.936–0.830−1.7050.089
 DR0.9000.947–0.853−1.3150.188
 MaxT0.9771.000–0.9541.7530.080
NFL (100)NN     
 FTM0.9200.964–0.876  
 TM0.9540.989–0.9181.1670.245
 AP0.9420.977–0.9070.7650.445
 DR0.8920.941–0.842−0.8660.386
 MaxT0.9881.000–0.9732.9000.004
Table 6. Results from the visual ratings of neuroanatomist J.H.F. from the ADNI baseline data
MethodAD-TPAD-FPHC-TPHC-FP
  1. The table lists true-positive (TP) and false-positive (FP) values for the Alzheimer's disease (AD) and healthy control (HC) classes compared with the FTM, DR, and MaxT methods. The FTM method outperforms both the human rater and the other methods.

  2. MaxT = maximum t-score, DR = PCA data reduction, FTM = Gabor filtering + template matching.

J.H.F0.7180.3800.6710.244
FTM0.8750.1220.8780.125
DR0.6220.3890.6110.378
MaxT0.8290.3750.6250.171

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

The overall classification results suggest the biophysically inspired feed-forward hierarchical model used in these experiments is sensitive to differences in functional brain imaging data. Both AD and NFL classification experiments showed impressive ROC-AUC rates using a method not specifically tuned for these imaging modalities. The FTM results are consistent with published classification rates for the ADNI AD data set using brain imaging; although, most reported results use a mix of structural and functional imaging features. For example, Hinrichs et al. used the ADNI dataset in a spatially augmented boosting framework and reported an ROC-AUC of 0.8716 when using just FDG PET [Hinrichs et al., 2009].

A benefit of using LR classifiers is the clear interpretation of which features are most informative for classification. For baseline AD classification, the four most informative patches (highest weights) were sampled from AAL atlas regions right hippocampus and superior temporal lobes left and right while the posterior cingulate, a region commonly associated with disease progression, ranked fourth. For 12m AD classification the most informative patches were sampled from frontal superior right, frontal superior orbital left, and the temporal pole superior right. For 24m AD classifications the most informative patches were sampled from the frontal superior right, temporal pole mid left, and hippocampus left. It is interesting that the frontal lobe was not included as a top discriminating location in the baseline data set but was in both the 12m and 24m data, consistent with well-known structural changes in AD disease progression. We also evaluated the performance of the FTM features using ROIs that specifically did not include those selected in AD dataset. The results were on average 10 to 15% lower in ROC-AUC for baseline AD than those reported in the Results, suggesting this method is sensitive to region of interest selection. Therefore we suspect the filtering pipeline could be used to test competing hypotheses about specific regions of interest implicated in disease. The top three most informative patches from the features evaluated using ROIs that specifically did not include those selected in AD dataset, were sampled from AAL atlas regions frontal inferior orbital left, insula right, and occipital middle right. Other informative patches for AD included the supramarginal right, lingual right, and frontal inferior operculum left. Interestingly, the frontal inferior orbital, operculum, and the supramarginal gyrus are all associated with AD in the literature suggesting the classification results are still picking up on areas related to the disease [Espasy and Jacobs, 2010; Grignon et al., 1998].

Overall, the average patch (AP) feature set outperformed the template matching (TM) feature set, suggesting no compelling benefit of template matching without Gabor filtering in this application. The utility of oriented Gabor filtering and template matching in deriving the feature set was most evident in AD classification. This trend was not observed for the NFL classification experiments. Why would oriented filtering improve classification rates in AD and not the NFL data set? It is well known in the literature that structural changes in AD follow an anatomical trajectory starting in entorhinal cortex and hippocampus, then moving to temporal and parietal lobes, and finally affecting the frontal lobes in late stage AD [Braak and Braak, 1997; Hua et al., 2008; Thompson et al., 2003]. These structural changes should be reflected in corresponding functional changes. In addition, the accumulation of amyloid plaques between nerve cells in the brain is known to be a hallmark of AD. Both the structural changes and plaques may be altering the functional brain imaging derived signal in orientation, scale, and localized spatial extent due, in part, to brain plasticity and compensation.

Alternatively, the FTM might not perform as well in data sets with widespread, global functional changes observed in the NFL data. Indeed the article by Amen et al. reports “significant decreases in regional cerebral blood flow were seen across the whole brain.” The comparison feature sets MaxT and DR should perform well in that setting because they rely on group differences and maximal variation. It is possible that the FTM method performs better in settings with more localized functional differences. The NFL dataset differed from the AD dataset in both imaging modality (SPECT vs. PET) and uptake conditions (continuous performance test vs. rest), which could contribute to the differences in classifier performance. We suspect modality is not a factor as the feature scores used in classification are modality neutral. Lower resolution imaging systems may contribute to lower true-positive rates if the regions of interest are small in size, despite the model's attempt to mitigate this effect using filter sizes of differing spatial scales. Regardless of how well the filtering method does, if the discriminating feature of a disease is too small to be accurately measured by the imaging device, performance of the classification system will undoubtedly suffer. The benefit of this method is that it uses information across spatial scales, orientations, and locations in the volumes to calculate the matching scores used for subsequent classification and should therefore be less reliant on any one discriminating feature. The uptake task will contribute to the functional signals and should be taken into account when selecting the regions of interest to calculate feature scores (C1 Layer Training Patches). Choosing regions that are absolutely not affected by the disease will decrease the discriminative power of the method. Alternatively, if the number of subjects in the dataset is high and there is no fear of classifier overfitting, choosing many regions, some known to be related to the disease and/or task and others whose relationship is unknown could provide interesting insight into whether the unknown regions are contributing to classification accuracy. Further, because the features of the dataset are computed separately from the classifier, one could choose to sample some features from all brain regions and either perform regularization in the classifier or choose a classification model that is less sensitive to overfitting (e.g., support vector machines). Each of these decisions should be made relative to the particular dataset and illness being studied.

CONCLUSIONS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

In general our volumetric variant of the hierarchical feed-forward model originally proposed by Serre et al. for detecting objects in photographs performed quite well on the functional brain imaging data sets used in this study. In fact, it outperformed both the comparison methods and the human counterpart at detecting AD in the FDG PET ADNI data set. The method is very general and does not rely on particular imaging modalities. It could be used on many spatial maps commonly computed in diagnostic and research imaging studies. Furthermore, there is evidence that it could be used to test hypotheses about regions implicated in disease. In conclusion, models designed in the computer vision community for object recognition and tracking in images of natural scenes may indeed have applications in detecting and tracking disease progression in human functional brain imaging with minimal modifications.

ACKNOWLEDGMENTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES

Data collection and sharing for the NFL dataset used in this study was made available by Daniel Amen, M.D., and the Amen Clinics Inc. Industry partnerships are coordinated through the Foundation for the National Institutes of Health. The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer's Disease Cooperative Study at the University of California, San Diego. ADNI data are disseminated by the Laboratory of Neuro-Imaging at the University of California, Los Angeles. The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the article.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. ACKNOWLEDGMENTS
  9. REFERENCES
  • Amen D, Newberg A, Thatcher R, Jin Y, Wu J, Phillips B, Keator D, Willeumier K (2011): Impact of playing professional American football on long-term brain function. J Neuropsychiatry Clin Neurosci 23:98106.
  • Bau TC, Healey G (2009): Rotation and Scale Invariant Hyperspectral Classification Using 3D Gabor Filters. Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XV, edited by Sylvia S. Shen, Paul E. Lewis, Proc. of SPIE Vol. 7334, 73340B.
  • Bau TC, Sarkar S, Healey G (2008): Using Three-Dimensional Spectral/Spatial Gabor Filters for Hyperspectral Region Classification. Algorithms and Technologies for Multispectral, Hyperspectral, and Ultraspectral Imagery XIV, edited by Sylvia S. Shen, Paul E. Lewis, Proc. of SPIE Vol. 6966, 69660E.
  • Borroni B, Premi E, Di Luca M, Padovani A (2007): Combined biomarkers for early Alzheimer disease diagnosis. Curr Med Chem 14:11711178.
  • Braak H, Braak E (1997): Staging of Alzheimer-related cortical destruction. Int Psychogeriatr 9:257261.
  • Brunelli R (2009): Template Matching Techniques in Computer Vision: Theory and Practice.Chichester, UK:John Wiley & Sons Inc.
  • De Santi S, De Leon MJ, Rusinek H, Convit A, Tarshish CY, Roche A, Tsui WH, Kandil E, Boppana M, Daisley K, Wang GJ, Fowler J (2001): Hippocampal Formation Glucose Metabolism And Volume Losses In MCI And AD. Neurobiology Of Aging 22(4):529539.
  • Delacourte A, David JP, Sergeant N, Buee L, Wattez A, Vermersch P, Ghozali F, Fallet-Bianco C, Pasquier F, Lebert F, et al. (1999): The biochemical pathway of neurofibrillary degeneration in aging and Alzheimer's disease. Neurology 52:1158.
  • Delong ER, Delong DM, Clarke-Pearson DL (1988): Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics 44:83745.
  • Drzezga A (2009): Diagnosis of Alzheimer's disease with [18F] PET in mild and asymptomatic stages. Behav Neurol 21:101115.
  • Espasy A, Jacobs D (2010): Frontal lobe syndromes. Emedicine specialties. Behav Neurol Dement Ed. http://emedicine.medscape.com/article/1135866-overview.
  • Forsyth DA, Ponce J (2002): Computer Vision: A Modern Approach.New Jersey:Prentice Hall. Professional Technical Reference.
  • Friston KJ, Ashburner J, Kiebel SJ, Nichols TE, Penny WD, editors (2007): Statistical Parametric Mapping: The Analysis of Functional Brain Images. London: Academic Press.
  • Grignon Y, Duyckaerts C, Bennecib M, Hauw JJ (1998): Cytoarchitectonic alterations in the supramarginal gyrus of late onset Alzheimer's disease. Acta Neuropathol 95:395406.
  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009): The WEKA data mining software: An update. ACM SIGKDD Explorations Newsl 11:1018.
  • Hinrichs C, Singh V, Mukherjee L, Xu G, Chung MK, Johnson SC (2009): Spatially augmented lpboosting for AD Classification with evaluations on the ADNI dataset. Neuroimage 48:138149.
  • Hua X, Leow AD, Parikshak N, Lee S, Chiang MC, Toga AW, Jack CR Jr, Weiner MW, Thompson PM (2008): Tensor-based morphometry as a neuroimaging biomarker for Alzheimer's disease: An MRI study of 676 AD, MCI, and normal subjects. Neuroimage 43:458469.
  • Hubel DH, Wiesel TN (1959): Receptive fields of single neurones in the cat's striate cortex. J Physiol 148:574.
  • Hubel DH, Wiesel TN (1961): Integrative action in the cat's lateral geniculate body. J Physiol 155:385398.
  • Hubel DH, Wiesel TN (1962): Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J Physiol 160:106154.
  • Hubel DH, Wiesel TN (1965a): Binocular interaction in striate cortex of kittens reared with artificial squint. J Neurophysiol 28:10411059.
  • Hubel DH, Wiesel TN (1965b): Receptive fields and functional architecture in two nonstriate visual areas (18 and 19) of the cat. J Neurophysiol 28:229289.
  • Hubel DH, Wiesel TN (1968): Receptive fields and functional architecture of monkey striate cortex. J Physiol 195:215.
  • Kantarci K, Jack CR Jr (2003): Neuroimaging in Alzheimer disease: An evidence-based review. Neuroimaging Clin NA 13:197.
  • Kawachi T, Ishii K, Sakamoto S, Sasaki M, Mori T, Yamashita F, Matsuda H, Mori E (2006): Comparison of the diagnostic performance of FDG-PET and VBM-MRI in very mild Alzheimer's disease. Eur J Nucl Med Mol Imaging 33:801809.
  • Knopman DS, Dekosky ST, Cummings JL, Chui H, Corey-Bloom J, Relkin N, Small GW, Miller B, Stevens JC (2001): Practice parameter: diagnosis of dementia (an evidence-based review): Report of the Quality Standards Subcommittee of The American Academy of Neurology. Neurology 56:1143.
  • Langbaum J, Chen K, Lee W, Reschke C, Bandy D, Fleisher AS, Alexander GE, Foster NL, et al. (2009): Categorical and correlational analyses of baseline fluorodeoxyglucose positron emission tomography images from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Neuroimage 45:11071116.
  • Lovestone S (2010): Searching for biomarkers in neurodegeneration. Nat Med 16:13711372.
  • Lowe DG (1999): Object recognition from local scale-invariant features, Computer Vision. The Proceedings of the Seventh IEEE International Conference on, vol. 2, no., pp. 11501157.
  • Martinez LM, Alonso JM (2003): Complex receptive fields in primary visual cortex. Neuroscientist 9:31731.
  • Mosconi L, Sorbi S, De Leon MJ, Li Y, Nacmias B, Myoung PS, Tsui W, Ginestroni A, Bessi V, Fayyazz M, et al. (2006): Hypometabolism exceeds atrophy in presymptomatic early-onset familial Alzheimer's disease. J Nucl Med 47:1778.
  • Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack C, Jagust W, Trojanowski JQ, Toga AW, Beckett L, Fisher A, Memo M, Stocchi F, Hanin I, Bures J, Kopin I, McEwen B, Pribram K, Rosenblatt J, Weiskranz L, (ed.) (2008): Alzheimer's Disease Neuroimaging Initiative Advances in Alzheimer's and Parkinson's Disease,Springer US,57:183189.
  • Mutch J, Lowe DG (2006): Multiclass Object Recognition with Sparse, Localized Features. IEEE Computer Society Conference on Computer Vision and Pattern Recognition - CVPR, Vol. 1, pp. 1118.
  • Nordberg A, Rinne JO, Kadir A, Långström B (2010): The use of PET in Alzheimer disease. Nat Rev Neurol 6:7887.
  • Rachakonda V, Tian Hong PAN, Le WD (2004): Biomarkers of neurodegenerative disorders: How good are they?Cell Res 14:349358.
  • Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Muller M (2011): Proc: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12:77.
  • Roe CM, Fagan AM, Williams MM, Ghoshal N, Aeschleman M, Grant EA, Marcus DS, Mintun MA, Holtzman DM, Morris JC (2011): Improving CSF biomarker accuracy in predicting prevalent and incident Alzheimer disease. Neurology 76:501510.
  • Schwartz O, Pillow JW, Rust NC, Simoncelli EP (2006): Spike-triggered neural characterization. J Vis 6:484507.
  • Serre T, Wolf L, Poggio T (2005): Object recognition with features inspired by visual cortex, Computer Vision and Pattern Recognition. IEEE Computer Society Conference on, vol. 2, no., pp. 9941000.
  • Tartaglia MC, Rosen HJ, Miller BL (2011): Neuroimaging in dementia. Neurotherapeutics 8:111.
  • Thompson PM, Hayashi KM, De Zubicaray G, Janke AL, Rose SE, Semple J, Herman D, Hong MS, Dittmer SS, Doddrell DM, et al. (2003): Dynamics of gray matter loss in Alzheimer's disease. J Neurosci 23:994.
  • Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer B, Joliot M (2002): Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain. Neuroimage 15:273289.
  • Van Herk M (1992): A fast algorithm for local minimum and maximum filters on rectangular and octagonal kernels. Pattern Recogn Lett 13:517521.