Clinical validation of an optimized multimodal neurocognitive assessment of chronic mild TBI

Abstract Objective Previous laboratory‐based studies have shown that neurocognitive eye‐tracking metrics are sensitive to chronic effects of mild traumatic brain injury (mTBI), even in individuals with normal performance on traditional neuropsychological measures. In this study, we sought to replicate and extend these findings in a military medical environment. We expected that metrics from the multimodal Fusion n‐Back test would successfully distinguish chronic mTBI participants from controls, particularly eye movement metrics from the more cognitively challenging “1‐Back” subtest. Methods We compared performance of participants with chronic mTBI (n = 46) and controls (n = 33) on the Fusion n‐Back test and a battery of conventional neuropsychological tests. Additionally, we examined test reliability and the impact of potential confounds to neurocognitive assessment. Results Our results supported hypotheses; Fusion 1‐Back metrics were successful in multimodal (saccadic and manual) classification of chronic mTBI versus control. In contrast, conventional neuropsychological measures could not distinguish these groups. Additional findings demonstrated the reliability of Fusion n‐Back test metrics and provided evidence that saccadic metrics are resistant to confounding influences of age, intelligence, and psychiatric symptoms. Interpretation The Fusion n‐Back test could provide advantages in differential diagnosis for complex brain injury populations. Additionally, the rapid administration of this test could be valuable for screening patients in clinical settings where longer test batteries are not feasible.


Introduction
Mild traumatic brain injury (mTBI) has a worldwide incidence of approximately 224/100,000 individuals. 1 Incidence of mTBI is even higher within military populations due to demographic characteristics and physical hazards associated with military operational/training activities. 2 Since 2000, U.S. military personnel have sustained over 340,000 mTBIs in both combat and garrison environments; 3 for up to 20%, post-concussive symptoms such as irritability, fatigue, or difficulty concentrating may persist into the chronic phase of injury (beyond 3 months), disrupting life activities and motivating patients to seek follow-up care. [4][5][6][7] These symptoms, however, can be influenced by a wide range of neurological and nonneurological factors. [8][9][10] Comprehensive neuropsychological evaluation is a widely accepted approach for identification of functional neural impairment; 11 however, this approach generally returns "normal" results beyond 90 days postinjury. [12][13][14][15] This raises questions about the value of neuropsychological assessment for the mTBI patient who seeks treatment after symptoms have become chronic. Accurate and efficient identification of chronic neural impairment related to mTBI is critical to guide treatments to reduce postconcussive symptoms and return the individual to optimum functioning.
Measures of oculomotor performance show considerable potential to improve assessment of mTBI. Several studies have shown that mTBI has a negative impact on eye movements, [14][15][16][17][18][19] particularly with multiple injuries or persistent post-concussive symptoms. 14,20,21 Eye movement abnormalities in TBI could be related to the broad network of regions that must communicate effectively to acquire and integrate information from the visual environment. 22 Impaired eye movements have been linked to axonal injury in postmortem tissue and on diffusion tensor imaging; [23][24][25] abnormal functional connectivity in chronic mTBI extends to the visual system and its interactions with higher-order cognitive processing. 26 Importantly, the types of eye movements that are most strongly impacted by mTBI are those that most heavily rely upon effective cognitive processing, as opposed to measures of basic neuromotor function. 14,18,[27][28][29] In particular, previous research conducted by our group 14,15 and others [28][29][30] has shown enhanced sensitivity of eye movements to effects of mTBI under conditions of increased cognitive load. Many oculomotor metrics are also less impacted by age, education, or intelligence, relative to conventional neuropsychological measures. 20,31,32 Concurrent measurement of multiple response modalities while an examinee completes a cognitive test may provide a means for improved assessment of chronic mTBI. 31 In previous studies, [14][15]31,33,34 our group has developed and validated methods to assess saccadic eye movements and manual motor performance in response to varying levels of cognitive load. This study extends this line of research using a more advanced system optimized to be clinically feasible for assessment of TBI in realworld medical environments. Based on recent findings, we hypothesized that the Fusion n-Back test 15particularly, saccadic metrics derived from the more cognitively challenging 1-Back subtestwould outperform conventional neuropsychological measures in distinguishing participants with chronic mTBI from controls.

Methods Participants
A volunteer sample of U.S. Active Duty military personnel and Veterans was recruited from military treatment facilities in the San Diego area. The mTBI group consisted of adults (>18 years old) with persistent symptoms related to mTBI 35 sustained 3 months to 12 years previously; the control group had no history of TBI or other neurological conditions. Of 120 participants enrolled, n = 79 (n = 33 control; n = 46 mTBI) met full eligibility requirements and were included in analyses. Participants with moderate-to-severe TBI (n = 15), TBI that fell outside the allotted time window (n = 11), medical conditions other than TBI that would be expected to impact performance (n = 7), incomplete testing session and loss to follow-up (n = 2), and failure on two or more measures of response validity/effort or noncompliance with task instructions (n = 6) were excluded from analysis.

Procedures
After providing written informed consent, participants provided demographic information and medical history ( Table 1). TBI history was obtained using the Ohio State University TBI Identification Method (OSU TBI-ID 36,37 ) and confirmed using medical records. Participants then completed a fixed battery of standardized self-report and neuropsychological measures and the Fusion n-Back test, described below. This study was approved by the Institutional Review Board at Naval Medical Center San Diego. Neuropsychological tests are detailed in Table 2. [38][39][40][41][42][43][44][45][46][47]

Fusion n-Back test
The Fusion n-Back test 15 is a multimodal cognitive task that combines the working memory demands of the classic 'n-Back' task 48 with the eye movement and visual attention demands of the Bethesda Eye & Attention Measure (BEAM) task. 31 In preparation for this study, development efforts were undertaken to optimize the previously described 15 Fusion prototype for enhanced mobility, efficiency, and clinical feasibility. In this study, the Fusion n-Back test was administered on a laptop computer, and testing procedures were streamlined, reducing testing time to 12 min total, plus instructions. Additionally, testing software was upgraded to provide enhanced ease-of-use, automated corrective feedback if an examinee failed to follow task instructions, and fully automated data processing/scoring. The Fusion n-Back test measures saccadic and manual responses to visual targets across two levels of cognitive load (0-Back and 1-Back; see Fig. 1). Details of the test have been described previously; 15 primary test metrics are shown in Table 3. Eye-tracking data were acquired at 150Hz using a Gazepoint GP3 HD Eye Tracker. Calibration was performed at the beginning of each task run using a 9-point rectangular calibration screen. Manual responses were recorded with a Cedrus RB-530 response pad. Stimuli were presented using PsychoPy v1.85.3 at 1920x1080 resolution on a 15" 60Hz LCD notebook computer display. Participants were seated with eyes positioned 24" from the stimulus display. Head movements were minimized with a chin rest. Gaze data and manual responses were synchronized with task event markers during data acquisition.

Statistical analyses
Gaze data from the Fusion n-Back test were processed using custom software, which automatically removed invalid trials and coded saccadic and manual responses. Standardized t-scores from the conventional neuropsychological battery were averaged to represent 'global cognition,' with better performance yielding higher scores. Statistical analyses were conducted using SPSS 25.0. A two-tailed alpha level of .05 was used for all analyses. M, mean; SD, standard deviation; t, age-corrected t score; SS, age-corrected scaled score. 1 Percentage of participants performing ≥ 1SD more poorly than the control group (Fusion n-Back) or normative (neuropsychological test) mean. 2 Statistical significance provided for ANCOVA covarying age: *P < 0.05, **P < 0.01, ***P < 0.001, †P < 0.10. 3 Odds ratios for mild TBI group membership associated with 1SD poorer performance relative to the control group on each predictor (from accepted logistic regression model). Missing data (2.8% of all primary metrics) were imputed using expectation maximization. Chi-square analyses and independent-samples t-tests were used to compare demographic characteristics and self-reported symptoms between groups. Analyses were conducted to compare Fusion n-Back test performance between groups and to identify a robust set of variables for use in identification of chronic mTBI. Mean test performance was compared between groups using ANOVA (for age-corrected neuropsychological tests) or ANCOVA controlling for age (for Fusion metrics). Receiver operating characteristic (ROC) analyses were conducted to identify Fusion n-Back metrics with the greatest sensitivity/ specificity for classifying chronic mTBI versus control groups. ROC analysis predictors were selected based on the subset of variables that were P < 0.10 in ANOVA/ ANCOVA. Forward stepwise logistic regression models were then used to evaluate joint classification accuracy of multiple metrics, with predictors selected based on the subset of variables that were P < 0.10 in ROC analyses, plus age. For these regressions, p-values were generated by bootstrapping across 1000 samples.
Reliability, sensitivity, and specificity to effects of chronic mTBI, correspondence with global cognitive performance, and estimates of effects of common confounds and psychiatric comorbidities were examined to evaluate potential use of this technology for assessment of mTBI in future research and clinical settings. Reliability analyses were conducted using split-half correlation with Spearman-Brown adjustment (for Fusion n-Back metrics) or Cronbach's alpha (for global cognition, based upon individual standardized scores from the battery of conventional neuropsychological tests). Partial correlations controlling for age and group (chronic mTBI vs. control) were used to evaluate relationships of Fusion metrics with demographic characteristics, psychiatric symptoms, and global cognition.

Participant characteristics
Participant characteristics are presented in Table 1. Mild TBI (n = 46) and control (n = 33) groups did not differ Participants were instructed for each test trial to look at the target and press the correct button as quickly as possible. Trials were presented in a pseudorandom order, with 56 directional cues and 32 misdirectional cues, starting with 10 practice trials per cognitive load condition. The test was performed across multiple conditions: Low Load (0-Back/Color Discrimination; press the button representing green vs. blue current target), and High Load (1-Back/Working Memory; press the button representing same vs. different color target relative to the previous target).

Comparison of neurocognitive performance in mTBI versus control groups
Test performance by group is shown in Table 2. Groups did not differ on premorbid IQ, global cognition, or any individual neuropsychological performance metrics examined. However, on the Fusion 1-Back subtest, mean performance of the mTBI group was poorer than that of the control group on saccadic inhibition errors (d = .69, P < 0.01), manual RT latency (d = .72, P < 0.05), and working memory score (d = À0.96, P < 0.01). There was also a trend for greater saccadic RT variability in the mTBI group on the 1-Back subtest (d = 0.47, P = 0.07). Next, analyses were performed to identify a robust set of Fusion n-Back metrics for identification of chronic mTBI. ROC analyses were conducted for the subset of variables that were P < 0.10 in ANCOVAs. As shown in Table 2, these analyses provided similar results to ANCOVA, with the addition of saccadic RT variability on the 1-Back subtest (AUC = 0.63, P < 0.05) as a significant classifier of chronic mTBI versus control group. AUC values for this group of individual metrics ranged from 0.63 to 0.75. Logistic regression was then performed to examine combined/incremental value of Fusion metrics for identification of chronic mTBI. All variables with significant AUC values from ROC analysis were entered as standardized z-scores using a stepwise forward method with P < 0.10 for entry. Regression diagnostics demonstrated that multicollinearity was not present for any variables (variance inflation factor < 4), and the model was well calibrated (Hosmer-Lemeshow test P > 0.20 for all steps). Model improvements for steps 1-3 were significant, P < 0.05. The accepted model (step 3) explained 38% (Nagelkerke R 2 ) of the variance in group (chronic mTBI vs. control), v 2 (3) = 25.90, P < 0.001, correctly classifying 74.7% of cases. In an additional step, age was entered as a predictor, but it did not significantly improve model fit (P > 0.05); therefore, this model was rejected. Effect sizes of individual predictors were comparable between model steps 3 and 4. ROC curves for variables in the accepted logistic regression model are presented in Figure 2. Proportions of participants impaired in each group are presented for each metric in Figure 3. Odds ratios for variables included in the accepted logistic regression model are presented in Table 2; all successful predictors in this model were derived from the Fusion 1-Back subtest. As shown, participants with poor working memory scores (z = À1.0) were 2.30x more likely to be in the chronic mTBI group, Wald = 8.11, P = 0.001 (95% CI = 1. 39-3.26). Membership in the mTBI group was 1.81x more likely (95% CI = 1.12-2.54) among participants with elevated saccadic inhibition errors (z ≥ 1.0; Wald = 5.38, P = 0.04) and 1.54x more likely (95% CI = 0.97-2.1) among participants with elevated saccadic RT variability (z ≥ 1.0; Wald = 3.43, P = 0.03). Combined value of these metrics was high, with positive predictive value of .76 and negative predictive value of .72.

Psychometric characteristics of the Fusion n-Back test
Additional psychometric characteristics of Fusion n-Back metrics are presented in Table 3. Overall, reliability estimates for Fusion n-Back metrics ranged from acceptable to excellent. Split-half reliability ranged from r sb = 0.76 to r sb = 0.95 for saccadic metrics and r sb = 0.70 to r sb = 0.99 for manual metrics. As a point of comparison, Cronbach's alpha was 0.75 for global cognition. Partial correlations were used to evaluate relationships of Fusion metrics with age (controlling for group), as well as education, IQ, global cognition, and self-reported symptoms of post-traumatic stress and depression (each controlling for age and group). Increased age was associated with greater manual RT variability on the 1-Back subtest, r p =0.25, P < 0.05, however, neither age nor education were significantly associated with any other Fusion n-Back test metrics. Higher estimated premorbid IQ was associated with faster manual RT latency (r p = À0.23, P < 0.05) and reduced manual RT variability (r p = À0.25, P < 0.05) on the 0-Back subtest. Higher estimated premorbid IQ was also associated with higher levels of global cognition, r p = 0.27, P < 0.05.

Discussion
This study was designed to evaluate an optimized version of the Fusion n-Back test 15 for multimodal neurocognitive assessment of chronic mTBI. Consistent with previous findings obtained in laboratory settings, 14,15 the Fusion n-Back test also demonstrated good sensitivity and specificity for chronic mTBI when used with U.S. military personnel in a clinical setting. As shown previously, 14,15 in this study, conventional neuropsychological tests again failed to discriminate between individuals with chronic mTBI and controls. Additional psychometric strengths of the Fusion n-Back Test included high levels of internal reliability andfor the best-discriminating metricsno detectable confounding by demographic or psychiatric factors. Overall, these findings support the use of neurocognitive eye tracking for clinical assessment of individuals with chronic mTBI. As expected, 15 the morechallenging 1-Back subtest provided superior classification of chronic mTBI relative to the 0-Back subtest. Considering similar findings with other eye movement tasks, [28][29][30] it appears likely that the additional working memory demands were instrumental in illuminating impairment related to chronic mTBI.
Additionally, current findings provide clear evidence for the value of a multimodal approach to neurocognitive assessment. By measuring concurrent eye movements and manual responses to test stimuli, the Fusion n-Back test produces distinct sets of saccadic and manual metrics. Examined individually, two of three saccadic metrics and two of three manual metrics from the Fusion 1-Back subtest were demonstrably poorer within the chronic mTBI group. When this set of variables was evaluated together, a set of three best-performing Fusion n-Back metrics (two saccadic, one manual) emerged as complementary predictors of TBI group.
The value of assessing multiple neurocognitive and motor processes appears to be heightened by the heterogeneous nature of neural dysfunction experienced by individuals with chronic mTBI. Approximately 30-50% of individual chronic mTBI group participants demonstrated impairments in each of the best-performing metrics, including saccadic RT variability, inhibition errors, and working memory score. Collectively, these metrics were better able to identify chronic mTBI than any one test metric alone. While none of the individual test metrics was a diagnostic "silver bullet," the Fusion 1-Back subtest appeared to effectively tap into multiple common forms of neural impairment associated with effects of chronic mTBI.
Aside from identifying persistent effects of neural injury, assessment of cognitive strengths and weaknesses can provide valuable information about an individual's capacity to complete real-world functional tasks. [49][50][51] In this study, manual metrics from the Fusion n-Back test were consistently and robustly associated with global cognitive performance. Therefore, these manual metrics may also be useful in predicting functional impairment, as defined by conventional neuropsychological measures with established (if modest) predictive relationships with functional capacity. 49 The functional relevance of the types of saccadic impairment elicited by the Fusion n-Back test has not yet been examined directly. However, the inconsistent and disinhibited eye movements demonstrated by many chronic mTBI participants in this study could reduce real-world performance by interfering with the acquisition of visual information from the environment. Even among individuals who are functionally intact, these saccadic impairments might serve as valuable biomarkers of neuronal injury. Additional research will be needed to investigate the functional relevance of different forms of saccadic impairment in comparison to conventional cognitive measures.
We also observed a psychometric divergence between saccadic and manual metrics in their relationships with estimated premorbid IQ and psychiatric symptoms. Consistent with previous research, 20,31,32 estimated intelligence was related to conventional neuropsychological measures and multiple manual metrics, but was not related to any saccadic metrics. Similarly, symptoms of depression and posttraumatic stress were related to manualbut not saccadicperformance. These findings provide additional evidence that saccadic metrics may provide advantages over manual metrics for detection of chronic mTBI effects with minimal interference from demographic or psychiatric factors that can confound conventional measures of cognitive performance. Interestingly, performance on the Fusion 1-Back subtest was sensitive to chronic mTBI, while the ostensibly more challenging Digit Span test from the conventional neuropsychological battery was not. The heightened sensitivity of Fusion n-Back metrics to chronic mTBI may be related to a synergy of multiple cognitive and motor demands embedded within this multimodal task.
While the mTBI and control groups were generally well matched, inclusion criteria restricted the mTBI group to those participants reporting persistent post-concussive symptoms. This criterion was selected to maximize clinical relevance of findings, as individuals are unlikely to seek clinical care if they feel their symptoms have resolved. As expected based on patterns of comorbidity, this symptomatic group of patients also had higher levels of depression and posttraumatic stress than the control group. However, these psychiatric symptoms were not related to primary metrics from the Fusion n-Back test, so we opted not to control for these factors in our analyses. Additional research within a larger sample may be useful to evaluate potentially subtle effects of psychiatric status on saccadic versus manual test metrics.
This study, building upon previous iterations of the Fusion system, [14][15]31,33,34 provides compelling support for the utility of the multimodal neurocognitive assessment of chronic mTBI. Consistent with our previous findings, 14,15 the clinically optimized version of the Fusion n-Back test used in this study successfully discriminated between service members with chronic mTBI and a wellmatched group of controls. Fusion metrics were most sensitive under the higher cognitive load condition (1-Back subtest). In contrast, conventional neuropsychological measures were unable to distinguish chronic mTBI and control groups. Additional findings supported the reliability of the Fusion test and suggested that saccadic metrics may be uniquely resistant to confounding influences of age, intelligence, and psychiatric symptoms. These test characteristics could provide advantages in differential diagnosis for complex brain injury populations. Additionally, with testing time as short as 8 min (if the 1-Back subtest is used alone), the Fusion system could be valuable for screening patients within clinical settings where longer test batteries are infeasible. Follow-up research is needed to identify changes in multimodal test performance over time, including comparisons of preand postinjury measurements and examination of potential improvement across the acute and subacute stages of TBI recovery.