Validation of brief symptom indexes among patients with recurrent or metastatic squamous cell carcinoma of the head and neck: A trial of the ECOG‐ACRIN Cancer Research Group (E1302)

Abstract Background Patients with advanced head and neck cancer have identified pain, fatigue, and difficulties swallowing, breathing, and communicating as high‐priority disease‐related symptoms. The Functional Assessment of Cancer Therapy‐Head and Neck Symptom Index‐10 (FHNSI‐10) assesses these symptoms. We sought to validate the FHNSI‐10, another brief symptom index (FHNSI‐7), and individual symptom endpoints representing these high‐rated priority disease symptoms among patients with recurrent or metastatic squamous cell carcinoma of the head and neck (SCCHN). Methods Patients (N = 239) were enrolled in a phase III randomized clinical trial (E1302) and completed the FHNSI‐10 at multiple time points. We assessed the internal consistencies and test–retest reliabilities of the FHNSI‐10 and FHNSI‐7 scores, and the known‐groups validity, predictive criterion validity, and responsiveness‐to‐change of the symptom indexes and individual symptom endpoint scores. Results The FHNSI‐10 and FHNSI‐7 indexes showed satisfactory internal consistencies (Cronbach's alpha coefficient range 0.60‐0.75) and acceptable test–retest reliabilities (intraclass correlation coefficients = 0.75 and 0.74, respectively). The FHNSI‐10, FHNSI‐7, and the pain, fatigue, swallowing, and breathing symptom scores showed evidence of known‐groups validity by performance status at baseline. The FHNSI‐10, FHNSI‐7, and the pain, fatigue, and breathing symptom scores at baseline showed evidence of predictive criterion validity for overall survival, but not time‐to‐progression (TTP). Changes in the symptom indexes and individual symptom scores were not associated with changes in performance status over 4 weeks, though most patients had stable performance status. Conclusions There is initial evidence of validity for the FHNSI‐10 and FHNSI‐7 indexes and selected individual symptom endpoints as brief disease‐related symptom assessments for patients with recurrent or metastatic SCCHN.


| INTRODUCTION
Head and neck cancer (e.g., cancers of the oral cavity, pharynx, and larynx) accounts for approximately 4% of cancer diagnoses in the United States annually, which translates to more than 53,000 expected new cases in 2019. 1 Advances in head and neck cancer treatment have resulted in improved survival rates over the past several decades, with the 5-year relative survival rate for localized head and neck cancer currently estimated as 84%. However, the vast majority (>70%) of patients with head and neck cancer are diagnosed with regional or distant advanced disease, where the 5-year relative survival rates drop to 65% and 39%, respectively. 1 Due to their location, these tumors can interfere with vital functions including swallowing, breathing, and speaking. Further, while treatments for advanced head and neck cancer (surgery; radiotherapy; and chemotherapy) may prolong life, they are associated with toxicities that can contribute to even greater symptom burden. [2][3][4] Assessment of these disease-and treatment-related symptoms is critical for clinical trials in which therapeutic efficacy is evaluated not only by clinical outcomes (e.g., survival and tumor response), but also by patient-reported outcomes (PROs), such as symptoms and quality of life. 5 There are several validated instruments available to assess PROs among patients with head and neck cancer, 6 such as the MD Anderson Symptom Inventory for Head and Neck Cancer (MDASI-HN) 7 and the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire Head and Neck module (EORTC-QLQ-H&N35). 8,9 The National Comprehensive Cancer Network-Functional Assessment of Cancer Therapy (FACT)-Head and Neck Symptom Index-22  was developed from clinician and patient rankings of priority head and neck cancer concerns and includes items related to symptoms, treatment side effects, and function/ well-being. 10 With a growing interest in isolating the assessment of specific symptoms, further item reduction to include only the highest priority disease symptoms experienced by patients with advanced head and neck cancer could help promote patient-centered outcome assessment that is fit for regulatory use. For example, the FACT-Head and Neck Symptom Index-10 (FHNSI-10) includes 10 items from the NFHNSI-22 that assess high-priority patient-reported head and neck cancer symptoms (e.g., pain, fatigue, swallowing, breathing, and communication). [11][12][13] Even further item reduction will be beneficial.
To address this need, this study sought to validate the scores of very brief symptom indexes for use among patients with advanced head and neck cancer based on prior identification of high-priority patient-reported disease symptoms, 11,12 first as clusters of symptoms (i.e., FHNSI-10 and FHNSI-7 symptom indexes) and then as individual symptom endpoints (i.e., pain, fatigue, and difficulty swallowing, breathing, and communicating) among patients with metastatic squamous cell carcinoma of the head and neck (SCCHN) enrolled in a large phase III randomized placebo-controlled trial (E1302). We evaluated the internal consistencies and test-retest reliabilities of the index scores, and the known-groups validity, predictive criterion validity, and responsiveness-to-change of the symptom index scores and the individual symptom endpoint scores using familiar clinical anchors (i.e., provider-rated Eastern Cooperative Oncology Group (ECOG) performance status (PS), overall survival, and disease progression). Secondary objectives were to explore the relationships between one additional item assessing overall treatment side effect bother with the symptom index scores, the individual symptom endpoint scores, and provider-rated adverse events. This work is informed by the conceptualization of the FACT symptom indexes as being causal indicators of symptom burden (vs. effect indicators). 14,15 2 | METHODS

| Participants and procedures
Participants in this study were enrolled in ECOG-ACRIN Cancer Research Group Study number E1302, a phase III randomized, placebo-controlled, double-blind trial of docetaxel with or without gefitinib to treat recurrent or metastatic SCCHN (ClinicalTrials.gov identifier: NCT00088907). 16 Eligible patients were at least 18 years old, had been diagnosed with incurable recurrent or metastatic SCCHN, and had a provider-rated ECOG PS of 0-2. Exclusion criteria included pregnancy or breastfeeding, recent major tumorrelated hemorrhagic events, current therapeutic anticoagulation, and tumors that had invaded major blood vessels. After providing informed consent, participants were randomized to treatment with docetaxel plus placebo or docetaxel plus gefitinib, and monitored for therapeutic response and disease Conclusions: There is initial evidence of validity for the FHNSI-10 and FHNSI-7 indexes and selected individual symptom endpoints as brief disease-related symptom assessments for patients with recurrent or metastatic SCCHN.

K E Y W O R D S
head and neck cancer, psychosocial studies, quality of life progression. The primary results of this trial are reported elsewhere. 16 All protocol procedures were approved by the relevant institutional review boards. Prior to protocol treatment, participants completed a baseline assessment including items assessing head and neck cancer symptoms. Symptom assessments were repeated mid-way through treatment cycle 1 (Week 2), at the end of treatment cycle 1 (Week 4), and at the end of treatment cycle 2 (Week 8). The data to support the findings of this study are available from the corresponding author upon reasonable request.

| Head and neck cancer symptom assessment
Participants completed the 10-item FHNSI-10 index of highpriority patient-reported head and neck cancer disease symptoms (e.g., pain, fatigue, and difficulties swallowing, breathing, and communicating) 12,13 and one additional item assessing overall treatment side effect bother (i.e., "I am bothered by side effects of treatment," item GP5) that is positively associated with clinician-reported adverse events and negatively associated with patient-reported enjoyment of life ( Figure 1). 17 Participants rated each item using a 7-day recall period, on an ordinal rating scale from 0 (not at all) to 4 (very much). As with all measures in the Functional Assessment of Chronic Illness Therapy (FACIT) system, high scores are better than low scores. Therefore, symptom responses were reversed as necessary, so that high scores represented less pain and fatigue as well as less difficulty swallowing, breathing, and communicating. Consistent with Pearman et al., 17 the single treatment side effect bother item (GP5) was not reverse scored and higher GP5 scores indicated more bother from side effects.
From the FHNSI-10 items, a 7-item index ("FHNSI-7") was computed to include only those items that correspond to symptoms identified in prior research as high-priority disease-related symptoms: pain, fatigue, swallowing, breathing, and communication (items GP4, HN12, GP1, HN7, HN11, HN3, and HN10). 11,12 In addition to the FHNSI-10 and FHNSI-7, individual symptom endpoints were computed for each of the following symptoms: pain (items GP4 and HN12); fatigue (item GP1); swallowing (items HN7 and HN11); breathing (item HN3); and communication (item HN10). FHNSI-10 and FHNSI-7 scores were computed as the prorated sum of the item responses, provided more than 50% of the items were answered (prorated score=(raw sum*number of total items)/number of items answered). Scores for each individual symptom endpoint were summed, and scores were only computed for patients who had answered all target items for a given symptom.

| Anchor variables
Anchor-based methods were used to evaluate the PRO measures' known-groups validity by ECOG PS, predictive criterion validity for meaningful clinical endpoints (i.e., change in F I G U R E 1 The 10-item FHNSI-10 plus one additional item assessing overall treatment side effect bother (GP5) that is not scored with the other items. The following items comprise the FHNSI-7: GP4, GP1, HN7, HN12, HN3, HN10, and HN11. ©Copyright FACIT.org and reprinted with permission. ECOG PS, overall survival, and time-to-progression (TTP)), and responsiveness-to-change by ECOG PS. Patients were classified using the single-item provider-rated ECOG PS ranging from 0 (normal activity without symptoms) to 4 (unable to get out of bed), 18,19 and change in ECOG PS was defined as ECOG PS at Week 4 minus the value at baseline. Overall survival (OS) was defined as the time from study registration to death from any cause, censored at the date of last contact. TTP was defined as the time from study registration to evidence of disease progression, censored at the date of last disease evaluation.

| Statistical analyses
Descriptive statistics were used to characterize the patients and describe the distribution of PRO scores (i.e., FHNSI-10, FHNSI-7, and individual symptom endpoints) at baseline and over time. There was no treatment effect on OS in the larger trial, 16 and there was no main effect of treatment (p = 0.11) or a two-way interaction effect between treatment and time points on FHNSI-10 scores (p = 0.15). Therefore, all PRO scores were combined across treatment groups. All p-values were two-sided, and a value of <0.05 was considered statistically significant. These analyses were exploratory in nature, so no statistical adjustments were made for tests of multiple comparisons unless otherwise specified. We calculated Cronbach's alpha coefficients to assess the internal consistency reliability of the FHNSI-10 and FHNSI-7 index scores across time, and we calculated intraclass correlation coefficients (ICCs) to assess the test-retest reliability of the FHNSI-10 and FHNSI-7 scores from baseline to Week 4 among patients with stable ECOG PS. 20 We also assessed the knowngroups validity, predictive criterion validity, and responsiveness-to-change for the FHNSI-10 and FHNSI-7 index scores and the symptom endpoint scores for pain, fatigue, swallowing, and breathing using anchor-based methods. Of note, we did not assess the validity or reliability of the communication symptom endpoint, as we do not hypothesize that difficulty communicating is related to ECOG PS, OS, or TTP. For known-groups validity, we used ANOVA tests to differentiate among ECOG PS at baseline with respect to PRO scores, with Scheffe tests to assess post-hoc pairwise differences. Non-parametric Kruskal-Wallis tests were further performed to confirm the ANOVA results for individual symptom endpoint scores. For predictive criterion validity, we evaluated the relationships between baseline PRO scores and longitudinal anchor variables using univariate general linear models (for change in ECOG PS) and Cox proportional hazards (PH) models (for OS and TTP). We used multivariable models to confirm the results of the univariate models adjusting for age, sex, race, disease status, and prior treatments (i.e., chemotherapy, radiotherapy, and surgery, separately), 21 and we also adjusted for ECOG PS for Cox PH models assessing OS and TTP. We explored the PRO scores' change over time using mixed linear models with unstructured covariance, with the assessment time point considered as a categorical variable. For responsiveness-to-change, we used ANOVA to evaluate the relationships between changes in the PRO scores and changes in ECOG PS, with change scores defined as the value assessed at Week 4 minus the value assessed at baseline. Finally, we used univariate and multivariable general linear models to explore the relationships between the GP5 "bother" item, the PRO scores, and the incidence and severity of provider-rated adverse events over time.

| Sample characteristics
In total, 270 patients with recurrent or metastatic SCCHN were enrolled in the phase III E1302 trial, and 239 of those patients completed baseline PRO assessments and were eligible for this secondary analysis. See Table 1 for patients' baseline demographic and disease characteristics. Patients were mostly male (79.5%) and white (84.9%). Notably, most patients had poor prognosis, with a provider-rated ECOG PS of 2 (62.8%) and prior treatments with chemotherapy (74.5%), radiotherapy (84.9%), and/or surgery (61.1%). Primary head and neck cancer sites were mostly oropharynx (32.6%), larynx (25.5%), or oral cavity (22.2%), and almost half of the patients at baseline had eradicated disease but with local recurrence (46.1%).

| Internal consistency and test-retest reliability
See Table 2 for descriptive statistics of the PRO measures at each time point. Cronbach's alpha coefficients were satisfactory for FHNSI-10 (range 0.68-0.75) and FHNSI-7 (range 0.60-0.68) at all time points. In addition, among patients with stable ECOG PS from baseline to Week 4 (n = 123), test-retest reliability was acceptable for the FHNSI-10 (ICC = 0.76) and FHNSI-7 symptom indexes (ICC = 0.75).

| Known-groups validity
We assessed known-groups validity by examining the relationships between baseline PRO scores and patients' baseline ECOG PS. Across almost all PRO measures, mean PRO scores for participants with an ECOG PS of 0 were significantly higher (better) than for participants with an ECOG PS of 1 or 2 ( Table 3). As an exception, there was not a significant difference between the mean breathing scores of participants with ECOG PS of 0 and 1. For the individual symptom endpoint scores, we confirmed these conclusions using non-parametric Kruskal-Wallis tests.

| Predictive criterion validity
We assessed predictive criterion validity by examining the relationships between baseline PRO scores and anchor variables over time (i.e., change in ECOG PS from baseline to Week 4, OS, and TTP; Table 4).

| Change in ECOG PS
Results from univariate general linear models showed that only higher (better) baseline scores for swallowing predicted increased (worsened) ECOG PS over time (F(1, 170)  The association between swallowing and OS did not reach statistical significance. These relationships were confirmed via multivariable models, with the exception of fatigue; the relationship between fatigue and OS was no longer significant after adjusting for demographic and clinical variables.

| Time-to-progression
None of the baseline PRO scores significantly predicted TTP as evaluated by univariate or multivariable Cox PH models.

| Responsiveness-to-Change
As seen in

| Treatment side effect bother, PRO scores, and adverse events over time
Higher scores on item GP5 ("I am bothered by side effects of treatment") have been associated with more clinician-reported adverse events and with lower patient-reported life enjoyment. 17 Thus, we used univariate general linear models to explore whether this item was associated with the PRO scores, the incidence of adverse events, and with the maximum grade  of adverse events over time (Table 6), and we confirmed the results using multivariable general linear models adjusting for age, sex, race, ECOG PS, disease status, and prior treatments. At baseline, higher GP5 (more treatment side effect bother) was significantly associated with lower (worse) FHNSI-10 (F(1, 210) = 3.94, p = 0.049) and breathing scores (F(1, 209) = 6.02, p = 0.01). However, the relationship between GP5 score and the FHNSI-10 was not sustained in a multivariable model. At Weeks 4 and 8, after patients initiated their assigned treatment, multiple associations emerged. Namely, higher Week 4 and Week 8 GP5 scores (more treatment side effect bother) were associated with lower (worse) FHNSI-10, FHNSI-7, pain, fatigue, and breathing scores. In addition, higher Week 4 GP5 was associated with more concurrent unique grade 1+ adverse events and with a higher maximum grade of adverse events. Higher Week 8 GP5 was also associated with a higher maximum grade of adverse events, but this relationship was not sustained in a multivariable model.

| DISCUSSION
This study sought to validate the FHNSI-10 and FHNSI-7, two very brief patient-reported symptom indexes, as well as individual symptom endpoints for use among patients with recurrent or metastatic SCCHN. We used data from an ECOG therapeutic trial (E1302) in which participants completed the FHNSI-10 13 plus one additional item related to treatment side effect bother (GP5) at multiple time points. Items from the FHNSI-10 were used to compute the even briefer FHNSI-7 and individual symptom endpoints for pain, fatigue, swallowing, breathing, and communication, each consisting of one or two items. The resulting symptom indexes and individual symptom endpoints included only the highest priority disease symptoms reported by patients with advanced head and neck cancer. 11,12 The FHNSI-10 and FHNSI-7 both performed adequately over time, with acceptable Cronbach's alpha internal consistency reliability coefficients over time and acceptable testretest ICC reliabilities among patients with stable ECOG PS from baseline to Week 4. Of note, Cronbach's alpha is best suited as a measure of internal consistency reliability for scales that measure one latent construct as opposed to an index of various important elements (as is the case of the FHNSI-10 and FHNSI-7). 22 Thus, our finding that Cronbach's alpha coefficients fell within the low range of acceptable internal consistency reliability is not considered a weakness of these indexes. In addition, ECOG PS was not assessed at Week 2. Thus, a shorter interval for test-retest reliability was not available, and stronger test-retest reliability might occur across intervals shorter than 4 weeks. The PRO measures showed known-groups validity, as the FHNSI-10, FHNSI-7, and the pain, fatigue, and swallowing symptom endpoint scores successfully differentiated patients by provider-rated ECOG PS 0 vs. 1 and 0 vs. 2 at baseline (pre-treatment). In addition, the breathing symptom endpoint differentiated patients by ECOG PS 0 vs. 2, but was less sensitive to PS 0 vs. 1. The PRO measures were less successful in differentiating patients by change in ECOG PS over time; only better swallowing at baseline predicted worsened ECOG PS over time. This relationship is in the opposite direction that we would expect. However, it should be noted that change in ECOG PS was only calculated for a subset of the analyzable patient population, and our results should be confirmed in a larger sample of patients who experience a change in ECOG PS over time.
Similar to past work that has linked quality of life with survival, we found evidence of predictive criterion validity such that better scores on the FHNSI-10 and FHNSI-7 symptom indexes at baseline predicted better survival. [23][24][25] Thus, these brief symptom indexes may have prognostic value among patients with recurrent or metastatic SCCHN. Moreover, less pain, fatigue, and breathing problems at baseline predicted better survival, though the relationship between fatigue and OS was not sustained after controlling for demographic and clinical variables. A recent review by Quinten and colleagues 26 identified emotional functioning, nausea/vomiting, and dyspnea as specific aspects of quality of life that are particularly relevant for predicting survival in patients with head and neck cancer. Our findings provide additional support that breathing-related symptoms provide prognostic information for patients with advanced head and neck cancer, particularly those with recurrent or metastatic SCCHN, and we extend prior work by identifying pain and possibly fatigue as other important markers of prognosis in this population. Interestingly, none of the PRO measures at baseline assessment predicted TTP, suggesting that factors other than patient health status may play a larger role in disease control.
We did not find evidence of responsiveness-to-change for the PRO measures by change in ECOG PS from baseline to Week 4. However, these null findings should be considered in the context of our data's limitations. As noted previously, longitudinal PRO data were only available for a subset of study participants. Moreover, the vast majority of patients with longitudinal data had stable ECOG PS and relatively few patients experienced changed ECOG PS. Nonetheless, for most PRO measures, mean changes were in the anticipated directions. Specifically, patients with improved ECOG PS tended to have positive PRO score changes, patients with worsened ECOG PS tended to have negative PRO score changes, and patients with stable ECOG PS tended to have minimal PRO score changes that fell between the two other groups. Future studies should consider assessing responsiveness-to-change for these PRO measures among a larger sample of patients in which a greater proportion of patients may experience changed ECOG PS.
Finally, we explored the associations between a single item that assesses how much patients are bothered by side effects of treatment and the PRO measures, the number of unique adverse events grade 1+, and the maximum grade of adverse events over time. At baseline, only more overall symptom burden on the FHNSI-10 index and more breathing problems were related to more treatment side effect bother. However, after the initiation of treatment, more treatment side effect bother was associated with more overall symptom burden (FHNSI-10 and FHNSI-7) and more pain, fatigue, and breathing problems. Further, at Week 4, more treatment side effect bother was associated with more concurrent adverse events grade 1+ and with a higher maximum grade of adverse events. The association between more treatment side effect bother and higher maximum grade of adverse events persisted to Week 8. Our findings complement past work, which also found that worse treatment side effect bother was associated with higher maximum grade of clinician-reported adverse events and less patient-reported enjoyment in life across four clinical trials of various cancer populations. 17 Our study provides additional support that a single item, "I am bothered by side effects of treatment," could have value as a very brief, patient-centered summary of treatment burden in clinical research and potentially clinical care.
Several limitations and considerations are noted. Although difficulty communicating is a high-priority head and neck cancer-related symptom, 11,12 our data did not include appropriate anchor variables by which to assess the validity or reliability of the communication symptom endpoint. Future studies should evaluate the psychometric properties of the communication symptom endpoint. Items assessing emotional functioning were not included in the symptom indexes or individual symptom endpoints evaluated here. This omission does not negate the importance of emotional functioning in this patient population. Rather, the brief symptom indexes and individual symptom endpoints evaluated here are meant to complement the assessments of other important aspects of the patient experience and provide options for very brief patient-centered PRO assessments in cases in which patient burden must be minimized as much as possible. As noted, there was substantial attrition in this study's sample with regard to the completion of the PRO measures after the baseline assessment. Thus, the longitudinal findings should be interpreted with caution. Some findings did not conform to expectations, possibly due to conducting multiple comparisons. In addition, the sample predominantly identified as non-Hispanic white, which limits the cross-cultural generalizability of our findings. Generalizability to patients with diagnoses other than recurrent or metastatic SCCHN and patients with worse performance status (ECOG PS 3-4) is also limited by parent study eligibility. Future work can evaluate the utility of these measures in expanded patient samples.
Nonetheless, these findings provide initial evidence for the validity for using the brief FHNSI-10 and FHNSI-7 symptom indexes and even briefer one to two item individual symptom endpoints (i.e., pain, fatigue, swallowing, and breathing) among patients with recurrent or metastatic SCCHN. These PRO measures may have value in clinical research and perhaps even clinical practice, and they can assist providers in conducting patient-centered outcome assessments of patients with recurrent or metastatic SCCHN.