Long‐term health‐related quality of life in head and neck cancer survivors: A large multinational study

Head and neck cancer (HNC) patients suffer from a range of health‐related quality of life (HRQoL) issues, but little is known about their long‐term HRQoL. This study explored associations between treatment group and HRQoL at least 5 years' post‐diagnosis in HNC survivors. In an international cross‐sectional study, HNC survivors completed the European Organization for Research and Treatment of Cancer (EORTC) quality of life core questionnaire (EORTC‐QLQ‐C30) and its HNC module (EORTC‐QLQ‐H&N35). Meaningful HRQoL differences were examined between five treatment groups: (a) surgery, (b) radiotherapy, (c) chemo‐radiotherapy, (d) radiotherapy ± chemotherapy and neck dissection and (e) any other surgery (meaning any tumour surgery that is not a neck dissection) and radiotherapy ± chemotherapy. Twenty‐six sites in 11 countries enrolled 1105 survivors. They had a median time since diagnosis of 8 years, a mean age of 66 years and 71% were male. After adjusting for age, sex, tumour site and UICC stage, there was evidence for meaningful differences (10 points or more) in HRQoL between treatment groups in seven domains (Fatigue, Mouth Pain, Swallowing, Senses, Opening Mouth, Dry Mouth and Sticky Saliva). Survivors who had single‐modality treatment had better or equal HRQoL in every domain compared to survivors with multimodal treatment, with the largest differences for Dry Mouth and Sticky Saliva. For Global Quality of Life, Physical and Social Functioning, Constipation, Dyspnoea and Financial Difficulties, at least some treatment groups had better outcomes compared to a general population. Our data suggest that multimodal treatment is associated with worse HRQoL in the long‐term compared to single modality.


| INTRODUCTION
Head and neck cancer (HNC) encompasses a range of neoplasms with heterogeneous clinical presentation arising from the mucosal epithelia of the head and neck.It is usually treated in multidisciplinary teams involving head and neck surgeons, radiation oncologists and medical oncologists.2][3] Worldwide, 19.3 million cases were diagnosed in 2020, making HNC the seventh most frequent cancer diagnosis that year. 40][21][22][23][24] At 5 years' postdiagnosis, HNC patients are no longer routinely followed up. 166][27][28][29] Very little evidence on these survivors' long-term HRQoL is available, despite the fact that approximately half of patients with this diagnosis will reach this important milestone.
Through a project called 'Late Toxicity and Long-term Quality of Life in Head and Neck Cancer Survivors' (EORTC 1629), we aimed to address this by describing the HRQoL of a large, international collective of HNC survivors.We also explored differences in the HRQoL found in this survivor population in light of the type of treatment received.

| Study design and inclusion criteria
The EORTC 1629 study is a multinational cross-sectional study carried out by members of the Quality of Life Group and the HNC Group of the EORTC and coordinated at the University Hospital in Mainz, Germany.Survivors who fulfilled the inclusion criteria were identified at each participating centre and asked to participate by mailed invitation letter, at their follow-up appointment in hospital, or by telephone.
Eligibility criteria were: ≥18 years old, confirmed carcinoma of the larynx, lip, oral cavity, salivary glands, oropharynx, hypopharynx, nasopharynx, nasal cavity, nasal sinuses or unknown primary in the head and neck area, and the diagnosis more than 5 years in the past.
Exclusion criteria were eye, thyroid or orbit tumours, skin cancers or lymphoma in the head and neck region.Survivors with current evidence of disease or who had experienced a second primary were not excluded from the study, as these are occurrences reflecting the reality of some cancer survivors.

| Treatment groups
Five broad treatment groups were defined a priori: surgery only; radiotherapy (RT) only; chemo-radiotherapy (CRT); radiotherapy ± chemotherapy and neck dissection (RT ± CT and ND); and any other type of surgical intervention plus radiation ± chemotherapy (surgery and RT ± CT).The last two groups were separated to make a distinction between less extensive and more extensive surgeries.In the 'RT ± CT and ND' group, the assumption was that neck dissection was less extensive than the surgeries in the group that had any other kind of surgery plus RT or CT.The order of treatments was not considered in the multimodal treatment groups.

| Data collection
Consenting survivors were invited to the local researcher's clinic to complete questionnaires.All documentation was completed with article and pencil and then either scanned and emailed or the documents were shipped via post to the coordinating centre in Mainz, where the data were entered into the Computer-Based Health Evaluation System (CHES ® ), a web-based database developed by the Evaluation Software Development company in Austria. 30The collaborators from Portugal and Greece chose to enter the data into the database themselves rather than send the documents to Mainz.

| Questionnaires
The questionnaires were the EORTC quality of life core questionnaire (EORTC QLQ-C30) and its head and neck cancer module (EORTC QLQ-H&N35). 1,31The EORTC QLQ-C30 consists of five functional scales, nine symptom scales and one global quality of life (QoL) scale, and has been validated in an international setting. 1,31The EORTC QLQ-H&N35 covers issues specific to HNC patients and includes 18 symptom scales; it has been validated in an international setting and is used extensively in HRQoL research. 1,15,32,33Both questionnaires use a four-point Likert scale to indicate the extent of problems experienced, ranging from 'not at all' to 'very much'.The answers for each domain are converted to a score ranging from 0 to 100; for functional scales, high scores represent a high level of QoL, and for symptom scales high scores indicate a poor QoL.A difference in score of 10 or more points is considered to be a clinically relevant difference and was the cut-off we regarded in our study. 34

| Clinical data
Physicians completed a Case Report Form for each survivor and recorded the survivor's sex, age, education, smoking status, diagnosis and treatment details, Karnofsky index and Charlson Comorbidity Score. 35Some clinicians reported UICC stage using version 7 and some using version 8, but all TNM values were reassessed using the version 7 classification, which are the values reported here.

| Statistical analysis
The survivor characteristics are reported for the entire study population according to treatment group as frequencies and percentages.
Chi-square test for independence, Fisher's test or analysis of variance was used depending on the type of data to explore the distribution of demographic and clinical characteristic over the treatment groups.
Each of the HRQoL domain scores is reported for each treatment group as means with 95% confidence intervals (CI) and standard deviations (SD) for the raw data.Analysis of covariance (ANCOVA) was used to calculate adjusted means with 95% CI for all HRQoL domains and to assess statistical evidence for differences of 10 points or more between treatment groups and Tukey-Kramer post hoc tests were used to determine where the differences were.Age, sex, UICC stage and tumour sub-site were included as covariables, as we expected these to be the main sources of potential confounding.If adjusted means or CI for a HRQoL domain were less than 0 or more than 100, these was recorded as '0' and '100', respectively, as these are the limits of the HRQoL scores.As current evidence of disease or the occurrence of a second primary at some point since the HNC diagnosis were not an exclusion criteria, we also looked at whether our (Continues) results from the ANCOVA changed if these survivors were removed from the analyses.
This study did not have a specific hypothesis and was aimed at exploring HRQoL difference between the treatment groups.A sample size of 1045 survivors would be necessary to assess differences across five groups in 10 HRQoL domains with 90% power and an alpha of 0.01 assuming a standard deviation of 25 points in each scale.

| Enrolment
The first survivor was enrolled in October 2018 and the last in October 2021, with start of the COVID-19 pandemic slowing enrolment considerably in 2020.Twenty-six sites in 11 countries enrolled survivors, with the highest enrolment in Italy, Belgium, Germany, Norway and Brazil.Figure 1 shows the breakdown of enrolled survivors and the final number that could be included for analysis.Of the 1113 survivors with treatment information, eight did not complete the questionnaires, meaning that 1105 survivors are included in the analysis.The reasons for not completing the HRQoL questionnaires included that the participant did not wish to and in one case the person died before completing them.groups except for 'RT ± CT and ND', which was 9 years.Three percent had current evidence of disease and 15% had had a second primary (not necessarily in the head and neck).

| Survivor characteristics
There was evidence for differences in patient characteristics among the treatment groups (age: P < .0001,sex: P = .004,tumour sub-site: P < .0001and UICC stage P < .001),while performance status and comorbidity index were more evenly distributed (Karnofsky: P = .05,Carlson: P = .2).The characteristics with the largest differences among the treatment groups were the tumour sub-sites and the UICC stage.

| HRQoL results
The raw (unadjusted) data showed differences of 10 points or more between survivors in some of the treatment groups for Fatigue, Insomnia, Pain in the Mouth, Swallowing, Senses Problems, Trouble with Social Eating, Teeth, Opening Mouth, Dry Mouth and Sticky Saliva (Table 2).
This changed slightly in the models where we adjusted for age, sex, UICC stage and tumour sub-site, where Teeth and Trouble with Social Eating no longer had a 10-point difference, but Sexuality did (Table 3 and Figure 3); in the adjusted model, Sexuality had a 10. between 'surgery only' and 'surgery and RT ± CT') (Figures 3 and 4).
In the remaining groups with a clinically meaningful difference, the maximum differences in each domain were between 10. Note: Bolded rows contain at least one difference of 10 or more points between treatment groups.For the functional scales and the global quality of life scale, high scores indicate good functioning and good quality of life; for all other scales, high scores are an indication of high symptom burden/poor quality of life in that area.
the Mouth) and 13.8 (Swallowing).Where notable differences between the treatment groups were present, the predominance of problems experienced by survivors treated with multimodal therapies could be seen.
There were 909 survivors (89.5% of the study population) who had neither current evidence of disease nor had had a second primary.When the adjusted models were rerun to include only these 909 patients, the means and CI did not change in clinically   a Indicates all the 10-point differences between treatment groups were statistically significant (Tukey post-hoc P value ≤.01).
b Indicates at least one (but not all) of the 10-point differences between treatment groups was statistically significant (Tukey post-hoc P value ≤.01).
sexuality issues are related to some other aspect than HNC treatment. 37Nolte et al reported Fatigue and Insomnia scores of 29.5 and 26.6, which are within range of our results as well, indicating that while our survivor population experiences these problems to some extent, it is not substantially different from a general international population. 38 The domains where no clinically meaningful differences between the treatment groups were found are of interest as well to gain a picture of how treatment may affect long-term HRQoL.For example, the adjusted mean Physical Functioning scores across our study's treatment groups were 100 (indicating the highest functioning possible), while Nolte at el. reported an average of 85.1 in an international general population. 38This could be an indication of a selection of health survivors, whereby the survivors in our study were healthy enough to attend a clinical visit, while Nolte et al collected the HRQoL data through online surveys, which would have required less physical strength, or it could be that treatment has little to no effect on this domain in the long-term.Even in our unadjusted models, Physical Functioning was quite good, ranging from a mean of 81.3-84.6.
Indeed, 67% of our survivor population had a Karnofsky score of 90 or higher, which corresponds to being able to carry out normal activities.Speech, too, in our study was not a notable problem for the survivors, but this could be because the survivors had adjusted to their voice limitations and may not regard it as a significant problem anymore.Dyspnoea is also interesting in that survivors across treatment groups had a low symptom burden (8.3 or less) but examples from a general population are 18.5 and 15.9. 37,38Across all treatment groups, Financial Difficulties was also a considerable problem, with adjusted mean scores ranging from 36.9 to 42.5, whereas the general population measurement was considerably lower at 10.6. 38e strengths of this study included the large sample of over 1100 individuals from an international setting and the use of wellestablished, validated questionnaires to ascertain HRQoL.Our study has good statistical power and adds substantial HRQoL information for HNC survivors on what to expect in the long-term and an indication of where the differences may be expected depending on treatment.Limitations include that for multimodal treatment the order of the treatments was not recorded.This means we cannot be sure whether the patients received the radiotherapy as primary or adjuvant therapy and the surgery as primary treatment or in salvage, which could affect HRQoL.Originally, the two multimodal surgery groups were together, and we separated these into RT ± CT plus neck dissection and RT ± CT plus any other surgery on the assumption that in the latter groups the surgeries were more radical and would impact more on HRQoL than a neck dissection.Additional treatment information such as the type of chemotherapy agent and radiation dosage limit a more precise analysis.Treatments have evolved over the last decades, and, for example, the use of laser surgery, robotic surgery, intensitymodulated radiotherapy and proton therapy have meant evolving acute toxicities, which may in turn affect the long-term outcomes.
Although we adjusted for stage of disease and tumour site, it remains possible that some of the differences we found are influenced by these important factors.We also do not have information on HPV status; adjusting for this factor would have been interesting as HPV-associated disease has a better prognosis than HPVnegative disease.It would have been preferable to also include the EORTC Survivorship questionnaire (SURV100) in this study, but it was not available at the time the study protocol was created and indeed is still in Phase IV testing. 39It is possible that some issues specific to survivorship were missed or that comorbidities not assessed by the Charlson Comorbidity Index are present.Using a 10-point difference between the treatment groups based on the findings of Osoba et al is a reasonable choice, but we realize that while Osoba et al were looking for a minimally significant change, we have investigated a minimally significant difference.1][42][43] It is likely that survivors who were not doing well were less likely to agree to participate than those who function well, particularly because physical attendance at the clinic was part of the study.The lack of information about those who declined to participate and those who did not respond at all prevents us from understanding the extent of differences between participants and non-participants.
Ideally, long-term prospective studies would be preferable to assess HRQoL, but the trajectory would cover many years and, given the long-term prognosis of the disease, a considerable number of patients would need to be enrolled at diagnosis in order to gain robust results after 5 years.An alternative could be to implement routine assessments of HRQoL and then examine these retrospectively.

| CONCLUSIONS
This study of long-term HRQoL among HNC survivors provides one of the most comprehensive overviews on this topic to date.Clinically meaningful differences in HRQoL between treatment groups were found among long-term HNC survivors in nine HRQoL domains, seven of which had statistical significance.Survivors who have had only surgery or RT had the smallest symptom burden compared to survivors with multimodality treatment even after taking site and stage into account.In some domains, survivors' HRQoL scores were better than examples from the general population.Our conclusions on the problems experienced by long-term HNC survivors provide a basis to educate patients on specific long-term quality of life issues related to their treatment and could contribute to clinicians tailoring specific follow-up regimes.As well, even before treatment begins, newly diagnosed patients can be informed about the possible long-term effects of treatment.

3 -
point difference between 'surgery only' and 'RT ± CT and ND'.The adjusted means for Trouble with Social Eating all shifted down to zero or near zero.Among the domains with a difference 10 points or more in the adjusted model, survivors in the 'surgery only' and 'RT only' treatment groups continued to have the lowest scores, indicating the lowest symptom burden.The survivors in the 'surgery and RT ± CT' group had the highest symptom scores compared to the other treatment groups for Fatigue, Pain in the Mouth, Senses Problems and Opening Mouth (Figures 2 and 3); the 'RT ± CT and ND' group had the highest scores for Insomnia and Sexuality (Figures 2 and 3); and CRT had the highest symptom scores for Swallowing, Dry Mouth and Sticky Saliva (Figures 3 and 4).All of the 10-point differences for Fatigue, Pain in the Mouth, Swallowing, Sense Problems and Dry Mouth had good evidence of statistically significant with post hoc Tukey-Kramer tests P ≤ .01,with the exception of the 10.6-point difference for Dry Mouth between RT and CRT (P = .05).The 11.8-point difference between 'surgery only' and 'RT ± CT and ND' for Insomnia had a P-value of .04,and the 10.3-point difference found for Sexuality had P = .2. Opening Mouth had a statistically meaningful 10-point difference between 'surgery only' and 'surgery and RT ± CT' (P < .0001)and between 'RT only' and 'surgery and RT ± CT' (P = .0012),and between 'surgery only' and 'CRT' (P = .0124);all the 10-point differences in Sticky Saliva had P values <.01 except for 'surgery only' vs 'RT only' (P = .03).The three domains with the largest adjusted mean difference among the treatment groups were Dry Mouth (largest difference[Δ]   was 31.0 between 'surgery only' and 'CRT'), Sticky Saliva (Δ was 20.9 between 'surgery only' and 'CRT') and Opening Mouth (Δ was16.5 2 (Pain in T A B L E 2 (Continued)

1105 survivors with HRQoL data included in analysis
Characteristics of the 1105 survivors by type of treatment.
Health-related quality of life measured with the EORTC QLQ-C30 and EORTC QLQ-H&N35 according to the type of treatment received reported as unadjusted means with 95% CI and standard deviation (SD).
111 (10%) 'RT ± CT and ND' and 422 (38%) 'surgery and RT ± CT'.The average age was 66 years (range 23-93), most were male (71%) and former smokers (57%).The most frequent tumour sub-sites were oropharynx (34%), oral cavity (22%) and larynx (19%), and the majority were diagnosed at an advanced stage (22% stage III and 39% stage IV).The median time since diagnosis was 8 years for all treatment T A B L E 1 (Continued) Note: Percentages are column percentages except for the Totals row.Oropharynx includes base of tongue and tonsil.Salivary gland includes parotid gland and other salivary gland.Abbreviations: CRT, chemo-radiotherapy; CT, chemotherapy; ND, neck dissection; RT, radiotherapy.a The order of the treatments is not known.b ANOVA model P < .001.c Chi2 test for independence P < .005.d Fisher P < .001.T A B L E 2 Health-related quality of life measured with the EORTC QLQ-C30 and QLQ-H&N35 according to the type of treatment received: means are adjusted by age, gender, UICC stage and tumour sub-site.
Means with 95% CI are reported.For the functional scales and the global quality of life scale, high scores indicate good functioning and good quality of life; for all other scales, high scores are an indication of high symptom burden/poor quality of life in that area.Results are adjusted for age, gender, UICC stage and tumour subsite.Bolded rows contain at least one difference of 10 or more points between treatment groups but the difference is not statistically significant (Tukey post-hoc test >0.01).Bolded and italics contain at least one 10-point difference with evidence of a statistical difference (Tukey post-hoc test ≤0.01).
There was little change in the Insomnia scores between the raw means and the adjusted means, suggesting that the age, sex,