Long‐term health‐related quality of life among men with prostate cancer in the Finnish randomized study of screening for prostate cancer

Abstract Background The long‐term health‐related quality of life (HRQOL) impacts of PCa screening have not been adequately evaluated. We aimed to compare the generic and disease‐specific health‐related quality of life (HRQOL) among men with prostate cancer in the screening arm with the control arm of the PSA‐based prostate cancer screening trial in up to 15 years of follow‐up. Materials and methods This study was conducted within population‐based Finnish Randomized Study of Screening for Prostate Cancer (FinRSPC). During 1996‐1999 80,458 men were randomized to the serum prostate‐specific antigen (PSA) screening arm (SA, N = 32 000) and the control arm (CA, N = 48 458). Men in the screening arm were screened at 4‐year intervals until 2007. HRQOL questionnaires were delivered to newly diagnosed prostate cancer patients in the screening and control arm 1996‐2006 (N = 5128) at the time of diagnosis (baseline), at 3‐month, 12‐month and 5, 10, and 15‐year follow‐up. Validated UCLA Prostate Cancer Index (UCLA‐PCI) and RAND 36‐Item Health Survey were used for HRQOL assessment. The data were analyzed with a random effects model for repeated measures. Results At baseline, men with prostate cancer in the screening arm reported better Sexual Function, as well as less Sexual and Urinary Bother. Long‐term follow‐up revealed slightly higher HRQOL scores in the screening arm in prostate cancer specific measures at 10‐year post diagnosis, but the differences were statistically significant only in Urinary Bother (UCLA‐PCI score 77.9; 95% CI 75.2 to 80.5 vs. 70.9; 95% CI 66.8 to 74.9 P = .005). The generic HRQOL scores were comparable between the trial arms. The overall differences in disease‐specific or generic HRQOL scores by trial arm did not vary during the follow‐up. Conclusion No major differences were observed in HRQOL in men with prostate cancer between the prostate cancer screening and control arms during five to 15‐year follow‐up.


| INTRODUCTION
Prostate cancer (PCa) is the most common cancer and one of the leading causes of cancer death in men in Western countries. 1 Long-term survival following PCa diagnosis is good; age-standardized 5-year survival is in the range 70%-100% in most countries. 2 PCa is the main global contributor to years lived with cancer disability 3 and men with clinically detected PCa have shown to experience treatment-related adverse effects at 5 years, 4 and even up to 10-15 years after diagnosis. 5,6 European Randomized Study of Screening for Prostate Cancer (ERSPC) has shown reduced incidence of advanced disease, 7 and a 20% relative reduction in PCa mortality over 16-year follow-up. 8 Hence screening for PCa could potentially offer major benefits, and improve quality of life in men with screen-detected PCa. However, frequent overdiagnosis and treatment of cancers that would not go on to cause symptoms or death may offset any such benefits. 9 Evaluation of the long-term health-related quality of life (HRQOL) effects of screening is important for the overall evaluation regarding PCa screening. 10,11 Few studies have reported long-term quality of life outcomes in PCa screening trials and even those studies have been limited by their cross-sectional design and absence of pretreatment baseline functioning HRQOL assessment. A previous analysis in the Finnish Randomized Study of Screening for Prostate Cancer (FinRSPC) using data from the year 2011 with median 6.7 years (control arm) and 8 years (screening arm) of follow-up suggested slightly higher generic HRQOL scores in the screening arm for men diagnosed with PCa (as measured by 15D, EQ-5D and SF-6D; statistically significant only for EQ-5D), than for such men in the control arm. No differences were found between the arms for men in the trial subsample free of PCa. 12 The current study differs from the earlier publication in that the data for this study was collected at regular intervals after diagnosis which permits more robust interpretation of the results, whereas the previously published study was not conducted at set points after diagnosis, but at fixed points after the start of the trial. In addition, our current study has data for both generic (RAND-36) and disease-specific (UCLA-PCI) HRQOL. In the US Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial no clear difference was found between trial arms in disease-specific quality of life scores among PCa survivors, suggesting that screening detection does not affect urinary, sexual, and bowel functioning at 5 to 10 years after diagnosis. 13 Majority of patients diagnosed in the PCa screening will live at least 10-15 years after diagnosis, however, research evidence on long-term HRQOL effects of PCa-screening is still scarce. This study aims to broaden the knowledge of the overall impact of prostate-specific antigen (PSA)-based PCa screening at population level by comparing both generic (RAND-36) and disease-specific (UCLA-PCI) HRQOL among PCa patients in the screening arm and control arm in a randomized trial with 15-year follow-up.

| MATERIAL AND METHODS
The study is based on the population-based Finnish Randomized Study of Screening for Prostate Cancer (FinRSPC), which is the largest component of the multicenter ERSPC trial. The Finnish trial recruited 80 458 men during 1996-1999 at two screening centers, Helsinki and Tampere. Annually, 8000 men aged 55, 59, 63, or 67 years were randomized to the screening arm (N = 32 000), and the remaining men (N = 48 458) formed the control arm. Men in the screening arm were screened at 4-year intervals until 2007. Trial protocol have been described previously in detail. 14 In the present study, men with PCa in the screening arm and control arm were compared on an intention-to-treat basis, ie regardless of screening compliance, to focus on population level effect of screening.
A total of 5218 incident PCa cases in both arms were identified from the Finnish Cancer Registry during 1996-2006. HRQOL questionnaires were delivered at the time of diagnosis (baseline), prior to primary treatment, at 3 months, 12 months, 5, 10, and 15 years after the PCa diagnosis ( Figure 1).
HRQOL was evaluated using both generic and disease-specific domains. UCLA Prostate Cancer Index (UCLA-PCI) has demonstrated validity as a disease-specific HRQOL measure in men with PCa. 15 It consists of six scales; Urinary Function, Sexual Function, Bowel Function, and a Bother assessment for Urinary, Bowel and Sexual domains.
The RAND 36-Item Health Survey (also known SF-36) is a validated and widely used generic health instrument. 16 It consists of eight scales: Physical Functioning, Role-Physical, Role-Emotional, Bodily Pain, General Health, Vitality, Social Functioning, and Mental Health.
The RAND-36 and UCLA-PCI scales both range from 0 to 100 after re-scoring, with a score of 100 representing optimal health, normal functioning, or no bother. Differences in the UCLA-PCI 17 and RAND scales 18 of 10 points are suggested by their authors to be clinically meaningful for 0 to 100 scales.
Background variables included age at diagnosis, screening center (Helsinki vs Tampere), comorbidity, 19 PCa tumor risk group, primary treatment group, and national registry data for educational level, socioeconomic, and marital status. The European Association of Urology PCa tumor risk group classification 20 was used based on PSA, Gleason scores and TNM stage.

| Statistical analyses
Background variables were compared between the trial arms using χ 2 tests. In the main analyses, we analyzed changes over time in HRQOL scores in the two arms using a linear random effects model with population-averaged estimator with exchangeable correlation structure, which can take into account the statistical dependence among observations in repeated measures. Missing values were not imputed, as less than 5% of the cases were missing and considered minor. We applied the inverse probability weighting method to reduce biases due to imperfections in the sample related to noncoverage and unit nonresponse. It involves weighting each participant's contribution to the estimation according to how likely they were, compared to the target population (ie diagnosed PCa cases), to be complete records based on poststratification for arm, age, screening center, comorbidity, educational level, prostate cancer risk group, and the interaction term for the year of diagnosis and arm. All analyses were adjusted for the assigned sampling weights. All models included screening center, sampling weights, main effects of trial arm, and follow-up time, and their interaction term in order to assess whether the effect of screening varied over time, and interaction term for age and time. Variables were treated as time-invariant based on the measure at the time of diagnosis. To avoid overadjustment, ie adjustment for causal intermediates of the intervention examined, the models did not include tumor risk group or treatment, which are likely affected by screening, as this would be expected to bias the results toward the null. To facilitate interpretation, we presented the main results using predictive margins and their 95% confidence intervals (CI) (Figures 2-7) and tested statistical difference with marginal effects P values. Predictive margins are statistical measures computed from predictions given by a regression model; individual predictive values are used to calculate the mean predictive values, ie predictive margins, while adjusting for the values of the covariates. 21 We conducted sensitivity analyses with generalized estimating equations models with robust standard errors, unstructured correlations, and maximum likelihood estimations as alternative approaches for analyzing the data. Sensitivity analyses resulted to equivalent results to the models we used. We also conducted per protocol sensitivity analyses comparing screen-detected cancers to other detection methods. In these analyses results were not converted, although in some disease-specific HRQOL variables differences were more pronounced, compared to the original analyses. All P values were two-sided and a significance level of 0.05 was applied in statistical tests. We carried out all the statistical analyses using Stata 14.0. 22

| RESULTS
During 1996-2006, 5128 new PCa cases were diagnosed in the study population ( Figure 1). Data collection failed for the year 2002 for both centers (response proportion 2,5%), and after 2002 in the Tampere center. These data were excluded, and 624 eligible cases remained in the screening arm (response proportion 33%) and 411 in the control arm (response proportion 22%) with an overall response proportion of 27%. The mean follow-up time was 8.3 years, with a median of 10 years.
Compared to the screening arm, respondents in the control arm were significantly older, had lower proportion of lowrisk group cancers (P < .01), and were more commonly from Helsinki screening center (P < .001) ( Table 1). There were no differences between the arms in any other sociodemographic variables or comorbidity. Surgery and active surveillance were more frequent primary treatments in the screening arm, while radiotherapy and hormonal therapy were more common in the control arm (P < .001).
In the nonresponse analyses (data not shown), nonparticipation was more likely in the higher risk group cancer patients and patients with hormonal therapy treatment in both arms, and among upper-level employees in the control arm. None of other socioeconomic status dimensions, marital status or comorbidity were associated with nonresponse. Radical prostatectomy patients were over-representative in the HRQOL sample compared to the total prostate cancer cohort in both arms, as well as the age-group 60-64 years in the screening arm, and the age-group 65-69 years and intermediate-risk disease group in the control arm.

| Prostate cancer-specific quality of life
The model-based mean values for each UCLA-PCI scale present the changes in HRQOL scores over time in the two trial arms (Figures 2-7). At baseline, the patients in the screening arm showed statistically significantly higher scores, ie less bother, than those in the control arm in Urinary Bother (79.2; 95% CI 77.1 to 81.3 vs 74.0; 95% CI 71.0 to 76.9; P = .005) ( Figure 3). The Urinary and Sexual Function, and Sexual Bother scores were also nonsignificantly higher in the screening arm (Figures 2, 6 and 7). The HRQOL scores in bowel domains did not differ by trial arm at baseline or in short-term follow-up. Urinary Function, Sexual Function, and Sexual Bother scores declined steeply (more than 10 points) at 3 and 12 months after diagnosis in both arms. Urinary Function showed an improvement at 12 months. In short-term follow-up, the differences in all disease-specific measures between the trial arms tended to decrease from the baseline.
At 5 to 15-years, the men in the screening arm showed a tendency toward somewhat higher disease-specific HRQOL scores in all domains compared with those in the control arm. The differences were statistically significant, however, only in Urinary Bother; at 10-year follow-up men in the screening arm reported higher scores, ie less bother (77.9; 95% CI 75.2 to 80.5), relative to the control arm (70.9; 95% CI 66.8 to 74.9; P = .005) ( Figure 3). In Bowel Bother, higher scores emerged in the screening arm only at 5 to 15-years postdiagnosis ( Figure 5). However, the interaction term of trial arm and time was not statistically significant in any domain, indicating lack of systematic changes over time for the difference between the arms. In Bowel Function, there were no marked changes, or differences between arms at any time point (Figure 4).
Unadjusted HRQOL scores together with distribution of one item of each scale are presented by arm in the Table A1.
Patients in the screening arm with low risk tumors reported significantly better Sexual Function at baseline and less Urinary Bother at 10-year follow-up compared to control arm (data not shown). However, no statistically significant interaction with trial arm was found overall. Sensitivity analyses with adjustment for both tumor risk group and primary treatment showed no clear reduction in the differences between the arms in HRQOL. Upper-level employee 48 (8) 23 (6)  71 (7) Higher-level employee 40 (6) 17 (4) 57 (6) Self-employed person 21 (3) 15 (4) 36 (3) Manual worker 39 (6) 25 (6)

| Generic health-related quality of life
At baseline, the generic HRQOL scores did not differ significantly between the arms, although the mean scores for the screening arm were slightly (1-3 points) higher on all domains (Table 2). In contrast, at 3 to 12-month followup, patients in the screening arm reported similar or nonsignificantly lower scores than the control arm. There was decline at the 3-month postdiagnosis compared to the baseline especially in Role-Emotional (by 6.9 points), and in Role-Physical (by 7.4 points) functioning. At 5 to 15-year follow-up, the screening arm had similar or higher mean RAND-36 scores than the control arm, though the differences were not significant in any of the generic quality of life dimensions. Furthermore, there was no significant interaction between trial arm and time on any domains of generic HRQOL.

| DISCUSSION
The long-term HRQOL impacts of PCa screening have not been adequately evaluated. Our results revealed only minor differences in disease-specific HRQOL at 5-15 years, with generally slightly higher mean scores in the screening arm, though not of the magnitude that have been considered clinically meaningful (over 10 points). No substantial or consistent differences between the trial arms emerged in generic HRQOL. Opportunistic screening, ie PSA screening outside a screening program, will dilute the effect of screening. We compared groups by the trial arm allocated by randomization (intention-to-screen analysis). PSA testing has been shown to be common in the control arm of the FinRSPC trial. 23 At least one PSA test was performed for 18% of the men in the control arm by four years, and it reached 48% by eight years. The mortality reduction in Finnish trial alone has been small, 24 likely reflecting both contamination and shorter screening period with fewer screening rounds than in the ERSPC Göteborg and ERSPC Rotterdam. 25 PLCO trial showed no mortality reduction likely due to widespread contamination, and low biopsy compliance. 10,11 In centers with larger mortality effect, the impact of screening on quality of life may also be greater. HRQOL results in the PLCO trial indicated no substantial long-term differences between arms in disease-specific measures. 13 Contamination has likely diluted differences between the arms, and this can be expected to also affect the current HRQOL results.
Overdiagnosis, detecting cases that would not have been diagnosed in the absence of screening, has been estimated to comprise 21%-50% of all screen-detected PCas. 9,10 Overdiagnosis, as well as lead-time and length time bias affect the distribution of prognostic factors among screen-detected cases. In the present study, low-risk cancers (T1-T2, Gleason < 7, PSA < 10) were more common in the screening arm compared to the control arm (45% vs 30%) and the distribution was even more pronounced among the quality of life-survey respondents (49% vs 26%). However, adjustment for the tumor risk group did not substantially affect the results, suggesting that the influence of overdiagnosis on our findings is limited. Based on simulation model prediction, Heijnsdijk and colleagues 26 concluded that overall benefit of PSA screening from averted deaths and advanced cancers was decreased by loss of quality-adjusted life-years (QALYs) caused by overdiagnosed cancers. Recent reviews in PCa treatment have concluded that adverse sexual, urinary, and bowel effects occur more commonly in men with active treatments compared to conservative management (eg active surveillance, watchful waiting). 4,10 Fenton and colleagues also concluded, based mainly on studies among clinically detected patients with ≤5 years follow-up, that despite difficulties in PCa-specific domains of functioning, active treatments for PCa did not clearly compromise generic quality of life or physical, or mental health status compared with conservative management. 10 Within the ERSPC trial, after accounting for disease and patient characteristics, trial arm had only a minor role in treatment choice compared to other variables. 27 We conducted supplementary analyses with adjustment for treatment, which did not materially affect the HRQOL differences between the trial arms.
Our study had some limitations. Firstly, the overall response proportion did not exceed 22% of the eligible men, which suggest a possibility of selection bias and may challenge the generalizability of our findings. After exclusion of data due to failed procedures at data collection response proportion was 27%, and we used weighting to improve the representativeness of the study population. However, the lack of statistical power may have prevented us from observing differential time trends between the screening and control arms. In the Sexual Bother question, 'sexual function' was translated as 'sexual life'. Therefore, responses may be related to sexual function, but also other sexual problems.
The major strength of our study was the material collected within a randomized screening trial, which maximizes the comparability of the groups and can minimize confounding and selection bias. To our knowledge, this study is the first PSA screening trial evaluating HRQOL in a longitudinal prospective study design including baseline assessment, with an exceptionally long follow-up including both PCa-specific and generic HRQOL measures. Baseline measurement is necessary in longitudinal studies to evaluate changes in HRQOL over time, and to distinguish impairments from those present already at baseline. Our results and conclusions were independent on the chosen statistical methods. Finally, we were able to obtain data on background sociodemographic characteristics in a comprehensive fashion from national registers.
In conclusion, our long-term evaluation of disease-specific and generic HRQOL did not reveal any large or systematic differences in men with PCa between the screening and control arms of the Finnish Randomized Study of Screening for Prostate Cancer.

CONFLICT OF INTERESTS
AA had financial support from Cancer Foundation Finland, AA and TLJT from Academy of Finland (grant #260931) and TLJT from Pirkanmaa Hospital District Competitive Research Funding (Grant No 9V065). Outside the submitted work: AA declares receiving a fee for expert consultation by Epid Research Inc. K.Taari declares receiving research funding from Medivation/Astellas/Pfizer, Orion and Myovant. TLJT declares receiving grants from Sigrid Juselius Foundation, personal fees from Bayer AG, Janssen-Cilag and Astellas. PK declares receiving travel costs to a meeting in Finland from Company Amgen. All other authors have no conflicts to disclose.

AUTHOR CONTRIBUTIONS
AA and TLJT are principal investigators, who conceived and designed the original proposal for the study and obtained trial funding. K.Talala, SH and AA contributed to the planning and reporting of the analyses. K.Talala worked the data, conducted the statistical analyses and prepared first draft of the manuscript. All authors contributed to the interpretation of data,commented on the contents, revised on the manuscript and approved the final version for submission.

ETHICAL STATEMENT
The Ethics committee of the Pirkanmaa Hospital District evaluated the study protocol (tracking number R10167). National Institute for Health and Welfare (Dnro THL/1601/5.05.00/2015) and Statistics Finland (TK-53-1330-18) has approved research permission.

DATA AVAILABILITY STATEMENT
No additional data available.