Differences in survival of prostate cancer Gleason 8–10 disease and the establishment of a new Gleason survival grading system

Abstract Background Although the latest Gleason grading system in 2014 has distinguished between Gleason 3 + 4 and 4 + 3, Gleason 8 and Gleason 9–10 are remained systemically classified. Methods A total of 261,125 patients diagnosed with prostate cancer (PCa) were selected between 2005 and 2015 from the Surveillance, Epidemiology, and End Results (SEER) database. We used propensity score matching to balance clinical variables and then compared overall survival (OS) and cancer‐specific survival (CSS) between Gleason score subgroups. We further establish a new Gleason survival grading system based on the hazard ratio (HR) values of each Gleason subgroup. Cox proportional hazards models and Kaplan–Meier curves were used to compare patient survival. Results Among PCa patients with Gleason score 8 disease, patients with Gleason 5 + 3 had significantly worse OS and CSS than those with Gleason 3 + 5 (OS: HR = 1.26, p = 0.042; CSS: HR = 1.42, p = 0.005) and 4 + 4 (HR = 1.50 for OS and HR = 1.69 for CSS, p < 0.001 for all). PCa patients with Gleason 5 + 3 and Gleason 4 + 5 may have the similar OS and CSS (reference Gleason score <=6, 5 + 3: OS HR = 2.44, CSS HR = 7.63; 4 + 5: OS HR = 2.40, CSS HR = 8.92; p < 0.001 for all). The new Gleason survival grading system reclassified the grades 4 and 5 of the 2014 updated Gleason grading system into three hierarchical grades, which makes the classification of grades more detailed and accurate. Conclusion PCa patients with Gleason 8–10 may have three different survival subgroups, Gleason 3 + 5 and 4 + 4, Gleason 5 + 3 and 4 + 5, and Gleason 5 + 4 and 5 + 5. Our results maximize risk stratification for PCa patients, provide guidance for clinicians to assess their survival and clinical management, and make a recommendation for the next Gleason grading system update.


| INTRODUCTION
Prostate cancer (PCa) is the second most common malignancy in men worldwide, which has become the most common malignancy in American men accounting for 28% of all malignancies. 1,2 Clinically, PCa patients were divided into low-, medium-, and high-risk groups according to their combined prostate-specific antigen (PSA) values, tumor staging, and Gleason scores, using which can preliminary predict the prognosis of patients and guide treatment decisions. 3 Gleason scoring system, the pathological grading system for prostate adenocarcinoma, was introduced in 1974 and has become an important predictor of the survival of PCa patients. 4,5 According to Gleason scoring system, the PCa tissues are divided into primary grading areas and secondary grading areas, each with a score of 1 to 5, and the higher score representing the higher grade of malignancy. However, the 25 different combinations may not be all existent in clinical practice. PCa biopsies with a sum of Gleason score less than 4 (e.g., 1 + 1, 1 + 2, 2 + 1) are almost nonexistent, and the primary and secondary pathologic differences greater than 2 (e.g., 1 + 4, 1 + 5, 4 + 1) are also infrequent. 6 The Gleason grading system was first codified by International Society of Urological Pathology (ISUP) in 2005, but it contains only the total points to assess the survival of PCa patients. 6 Many subsequent studies found that PCa patients with Gleason 4 + 3 group had significantly worse prognosis than those with Gleason 3 + 4 group. [7][8][9] The ISUP updated the latest Gleason grading system in 2014, which further clarified the clinical significance difference between Gleason score 3 + 4 and 4 + 3. 10 The 2014 ISUP ranged Gleason score into five levels, grade 1: Gleason 2-6; grade 2: Gleason 7 (3 + 4), grade 3: Gleason 7 (4 + 3), grade 4: Gleason 8, and grade 5: Gleason 9-10. 10 However, the 2014 ISUP Gleason grading system is heterogeneous at grades 4 and 5, Gleason score 8 (grade 4) can be divided into three categories (4 + 4, 3 + 5 and 5 + 3), and  can also be divided into three categories (4 + 5, 5 + 4 and 5 + 5), and similar prognosis and treatment were adopted in the same grade. 10 Is there a significant difference in the survival of these subgroups in the same Gleason grade just like the survival differences between Gleason 3

+ 4 and 4 + 3?
To maximize the risk stratification of PCa patients and to enable them to obtain personalized survival prediction and treatment, we conducted a large-scale statistical study to analyze the effects of different Gleason scores on overall survival (OS) and cancer-specific survival (CSS) in patients with PCa.

| DATA acquisition
The patients' data used for analysis in this study were acquired from Surveillance, Epidemiology, and End Results (SEER) database [SEER 18 Regs Custom Data (with additional treatment fields), November 2018 submission, vision 8.3.5]. The SEER database contains 18 cancer registries which cover about 28% of the United States population. 11 Patients' information provided by SEER database has greatly facilitated clinical cancer research.

| Study population
Patients diagnosed with PCa between 2005 and 2015 from SEER database were selected. All of the PCa patients had no history of other cancers. We only included patients with pathologically diagnosed prostate adenocarcinoma. Available patient information, including race, age, marital status, TNM stage, PSA value, Gleason score (primary score and secondary score), radiotherapy, surgery, and chemotherapy, were collected. The patients with missing abovementioned information were excluded. PCa patients with scarce Gleason scores, such as Gleason 2 + 4, 3 + 1 and 4 + 2, were not included in our study.

| Statistical analysis
To analyze the effect of Gleason score on survival in PCa patients as accurately as possible, we used propensity score matching (PSM) to balance clinical variables and minimize statistical bias, and then compare patient survival. The patient information, race, age, marital status, TNM stage, PSA value, radiotherapy, surgery, and chemotherapy, were matched between PCa patients with Gleason score less than or equal to 6 and other Gleason score values. We further compared the OS and CSS for each patient characteristic subgroup between PCa patients with Gleason score less than or equal to 6 and other Gleason score values after PSM adjusted. We pairwise compared the overall death risk and cancer-specific death risk between subgroups of PCa patients with Gleason 7,8, and 9-10 after PSM adjusted. Cox proportional hazards model was used to compare overall and cancer-specific death risk of patients. We further establish a new Gleason survival grading system of PCa based on the hazard ratio (HR) values of each Gleason subgroup after PSM adjusted. We used Kaplan-Meier curves to compare the OS and CSS of PCa patients divided with different grades using the new Gleason survival grading system and with the 2014 ISUP Gleason grading system.
Overall death and cancer-specific death were considered as the primary endpoints of this study. In the OS analysis, alive patients were considered as censored data. In the CSS analysis, alive patients and patients who died not due to the cancer are considered as censored data. All statistical analyses were implement by R software 3.6.2, and two-sided p values less than 0.05 were determined as statistical significance.

| Baseline patient characteristics
A total of 261,125 patients diagnosed with PCa between 2005 and 2015 were included in this study. Gleason scores for PCa patients contain less than or equal to 6 (2 + 3, 3 + 2, 3 + 3), 3 + 4, 3 + 5, 4 + 3, 4 + 4, 4 + 5, 5 + 3, 5 + 4, and 5 + 5. Baseline characteristics including race, age, marital status, TNM stage, PSA value, radiotherapy, surgery, and chemotherapy of each Gleason subgroup are listed in Table  1. The distribution of invasion factors of PCa, TNM stage, and PSA values is shown in Table 2. Notably, we found that PCa patients with higher Gleason score, T stage, and PSA value distribution were more likely to have regional lymph node and distant metastasis ( Table 2).

| Association between Gleason score and survival
Kaplan-Meier curve was used to preliminary compare the OS and CSS of PCa patients with different Gleason scores values ( Figure 1). We further used PSM to balance clinical variables and minimize statistical bias, and then compare the OS and CSS between PCa patients with Gleason score less than or equal to 6 and other Gleason subgroups. The comparisons of overall death risk and all cancer-specific death risk between Gleason subgroups are listed in Table 3

| Establishment of new Gleason survival grading system
According to the HRs of the subgroups shown in Tables 3  and 6, we established a new Gleason survival grading system to maximize the risk stratification of PCa patients. The new   (Table 7). Kaplan-Meier curves were used to compare the OS and CSS of patients divided with different Gleason grades using the new Gleason survival grading system and the 2014 ISUP Gleason grading system ( Figure  2) and we found that the survival of patients with different Gleason grades was more distinct and accurate when our Gleason grading system was used (Table 7).

| DISCUSSION
In 2014, the ISUP developed the latest Gleason grading system after the 2005 version. The modified Gleason grading system was grouped according to the histological characteristics of the Gleason score and the prognosis of the PCa patients, showing a significant improvement over the previous Gleason scoring system: 1) provided a more accurate stratification depending on the pathology of PCa; 2) classified simplifies the division of Gleason scores; and 3) grouped according to the survival of PCa patients. 10,12 However, the 2014 Gleason grading system still systematically classifies Gleason 8 and Gleason 9-10. Therefore, the purpose of this study was to study in detail the effects of Gleason score values on OS and CSS of PCa patients. PCa with Gleason score less than or equal to 6 had been considered indolent and had a high cure rate even including with extra-prostatic invasion and positive margins. 12,13 Therefore, PCa with Gleason score less than or equal to 6 was set as the matched group in this study. After matching the clinical information of PCa patients, we found that in three subgroups with a Gleason score of 8, the risk of overall death and cancer-specific death in patients with Gleason 5 + 3 was significantly higher than those with Gleason 3 + 5 and 4 + 4, and in three subgroups of Gleason score 9-10, patients with Gleason 4 + 5 had better OS and CSS than Gleason 5 + 4 and 4 + 5 groups. The comparison of survival between Gleason 5 + 3 and Gleason 4 + 5 was not statistically significant, but they may have the similar OS and CSS (reference Gleason score <=6, 5 + 3: OS HR = 2.44, CSS HR = 7.63; 4 + 5: OS HR = 2.40, CSS HR = 8.92; p < 0.001 for all). Currently, the Gleason score of 8-10 is classified as a high-risk clinical group and all treated with the same clinical guidelines. 3 Clinicians generally recognized that patients with Gleason 9-10 have a worse prognosis than those with Gleason score of 8, and the difference in prognosis may be one of the factors guiding treatment decisions. 14 For example, the National Comprehensive Cancer Network (NCCN) guidelines recommend that PCa patients with Gleason 8-10 should receive 2 to 3 years of androgen deprivation therapy before radiation, and some clinicians may be inclined to 2 years for Gleason 8 and 3 years for 16 However, our study showed that PCa patients with Gleason 8-10 had significant differences in OS and CSS in subgroups, which may be divided into three different prognostic subgroups, Gleason 3 + 5 and 4 + 4, Gleason 5 + 3 and 4 + 5, and Gleason 5 + 4 and 5 + 5. Patients in different prognostic groups may require more personalized treatment. We found that the PCa patients with Gleason 3 + 5 had worse OS than whose with Gleason 4 + 4 (HR = 1.13, 95%CI = 1.02-1.25, p = 0.016), while the comparison of CSS was significant (HR = 1.10, p = 0.315). Previous studies have shown that PCa with Gleason pattern 5 had poorer survival and higher biochemical recurrence rate than patients without Gleason pattern 5. 17,18 However, due to the influence of primary and secondary pathology, the survival difference between Gleason 4 + 4 and 3 + 5 patients may not be very significant. 19 Another study reported that PCa patients with Gleason 5 + 3 had a significantly worse CSS than whose with Gleason 4 + 4 or 3 + 5, but similar to Gleason 9 disease, which was basically consistent with our results. 20 The results of our study might have guiding significance for the development of treatment and All comparisons were performed after propensity matching score adjusted.
Abbreviations: CI, confidence interval; HR, hazard ratio. All comparisons were performed after propensity matching score adjusted.
※ represents the P value less than 0.001.
Due to the limitation of the number of patients undergoing chemotherapy, the analysis was not performed. follow-up strategies for PCa patients. We further developed a new Gleason survival grading system based on the survival of PCa patients in each Gleason subgroups. The proposed Gleason survival grading system reclassified the grades 4 and 5 of the 2014 ISUP Gleason grading system into three hierarchical grades, which makes the Gleason grading system more detailed and accurate. Interestingly, we found that PCa patients with higher Gleason scores had higher T staging and PSA distribution, and were more likely to have regional lymph nodes and distant metastasis. Although PSA has a high proportion of deviations in clinical diagnosis, it is still an important factor affecting the survival of PCA patients and a driving factor for almost all initial disease treatment and management decisions. 3,21 Studies have shown that PCa patients with higher PSA and Gleason score had a higher risk of developing lymph nodes and distant metastases. 22,23 Therefore, PCa patients with higher Gleason scores should need more aggressive treatment and closer follow-up.
Because PCa patients with a Gleason score of 7 are more common than those with a Gleason score of 8-10 (41.0% vs. 15.7% in this study), most studies were focused on the difference in survival between Gleason 3 + 4 and 4 + 3 disease, 7,24,25 and data on  have rarely been collected to study the subgroup survival differences. This study is the first to study the survival differences of Gleason 8-10 subgroups in detail based on large sample size and long-term follow-up data of PCa patients, and we hope that our study results can provide a proposal for the next Gleason score update.
Our study has important clinical significance for guiding and changing the clinical nomogram, follow-up adjustment, and treatment of high-risk PCa, but it still has some potential limitations. First, the time of biochemical recurrence and PSA growth rate after treatment are important factors affecting the survival of PCa patients, 26 and the absence of the above information in the SEER database may affect our results. Second, the grade of Gleason score is divided by pathomorphism and the survival of patients, and the proposed new Gleason survival grading system is based entirely on survival of PCa patients. Lastly, our study was retrospective in nature, so more prospective studies are required to verify our results.

| CONCLUSIONS
In conclusion, based on the large sample size and long-term patient data, we used the statistical method of low inter-group bias to study the effect of Gleason score on OS and CSS of PCa patients and proposed a new Gleason survival grading system for the first time. Our results provide valid epidemiological evidence that PCa patients with Gleason 8-10 may have three different survival groups, Gleason 3 + 5 and 4 + 4, Gleason 5 + 3 and 4 + 5, and Gleason 5 + 4 and 5 + 5. Our results may provide guidance for clinicians in evaluating survival and clinical management for PCa patients, and provide a suggestion for ISUP to update Gleason score next time.

CONFLICT OF INTEREST
The authors have no conflict of interest.

DATA AVAILABILITY STATEMENT
All patients' information in the present study were acquired from the SEER database for research purposes.

ETHICS STATEMENT
All patients' information in the present study were acquired from the SEER database for research purposes. Thus, no human or animal participants were involved in this study.