Latent heterogeneity of muscle‐invasive bladder cancer in patient characteristics and survival: A population‐based nation‐wide study in the Bladder Cancer Data Base Sweden (BladderBaSe)

Abstract Background Patients with muscle‐invasive bladder cancer (MIBC) constitute a heterogenous group in terms of patient and tumour characteristics (‘case‐mix’) and prognosis. The aim of the current study was to investigate whether differences in survival could be used to separate MIBC patients into separate classes using a recently developed latent class regression method for survival analysis with competing risks. Methods We selected all participants diagnosed with MIBC in the Bladder Cancer Data Base Sweden (BladderBase) and analysed inter‐patient heterogeneity in risk of death from bladder cancer and other causes. Results Using data from 9653 MIBC patients, we detected heterogeneity with six distinct latent classes in the studied population. The largest, and most frail class included 50% of the study population and was characterised by a somewhat larger proportion of women, higher age at diagnosis, more advanced disease and lower probability of curative treatment. Despite this, patients in this class treated with curative intent by radical cystectomy or radiotherapy had a lower association to risk of death. The second largest class included 23% and was substantially less frail as compared to the largest class. The third and fourth class included each around 9%–10%, whereas the fifth and sixth class included each 3%–4% of the population. Conclusions Results from the current study are compatible with previous research and the method can be used to adjust comparisons in prognosis between MIBC populations for influential differences in the distribution of sub‐classes.


| INTRODUCTION
Despite patients with muscle-invasive bladder cancer (MIBC) constituting a very heterogenous group in terms of tumour characteristics, age at diagnosis, gender, comorbidities and socioeconomic background factors, [1][2][3] very few studies have investigated whether these differences can be used to separate MIBC patients into distinct classes with possible differences in prognosis. 4,5haracterisation of such classes could help individualise treatment to avoid under-as well as over-treatment if they are associated with response to treatment.Characterising the clinical implications of classes can be a tool for better understanding of how MIBC cohorts develop over time: for example one class with longer survival will come to dominate the cohort over time after exhaustion of another class with bad prognosis.Furthermore, separation into classes can increase our understanding of the differences in case-mix of MIBC patients that influence comparisons in observational studies of performance or survival between countries, regions, or hospitals.
A tool for assessing heterogeneity is to include the concept of frailty, which could further enhance prognostication and identify more vulnerable patients.For prognosis of MIBC, different clinical scores assessing frailty based on known risk factors have already been proposed. 6,7In the current study, we used a recently developed method to investigate frailty based on factors that are not present in the data as covariates, that is unmeasured and hitherto unknown covariates associated to survival, in addition to account for risk of competing causes of death, which has to our knowledge never been investigated previously in MIBC patients.
The primary research question of the current study is to determine whether we could identify latent classes of MIBC patients by using data from a nation-wide bladder cancer database in Sweden and applying a novel statistical approach to latent class proportional hazards models and if so, compute their class-specific characteristics, frailty and associations to risk of death from bladder cancer and from other causes.A secondary aim was to investigate if the distribution of classes were differently geographically distributed.

| Study population
We selected all participants diagnosed with MIBC (defined as T-stage ≥2) in the Bladder Cancer Data Base Sweden (BladderBaSe). 8The BladderBaSe started in 2014 by merge of the Swedish National Registry for Urinary Bladder Cancer (SNRUBC) to several national health care and sociodemographic registers.Details of the multiregister linkage and variables included can be found elsewhere. 8In short, we merged clinical data of all individuals with primary bladder cancer in the SNRUBC with data on socioeconomic factors, comorbidity, other clinical and cancer diagnoses, and cause and date of death.The project has been authorised by the Research Ethics Board at Uppsala University, Sweden (EPN Dnr 2015/277).

| Definition of covariates
For this investigation we retrieved data from the SNRUBC on T, M and N stage and grade at diagnosis, type of treatment (no curative treatment, or curative treatment by radical cystectomy or radiotherapy), diagnosing hospital size (university, region or district hospital) and healthcare region (six different in Sweden).From the National Patient Register, we used data from hospital admissions up to 10 years prior to the date of bladder cancer diagnosis.These data on discharge diagnoses were used to compute the Charlson Comorbidity Index (CCI), and the CCI were categorised into four groups: no comorbidities, mild, moderate, or severe comorbidity (0, 1, 2 and ≥3 comorbidities 9,10 ).From the Longitudinal Integration Database for Health Insurance and Labor Market Studies, we retrieved data on educational level and categorised this into three groups: ≤9 years, 10-12 years and ≥ 13 years, which corresponded to low, intermediate and high education level. 11n all analyses type of treatment, gender (male or female), diagnosis age (continuous), calendar year at diagnosis (continuous), education level (in categories described above), CCI (in categories described above), hospital size, T stage at diagnosis (T2, T3 or T4), M stage at diagnosis (no distant metastases [M0], distant metastases [M1], or not measured [MX]), N stage at diagnosis (no regional lymph nodes with cancer [N0], cancer in at least one of the regional lymph nodes [N+] or not measured [NX]) and grade at diagnosis (G1, G2 or G3, according to World Health Organization 1973 classification from 1997 to 2003 and according to World Health Organization 1999 from 2004 and onwards) were used as covariates.

| Definition of follow- up and endpoints
Start date of the study was defined as date of diagnosis of MIBC, and last date of the study was date of death, or December 31, 2014, whichever occured first.Timescale for follow up were years from diagnosis.Date and cause of death were acquired from the Cause of Death Register.
Death from bladder cancer were defined as code C67 (International Classification of Diseases, 10th revision).Bladder cancer death and death from other causes were used as endpoints in the analysis.

| Statistical analysis
To investigate inter-patient heterogeneity taking competing risks into account, we used latent class proportional hazards models.Detailed information of the latent class proportional hazards models can be found elsewhere. 12In short, latent class proportional hazards analysis was performed in two steps; the first step was to perform regression for models having a range of latent classes with different degrees of complexity.We used Akaike information criteria as the second step to select the model best supported by the evidence in the studied survival data, having the optimal number of sub classes and complexity.The model used does not assume homogenous associations between covariates and endpoints, accounting simultaneously for the risks of both endpoints, in comparison to models in conventional survival analysis.The model builds on the assumption that the study population may be composed of several latent classes, and that differences in risks of the endpoints between classes are caused by latent heterogeneity, that is heterogeneity not captured by the studied covariates.The latent heterogeneity between the classes can be quantified by relative frailty measures, that is variability in base hazard rates and associations.For each latent class, there is an assumption of proportional hazards for the associations with covariates to each endpoint in the study.Hence the assumption of proportional hazard is not required to be valid for the full study population.The method will result in a model similar to the Cox proportional hazards model if no latent classes are found in the population analysed.
We assessed differences in class membership, frailty and associations, based on the estimates of the latent class proportional hazards analysis.Frailty was defined as the heterogeneity resulting from risk factors that were not explicitly present in the survival data as covariates, that is unmeasured and hitherto unknown covariates associated with the risk of any of the endpoints.Associations with covariates and latent class membership on risk of all endpoints were investigated by calculations of hazard ratios (HR) from the latent class model for competing risk analysis.Relative frailties between classes detected by the model were assessed by HR for latent class membership.The analysis was carried out using the software package SaddlePoint-Mosaics version 1.1.0(https://www.saddl epoin tscie nce.com/) which implements the method as presented previously in the literature. 12The estimated latent classes were tabulated versus health care regions to investigate differences between the six Swedish health care regions (North, Mid, Stockholm/Gotland [Sthlm], West, Southeast and South).

| RESULTS
The study population include 9653 patients, 71% men and 29% women diagnosed with MIBC, (Table 1).Mean age at start of study was 75 years (SD = 11), and during follow-up, 5235 (54%) patients died of bladder cancer and 2235 (23%) died of other causes.Mean follow-up time in the cohort were 2.6 years (SD = 3.5), and survival time (excluding patients alive at end of study period) were median 0.8 years (IQR = 0.4-1.8).
Information on baseline characteristics and endpoints used in the calculations of latent class proportional hazards models and baseline characteristics included in the model computation are shown in Table 1.The analysis resulted in a model with six different latent classes as being best supported by the survival data in this study.We determined the characteristics, frailty and the associations to risks of the endpoints of the study participants in relation to the most probable latent class they were assigned to.
Class 5 had a somewhat larger proportion of women, patients older than 75 years at diagnosis, more advanced bladder cancer tumour (by T, M and N stages and grade at diagnosis), and individuals not receiving curative treatment (Table 1).Thus, Class 5 included only 9.6% of patients that were alive at end of study period, and 76.8% of all patients that died of bladder cancer, albeit 25.3% died of other causes during the follow-up (Table 2).Class 5 had a median survival of 0.5 years (IQR: 0.3-1.0years), excluding patients alive at end of follow-up.In Class 5, we found that radical cystectomy and radiotherapy were both associated with a decreased risk of both death from bladder cancer and other causes, as compared to no treatment (Table S1, for p < 0.001).Furthermore, T3 and T4 (as compared to T2), M1 (as compared to M0) and N+ (as compared to N0) were all associated with increased risk of death from bladder cancer, while NX (as compared to N0) were associated to risk of death from other causes (Table S1, for p < 0.001).As Class 5 was both the largest and the most frail class, we used that class as a reference T A B L E 1 Baseline characteristics of covariates included in the analysis of full study population (9653 patients with muscle-invasive bladder cancer), and six latent classes as a result from the competing risk latent class analysis.group for calculations of frailty of all other classes for both endpoints studied.Class 6, the second largest class, had a higher proportion of men, below 75 years at diagnosis and with very few comorbidities (Table 1).Furthermore, this class had a higher proportion with higher education level, diagnosed at university hospitals, less advanced bladder cancer tumour (with respect to T, N and M stage, and grade) and were treated with radical cystectomy.Class 6 included 64.2% of the patients that were alive at end of study period, 9.6% of patients that died of bladder cancer and 15.8% of the that died of other causes (Table 2).Within Class 6, median survival time were 1.6 years (IQR: 0.9-3.5 years), and severe comorbidity (as compared to no comorbidities), T3 and T4 (as compared to T2) were associated to increased risk of bladder cancer death, furthermore, higher age at diagnosis and M1 (as compared to M0) were associated with risk of death from other causes.(Table S1, for p < 0.001).Thus, Class 6 were less frail (as compared to Class 5) with HRs at 0.2 and 0.15 for the endpoints, which implies that patients in Class 6 had a hazard rate of the endpoints of one fifth and almost one seventh as compared to Class 5.

Full
Class 3 and Class 4, the third and fourth largest classes, were substantially smaller and more difficult to characterise (Table 1).Both of these classes had a higher proportion of patients with comorbidities that did not receive curative treatment, had low education level, and were diagnosed at district hospitals.The median survival time for Class 3 was 2.2 years (IQR: 1.2-4.5 years) and for Class 4 1.6 years (IQR: 0.9-3.9years) (Table 2).Both classes had a similar proportion that were alive at end of study period, and that died from bladder cancer, but Class 3 had a slightly higher proportion of study participants dying from other causes.Patients in Class 4 treated with radical cystectomy (as compared to no curative treatment) and patients in Class 3 with moderate comorbidity (as compared to no comorbidities), had a lower risk of bladder cancer death (Table S1, for p < 0.001).For patients in both classes, several factors were associated to risk of death from bladder cancer and other causes.The HRs for class membership for bladder cancer death were very similar in the two classes: 0.06 and 0.07, which implies that these classes had a hazard rate of about 15 times less for bladder cancer death as those in Class 5.The HR for class membership for other causes of death were 0.31 for Class 3 and 0.16 for Class 4, which gives hazard rates of approximately one third and one sixth respectively as compared to Class 5.
The two smallest classes; Class 1 and Class 2, included totally 7.5% of the studied population and were also more difficult to characterise (Table 1).These classes had a higher proportion woman, comorbidities, lower education level, diagnosed with higher T stages and not receiving curative treatment.Class 1 had a median survival time of 0.8 years (IQR 0.2-2.1 years) and Class 2 2.3 years (IQR 1.2-5.2years).These classes had HRs for class membership for the endpoints of 0.07-0.21,except for HR for Class 1 for death for other causes that were 0.81.This implies that Class 1 have approx.only 1.25 lower hazard rate of death from other causes, that is substantially closer to Class 5, as compared to the estimates from all other classes.
Regional distribution of the classes is shown in Table S2.No significant differences in classes were found between regions.

| Key findings
Data from a clinical database merged with other national registers and a latent class proportional hazards model could identify six distinct latent classes of MIBC patients.The largest, and most frail, class corresponded to nearly half of the studied population and was characterised by a somewhat higher proportion of women, older age at diagnosis, more advanced bladder cancer and a higher proportion not receiving curative treatment.Despite this class having the worst prognosis, curative treatment by radical cystectomy or radiotherapy were associated with decreased risk for both death from bladder cancer and other causes.The second largest class corresponded to nearly one fourth of the studied population and was characterised by a higher proportion of men, younger age at diagnosis, higher education level, being diagnosed at a university hospital, less comorbidities, less advanced bladder cancer and a higher proportion treated with radical cystectomy.In both these classes, higher T, N and M stages, and more comorbidities were associated to higher risk of death.In addition to these two major classes, two smaller classes of around 9%-10% each were found; these classes were more difficult to characterise, but both had a higher proportion of patients diagnosed at district hospitals, with low education level, more comorbidities and not receiving curative treatment.Lastly, two even smaller classes of 3%-4% each were found, with a larger proportion woman and patients diagnosed with higher T stages.These two smaller classes had a better bladder cancer and overall survival than the largest class, but the smallest class had almost as high risk of death from other causes as the largest and most frail class.There were no tangible differences in the classdistribution over health care regions.

| Strengths and limitations
We used advanced modelling based on high-quality register data from an entire nation 8 to map inter-patient heterogeneity accounting for competing risks in a systematic approach which avoids violation of underlying statistical assumptions.This new method included an estimation of frailty and detected six latent classes in the study population of patients diagnosed with MIBC with class-specific risk factors of death from bladder cancer and other causes.We used information on the patients out-and in-patient hospital care as a measure of the comorbidity as a covariate; comorbidities diagnosed in primary care or other healthcare institutions were not captured.Also, the original definition of CCI = 0 includes a wide range of health  statuses.Thus, the CCI = 0 category may include a group of patients with medical contraindications to major surgery and/or chemotherapy.We had no information on smoking status or body mass index, factors that have been previously associated with molecular subtype of MIBC. 4 However, identification of molecular subtypes was beyond the scope of this study as we had no access to pathological or molecular details of the bladder tumours, 1 nor genetic factors or epistasis, 13 albeit that could have been used to further characterise the heterogeneity found in the study.

| Interpretation
The data included in the study are 'real-world data' and reflect characteristics in the entire bladder cancer population in Sweden with a mixture of patients with widely different prognosis and health status-many of them only offered palliative treatment.Thus, this population differs from many other institutional cohorts where a majority were offered treatment with curative intent.The follow up times in the current study are in line with data already published from the BladderBaSe 14,15 and in similar settings in other countries. 16he classes detected in our study had distinctly different bladder cancer prognosis as well as different risks of dying from other causes.These results indicates that there is meaningful information in a routinely collected database that potentially be used further in observational studies or comparisons of performance between counties, regions, or hospitals, as the class characterisation may show differences in 'case-mix' that explain differences in survival.T stage at diagnosis was an influential prognostic marker in the four largest classes and M and N stage in several of the classes, indicating that early detection is important for patients with MIBC. 17oreover, in a future perspective our results could give clues for clinical management or for further risk stratification in interventions or in more specific observational research.For example, the largest class had the worst prognosis, but seemed on the other hand to respond well to treatment with curative intent.Interestingly, still 59% of this class was not offered such treatment.These numbers are in line with other studies reporting that treatment with curative treatment is underused, particular in the elderly and with low socioeconomic status. 18 large proportion of patients in this class had missing N or M stage.Likely, a proportion had M1 disease that was not recorded; in patients with bad general health status or high age that preclude any treatment except best supportive care, verification of N and M-status is often omitted and detailed status therefore not reported.We cannot from this study deduct if abstaining from radical treatment is rationally associated with treatment decisions related to the high frailty or if it also in some cases represents missed opportunities.In a wider perspective, the study is also a test of a method to find clinically relevant subclasses in cancer patients in general.
A further example of research questions that arises from the behaviour of the subclasses is the seemingly contra intuitive results for patients in Class 2. To interpret their outcome, one needs to consider that our model allocates individuals into classes accounting for both known and unknown covariates.Based on their known covariates, we would expect patients in Class 2 to have poorer outcomes.However, they still seem to be robust with respect to a longer survival time, hence, this must be associated with one or more unknown factor(s) not captured with our data-points.We can only speculate if this may be a factor related (i) to their age at diagnosis, that is the sub population in Class 2 which are older than 75 may be more robust than the group below 75 years, simply because they have not died already, due to unknown factors that could be related to genes or lifestyle or other factors that is associated with longevity, or to (ii) a factor that may be contraindicative to curative treatment but associated with longevity, or to (iii) another molecular, clinical, or biological factor that is associated with a more slowly growing tumour and thus a better prognosis.One may argue that a small sub-class should be integrated with a larger one.However, that may hinder further investigations of an interesting sub-class and may dilute the characterisation of the larger class.
We did not intend to study the influence of molecular bladder cancer subtypes.However, our findings lend themselves to generate hypothesis of the association and/or interaction between clinical data of the kind we used and molecular subtype/class.Interestingly, a consensus article recently agreed on six molecular subtypes from transcriptomic data from 1750 MIBC patients. 19hat study also reported that the largest class (basal/ squamous, 35% of the studied population) were more common among women and had a median survival time of 1.2 years.The second largest class were luminal papillary subtype comprising of 24% of all patients.Associations between gender and molecular subtypes have also been reported in later studies, 5 with higher rates of basal/squamous subtype in women than men, whereas men had higher proportions of the luminal subtypes.Other reports have shown that patients with basal-like tumours compared to those with luminal subtypes were more likely to be older at diagnosis, obese and have started smoking at a younger age (n = 372). 4ence, results in our study are compatible with previous research with respect to number of classes, gender, ages and survival pattern for the largest class.Larger studies investigating heterogeneity of characteristics of patients with MIBC at diagnosis are sparse, likewise are studies of heterogeneity of progression and responses to treatment.Two studies have reported different responses to neoadjuvant chemotherapy for different molecular subtypes, 20,21 but to our knowledge, no studies have assessed other treatments or factors related to progression for latent classes of MIBC.Also, to our knowledge, there are no other study of risk factors, treatments, progression and survival that have accounted for frailty in assessment of separate classes of MIBC.

| Generalisability
Our study originates from high-quality register data from an entire nation during more than 15 years follow up, and our main results should be generalisable to regions with similar patterns in risk factors, diagnostic routines, treatment guidelines and prognosis.We found practically no geographical differences in distribution of classes.The patients contributing to this study have mostly north European ancestry and regional differences may be larger in populations where for example the aetiology of the bladder cancer would differ more than in Sweden.Example our results are probably not generalizable to MIBC patients infected by Schistosoma haematobium as this infection not is endemic in Sweden.
The novel analysis used in this study is generalisable to cohort studies in a broad spectrum of diseases to analyse inter-patient heterogeneity in survival.Moreover, it also considers risk of competing events which is important since many patient groups are elderly and frail.Under the hypothesis that class-specific characteristics for a certain disease differ much between different settings, our study shows that this analysis can be applied to find the relevant classes for the specific situation if baseline data from a clinical register is present.

| CONCLUSION
We identified six distinct latent classes of MIBC patients and their class specific differences and association to risk of death from bladder cancer and other causes.Our results are compatible with data from other investigations of heterogeneity of MIBC and raises research questions to further understand and characterise the subclasses of MIBC patients.The method used can be expanded to include biomarkers and molecular data to probe if precision medicine 22 can be further used to individually tailor treatment of MIBC.These findings also have implications for risk stratification in both interventional and observational research.

T A B L E 2 a
Status at end of study, survival times, frailty factors, relative frailty factors and HRs due to class membership (due to frailty) in the six latent classes, as compared to Class 5 (the largest class and also most frail) for both endpoints.Excluding study participants alive at end of study.b Relative contribution due to class membership as compared to the largest class (Class 5), defined as relative frailty factor = abs[frailty (Class 5)]-abs[frailty (Class X)].c Association due to class membership: HR = exp(relative frailty factor).