Comprehensive clinical profiling of the Gauting locoregional lung adenocarcinoma donors

Abstract A comprehensive characterization of lung adenocarcinoma (LADC) clinical features is currently missing. We prospectively evaluated Caucasian patients with early‐stage LADC. Patients with LADC diagnosed between 2011 and 2015 were prospectively assessed for lung resection with curative intent. Fifty clinical, pathologic, radiologic, and molecular variables were recorded. Patients were followed till death/study conclusion. The main findings were compared to a separate cohort from France. Of 1943 patients evaluated, 366 were enrolled (18.8%; 181 female; 75 never‐smokers; 28% of registered Bavarian cases over the study period). Smoking and obstruction were significantly more prevalent in GLAD compared with adult Bavarians (P < 0.0001). Ever‐smoker tumors were preferentially localized to the upper lobes. We observed 120 relapses and 74 deaths over 704 cumulative follow‐up years. Median overall and disease‐free survival were >7.5 and 3.6 years, respectively. Patients aged <45 or >65 years, resected >60 days postdiagnosis, with abnormal FVC/DLCOVA, N2/N3 stage, or solid histology had significantly decreased survival estimates. These were fit into a weighted locoregional LADC death risk score that outperformed pTNM7 in predicting survival in the GLAD and in our second cohort. We define the clinical gestalt of locoregional LADC and provide a new clinical tool to predict survival, findings that may aid future management and research design.

conclusion. The main findings were compared to a separate cohort from France. Of 1943 patients evaluated, 366 were enrolled (18.8%; 181 female; 75 never-smokers;

28% of registered Bavarian cases over the study period). Smoking and obstruction
were significantly more prevalent in Gauting Lung Adenocarcinoma Donors (GLAD) compared with adult Bavarians (P < 0.0001). Ever-smoker tumors were preferentially localized to the upper lobes. We observed 120 relapses and 74 deaths over 704 cumulative follow-up years. Median overall and disease-free survival were >7.5 and 3.6 years, respectively. Patients aged <45 or >65 years, resected >60 days postdiagnosis, with abnormal FVC/DL CO V A , N2/N3 stage, or solid histology had significantly decreased survival estimates. These were fit into a weighted locoregional LADC death risk score that outperformed pTNM7 in predicting survival in the GLAD and in our second cohort. We define the clinical gestalt of locoregional LADC and provide a new clinical tool to predict survival, findings that may aid future management and research design.

K E Y W O R D S
LADC, lung adenocarcinoma, obstruction, smoking, survival and percentage predicted values for forced vital capacity (FVC), forced expiratory volume in 1 sec (FEV 1 ), FEV 1 / FVC ratio, lung diffusion capacity for carbon monoxide (DL CO ), and DL CO corrected for alveolar ventilation (DL CO /V A ) were recorded. Patients eligible and fit for surgery were prospectively enrolled. Baseline data obtained at entry were: blinded patient identifier (ID), age and sex, body mass and length, date and mode of clinical and tissue diagnosis, clinical TNM7 (cTNM7) stage including site and extent of metastatic disease, smoking start, stop, and intensity, and lung function results. Chronic obstructive pulmonary disease (COPD) was defined as smoking >30 pack-years with compatible symptoms and FEV 1 /FVC <70% and was graded by the global initiative for chronic obstructive lung disease (GOLD) 2001 classification. 13 All patients were re-evaluated at 30 days postsurgery, the benchmark of referral to oncology/radiotherapy (all stage III/IV patients received adjuvant therapy) or dismissal to out-patient follow-up according to current guidelines. 14 Data prospectively recorded included: date of surgery, time from diagnosis to treatment calculated from imaging/tissue diagnosis (whichever occurred first) to resection date, blinded tissue ID, lobar tumor location, relapse/metastasis date and site, histologic subtype, pathologic TNM7 (pTNM7) stage, and oncogene testing results. Follow-up data were retrospectively acquired from visits, medical charts, telephone consultations with treating physicians, and/or death certificate searches and included: adjuvant therapy, relapse/metastasis date, site, and extent, and death or last contact. Primary endpoint was overall survival (OS), calculated from surgery to death (event) or last contact (censored); secondary endpoint was disease-free survival (DFS), calculated from surgery until recurrence (event) or last contact (censored); tertiary endpoints were associations between the variables obtained.

| Tours comparison cohort
All patients with tissue-diagnosed LADC between January 2006 and December 2011 were prospectively evaluated for curative resection, staged according to TNM7, 2 preoperatively tested for lung function, prospectively enrolled if eligible, and fit for surgery. Data obtained and endpoints were identical to GLAD, except from histologic subtype, extent of metastatic disease, and oncogene test results.

| Histology and genotyping
LADC subtypes of GLAD were determined by our pathology expert (AMH) according to IASLC guidelines. 1,2

| Statistics
Minimal study size (n MIN ) was determined by power analyses (http://www.gpower.hhu.de/en.html) employing Fisher's exact test, proportion inequalities in two independent groups, α error=0.05, 80% power, and 1:1 allocation ratio. n MIN = 314 was required to detect the difference between 0% and 5% and n MIN = 348 between 30% and 45%. We targeted recruitment to n = 350 and achieved n = 366 in September 2015. Data distribution was tested using Kolmogorov-Smirnov test and summaries are given as frequencies or point estimates (mean or median) with descriptors of dispersion (standard deviation, SD or interquartile range, IQR or 95% confidence interval, 95%CI), as appropriate and indicated. Survival was analyzed by Kaplan-Meier estimates and Cox proportional hazard models using Waldman backward elimination. Log rank tests were used for comparisons. Associations between variables were examined using Fisher's exact or χ 2 tests, Student's t-or Mann-Whitney U-tests, one-way analysis of variance (ANOVA) with Bonferroni posttests or Kruskal-Wallis ANOVA with Dunn's posttests, Pearson's or Spearman's correlations, and linear regression, depending on input and target variable nature and distribution, as appropriate and as indicated. Probabilities (P) < 0.05 were considered significant. Least absolute shrinkage and selection operator (LASSO) regression analysis was carried out using the GLMNET package on R*, where the number of regression coefficients shrunk according to a penalization factor λ (https://www.rproject.org/) and their point estimates were determined with cross-validation using 244 samples with complete records. Unsupervised clustering of 362 GLAD patients was done using ConsensusCluster; 15 settings were K = 2-6, subsample size = 300, and fraction = 0.8, K-means algorithm with average linkages, hierarchical consensus, and Euclidean distance metric, and center principal component analysis normalization with fraction = 0.85 and eigenvalue weight = 0.25. Receiveroperator curves (ROC) were used to identify variables defining patient clusters. Analyses were done on the Statistical Package for the Social Sciences v24.0 (IBM, Armonk, NY) and Prism v5.0 (GraphPad, San Diego, CA).

| The Gauting locoregional lung adenocarcinoma donors (GLAD)
During the period from February 2011 to September 2015, 1943 patients with LADC were prospectively assessed in the Asklepios Medical Center, Gauting, Germany. Among them, 455 were eligible and fit for curative surgery, and 366 were enrolled (89 patients were excluded due to cTNM7 N3 disease or unwillingness to provide informed consent). They represent ~28% of registered Bavarian locoregional LADC cases during the study period (21 588 lung cancer cases, corresponding to 8635 LADC cases at expected 40%, and to 1295 resectable LADC cases at expected 15%; http://www.krebsregister-bayern.de/index_e.html) summarized in Figure 1A. [16][17][18] During the same period, another 1577 patients with LADC were not eligible or fit for lung resection, rendering 23% of patients screened resectable with intention to treat, and resulting in 19% recruitment rate into GLAD ( Figure 1B and C). Of the 366 patients resected, 41 had oligometastatic disease detected prior to surgery, seven were incompletely resected, and in 20 a malignant pleural disease was identified intraoperatively. Out of the 305 patients that were tumor-free after surgery, 301 remained tumor-free at the 30-day postoperative census (82.2%), and 181 (49.5%) at the mid-2016 census ( Figure 1D). At this time, 8453 cumulative follow-up months (median[interquartile range, IQR] 18 [7-33] months/patient) had been delivered, and 120 relapses and 74 deaths were observed. Median(95% confidence interval, CI) overall survival (OS) was not reached (>7.5 years), disease-free survival (DFS) was 3.64 (2.76-5.88) years, and 5-year OS and DFS rates were 62% and 39%, respectively ( Figure 1E). GLAD will be re-censored mid-biannually; hence survival data are expected to evolve. A color-coded phenome plot of all information available at the mid-2016 census is shown in Figure 2 and Table S1, while a heat map of all the associations observed (discussed below) is given in Figure 3. The major findings from GLAD classified according to clinical variables are presented below.

| Age
In GLAD, median(IQR) age was 67 (59-72) years, including 11 (3%) and 195 (53%) patients younger than 45 and older than 65 years, respectively; those had markedly decreased overall survival (OS) and disease-free survival (DFS) compared with 160 (44%) patients aged between 45 and 65 years ( Figure 4A and B). Age was positively associated with cumulative smoke exposure and lepidic/papillary histology. On the contrary, it was negatively linked with current smoking, body length, FVC and FEV 1 , and time to surgery ( Figure 3). In addition, more death and relapse events were observed in patients of extreme age (<45 or >65 years) ( Figure S1A). Linear regression-calculated lung function decline rates with age were similar to the Framingham study, 19 and lung function test results were tightly correlated with body metric indices, validating GLAD lung function data (Figure S1B-D). Interestingly, patients with affected resection margins and perioperative pleural relapse were significantly younger ( Figure S1A).

| Sex
Surprisingly, 181 patients (49.5%) of GLAD were female, reflecting increasing local and worldwide female smoking trends. 6,20 Female sex was positively associated with percent predicted FVC and FEV 1 values and FEV 1 /FVC ratio, and negatively linked with smoking rate and intensity, body metric indices, absolute FVC and FEV 1 and percent predicted DL CO /V A , COPD frequency, solid histologic subtype, and adrenal relapse ( Figure S2). However, sex did not significantly impact survival ( Figure 4B). These results suggested that locoregional LADC in Caucasian women has distinct features as proposed elsewhere. 6 However, these do not profoundly alter the biologic course of the disease, in accord with published results from Norway. 20

| Obstruction and COPD
When GLAD were classified according to original GOLD criteria, 13 patients had stage 0 (62.8%), 50 patients stage I (13.7%), 75 patients stage II (20.5%), 6 patients stage III (1.6%), and 5 patients indeterminate (1.4%) COPD status ( Figure 2). Smoking was intimately linked with GOLD COPD stage (P < 0.0001, Fisher's exact test) and COPD was significantly more prevalent in GLAD compared with current Bavarian rates (P < 0.0001, Fisher's exact test; Figure 4D and E). 18 These findings were validated using real-time statistics (https://knoema.com/REG_DEMO_ TL2/demographic-statistics?region=1001010-bavaria, http://www.registrecancers59.fr/index.php/incidence) in GLAD and Tours cohorts ( Figure 1A, Figure S3A), 24 underpinning the causative role of smoking in both COPD and LADC. 25 Lung function tests were concordant to GOLD COPD definition ( Figure S5A). COPD was positively associated with affected resection margins and perioperative pleuropulmonary relapse likely attributable to adverse effects of distorted lung structure on surgical outcome, and correlated negatively with FVC and DL CO /V A ( Figure S5B). However, COPD did not impact survival ( Figure S5C and D). Of all lung function variables, only abnormal percentage predicted FVC and DL CO /V A negatively impacted survival ( Figure S5E and F). Collectively, the data indicate that COPD and LADC show significant overlap, suggesting a common pathogenesis, in line with the literature. 25 Moreover, the percentage predicted FVC and DL CO /V A , but not other spirometry indices or a diagnosis of COPD, can predict survival.

| cTNM7 staging
All patients were staged according to cTNM7 to guide management. 2,14 We included history, physical exam, and chest-toadrenal computed tomography in all. For stage III patients, an invasive bronchoscopy with mediastinal lymph node sampling, mediastinoscopy, and/or 18 fluoro-deoxyglucose positron emission tomography were also performed. Analysis of T, N, and cTNM7 stage showed a significant impact on survival ( Figure  S6) and validated GLAD against the reference IASLC study. 2

| Tumor location
The lobe of origin of GLAD tumors was definitively determined during surgery in 296 patients, while tumors involving multiple lobes, central airways, and/or mediastinal structures rendered this impossible in 70 patients. We identified a striking upper lobe predominance in both GLAD and Tours cohorts, which was disproportional to published lobe ventilation or perfusion patterns, and was reminiscent of lobar ventilation/ perfusion ratios ( Figure 4G-I). 26,27 Strikingly, RUL LADCs predominated in smokers of both cohorts, and patients with RUL LADC displayed higher FVC, FEV 1 , and N stage, but similar survival, compared with all other patients ( Figure S7).

| Histology
After the pathologic review of multiple tumor sections and sites (AMH), GLAD were classified into 16 lepidic  lepidic-predominant tumors in LADC was more frequent in never-smokers and displayed lower overall TNM descriptors, decreased metastatic propensity, and prolonged overall survival, as opposed to solid-predominant LADC that displayed aggressive features and poor survival ( Figure  S8), further validating GLAD.

| Patterns of relapse
Over 704 cumulative follow-up years, 190 relapse events were identified in 120 patients ( Figure 1D). In addition to the associations described above, patients with higher cTNM7 descriptors had higher relapse rates, both at the 30-day postsurgery and at mid-2016 benchmarks ( Figures 4J  and 5). Relapse timing and site did not significantly impact OS; however, pleural or multi-site relapse (5/20 patients with multiple relapses also had pleural relapse) adversely impacted DFS indicating that pleural relapse occurs earlier than others ( Figure S9).

| Survival
We next assessed the impact of each variable on OS and DFS. In a first step, Kaplan-Meier analyses using OS and DFS as target and single variables as inputs (continuous numerical variables were dichotomized at abnormal cutoffs) showed that patients with age outlying 45-65 years, abnormal percentage predicted FVC and DL CO /V A , high T, N, and cTNM7 descriptors, delayed and incomplete resection, solid histologic subtype, and pleural relapse; had decreased OS and/or DFS ( Figures 4A and 4F, Figures S5E and F, S6, S8C and D, S9B). All variables were entered into a second line least absolute shrinkage and selection operator (LASSO) regression analysis that identified age, body mass, percentage predicted FVC, FEV 1 /FVC ratio, percentage predicted DL CO , GOLD COPD stage, T, N, and cTNM7 stage, time to surgery, right lower lobe origin, and histologic subtype as determinants of OS and/or DFS ( Figure 5A). In a final step, all variables that emerged both from Kaplan-Meier and LASSO analyses were entered into Cox regression using backward Waldman elimination, which identified age outlying 45-65 years, abnormal percentage predicted FVC and DL CO /V A , N2/3 disease, delayed resection, and solid histology as independent predictors of OS of GLAD ( Figure 5B).

| The locoregional lung adenocarcinoma death risk score (LADERS)
We next built a model to predict OS at the 30-day postresection benchmark, using the six variables that withstood Kaplan-Meier, LASSO, and Cox regression testing using OS as the target. LADERS employs Cox proportional hazard points and was tailored for easy clinical use without extra imaging/procedures ( Figures 5C and 6A and B, Table 1). LADERS displayed only 25% correlation with pTNM7 and was intimately linked with death events, while pTNM7 showed tight linkage with relapse events (Figure S10A and B). LADERS outperformed pTNM7 in predicting OS of GLAD in Kaplan-Meier and Cox regression analyses, while pTNM7 performed better in predicting DFS ( Figure S10C; Table 2). LADERS also outperformed pTNM7 in predicting OS in the Tours cohort ( Figure 6C and D).

| DISCUSSION
Here we present GLAD, a prospectively evolving biobank of phenotypes and tumor/normal paired tissues of patients with locoregional LADC. The longitudinal follow-up of the cohort suggests that locoregional LADC is currently a chronic lung disease with median survival >7.5 years. We corroborate pertinent findings of previous studies, such as the high frequency of these tumors in never/ex-smokers and women and the significant overlap of LADC with COPD, the upper lobe predominance of these tumors that appears to be dexterous in smokers, as well as the value of current staging and histologic typing systems in management and prognosis. 1,2,22 Using detailed phenotyping and prolonged follow-up, we discovered previously ill-defined and undefined clinical associations, such as the adverse effects of extreme age, poor lung function, and delayed resection on survival, as well as the early nature of pleural and the latency of pulmonary relapse. Most of our findings are corroborated in a separate patient cohort from France. Most importantly, we combined this wealth of clinical information to produce LADERS, a clinical score that accurately predicts survival in both cohorts. In accord with only one previous report, 28 young GLAD patients developed more aggressive LADC, possibly attributable to germline tumor suppressor loss, a hypothesis that can be directly tested in GLAD tissues. On the other hand, extremely old patients appeared to have a worse surgical prognosis associated with reduced lung and overall function. Active smokers had low N stage and relapse rates, findings possibly related with the young age and increased surveillance of active smokers in our cohort.
For the first time, we report how important time to surgery is for incipient survival, underscoring the aggressiveness of the disease and the urgency of surgery. We also define a previously reported spatial pattern of LADC development in the upper lobes. 29 Although the clinical importance of this finding is unclear, it is likely the result of increased local conversion of inhaled precarcinogens to active carcinogens in the upper lobes of smokers. Of special note, we identify distinct temporal trends in organ-specific relapse of early-stage LADC, similar to biphasic metastatic patterns of other tumor types like breast cancer. 30 Importantly, we provide clinicians with LADERS, an easy-to-use and accurate clinical tool to predict survival.
In conclusion, the first results from a prospective cohort of patients with locoregional LADC corroborate the impact of current staging and histologic subtyping systems and identify important effects of age, lung function, and time to resection on survival. A clinical tool to assess survival is also provided. Importantly, future combination of clinical information with tissue profiling is anticipated to unveil novel tumor genomephenome links and unprecedented mechanistic insights into evolution of carcinogenesis in the respiratory tract.
T A B L E 2 Performance of pTNM7 and LADERS scores as predictors of survival of GLAD patients at discharge from thoracic surgery