Serum markers of pulmonary epithelial damage in systemic sclerosis‐associated interstitial lung disease and disease progression

The course of systemic sclerosis‐associated interstitial lung disease (SSc‐ILD) is highly variable, and accurate prognostic markers are needed. KL‐6 is a mucin‐like glycoprotein (MUC1) expressed by type II pneumocytes, while CYFRA 21‐1 is expressed by alveolar and bronchiolar epithelial cells. Both are released into the blood from cell injury.


INTRODUCTION
Interstitial lung disease in scleroderma (systemic sclerosis-associated interstitial lung disease, SSc-ILD) is the leading cause of death in SSc. 1 Although many patients have relatively mild and/or stable ILD, many others have progressive disease with reduced life expectancy. Patients at higher risk of ILD progression need to be identified in order to ensure optimal treatment and monitoring.
Krebs von den Lungen-6 (KL-6), a glycoprotein expressed mainly on type II pneumocytes, is highly expressed by proliferating and regenerating cells. 2,3 Serum KL-6 levels are increased in SSc-ILD, [4][5][6] with higher levels associated with more extensive SSc-ILD, [7][8][9][10][11][12] more rapid short-term decline in forced vital capacity (FVC) 13,14 and development of end-stage lung disease. 15 However, the prognostic value of serum KL-6 levels has not been evaluated against changes in measures of gas transfer, known to be strongly linked to mortality in SSc-ILD, [16][17][18] particularly a categorical worsening in diffusing capacity of the lung for carbon monoxide (DL CO ) by ≥15% at 2 years, an independent predictor of survival in SSc-ILD. 18 A single-nucleotide polymorphism (SNP) in the KL-6 gene MUC1, rs4072037, relates to KL-6 serum levels in SSc, with higher levels in individuals carrying the G allele. 19 Cytokeratin 19 fragment, CYFRA 21-1, is expressed on type I/II pneumocytes and respiratory bronchiolar epithelial cells. Cytokeratin proteolytic fragments are soluble and released into the blood from cell lysis or necrosis. Serum CYFRA 21-1 appears to distinguish idiopathic pulmonary fibrosis (IPF) patients from controls, with higher levels associated with increased mortality. 20 In a study of patients with connective tissue diseases, CYFRA 21-1 was associated with ILD, although the number of patients was small (n = 23), with only seven SSc-ILD patients. 21 In this study, we evaluate serum KL-6 and CYFRA 21-1 as biomarkers of SSc-ILD and its worsening in SSc patients with long-term follow-up.

Study populations
Consecutive SSc patients attending clinics at the Royal Brompton and Royal Free Hospitals, London (retrospective cohort: 1991-2013, prospective cohort: 2014-2016), were recruited. A diagnosis of SSc was made according to established criteria. 22 Only patients with lung function within 6 months of serum collection were included. Patients with malignancies at the time of serum collection were excluded. All participants gave written informed consent, and the Ethics Committees of the Royal Brompton and the Royal Free Hospitals gave authorization for the study (REC 13/LO/0857).

Clinical assessment
Clinical data were recorded at the time of serum collection. ILD was defined as the presence of interstitial changes on chest imaging. Lung function tests were performed in a single lab, including FVC and (DL CO ) levels, as previously reported. 23 Further details are available in Appendix S1 (Supplementary Information).

Statistical analysis
Analyses were performed using STATA15.1 software (StataCorp, College Station, Texas). Group comparisons were made using Wilcoxon's rank sum, Mann-Whitney or chi-square tests, as appropriate. KL-6 and CYFRA 21-1 levels were log transformed to normalize the data. Generalized linear models were used to assess whether the association between serum KL-6 and ILD extent was modified by MUC1 genotype. We performed linear mixed-effects analysis, which takes into account variations in test intervals, using FVC (L) and DL CO (mmol/min), as outcome measures, with subject as a random effect and time from baseline, age, gender, ethnicity and smoking status as fixed effects. A P-value of <0.05 was considered significant. Further details are available in Appendix S1 (Supplementary Information).

Patient cohorts
A total of 189 patients were recruited for the retrospective cohort and 118 patients for the prospective cohort. Further details are available in Appendix S2 (Supplementary Information).
Patient characteristics are described in Table 1. Compared to the retrospective cohort, patients in the prospective cohort were significantly older and more likely to be of non-European ancestry. Patients in the prospective cohort were also more likely to have more severe lung disease, be on active treatment, have estimated pulmonary artery systolic pressure (PASP) ≥40 mm Hg on echocardiogram and less likely to have anti-centromere antibodies (ACA) ( Table 1).
Serum KL-6 and CYFRA 21-1 are associated with the presence and extent of ILD and MUC1 rs4072037 allele carriage Serum KL-6 and CYFRA 21-1 correlated with the presence and severity of ILD in both cohorts, with higher levels in SSc-ILD compared to SSc-no ILD, and in extensive compared to limited ILD ( Fig. 1). In both cohorts, KL-6 levels were significantly higher in patients carrying the G allele of MUC1 rs4072037 (Fig. 2). Further details are available in Appendix S2 (Supplementary Information).

KL-6 and CYFRA 21-1 correlate with baseline levels of lung function
Serum levels of KL-6 and CYFRA-21-1 were inversely correlated with the baseline lung function measurements (Appendix S2, Figs S1,S2 in Supplementary Information).

KL-6 and CYFRA 21-1 levels and active treatment
In both cohorts, patients on active treatment (Table S1 in Supplementary Information) at the time of serum collection had higher levels of KL-6 compared to those not on active treatment (P = 0.03 and P = 0.04, respectively). This association was lost once the disease severity (CPI) was taken into account (Table S2 in Supplementary Information). There was no significant difference in CYFRA 21-1 levels in either cohort according to treatment status.

Association between KL-6 and CYFRA 21-1 and SSc-ILD progression
The association between serum KL-6 and CYFRA 21-1 and lung function worsening was evaluated in patients with SSc-ILD (retrospective n = 146, prospective n = 114). Only associations identified as significant in the retrospective cohort were tested in the prospective cohort for validation.
On linear mixed-effect model analysis, KL-6 was significantly associated with FVC (P < 0.005) and DL CO (P < 0.001) decline in the retrospective cohort. The association with DL CO decline was confirmed in the prospective cohort (P = 0.004) ( Table 2). Serum CYFRA 21-1 was not significantly associated with decline in FVC or DL CO in the retrospective cohort (Table 3).
Having confirmed an association between serum KL-6 and DL CO decline in the prospective cohort, we evaluated whether KL-6 was predictive of lung function decline independent of disease severity. As the two cohorts differed significantly in ILD severity and had markedly different follow-up time, to correct for ILD severity but have adequate numbers to allow statistical  power in each subgroup, we combined the two cohorts and stratified according to median CPI (45.97). This definition of ILD severity was selected as it resulted in an even number of patients in each severity group (n = 128/129), while subgrouping according to Goh et al.'s staging system would have resulted in unequal cohort sizes (n = 134 and n = 123). In patients with less severe ILD (CPI < 45.97), KL-6 was significantly associated with decline in DL CO (P = 0.03). This association remained significant following correction for age, gender, ethnicity, smoking status and allele carriage (P = 0.007). Although the trend towards an association with FVC decline did not reach significance on univariable analysis, KL-6 was also significantly associated with decline in FVC on multivariable analysis (P = 0.01). In patients with more severe ILD (CPI ≥ 45.99), KL-6 was significantly associated with decline in DL CO on both univariable (P = 0.007), and multivariable analyses (P = 0.02) ( Table 4). For clinical purposes, we wanted to test if knowledge of MUC1 rs4072037 allele carriage was necessary for the prognostic utility of KL-6. The associations with DL CO remained significant when allele carriage was omitted from the multivariable analysis (Table S3 in Supplementary Information). All associations remained significant when estimated PASP ≥ 40 mm Hg on echocardiogram was added as a covariate in the smaller group with echocardiographic data (Table S4 in Supplementary Information).
Predictive cut-off value for serum KL-6 in predicting DL CO decline by ≥15% We sought to establish the optimal serum KL-6 threshold in predicting decline in DL CO by ≥15% at 2 years, an established surrogate marker of mortality in SSc-ILD, 18 by performing receiver operating characteristic (ROC) analysis in the retrospective cohort. The best cut-off level for serum KL-6 was 1472 U/mL, with a sensitivity of 41.94%, specificity of 80.67% and 74.03% of patients correctly classified. This cut-off value successfully predicted time to decline in DL CO by ≥15% in the prospective cohort (P = 0.003) (Fig. S3 in Supplementary Information).

KL-6 and CYFRA 21-1 and mortality
Both KL-6 (P = 0.015) and CYFRA 21-1 (P = 0.001) were significantly associated with mortality in patients with SSc-ILD in the retrospective cohort on univariate analysis, although only bordered on statistical significance (P = 0.06 for both) after adjustment for CPI, age, gender, ethnicity, smoking status and allele carriage when appropriate (Table S5 in Supplementary Information). As the findings were borderline significant, we also tested association with survival in the prospective   cohort in patients with SSc-ILD, and did not find an association with mortality.

DISCUSSION
In this study, we found that serum levels of KL-6 and of CYFRA 21-1 were highest in SSc patients with lung involvement, and in those with extensive rather than limited ILD. In patients with SSc-ILD, KL-6, but not CYFRA 21-1, was significantly associated with lung function decline, regardless of ILD severity. Despite advances in the management of SSc-ILD, its impact on quality of life and mortality remains high. Accurate prognostication remains difficult. Evidence supports the need to treat patients with extensive and/or progressive SSc-ILD, while only a subset of patients with milder ILD may require treatment. 25 The last decade has seen the publication of landmark clinical trials for SSc-ILD. 26,27 While immunosuppression remains the mainstay of treatment, there is a subgroup of patients with progressive fibrotic disease despite treatment. Their early identification and prevention of progressive fibrosis remain a key objective. In addition to KL-6, a number of biomarkers have been reported to be associated with ILD presence and/or progression in SSc-ILD, including serum CCL18, 28 although none are currently available for routine clinical use in Europe. Our results suggest that serum KL-6 is a more powerful biomarker than CYFRA 21-1 for predicting SSc-ILD progression across ILD severity. In particular, KL-6 is predictive of lung function decline in patients with less severe SSc-ILD, the group for which predictive markers are most needed, particularly now that the range of options to treat progressive fibrotic lung disease has increased to include anti-fibrotic agents, 27,29 and further novel treatments are under investigation.
Interestingly, although carriage of the MUC1 allele was associated with ILD severity in both cohorts, the significance of the association between serum KL-6 and DL CO did not change even after omitting the allele carriage data from the multivariable analysis, suggesting that for clinical purposes, knowledge of the MUC1 allele carriage status is not indispensable for KL-6 to provide prognostically useful information.
Having observed an association between serum KL-6 and lung function worsening, in order to establish the best predictive cut-off value, we utilized DL CO decline at 2 years, identified as a stronger surrogate mortality marker than changes in FVC in SSc-ILD. 18 We identified optimal thresholds predictive of decline in DL CO by ≥15% at 2 years from baseline in the retrospective cohort, and confirmed that KL-6 ≥ 1472 U/mL was also significantly associated with earlier decline in DL CO by ≥15% in the prospective group. Considering the majority of patients in the prospective cohort were on treatment for their SSc-ILD, serum KL-6 thresholds could aid in identifying patients more likely to require intensification of treatment to prevent progression of disease. In particular, whether serum KL-6 thresholds could help in identifying patients more likely to benefit from the addition of anti-fibrotic treatments will require further study.
Our study has limitations. The prospective cohort was not an ideal validation cohort, as ILD severity was greater and follow-up time was much shorter than in the retrospective cohort. The difference reflected unexpected changes in referral patterns during the study period, with a shift in recent times towards the selective referral of severe SSc-ILD patients. As a result, meaningful analysis of prognostic differences between severe and less severe SSc-ILD was not possible in the prospective cohort, with shorter follow-up time in this cohort as an additional constraint. In view of the importance of severity distinctions, we therefore conducted a post hoc analysis in which the two cohorts were combined and subdivided according to median CPI. This definition for ILD severity was selected as it resulted in an even number of patients in each severity group. Although the CPI has not specifically been tested in SSc-ILD, Wells et al. had observed that the relationship between spirometric lung volumes and DL CO , components of the CPI score and HRCT extent did not differ between SSc-ILD and IPF, suggesting that it is reasonable to use CPI as a measure of severity in SSc-ILD. 30 Another unavoidable limitation of our study is the inability to adjust for treatment differences. Although categorized broadly as active treatment within 3 months of serum collection, the later introduction of treatment could not be accounted for in the analyses. Baseline KL-6 levels were higher in patients on treatment in both cohorts, but this association was lost with adjustment for disease severity, with treatment status linked to disease severity, as expected. CYFRA 21-1 levels did not vary according to treatment status. Treatment regimens in SSc-ILD are too variable to allow categorical subanalysis during longer term follow-up. There is a major variability in the choice, timing and duration of treatment with large modifications often made due to side effects or non-efficacy. Finally, although our main focus was the utility of serum KL-6 and CYFRA-21-1 as potential markers of SSc-ILD progression, we recognize that the relatively small number of patients without ILD is a limitation of the study.
Serum KL-6 and CYFRA 21-1 are markers of epithelial cell damage. Rapid clearance of radio-labelled DTPA, reflecting impaired alveolar epithelial integrity, is associated with progression of SSc-ILD, 31,32 suggesting that epithelial cell damage plays an important role in SSc-ILD pathogenesis. Interestingly, DTPA clearance was associated with lung function worsening, but not with mortality in SSc-ILD, 33 similar to our observations where we found only a weak association with mortality on multivariable analysis and only in the retrospective cohort. This would again suggest that KL-6, like DTPA clearance, is specifically a marker of epithelial events, and therefore linked with lung function worsening. It would be of interest to investigate whether KL-6 is purely a marker of progression in SSc-ILD or if it has a direct role in promoting fibrosis. There is evidence that KL-6 may promote a fibrotic phenotype in human lung fibroblasts, [34][35][36] although further data on its potential role are needed.
In conclusion, despite advances in the knowledge of SSc-ILD staging and pathogenesis, management of the disease remains challenging, with the need for more accurate predictors of disease progression. Serum biomarkers are easily obtainable, and could provide increased prognostic ability and potentially new insights into pathogenesis and potential therapeutic targets in SSc-ILD. Both serum KL-6 and CYFRA 21-1 are markers of pulmonary epithelium injury and abnormal repair. From our study, we conclude that serum KL-6 appears to be a better marker of progressive SSc-ILD than CYFRA 21-1. Ultimately, we need to develop an individualized risk index that incorporates clinical variables including ILD severity, integrated by easily obtainable biomarkers to inform selective early treatment and frequent monitoring of patients with SSc-ILD at high risk of progression. Appendix S2 Additional results. Figure S1 KL-6 correlation with baseline lung function. Figure S2 CYFRA 21-1 correlation with baseline lung function. Figure S3 KL-6 cut-off with decline in DLCO ≥15%. Table S1 Type of treatment at baseline.