Proteomic profiling of lung diffusion impairment in the recovery stage of SARS‐CoV‐2–induced ARDS

Dear Editor, In survivors of acute respiratory distress syndrome (ARDS) secondary to SARS-CoV-2 infection, lung diffusion impairment is consistently associated with a characteristic plasma proteome. The mechanistic pathways linked to the proteomic pattern provide novel evidence on multiple biological domains relevant to the postacute pulmonary sequelae. Based on the increasing number of COVID-19 survivors affected by pulmonary abnormalities and the limited understanding of the pathophysiology of the sequelae,1 we analysed the systemic proteomic determinants of lung diffusion impairment in SARS-CoV-2–induced ARDS survivors. This is a substudy of a 3-month prospective cohort study including survivors of severe COVID-19 (n = 88).2 Patients admitted to the Hospital Universitari Arnau de VilanovaSanta María (Lleida, Spain) between March and August 2020 were included if they fulfilled the following criteria: aged over 18, developed ARDS during hospital stay and attended a ‘post-COVID’ evaluation 3 months after hospital discharge. The study received approval from the medical ethics committee (CEIC/2273) and was performed in full compliance with the Declaration of Helsinki. The patients received written information about the study and signed an informed consent form. A complete pulmonary evaluation was performed as previously detailed.2 Blood samples were collected in EDTA tubes (BD, NJ, USA) and processed using standardised operating procedures with support by IRBLleida Biobank (B.0000682) and ‘Plataforma Biobancos PT20/00021’. Plasma proteomic profiling was performed using the PEA technology (Olink, Uppsala, Sweden). Four panels were analysed: organ damage, immune response, inflammation and metabolism. Additional details can be consulted at https://www.olink.com/resources-support/documentdownload-center/. A total of 364 proteins were measured. One hundred forty-five proteins were excluded from sub-

Dear Editor, In survivors of acute respiratory distress syndrome (ARDS) secondary to SARS-CoV-2 infection, lung diffusion impairment is consistently associated with a characteristic plasma proteome. The mechanistic pathways linked to the proteomic pattern provide novel evidence on multiple biological domains relevant to the postacute pulmonary sequelae.
Based on the increasing number of COVID-19 survivors affected by pulmonary abnormalities and the limited understanding of the pathophysiology of the sequelae, 1 we analysed the systemic proteomic determinants of lung diffusion impairment in SARS-CoV-2-induced ARDS survivors.
This is a substudy of a 3-month prospective cohort study including survivors of severe COVID-19 (n = 88). 2 Patients admitted to the Hospital Universitari Arnau de Vilanova-Santa María (Lleida, Spain) between March and August 2020 were included if they fulfilled the following criteria: aged over 18, developed ARDS during hospital stay and attended a 'post-COVID' evaluation 3 months after hospital discharge. The study received approval from the medical ethics committee (CEIC/2273) and was performed in full compliance with the Declaration of Helsinki. The patients received written information about the study and signed an informed consent form. A complete pulmonary evaluation was performed as previously detailed. 2 Blood samples were collected in EDTA tubes (BD, NJ, USA) and processed using standardised operating procedures with support by IRBLleida Biobank (B.0000682) and 'Plataforma Biobancos PT20/00021'. Plasma proteomic profiling was performed using the PEA technology (Olink, Uppsala, Sweden). Four panels were analysed: organ damage, immune response, inflammation and metabolism. Additional details can be consulted at https://www.olink.com/resources-support/documentdownload-center/. A total of 364 proteins were measured. One hundred forty-five proteins were excluded from sub- sequent studies due to undetectable levels in more than 50% of the samples (Table S1). SARS-CoV-2 RNA was detected as previously described. 3 STRING, 4 Reactome, 5 GTEX (https://www.gtexportal.org/home/) and Drug-Gene Interaction 6 databases were used for bioinformatic analyses. All statistical analyses were performed using R software, version 4.0.2.
The study flowchart is displayed in Figure S1. The most relevant demographic and clinical characteristics during the acute phase are shown in Table 1. The median (P 25 ;P 75 ) age was 60.0 years (53.0;65.5), and the prevalent sex was male (69.0%). At the 3-month follow-up, 30% of patients presented moderate-to-severe pulmonary diffusion impairment (D LCO < 60%) ( Table 2). Using linear models for arrays, we found 15 differentially detected proteins (FDR < 0.05) in this study group ( Figure 1A, Table S2). The 15 proteins separated the patients according to the grade of lung dysfunction ( Figure 1B,C). All proteins showed higher concentrations in patients with D LCO < 60% ( Figure 1D). Proteins showed a dose-response relationship with D LCO in unadjusted generalized additive models (GAM) models ( Figure S2). Renal function at follow-up was associated with both diffusion impairment and several proteins (rho≥0.3) (Table 2, Figure S3). Therefore, glomerular filtration was considered a confounder, together with age, sex, previous chronic pulmonary disease, smoking history and the use of corticoids after hospital discharge. No impact of these confounding factors was observed ( Figure 1E). Except for KIM1 (rho≥0.3 and r pb ≥0.3), there was no correlation between protein levels and disease severity ( Figure S4). KIM1, LAMP3 and PGF correlated with the presence of fibrotic lesions (r pb ≥0.3) ( Figure S5). Specific correlations were observed between protein levels and laboratory parameters (rho≥0.3) ( Figure S3).
The sparse partial least-squares discriminant analysis (sPLS-DA) generated a signature of 20 proteins that allowed optimal discrimination between study groups (AUC = 0.872) (Figure 2A-C). Based on the variable importance of component 1, the top five relevant contributors were PTN, KIM1, CALCA, CLEC7A and ENTPD6 (Figure 2A). The feature selection procedure based on random forest supported these results ( Figure S6A). In addition, sPLS was used to determine the protein profile that best explained the D LCO levels (as a continuous variable) ( Figures 2D,E,F). The analysis identified a signature of 35 proteins. PTN, PGF, NPDC1 and METRNL were the most weighted factors for defining component 1 ( Figure 2D). The proteomic profile generated using random forest was in concordance with these findings ( Figure S6B). IFN-γ, which participates in the response to infection, 7 was associated with diffusion capacity. Therefore, we analysed viral load in plasma samples from a subset of 50 patients. Only one patient was positive for the presence of SARS-CoV-2 RNA. The signature including the higher number of proteins (n = 35) was used for bioinformatic analyses. An enrichment in pathways associated with cell proliferation and differentiation, tissue remodelling, inflammation and immune response, angiogenesis, coagulation and fibrosis was observed ( Figure S7A, Tables S3 and S4). Three independent protein networks were identified ( Figure S8). A generalized expression of the signature was observed in the lung but also in other tissues ( Figure S7B). The proteomic pattern was enriched in lung epithelial, endothelial and immune cells ( Figure S7C). The drug-gene interaction analysis identified several FDA-approved drugs that can target the proteins (Table S5).
Postinfection long-term lung dysfunction has become clinically evident in a large percentage of SARS-CoV-2-induced-ARDS survivors. Systemic molecular profiling constitutes a promising strategy to decipher the underlying biological mechanisms linked to the pulmonary outcomes and, consequently, to identify candidates that may be amenable of therapeutic intervention. [8][9][10] Here, we provide compelling evidence that (i) a set of plasma proteins are differentially detected in survivors with moderate-tosevere diffusion impairment; (ii) diffusion capacity is associated with alterations in the proteomic profile, even after adjustment for confounding factors; (iii) survivors with the most serious sequelae show higher disturbances in the protein levels; (iv) sPLS and random forest define protein signatures highly associated with pulmonary function; (v) the signatures are composed of heterogeneous factors implicating multiple biological pathways; (vi) the signatures constitute a source of targets for candidate drugs; (vii) plasma proteomic profiles accurately classify patients with respiratory sequelae; and (viii) no association was observed between blood viral load and diffusion impairment.

CONCLUSION
The plasma proteomic profile linked to lung diffusion impairment improves our understanding of the physiopathology of postacute pulmonary sequelae in COVID-19, and, consequently, constitutes a useful resource for the design of therapeutic strategies and the development of tools to improve medical decision-making in the "post-COVID" syndrome. Additional cohorts and functional analyses are needed to corroborate our findings.

C O N F L I C T O F I N T E R E S T
The authors declare that they have no competing interests.

F U N D I N G I N F O R M AT I O N
Financial support was provided by the Instituto de Salud Carlos III de Madrid (COV20/00110), co-funded by the European Development Regional Fund (A Way to Achieve Europe programme) and Centro de Investigación Biomedica En Red Enfermedades Respiratorias (CIBERES). CIBERES is an initiative of the Instituto de Salud Carlos III. Suported by: Programa de donaciones "estar preparados" UNESPA (Madrid, Spain); and Fundación Francisco Soria Melguizo (Madrid, Spain). Finançat per La Fundació La Marató de TV3, projecte amb codi 202108-30/-31. COVIDPONENT is funded by Institut Català de la Salut and Gestió de Serveis Sanitaris. APT was funded by the Sara Borrell Research Grant CD018/0123 funded by the Instituto de Salud Carlos III and co-financed by the European Development Regional Fund (A Way to Achieve Europe programme). MCGH is the recipient of a predoctoral fellowship from the "University of Lleida". DdGC (Miguel Servet 2020: CP20/00041) and MM (PFIS: FI21/00187) have received financial support from the Instituto de Salud Carlos III, cofunded by the European Social Fund (ESF)/"Investing in your future".