Single‐cell analysis identified lung progenitor cells in COVID‐19 patients

Abstract Objectives The high mortality of severe 2019 novel coronavirus disease (COVID‐19) cases is mainly caused by acute respiratory distress syndrome (ARDS), which is characterized by increased permeability of the alveolar epithelial barriers, pulmonary oedema and consequently inflammatory tissue damage. Some but not all patients showed full functional recovery after the devastating lung damage, and so far there is little knowledge about the lung repair process. We focused on crucial roles of lung progenitor cells in alveolar cell regeneration and epithelial barrier re‐establishment and aimed to uncover a possible mechanism of lung repair after severe SARS‐CoV‐2 infection. Materials and methods Bronchoalveolar lavage fluid (BALF) of COVID‐19 patients was analysed by single‐cell RNA‐sequencing (scRNA‐seq). Transplantation of a single KRT5+ cell‐derived cell population into damaged mouse lung and time‐course scRNA‐seq analysis was performed. Results In severe (or critical) COVID‐19 patients, there is a remarkable expansion of TM4SF1+ and KRT5+ lung progenitor cells. The two distinct populations of progenitor cells could play crucial roles in alveolar cell regeneration and epithelial barrier re‐establishment, respectively. The transplanted KRT5+ progenitors could long‐term engraft into host lung and differentiate into HOPX+ OCLN+ alveolar barrier cell which restored the epithelial barrier and efficiently prevented inflammatory cell infiltration. Conclusions This work uncovered the mechanism by which various lung progenitor cells work in concert to prevent and replenish alveoli loss post‐severe SARS‐CoV‐2 infection.


| INTRODUC TI ON
However, severe virus infection leads to diffuse alveolar damage (DAD) characterized apoptosis, desquamation of alveolar epithelial cells and infiltration of inflammatory cells into alveolar cavity, which could eventually lead to hypoxaemia, pulmonary tissue fibrosis and death of patients. Some but not all patients showed full functional recovery after the devastating lung damage, and so far there is little knowledge about the lung repair process. 1 Hyperplasia of type II alveolar cells (ATII) was also noted in most cases, which could suggest an undergoing regenerative process mediated by ATII lung progenitor cells. [2][3][4][5] Recent animal studies demonstrated that the existence of multiple stem/progenitor populations in lung which have a potent effect to ameliorate pulmonary infection as well as inflammation and regenerate damaged bronchial and alveolar tissues. Previous reports have identified a rare population of Wnt-responsive ATII is regarded as the major facultative progenitors, 6,7 which can be specifically marked by TM4SF1 expression in human lung. 8 Distal airway stem cells (DASCs) with potential in stem cell-based therapies for acute and chronic lung diseases have been identified, [9][10][11][12] expressing SOX9, TP63 and KRT5, all known to play pivotal roles in the maintenance of stem/progenitor cell states and epithelial differentiation in multiple organ types both in development and during adulthood. [13][14][15][16][17] These DASCs were able to respond to injury induced by virus infection and assist airway and alveolar epithelial regeneration. 12 These properties suggest the mechanism of the repair process of COVID-19 patients by which multiple stem/progenitor populations work in concert.

| ScRNA-seq analysis of BALF cells
Public data sets (GEO: GSE14 5926) 18 and (GEO: GSE12 8033) 19 which contain scRNA-seq data from BALF cells from three patients with moderate COVID-19 (M1-M3), six patients with severe/critical COVID-19 (S1-S6), three healthy controls (HC1-HC3) and one fresh BALF (GSM3660650) from a lung transplant donor (HC4) samples were used for bioinformatic analysis. Seurat v.3 was used for quality control. The following criteria were then applied to each cell of all Epithelial cells were re-clustered with Seurat v.3. Epithelial cells of all samples were re-clustered using the same parameter mentioned above in the clustering step, and parameter resolution was set to 0.3. Immune cell clusters were removed, and the other cells were re-clustered using the same parameter mentioned above in the clustering step, and parameter resolution was set to 0.4. MAST 20 in Seurat v.3 (FindAllMarkers function) was used to perform differential gene expression analysis. For each cluster of epithelial cells, differentially expressed genes (DEGs) were generated relative to all of the other cells. A gene was considered significant with adjusted P < .05 (P-values were adjusted by false discovery rate in MAST).
For analysis of engrafted GFP+ cells, single cells were captured and barcoded in 10× Chromium Controller (10× Genomics).
Subsequently, RNA from the barcoded cells was reverse-transcribed and sequencing libraries were prepared using Chromium Single-Cell

| Gene functional annotation
Gene Ontology (GO) enrichment analysis and Gene Set Enrichment Analysis (GSEA) of differentially expressed genes were implemented by the ClusterProfiler R package. 21 GO terms with corrected Pvalue < .05 were considered significantly enriched by differentially expressed genes. Dot plots were used to visualize enriched terms by the enrichplot R package. For hypoxia gene analysis, the hallmark gene sets in MsigDB 22 were used for annotation.

| Single-cell trajectory analysis
To construct single-cell pseudotime trajectory and to identify genes that change as the cells undergo transition, Monocle2 (version 2.4.0) algorithm 23 was applied to our data sets. Genes for ordering cells were selected if they were expressed in ≥1% cells, their mean expression value was ≥0.3 and dispersion empirical value was ≥1. Based on the 'DDRTree ' method, the data were reduced to two dimensional, and then, the cells were ordered along the trajectory.

| Correlation analysis of BALF cells and engrafted cells
The scHCL Model 24 was used to assess the similarity between

| Cell culture
Mouse lung P63+ KRT5+ progenitor cells were isolated and cultured as previously described. 12 Briefly, lung lobes were collected and processed into a single-cell suspension by protease, trypsin and DNaseI. Dissociated cells were passed through 70-μm nylon mesh, washed twice with cold F12 medium and then cultured onto feeder cells (irradiated 3T3-J2 feeder cells) in culture medium including DMEM/F12, 10% FBS (Hyclone), Pen/Strep, amphotericin and growth factor cocktail as previously described. Cells were grown in a humidified atmosphere of 7.5% (v/v) CO 2 at 37°C. To generate monoclonal cell, cells were processed and diluted into single-cell suspension. One single cell was aspirated and isolated by pipette under microscopy and then transferred into 96-well plate to expand.

| FACS sorting of engrafted GFP+ single cells
Transplanted lung was collected and immersed in cold F12 medium with 5% FBS, followed by being minced into small pieces and digested with dissociation buffer (F12, 1 mg/mL protease, 0.005% trypsin and 10 ng/mL DNaseI) on shaker in 37 degrees for 1 hour. Dissociated cells were filtered through 100-μm cell strainer, and Red Blood Cell Lysis Buffer was used to remove erythrocyte. Cell pellets were resuspended in DMEM containing with 1% FBS following washing twice and then passed over 30-μm strainers. Sorting and subsequent quantification were performed on BD FACS Arial cytometers. GFP+ cells were gated using SSC-A vs FSC-A, FSC-H vs FSC-W and SSC-H vs SSC-W gates, followed by SSC-A vs FITC-A gate.

| Statistics
Differences of median percentage between healthy controls, moderate and severe groups of all cell types, TM4SF1+ AGER+ cells in all TM4SF1+ cells and BALF cells annotated to mature ABCs were compared using a Student's t test (two-sided, unadjusted for multiple comparisons) with R ggpubr v.0.2.5. Differences of gene expression levels between healthy controls, moderate and severe groups were compared using MAST in Seurat v.3. A gene was considered significant with adjusted P < .05 (P-values were adjusted by false discovery rate in MAST).

| RE SULTS
In order to fully elucidate the epithelial damage and repair mechanism, we analysed the single-cell transcriptomic profile of lung BALF to quantify the major events post-infection and focused on structural epithelial cells. BALF is a useful technique for sampling the human lung, providing landscape information of the whole lower respiratory tract. The current study was based on public scRNA-seq data sets on BALF cells from three patients with moderate COVID-19 (M1-M3), six patients with severe/critical infection (S1-S6) and four healthy controls (HC1-HC4). 18,19 Firstly, we performed unsupervised clustering analysis on the whole data set to separate EPCAM+/TPPP3+/KRT18+ epithelial cells from other cells types (mostly immune cells) in the BALF ( Figure S1A,B). Re-clustering analysis identified 12 epithelial cell clusters, among them four were identified to be co-expressing immune markers which could be epithelial cells engulfed by leucocytes ( Figure S1C Figure 1E). This phenomenon was not obvious in moderate COVID-19 patients, which was also consistent with previous pathological observation. 25 Therefore, the number of alveolar cells (or the alveolar marker gene expression level) in BALFs could be clinically used to measure the structural integrity of lung, which could serve as an index of disease severity for COVID-19 patients.
In the BALFs of patients with severe infection, we also found significant higher proportions of progenitor cell clusters (Cluster 7) ( Figure 1B-D). Multiple stem/progenitor cell populations have been reported to play critical roles in damage repair after various types of acute lung injury. 26 Among them, a rare population of Wntresponsive ATII is regarded as the major facultative progenitors, 6,7 which can be specifically marked by TM4SF1 expression in human lung. 8 In the current study, we found that in the patients of severe group, the number of TM4SF1+ cells increased remarkably, which implicated the rapid activation of such progenitor cells by tissue damage (Figure 2A,B). Consistently, we observed co-expression of TM4SF1 and mature alveolar cell markers, AGER (also known as RAGE) and SPC, in a substantial proportion of TM4SF1+ cells (7.45% in average) in patients of severe group ( Figure 2C). These results In order to examine the process whereby KRT5+ progenitor cells restore mature alveolar barrier in injured lung, we isolated the mouse KRT5+ progenitor cells (previously also named distal airway stem cells) 12 for transplantation assay as described in Figure 3A. Briefly, the cell population was trypsinized into single-cell suspension and a single-cell-derived pedigree clone was propagated, followed by GFP labelling by lentiviral infection for further analysis. Immunostaining showed that all cultured GFP+ KRT5+ progenitor cells expressed KRT5 and P63 marker genes ( Figure 3B). Then, we transplanted the cultured Claudin7, Claudin4, AQP3, etc) whose function was related to lung morphogenesis, tight junction assembly and water homeostasis ( Figure S3B,C).
To further dissect the lineage relationship between different clusters of engrafted cells, we performed the Monocle pseudotime analysis based on the scRNA-seq data. The result indicated that the P63+ KRT5+ progenitors could differentiate into P63− KRT5+ progenitors and immature ABCs, which eventually give rise to OCLN+ mature ABCs ( Figure 4A,B). Consistent with the pseudotime analytical data, we noticed that compared to the 30-day engrafted cells, the 90-day engrafted cells had relatively more mature ABCs, less immature ABCs and less P63-KRT5+ progenitors ( Figure 4C). Altogether such data revealed that the tight alveolar barrier would be gradually established by KRT5+ progenitor cell differentiation.
Next, we asked which molecular signalling pathways were involved in the establishment of alveolar barrier. Previous studies indicated that Notch signalling is critical for activation of P63+ KRT5+ progenitors in lung, but persistent Notch signalling prevents further differentiation of cells. 29 Consistently, here we found that the expression of multiple Notch pathway component genes was up-regulated in P63+ KRT5+ progenitors but gradually down-regulated when the cells were differentiating to mature ABC ( Figure 4D). In addition, sense of low oxygen level through hypoxia pathway is known to be critical for the expansion of KRT5+ progenitors. 34 Here, we found that the hypoxia pathway component gene expressions were relatively low in P63+  Figure 4F). So why the four patients have much more mature ABC than others? Interestingly, we found that generally the individuals' mature ABC cell numbers were positively correlated with their FCN1+ macrophage cell numbers in BALF (P = .002), and the patient S2, S4, S5 and S6 had much more FCN1+ macrophages in BALF than most of the other individuals ( Figure 4G). FCN1+ macrophages were reported to be highly pro-inflammatory and responsible for the tissue damage in COVID-19 patients. 18 Therefore, it seems that the establishment of new alveolar barrier was closely associated with severity of tissue damage and inflammation at individual level.

| D ISCUSS I ON
Altogether, our current studies uncovered a possible mechanism of lung repair after severe SARS-CoV-2 infection. In those severe patients, the virus infection led to alveolar cell death and leakage of epithelial barrier, which resulted in massive flood of proteinsenriched fluids and leucocytes into alveolar cavity, which finally led to pulmonary oedema and tissue hypoxia. In this situation, as repair backups, KRT5+ and TM4SF1+ progenitor cell could be activated by the hypoxia and other multiple microenvironment signals to expand, migrate and differentiate into new functional cells.
Eventually, the KRT5+ progenitor-derived ABCs could rapidly restore new epithelial barriers to cover the denuded alveoli and seal the leakage. Simultaneously, the TM4SF1+ progenitor cells could gradually regenerate new functional ATII and ATI cells. The synergic act of two types of progenitor cells could make timely repair of alveoli in an acute injury scenario (summarized in Figure 4H). Of course, there could be more resident or circulating stem/progenitor cells working in concert with them to achieve maximal repair.
Of note, our current studies were based on limited sample data.
Future study within a larger patient cohort could further facilitate our understanding of the repair process of COVID-19 patients, and development of potential progenitor cell-based therapeutic strategies. Zuo. We thank Prof. Zheng Zhang group for sharing their original COVID-19 BALF scRNA-Seq data set to public. We salute to medical workers who sacrificed their lives in fight against the COVID-19 pandemic.

CO N FLI C T O F I NTE R E S T
All authors declare no conflicts of interests.

AUTH O R CO NTR I B UTI O N S
WZ and TZ designed the study. ZXZ, YZ, YQZ and XFW involved in data collection and analysis. ZXZ, YZ, YQZ and WZ prepared the manuscript.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.