The composition of the lung microbiome differs between patients with dermatomyositis and rheumatoid arthritis associated with interstitial lung disease

Dermatomyositis and rheumatoid arthritis are inflammatory diseases that affect the skeletal muscles and joints, respectively. A common systemic complication of these diseases is interstitial lung disease (ILD), which leads to a poor prognosis and increased mortality. However, the mechanism for the initiation and development of ILD in patients with dermatomyositis is currently unknown. In the present study, we used 16S rRNA high‐throughput sequencing to profile the bacterial community composition of bronchoalveolar lavage fluid of patients with dermatomyositis associated with ILD (DM‐ILD; shortened to DM below), rheumatoid arthritis associated with ILD (RA‐ILD; shortened to RA below) and healthy controls (N) aiming to understand the differences in their lung microbiota and to predict gene function. We found that there were more operational taxonomic units (OTUs) in the lung microbiota of both RA and DM compared to N, although there was no significant difference in the number of OTUs between RA and DM. Similarly, the diversity in alphaproteobacteria differed between RA and DM compared to N, but not between RA and DM. The lung microbiota of RA, DM and N was mainly comprised of five phyla: Firmicutes, Bacteroidetes, Proteobacteria, Actinobacteria and Fusobacteria, with 10 dominant genera. Despite the similarity in microbiota composition, we also identified 41 OTUs of lung microbiota that differed among RA, DM and N. Additionally, linear discriminant analysis effect size and linear discriminant analysis genus scores confirmed that 31 microbial biomarkers were clearly distinguished among RA, DM and N. The functional and metabolic alterations of the lung microbiota among RA, DM and N were predicted using picrust, and differentially abundant KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways were identified. Research on the lung microbiota of patients with DM and RA may open new opportunities for developing biomarkers to identify high‐risk patients.

Dermatomyositis [1] is a type of idiopathic inflammatory myopathy that is characterized by the inflammation of the skeletal muscle and skin and involves a characteristic heliotrope skin rash and Gottron's papules [2]. Interstitial lung disease (ILD) is considered a common systemic complication of DM [3]. DM associated with ILD (DM-ILD) is one of the major extra muscular manifestations contributing to increased morbidity and mortality [4]. ILD is also one of the life-threatening complications of clinically amyopathic dermatomyositis. The overall prognosis is grave, with a 33%-67% 6-month mortality despite aggressive immunosuppressive therapy [5][6][7][8][9]. However, the mechanism of the initiation and development of disease progression in DM-ILD is not understood.
Subsequent to the first culture-independent report of the microbiota in the lower respiratory tract [10], accumulating evidence supports potential functional roles for the lung microbiota [11,12]. Dysbiosis of the lung microbiota can underlie lung diseases such as cystic fibrosis [13], asthma [10], chronic obstructive pulmonary diseases [14], bronchiectasis [15], sarcoidosis [16] or even lung cancer [17]. Recently, emerging evidence has shown that the increased bacterial burden of potentially pathogenic bacteria may lead to disease progression in idiopathic pulmonary fibrosis [18,19]. We are also now beginning to understand the composition of the lung microbiota in other ILDs associated with rheumatoid arthritis (RA) [20]. However, the potential contribution of the lung microbiome in DM-ILD has never been investigated.
The present study aimed to characterize the lung microbiota in patients with DM-ILD (DM group) and compare its composition and diversity with the results obtained from patients with RA (RA group) and historic healthy controls (N group).

Participant recruitment and sample collection
The study was approved by the Ethics Committee of Renji Hospital, School of Medicine, Shanghai Jiaotong University (2016075) and written informed consent was obtained from all participants. The study methodologies conformed to the standards set by the Declaration of Helsinki. All participants consented to a bronchoscopy examination at Renji Hospital. Medical history was collected from all participants and a set of routine pre-procedure tests, including physical examination, electrocardiogram, pulmonary function testing, computed tomography, routine blood count and blood coagulation function analysis, were carried out. Patients in the DM group were diagnosed with dermatomyositis in accordance with the classification criteria proposed by Bohan and Peter [21,22]. Patients in the RA group were diagnosed with rheumatoid arthritis according to the 1987 ACR criteria [23]. The diagnosis of interstitial lung disease was based on respiratory symptoms and the presence of bibasilar infiltrates on high-resolution computed tomography. The criteria for selecting healthy control participants were: good physical status, no significant respiratory conditions, and normal findings on pre-procedure examinations and bronchoscopy. The study methodologies conformed to the standards set by the Declaration of Helsinki.
Bronchoscopy was performed as previously described [24]. Bronchoscopy via the nasal route was performed to obtain bronchoalveolar lavage fluid (BALF) samples from patients in the RA and DM groups and one BALF sample from each control. All samples were immediately frozen and maintained at À80°C until further DNA extraction.

DNA extraction, 16S rRNA amplification and sequencing
All BALF samples were subjected to the same procedures for DNA extraction and PCR amplification by the same laboratory staff. The BALF sample was centrifuged at 17 465 g for 30 min at 4°C, and then, the pellet was suspended in 790 lL of sterile lysis buffer (4 M guanidine thiocyanate; 10% N-lauroyl sarcosine; 5% N-lauroyl sarcosine-0.1 M phosphate buffer, pH 8.0) in a 2-mL screwcap tube containing 1 g of glass beads (0.1 mm; BioSpec Products Inc., Bartlesville, OK, USA). This mixture was vortexed vigorously and then incubated at 70°C for 1 h. After incubation using bead beating for 10 min at maximum speed, DNA was extracted in accordance with the manufacturer's instructions for bacterial DNA extraction using The E.Z.N.A.ÒStool DNA Kit (Omega Bio-tek Inc., Norcross GA, USA), with the exception of the lysis steps, and stored at À20°C for further analysis. The blank reagents were set as a negative control to avoid contamination in the extraction progress. The extracted DNA from each sample and the negative control were used as the template to amplify the V3-V4 region of 16S rRNA genes.
The primers F1 and R2 (5 0 -CCTACGGGNGGCWGC AG-3 0 and 5 0 -GACTACHVGGGTATCTAATCC-3 0 ) corresponding to positions 341-805 in the Escherichia coli 16S rRNA gene were used to amplify the V3-V4 region of each BALF sample by PCR. The PCR reagents were set as a negative control for the PCR step. PCR reactions were run in an EasyCycler 96 PCR system (Analytik Jena Corp., Jena, Germany) using the program: 2 min of denaturation at 95°C, followed by 25 cycles for 30 s at 95°C (denaturation), 30 s for annealing at 55°C and 30 s at 72°C (elongation), with a final extension at 72°C for 5 min. The products from different samples were indexed and mixed at equal ratios for sequencing by Shanghai Mobio Biomedical Technology Co. Ltd (Shanghai, China) using the Miseq platform (Illumina Inc., San Diego, CA, USA) in accordance with the manufacturer's instructions.

Bioinformatics analysis for sequencing data
Clean data were extracted from the raw data using USE-ARCH, version 8.0 with the following criteria: (a) Sequences of each sample were extracted using each index with zero mismatches; (b) sequences with an overlap of < 50 bp were discarded; [18] the error rate of overlap > 0.1 was discarded; and (c) sequences < 400 bp after merge were discarded. Operational taxonomic units (OTUs) were classified based on 97% similarity after chimeric sequences were removed using UPARSE, version 7.1 (http://drive 5.com/ uparse). The phylogenetic profile of each 16S rRNA gene sequence was analyzed via RDP CLASSIFIER (http://rdp.cme. msu.edu) against the Silva (SSU123) 16S rRNA database with a confidence threshold of 70%.
Sample diversity metrics were assessed based on the nonparametric Shannon-Wiener diversity index, Chao index and abundance-based coverage estimators index [25]. The Bray-Curtis distance was calculated in QIIME (http://qiime. org). The QIIME pipeline was also used to generate principal coordinate analysis (PCoA) plots to visualize Bray-Curtis dissimilarity. The Mann-Whitney test and Kruskal-Wallis rank sum test were used to test for statistical significance between and among the bacterial types in different groups (QIIME package). Adonis analysis was used to estimate the significance between different groups. The linear discriminant analysis effect size (LEfSe) was used to detect taxa with differential abundance among groups. Bar plots, PCoA plots and Venn diagrams were all generated in R (http://www.R-project.org).
The Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUST) (http:// picrust.github.io/picrust) predicts the metabolic functions of bacterial flora and 16S rRNA gene sequences in the Kyoto Encyclopedia of Genes and Genomes (KEGG). PICRUST recaptures key findings from the Human Microbiome Project by an extended ancestral state reconstruction algorithm and accurately predicts the abundance in host-associated communities of the gene families, with quantifiable uncertainty.
Raw sequencing data of the 16S rRNA gene V3-V4 regions and accompanying information are available in the Sequence Read Archive database (https://www.ncbi.nlm. nih.gov/sra) under accession number PRJNA714758 (RA, DM) and SRP110884 (healthy controls).

Results
In total, 59 BALF samples were prospectively collected. After rigorous diagnosis and exclusion procedures, 46 samples were included for analysis, including 19 RA, seven DM and 18 N from the Renji Hospital.
All samples that belonged to different groups were used to characterize the differences in lung microbiota of the study participants ( Fig. 1).

Baseline characteristics
The present study included 18 healthy controls, seven patients with DM-ILD and 19 patients with RA-ILD. Serological profiling of each patient, including Creactive protein (CRP), erythrocyte sedimentation rate (ESR), interleukin (IL)-2 receptor, IL-6 and IL-8, was performed using standard methods. The clinical characteristics are presented in Table 1.

Characteristics of 16S rRNA gene sequencing in BALF samples
Bacterial 16S rRNAs were detected in all BALF specimens. Both negative controls (blank reagents) from the extraction and PCR steps showed no bands in the agarose gel; thus, all of the PCR products were from the samples, and there was no contamination from reagents or the environment.
Over 1 692 237 high-quality sequencing reads were obtained from the 46 specimens, with an average of 36 787.8 per sample. The Good's coverage of each sample was > 99.90%, indicating that the sequencing reads identified represent the majority of bacteria in each of the BALF samples.
The diversity of lung microbiota among RA, DM and N The mean AE SE number of OTUs was 214.84 AE 20.20 in the BALF samples from the RA group, which was higher than that of the N group (75.83 AE 10.47, P < 0.0001). There were 202.86 AE 31.80 OTUs in the BALF samples from the DM group, which was higher than that of the N group (75.83 AE 10.47, P = 0.001). However, there was no significant difference in the OTUs of BALF samples between the RA and DM groups (P = 0.82). Among the three groups, the angiotensin-converting enzyme (ACE) index was higher in RA than in N (254.49 AE 18.44 vs. 104.37 AE 13.97, P < 0.001) and it was higher in DM than in N (249.71 AE 28.03 vs. 104.37 AE 13.97, P < 0.001). There were no significant differences in the ACE index between the RA and DM groups (P = 0.65). The Chao indices were also higher in RA than in N (202. 20  A Venn diagram showed that 282 of the 1262 OTUs were common to all the groups, whereas 377 were unique to RA and 137 were unique to DM (Fig. 3A).
Notably, compared to the N group, there were many more OTUs increased in RA and DM, implying that there were microbial differences among RA, DM and N.
To compare the overall lung microbiota composition among the three groups, PCoA analysis based on the Bray-Curtis distance and weighted unique fraction distance according to the OTUs of each sample was conducted. Overall, in the total variation of all BALF samples, the first principal component (PC1) could account for 42.07% and PC2 could account for 6.27%. As shown in Fig. 3B, the PCoA revealed a difference among RA, DM and N with respect to lung bacterial composition (P = 0.0015, Adonis analysis).
The composition of lung microbiota in RA, DM and N At the phylum level, the lung microbiota mainly comprised five phyla (Fig. 4A); at the genus level, the lung microbiota mainly comprised 10 dominant genera (Fig. 4B); and the relative abundances for each phylum were comparable in RA, DM and N (mean AE SE) ( Table 2).
Despite the above similarity, the present study also identified 41 OTUs of lung microbiota that differed among RA, DM and N ( Fig. 5 and Table 2 Acinetobacter (OTU 943, OTU 1122) were lower in RA than in DM, and other genera in 41 OTUs were higher than in DM (Fig. 5).
Crucial bacteria of the lung microbial communities in RA, DM and healthy control groups An LEfSe analysis and the linear discriminant analysis (LDA) genus score (Fig. 6) confirmed that 31 Fig. 1. Study design and flow diagram. In total, 59 BALF samples were collected prospectively. After rigorous diagnostic and exclusion procedures, 46 samples were included for analysis, including 19 RA, seven DM and 18 N. In this discovery cohort, we used 16S rRNA high-throughput sequencing to characterize the lung microbiota between RA, DM, and N. microbial biomarkers were clearly distinguished among RA, DM and H. Moreover, the divergence among groups was highly significant (P < 0.05). Biomarker names, LDA scores, log values and P-values are provided in the Supporting information.

Gene function analysis in different groups
To elucidate the functional and metabolic alterations of the lung microbiomes among the RA, DM and N groups, the gene function prediction was inferred from the 16S rRNA data, and the functional potential of the lung microbiota was analyzed using PICRUST. The  differentially abundant KEGG pathways among the RA, DM and N groups were identified by LEfSe (Fig. 7). Biomarker names, LDA scores, log values and P-values are provided in the Supporting information. Seven KEGG pathways including ribosome, DNA repair and recombination proteins, and ribosome biogenesis were enriched in RA (P < 0.05, LDA > 2.5). Twelve pathways including purine metabolism, other ion coupled transporters and aminoacyl tRNA biosynthesis were enriched in DM (P < 0.05, LDA > 2.5). Seventeen pathways including sporulation L3 transcription factors, general function prediction only, methane metabolism, bacterial chemotaxis, arginine and proline metabolism, starch and sucrose metabolism, and pentose and glucuronate interconversions were enriched in the N group.

Discussion
Microbial-immune cell interactions can form an adaptive immune response in hosts, and the contribution of microorganisms to the pathogenesis of autoimmune diseases is credible and possible. Recently, an increasing number of studies have used 16S rRNA gene pyrosequencing technology to explore the lower and upper respiratory tract microbiota in health and disease [26,27]. A deeper understanding of the role of lung microbiota as a mediator of inflammation has recently emerged. Microbes including viruses, bacteria and environmental fungi have long been hypothesized to play a role in the pathogenesis of ILDs. The findings from novel relevant studies have revealed that the human distal airways harbor several bacterial species constituting a unique ecological community [28] and also that an increased bacterial burden and/or an abundance of potentially pathogenic bacteria are associated with disease progression, acute exacerbations and mortality in idiopathic pulmonary fibrosis [19,29,30]. A recent study also investigated the lung microbiota composition in patients with early rheumatoid arthritis and found reduced bacterial species diversity compared to controls [20]. The present study demonstrates that lung microbiota may play a different role in inflammatory lung disease. As a highly progressive disease, management of DM-ILD remains a challenge. Although DM-ILD etiologies include environmental factors such as infection, malignancy and drug toxicities from statins or immune checkpoint inhibitors [31], their role in the pathogenesis of DM-ILD initiation or progression remains poorly understood. To our knowledge, the present study is the first attempt to investigate the lung microbiome in this disease.
In our study, we profiled the differences in lung microbiota of RA, DM and healthy controls using 16S rRNA high-throughput sequencing. We showed that, compared to healthy controls, the alpha diversity in terms of both the evenness and richness is higher in patients with DM and RA, although there were no differences between DM and RA. As previously reported, the a-diversity of the lung microbiome in RA was decreased compared to healthy controls [20], and other research on mucosal sites has reported similar findings [32][33][34]. However, our results conflicted with these findings, possibly because bacteria dysbiosis caused a decline in lung function, which increased the communication between the lung and the environment. Furthermore, more taxa occupied the lung's ecological niche, and these taxa supplied a much more suitable environment for pathogenic bacteria. However, much more research is needed on this phenomenon. Importantly, the increased taxa were not shared among RA, DM and healthy controls. Indeed, some of the shared taxa between RA and DM had a different abundance, such as the genus Prevotella, which was decreased in DM. Some species of Prevotella, such as Prevotella copri, were considered pathogenic [35], and various species belonging to the Prevotella genus possess a different functional potential and therefore impact the clinical outcome differentially [35], including the generation of a 27-kDa protein by DR-presentation from P. copri, which could stimulate T-helper-cell 1 immune responses [36] and suppressed arthritis in humanized HLA-DQ8 mice [37]. The Streptococcus genus was more abundant in samples from lung tissues and bronchoscopy in patients with cancer [38]. In the present study, there was no significant difference in Streptofocus between RA, DM and healthy controls, which may potentiate cancerous processes. Corynebacterium is a common species in the skin microbial community [39], acting as the key stabilizer taxon or commensal in healthy skin, and this species might be replaced or inhibited once the competing network is disrupted by other pathogens [40]. Here, we saw more Corynebacterium in DM than in RA and healthy controls, and these findings in line with prior studies [40]. The taxa that we observed in RA, DM and healthy controls were similar to those reported previously, and there may be many unknown functions for each taxon [40].
Although the present study predicts the function of the lung microbiota, validation studies are still needed.
A combination of descriptive and hypothesis-driven research is essential for identifying potential microbial targets preclinically and establishing the treatment of autoimmune diseases. The main limitation of the present study is the relatively small sample size. Our preliminary findings require confirmation in larger studies. Importantly, our study revealed the differences in lung microbiota and prediction functions among RA, DM and healthy controls. We did not collect data on factors such as living environment, lifestyle and dietary habits, which can cause changes in lung microbiota. Therefore, a correlation between microbiota and such factors is lacking. Also, we knew that the greater the number of samples collected, the better the statistical power of the study. However, BALF sample collection was difficult to implement. Because bronchoscopy is invasive, patients with RA or DM are not willing to undergo bronchoalveolar lavage if there is no obvious lung CT abnormality; accordingly, it is difficult to include RA or DM patients without interstitial lung disease as control participants. To better elucidate the causal relationship between the occurrence of DM and RA with the lung microbiota, samples will be collected longitudinally for subsequent in-depth studies, which will shed further light on the mechanisms of autoimmune disease development.
In conclusion, we profiled the differences in the lung microbiota of patients with RA, DM and healthy controls using 16S rRNA high-throughput sequencing. Our data demonstrate that the a-diversity of lung microbiota in patients with DM and RA was higher than that of healthy controls, although there was no difference between DM and RA. However, comparisons of DM and RA showed much more varied genera, with 14 genera more prevalent in DM than in RA, such as Lachnospiraceae incertae sedis, Prevotella, Leptotrichia, Atopobium, Campylobacter, Acinetobacter, Corynebacterium, Blautia, Lactobacillus, Stenotrophomonas, Achromobacter, Aeromonas, Delftia and Acinetobacter. As the differences in microbiota between DM and RA based on LEfSe suggest, we cannot discount the role of these bacteria in lung disease. Gene function analysis has shown that DM and RA have different metabolic pathways. Although we investigated lung microbiota and gene function prediction, we note that the present study only described lung microbiota in patients with RA and DM, and therefore causality cannot be inferred. However, further studies are needed to elucidate the role of the lung microbiota and its potential association with lung disease. Research on the lung microbiome and lung disease may also open new opportunities for developing biomarkers to identify high-risk patients.