Interplay between SARS‐CoV‐2 and human long non‐coding RNAs

Abstract The long non‐coding RNAs (lncRNAs) play a critical regulatory role in the host response to the viral infection. However, little is understood about the transcriptome architecture, especially lncRNAs pattern during the SARS‐CoV‐2 infection. In the present study, using publicly available RNA sequencing data of bronchoalveolar lavage fluid (BALF) and peripheral blood mononuclear cells (PBMC) samples from COVID‐19 patients and healthy individuals, three interesting findings highlighted: (a) More than half of the interactions between lncRNAs‐PCGs of BALF samples established by three trans‐acting lncRNAs (HOTAIRM1, PVT1 and AL392172.1), which also exhibited the high affinity for binding to the SARS‐CoV‐2 genome, suggesting the major regulatory role of these lncRNAs during the SARS‐CoV‐2 infection. (b) lncRNAs of MALAT1 and NEAT1 are possibly contributed to the inflammation development in the SARS‐CoV‐2 infected cells. (c) In contrast to the 3′ part of the SARS‐CoV‐2 genome, the 5′ part can interact with many human lncRNAs. Therefore, the mRNA‐based vaccines will not show any side effects because of the off‐label interactions with the human lncRNAs. Overall, the putative functionalities of lncRNAs can be promising to design the non‐coding RNA‐based drugs and to inspect the efficiency of vaccines to overcome the current pandemic.

base-pairing to either stabilize mRNAs and enhance or inhibit their translation. 2 The role of lncRNAs in the viral infection, including the initiation and progression of infectious diseases, has been recently reported. RNA sequencing of SARS-CoV-and influenza A-infected lung tissues of mice also demonstrated the key roles of lncRNAs in respiratory virus pathogenesis via stimulating the interferon (IFN) production. 3 In our recent work, we found that the miR-29 family has the most binding sites (11 sites) on the SARS-CoV-2 genome. 4 However, to our knowledge, there is not any report on investigating the physical interaction of human differentially expressed ln-cRNAs with SARS-CoV-2. In the present study, using the available transcriptomic data obtained from the peripheral blood mononu-

| Data collection and processing
The raw RNA sequencing data of 12 Chinese individuals (PBMC and BALF) deposited at the Beijing Institute of Genomics (BIG) Data Center (accession number: CRA002390) was used in the present study. 5 After checking the read quality and trimming, reads were mapped to the human genome (hg38) using STAR (V. 2.7.2b) with the ENCODE standard options . 6 Then, the count matrix was generated, and differentially expressed genes were identified using edgeR package (V.3.7). 7 The genes with a read count greater than 15 were chosen and normalized to counts per millions (CPM). For the BALF data analysis, we summed up the read counts from the two technical replicates of COVID-19 patients to create an object with a single column of reading count for each patient sample. Here, genes with log 2 fold change > |1| and false discovery rate (FDR) threshold of 0.05 considered significantly differentially expressed for further analysis. Genes with the biotypes of processed_transcript, pseudogene, lincRNA, 3 prim_overlapping_ncrna, antisense, sense_intronic and sense_overlapping were considered as lncRNAs for further analysis.

| Identification of cis-acting lncRNAs
The lncRNAs located at the adjacent (300 kbp upstream and downstream) of protein-coding genes (PCGs) are considered as cis-acting lncRNAs if they exhibited a high correlation expression with the adjacent PCGs (correlation coefficient >.95 or <−.95 at the adjusted P-value cut-off of .05). The correlation coefficient between DE lncR-NAs and DE PCGs calculated using the Hmisc package implemented in R. Spearman's rank correlation test was utilized for doing this analysis.

| Identification of trans-acting lncRNAs
We screened the trans-acting lncRNAs by comparing the complementary bases between PCGs and lncRNAs using the LncTar tool. 8 Here, PCGs and lncRNAs with high fold change threshold (log 2 FC cut off of |2|) were utilized to ensure the possibility or impossibility for the physical interaction between the lncRNAs and the target genes. Additionally, we investigated the possible interaction of DE lncRNAs with the complete genome sequence of SARS-CoV-2 (GenBank: MN988668) by the LncTar tool.

| Functional annotation of lncRNAs
The biological function of DE lncRNAs was identified by gene set enrichment analysis of DE PCG targets of lncRNAs using the g:Profiler tool. 9 The Go terms or biological pathways with FDR < 0.05 were considered significant.

| RE SULTS AND D ISCUSS I ON
We detected 207 and 223 lncRNAs as significantly altered genes in BALF and PMBC samples, respectively (File S1). LincRNA and antisense were the main classes of differentially expressed lncRNAs in both PBMC and BALF samples. Among the dysregulated lncRNAs, 17% of lncRNAs in PBMC samples and about 50% in BALF samples were up-regulated.

| Identification of DE cis-acting lncRNAs in response to the SARS-CoV-2 infection
We found that the expression of 239 and 527 PCGs at the PBMC and BALF samples could be influenced by 106 and 162 cis-acting lncRNAs, respectively. Based on our enrichment results, these lncRNAs mainly play a role in the immune-related processes in the PBMC samples. The GO terms, like immune system process, myeloid leukocyte activation, neutrophil degranulation and the regulation of ion homeostasis were significantly associated with this type of RNA molecules during the SARS-CoV-2 infection (File S2). Specifically, nine cis DE lncRNAs were highly correlated (correlation coefficient >.9 or <−.9, adjusted P-value <.05) with the known genes involved in the immune system (Table 1)

| Identification of DE trans-acting lncRNAs in response to the SARS-CoV-2 infection
According to our results, 37 differentially expressed trans-lncRNAs had the potential binding site on 1603 differentially expressed protein-coding genes in the BALF sample. Interestingly, we found that 68% of interactions between lncRNAs and PCGs were covered by three trans-lncRNAs named AL392172, HOTAIRM1 and PVT1

| The DE lncRNAs interaction with the SARS-CoV-2 genome
With the dnG score of less than −8, a very stringent cut-off, 20 DE Also, the viral portion of SARS-CoV-2 harbouring the sequence coding spike protein tends to interact neither with human proteins nor with human lncRNAs, implying that the mRNA-based vaccines will not show the possible side effects because of the off-label interactions with these macromolecules.

ACK N OWLED G EM ENTS
This work was supported by a COVID-19 grant obtained from Shahid Beheshti University of Medical Sciences.

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.