Host transcriptional response to SARS‐CoV‐2 infection in COVID‐19 patients

Background One of the most perplexing aspects of infection with the SARS-CoV-2 virus has been the variable response elicited in its human hosts. Investigating the transcriptional changes in individuals affected by COVID-19 can help understand and predict the degree of illness and guide clinical outcomes in diverse backgrounds. Methods Analysis of host transcriptome variations via RNA sequencing from naso/oropharyngeal swabs of COVID-19 patients. Results We report strong upregulation of the innate immune response, especially type I interferon pathway, upon SARS-CoV-2 infection. Upregulated genes were subjected to a comparative meta-analysis using global datasets to identify a common network of interferon stimulated and viral response genes that mediate the host response and resolution of infection. A large proportion of mis-regulated genes showed a reduction in expression level, suggesting an overall decrease in host mRNA production. Significantly downregulated genes included those encoding olfactory, taste and neuro-sensory receptors. Many pro-inflammatory markers and cytokines were also downregulated or remained unchanged in the COVID-19 patients. Finally, a large number of non-coding RNAs were identified as down-regulated, with a few of the lncRNAs associated with functional roles in directing the response to viral infection. Conclusions SARS-CoV-2 infection results in the robust activation of the body’s innate immunity. Reduction of gene expression is well correlated with the clinical manifestations and symptoms of COVID-19 such as the loss of smell and taste, and myocardial and neurological complications. This study provides a critical dataset of genes that will enhance our understanding of the nature and prognosis of COVID-19.

Dear Editor, COVID-19 has an extremely variable prognosis, ranging from asymptomatic and mildly affected individuals to severe disease and death. We have investigated the transcriptional changes in 36 COVID-19 positive Indian patients hospitalized during the first surge (Figure 1A, Table S1, and Supplementary Methods) against 5 COVID-19 negative samples. RNA was isolated from naso/oropharyngeal swabs for paired end sequencing using the Illumina Nova-seq 6000. We identified 251 upregulated (220 protein coding) and 9068 downregulated (3252 protein coding) differentially expressed genes (DEGs) (adjusted p-value < 0.05 and absolute log2 fold change > 1) ( Figure 1B, Tables S2 and S3). Seven patients were critical and required intensive care unit (ICU) intervention, while 23 were discharged from COVID-19 ward (W), although no significant differences could be seen in their transcriptional profiles ( Figure 1C). The overall transcriptional reduction, irrespective of disease severity (Figure 1C), is well correlated with the phenomenon of fading host cell functionality and prominent viral protein synthesis, and may be associated with interference in host cellular processes and responses. 1 The results indicate a diverse transcriptomic profile in response to SARS-CoV-2, in line with the variable prognosis seen in many COVID-19 patients. However, we find robust activation of the innate immune response concomitant with a reduction in the gene expression profiles associated with cardiac, muscular, and neurological processes, as well as peripheral neurosensory markers.
Immune response genes were highly upregulated (Figure 2A and Table S4), with prominent clusters of genes associated with multiple viral infections ( Figure 2B and Table S5) marking the activation of infection clearance pathways. Meta-analysis of published datasets identified a signature of 19 upregulated genes (Table S6) . Prominent nodes include well-documented IFNstimulated genes (ISGs), IFIT paralogs that restrict viral translation, IFIH1 and ISG15 that drive innate immune response upon sensing viral RNA, as well as proviral factors XAF1 and MX1, and DExD/H-Box Helicase antiviral factors that promote RIG-I like receptor-mediated signaling. STAT1 binding to ISGs mediates the IFN-triggered host response and its dysfunction has been associated with hyperactivation of inflammatory pathways in individuals with acute COVID-19 pathophysiology. IFN-mediated activation of the JAK-STAT signaling pathway may play a role in inducing necroptosis ( Figure 2B), and is implicated in Acute Respiratory Distress Syndrome (ARDS) development and protection from severe COVID-19 along with OAS1. 2,3 Though not a part of this network, all the MHC class 1 and some MHC class 2 genes (HLA-A,B,C,E,F, and HLA-DQB1, DR-B1, DR-B5), involved in T-cell mediated cell death and the antibody-mediated adaptive immune response, were also upregulated along with RFX5, that binds to MHC-II promoters. However, many of the proinflammatory markers remained unchanged, suggesting an absence of hyperinflammation and a better disease prognosis in these patients.
Downregulated protein coding genes were associated with processes related to neurotransmission and cardiac and muscular contraction ( Figure 2A and Table S7). Multiple cardiomyopathy pathways appear to be affected (Figure 2B and Table S8), either as a direct result of the infection or as a downstream consequence of the immune response activation, that may shed light on adverse clinical outcomes. RAS and cAMP signaling pathways, CACNs related to cellular calcium signaling, and key cardiac proteins, such as troponin and tropomyosin, which together with calcium ions are required for proper cardiac muscle contraction, were also downregulated ( Figure 2B). These results suggest myocardial issues and highlight the importance of continued follow-up in COVID-19 patients. 4 Pancreatic and insulin secretory systems-related genes were  . Each bar marks the expression level of a gene from highest (red) to lowest (blue) as per the scale on the right. Sample names are indicated on the x-axis (clustered by negative and then positive samples) and their metadata is shown on the top. "Condition" indicates COVID-19 negative (gold bar) or positive (blue bar) status, while "Severity" indicates whether the patients needed ICU intervention (ICU, green bars) or were discharged from the general ward (W, dark blue bars). The Y-axis bar on the left marks the number of upregulated protein coding genes (orange) followed by the downregulated protein coding genes (pink) in patients compared to the controls also downregulated, in agreement with recent work showing that the insulin requirement for patients with diabetes mellitus increases at the peak of COVID-19 illness. 5 Interestingly, there was a strong enrichment for "olfactory transduction" and "taste transduction" pathways among downregulated genes ( Figure 2B), including 105 olfactory receptor genes. Over the last year, olfactory dysfunction has emerged as a key symptom of COVID-19 and the loss of smell and taste is likely a consequence of the observed impairment of neurosensory perception pathways. 6 Genes associated with drug addiction and neuroactive ligand-receptor pathways also lost their expression pattern. Protein-protein interaction analysis showed two strong networks of genes from the family of gamma-aminobutyric acid type A (GABA) receptors (Figure 2D), important for normal neurological functioning, and the GRIN genes which are part of the N-methyl-Daspartate receptors family involved in memory, learning, and synaptic development. Reduction in GABA and alterations in GABA receptor levels are associated with stressinduced anxiety and depression, increasingly recognized in COVID-19 patients. The effect on GABAergic interneurons in the olfactory bulb, connecting sensory neurons in the olfactory epithelium, might increase the potential for neurological complications observed in COVID-19 patients. 7 Further studies are underway to delineate the and downregulated genes (red nodes). Nodes sharing 30% or more genes are connected by edges whose thickness represents the percentage of common genes. Size of the node represents the number of genes in that pathway (ranging from 4 to 136). (C) Protein-protein interactions of 19 genes showing a tightly connected network involved in "innate immune response" (red nodes), "defense response to virus" (blue nodes), and "type I interferon signaling pathway" (green nodes). These genes were upregulated during COVID-19 infection in our analysis as well as in other published datasets (Table S6). Edges depict both functional and physical associations. Edge thickness indicates the confidence in the interaction. All active interaction sources in the STRING database are considered. The minimum interaction score for an edge is set at a high confidence level of 0.7. (D) Protein-protein interactions of 55 genes showing tightly connected clusters of genes involved in "cognition" (red nodes), "learning and memory" (blue nodes), and "sensory perception" (green nodes). These genes were involved in four addiction-related pathways. Edges depict both functional and physical associations. Edge thickness indicates the confidence in the interaction. All active interaction sources in the STRING database are considered. The minimum interaction score for an edge is set at a high confidence level of 0.7 implications for neuronal infectivity via the olfactory and respiratory tracts and the nasopharyngeal compartment, 6 which are predominantly epithelial cells.
A large proportion of the DEGs included relatively low expression lncRNAs ( Figure 3A), including some known to have functional roles during viral infection. For example, ZBTB11-AS1, an antisense lncRNA to ZBTB11, regulating neutrophil development 8 was upregulated along with the cognate gene. HEIH, associated with recurrence in hepatitis C virus-related hepatocellular carcinoma and IGF2-AS, associated with HepC viral replication 9,10 were also significantly misregulated. However, the role of many lncRNAs is unspecified. We identified 720 differentially expressed protein coding genes nearest to the misregulated lncRNAs, most of which were found to overlap the cognate gene or its promoter on the antisense strand ( Figure 3B) and potentially mediated many developmentally regulated processes ( Figure 3C and Table S9).
In conclusion, we have documented significantly misregulated genes and associated pathways during  Figure 4). Our results highlight a commonly upregulated network of innate immune response genes and absence of hyperinflammatory markers. A majority of the genes being downregulated suggest host shutdown and large-scale systemic effects spanning not just lung and respiratory complications but also cardiac, endocrine, and neurological issues. The downregulation of a large proportion of sensory receptors, including olfactory and taste receptors, and associated pathways stands out as a major correlate of SARS-CoV-2 infection. Such studies can help compare host responses in the current and subsequent waves of the pandemic across the globe and identify targets for monitoring and planning therapeutic approaches.

C O N F L I C T O F I N T E R E S T
The authors declare no conflict of interest.

D ATA AVA I L A B I L I T Y S TAT E M E N T
Raw data and the RNA-seq count data can be accessed from Gene Expression Omnibus (GEO) database (accession number GSE166530).

F I G U R E 4
Enriched GO processes and KEGG pathways associated with significant DEGs. Volcano plot showing rlog transformed expression values of all the genes. Each gene is plotted based on the log2 fold change value (X-axis) and -log10 adjusted p-value (Y-axis). Vertical dashed line (absolute (log2 fold change) = 1) and horizontal dashed line (padj = 0.05) show the criteria set for defining significant DEGs. Genes not changing significantly are colored gray. Upregulated genes (colored red) are on the right side of the plot, while downregulated genes (colored blue) are on the left side of the plot. Genes from significantly enriched GO terms and KEGG pathways are highlighted. For example, genes labeled as "Interferon related" are associated with the GO terms "type I interferon signaling pathway" and "cellular response to type I interferon." Genes labeled as "Response to virus" are associated with the GO terms "defense response to virus" and "response to virus." Genes marked as "Addiction related" are from four enriched KEGG pathways, namely, "Nicotine addiction," "Morphine addiction," "Cocaine addiction," and "Amphetamine addiction." The other labels correspond to genes from the KEGG pathway "Insulin secretion" and "Olfactory transduction"