Smoking affects gene expression in blood of patients with ischemic stroke

Abstract Objective Though cigarette smoking (CS) is a well‐known risk factor for ischemic stroke (IS), there is no data on how CS affects the blood transcriptome in IS patients. Methods We recruited IS‐current smokers (IS‐SM), IS‐never smokers (IS‐NSM), control‐smokers (C‐SM), and control‐never smokers (C‐NSM). mRNA expression was assessed on HTA‐2.0 microarrays and unique as well as commonly expressed genes identified for IS‐SM versus IS‐NSM and C‐SM versus C‐NSM. Results One hundred and fifty‐eight genes were differentially expressed in IS‐SM versus IS‐NSM; 100 genes were differentially expressed in C‐SM versus C‐NSM; and 10 genes were common to both IS‐SM and C‐SM (P < 0.01; |fold change| ≥ 1.2). Functional pathway analysis showed the 158 IS‐SM‐regulated genes were associated with T‐cell receptor, cytokine–cytokine receptor, chemokine, adipocytokine, tight junction, Jak‐STAT, ubiquitin‐mediated proteolysis, and adherens junction signaling. IS‐SM showed more altered genes and functional networks than C‐SM. Interpretation We propose some of the 10 genes that are elevated in both IS‐SM and C‐SM (GRP15, LRRN3, CLDND1, ICOS, GCNT4, VPS13A, DAP3, SNORA54, HIST1H1D, and SCARNA6) might contribute to increased risk of stroke in current smokers, and some genes expressed by blood leukocytes and platelets after stroke in smokers might contribute to worse stroke outcomes that occur in smokers.


Introduction
Cigarette smoking (CS) is a significant, modifiable risk factor for ischemic stroke (IS). 1-10 CS accounts for roughly 15% of all stroke death and has a dose-response relationship in older age subjects. 11 Smoking cessation not only rapidly reduces the risk of primary stroke, with the risk almost disappearing within 4 years of smoking cessation, but also improves outcomes of recurrent IS. 4,7,10,11 Smoking has a variety of detrimental effects on the cerebrovascular and cardiovascular systems by promoting endothelial dysfunction, systemic inflammation, high levels of low-density lipoprotein cholesterol, atherosclerosis, platelet aggregation, and clot formation. 9 Thus, smoking increases the rates of stroke as well as myocardial infarction. Herein, we hypothesize that CS induces gene expression changes, especially in inflammation-related and platelet aggregation-related genes, resulting in an increased risk of IS.
To address this, we performed a whole blood transcriptome analysis to identify differences and similarities between IS-SM versus IS-NSM and C-SM versus C-NSM. Some of the genes unique to IS-SM may explain why smokers tend to have poorer outcomes following IS, and some of the genes common to IS-SM and C-SM may contribute to the risk of stroke among smokers. To our knowledge, this study is the first to describe changes of gene expression in whole blood at the whole genome level following ischemic stroke in humans in smokers compared to never smokers.

Methods and materials Study participants
Subjects (n = 219) were recruited from the Universities of California at Davis and San Francisco, and the University of Alberta, Canada. These included 42 IS-current smokers (IS-SM), 68 IS-never smokers (IS-NSM), 23 controlsmokers (C-SM), and 86 control-never smokers (C-NSM). Ethics approval was obtained from the institutional review boards, and written informed consent was obtained from each participant. All procedures followed institutional guidelines. Diagnosis of IS was made by two board-certified neurologists based on medical history, and computed tomographic (CT) and/or magnetic resonance imaging (MRI) brain scans. Controls include healthy subjects and vascular risk factor subjects. Vascular risk factor subjects had hypertension, diabetes, and/or hypercholesterolemia. Exclusion criteria included prior strokes, active cancer, infection just before or after strokes, or a rheumatological disorder.
Smoking status was determined by questionnaire. Current smokers were those who smoke on average at least one cigarette per day during the past 12 months. Never smokers (both stroke S-NS and controls C-NS) had never smoked by self-report.
Sample processing and total RNA isolation One venous blood sample was collected from each IS patient. The time of sample collection ranged from 4.4 to 83.2 h after IS. Blood was drawn into PAX gene tubes (Qiagen) and stored frozen at À80°C until processed. 12,13 Total RNA was extracted according to protocol (PAXgene blood RNA kit; Qiagen). RNA quality and concentration were measured using Agilent 2100 Bioanalyzer and Nano-drop, respectively. A 260 /A 280 absorbance ratios ≥ 1.8, a 28S/18S rRNA ratio ≥ 1.8, and an RNA integrity number (RIN) ≥8 were used for determining RNA quality. Reverse transcription, amplification, and sample labeling were carried out using Nugen's Ovation Whole Blood Solution (Nugen Technologies, San Carlos, CA) to generate cDNA for analysis on Affymetrix Gen-eChip â arrays.

HTA 2.0 microarray
The Affymetrix HTA 2.0 Gene Chip microarrays (Affymetrix, Santa Clara, CA) were used to measure expression of mRNAs and noncoding RNA. Amplified cDNAs were hybridized to Affymetrix HTA 2.0 Gene Chip microarrays, washed on a Fluidics Station 450 and scanned on a GeneChip Scanner 3000. Samples were randomly assigned to microarray batches to reduce batch effect. Microarray raw gene expression data were saved in CEL files.

Statistical analysis
Raw gene expression data were input into Partek Flow software (Partek Inc., St. Louis, MO) and normalization performed using robust multichip averaging (RMA). Statistical analyses of mRNA data were performed for (1) IS-SM versus IS-NSM and (2) C-SM versus C-NSM. Univariate analyses were performed to determine confounding factors, including gender, age, race, and vascular risk factors that might differ between IS-SM and IS-NSM, as well as C-SM and C-NSM. Fisher's exact test was used for categorical variables and unpaired t-test for continuous variables. A mixed effect regression model was utilized for differential gene expression analysis, including diagnosis, smoking status, significant factors from the univariate analysis, technical variation (batch), gender and interaction between diagnosis and smoking status. The criteria for determining significantly differentially expressed genes were P < 0.01 and |fold change| ≥ 1.2. A fold change of 1.2 was used as in our previous studies to help ensure biologically significant changes and to provide enough genes for functional pathway analysis. [14][15][16][17] Functional pathway analyses and crossvalidation analysis Exploratory Gene Association Networks (EGAN) software was used to analyze the functional networks. Gene networks and pathways were generated according to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.
Cross-validation analysis was performed to determine prediction accuracy of the optimal model using forward selection and the k-nearest neighbor algorithm in Partek Genomics Suite 6.4. 18 Sensitivity and specificity of the best classifier were calculated.

Demographic and clinical characteristics
Demographic and clinical characteristics of the 219 subjects are summarized in Table 1. Average age of IS subjects (years AE SD) was 60.9 AE 12.6 and 75.5% were male. There were 55.5% white, 16 (Table 2) and C-SM versus C-NSM (Table 3). In addition, the time-since stroke onset was not significantly different in the IS-SM versus IS-NSM groups ( Table 2).

Functional analysis of regulated mRNAs for IS-SM versus IS-NSM and C-SM versus C-NSM
Functional pathway analysis showed that the 158 genes (113 up and 45 down) differentially expressed in blood of IS-SM versus IS-NSM were over-represented in T-cell receptor, cytokine-cytokine receptor, chemokine, adipocytokine, tight junction, Jak-STAT, ubiquitin-mediated proteolysis, and adherens junction signaling (Figs. 2 and 3). The 100 differentially expressed genes (75 up and 25 down) in blood of C-SM versus C-NSM were associated with cell adhesion, antigen processing and presentation, T-cell receptor, natural killer cell-medicated cytotoxicity, cytokine-cytokine receptor interaction, apoptosis, and endocytosis signaling pathways (Fig. S1).
The mRNA predictive model that was able to best discriminate IS-SM from IS-NSM was generated using a knearest neighbor algorithm (k = 21) with a normalized correct rate of 79.9% using 27 genes. The sensitivity was 64.3% and specificity was 95.6% for IS-SM.

Discussion
In this study, we identified 158 differentially expressed genes associated with IS-SM compared to IS-NSM, 100  altered genes associated with C-SM compared to C-NSM, and 10 common genes for IS-SM versus IS-NSM and C-SM versus C-NSM. Some of the 10 common genes (GPR15, LRRN3, CLDND1, ICOS, GCNT4, VPS13A, DAP3, HIST1H1D, SNORA54, and SCARNA6) could be associated with stroke risk since they are expressed in smokers before stroke and in smokers after stroke. For example, regulation of ICOS affects outcomes in experimental rodent stroke models by modulating T cells. 19 Chorea-acanthocytosis (ChAc), a neurodegenerative disease, results from loss-of-function mutations of the chorein-encoding gene VPS13A. 20 Perhaps smoking induced   changes in levels of VPS13A would affect red blood cells and predispose to clotting/stroke. HIST1H1D and DAP3 are involved in apoptosis, and SNORA54 is associated with Factor X which plays a key role in clotting (Gene-Cards). GPR15, LRRN3, and CLDND1 are discussed next. A previous transcriptome meta-analysis reported the top 25 smoking-related genes associated with current smokers versus never smokers. 21 Comparing our 100 smoking genes identified in C-SM versus C-NSM with the top 25 smoking-related genes identified in this previous meta-analysis (n = 25), we found three genes (GPR15, LRRN3, and CLDND1) overlapped, which is highly significant based on a hypergeometric probability test (P < 0.0002). These three genes were also among the 10 common genes differentially expressed in IS-SM, as well as in C-SM (Fig. 1, overlap). Notably, smoking is the only condition known to increase the numbers of GPR15+ T cells in blood 22 which are proinflammatory Th17-like. 23 GPR15 is an orphan receptor that is involved in the regulation of the innate immunity and T-cell trafficking. 24 GPR15 and LRRN3 have DNA methylation loci in their promoter regions that are reported to be hypomethylated among smokers which might indicate smoking-induced epigenetic changes. 21 LRRN3 is an inflammatory regulatory gene related to T-cell function and immunosenescence whose expression declines with age 25 and smoking, perhaps resulting in a dysregulated immune system prior to and after stroke. Claudin domaincontaining 1 (CLDND1), also known as claudin-25, plays a role in the development of cerebrovascular disease in stroke-prone spontaneously hypertensive rats. 26 Perhaps smoking induction of CLDND1 is a risk factor for stroke in humans as well.
Some of the 158 genes expressed after stroke in smokers could be associated with the worse stroke outcomes in smokers. It is notable that most of the genes associated with IS-SM versus IS-NSM compared to C-SM versus C-NSM were different/unique for each. This suggests that smoking has a strong interaction with stroke associated responses in peripheral leukocytes and platelets that is quite different from control-smokers compared to control-never smokers.
Many of the IS-SM genes were associates either with T lymphocytes or T-cell receptor (TCR) signaling including CD3E, ICOS, PRKCQ, LEF1, TCF7, and IL7R. T cells increase inflammation in atherosclerotic plaques and contribute to lesion development, and T cells infiltrate ischemic brain and may be beneficial or harmful. 27,28 For example, an anti-CD3 antibody reduced atherosclerotic plaque development in mice. 29 Patients with myocardial infarction (MI) and/or stable angina who were diagnosed with atherosclerosis had associated decreases of ICOS+ T cell subset and Treg cells, suggesting a possible role for these cells. 30 A protective role of ICOS was observed in ApoE-KO mouse model. 31 Protein Kinase C Theta (PRKCQ) is a serine/threonine protein kinase expressed in T lymphocytes and modulates proliferation and cytokine production and is required for T-cell driven inflammatory responses. 32,33 PRKCQ is also highly expressed in platelets and positively regulates thrombin-mediated platelet activation and aggregation. 34 Interleukin receptor IL7R was upregulated in the peripheral blood of IS-SM. IL7 regulates T-cell differentiation, survival, and homeostasis, which mediates inflammation upon binding to IL7R. 35 Increased gene expression of IL7R and lnc-IL7R are observed in many inflammatory conditions including atherosclerotic plaques in humans. [36][37][38][39] Thus, the dysregulation of some of the above T-cell receptor signaling pathways could contribute to worsened ischemic brain injury in smokers.
CCR4 and CCR8 are also upregulated in IS-SM. These are CC chemokine receptors that mediate leukocytes chemotaxis and play regulatory roles in the inflammatory response upon interaction with their chemokines. Atherosclerosis is associated with the recruitment of monocytes and T lymphocytes to the vascular wall of blood vessels. The dendritic cell-derived chemokine CCL17 is a ligand for CCR4. An atherosclerosis-prone mice study showed that CCL17 deficiency resulted in a reduction of atherosclerosis, which was dependent on Tregs. 40,41 Another study found that CCR4 expression was decreased by simvastatin accompanied by anti-inflammation and lipid-lowering effects in human endothelial cells and macrophages. 42 These findings suggested CCR4 is involved in promoting atherosclerosis. In addition to the role in atherosclerosis, studies have shown that CCR4 was expressed on platelets and triggered platelet activation and aggregation via binding to its ligands including macrophage-derived chemokine in Th2 diseases such as asthma and atopic dermatitis. 43,44 CCR8 is found on monocytes, T lymphocytes, endothelial cells, and vascular smooth muscle cells in human. [45][46][47] These cells produce CCR8 ligand, CCL1/I-309, mediating the effects of Lipoprotein(a) in atherosclerosis. CCR8 was also identified on endothelium of human atherosclerotic plaques and induced angiogenesis when interacting with CCL1. 45,48 These data showed that CCR8, an endothelial receptor, may regulate endothelial function and participate in arterial vessel wall pathology. In the current study, both CCR4 and CCR8 are upregulated in IS current smoker patients, which may suggest a role of current smoking in increasing stroke injury by promoting atherosclerosis and platelet aggregation.
Smoking also activates the Wnt signaling pathway via LEF1 and TCF7 which modulate adherens junctions. LEF1 is a transcriptional repressor which activates HDACs which downregulate expression of many genes important in the pathogenesis of stroke. 49 Indeed, HDAC inhibitors in humans can decrease the risk of stroke after a previous stroke or TIA. 50 Thus, smoking activation of LEF1 would be expected to activate HDACs which could worsen stroke outcomes.
There were many noncoding RNAs (ncRNA) that were associated with stroke in smokers (IS-SM). Many of the ncRNA, including microRNA and long noncoding RNA, can regulate mRNA expression. As an example, SNORA54 appears to modify expression of Factor X and thus could play a role in clotting related to smoking-associated strokes (Gen-eCards). Another example is SCARNA6, a small Cajal bodyspecific RNA which are in a class of small nucleolar RNAs (snoRNAs) that localize to the Cajal body, a nuclear organelle involved in the biogenesis of small nuclear ribonucleoproteins (snRNPs or snurps). ScaRNAs guide methylation and pseudouridylation of RNA polymerase II transcribed spliceosomal RNAs U1, U2, U4, U5, and U12 to modulate splicing which might be another impact of smoking in stroke. 51 One limitation is small sample size. Future larger studies will be required to validate the findings in a separate cohort. Another limitation is the inability to prove causal relationship between smoking, gene expression and clinical outcomes in IS patients due to this being cross-sectional study. While the gene profiling provides evidence for hypothesis generation, in vitro and in vivo functional experiments need to be performed to establish causality that the identified genes represent either stroke risk genes or represent genes that contribute to smoking-induced worsening of stroke.

Conclusions
These studies provide a novel list of targets to possibly decrease stroke risk or improve stroke outcomes in smokers.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Figure S1. Top functional pathways for regulated genes in blood associated with C-SM compared to C-NSM. Table S1. 158 Differentially expressed genes (Pvalue < 0.01; |fold change| ≥ 1.2) in ischemic strokesmokers (IS-SM) versus ischemic stroke-never smokers (IS-NSM). Table S2. 100 Differentially expressed genes (Pvalue < 0.01; |fold change| ≥ 1.2) in control-smokers (C-SM) versus control-never smokers (C-NSM). Table S3. Top functional pathways for regulated genes associated with IS-SM when compared to IS-NSM. Table S4. Top functional pathways for regulated genes associated with C-SM compared to C-NSM.