Identification of novel differentially expressed genes in retinas of STZ‐induced long‐term diabetic rats through RNA sequencing

Abstract Background The aim of this research was to investigate the retinal transcriptome changes in long‐term streptozotocin (STZ)‐induced rats' retinas using RNA sequencing (RNA‐seq), to explore the molecular mechanisms of diabetic retinopathy (DR), and to identify novel targets for the treatment of DR by comparing the gene expression profile we obtained. Methods In this study, 6 healthy male SD rats were randomly divided into wild‐type (WT) group and streptozotocin (STZ)‐induced group, 3 rats each group. After 6 months, 3 normal retina samples and 3 DM retina samples (2 retinas from the same rat were considered as 1 sample) were tested and differentially expressed genes (DEGs) were measured by RNA‐seq technology. Then, we did Gene Ontology (GO) enrichment analysis and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis and validated the results of RNA‐seq through qRT–PCR. Results A total of 118 DEGs were identified, of which 72 were up‐regulated and 46 were down‐regulated. The enriched GO terms showed that 3 most significant enrichment terms were binding (molecular function), cell part (cellular component), and biological regulation (biological process). The results of the KEGG pathway analysis revealed a significant enrichment in cell adhesion molecules, PI3K‐Akt signaling pathway, and allograft rejection, etc. Conclusion Our research has identified specific DEGs and also speculated their potential functions, which will provide novel targets to explore the molecular mechanisms of DR.


| INTRODUCTION
With the notably increasing population of diabetes, diabetic retinopathy (DR), one of the most severe complications of diabetes, has the potential to be the leading cause of blindness worldwide (Klein, 2007;Yau et al., 2012). The main pathological change of DR is the retinal capillary endothelial damage. Existing researches have revealed that the major pathogenesis of DR includes susceptibility genes, increased polyol pathway flux, increased advanced glycation end-product (AGE) formation, abnormal activation of protein kinase C (PKC) pathway, and increased hexosamine pathway flux (Frank, 2004). These pathways can cause up-regulation of factors such as insulin-like growth factor (IGF, OMIM: 147440), vascular endothelial growth factor (VEGF, OMIM: 192240), tumor necrosis factor (TNF, OMIM: 191160), and basic fibroblast growth factor-2 (bFGF-2 OMIM: 134920) that can promote the occurrence of DR (Safi, Qvist, & Kumar, 2014). However, the mechanisms of DR are really complex, and the internal relations between pathways and biomarkers are complicated (Saxena, Singh, & Saklani, 2016). Identifying new significantly differentially expressed genes (DEGs) and speculating their roles might help us to better understand the molecular network of the pathogenesis of DR and provide new research ideas.
Streptozotocin (STZ) is a pancreatic islet β-cell-cytotoxic antibiotic, which can highly and selectively destroy pancreatic islet B cells. So it has been widely used to develop animal models of human condition with either type 1 diabetes mellitus or type 2 diabetes rat mellitus (Furman, 2015).
Previous RNA-seq studies of DR mostly use 2 months course diabetic rats, but streptozotocin-induced diabetic retinopathy in rats might not be apparent within 2 months as the previous paper showed (Chen, Bin, & Qian, 2013;Yuan, Zhi, & Geng, 2016). In this study, we analyzed whole transcriptome of mRNA in retina from both normal and 6 months diabetes mellitus (DM) male Sprague Dawley (SD) rats, identified differentially expressed genes, and also speculated their roles based on Illumina HiSeq sequencing platform combining with Gene Ontology (GO) and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway enrichment analysis. We hope our study will enhance our understanding of the molecular mechanisms underlying the pathogenesis of DR and allow the development of novel therapeutic targets of DR.

| Ethical compliance
The experimental protocols used in this study followed guidelines established by the ARVO Statement for the Use of Animals in Ophthalmic and Vision Research and were approved by the Ethics Committee of Shanghai General Hospital, Shanghai Jiaotong University, Shanghai, China (Permit Number: 2009-0086

| Animals
Adult male SD rats (250-300 g) were obtained from the Shanghai Laboratory Animal Center. Healthy male SD rats were randomly divided into wild-type (WT) group and STZinduced group. STZ (Sigma-Aldrich) was injected in rats' abdomen. Rats in WT group received an intraperitoneal injection of citrate buffer only. Diabetes was confirmed by assaying the glucose concentration in blood collected from the tail vein using a precision glucometer (Roche Diabetes Care GmbH) weekly after STZ injection. Rats with blood glucose concentration >300 mg/dl were considered diabetic. Rats were sacrificed by intraperitoneal injection of excess 10% chloral hydrate (10 ml/kg). All rats had received cardiac perfusion using physiological saline to remove blood from retinas before they sacrificed 6 months after the injection of STZ.

| RNA-seq
Total RNA of each sample was extracted using Trizol. Total RNA of each sample was quantified and qualified by Agilent 2100 Bioanalyzer (Agilent Technologies), NanoDrop (Thermo Fisher Scientific Inc.), and 1% agarose gel. 1 μg total RNA with RIN value above 7 was used for following library preparation. Next-generation sequencing library preparations were constructed according to the manufacturer's protocol (NEBNext ® UltraTM RNA Library Prep Kit for Illumina ® ). The poly(A) mRNA isolation was performed using NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB) or Ribo-ZeroTM rRNA removal Kit (Illumina). The mRNA fragmentation and priming were performed using NEBNext First Strand Synthesis Reaction Buffer and NEBNext Random Primers. First-strand cDNA was synthesized using ProtoScript II reverse transcriptase, and the second-strand cDNA was synthesized using Second Strand Synthesis Enzyme Mix. The purified double-stranded cDNA (by AxyPrep Mag PCR Clean-up (Axygen) was then treated with End Prep Enzyme Mix to repair both ends and add a dA-tailing in one reaction, followed by a T-A ligation to add adaptors to both ends. Size selection of Adaptor-ligated DNA was then performed using AxyPrep Mag PCR Clean-up (Axygen), and fragments of ~360 bp (with the approximate insert size of 300 bp) were recovered. Each sample was then amplified by PCR for 11 cycles using P5 and P7 primers, with both primers carrying sequences which can anneal with flow cell to perform bridge PCR and P7 primer carrying a six-base index allowing for multiplexing. The PCR products were cleaned up using AxyPrep Mag PCR Clean-up (Axygen), validated using an Agilent 2100 Bioanalyzer (Agilent Technologies), and quantified by Qubit 2.0 Fluorometer (Invitrogen). Then, libraries with different indices were multiplexed and loaded on an Illumina HiSeq instrument according to manufacturer's instructions (Illumina). Sequencing was carried out using a 2 × 150 bp paired-end (PE) configuration; image analysis and base calling were conducted by the HiSeq Control Software (HCS) + OLB + GAPipeline-1.6 (Illumina) on the HiSeq instrument. The sequences were processed and analyzed by GENEWIZ.

Quality control
In order to remove technical sequences, including adapters, polymerase chain reaction (PCR) primers or fragments thereof, and quality of bases lower than 20, pass filter data of fastq format were processed by Trimmomatic (v0.30) into high-quality clean data.

Mapping
Firstly, reference genome sequences and gene model annotation files of relative species were downloaded from genome website, including UCSC, NCBI, ENSEMBL. Secondly, Hisat2 (v2.0.1) was used to index reference genome sequence. Finally, clean data were aligned to reference genome via software Hisat2 (v2.0.1).

Expression analysis
In the beginning, transcripts in fasta format were converted from known gff annotation file and indexed properly. Then, with the file as a reference gene file, HTSeq (v0.6.1) estimated the expression level of genes and isoforms from the pair-end clean data.

Differential expression analysis
The DESeq Bioconductor package, a model based on the negative binomial distribution, was used in the differential expression analysis. Benjamini and Hochberg's approach was T A B L E 1 List of primers used in qRT-PCR

Gene
Primer sequence (5′−3′) applied to control the false discovery rate, and genes with pvalue <.05 were considered differently expressed.

GO and KEGG enrichment analysis
GO-TermFinder was used to identify GO terms that annotate a list of enriched genes with a significant p-value less than.05. KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances (http://en.wikip edia.org/wiki/KEGG). We used scripts in-house to enrich significant differential expression gene in KEGG pathways.

| Quantitative real-time PCR
Total RNA was isolated with Trizol and reverse-transcribed into cDNA using the PrimeScript RT Master Mix kit (Takara Bio). The expression of specific genes was quantitated by real-time PCR using SYBR Premix Ex Taq kit (Takara Bio) on an ABI 7500HT machine (Applied Biosystems) with 18sRNA (rat) as internal control. Primers used in qRT-PCR are listed in Table 1.

| General RNA-seq analysis
Six pairs of retina samples from 3 normal rats and 3 diabetic SD male rats were sequenced 3 times. In total, we established six RNA-seq libraries and obtained over 45,000,000 clean reads in each library, in which more than 90% clean reads can be mapped to the reference genome. The detailed mapping data are listed in Table 2. The correlation test between samples verified the reliability of our test and the rationality of sample selection: control 1/DM1 group: R 2 = .834, control 2/DM2 group, R 2 = .968 (Figure 1a

| GO enrichment analysis
GO contains three ontologies, which are molecular function, cellular component, and biological process. In our study, we identified 66 enriched GO terms, among which 20 belong to molecular function, 22 belong to cellular component, and 24 belong to biological process. The three most enriched molecular function terms were binding, catalytic activity, and signal transducer activity. In the ontologies of cellular component and biological process, cell part, organelle, membrane part and biological regulation, cellular process, metabolic process ranked the three most enriched go terms, respectively (Figure 2).

| KEGG pathway enrichment analysis
In total, we identified 41 KEGG pathways and classified these pathways into five categories. Specifically, Human Disease category contained 20 pathways. Organismal System category contained 9 pathways. Metabolism category and Environmental Information Processing category contained 7 and 4 pathways, respectively, and Cell Processes category contained 1 pathway which named Phagosome. Thirty most enrichment items chosen for this study are shown in Figure 3. The Details are listed in Table 5-8.

| Validation of DEGS
In this study, qRT-PCR was used to confirm the results of RNA-seq. Overall, 25 most differentially expressed genes were selected, such as Asb15, Ltk, Eqtn, RT1-Ba, RT1-Bb. Changes in the expression levels of these genes were similar to those obtained following RNA-seq. The complete results are shown in Figure 4.

| DISCUSSION
This study examined retinas from WT and diabetic SD male rats to investigate the changes in a variety of retinal transcripts as a result of diabetes using RNA-seq. We identified a total of 118 DEGs, of which 72 were up-regulated and 46 were down-regulated. We also found 66 GO terms and 41 KEGG pathways which were significantly enriched by GO and KEGG analysis.
Top 10 most down-regulated and up-regulated genes are listed in Tables 3 and 4, and were confirmed by qRT-PCR showed in Figure 4. Asb15 gene is the most up-regulated one we identified and confirmed. Asb15 is a member of Asb gene family; the family has been reported to be involved in cell proliferation and differentiation (Hancock et al., 1991;Kohroki et al., 2001;Liu et al., 2003). The presence of both Ankyrin repeat and suppressors of cytokine signaling (SOCS) box motifs are characters of members of Asb gene family (McDaneld, Hancock, & Moody, 2004). Member of SOCS family plays important roles in the negative regulation of signaling pathways (Kile & Alexander, 2001;Zhang et al., 2001). SOCS3 acts as a regulator of inflammation through inhibiting JAK/STAT pathway (Tamiya, Kashiwagi, & Takahashi, 2011). Down-regulating SOCS3-STAT3 can alleviate DR (Chen, Lv, & Gan, 2017;Jiang, Thaksan, & Bheemreddy, 2014;Ye & Steinle, 2015). Ladinin-1(Lad1), a largely uncharacterized protein to date, was found to be related to the proliferation and migration of breast cancer cells (Roth, Srivastava, & Lindzen, 2018). Cell proliferation and migration are processes of neovascularization. Neovascularization is the sign of PDR, which can lead to serious vision loss of patients. Fibroblast growth factor 2 (Fgf2) is a member of fibroblast growth factors (FGFs) family. FGFs and their receptors have important roles in FGF2 was found overexpression in the early stage of DR, and it can destroy the blood-retinal barrier (Yang et al., 2018). Hemoglobin alpha adult chain 1 (Hba-a1) is one of the hemoglobin genes. Hemoglobin plays an important role in neuronal respiration, oxidative stress, and response to injury (He et al., 2010;Poh, Yeo, Stohler, & Ong, 2012;Richter, Meurers, Zhu, Medvedeva, & Chesselet, 2009). Neuronal respiration is an important life activity of neuronal cells. Neurological injury is one of the performances of DR. Inositol monophosphatase domain containing 1 (Impad1) encodes gPAPP, which is a Golgi-resident nucleotide phosphatase that hydrolyzes phosphoadenosine phosphate (PAP), the by-product of sulfotransferase reactions, to AMP. AMP-activated protein kinase (AMPK) signaling pathway plays vital roles in the diabetes-induced retinal inflammation (Kubota, Ozawa, & Kurihara, 2011). RT1-Bb, RT1-Ba, belongs to RT1 complex, which is the major histocompatibility complex (MHC) of rat (Eberhard & Lutz, 2001). It is believed that the MHC region is vital because it plays an important role in diseases, such as autoimmune and infectious diseases, vascular diseases like DR, hematological and neurological diseases (John, 2005). Collagen type III alpha 1 chain (Col3a1) is a kind of type III collagen, mainly existing in the extracellular matrix. Lacking of type III collagen can destroy the structure of connective tissues (Cortini et al., 2017). According to previous researches, it is associated with the aneurysm. Retinal microaneurysm is the early performance of DR. Col3a1 was also found significantly changed in RNA-seq of human PDR fibrovascular membranes (Lam et al., 2017). αA-crystallin (Cryga) and αF-crystallin (Crygf) are members T A B L E 3 Top 10 up-regulated genes in the retina of STZ-induced rats, compared with normal rats of crystallins, which were involved in different functions in various tissues (Clayton, Jeanny, Bower, & Errington, 1986;Head, Peter, & Clayton, 1991;Smolich, Tarkington, Saha, & Grainger, 1994). Knockout of αA-crystallin can inhibit ocular neovascularization (Xu, Bai, & Huang, 2015). More and more evidence indicated that inflammation (Adamis, 2002;Gologorsky, Thanos, & Vavvas, 2012) and neovascularization (Gardner & Davila, 2017;Nguyen et al., 2018) are important in the pathogenesis of DR. The results of the KEGG pathway significant enrichment analysis revealed two most enrichment items-cell adhesion molecules (CAMs) and PI3K-Akt signaling pathway. CAMs are proteins located on cell surface; the binding of CAMs to their receptors is important in the mediation of inflammatory and immune reactions (Golias et al., 2007). Previous studies have suggested that CAMs are important in the development of DR (Khalfaoui et al., 2009;Ugurlu et al., 2013). PI3K-Akt signaling pathway is a main downstream molecular pathway of insulin and is associated with DR neovascularization (Qin, Zhang, & Xu, 2015;Sasore, Reynolds, & Kennedy, 2014). Compared with the results of transcriptomic analysis which use CD31+ vascular endothelial cells obtained from human PDR fibrovascular membranes (FVM) (Lam et al., 2017), we found the gene Col3a1 were both significantly changed, and according to the GO enrichment analysis of two researches, Col3a1 may belong to blood vessel development term (GO:0001568), suggested that Col3a1 may play a role in DR neovascularization. We also found that the positive regulation of immune system process term (GO: 0002684) was both significantly enriched in two researches. It is a pity that we have not found KEGG analysis of retinas from human DR patients.
According to previous research which used rats' retinas (Yuan et al., 2016), 3 GO terms-biological regulation, response to stimulus, and metabolic processes-were significantly enriched, and KEGG pathway analysis showed that CAMs, complement and coagulation cascades, and antigen processing and presentation, PI3K-Akt signaling pathway were highly enriched; the above were the same with our research.
Compared with previous similar studies (John, Ram, & Jiang, 2018), less DEGs were identified in this study, which probably due to the individual differences of experimental animals and longer course of diabetes. Previous studies related to DR usually use rats injected with STZ for 2-3 months, we used rats sacrificed 6 months after injecting STZ because DR was definitely occurred at 4-6 months after the development of diabetes (Gong, Lu, & Hu, 2013;Kumar et al., 2013;Si et al., 2013). H b a -a 1 T n n t 2 E d n 2 F 9 E q t n A n k a r C n t n a p 5 b L a d 1