An integrated analysis to predict micro‐RNAs targeting both stemness and metastasis in breast cancer stem cells

Abstract Several evidences support the idea that a small population of tumour cells representing self‐renewal potential are involved in initiation, maintenance, metastasis, and outcomes of cancer therapy. Elucidation of microRNAs/genes regulatory networks activated in cancer stem cells (CSCs) is necessary for the identification of new targets for cancer therapy. The aim of the present study was to predict the miRNAs pattern, which can target both metastasis and self‐renewal pathways using integration of literature and data mining. For this purpose, mammospheres derived from MCF‐7, MDA‐MB231, and MDA‐MB468 were used as breast CSCs model. They had higher migration, invasion, and colony formation potential, with increasing in stemness‐ and EMT‐related genes expression. Our results determined that miR‐204, ‐200c, ‐34a, and ‐10b contemporarily could target both self‐renewal and EMT pathways. This core regulatory of miRNAs could increase the survival rate of breast invasive carcinoma via up‐regulation of OCT4, SOX2, KLF4, c‐MYC, NOTCH1, SNAI1, ZEB1, and CDH2 and down‐regulation of CDH1. The majority of those target genes were involved in the regulation of pluripotency, MAPK, WNT, Hedgehog, p53, and transforming growth factor β pathways. Hence, this study provides novel insights for targeting core regulatory of miRNAs in breast CSCs to target both self‐renewal and metastasis potential and eradication of breast cancer.

significant role of miRNAs in controlling the stemness and metastasis in CSCs. In this way, several miRNAs were known to be differentially expressed in CSCs or normal stem cells, also their role has been studied in targeting genes and networks involved in stemness properties, 5 cell proliferation, and differentiation. [6][7][8][9] A relationship between epithelial to mesenchymal transition (EMT) and self-renewal and mammosphere formation capacity has recently defined with ectopic expression of TWIST or SNAI in human mammary epithelial cells. 10 Consistently, mammosphere-forming activity is abrogated in breast CSCs after the EMT is shut down. 11 Alignment of EMT with the CSCs signature was also found in cells derived from a breast cancer lung metastasis. 12 More importantly, many signalling pathways, such as Wnt, Notch, and Hedgehog, that regulate EMT also drive self-renewal. [13][14][15] Based on our knowledge, identifying potential regulatory miRNAs responsible for self-renewal and EMT controlling could facilitate the detection of metastatic cell with the ability of seeding and enabling the discovery of therapeutic targets. Here, we presented an integrative experimental and computational approach for identifying miRNAs probably responsible for of CSCs potential and metastasis.

| Bioinformatics and computational analysis
First, we performed a systematic literature review on Pubmed and Coremine website to identify all related articles to our study with keywords: "Human breast cancer cell lines, CSC, self-renewal, stemness, microRNA, metastasis, and EMT." Briefly, we also looked for both miRNA and mRNA expression profiles on NCBI GEO database by searching the same keywords. Consequently, after the literature mining, studies with incomplete data were excluded from the analysis if (i) the review articles or letters, (ii) studies with insufficient or inaccessible data, and (iii) studies that are not related to CSCs and homo sapiens. After full text reviewing, all the miRNAs reported in each study were compiled in a list, and then, the most frequent miRNAs regulate the stemness and metastasis genes were highlighted. The targets of the miRNAs were predicted using Tar-getScan 16 and miRWalk. 17,18 Each miRNA list with their target genes was reviewed. As the most of miRNAs at least connected to two genes in metastasis list and to three genes in stemness list, therefore, we selected common miRNAs regulating at least three stemness and two metastasis genes ( Figure S1). Subsequently, we computed the differential expression fold changes and P-values (using two-sided Student's t test) between mammospheres vs adherent culture (at least two fold-change differential expression, P < 0.05).
Enrichr 19,20 on KEGG pathways was used to identify pathways that were affected by the target genes of each miRNAs. We also performed GO functional enrichment analysis (biological process, molecular function, and cellular component) by the same tool. The cut-off criterion was P < 0.05. In addition, network analysis of miRNA targets was constructed to visualize the interaction between miRNAs and their target genes that were integrated and mapped in a network structure using miRTargetLink Human. 21

| Generation of mammosphere cultures
To form mammospheres, we prepared two types of non-adherent plates. In one experimental group, the standard tissue culture plates were covered with 1.2% poly 2-hydroxyethyl methacrylate (p-HEMA) (Sigma), and in the other group, plates were covered with 1% agar to prevent the cells attachment. Subsequently, 2 × 10 4 cells of single cells in serum-free medium consisted of DMEM and supplemented with 20 ng/mL epidermal growth factor (Royan Institute, Iran), 20 ng/ mL basic fibroblast growth factor (Royan Institute), 2% B27 (no vitamin A; GIBCO, USA), and 2 mmol/L L-Glutamine. The media were refreshed every 48 hours (without removing the old media), and finally, the mammospheres were formed at 37°C under a 5% humidified CO 2 atmosphere after 7 days.

| Mammosphere-and colony-forming efficiency assay
Mammosphere-forming efficiency was calculated by dividing the number of mammospheres, which are greater than 60 μm in number of seeding cells. All experiments were performed in each generation of mammospheres in triplicates.
To compare the colony-forming capacity of adherent cells and mammospheres, 200 cells of each group were counted and replated in a complete medium containing DMEM supplemented with 10% FBS, 1% non-essential amino acids, 2 mmol/L L-glutamine, and 1% penicillin/streptomycin in six-well plates. After 10 days, cell colonies were fixed with 4% paraformaldehyde and stained with 0.05% crystal violet (Sigma) and the round shape colonies with more than 400 μm diameter were counted using an inverted microscope (Tokyo, Japan Microscope brand).

| Cell invasion and migration assay
Adherent cells and mammospheres of luminal phenotype (MCF-7) and triple negative basal phenotype (MDA-MB231 and MDA-MB468) were grown to 80% confluence; then, adherent cells were starved in serum-free medium the day before the assay. The next day, 1 × 10 5 cells seeded onto the top chambers of transwell inserts of 8-μm pore size filter (BD, USA) coated with 0.5 mg/mL matrigel (BD, USA) in a six-well plate. At the bottom of the chambers, DMEM containing 10% of FBS was added, and the cells were then cultured for 10 hours at 37°C in a 5% humidified CO 2 incubator.
Finally, cells on the top surface of the filter were removed by using a cotton swab. Cells on the bottom of the filter were then fixed with 4% paraformaldehyde for 30 minutes and stained with 0.05% crystal violet. The chambers were then washed in PBS, counted using an inverted microscope with either a 4× or a 10× objective lens using cell science software and plotted as the percentage of invading of the total number of plated cells. For cell migration assay, all steps were carried out similar to those in the invasion assay except for the matrigel coating. The experiments were performed in triplicates.

| RNA isolation and cDNA synthesis
Total RNAs with retention of small RNAs were extracted from the adherent cells (as control groups) and mammospheres (as experimental groups) using TRIzol reagent (Qiagen) according to the manufacturer's instructions. The concentration and purity of extracted RNA were determined by UV absorbance at 260 and 280 nm (260/ 280 nm) in spectrophotometer. The integrity of RNA samples was checked by gel electrophoresis. Two micrograms of total RNA was used to generate cDNA using cDNA synthesis kit (TaKaRa, Japan) according to the manufacturer's instructions.

| Real-time reverse transcriptase PCR
The expression level of stemness-and metastasis-related genes was evaluated using quantitative real-time reverse transcriptase PCR (RT-PCR). Ten microlitres of reactions containing 2.5 μL of SYBR Green PCR mix (TaKaRa, Japan) and 1 μL of each primer with 5 pmol/μL concentration was subjected for QRT-PCR using Applied Biosystems Instrument (ABI) (Thermo Fisher) with specific primers including stemness-related genes (OCT4, SOX2, NANOG, KLF4, NOTCH, c-MYC, and CD133) and metastasis-related genes (CDH1, CDH2, SNAIL1, SNAIL2, TWIST1, TWIST2, and ZEB1) ( Table S1). Expressions of these genes were normalized according to the expression of β-ACTIN. The PCR thermal profile included 95°C for 10 minutes, 40 cycles of denaturation at 95°C for 10 seconds, annealing at 60°C for 20 seconds, and elongation at 72°C for 20 seconds. A final melting curve analysis from 65 to 95°C was performed and the relative level was analysed using the 2 −ΔΔCT values.

| MiRNAs validation by real-time PCR
MiRNAs were evaluated by performing SYBR green qRT-PCR. In brief, 100 ng of total RNA containing the miRNAs was poly adenylated by poly (A) polymerase and reverse transcribed to cDNA using RT enzyme. First-strand cDNA synthesis reaction was provided in the PARSGENOME MiR-Amp kit according to the manufacturer's instructions. Each reaction was performed in a final volume of 10 μL containing diluted cDNA and PCR master mix and all reactions were run in triplicates. The qRT-PCR reaction was performed using Applied Bio systems real-time PCR Instruments (ABI) according to the manufacturer's protocol. The expression levels of miRNAs were normalized against internal controls U6 primer as a housekeeping gene control.

| miRNAs vs pathways heat maps
The DIANA-miRPath v3.0 was applied to create advanced visualizations of miRNAs contributed in self-renewal, EMT, and both self-renewal and EMT vs pathways heat maps. Heat maps are graphical representations of data where values in a matrix are represented as colours. 22 This web server uses the hierarchical clustering results of pathways and miRNAs on separate axes, in order to make the heat map visualization. These visualizations enabled us to identify patterns, which were not simply distinct their relationships and levels of interaction. "Significance Heat Maps" and "Targeted Pathways Heat Maps" are two options for heat map calculation, in the case of cluster analysis. Therefore, numerous miRNA-miRNA, miRNA-pathway, and pathway-pathway relationships were identified by using this tool.

| Survival analysis and definition of miRNArelated prognostic signature
For assessment of overall survival implications for significant micro-RNAs, the PROGmiR tool was used as publicly available data sets. 23 The breast cancer expression data were included the Cancer Genome Atlas dataset (https://cancergenome.nih.gov) and included 841 cases of breast invasive carcinoma (BRCA).

| Statistical analysis
In vitro characterization of mammospheres derived from MCF-7, MDA-MB231, and MDA-MB468 cells is presented as the mean ± SD of at least three different experiments. Two-tailed Student's t test and analysis of variance (ANOVA) were performed to evaluate the difference between the mean values. To detect the correlation of miRNA and mRNA expression levels, Spearman's rank correlation test was used. For this, each group was done at three independent replicate and each replicate was done as duplicate. A two-tailed with P < 0.05 was considered statistically significant for all experiments.
For functional enrichment analysis, target genes of selected miRNAs were submitted to Enrichr database.

| Agar-coated plate and DMEM as culture media enhance sphere formation ability in breast cancer stem cells
To find the best coating layer for providing low attachment surface for mammospheres formation of MCF-7 cells, six-well plates were coated with 1% agar and 1.2% poly-HEMA, respectively (described in Section 2). Moreover, DMEM alone, mixture of DMEM with methylcellulose (1.2%), and mixture of DMEM with matrigel (25 mg/mL) were used as culture media to find the best medium for mammosphere culture. As shown in Figure   However, MDA-MB231 and MDA-MB468 formed loose and grape shape spheres compared to MCF-7 that formed compact and dense mammospheres (Figure 2A). All mammospheres could be passaged continuously with significant increasing in the spheres formation ability ( Figure 2B). All mammospheres were dissociated and subjected to colony formation assay in 2D and 3D models.

| Mammospheres revealed higher rate of selfrenewal and invasion compared to their parental cells
The central part of each colony consisted of several layers of undifferentiated cells, whereas marginal part of each colony consisted of spindle and differentiated cells. Mammospheres derived from MCF-7 were highly clonogenic; however, the MDA-MB231mammospheres had lower clonogenic ability compared to adherent cells ( Figure 2C). There were no differences in clonogenic ability of mammospheres derived from MDA-MB468 and their adherent cells ( Figure 2C). Morphologically, colonies in mammospheres were compact and large that is a characterization of holoclones ( Figure 2D).   TargetScan 16 and miRWalk. 17,18 To find the best miRNAs to target both stemness and metastasis genes, a threshold was defined (described in Section 2). As a result, eight miRNAs including miR-200c-3p, miR-21-5p, miR-204-5p, miR-30c-5p, miR-34a-5p, miR-10b-5p, mir-520c-3p, and mir-373-3p were found to target self-renewal and EMT. The experimentally validated of target genes of each miRNAs has shown in Table 1.

| Selection of MicroRNAs and prediction of their target genes and Gene ontology analysis
In order to gain a better understanding of the specific biological functions of eight selected miRNAs, their target genes were identified from three database: miRTarbase, TargetScan, 16 and miRWalk. 17,18 We arrived to the set of 34 target genes in three steps: first, we establish the set of genes that had overlapped among the selected miRNAs. As a second step, the list of target genes was registered in the GO annotation data set for biological process, molecular function, and cellular components using Enricher. Third, the list was sorted based on P-value, number of genes, and known functions for mRNAs with self-renewal, stemness, invasion, or migration. The most significantly enriched genes were involved in biological process of the cell-cell adhesion, stem cell proliferation process, cell cycle, and EMT process ( Figure 5A).  Figure 5D). This confirmed the network of miRNA-mRNA interactions for selected eight miRNAs in regulation of self-renewal and EMT process ( Figure S2).

MicroRNAs targeting both stemness and EMT pathways
Eight above-selected miRNAs with potential to target self-renewal and EMT pathways were evaluated in mammospheres derived from different cell lines. Interestingly, as shown in Figure 6  showed more strong correlation with EMT-related genes (CDH1, CDH2, TWIST2) and then miR-10b (ZEB1, SNAIL2) and miR-34a (CDH1, CDH2) were relevant in above pathway (Table 2). To identify prognosis-related miRNAs, we first used the bivariate correlation analysis to evaluate the associations between the expression level of each of the eight differentially expressed miRNAs with EMT-and stemness-related genes and found that eight miRNAs (Table 2) were significantly associated with the genes expression (P < 0.05).
In order to recognize the role of miRNAs in cancer development, the DIANA-mirPath analysis of the selected miRNAs in each group was performed. As shown in Figure 6, most of miRNAs significantly modulated in most of cancers including glioma, pancreatic, bladder, nonlong carcinoma, prostate, thyroid cancers, and also p53 pathway, cell cycle, pyrimidine metabolism, cell cycle, and DNA replication. Using PROGmiR made us able to create a significant diagnostic plot between the expression level of each set of miRNAs and patients overall survival. As shown in Figures S3 and S7, the expression levels of most miRNAs and also combination of those miRNAs in group 1 (as targeting for EMT) and group two (as targeting for stemness) had no significant effect on the survival rate of breast cancer patients ( Figure 7A,B). However, combination of miR-204, miR-200c, miR-T A B L E 1 miRNAs involved in breast cancer metastasis and selfrenewal along with their target genes microRNA Metastasis genes  Stemness genes   miR-10b  CDH1, CDH2, MYC,  SNAIL1, SALL4, SMAD4,  TWIST1, ZEB1 34a, and miR-10b significantly reduced the survival rate of breast invasive carcinoma (P: 0.03, Figure 7C). Moreover, their main targets ing ZEB1 and SNAIL1, 37,38 similarly stemness gene such as NOTCH1. 39,40 The down-regulation of miR-200c was speculated to be the reason of high radiation tolerance in tumour cells. 41 The expression of miR-30c also is associated with the acquisition of EMT phenotype by directly targeting of ZEB1, CDH1, and SNAIL1 in breast tumours 42,43 and has role in self-renewal by directly targeting of NOTCH1, c-MYC, and CD44. 40 Although the role of miR-200c, miR-21, and miR-30c in regulation of CSCs is well defined, the role of miR-204 in cancers and CSCs is controversial. Most of the studies have reported miR-204 as a tumour suppressor gene, and its lower expression is significantly associated with a more aggressive tumour phenotype in breast cancer, 44 with poor clinical outcome of acute myeloid leukaemia patients, 45 poor survival in colorectal cancer patients, 46 and inhibit migration and invasion of cervical cancer. 47 Several studies also suggested miR-204 as an onco-miR in breast cancer 48   miR-34a and miR-10b target/regulate some of stemness genes and EMT factors but with lower correlation to both pathways. The role of both miRNAs in BCSCs has been reported previously in several studies. [52][53][54] The pathway analysis revealed that these genes were significantly related to the "CAMs," "pathways in cancer," "MAPK signalling pathway," "Wnt signalling pathway," "Hedgehog signalling pathway," "Hippo signalling pathway," "TGF-β signalling pathway," "Signalling pathways regulating pluripotency of stem cells," "p53 signalling pathway," "ABC transporters," and "Cell cycle." All these pathways have been demonstrated to be linked to various cellular activities including proliferation, migration, invasion, formation of multicellular spheroid, regulation of oestrogen receptor signalling, cancer progression, metastasis, self-renewal in cancer and CSCs, maintenance of EMT and stemness, and breast cancer chemoresistance. Interestingly, TGF-β induces miR-21 55

| CONCLUSIONS
To the best of our knowledge, this is the first demonstration of the involvement of a combination of specific miRNAs in the coordinated regulation of BCSC proliferation, EMT, and differentiation. We suggested here that the miR-204, miR-200c, miR-34a, and miR-10b form a core regulatory network for induction of