In silico prediction of potential miRNA‐disease association using an integrative bioinformatics approach based on kernel fusion

Abstract Accumulating experimental evidence has demonstrated that microRNAs (miRNAs) have a huge impact on numerous critical biological processes and they are associated with different complex human diseases. Nevertheless, the task to predict potential miRNAs related to diseases remains difficult. In this paper, we developed a Kernel Fusion‐based Regularized Least Squares for MiRNA‐Disease Association prediction model (KFRLSMDA), which applied kernel fusion technique to fuse similarity matrices and then utilized regularized least squares to predict potential miRNA‐disease associations. To prove the effectiveness of KFRLSMDA, we adopted leave‐one‐out cross‐validation (LOOCV) and 5‐fold cross‐validation and then compared KFRLSMDA with 10 previous computational models (MaxFlow, MiRAI, MIDP, RKNNMDA, MCMDA, HGIMDA, RLSMDA, HDMP, WBSMDA and RWRMDA). Outperforming other models, KFRLSMDA achieved AUCs of 0.9246 in global LOOCV, 0.8243 in local LOOCV and average AUC of 0.9175 ± 0.0008 in 5‐fold cross‐validation. In addition, respectively, 96%, 100% and 90% of the top 50 potential miRNAs for breast neoplasms, colon neoplasms and oesophageal neoplasms were confirmed by experimental discoveries. We also predicted potential miRNAs related to hepatocellular cancer by removing all known related miRNAs of this cancer and 98% of the top 50 potential miRNAs were verified. Furthermore, we predicted potential miRNAs related to lymphoma using the data set in the old version of the HMDD database and 80% of the top 50 potential miRNAs were confirmed. Therefore, it can be concluded that KFRLSMDA has reliable prediction performance.


| INTRODUC TI ON
A microRNA (miRNA) is a small non-coding RNA molecule (containing about 22 nucleotides) found in plants, animals and some viruses, and functions in RNA silencing and post-transcriptional regulation of gene expression. 1,2 While miRNAs are usually located within the cell, some miRNAs have also been discovered in extracellular environment. 3 The miRNAs in distinct tissues and growth stages can differ significantly and thus may have different spatial and temporal expression patterns. 4 It is commonly believed that these small molecules have a wide range of regulation effects on eukaryotic gene expression based on a cornucopia of experiments. 5 Accumulating evidence revealed that miRNAs are important components in cells, which could play significant roles in multiple biological processes, including cell proliferation, 6 development, 7 differentiation, 8 signal transduction 9 and viral infection. 8 Furthermore, miRNAs play crucial roles in the regulation of stem cell progenitors differentiating into adipocytes. 10 Therefore, it is no surprise that the dysregulation of miRNAs is related to a number of human complex diseases.
The first human disease discovered to be associated with dysregulation of miRNAs is chronic lymphocytic leukaemia. 11 Since then, many miRNAs also have been verified to have links with cancers.
For instance, the levels of mir-27b and miR-134 were found significantly lower in lung tumours than normal tissue, indicating that they have association with lung cancer. 12 Also, five members of the mi-croRNA-200 family (miR-200a, miR-200b, miR-200c, miR-141 and miR-429) are all down-regulated in tumour progression of breast cancer. 13 In addition to cancers, studies have shown that a mutation in the seed region of miR-96 caused hereditary progressive hearing loss 14 and a mutation in the seed region of miR-184 caused hereditary keratoconus with anterior polar cataract. 15 Although scientists have already discovered plenty of associations between miRNAs and diseases, we should be aware that it is extremely expensive and time-consuming to identify the associations by just applying experimental methods for each candidate association. As currently there are plenty of miRNA-related data sets available, computational methods can be applied to predict the potential miRNA-disease associations. So far, computational methods have been proven to be efficient in predicting miRNA-disease associations in that they can select the most promising candidate miRNAs for further experimental studies. But it is still necessary for us to make further efforts and develop more effective computational models for miRNA-disease association prediction.
There are many computational methods proposed to predict the potential associations between miRNAs and diseases, most of which are developed based on the assumption that miRNAs with similar functions are more likely to have connections with diseases of similar phenotypes. [16][17][18][19][20][21] Every time a new model was proposed, the prediction accuracy would be increased. In 2010, a hypergeometric distribution-based model was presented by Jiang et al 22 to predict miRNA-disease associations, where disease phenotype similarity, miRNA functional similarity and known human disease-miRNA associations were integrated. In 2013, Shi et al 23 used the information of proteins as a bridge between miRNAs and diseases, according to the fact that miRNAs whose target genes are related to certain diseases are more likely to be associated with these diseases. Their model implemented random walk algorithm on a protein-protein interaction (PPI) network and utilized miRNA-target interactions, disease-gene associations and PPI to obtain possible associations between miR-NAs and diseases. Furthermore, in 2014, Mork et al 24  MiRAI to represent the distributional information on miRNAs and diseases in a high-dimensional vector space. The vector space consisted of the miRNA-disease association matrix, the miRNA-neighbour association matrix, the miRNA-target association matrix, the miRNA-word association matrix and the miRNA-family association matrix. Singular value decomposition (SVD) was performed on the space for dimensionality reduction, and the association score for a miRNA-disease pair was given by the cosine similarity between the miRNA in the miRNA space and the disease in the disease space.
However, all the above methods have a common problem of high false positives and false negatives in miRNA-target interactions, which resulted in a huge reduction of prediction accuracy.
To address the problem, several other researchers avoided using miRNA-target interactions in computational models. Instead, they built models from the known miRNA-disease association data, the miRNA similarity (a measure that quantifies the similarity between two miRNAs) and the disease similarity (a measure that quantifies the similarity between two diseases). In 2013, Xuan et al 27 proposed a model named HDMP that analysed disease-related miRNAs by considering the miRNAs' k most similar neighbours in the miRNA similarity network. HDMP assigned higher weights to miRNAs in the same cluster or family, and higher weights would indicate a greater association probability between miRNAs and diseases. HDMP was a pioneering work in the topic of miRNA-disease association inference. Nonetheless, it had a major drawback that it would fail to work when applied to new diseases without known related miRNAs, as it heavily relied on the neighbours of the miRNAs. In 2012, Chen et al 28 introduced Random Walk with Restart for MiRNA-Disease Association prediction (RWRMDA), which combined the miRNA similarity and known miRNA-disease associations to make predictions. As global similarity measures were superior to local similarity measures (as had been used in HDMP and others) in making predictions, the performance of RWRMDA was better than that of previous models. However, like HDMP, this method could not predict miRNAs associated with new diseases without any known related miRNAs, either. To solve this issue, Chen et al 29  where semi-supervised learning on the miRNA/disease space was implemented. However, it should be noted that it is usually hard to find appropriate parameters for the model and difficult to integrate the classifiers from miRNA space and disease space. In addition to RLSMDA, Chen et al 34  From the performance of these three models, it can be concluded that ensemble approach leverages the advantages of individual methods and thus is a powerful tool for link prediction.
In this paper, we presented such an ensemble-based model to

| Brief Introduction to KFRLSMDA
KFRLSMDA was based on a semi-supervised ensemble learning approach. Here, 'semi-supervised' means that unlabelled samples instead of negative samples (ie miRNA-disease pairs confirmed to be unassociated) were used to train the model; and 'ensemble' means that two classifiers from the miRNA and disease spaces, respectively, were combined to yield a higher predictive accuracy. The inputs to the model included three data sets: (a) the miRNA-miRNA functional similarity that was calculated using the overlap in disease associations of a given pair of miRNAs; (b) the disease-disease similarity that was gained through computing shared part of their directed acyclic graph (DAG); and (c) the miRNA-disease association network that described whether a miRNA-disease pair was linked or not. The model's output was a list of association scores for each miRNA-disease pair, and a high score would indicate a strong association likelihood between the pair.

| Performance evaluation
Cross-validations were used as the evaluation scheme for our model, and known miRNA-disease associations in the HMDD v2.0 database 38 were used as the training data. Specifically, we applied three types of cross-validations, namely, global leave-one-out cross-validation (LOOCV), local LOOCV and 5-fold cross-validation. To prove the effectiveness of the algorithm, KFRLSMDA was compared with 10 previous computational methods: MaxFlow,39 RKNNMDA, 40 MiRAI, 26 HDMP, 27 RWRMDA, 28 WBSMDA, 29 HGIMDA, 30 RLSMDA, 33 MIDP 41 and MCMDA. 42 In LOOCV evaluation, each known association in the database was considered as the test sample in turn while the other known associations were viewed as training samples. Additionally, those miRNA-disease pairs without known association evidence were regarded as potential candidates for true associations. KFRLSMDA generated association scores for all miRNA-disease pairs. In global LOOCV, the score of the test sample was ranked against that of all candidate samples, whereas in local LOOCV the score of the test sample was only ranked against that of candidate samples for a particular disease. In other words, local LOOCV evaluated predictions made for a specific disease, while global LOOCV assessed predictions made across all diseases. In 5-fold cross-validation, the known miRNA-disease associations were randomly divided into five subsets with equal size. Each time, we selected one subset as test samples, leaving the remaining four subsets as training samples. Again, those miRNA-disease pairs without association evidence were considered as candidate samples. Like in global LOOCV, the score of each test sample was ranked against that of all candidate samples, respectively. This procedure was repeated five times until each known association was used as test sample and with its score ranked; and those test samples whose ranks surpassed a given threshold would be considered as successful predictions.
Up to this point, the 5-fold cross-validation process was completed.
We repeated this process for 100 times to examine the variance of KFRLSMDA's prediction performance.
Subsequently, the receiver operating characteristics curve (ROC) was drawn to visualize KFRLSMDA's (and ten previous models') performance at different ranking thresholds, and thereby to calculate the performance evaluation metric, area under the ROC curve (AUC). The ROC curve is created by plotting the true-positive rate (TPR, sensitivity) against the false-positive rate (FPR, 1-specificity) at various threshold settings. In our study, sensitivity represented the percentage of positive miRNA-disease test samples whose rankings exceeded the given threshold while specificity represented the percentage of negative miRNA-disease associations whose rankings were lower than the threshold. When calculating FPR, we regarded all miRNA-disease pairs without confirmed associative relationship

| Case studies
To further demonstrate the reliable performance of KFRLSMDA, we carried out case studies on five diseases, namely, Breast Cancer, Colon Cancer, Esophageal Cancer, hepatocellular cancer and lymphoma. These diseases were selected in our case studies because they are the most common cancer types, with high incidence and death rate each year. In addition, they have been used as case studies in many previous publications. 22,27,30,33,40,41,43 Unlike cross-validations that solely depended on HMDD v2.0, our case studies used Omnibus (GEO) database, which was the largest public repository for high-throughput gene expression data. To control the data quality, authors of dbDEMC only selected experiments with at least three biological duplicates. From our perspective, the two databases were both considered to be reliable in validating the case studies, although they seemed to have different focuses: one consisted of more disease types while the other covered more miRNAs. By inner joining the two databases, we found that there were 374 overlap associations between them. This was 19.3% of miR2Disease and 20.6% of dbDEMC. As for the statistical analysis between these two databases and HMDD v2.0, the results showed that 232 and 546 miRNA-disease associations were overlapped between miR2Disease and HMDD v2.0, dbDEMC and HMDD v2.0, respectively. The ratios of the overlaps were both small relative to the number of 5430 samples in training database.
The top 10 and top 50 predicted candidate miRNAs related to these diseases were examined by the two validation databases. In our work, the way of validating top 10/50 miRNAs against evidence databases was consistent with that in most previous studies on miRNA-disease association prediction. 23,27,28,30,33,40,41,43 A candidate miRNA was unlinked with the investigated disease according to HMDD v2.0. This means that there has been no evidence supporting the association between the miRNA and the disease. Thus, their associative relationship was to be examined by our model, and the miRNA was named 'candidate'. It is worth emphasizing that only candidate miRNAs for each investigated disease were prioritized and subsequently verified by evidence databases. Therefore, there was no overlap between the training samples and the prediction lists.  Table 1).
Among the 42 confirmed miRNAs, three were supported by both databases. Among the eight unconfirmed miRNAs, six were verified by more recent studies and their PMID is recorded in Table 1.
For example, miR-151's association with breast neoplasms was suggested by recent studies because miR-151-3p was found to target TWIST1 gene to suppress the migration of breast cancer cells 48 and miR-151-5p up-regulation might inhibit metastasis in primary breast tumours. 49 Another example is that miR-216b could suppress breast cancer cell growth and metastasis by targeting SDCBP gene. 50 Therefore, 48 of the top 50 candidate miRNAs for breast neoplasms were supported by either database or literature evidence.
Colon Neoplasm, diagnosed mostly in the boundary of rectum and sigmoid colon, 51 is the third most common cancer and imposes great threats on both men and women in the United States. 52 Studies showed that about half of the Colon Neoplasm patients die of metastatic disease within 5 years from diagnosis. 53,54 Detecting this disease is difficult, particularly at early stages, because only subtle symptoms can be noticed in early Colon Neoplasm patients. 55 MiRNAs seem to be a novel, potential diagnostic tool for colon neoplasms, and many miRNAs have been confirmed to be correlated with the disease. For example, miR-126, often found to be deficient in Colon Neoplasm patients, can restrict neoplastic cells growth via targeting phosphatidylinositol 3-kinase signalling. 56 Another example is miR-145 targeting the insulin receptor substrate-1 and also suppressing Colon Neoplasm cell growth. 57 KFRLSMDA was implemented to predict the top 50 potential miRNAs related to colon neoplasms. As a result, nine of the top 10 and 45 of the top 50 candidates were verified by dbDEMC and miR2Disease database (see Table 2). Among the 45 confirmed miRNAs, 26 were supported by both databases. In addition, all the five unconfirmed miRNAs were verified by more recent studies and their PMID is recorded in Table 2. For example, miR-92a was suggested by experiments to be correlated with the tumour-node-metastasis (TNM) stage, the lymph node and distant metastases, and the survival rate of colon neoplasms. 58 Another example is that overexpressed miR-101 could suppress the proliferation, stimulate cell cycle arrest and promote apoptosis of colon cancer SW620 cells. 59 Therefore, 50 of the top 50 candidate miRNAs for colon neoplasms were supported by either database or literature evidence.
As reported, Esophageal Neoplasm is the sixth leading cause of deaths related to cancers and the eighth most common cancer worldwide based on the pathological characteristics. 60 Males are more likely to get the disease based on the fact that the number of male patients is three to four times higher than the number of the female patients. 61 As has been suggested, if the tumours could be diagnosed at an early stage, the survival rate could increase to 90%, 62 which means that the early detection of oesophageal neoplasms is critical to cancer treatment. 63 database (see Table 3). Among the 44 confirmed miRNAs, one was supported by both databases. Among the six unconfirmed miRNAs, miR-218 was found to inhibit the growth of oesophageal squamous cell carcinoma (ESCC) and could enhance the chemo-sensitivity of ESCC to cisplatin. 66 The PMID of the supporting literature for miR-218 is recorded in Table 3. Therefore, 45 of the top 50 candidate miRNAs for oesophageal neoplasms were supported by either database or literature evidence.
To analyse the distributional difference between the scores of confirmed candidate miRNAs and the scores of unconfirmed ones, for each disease we separated its candidate miRNAs into two groups. One group contained candidates confirmed by miR2Disease and/or dbDEMC and the other held the remaining unconfirmed candidates. Then, we obtained the corresponding scores of miRNAs in the two groups and carried out the non-parametric Wilcoxon rank sum test for a difference in mean ranks of the distributions for the two groups' scores. The null hypothesis was that the two lists' distributions had the same mean rank, and the alternative hypothesis was unequal mean ranks. The significance level was set to be α = 0.05.
For breast neoplasms, there were 145 confirmed candidate miRNAs and 148 unconfirmed ones (the scores can be found in Table S1).
The predicted scores were higher for the confirmed group than for the unconfirmed group (means: 0.009386613 and 0.005127228, respectively; P = 1.511e-09). For colon neoplasms, there were 145 confirmed candidate miRNAs and 346 unconfirmed ones (the scores can be found in Table S2). The predicted scores were higher for the confirmed group than for the unconfirmed group (means: 0.0009716386 and 0.0001703209, respectively; P < 2.2e-16). For oesophageal neoplasms, there were 208 confirmed candidate miR-NAs and 213 unconfirmed ones (the scores can be found in Table   S3). The predicted scores were higher for the confirmed group than for the unconfirmed group (means: 0.00471542 and 0.00225310, respectively; P < 2.2e-16). It can be seen from the test results that across all three diseases the scores for confirmed and unconfirmed miRNAs were very different from each other.
The results of case studies on the three human diseases mentioned above can well prove that KFRLSMDA had satisfactory prediction performance. Moreover, we prioritized the potentially associated miRNAs for all the human diseases in HMDD database (see  Table S4). If one wants to know the predicted miRNAs associated with a specific disease, she or he could find them by searching that disease in the provided list. Besides, we also provided the code of KFRLSMDA to readers for easy use, which could be obtained from: https ://github.com/AnnaG uan/KFRLSMDA. We hope that the predictions of KFRLSMDA can be verified in future scientific researches.
In order to evaluate the prediction ability of KFRLSMDA in special diseases without any known related miRNAs, hepatocellular cancer is used as an example in our experiment. This cancer was chosen as the case study because it is a major cancer type and has been frequently used in previous literatures. Including it in our case studies would enable further comparison of different models' predictive performance for the same disease. Basically, all miRNAs known to be related to hepatocellular cancer were removed and we predicted potential related miRNAs by using other diseases-related miRNA information and similarity information. As a result, 10 out of the top 10 and 44 out of the top 50 predicted hepatocellular cancer-related miRNAs were experimentally verified by reports from dbDEMC, miR2Disease and HMDD database (see Table 4). Among the six unconfirmed miRNAs, five were verified by more recent studies and their PMID is recorded in    Table 5). Among the 12 unconfirmed miRNAs, miR-128b was found to be down-regulated in classic Hodgkin lymphoma (cHL) with Epstein-Barr virus (EBV) 69 ; miR-142-5p, the 5p arm of miR-142, suppressed the proapoptotic gene TP53INP1 as its target and played a pivotal role in the pathogenesis of gastric MALT lymphoma. 70 The PMIDs of the supporting literatures for these two miRNAs are recorded in Table 5. Therefore, 40 of the top 50 candidate miRNAs for lymphoma were supported by either database or literature evidence.

| D ISCUSS I ON
To date, many computational methods have been proposed to predict the potential associations between miRNAs and diseases.
It is widely believed that computational models could yield the most potential miRNAs related to human diseases and are a valuable complementary tool for experimental methods. 28 We believe that the following factors are the main reasons for KFRLSMDA's reliable performance. First, although other methods are also using HMDD, our model was the first to apply the fusion technique that integrated multiple data sets in a novel way.
KFRLSMDA fused the miRNA functional similarity matrix and Gaussian interaction profile kernel similarity matrix together instead of simply average these two matrices, and the same was true with TA B L E 4 Prediction of the top 50 predicted miRNAs associated with hepatocellular cancer by removing miRNAs known related to hepatocellular cancer and predicting potential related miRNAs using other diseases-related miRNAs. research. For example, we would consider adding the difference between tissue-specific expression of miRNAs to our model.

| MATERIAL S AND ME THODS
As has been mentioned in the RESULT section, KFRLSMDA took three input data sets, namely, the miRNA-disease associations, the miRNA functional similarity and the disease semantic similarity. The miRNA-disease association data would firstly be used to generate

| Human miRNA-disease associations
The miRNA-disease association data were the first input data set of our model. As with previous studies, 27,30,33,41,43 we used HMDD v2.0 38 as the training database to learn KFRLSMDA for cross-validations and case studies, and adopted miR2Disease 45

| Disease semantic similarity
The third input data set was the disease semantic similarity, which we obtained from 27 and was calculated by describing each disease as a directed acyclic graph (DAG) according to the disease MeSH descriptors from the National Library of Medicine (http://www.nlm. nih.gov). In a DAG, the nodes denoted the disease itself as well as its ancestor diseases, while the links between the parent nodes and the children nodes represented the relationship between diseases. To illustrate this, disease D could be described as DAG(D)=(D,T(D),E(D)), where T(D) was the node set including D and its ancestors and E(D) was the corresponding link set.
We defined the contribution of disease d in DAG(D) to the semantic value of disease D as follows: where δ was the semantic contribution factor fixed in optimization and equal to 0.5. 27 The distance between disease d and D was inversely proportional to the contribution score for disease d. We defined the semantic value of disease D as follows: Intuitively, if two diseases had larger shared part of their DAGs, they should have higher similarity score. In this regard, the semantic similarity between disease d (i) and d (j) was defined as follows: The resulting matrix S D was the disease semantic similarity.

| Gaussian interaction profile kernel similarity
Inspired by the literature, 77 we computed the Gaussian interaction profile kernel similarity for diseases and miRNAs to capture the key features of the miRNA-disease association data. Construction of this kernel similarity was based on the assumption that similar diseases tend to have associations with miRNAs with similar functions. Binary vector IP (d (u)) was defined to represent the interaction profiles of disease d (u) by observing whether there were known associations between disease d (u) and each miRNA. In this regard, we defined the Gaussian interaction profile kernel similarity for diseases d (u) and d (v) as: where d was a parameter used for kernel bandwidth control, which could be acquired by normalizing a new bandwidth parameter ′ d by the average number of associated miRNAs for each disease. (1) In the same way, the Gaussian interaction profile kernel similarity between miRNA m (i) and m (j) was defined as: Together with the abovementioned three input data sets, matrices K D and K M calculated from Equations (4) and (6) were also fed into KFRLSMDA to facilitate subsequent computational steps.

| KFRLSMDA
We developed the computational model of KFRLSMDA by combining the miRNA-disease association data, the miRNA functional similarity, the disease semantic similarity and the Gaussian interaction profile kernel similarity to predict potential miRNA-disease associations (see Figure 2). Basically, our algorithm was divided into three parts, namely, kernel fusion of data sets, regularized least squares classifiers in the miRNA and disease spaces, and ensemble of the two classifiers.

| Kernel fusion of data sets
Instead of simply integrating similarity matrices using linear combination like many previous studies in computational biology, here we adopted nonlinear kernel fusion on our data sets. To be more specific, kernel fusion was carried out in both the miRNA space (involving S M and K M ) and the disease space (involving S D and K D ).
In the miRNA space, we firstly made S M positive semi-definite by adding an identity matrix using the formula K SM = S M + * I SM , where I SM was the identity matrix with the same size as S M 77 and ε was a small positive value assumed to be 0.1 (and could be optimized further). Secondly, K SM was row-normalized so that each row could sum up to one, and its symmetric version PM 1 was obtained by taking the average of K SM and its transpose. Thirdly, the local similarity matrix for PM 1 was calculated by the following equation where N i denoted the nearest neighbours of the current disease d (i).
In our work, we used four nearest neighbours (k = 4). This matrix LM 1 captured the local information of PM 1 . In addition, we also calculated a row-normalized symmetric version of K M , which was denoted by PM 2 ; and we obtained the local similarity matrix LM 2 according to Equation (8).
Inspired by Tu et al, 78 in the ensuing step we iteratively updated PM 1 and PM 2 according to. was the status matrix of K M . As has been pointed out by Tu et al, 78 the process above could loosely be considered as a diffusion process. Notice that, at the end of each iteration, both status matrices were further changed as they were added by an identity matrix. In the next iteration, the generated matrices were further used. The iteration step could be set by the user, and we set to 2 in our study. After the iterations, the two final status matrices were averaged K M kf = PM (t) 1 + PM (t) 2 and then K M kf was row-normalized. Here, M was the shorthand for miRNAs, meaning K M kf was the kernel fusion matrix in the miRNA space. Finally, we further transformed the resulting matrix by K M kf = (K M kf + K M kf ) T + I ∕2 , which was the final fusion matrix. The fusion steps are illustrated in the left part of Figure 2. We computed the fusion matrix K D kf in the disease space in the same way (as depicted in the right part of Figure 2).

| Regularized Least Squares Classifiers in the MiRNA and Disease Spaces
After kernel fusion, we further used regularized least squares (RLS) 79 to construct the two classifiers in the miRNA and disease spaces, respectively. In the miRNA space, the RLS classifier was obtained by defining a cost function to minimize.
where || • || F was the Frobenius norm and M was the trade-off parameter. Fortunately, this optimization problem had closed-form solution: where I M was the identity matrix with the same size as matrix K M kf . F * M was the final RLS classifier in the miRNA space. Similarly, we could acquire the classifier F * D in the disease space as follows where I D was the identity matrix with the same size as matrix K D kf . Here, we set the two trade-off parameter M and D as 0.3, respectively, according to previous work. 79

| Ensemble of two classifiers
As the last step, F * M and F * D were combined in a simple weighted average operation: F * was the output of the trained model and could be used to make miRNA-disease association prediction. The entity in row i column j of F * was denoted by F * (i,j), which represented the association score for miRNA j and disease i . The higher the score was, the more probably this miRNA-disease pair would be associated. The value of could be optimized from 0 to 1 using grid search method.
Here, we set = 0.1, which could be regarded as the start point.