Using ensemble learning and genetic algorithm on magnetic resonance imaging radiomics to classify molecular subtypes of breast cancer

Breast cancer (BRCA) is one of the most frequent malignant tumors with the highest incidence of cancer and the second most common oncologic cause of death in women. BRCA can be classified into different molecular subtypes, such as basal‐like, represented by triple‐negative BRCA (estrogen receptor [ER] negative, progesterone receptor [PR] negative, and human epidermal growth factor receptor 2 [HER‐2] negative). This study aims to determine whether radiomics features extracted from magnetic resonance imaging (MRI) could be used to distinguish various BRCA molecular subtypes. This study retrospectively collected a dataset of 922 BRCA patients with MRIs and experimental genomic profiles. A genetic algorithm is then employed to select the optimal MRI features for each subproblem. Subsequently, stacking ensemble learning is implemented to learn these features and generate the prediction outcomes. Our model showed a significant performance of 0.700, 0.732, and 0.642 (area under the curve; AUC) in predicting ER, PR, and HER‐2 statuses. For multiclassification of Luminal A, Luminal B, HER2, and TNBC, the AUCs reached 0.672, 0.624, 0.639, and 0.669, respectively. Our model is superior in most subtypes compared to the state‐of‐the‐art predictors on the same dataset. In conclusion, genetic algorithm and ensemble learning can be suitable for BRCA subtype classification with high performance.


| INTRODUCTION
Breast cancer (BRCA) is one of the most frequent malignant tumors with the highest incidence of cancer and the second most common oncologic cause of death in women. 1 BRCA can be stratified into different subtypes based on hormone receptor status. 2 Genomic and profiling studies have identified several distinct BRCA subtypes: basal-like, represented by triple-negative BRCA (estrogen receptor [ER] negative, progesterone receptor [PR] negative, and human epidermal growth factor receptor 2 [HER-2] negative); luminal A, characterized by hormone receptor-positive tumors and luminal B, denoted by ER-positive tumors with low and high proliferative activity, respectively; and HER2-positive, represented by tumors with high ERBB2 gene expression. [3][4][5] These subtypes differ significantly in terms of prognosis as well as the spectrum of therapeutic targets they exhibit. 6 Treatment of patients with triple-negative BRCA (TNBC), a subtype of BRCA that lacks ER, PR, or HER2, has proven difficult due to the disease's complexity and the lack of well-defined molecular targets. 7 TNBC has high invasiveness, high metastatic potential, proneness to relapse, and tends to be detected when it is already large in size and high grade.
Breast magnetic resonance imaging (MRI) is the best imaging tool for presenting and predicting the pathologic response of BRCA. 8 "Radiomics" refers to the extraction and analysis of multiple quantitative imaging features from medical pictures acquired using computed tomography (CT), positron emission tomography (PET), or MRI with a high throughput. 9,10 These data are extracted from standard pictures, which leads to a large potential subject pool. Radiomics data can be utilized as a minor form to create descriptive and predictive models for the visual characteristics of phenotypes or gene protein signatures.
Radiomics central concept is that these models, which may include biological or medical data, can provide effective therapeutic, prognostic, or predictive information. It has been applied successfully for predicting different outcomes in BRCA patients, such as pathologic complete response (pCR) to neoadjuvant chemotherapy, 11,12 axillary lymph node metastasis, and disease-free survival, 13  This study aims to determine whether radiomics features extracted from reconstructed images from MRI could be used to distinguish between various BRCA molecular subtypes. By using several machine learning (ML)-based models and optimal features obtained from previous reports, 15 we hypothesize that an ensemble model could show better performance compared to individual algorithms and present promising predictive values for BRCA radiogenomics.

| Data collection
We collected MRI imaging and other data for 922 patients with invasive BRCA and available preoperative MRI with no breast surgery, described in the previous study. 15 Ethical review and approval were waived for this study due to all data deriving from the public database.
All axial breast MRI images in prone position were retrieved by 1.5 T or 3 T scanners. In DICOM format, MRI sequences have been shared as follows: a nonfat saturated T1-weighted sequence, a fat-saturated gradient-echo T1-weighted precontrast sequence, and mainly three to four postcontrast sequences.

| Feature extraction
Our feature set included multiple modalities as follows: • Demographic, pathology, clinical, treatment, outcomes, and genomic data: It is believed that systematically evaluating the effect of parameter scanners on the extraction of features for radiomics and radiogenomics studies is highly important.
• Five hundred twenty-nine imaging features were already extracted from the typical characteristics such as size, shape, texture of the pri-  20 This set is the main one to assess the potential of MRI radiomics in BRCA subtype classification.
• The lesion locations were all labeled as annotation boxes provided by radiologists in the images.

| Genetic algorithm for feature selection
Feature selection is essential in eliminating noisy variables and maintaining features with a high degree of separability between two classes, leading to more precise predictions. Since we have 529 radiomics features, we have to reduce their dimension. In this study, we used a genetic algorithm (GA) approach to select the best features of our models. Inspired by the theory of evolution, the genetic algorithm is a metaheuristic technique that mimics the evolutionary process and can provide efficient models when the hypothesis space to a problem is too large to be evaluated. It has been proven efficient in reducing high-dimensional features in different data. 21,22 This study used sklearn-genetic package (https://pypi.org/project/sklearn-genetic/) to employ the genetic approach. We systematically looked at the optimal process of GA by tuning different parameters including the number of populations, number of generations, cross-over probability scores, and mutation probability scores. The search ranges and optimal parameters are shown in Table 1. eXtreme Gradient Boosting (XGBoost) was selected as the classifier when GA sorted out the good or destructive features.  Table S1. Moreover, among those powerful ML algorithms for each problem, we then assessed the ability of ensemble models to reach optimal performance. We one-by-one stacked individual models to find the optimal combination as our final ensemble model. The second level of the stacking ensemble model used another LR algorithm. All ML and ensemble learning models were implemented using sklearn package in Python.   Table 2 shows the patient characteristics of our study cohort. According to these statistics, most positive and negative data were significantly different in age (p < .05). Moreover, the table also shows the differences in the other characteristics (i.e., Menopause, Race, or Stage). The distribution between the positive and negative data of the three genomic profiles did not show many differences in Menopause.

| GA-based features for radiogenomics BRCA subtypes
We used GA 21 to find the optimal set of radiomics features for each problem (BRCA subtype). In detail, since we tried to solve four different classification problems (ER, PR, HER2, and molecular subtype [MS]), we searched for the optimal set of them one by one. The results then showed that our models achieved the best performance using 18, 14, 14, and 13 features for ER, PR, HER2, and MS classifications, respectively. The detailed result is shown in Table S2. We combined all optimal features into a heatmap analysis to show the distribution of features among classes ( Figure 2). As shown in Figure 2, our radiomics features could separate BRCA patients into subgroups according to their molecular subtypes. These feature sets could hold the potential for reaching optimal performance as well as become radiomics signatures for specific problems.
T A B L E 1 Search ranges and optimal hyperparameters of genetic algorithm (GA)

| Improving the predictive performance using ensemble stacking
In the previous step, we conducted several algorithms to identify good single classifiers for each molecular subtype. Although MRI-based radiomics individual ML models show high performance, it is necessary to improve the predictive ability. Recent studies suggested that it might be better to ensemble the best performance models than single techniques. [24][25][26] Therefore, ensemble technique was applied to improve the predictability. We ensemble three individual models oneby-one via stacking method to find the optimal combination. In details, we used these three models' prediction probabilities and inserted into another LR model to make it as a stacking model. Several assessments have been made to select the best combination to avoid the correlations among classifiers. Finally, we have found that our model could achieve a significant performance with the stacking model between RF, ET, and LDA. As shown in Figure 4, the stacking ensemble model worked well to generate better performance than individual algorithms.

| Comparison to previously published works
Recently, a variety of studies have conducted this BRCA radiogenomics problem using different datasets. 15

| DISCUSSION
In this study, we used a large public cohort of 922 patients with 529 breast DCE-MRI extracted radiomics features to assess the predictive performance of various ML and ensemble learning models. We proposed that the stacking ensemble generated by combining the best prediction from different well-performing models could improve the prediction accuracy of receptor status and surrogate molecular subtypes of invasive BRCAs.
As shown in Figure 4, the stacking ensemble approach outperformed the other three base learner models (RF, ET, and LDA) in pre- The predictive performance-based single model may challenge an issue where the model cannot display the correct performance if the data label is biased or the selected model is over-fitted with the related data. Therefore, ensemble approaches are a machine-based method that has emerged to address these issues and improve the predictive outcomes. 30 The best performance meta-learner model was discovered when the identical models were employed in the base learner and meta-learner. However, the stacking method is not  HER2 protein and other novel anti-HER2 medicines. It is also a prognostic factor for both node-negative and node-positive patients, which plays a vital role in treatment decision-making. 29 For clinical applications, our study can serve as a model integrated into clinical decision support system (CDSS) to assist physicians in their decisionmaking processes related to BRCA genomics. A noninvasive MRI radiomics model is necessary to have an early screening for further diagnosis and treatment.
Our study has some limitations that come at first; the analysis data are not from our resources. Therefore, some characteristics related to MRI radiomics, and genomics features could not be optimized to train the base learners to generate ideal data for the metalearner. In this study, we combined base learners' three best accuracies by voting without evaluating the uncorrelated between different individual models. Furthermore, the stacking ensemble approach is not always superior to other single ML methods due to many reasons, such as the complexity of problems, the competence of training data, and the sufficiently uncorrelated base learners' predictions, as mentioned above. On the other hand, this study relied on surrogate molecular subtypes of IHC derived from ER, PR, and HER2, which are not as accurate as formal genetic analysis in predicting outcomes.
Consequently, the modest relationship between MRI phenotypes, genomics, and BRCA subgroups indicates the potential of such findings as part of a composite marker for stratifying invasive BRCAs, which may incorporate additional clinical factors and imaging features from other modalities.

| CONCLUSION
Overall, the findings of our study might help to decide which ML models to use in future stacking ensemble research for early diagnosis and correct classification of various invasive BRCA types, as well as finding the best treatments that could improve patients' survival rates.
Our suggested model outperformed other earlier models in invasive BRCA categorization of receptor and hormonal status, allowing for more individualized care with increased clinical outcomes.