MRI radiomics model for predicting TERT promoter mutation status in glioblastoma

Abstract Background and purpose The presence of TERT promoter mutations has been associated with worse prognosis and resistance to therapy for patients with glioblastoma (GBM). This study aimed to determine whether the combination model of different feature selections and classification algorithms based on multiparameter MRI can be used to predict TERT subtype in GBM patients. Methods A total of 143 patients were included in our retrospective study, and 2553 features were obtained. The datasets were randomly divided into training and test sets in a ratio of 7:3. The synthetic minority oversampling technique was used to achieve data balance. The Pearson correlation coefficients were used for dimension reduction. Three feature selections and five classification algorithms were used to model the selected features. Finally, 10‐fold cross validation was applied to the training dataset. Results A model with eight features generated by recursive feature elimination (RFE) and linear discriminant analysis (LDA) showed the greatest diagnostic performance (area under the curve values for the training, validation, and testing sets: 0.983, 0.964, and 0.926, respectively), followed by relief and random forest (RF), analysis of variance and RF. Furthermore, the relief was the optimal feature selection for separately evaluating those five classification algorithms, and RF was the most preferable algorithm for separately assessing the three feature selectors. ADC entropy was the parameter that made the greatest contribution to the discrimination of TERT mutations. Conclusions Radiomics model generated by RFE and LDA mainly based on ADC entropy showed good performance in predicting TERT promoter mutations in GBM.


INTRODUCTION
Glioblastoma (GBM) is the most common and aggressive primary brain tumor in adults (Chougule et al., 2022;Gonçalves et al., 2020), with high recurrence and mortality rates despite standard therapies (Campos et al., 2016;Parvaze et al., 2023).Molecular classification of GBM has been proposed to identify subtypes with distinct clinical, genetic, and epigenetic features for risk stratification (Gritsch et al., 2022;Yang et al., 2022).TERT promoter mutations, one of the most important molecular biomarkers, are present in up to 73.6% of GBM cases (Kanas et al., 2017).The presence of TERT promoter mutations is associated with worse prognosis and resistance to therapy, making it an important biomarker for personalized treatment strategies (Śledzińska et al., 2021).
TERT is an enzyme that is essential for the maintenance of telomeres, the protective caps on the ends of chromosomes (Arita, Narita et al., 2013;Stichel et al., 2018).Telomeres shorten with each cell division, eventually leading to cellular senescence.TERT adds DNA to the ends of telomeres, preventing them from shortening and allowing the cell to continue dividing (Amen et al., 2021).Usually, TERT is highly expressed in stem cells and cancer cells.TERT mutation is a hallmark of cancer and is often used as a diagnostic and prognostic marker.In GBM, TERT promoter has been shown to be an independent prognostic factor for poor overall survival (OS) and progression-free survival (PFS), even after adjusting for age, grade, and other molecular markers.Previous research has demonstrated that suppressing TERT expression increases the sensitivity of cell to radiation-and chemotherapy-induced DNA damage, making it a target for novel therapeutic approaches (Nakamura et al., 2005;Rohwer et al., 2013).
The ability to identify TERT mutations is essential for the risk stratification of GBM patients in clinical settings.Molecular diagnostic procedures like polymerase chain reaction or next-generation sequencing are frequently used to detect TERT promoter mutations (Fujioka et al., 2021;Jovčevska, 2018).However, the acquisition of pathology specimens is challenging in clinical settings where surgery is not possible or cannot be tolerated by the patient.Furthermore, there is always the possibility of false-positive or false-negative results.
To overcome these limitations, it is important to integrate multiple diagnostic technologies to increase the diagnostic efficiency and accuracy.
Recently, radiomics is being increasingly used as a supplementary tool for tumor diagnoses and the monitoring of therapeutic response (Li, Li et al., 2022;Li, Liu et al., 2022).Radiomics involves the extraction of quantitative features from medical images, which can then be used to identify patterns and relationships within the data (Jiang et al., 2017).This approach can help identify subtle changes in tumor characteristics that may be missed by traditional imaging techniques.
However, some traditional radiomics approaches tend to use a single algorithm to build models, which may not be able to adequately represent the complexity and heterogeneity of the imaging data.Therefore, further research is required to apply a variety of algorithms along with feature selections in order to increase the accuracy of the models.
To the best of our knowledge, two studies have been conducted on the radiomics analysis of TERT classification in GBM (Gerardi et al., 2023;Zhang et al., 2023); however, these researchers do not encompass the selection of optimal algorithms and feature selectors for predicting TERT classification.Despite the valuable insights gained from these two studies, there are still several limitations that need to be addressed.The studies have not examined the potential impact of combining different feature selections and algorithms on the accuracy of TERT classification.Integrating multiple sources of information could potentially lead to more precise and reliable predictions, but this hypothesis has not been thoroughly tested.Based on the reasons outlined above, we investigated the radiomics model for distinguishing TERT promoter mutations based on preoperative multiparameter MRI by applying three feature selections and five classification algorithms.(4) patients were not on steroids at the time of imaging.The patient selection flow chart is shown in Figure 1.

MRI protocol
Imaging data included axial T2-weighted, DWI, and ADC sequences obtained on a 1.5T (GE, Octane; Siemens) or 3.0T MRI system (Philips, Achieva; GE, Premier).The MRI parameters are provided in Table 1.

Pathological assessment
Pathological diagnosis was made according to the 2021 (fifth edition) classification criteria for central nervous system brain tumors.IDH and TERT mutation statuses were obtained by next-generation sequencing, as described elsewhere (Arita, Narita, Takami et al., 2013;Hasanau et al., 2022).    in a ratio of 7:3.To remove the imbalance of the training dataset, we used the synthetic minority oversampling technique to make the samples balanced.On the feature matrix, we computed the mean value and the standard deviation for each feature vector.Each feature vector was split by the standard deviation, and its mean value was subtracted.Each vector has a zero center and a unit standard deviation after normalization.We compared the similarity of each feature pair because of the large dimension of the feature space.One of the feature pairs was eliminated if the Pearson correlation coefficient value was more than 0.99.Using this process, the dimension of feature space is reduced, rendering each feature independent of each other.
Analysis of variance (ANOVA), recursive feature elimination (RFE), and relief were the three feature selections employed before building the model to select the features that showed the strongest correlation to the label (TERT mutations).To assess the association between the features and the label, the F-value was determined.To develop the model, we chose a specific amount of features and ranked them according to the matching F-value.Then, we filtered the features based on their F-values and built the model with a specific amount of features.The selected features were then modeled using support vector machine (SVM), auto encoder (AE), linear discriminant analysis (LDA), random forest (RF), and logistic regression via the least absolute shrinkage and selection operator (LR-Lasso) classification methods.The performance of the model was evaluated using receiveroperating characteristic (ROC) curve analysis, and the area under the curve (AUC) values were calculated for quantification.At a cutoff value associated with the maximum Youden index, the accuracy, sensitivity, specificity, positive predictive value, and negative predictive value were also determined.We also estimated the 95% confidence interval by bootstrapping with 1000 samples.

TA B L E 2
The classification and parameter composition of radiomics features.

Basic clinical information of participants
The basic clinical characteristics of the GBM patients are shown in were diagnosed as TERT mutation-positive and 77 (53.85%) as TERT mutation-negative.There were significant differences in age among the TERT subgroups.However, the sex distribution did not show a significant difference in the TERT mutation subgroups (p = .03).

Radiomics results
We compared all models using FAE on the validation set.The results of ROC curve analysis for three feature selections and five classification algorithms with fivefold cross validation are shown in Table 4.
The pipeline based on RFE and LDA generated the highest AUC with nine features.When using the "one standard error" rule (Wang et al., 2022), FAE produces a simpler model of eight features.The AUC of the cross verification set, training set, and test set were 0.964, 0.983, and 0.926, respectively.The eight features selected from this pipeline were listed in ascending order, as shown in Figure 3.The result demonstrated that the feature that contributed most to the whole pipeline was the entropy of the original first-order feature from ADC.
The pipeline combined with relief and RF produced the highest AUC of 14 features, and the "one standard error" rule was used to screen out the 6 features.The AUC of the cross verification set, training set, and test set were 0.961, 1.0, and 0.913, respectively.The six features selected from this pipeline were listed in ascending order, as shown in Figure 4.The result showed that the feature that contributed most to the whole pipeline was the interquartile range of the original first-order from ADC.When five classification algorithms were performed to access ANOVA, RFE, and relief feature selectors, RF was the most stable, and the AUC values were higher than 0.913.The Delong test revealed that RF combined with the other three feature selectors did not make a significant difference in distinguishing TERT subtypes in GBM patients.

DISCUSSION
This study explored the potential value of radiomic models based on T2WI, DWI, and ADC maps using different feature selection (ANOVA, RFE, and relief) and classification algorithms (SVM, LDA, AE, RF, and LR-Lasso) to predict the presence of TERT promoter mutations in patients with GBM.Our cohort of 143 GBM patients comprised 46.15% tumors with TERT mutation.The age distribution among the TERT subgroups exhibited significant disparities; however, no statistically significant variation was observed in the sex distribution across the TERT mutation subgroups.Furthermore, using feature selectors to Some previous studies focused on a single radiomic approach, and the diagnostic efficacy of these models was far from satisfactory.In a retrospective study of 112 patients with newly diagnosed GBM by Yamashita et al. (2019), the AUC, accuracy, sensitivity, and specificity of the radiomics model using SVM for predicting TERT promoter mutation were 0.776, 0.857, 0.548, and 0.741, respectively.In another study of 105 gliomas by Peng et al. (2021), the AUC and accuracy of classifying IDH status by the Lasso method were 0.770 and 0.823.In general, the performance of the model increases with the size of the dataset (Rogers et al., 2022).However, even in a large cohort, the classification results obtained with a single algorithm are still unsatisfactory.
Yan et al. investigated 357 glioma patients by using Bayesian neural networks and integrated radiomics features from Gd-T1 and ADC; the AUC, accuracy, sensitivity, and specificity for predicting TERT mutations were 0.598, 0.685, 0.976, and 0.290, respectively (Rogers et al., 2022).In this regard, it is very important to use appropriate algorithms to build the effectiveness model.Our hypothesis is further verified by comparing the model performance generated by different feature selectors and classification algorithms (Chen et al., 2022;Zhang et al., 2023).A previous study sought to distinguish between 36 low-grade gliomas and 42 GBMs using 4 machine learning classifiers (including SVM; K-nearest neighbors, K-NN; LDA; and adaptive boosting using decision stumps as the base learner, AdaBoost); the results showed that the AdaBoost classifier generated higher predictive accuracy than other algorithms individually (the AUC, sensitivity, specificity, and accuracy were 0.96, 91%, 86%, and 89%, respectively) (Malik et al., 2021) (Calabrese et al., 2022;Tian et al., 2020;Yan et al., 2021).In this study, we aimed to compare the effectiveness of ADC, DWI, and T2WI in differentiating TERT mutation status in GBM patients using radiomics analysis.Our study indicated that ADC was the most significant parameter for the predictive model.
This result is consistent with previous studies (Gihr et al., 2020;Park et al., 2021).Furthermore, among the radiomics features analyzed in this study, ADC entropy was found to have made the greatest contribution to the discrimination between TERT-mutant and wild-type GBMs.Specifically, the entropy feature was able to capture the heterogeneity of tumor microenvironment, which is a crucial factor in the development and progression of GBM and showed a strong correlation with TERT mutations (Gihr et al., 2022;Kim et al., 2020;Wang et al., 2023).Entropy is a measure of the randomness or disorder in the pixel intensity distribution of an image (Just, 2014).Theoretically, the greater the entropy, the greater is the dispersion degree of the tumor.In addition, entropy values were also significantly higher in

F
were resampled to a voxel size of 1 mm × 1 mm × 1 mm, and the gray level was discretized with a bin width of 25.These steps helped reduce the variability caused by differences in scanning parameters and equipment.The volume of interest was semiautomatically plotted on T2WI along the tumor margin slice by slice and automatically registered to DWI and ADC images.Tumor segmentation was manually performed F I G U R E 2 Radiomics processing flow.by two neuroradiologists (T.L and L.Z.H) with 10 years of experience in neuroradiology.Interclass correlation coefficient between 0.75 and 1 was considered indicative of good agreement.Any disagreement between the two neuroradiologists was resolved by consensus.The radiomics process is shown in Figure 2. 2.4.2Feature extraction For each patient, a total of 2553 features were extracted.These features can be broadly categorized into four groups, including shape features (n = 14), histogram features (n = 18), textural features (graylevel co-occurrence matrix [n = 24], gray-level dependence matrix [n = 14], gray-level run length matrix [n = 16], gray-level size zone matrix [n = 16], neighborhood gray-tone difference matrix [n = 5]), and wavelet transform (n = 744).The texture features of the above segmentation images were extracted and quantified in the pyradiomics database after wavelet transformation.Each stage of wavelet filtering results in eight decompositions.In three dimensions, all feasible combinations of high-pass or low-pass filters (LLH, LHL, LHH, HLL, HLH, HHL, HHH, LLL) were applied, and three types of texture features were retrieved at each decomposition.Finally, the wavelet transform yielded 744 features, and a total of 851 texture features were retrieved per sequence.The feature classification and radiomics parameters are shown in

Finally
, we used 10-fold cross validation on the training dataset to establish the model's hyperparameters.The hyperparameters were chosen based on the performance of the model on the validation dataset.

F
Model performance generated by recursive feature elimination (RFE) and linear discriminant analysis (LDA): (a) receiver-operating characteristic (ROC) curves of different datasets; (b) a simpler model of eight features based on the "one standard error" principle; (c) feature contribution arrangement in the final model generated by RFE and LDA.0.946, respectively.The three features selected from this pipeline were listed in ascending order, as shown in Figure 5.The result demonstrated that the feature that contributed most to the whole pipeline was the interquartile range of the original first-order from ADC.The comparison of different feature selectors and classification algorithms is shown in Figure 6a-h.The model evaluated by relief demonstrated remarkable stability with AUC values exceeding 0.800 on the validation, training, and test sets when evaluating the F I G U R E 4 Model performance generated by relief and random forest (RF): (a) receiver-operating characteristic (ROC) curves of different datasets; (b) a simpler model of six features based on the "one standard error" principle; (c) feature contribution arrangement in the final model generated by relief and RF.classification algorithms (SVM, LDA, AE, RF, and LR-Lasso) individually in conjunction with these three feature selectors (ANOVA, RFE, and relief).However, employing the Delong test for statistical analysis revealed no significant differences in distinguishing TERT subtypes in patients with GBM between the combination of relief and the other five classification algorithms.The performance of five classification algorithms was evaluated using ANOVA, RFE, and relief feature selectors.Among them, RF demonstrated the highest stability and achieved AUC values exceeding 0.913.Furthermore, the Delong test indicated no significant differences in distinguishing TERT subtypes in patients with GBM when comparing RF with the other three feature selectors.
separately evaluate other five classification algorithms, relief was considered the best feature selector with AUC values more than 0.800 across validation set, training set, and test set.On the other hand, RF was found to be the preferred classifier for separately evaluating the performance of these three feature selectors, and the AUC values were greater than 0.913.In our research, different combinations of feature selectors and classification algorithms generated different results in predicting the TERT mutation.As for combination, the model constructed based on the RFE and LDA yielded the best diagnostic performance (AUC, accuracy, sensitivity, and specificity: 0.964, 0.940, 0.891, and 0.982, respectively) compared to other combinations.

F
I G U R E 5 Model performance generated by analysis of variance (ANOVA) and random forest (RF): (a) receiver-operating characteristic (ROC) curves of different datasets; (b) a simpler model of three features based on the "one standard error" principle; (c) feature contribution arrangement in the final model generated by ANOVA and RF.
TERT mutations, indicating increased heterogeneity within the tumor environment.The presence of high entropy values in TERT-mutant GBM may be attributed to the dysregulated cellular metabolism, which results in differences in tumor microenvironment between TERTmutant and wild-type tumors.From this perspective, entropy may serve as a useful marker and provide insights into the complex nature of tumor heterogeneity.Further research is required to validate these F I G U R E 6 The comparison of different feature selectors and classification algorithms.
Figure a-c showed that the ANOVA, relief, and RFE feature selectors were combined with these five algorithms to evaluate the model performance.Figure d-h showed that AE, LDA, LR-Lasso, RF, SVM algorithms were integrated with these three feature selectors to evaluate the diagnostic efficiency of the model.findings in larger cohorts and to investigate the clinical implications of using more imaging biomarkers in the diagnosis and treatment of GBM.Some limitations of our study need to be considered.To begin, we need to expand the sample size of our dataset in order to draw more robust conclusions.Future study should incorporate samples from more centers.Second, in our investigation, only conventional sequences were employed, and subsequent analysis was carried out utilizing more advanced imaging techniques, such as perfusion imaging and amide proton transfer weighted imaging, which are sensitive to tumor heterogeneity.Last, we did not assess OS or PFS in GBM patients.Our future research will entail more in-depth analysis incorporating the clinical characteristics.In conclusion, our study highlights the potential value of ADC entropy as a potential noninvasive imaging biomarker for identifying TERT status in patients with GBM.The combination of feature selectors and classification algorithms has an important impact on predicting TERT mutations in GBM.The model obtained by RFE and LDA showed the best predictive value, which is of great significance for the development of more personalized therapeutic strategies in clinical settings.
Feature parameters (n = 851)Abbreviations: TERT-mt, telomerase reverse transcriptase mutant-type; TERT-wt, telomerase reverse transcriptase wild-type.SPSS 27.0 was used for statistical analyses.The age was compared using the independent-samples t test, and sex distribution was compared using the chi-square test.The diagnostic accuracy in the training set, validation set, and test set was evaluated by ROC curve analysis.p Values <.05 were considered indicative of statistical significance.

Table 3 .
One hundred and forty-three patients (58 females, 85 males; due to a previous history of brain surgery.Sixty-six cases (46.15%) Results of receiver-operating characteristic (ROC) curve analysis for various feature selections and classification algorithms with 10-fold cross validation.
The pipeline coupled with ANOVA and RF produced the highest AUC of seven features, and the "one standard error" rule was used to screen out a simpler model with three features.The AUC values of the cross verification set, training set, and test set were 0.957, 1.0, andTA B L E 4Abbreviations: ROC, receiver-operating characteristic; AUC, area under the curve; YI, Youden index; Acc, accuracy; Sen, sensitivity; Spe, specificity; PPV, positive predictive value; NPV, negative predictive value.
However, the majority of previous studies have primarily focused on grades 2-4 gliomas.Limited clinical research has been conducted utilizing more feature selection techniques (such as ANOVA, RFE, and relief) and classification algorithms (including SVM, AE, LDA, RF, and LR-Lasso) specifically for TERT subtypes in GBM.Our findings suggest that the model constructed based on the combination of RFE and LDA demonstrated superior diagnostic performance compared to other combinations.Previous studies have examined variations in perfusion metrics, magnetic resonance spectroscopy, T1 pre-and postcontrast, and T2WI-FLAIR in TERT-mutated GBMs harboring TERT mutations, yielding promising results