Lung cancer subtyping from gene expression data using general and enhanced Fuzzy min–max neural networks

In this article, we address the problem of lung cancer diagnosis from gene expression data, which is now recognized as an effective means for early treatment and prevention of cancer. Specifically, we employ Fuzzy min–max (FMM) classifier for the task which is a well‐known neuro‐fuzzy neural network. The idea is to take advantage of the fuzzy class definitions whose boundaries are set using the min–max hyperbox constructed for each class. We implement two advanced FMM neural network architectures: general Fuzzy min–max (GFMM) and enhanced Fuzzy min–max (EFMM) for the classification of lung cancer subtypes from gene expression data. The advantage of GFMM is that it involves very simple operations for hyperbox manipulation, and can handle both labeled and unlabeled data. On the other hand, EFMM proposes three heuristic rules related to hyperbox expansion, contraction and the overlap test, which enhances the learning algorithm. We perform the classification of gene expression data using these two models, and then we analyze the performance by visualizing the hyperboxes obtained after training, and compare the accuracies of these classifiers with the state of the art. Least absolute shrinkage and selection operator (LASSO) is used for selecting the informative genes from the high‐dimensional gene expression data. From the empirical results, we observe that GFMM with LASSO gives the best performance of all, with validation accuracy of 98.04% and cross‐validation accuracy of 94.06%.

samples.The problem is complicated by the presence of noise, overlap between classes, and an imbalanced class distribution in which one type or subtype of cancer has a larger population than other classes.Slonim et al. 1 distinguished between class discovery and class prediction for gene expression data in the Bayesian inferencing framework; they used leukemia dataset; they found that the genes without correlation gave better results, and the median prediction was 0.86.Gene discovery studies use very few genes for cancer diagnosis. 2The authors in Reference3 use the expression levels of three pairs of genes for cancer diagnosis.Such clinical experiments allow scrutiny of individual samples, and do not require adequate training.
In this literature, a large variety of machine learning algorithms have been applied for the classification of gene expression data.A host of classifiers such as the support vector machine, 4 random forest of decision trees, 5 logistic regression, 6 and naïve Bayes classifier 7 have been successfully used for the classification of gene expression data.All these classifiers work on crisp data without transiting to the fuzzy domain.Feature selection techniques like LASSO has proved to improve the performance of these classifiers on gene expression data. 8Khan et al. 9 used artificial neural networks (ANN) for the categorization of cancer using gene expression profiles; the main advantage found was that it could work with nonlinear features and has high sensitivity.For the classification of gene expression profiles, Ahmed et al. 10 used the deep neural network (DNN), improved DNN, CNN, and RNN along with preprocessing techniques; improved DNN gave the best result of all.Urda et al. 11 proposed a DNN having 2 to 4 hidden layers having less than 200 neurons in each layer.This model was extended in 12 with minority oversampling followed by feature selection, which was found to improve the accuracy of the DNN whose four hidden layers contained 512, 256, 128, and 64 neurons, respectively.Lyu et al. 13 performed tumor classification using a convolution neural network (CNN) for learning from the gene expression data, and they achieved 95.59% of accuracy which was better as compared to other related works.In a recent work, Mohammed et al. 14 presented the 1D-CNN classifier as an optimal classifier for gene expression data.On top of that they proposed training the 1D-CNNs on cancer types and stacking them together in an ensemble formation to distinguish between different types of cancers.The authors proposed that the stacked ensemble could perform better classification of cancer subtypes than the individual models.
9][20] Fuzzy logic has been applied in different forms on gene expression data in the past.Vinterbo et al. 21defined fuzzy sets for gene expressions based on the qualitative levels of a gene such as up, neutral or down.They proved that fuzzy logic-based classifiers constructed in this manner outperformed machine learning algorithms such as logistic regression.A fuzzy rule-based classification system was proposed a few years later 22 for identifying input patterns in the gene expression data.The fuzzy rules were made more compact using the genetic algorithm.A distinctive work involving fuzzy logic and gene expressions is 23 in which a neural network is trained on non-fuzzy data, and later on the weights are used to compute the fuzziness, delete the outliers and compute the membership probability at the output of the neural network.Such an approach was observed to work for inadequate data such as gene expressions.A recent notable work is that of Halder and Kumar 24 who proposed a rough-fuzzy classifier to select the most informative samples for active learning of gene expressions.The authors claim that this strategy will help to mitigate the uncertainty and overlap that exists between cancer subtypes.Motivated by the previous works, in this article, we aim to further explore the application of fuzzy logic to gene expression data in the form of neuro-fuzzy classifiers general Fuzzy min-max (GFMM) 25 and enhanced Fuzzy min-max (EFMM) 26 neural networks which have never been investigated for application to gene expressions.A brief introduction of the same is given below.
The FMM, GFMM, and EFMM neural networks are examples of successful application of the fuzzy set theory to pattern recognition and classification problems. 27As we know, regarding crisp sets, the element or a data point that is in the universe of discourse either belongs to the positive set (i.e., 1) or not (i.e., 0).But the fuzzy sets are more generalized and regard all samples as members of a set; they take care of the data points which partially belong to a set by calculating their membership values with respect to the particular set.The fuzzy membership value is continuous between [0, 1], and it may be different for each data point.By using this concept, Simpson et al. 28 introduced the Fuzzy min-max (FMM) neural network in 1992 which has been applied successfully for classification and clustering problems.In FMM, the fuzzy set is represented by a rectangle which is also known as hyperbox.All hyperboxes have min and max points within a unit hypercube, and the data point which lies inside the hyperbox has a membership value equal to 1.The fuzzy min-max classifier connects the hyperboxes to their respective class nodes (i.e., gives the highest membership value) which can be used for the classification.The FMM learning algorithm is divided into three phases, firstly, it checks for the hyperbox expansion then after the successful expansion it goes for the second phase which performs the overlap test.If there is overlap, then the final phase of the learning algorithm is contraction phase which removes the unwanted overlap between hyperboxes.Many researchers have modified this learning algorithm to make it more efficient and faster. 29The two popular improved versions of the FMM classifier which we are going to apply in this article are GFMM and EFMM. 25,26The GFMM improves the effectiveness of the original fuzzy min-max algorithm by suggesting a few modifications to the general FMM architecture and functioning, some of which are listed below.
1.In the pattern space, the input patterns can be fuzzy hyperboxes or crisp points.2. The membership function and the hyperbox expansion constraints are modified.3. GFMM can be used for both clustering and classification because it can process labeled and unlabeled inputs at the same time.4. In the original algorithm, the number of hyperboxes created depends on the maximum hyperbox size hyperparameter.The smaller the value, the more the number of hyperboxes created, and this leads to overfitting; a larger value creates lesser number of hyperboxes which increases the generalization ability, but then the ability to capture the boundaries between the classes is decreased.So the settlement between these two cases is implemented in GFMM.
In the original FMM, 28 Simpson proposed two different algorithms for classification and clustering problems, but the GFMM combines them in one algorithm.The training of GFMM is extremely efficient for almost every case because it uses very simple compare, add and subtract operations for hyperbox manipulation.
The other very popular version of FMM is the enhanced fuzzy min-max (EFMM) 26 which is known to give high classification performance in case of adequate training.There are three heuristic rules introduced in EFMM which enhances the learning process.Firstly, reducing the overlapping regions of hyperbox during the expansion phase that reduces classification errors.Secondly, the already existing overlap testing phase is extended so that all the overlapping corners can be identified.Thirdly, the existing hyperbox contraction rule in FMM is not able to cover all the overlapping cases, so in EFMM they introduced a new rule for contraction for solving the different overlapping cases.
In this article, we investigate the application of GFMM and EFMM neuro-fuzzy classifiers for the classification of lung cancer gene expression data.The application of GFMM and EFMM to gene expression data has not yet been explored.In their previous work, 30 the authors had successfully applied the FMM classifier for the classification of lung cancer gene expression data.The current work advances on 30 by exploring two advanced architectures of the FMM classifier for application to microarray data.The aim is to exploit the improved functionalities of hyperboxes and the expansion-contraction learning process for determining the decision boundaries between cancer subtypes.
One problem with the microarray data is they have thousands of genes (or features), and processing of all the features simultaneously increases time complexity.To overcome this issue, we use the least absolute shrinkage and selection operator (LASSO) for feature selection, which is a high performing algorithm for skewed feature selection. 31,32Next, we analyze the performance of GFMM and EFMM, and compare the results with that of several other machine learning algorithms.We perform cross-validation for all the classification algorithms, and all the comparisons are made based on accuracy and execution time of the algorithm.A nomenclature table is provided in Table 1 which expands different abbreviations and linguistic variables used in the article.
The organization of this article is as follow.Sections 2 and 3 contain a brief discussion about GFMM and EFMM, respectively.Section 4 presents the methodology used for the experiments, and finally, Section 5 analyzes the classification results, and Section 6 summarizes our article and outlines the future scope of this work.

GENERAL FUZZY MIN-MAX NEURAL NETWORK
In this section, we discuss about the input patterns of GFMM, learning algorithm phases, and the neural network at the core of GFMM for the current task of classification of microarray gene expression data.

Input pattern
The input that is processed by GFMM is the ordered pair of the h th input pattern and the class index c of one of the classes.The ordered pair is given by where I h is the h th input pattern in the form of I l h (lower) and I u h (upper) i.e.
[ I l h , I u h ] are the vector inputs.Therefore, unlike a normal FMM where the input is a data point, the input to GFMM is a hyperbox with a min point I l h and a max point I u h .c h  { 0, 1, 2, 3, … ., p } is the class index of any one of the p + 1 classes.If c h = 0, it means the input is unlabeled.

Membership function
The fuzzy hyperbox membership function plays an important role in deciding whether a particular input belongs to a particular class or not.In GFMM, a new membership function is defined which fulfills the limitations of the original fuzzy min-max.In the original function, it was observed that by increasing the distance from the hyperbox, the membership does not decrease steadily, which is a major drawback of this membership function.In GFMM, the degree of membership b q for the hyperbox B q is 1 if I h is inside the hyperbox B q , and the membership decreases as the distance from the hyperbox is increased.In the membership equation,  = [ 1 ,  2 , … ,  n ] is the sensitivity parameter; this regulates how fast the membership values decreases. where

GFMM Learning algorithm
The steps of the GFMM learning algorithm are given below.

Min and Max point initialization
For the new hyperbox, the algorithm initializes its min point V q = 0 and the max point W q = 0, this can be automatically used in the expansion phase of the algorithm.The values of the min and max points when the q th hyperbox is adjusted for the first time by using the are given by These values are similar to the input pattern.

Hyperbox expansion
Suppose the h th input pattern has to be expanded with the hyperbox B q which have the highest degree of membership; before expansion the following condition has to be satisfied.
In (5),  is a user-defined value which sets an upper bound on the maximum size of a hyperbox.If the condition in (5) got satisfied, then the new min and max points of the hyperbox B q are given by ( 6) and (7), respectively.
If the above expansion condition does not satisfy, then we look for the other hyperboxes of the same class for the expansion.If none of the hyperboxes is ready for the expansion, then make a new hyperbox B k for the input pattern.

Hyperbox overlap test
After the successful expansion, there are chances of overlap between the two hyperboxes and if both these hyperboxes belongs to the different classes, then the classifier will give wrong results.The algorithm conducts the hyperbox overlap test to check for the overlap.
Let the hyperbox B q be expanded; we test for the overlap with hyperbox other, go for the overlapping

Hyperbox contraction
Δ th dimension of the two hyperboxes is adjusted only if Δ > 0. To make minimal effect on the size and shape of the hyperbox, the only one dimension is adjusted in each hyperbox.The contraction phase of GFMM is very similar to the original Fuzzy min-max.

F I G U R E 1
The network of GFMM.

Network architecture of GFMM
There are only two changes between the GFMM network architecture shown in Figure 1 and Simpson's original FMM network architecture.Firstly, the input node gets doubled to 2 × n.Secondly, in the output layer, an additional node is introduced which handles the unlabeled hyperbox from the second layer of the network.

ENHANCED FUZZY MIN-MAX NEURAL NETWORK
The enhanced Fuzzy min-max neural network (EFMM) 26 overcomes the limitation of the original FMM learning algorithm and enhances its performance.There are three heuristic rules for the learning algorithm, as will be discussed in this section.

Shortcomings of FMM
The three shortcomings of FMM that are overcome by EFMM are summarized below.

Hyperbox expansion:
In this phase it is shown that when the overlapping regions are increasing between two classes it makes an impact on the performance of the FMM.In FMM they first calculate the sum of all the differences between min and max points of the dimensions, and then they compare this sum with n.There are very high chances of wrong prediction even if one dimension can exceed the n (expansion coefficient) and the sum of all dimensions is under the expansion coefficient.This can lead to overlapping regions between different hyperboxes 2. Hyperbox overlap test: The four existing cases for detecting the overlap between two different class hyperboxes are not sufficient.There are some inputs in which overlapping regions are detected and the test assumes it is a nonoverlapping region and it stops the overlap test.So more conditions are added in the overlap test of EFMM.

Hyperbox contraction:
In FMM the contraction is based on the hyperbox overlap test, but the overlap test phase can pass some undetected overlapping regions which creates problems in the contraction phase In EFMM they modified all these three phases to overcome these problems.The modified version improves the classification results

EFMM Learning algorithm
The three heuristic rules which can overcome all the limitations of EFMM are: 1. Hyperbox expansion rule: To solve all expansion problems in FMM, a new equation is formulated.The q th hyperbox is checked from all dimensions separately to see if it exceeds  or not.This rule is only applicable if no dimension exceeds .
2. Hyperbox overlap test rule: In the original FMM, the four cases are insufficient for the hyperbox overlap test.In GFMM, they modified the test phase and included additional overlap testing cases, as observed from (8).Now there are total nine cases to detect possible overlap regions.And ( 10) and ( 11) are already there in FMM.
When  old −  new < 1, then only the overlapping region is detected.To check for the next dimension, we have to initialize Δ = p and  old =  new .And this loop ends when no more regions are detected.

Hyperbox contraction rule
For the contraction of the overlapping hyperboxes, EFMM introduces nine cases and, all these cases are totally based on the overlap test rules.
Case 7(a)∶ V qΔ < V rΔ ≤ W rΔ < W qΔ and These are the nine cases for hyperbox contraction in EFMM.
These three heuristic rules are the main reason for the enhancement of the learning algorithm of EFMM over FMM.

METHODOLOGY
The task at hand in our current work is to identify lung cancer subtypes from the gene expression profiles pertaining to lung cancer data.The details of the dataset are given in Section 5.The process flow of the training and testing procedures for GFMM / EFMM is shown in Figure 2. The dataset is split into two equivalent halves using alternate samples for training and testing (50:50 train: test split ratio).For each classifier, we perform the validation (V) and cross-validation (CV) steps.In V step, the classifier is trained using the train set, and the trained model is used to classify the test set.The CV results are obtained by swapping the training and test sets.We followed the train: test split and validation procedure as in. 32LASSO feature selection is used to select significant and informative genes prior to the classification phase.With a reduced gene subset, the whole classification process becomes faster and the results are also impressive.
The steps of the methodology are detailed below.i. Directly go for 50:50 train: test split and then perform classification by GFMM, EFMM and other classifiers and models that are used for comparison.ii.Extract the important features from the training set of the lung cancer dataset.There is a requirement for this step because the lung cancer dataset has 12,600 genes (i.e.features) and not all the genes make an impact on the final result.We used the LASSO feature extraction technique for extracting the features from both the training and test sets.This step makes the whole classification process faster and more efficient.After the feature selection stage, the selected gene pool is used for training of the model.The trained model is applied obtain the test accuracies also known as Validation (V) accuracies.iii.Now swap the training and test sets and repeat step (ii) by performing feature selection using LASSO on the new training set.The test accuracies are compiled which are known as the cross-validation (CV) accuracy.

RESULTS
In this section, we first discuss the experimental setup and the hyper parameter settings of the classification algorithms including GFMM and EFMM.Then we compare the results of GFMM and EFMM with the other classification algorithms.

Experimental setup
The hyperparameters which are used for FMM, GFMM, EFMM and various machine learning algorithms used for the experimentation are given in Table 2.We set the parameters to get the best result using grid search.
For other published works used for comparison, 11,12,14 we followed the guidelines in their original articles for hyperparameter setting.All the experiments were performed in Python version 3.7.0software on Intel 2.00 GHz core PC.

TA B L E 2
Hyperparameter Settings for FMM, GFMM, EFMM and machine learning classifiers.

Classifier Hyper parameter Value
Enhanced Fuzzy min-max Maximum size of the hyperbox () 0.

Dataset details and preprocessing steps
We used the lung cancer dataset 33 for our experiments.This dataset has 203 samples and 12,600 features (genes).There are five classes indicating five subtypes of lung cancer. 33The class distribution is highly imbalanced.The different cancer subtypes and their class populations are: lung adenocarcinomas (139), squamous cell lung carcinomas (21), lung carcinoids (20), small cell lung carcinomas (6), and normal samples (17).Under such a scenario, defining accurate class boundaries is an obvious challenge.We propose to counter this challenge using the FMM classifiers: GFMM and EFMM.We first normalize the dataset using min-max normalization and the range of the normalization is set to [0, 1].After obtaining the train and test sets, we extract the selected features by implementing LASSO on the train set.The reduced feature set is used to train the model and compute the test accuracy.

GFMM Results
The General Fuzzy min-max model hyperbox visualization after training is complete is shown in Figure 3.The five colors indicate the five classes.As observed from the hyperbox visualization in Figure 3, the majority class namely, lung adenocarcinoma, is segregated well from the other classes which indicate a good classification performance.The minority classes are also distinctly separated from each other.For the GFMM learning algorithm, the value we choose for the hyperbox expansion coefficient is 0.5 and the sensitivity value is 1.The classification results are shown in Tables 3 (accuracy) and 4 (execution time) for all methods.General Fuzzy min-max classifier gives the best result among all the classifiers, as observed from Table 3.The accuracy achieved with LASSO is 98.04% and 94.06% for validation and cross-validation, respectively, and this is the best among all the classifiers that we have used for this microarray dataset.For GFMM, we observe that the execution time of the classification process in case of selected features is 4.57 s (with hyperbox visualization) which is faster as compared to all the other fuzzy models.The reason for the accurately defined class boundaries of GFMM is the simple operations involved for which even the few samples in the training set is sufficient, thereby reducing the overlap between classes to a great extent.

EFMM Results
The enhanced Fuzzy min-max classifier does not give good results for the small sample dataset in our experiments.The accuracy we achieved with LASSO is 90.2% and 93.07%for Validation and cross-validation, respectively, as observed from Table 3.The EFMM hyperbox visualization is shown in Figure 4, that is, obtained after the training process is complete.Comparing the hyperbox visualizations of GFMM in Figure 3 with that of EFMM in Figure 4, we observe a better segregation of classes in case of GFMM in Figure 3, indicating that the fuzzy membership functions for the five classes are more well-defined in case of GFMM.The hyperbox visualization shows some degree of overlap between the majority class and few minority classes, and also among the minority classes.This implies that EFMM is incapable of learning from small sample datasets such as gene expression datasets due to the intricate learning procedures involved that require sufficient samples to learn from.

Results comparison
In Table 3 we can see the test accuracy comparison of all the classification algorithms.Other than GFMM and EFMM, we have compared the results to the machine learning models Support Vector Machine (SVM), K-Nearest Neighbor, Logistic Regression, Naïve Bayes and Random Forest and to some of the existing works on cancer gene expression classification: FMM, 30 1D-CNN, 14 DNN, 11 SMOTE and DNN. 12 For the hyperparameter settings of 11,12,14,30 we have referred to the original articles.LASSO is applied for feature selection in all cases.Out of the 12,600 features, 98 features are extracted during validation, and 95 features are extracted during cross-validation, LASSO being applied only on the training set in each case.LASSO thus results in a reduced gene pool which is applied for training purpose.
From these results, we analyze that GFMM stands out among all the algorithms in terms of the accuracy obtained (Validation accuracy = 98.04%,Cross-validation accuracy = 94.06%).The second-best performance is that of SMOTE, LASSO, and DNN 12 (Validation accuracy = 95.68%,Cross-validation accuracy = 94.06%).
Figure 5 shows the comparison of the performance of GFMM and EFMM with and without LASSO.The application of LASSO creates a reduced and optimized gene pool, hence the system is faster and performance is boosted, as observed from Figure 5 which compares the test accuracy scores.We note the following observations from Figure 5A, B showing the validation and cross-validation accuracies, respectively.i. Feature selection by LASSO significantly boosts the performance of the FMM models, especially GFMM, due to the inclusion of the most informative features in the selected gene pool.ii.In the absence of feature selection, EFMM marginally outperforms GFMM, signifying that for efficient operation of GFMM, informative features are required.
From the execution times of all models summarized in Table 4, we make a general observation that FMM-based methods take more time to execute as compared to other machine learning algorithms.GFMM is the fastest with feature selection as compared to other fuzzy algorithms, and EFMM takes approximately same time for both with and without feature selection.The execution time of GFMM is 4.57 s with online or progressive hyperbox visualization and around 0.32 s without online or progressive hyperbox visualization.
After analyzing all the comparison methods, we can say that the General Fuzzy min-max classifier is a very suitable option for the classification of microarray data, and it performs the best when used with feature selection for selecting the most optimal gene set that identifies the cancer subtype accurately.

Implications of the study
The advantages of our method are as follows.
1. Neuro-fuzzy classifiers are not a new concept; however, they assume significance in contemporary times due to the rising interest in explainable AI. 2. The fuzzy class definitions assume overlap between classes and sets new rules for defining outliers which is not the scene for crisp classifiers like Naïve Bayes, logistic regression, SVM etc. where crisp decision boundaries are defined.3.As observed from the experimental results, the General Fuzzy Min-Max neural network proved effective for classifying the gene expression dataset, achieving a high validation accuracy of 98.04% and a cross-validation accuracy of 94.06%.4. GFMM involves simple computations and executes faster than most neural networks; it performs even better and executes even faster with feature selection.5.The combination of GFMM with LASSO feature selection outperformed EFMM and the state-of-the-art machine learning algorithms and some recently published works 11,12,14,30 on cancer gene expression classification.
The limitations of our method are as follows.
1. Feature selection algorithms are data-dependent, and the number of selected features depends on the training set whose size is limited in our case due to the small size of gene expression datasets.Therefore, advanced neuro-fuzzy architectures incorporating some form of embedded feature selection techniques must be developed in future.2. The lung cancer dataset is imbalanced in nature, a characteristic of most cancer gene expression datasets.Therefore, balancing strategies such as resampling, cost-sensitive learning, and hybrid approaches should be tested for their suitability to gene expression datasets.3. The present work focuses only on classifying lung cancer subtypes.However, future models can attempt to simultaneously classify types and subtypes of different cancers as in 14 which presented a stacked ensemble framework.The computational overhead, in such a case, needs to be reduced.

CONCLUSION
In this article, we have explored neuro-fuzzy systems for cancer diagnosis from gene expression data.We have applied two advanced neuro-fuzzy models GFMM and EFMM for classifying gene expression data into five subtypes of lung cancer.We found that GFMM was most suited for classifying small sample datasets such as gene expressions due to the simple computations of fuzzy memberships that require fewer samples to learn from.To the best of our knowledge, there is no other work that has investigated the application of these neuro-fuzzy models for the classification of gene expression data.For all the experiments we used the microarray lung cancer gene expression dataset containing 203 samples and 12,600 genes.LASSO is used for selecting the important genes and this optimized subset of genes is used for training all the models.In the performance analysis, we found that GFMM is more efficient as compared to EFMM and Simpson's FMM in terms of both accuracy and execution time.GFMM also outperformed all other machine learning algorithms and some existing works on cancer gene expression classification published in the last few years.GFMM in combination with LASSO achieved a validation accuracy of 98.04% and a cross-validation accuracy of 94.06%.SMOTE with LASSO and deep neural networks is the second best method with a validation accuracy of 95.68% and a cross-validation accuracy of 94.06%.The hyperbox visualizations indicate that the fuzzy membership values for the hyperboxes pertaining to the five lung cancer subtypes are better defined in case of GFMM that for EFMM.The decision boundaries in case of GFMM are more accurate due to the improved expansion-contraction process that aims to reduce the overlap between the hyperboxes representing the different classes.GFMM is the fastest among all the fuzzy classifiers.The execution time of GFMM is 4.57 s with online or progressive hyperbox visualization and around 0.32 s without online or progressive hyperbox visualization.Our work provides insights on the future development of neuro-fuzzy models incorporating feature selection and imbalance treatment that can be customized for the classification of small sample, high-dimensional gene expression datasets.

1 . 2
Load the Microarray gene expression dataset.2. Normalize the dataset using min-max normalization and the range of the normalization is [0, 1].In min-max normalization, all the minimum values are set to 0 and all the maximum values are set to 1.The values which lie between maximum and minimum values are set with a decimal value within a range of [0, 1]. 3.In this step, we select the important features with LASSO and then perform the classification task.Flow chart of the training and testing process.
Nomenclature table of abbreviations and linguistic variables used in the article.
TA B L E 1 Test accuracies of different models for lung cancer classification with LASSO as feature selector.Average execution time of classification algorithms.
hyperbox constructed after training is complete.TA B L E 3 hyperbox constructed after training is complete.Performance of GFMM and EFMM models with and without LASSO.