Contextualizing the current state of research on the use of machine learning for student performance prediction: A systematic literature review

Today, educational institutions produce large amounts of data with the deployment of learning management systems. These large datasets provide an untapped potential to support and enhance decision‐making and operations. In recent times, machine learning (ML) has been applied to develop models utilizing this “big” data to assist in decision‐making. This study presents a systematic literature review into the application of ML to predict student performance. A total of 162 research articles from January 2010 to October 2022 were critically reviewed and analyzed by applying Kitchenham's systematic literature review approach. Our analysis categorized the literature predicting students' academic performance into two categories: (i) predicting student performance in assessments, courses or programs, and identifying students at‐risk of failing their course/program (129 studies); and (ii) predicting student dropout or retention in a course or program (33 studies). Classification is the most commonly used approach for predicting student performance (138 studies), followed by regression (25 studies) and clustering (9 studies). Supervised learning methods are used more often than semi‐supervised learning. Five most popular ML algorithms include the Decision Tree, Random Forest, Naïve Bayes, Artificial Neural Network, and Support Vector Machine. Historical records of students' grades and class performance, academic related data from learning management systems, and students' demographics are the most common features used for predicting students' performance. The most common methods used for feature selection are Information Gain‐based selection algorithms, Correlation‐based feature selection, and Gain Ratio. The general platforms/tools/libraries used in the studies include WEKA, Python, R, Rapid Miner, and MATLAB. We also investigated possible actions considered in the literature to help at‐risk students. We only found very few studies that deployed remedial actions and evaluated their impact on students' performance. In conclusion, ML has shown great potential in the prediction of student performance, but also has many areas of further research.


INTRODUCTION
Today, large amounts of data are produced by educational institutions with the widespread use of online educational systems and learning technologies in the educational domain. 1 Consequently, the potential to extract knowledge and insights from educational data has gained momentum.These insights have the potential to significantly impact students' learning experiences, learning outcomes, efficiency in managing learning organizations' services, and others.Educational Data Mining (EDM) is an emerging field aimed to address these needs. 2 EDM is an interdisciplinary field of study that applies Data Mining (DM) methods to educational data in order to solve or gain insights from the massive amount of data in educational systems. 3The International Educational Data Mining Society defines EDM as an evolving field that aims to establish methods to investigate the particular types of data from educational settings and to use these methods to better understand learners and the environments they study in. 4 DM methods use techniques from statistics, databases, artificial intelligence and, more recently, Machine Learning (ML) to analyses data.While conventional data mining techniques typically require manual intervention, ML does not require human intervention and can continuously learn and improve their models as more data becomes available.
More specifically, ML is a branch of artificial intelligence, which enables systems to automatically learn and optimize models from analyzing data without being directly programmed. 5ML focuses on building models that can learn complicated relationships or patterns through accessing large datasets. 6,7The models can then be used for prediction.EDM has successfully used ML methods to build models to meet various educational goals. 8ML has been successfully applied to predict students' academic performance, identify at-risk students, define key learning requirements for different students, predict student graduation rates, and others. 9redicting student performance is an important area of research in EDM and has been studied extensively.1][12][13][14] In the work of Roy and Garg, 10 an analysis of the different DM techniques, pros and cons of DM methods, and DM tools and trends in the EDM literature are discussed.Another study detailing DM techniques for predicting student performance is presented by Christy and Rama. 11In the study by Shahiri and Husain, 12 a systematic literature review (covering 30 studies from 2002 to 2015) on predicting students' performance using DM techniques is discussed.They investigated significant attributes used for prediction, DM techniques, and the accuracy of predicting student performance.A similar study is presented by Kumar et al. 13 where they presented a systematic literature review covering papers published between 2007 and 2016.Abu Saa et al. 14 also conducted a systematic literature review analyzing 36 studies from 2009 to 2018, with the aim to identify the common factors that influence the prediction of student performance and DM techniques that identify those factors.Albreiki et al. 15 presented a systematic literature review of using EDM to identify student dropouts and students at risk of failure from 2009 to 2021.Their review included 78 research articles.Hellas et al. 16 presented a comprehensive systematic literature review analyzing articles from 2010 to 2018 to define what performance means and the purpose of predicting students' performance.
Our study complements existing work by conducting an extensive systematic literature review on the use of ML methods to predict student performance in the context of EDM, covering published work from January 2010 to October 2022.Although ML methods have shown promise in predicting student performance, previous literature reviews have considered DM techniques in general.Our study is unique in that we focus on ML methods and algorithms, as well as their context and use in predicting student performance.In addition, we identified features/factors affecting the prediction models and feature selection methods/algorithms applied in the literature.Finally, we have looked at what remedial actions were taken in the literature for those predicted at-risk students.The research questions (RQs) addressed in this study are as follows: • What are the goals/objectives of applying ML techniques in predicting student performance?
• What are the most common ML methods and algorithms applied in the literature for predicting student performance?
• What are the most commonly used features/factors for the prediction of student performance?
• What methods for feature selection are employed in the literature?
• What learning environments have ML methods been applied to predict student performance?
• What are the ML tools/platforms/libraries used to implement ML algorithms for predicting student performance?
• What actions are taken, based on the prediction results, for the identified at risk students?
The above seven RQs help us contextualize the current state of research in using ML to predict students' performance.To the best of our knowledge, there are no other systematic literature reviews that have considered feature selection methods/algorithms applied for predicting students' performance, as well as reviewed studies that take actions based on student performance prediction results.
Through this literature review, we are able to identify the emerging areas and current applications of ML algorithms in predicting students' academic performance, students at-risk of failure or dropping out, and student retention.This review not only highlights the main areas of focus in the literature, but also exposes areas that require further attention.Additionally, the review provides an analysis of the most prominent ML techniques, algorithms, platforms and tools used for predicting student performance.
The rest of this article is organized as follows: Section 2 presents some background on EDM and ML methods.Section 3 details the literature search procedure and criteria.Section 4 analyses the literature, with the goal of addressing the seven RQs outlined above, and reports results.Discussion is presented in Section 5. Section 6 concludes the article with a view on future research directions.

BACKGROUND
EDM is an interdisciplinary field of study that seeks to better understand students' interests and behaviors by applying DM techniques and algorithms to explore data from educational settings. 17ML methods, a subset of DM techniques popularly used, can be divided into three types: supervised learning, unsupervised learning, and semi-supervised learning.Supervised learning utilizes input data with labelled responses aiming to develop models that inference relationships between input and output observables in the data. 18Classification and regression are two main approaches used in supervised learning.Classification has a categorical output label while regression has a numerical value as the output label.In contrast, unsupervised learning utilizes input data without labelled responses in order to find unseen patterns or clusters in the data. 19,20luster analysis is the most common method used in unsupervised learning. 18Semi-supervised learning exploits a small pool of labelled data with a large pool of unlabelled data in order to build efficient learning models by extensively analyzing the unlabelled data. 21It is typically used when labelled data is small or is not enough to create an efficient model.This section will briefly introduce some popular ML methods used in EDM.

Supervised ML methods for classification
Classification is a supervised learning approach that learns from input data, and then uses it to map or classify new observations; the goal is to predict output categorical variables using input data.When learning a model, classification uses labelled input data as a training dataset to build the model.Unseen data is then used to test the model's efficiency and analyze its results by evaluating the accuracy of the model. 22The most common metrics used to evaluate a classification model include the Confusion Matrix, Receiver Operator Characteristic (ROC), and F-measure (F1) Score. 23Classification is the most popular approach used in dealing with educational data. 22Some of the common ML algorithms used in classification problems are: C4.5, ID3, K-Nearest Neighbour (K-NN), Naïve Bayes (NB), Support Vector Machine (SVM), and Artificial Neural Networks (ANN).

Supervised ML methods for regression
Regression is a supervised learning approach used mainly when there is a need to predict a continuous outcome variable (y), which is the dependent variable, based on the value of one or multiple predictor variables (x i ), called independent variables, by evaluating the relationship between variables. 10The goal of regression models is to find an equation that explains (y) as a function of the (x i ) variables, which can then be used to predict the (y) value based on (x i ) variables.The Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Adjusted R-square are commonly used to evaluate the efficiency of a regression model.

Unsupervised ML methods for clustering
Clustering is an unsupervised learning method that divides objects into groups, where objects in the same group are more similar compared to the objects in other groups. 10The goal of cluster analysis is to gain insights from grouping unlabelled data.Clustering is an unsupervised learning approach mainly because it utilizes unlabelled data. 24K-mean clustering is a widely used algorithm in clustering problems in the education domain.The Silhouette Score and Silhouette Plot are most commonly used metrics for measuring the performance of a clustering algorithm.They are used to measure the separation distance between clusters.

Semi-supervised ML methods
Semi-supervised ML algorithms fall between supervised and unsupervised learning.Generally, semi-supervised learning uses a small amount of labelled data and a significant amount of unlabelled data for training, in order to uncover hidden information in the unlabelled data and build effective learning models by analyzing the unlabelled data extensively. 21The aim of semi-supervised learning is to consider how the mixture of labelled and unlabelled data will affect the learning behavior and develop algorithms that benefit from such a combination.Semi-supervised learning is important in ML and DM, as it can use accessible unlabelled data to enhance supervised learning activities when labelled data is unavailable or expensive to obtain. 25,26Self-training, co-training and tri-training are common semi-supervised techniques. 27Metrics such as the Confusion Matrix, ROC, Area Under the ROC Curve (AUC), and F1 Score 23 are widely used to evaluate the performance of a semi-supervised learning method.

LITERATURE SEARCH PROCEDURE AND CRITERIA
The main objective of this article is to present a systematic literature review on utilizing ML techniques to predict student performance.This study followed • Rigor and consistency: The guidelines provide a structured and systematic approach to conducting a literature review, which ensures rigor and consistency in the review process.
• Increased transparency: The guidelines provide a clear and transparent method for documenting the literature review process, which makes it easier to evaluate the quality of the review and the conclusions drawn from it.
• Improved reproducibility: By following a structured and systematic approach, the guidelines help to ensure that the literature review can be easily reproduced by others, thereby promoting the transparency and replicability of the research.
• Better evidence synthesis: The guidelines promote the systematic synthesis of evidence from multiple sources, which can lead to more comprehensive and robust conclusions.
• Enhanced reliability: The guidelines help to ensure that the literature review is reliable and free from bias, which increases the credibility and validity of the research findings.
Overall, Kitchenham et al.'s 28 methodological guidelines provide a useful methodology for conducting a systematic literature review and can help to improve the quality and rigor of the research.Therefore, we followed Kitchenham's methodological guidelines to conduct this review.In this section, we present the literature search procedure used in detail.
We defined the scope of "student academic performance" broadly, for the purposes of this literature review, to include students' performance on a single assessment, performance in a course, and retention in a program or continued progress through a program/course.However, we only looked at quantifiable variables directly connected to the course or program, such as grades, pass/fail rates, program retention and course/program dropout.We did not include papers that predict factors that are not measurable or directly connected to academic performance, such as depression or team cohesiveness, even if they are likely to impact student performance.We broadly defined "student dropout" as a student who withdraws a course, school, college, or university before completing their qualification.We also broadly defined "student retention" as the percentage of students who re-enroll in courses in a program in the subsequent year.Finally, we broadly considered "early prediction" as predictions that occur before or during the term where predicted low-performing students still have the opportunity to reflect/improve their performance to successfully complete a course or program.
Several keywords to identify papers related to the RQs were used.Specifically, the terms "educational data mining," "academic performance," "student performance," and "machine learning" were utilized in Scopus, Web of Science, and the ACM Digital Library to search for relevant studies.Boolean operators such as AND, OR and NOT were used in the search strings.This study covered papers published from January 2010 to October 2022.The inclusion criteria used to examine relevant studies are listed below: 1. Papers that have used ML methods and algorithms in the context of education settings.2. Papers that are either journal or conference publications.3. Included papers must not be a review paper.4. Papers published in English.
The exclusion criteria used to exclude studies that are not relevant are as follows: 1. Papers that are not relevant to the RQs. 2. Papers that do not use ML methods and algorithms.
The approach of selecting studies that are relevant to the RQs had three steps: First, the selection of published papers that satisfy the search keywords or search strings; the search query was: ("educational data mining" AND ("academic performance" OR "student performance") AND "machine learning").Then, a selection of papers based on the selection criteria by reading the title, abstract, and keywords of each paper.The final process was to select papers based on reading their full text.
The process of selecting relevant studies was performed by the first author.The second and third authors checked and confirmed the results to ensure and reduce bias from the first author.The final selection included 162 papers.Figure 1 explains the process of selecting the literature in this review.A summary of the reviewed papers in terms of their aims of prediction, ML methods, datasets/data sources, types of features, feature selection methods, and platforms/tools used to implement the ML algorithms is provided as supplementary material to this paper.In addition, we have listed the top 25 cited research articles in Table 1, and a year wise distribution of included publications in Figure 2.

RESULTS
In this section, we report the results of our systematic literature review study, where we address our RQs and elaborate on the findings derived from analyzing the extracted data.

RQ 1: ML and students' performance
Analyzing the results of the systematic literature review, we can broadly classify the main objectives of applying ML to predict student performance into two categories: An overview of the literature selection process.
• Predicting students' academic performance in a course or program and identifying students at risk of failing.
• Predicting students' academic performance in a course or program and identifying students at risk of dropping out.
Predicting students' academic performance includes applying ML algorithms to predict marks/grades in assessments and courses, predict the Cumulative Grade Point Average (CGPA), predict students' graduation time, and identify students who are at risk of failing their courses/programs.There were 129 studies in this category (see Figure 3).Identifying students at risk of failing early on can assist in intervention strategies aimed to achieve better outcomes.A second category of applying ML algorithms to predict student performance is to predict whether a student will drop out of a course, a program, or a university.There were 33 studies classified in this category (see Figure 3).

4.1.1
Predicting students' academic performance and detecting students at risk of failing This category has attracted the attention of many researchers, to understand students' learning processes and behaviors with the aim to improve students' performance and outcomes.Various researchers used ML algorithms, both supervised and unsupervised learning, to extract hidden knowledge and patterns from student data. 53Many applications used classification, regression, or clustering methods in educational datasets to predict students' performance in their exams, predict if a student will be able to pass certain courses or not, predict students' CGPA, predict students' graduation time, and others.A high-level analysis of some studies that predict students' performance is discussed below.Predicting students' performance in their assessments is crucial for academic institutions due to the effective actions that could be taken by identifying students at risk of failing in the early stages and providing interventions early on to assist in improving their performance. 54Extensive research has been conducted to predict students' performance in assessments.For instance, Kostopoulos, et al. 55 carried out a study to predict undergraduate students' performance (either pass or fail) in the final examination at the end of the academic year in an e-learning course using a co-training method based on two views of the dataset.The first view includes features concerning students' demographic and academic performance; and the second view includes features on tracking students' online activities in the course's learning management system (LMS).The researchers used five ML algorithms (K-NN, Extra Tree Classifier, Random Forest, Gradient Boosting Classifier (GBC), and Gaussian NB Classifier) as base learners to form a number of co-training and self-training algorithms.The results showed that co-training outperforms self-training algorithms.More specifically, their proposed co-training (Extra, GBC) algorithm provided the highest accuracy (77.69%-85.19%,confirmed by the Friedman Aligned Ranks Nonparametric test, and F1 scores ranged from 0.767 to 0.847).

F I G U R E 3
Studies grouped by their purpose of prediction (percentage based on all reviewed studies).Percentages: number of studies in each category divided by the total number of studies (which is 162), and then multiplied by 100.
In a similar study, Athani et al. 56 classified students' final exam results into five levels ("A," "B," "C," "D," and "F") using two ML algorithms: Multiclass SVM (MSVM) and ANN.The results showed that the MSVM outperforms ANNs (accuracy of 84.16% for the MSVM and 68.9% for ANNs).Another study was conducted by Kotsiantis et al. 29 to predict students' performance in a distance learning environment by implementing an ensemble classification method that combines an incremental version of NB, the WINNOW algorithms using the voting methodology, and 1-NN.Ensemble methods are techniques that seek to improve the performance of a model by training multiple models and combining their predictions, instead of relying solely on a single model.The combined models generally improve the results' accuracy of a model. 57,58The dataset used in this study consisted of 1347 instances and five features (1st written assignment, 2nd written assignment, 3rd written assignment, 4th written assignment, and the final exam grades) from an online Informatics course.Results showed that the proposed ensemble method outperforms the NB, 1-NN and WINNOW algorithms as well as other state-of-art ML algorithms (Back-Propagation, 3-NN, Radial Basis Function Algorithm, C4.5, Sequential Minimal Optimization, and RIPPER).The average accuracy of the proposed ensemble method was 78.95%.
EDM can play a significant role in finding issues that affect students' performance, especially those related to students who are at risk of failing.A number of studies have presented early warning systems or models to identify students at risk of failing.The first step towards improving academic performance is by early detection of low-performing students in the academic term.Early warning systems are considered to be effective tools for early detection of students who are at risk. 59Through the use of the warning systems, educators can deliver early intervention strategies and real-time support to assist struggling students as well as adjust the course to cater to students' needs. 60any studies applied ML techniques on educational data to predict at-risk students who might fail their courses or programs.For instance, Trakunphutthirak et al. 61 used a university Internet access log file to predict students' academic performance and detect students who are at risk of failing from their browsing behavior.The study applied five ML algorithms: Decision Tree, NB, Logistic Regression, ANN, and Random Forest.The average accuracy of Decision Tree was 76.7%, NB was 58.3%, Logistic Regression was 68.9%, ANN was 73.9%, and Random Forest was 78.9%.The Random Forest technique provided the highest accuracy to predict students at risk of failure.Another study conducted by Iqbal et al. 62 proposed an early prediction of students' grades in order to provide special attention for at-risk students at the early stages in the semester.The study evaluated multiple ML techniques for predicting students' grades, such as Matrix Factorization, Nuclear Norm Minimisation, Collaborative Filtering, Bayesian Ridge Regression, and the Restricted Boltzmann Machine (RBM).Results indicated that the RBM provides best results in terms of RMSE and MAE for predicting students' grades.
In another study, Cano and Leonard 48 presented a multi-view early warning system that aims to identify students who are at risk of failure, course withdrawal and dropping out.The study built an incremental model on a weekly basis using Multi-View Genetic Programming (MVGP) classification.The study evaluated their model's performance against single-view algorithms.The average accuracy of single-view algorithms ranged between 82.5% and 91.1%, while the average accuracy of their proposed MVGP was 91.7%.In addition, the average AUC values of the single-view algorithms were between 70.4% and 86.3%, while the average AUC of the proposed MVGP was 90.1%.
We observe that most work along this line of research used classification methods over regression for classifying students' results at the end of their courses.This makes sense since the nature of the problems is mostly to categorize students into different levels such as pass or fail as in binary classification or low, average and high performance in multiclass classification.Moreover, many different ML algorithms have been utilized for the case of predicting student academic performance.Various factors affect the choice of ML algorithms, such as the type of problem, size, quality, nature of data, number of features or parameters, and linearity.

Predicting students' retention and dropout
Another critical issue for academic institutions is student retention (in courses, programs, etc.) and successful completion (e.g., graduating on time).Low student retention rates can have significant implications on tertiary institutions.According to the Complete College America dashboard,* only 5% of full-time students at 2-year institutions graduate on time, 12% of full-time students at 1-to 2-year certificate programs graduate on time, and only 20% of full-time American college students in 4-year Bachelor degree programs graduate on time.
Several studies have applied ML techniques to predict students' graduation time, to identify students who are in danger of not graduating from their programs within the expected time.The work of Livieris et al. 63 proposed a semi-supervised self-trained two-level classification algorithm to predict whether a student will graduate within 6 years or not; followed by predicting the student's graduation time if they will graduate within 6 years.The study applied several ML algorithms on their proposed model: NB, Multi-Layer Perceptron, Sequential Minimal Optimization, K-NN, C4.5, and RIPPER (JRip).Results indicated that the proposed model can identify students at risk of not graduating and accurately classify students based on their graduation time in either four years, five years, or six years.More specifically, K-NN showed the best average classification accuracy results of 66.17%, 77.92%, and 81.48% utilizing 20%, 30%, and 40% labelled data ratios, respectively.
Student dropout is also a critical issue faced by higher education institutions in many countries.Reducing the student dropout rate has been a common goal of many universities.A number of studies have applied ML techniques to predict whether or not a student will dropout from a course.In the work of Burgos et al., 64 a study was conducted to predict whether a student will drop out of a course in e-learning or not using a dataset of 104 students in five courses.The study built an incremental model using Logistic Regression to detect if a student is at risk of dropping out of a course.The study also created a tutoring plan to advise students who are at risk of dropping out.The proposed model predicts students' performance based on their activities' grades every week.The model was able to achieve an accuracy of 97.13%, precision of 98.95%, recall of 96.73%, and specificity of 97.14% at week 10 (middle of the term).The authors were able to reduce the dropout rate by 14% compared with previous academic years by implementing dropout prevention actions at different times during the term targeting identified at-risk students.
Another study conducted by Iam-On and Boongoen 65 presented a new data transformation model, based on the summarized data matrix of link-based cluster ensembles, to improve the classification accuracy of predicting students' dropout via converting the original data to a new form.The study used a dataset with 811 records and 21 features from an operational database system at Mae Fah Luang University, Thailand, from 2009 to 2012.The aim was to predict students' dropout before and after the first year at the university.Their results indicated that the proposed method has an accuracy level of 91% and performed better than many well-known reduction methods.
In summary, both student retention and dropout are of significant concern in academic institutions due to their impact on student graduation.About 30% of first-year students at the United States' baccalaureate institutions do not return for their second year. 66As discussed, several studies have used ML algorithms to identify students who are at risk of dropping out.A dropout early warning system can assist universities in identifying and responding to students who are at risk of dropping out of their courses or university programs.At-risk students tend to drop out without proper evaluation of the negative implications of their decisions or without having the chance to seek expert advice and support.The dropout early warning systems can guide students who are likely to drop out with intervention strategies that can lead them to successful completion and graduation. 67

Summary
The above analysis addressed our first RQ-"What are the goals/objectives of applying ML techniques in predicting student performance?"We can classify broadly the main goals/aims of applying ML algorithms and methods for predicting student performance (in assessments, courses, or programs) as: (i) predicting student academic performance and identifying students at risk of failing, and (ii) predicting students at risk of dropping out.What is also interesting to note is that most of the research studies have focused on prediction of academic performance for detecting students at risk of failing (129 studies), while fewer studies have focused predicting students at risk of dropping out (33 studies).These findings thus highlight areas where future research can devote attention to, such as student dropout and retention, given that it is a prevalent and significant issue. 66he next section looks at the main ML methods/algorithms used for predicting student performance.

RQ 2: ML methods and algorithms
There are three main approaches utilized for predicting students' performance in the literature: classification, regression, and clustering.Classification is the main approach used in the literature to predict student performance (around 80% of studies, see Figure 4).In our survey, 25 studies used regression to predict student performance.Only nine studies applied clustering (i.e., unsupervised) for predicting student performance.Our analysis (as detailed in Figure 4) indicates that most research work examined supervised learning as classification or regression methods, while only six studies utilized semi-supervised learning in classification and two studies used regression.Note: some studies used more than one method and thus are classified into multiple groups.
We have extracted more than 50 distinct ML algorithms used by the 162 papers reviewed.Nonetheless, the most widely used algorithms for predicting students' performance that were applied in ten or more studies are shown in Figure 5.As can be seen in Figure 5, the Decision Tree, Random Forest, NB, ANN, and SVM are the top five mostly used ML algorithms in the literature.Based on this analysis, we are able to answer the second RQ-"What are the most common ML methods and algorithms applied for predicting student performance in the literature?"ML algorithms' computation time complexity plays a vital role in selecting an ML algorithm to create prediction models.Each ML algorithm takes time to train a set of data and perform a prediction, which refers to the computational complexity of ML algorithms.Table 2 shows the time complexity of different ML algorithms widely referred to in this review.We noted that NB and Decision Trees are the least complex algorithms in Table 2.This could explain why these two algorithms are also the most used algorithms.

RQ 3: Common features/factors used for predicting students' performance
We classified each study into one or more categories as described in Table 3.These include: (1) Historical records of students' grades and class performance, (2) Students demographics related features, (3) Academic related features from LMSs, (4) Learning/eLearning activity related features (Log File), (5) Instructor related features, (6) Students' social information, (7) Internet access log files, (8) Parents' related features, (9) Students' economic related features, and (10) Students' behavioral related features.Table 3 describes each of these categories.
As can be seen in Figure 6, the most commonly used features for creating ML models for predicting students' performance are "Historical records of students' grades and class performance" (presented in 122 studies, 27%), followed by "Academic related features from LMSs" (presented in 98 studies, 21.6%), "Students' demographics" (presented in 97 studies, 21.4%), and "Students' social information" (presented in 44 studies, 10%), "Learning/eLearning activity related features" (presented in 37 studies, 8%), and "Students' behavioral information" (presented in 31 studies, 7%).The least two categories-"Internet access log files" (presented in 3 studies, 0.7%) and "Instructor related features" (presented in 2 studies, 0.4%), which represent a total of 3% altogether, were used by a few number of studies; thus, they were considered as ad hoc features.
These findings, especially the top four categories, are in agreement with the findings of a previous systematic literature review conducted by Abu Saa et al. 14 which showed that the historical records of students' grades and class performance, students' eLearning activity, students' demographics, and students' social information are the most common features used for predicting students' performance.Shahiri and Husain 12 systematic literature review results also showed that CGPA and historical records of students' grades and class performance are the most commonly used features, followed by students' demographic features, extra-curricular activities, and students' social related features.Hellas et al. 16 also indicated that course performance features, pre-course performance features, demographic features, and student engagement are the top factors used for predicting students' performance.

# Category Description
1 Historical records of students' grades and class performance Historical records of students' grades on assessments/courses or other performance indicators from previous courses, semesters, or years.
2 Students demographics related features Student demographic data, including gender, age, nationality, ethnicity, etc.
3 Academic related features from LMSs Information related to students/courses/programs/universities that are acquired from an LMS, such as the course name, course information, school subjects, major, absence, admission entry requirement, etc.
4 Learning/e-learning activity related features (Log File) The activity logs of students in eLearning systems, such as the number of logins, time spent on activities, idle time, time spent on students' activities on e-learning system (e.g., forum, etc).
5 Instructor related features Information related to the instructor of a student and their evaluation results.

F I G U R E 6
Distribution of studies by feature' categories.

RQ4: Feature selection methods
Feature selection is a process that consists of selecting subsets of variables that are relevant to predictive models.It reduces the number of predictor variables used in ML algorithms.Also, it reduces the amount of time it takes to compute ML models, improves their generalization ability, prevents overtraining, and reduces the amount of resources (memory and CPU time) required to predict.9][70] Three main approaches used for feature selection: Filter methods, Wrapper methods, and Embedded methods. 71 • Filter methods employ univariate statistics to determine the intrinsic properties of the features instead of relying on cross-validation measures to assess their performance.There is a considerable advantage to these methods in terms of speed and computational load compared to wrapper methods.0]72 Information Gain, Chi-square Test, Fisher's Score, and Correlation Coefficient are the most often used filter methods.
• Wrappers require some method of searching through the space of all possible subsets of features and assessing their quality by learning and evaluating a classifier that can be applied to that subset of features.ML algorithms are used to select features for a given dataset based on a given algorithm.This method evaluates all possibilities against the evaluation criteria in a greedy search approach.70]72 The most common techniques used in wrapper methods are Forward Feature Selection, Backward Feature Elimination, Exhaustive Feature Selection, and Recursive Feature Elimination.
• Embedded methods combine the advantages of both wrapper and filter methods while maintaining a reasonable level of interaction between features at a reasonable cost of computational complexity.70]72 The most common techniques used in embedded methods are LASSO Regularization (L1), and Random Forest Importance.
In this systematic literature review, there were only 56 research articles that considered using feature selection methods/algorithms for their models to identify best features that are relevant for predicting students' performance.Forty nine (49) research articles explicitly mention their methods/algorithms for feature selection process, where 7 research articles

F I G U R E 8
Most commonly used ML software and tools (percentages).

RQ5: Student performance and learning environments
Next, we classify studies based on learning environments: e-learning, and traditional and blended learning environments (see Figure 7).Traditional and blended learning environments have face-to-face contact between instructors and students (e.g., on-campus labs, tutorials, etc.), while in e-learning, all learning interactions are online.Since all interactions in an e-learning environment are online, e-learning platforms have the ability to collect a more complete dataset of students' interactions and behavior using log files.Despite this, we observe that there are more studies in the literature (62% of the studies in this survey) using datasets from traditional and blended learning environments than e-learning (38%).These results address the RQ-What learning environments have ML methods been applied to predict student performance?

RQ 6: Tools and platforms
This section outlines tools/platforms/libraries used to implement ML algorithms in the literature, addressing the RQ-"What are the main ML tools/platforms/libraries used for predicting student performance?".Although ML algorithms can be implemented from scratch, there are a number of programming language-specific libraries as well as tools/platforms available.It is no surprise that many studies in the literature used these libraries/tools/platforms to implement ML algorithms (Figure 8).It is noteworthy to mention that not all papers have specified the ML implementation language/tool used.Thus, Figure 8 is derived from 106 of the papers reviewed that have mentioned tools/platforms/libraries used in their studies.As seen in Figure 8, WEKA is the most common platform used in the literature (used by 42 studies, 40%) followed by Python (used by 34 studies, 32%).

RQ 7: Remedial actions for at-risk students
Predicting student performance has received a lot of attention in the literature.4][75][76][77][78] In these research studies, interventions have generally been determined by the instructor, who contacts the identified at-risk students personally (usually via email), 76 or who has pre-defined email templates that are automatically sent, 75,77 or who manually contacts students anticipated to be at risk. 64The effectiveness of these treatments has been demonstrated in lowering the dropout rate, 64 increasing retention, 76 enhancing students' achievements and grades, 75 and addressing the causes of why students struggled. 77Although it is evident that actions based on predictions of student performance can have a substantial influence, we note that very few research studies have looked at this topic.Purdue's course signals initiative was among the first to use student performance predictions in higher education. 76ourse signals' predictive models detect at-risk students using four data sources: current course grades, Blackboard engagement data, past academic history such high school GPAs, and demographic data such as residence and age.Based on the outcomes of the predictive models, the  76 Across several different cohorts at Purdue, the approach has increased both the success and retention rates.
Jayaprakash et al. 75 described the Open Academic Analytics Initiative, which creates an open-source platform to identify at-risk students early in the semester to enable interventions.Four data sources, student demographics, course grades and related data, student interaction data with the LMS, and students' progress on final grade so far, were used to determine whether a student was at risk of failing their courses.Logistic Regression, SVMs employing Sequential Minimal Optimization, J48 Decision Trees, and NB were initially used to develop predictive models.Logistic Regression performed better than other algorithms and was chosen as the eventual predictive model.Two intervention strategies-the "Awareness Messaging Intervention" and "Online Academic Support Environment Intervention"-were also evaluated in their study.The "Awareness Messaging Intervention" group received a message informing them that they were at risk of failing the course, and instructors were encouraged to provide actions (such as visits during office hours, set up appointments with tutors, access web-based resources, etc.).The "Online Academic Support Environment Intervention" group received a similar message to the "Awareness Intervention" group, except that instead of specific instructions, the students were invited to join the institution-wide online support services, such as access to Open Educational Resources instructional materials and a variety of mentoring from peers and professional support personnel (e.g., Khan Academy videos, Flat World Knowledge textbooks, etc.).In Spring 2012, a comparison between the control and intervention groups was performed (1739 students were enrolled in the study, with 451 student identified as being at risk).The findings indicated that there is a 6% improvement in the final grades in the two intervention groups compared to the non-intervention control group, and a statistically significant difference was achieved in final grades between the control group and intervention groups.There was no statistically significance difference between the two intervention groups.
Burgos et al. 64 were able to identify students who were likely to withdraw from a course early in the term based on their assessed activities.With 104 students across five courses, the prediction model was deployed to identify at-risk students who could otherwise drop out.Accuracy at week 10 (middle of the term) using the Logistic Regression model was 97.13%, precision at 98.95%, recall at 96.7%, and specificity at 97.1%.An intervention tutoring plan was also developed, allowing academics/instructors to intervene at different times during the semester (weeks 4, 7, and 10) to advise students at risk of dropping out of the course.The dropout rate was reduced by 14% compared with previous cohorts when instructors and academic advisors contacted them by email and phone.
A group of academics at a fully online university (Universitat Oberta de Catalunya (UOC)) have been working to create an Early Warning System (EWS) and an intervention strategy in order to identify students who are most likely to fail.The EWS employs a variety of classifiers, including NB, CART Decision Trees, K-NN, and SVMs. 77The prediction models are created using data from the UOC data mart, which includes information on students' involvement and interaction (navigation, interaction, contact with LMSs, etc.), student profiles, and course performance data (student enrolment, assessment, etc.).The prediction results are shown to students using dashboards that show Green-Amber-Red signals.
Additionally, the minimum marks necessary for the following assessment to receive a pass grade in the course are also shown. 77,79,80Interventions are automatically provided through email when at-risk students have been identified. 81The EWS and interventions are evaluated 77,[79][80][81] in a variety of courses at UOC.The results are encouraging, showing higher pass rates, lower dropout rates, and positive student and instructor acceptance.Bañeres et al. 82 extended their work to predict students at risk of dropping out.When a student is detected as being at risk of dropping out a course, the intervention system automatically sends them personalized emails.The intervention was evaluated in a large online course at UOC and the dropout rate for students who participated in the intervention is significantly reduced (12% lower) than those who didn't at the end of the course.
Albreiki et al. 83 proposed a framework that employs ML algorithms and a rule-based model to identify students who are at risk of failing their courses and provide remedial actions.Student prediction models for two datasets using eight different ML algorithms to predict student performance as Good/At-Risk/Failed were developed.Dataset 1 is comprised of student performance data (also known as assessment data) for a course, and Dataset 2 is comprised of both student performance data and history data (such as student demographics, past performance, etc.).Out of the eight ML methods, ExtraTrees performed the best, with an accuracy score of 0.86 and an AUC score of 0.96 for Dataset 1, and 0.87 and 0.95, respectively, for Dataset 2. A framework for automatic remedial actions by mapping course learning outcomes with topics has been established, with the instructors determining a list of remedial actions that can be provided to predicted at-risk students at different times (checkpoints).The framework is yet to be implemented and tested in real-world settings.
Borrella et al. 74 evaluated four treatments to reduce dropouts in Massive Open Online Courses (MOOCs).The first two treatments used ML prediction models to identify and intervene with students at risk of dropping out.The first intervention consisted of an email sent to students at risk of dropping out with the intention of increasing their motivation before an important assessment.The second intervention provided students at risk of dropping out with exam preparation resources and study suggestions.In the third and fourth interventions, data analysis was used to identify at-risk students as intervention targets.In the third intervention, the difficulty of the assignments gradually increased.In the fourth intervention, the most difficult sections of the course were selected for redesign with scaffolding to help students, gradually enable students to improve understanding of the content.The first and second interventions had no statistically significant effect on the dropout rate.The third and fourth interventions revealed a statistically significant dropout rate reduction.
In conclusion, taking prompt actions based on identifying students who are at risk of failing and/or dropping out shows promise.However, it is evident that only a few studies have considered this important area, and this is a gap that future studies should consider exploring.

DISCUSSION
This study reports a systematic literature review regarding the use of ML algorithms for predicting students' academic performance.The study aimed to identify: i.The aims of applying ML algorithms and methods in predicting student performance.ii.The ML methods and algorithms for predicting student performance; iii.The features/factors that affect the prediction of students' performance; iv.Methods employed for feature selection v. Learning environments where ML methods have been applied to predict student performance, vi.ML tools/platforms/libraries used to implement ML algorithms for predicting student performance, and vii.Actions taken based on the prediction results, The study reviewed 162 research articles related to the subject of ML and specifically to EDM for predicting students' performance and came up with results of research distribution across multiple dimensions.
In terms of the first research direction, we have seen many studies that have applied ML algorithms, both supervised and unsupervised learning, to extract hidden knowledge and patterns from student data.Many applications have used classification (138 studies), regression (25 studies), or clustering (9 studies) methods in educational datasets to predict students' performance in their exams, predict if a student will be able to pass certain courses or not, predict students' CGPA, predict students' graduation time, and others.We classified broadly the main goals/aims of applying ML algorithms and methods for predicting student performance (in assessments, courses, or programs) as: (i) predicting student academic performance and identifying students at risk of failure, and (ii) predicting students at risk of dropping out.Our synthesis analysis showed that that most of the research studies have focused on prediction of academic performance for detecting students at risk of failure (129 studies), while fewer studies have focused on predicting students' dropout and retention (33 studies).Interestingly, most of the dropout/retention prediction models are proposed in online or MOOC environments due to the highly prevalent issue of students dropping out in these environments.These findings are also consistent to findings in Albreiki et al. 15 Features used for predicting student performance can vary based on several factors, such as data availability, the type of educational institution, specific courses, and goals of the prediction model.Our analysis revealed that historical records of students' grades and class performance were used in 122 studies, academic related features from LMSs were used in 98 studies, and students' demographic data was used in 97 studies.These are the top categories of features used for predicting students' performance.The findings here are consistent with those of Abu Saa et al. 14 Shahiri and Husain, 12 and Albreiki et al. 15 , where historical records of students' grades and class performance, students' e-learning activity, and students' demographics are mainly considered for predicting students' performance.However, the relative importance of each feature will depend on the specific educational context and goals of the prediction model.
As aforementioned, three main approaches used for predicting students' performance in the literature are classification, regression, and clustering.More than 80% of the studies used classification to predict student performance.The main reasons is the nature of the problem where most research studies aimed to group/classify students who are at risk of either failing or dropping out the course.The second reason could be the nature of the dataset/feature used for creating prediction models, where most of the dataset are labelled and appropriate for supervised learning methods (i.e., classification or regression methods).Fewer studies (14.5%) have applied the regression approach, and only nine studies applied the unsupervised approach (i.e., clustering).Clustering is typically considered an exploratory technique because it helps discover patterns and groupings in data, rather than making predictions.
There are several reasons for supervised learning to be a popular approach for predicting student performance, 1,84 including: i. Availability of labelled data: In supervised learning, the algorithm is trained on a labelled dataset, which means that the data includes the target variable (e.g., "pass" or "fail") and input features (e.g., past grades, attendance, demographics, etc.).This type of data is easily accessible in educational settings.ii.Predicting a continuous or categorical output: In the context of student performance prediction, the target variable is often a continuous value (e.g., a numerical grade) or a categorical value (e.g., pass/fail).Supervised learning algorithms, particularly regression and classification algorithms, are well suited for these types of prediction problems.iii.Understanding relationships between inputs and outputs: Supervised learning algorithms can uncover the relationships between input features and the target variable, which can provide valuable insights into what factors influence student performance.
However, the unsupervised approach (i.e., clustering) can still be used to make predictions about student performance by using the cluster assignments as features in a supervised learning model.It could also be used to improve the classification accuracy of predicting students' performance as in Almasri et al., 85 where they proposed a cluster-based classifier that groups historical student records into a set of homogeneous clusters.By using clustering in this way, one can leverage the information gained from the exploratory analysis to make predictions about student performance.This approach can provide insights into the relationship between the features and the target variable, and can help improve the accuracy of the predictions.
There are many different ML algorithms used in the literature, more than 50 of them.Decision Tree, Random Forest, NB algorithms are the most commonly used.
We have found many different methods/algorithms in the literature for selecting features that are relevant to predictive models.These methods/algorithms are helpful, especially for large datasets, to reduce the amount of computational time and use of resources to develop a predictive ML model.However, we have seen less than 40% of research articles in the literature considered adopting feature selection methods/algorithms for their models.Many studies in the literature have tested their predictive models with and without using feature selection methods [86][87][88][89][90] , and their models revealed better results when applying feature selection methods.Albreiki et al.'s systematic literature review 15 also revealed that most of the research articles did not consider feature selection methods/algorithms for identifying features that are relevant to the prediction models.
There could be several reasons/factors why many research articles in the literature do not use feature selection methods in their student performance prediction models: i. Insufficient data: if the data collected for the study is limited in size, feature selection may not be necessary, as the model will not over-fit the data and the number of features will be relatively low; ii.Data is already cleaned: if the data collected is already cleaned and does not contain a large number of redundant features; iii.Strong feature interpretation: if the researchers believe that the features included in the model are already highly informative and strongly interpretable; and iv.[93] These reasons/factors may influence a researcher's decision to use or not use feature selection methods in their studies.We have seen many research studies in the literature proposing models for predicting students' performance and identifying at-risk students.However, limited work exists on actions that can be taken based on these prediction results and evaluating their impact to meet educational outcomes.We found less than 10% of the research articles in this literature review that consider taking timely actions based on prediction results.These studies 64,[73][74][75][76][77][78] have shown promising results with improved student pass rates, reduced dropout rates, increases in retention, and enhanced students' achievements and grades.
As can be seen in this literature review, ML has great potential to predict students' performance and accurately identify students who are at risk.However, most of those predictive models are not easily interpretable/explainable to those, especially instructors, who are not familiar with them, which are often considered as black box models.When a model is used for decision-making, it is important to explain the reasons for a specific decision.Several benefits of using explainable ML models for predicting student performance include [94][95][96] : • Transparency: Explainable models provide a clear understanding of how the model is making its predictions, allowing stakeholders to understand the reasoning behind the predictions.This can help build trust in the model and ensure that its predictions are aligned with educational goals and values.
• Fairness: Explainable models can help identify potential sources of bias in the data used to train the model, allowing organizations to address these issues and ensure that the model is making fair predictions.
• Improved model performance: By understanding how the model is making predictions, stakeholders can identify areas where the model may be struggling and adjust the data or model to improve performance.
• Compliance: In some cases, regulations may require that organizations use explainable models, especially when sensitive data is involved.
Overall, the use of explainable ML models for predicting student performance can lead to better decision-making, improved model performance, and increased trust in the predictions made by the model.However, we found only two research articles 83,97 in this literature review to have considered adopting the explainable ML predictive models.For example, Albreiki, et al. 83 proposed an explainable ML model in which academics are able to understand the causes affecting students' performance and why they are being classified as being at risk.Specifically, they provided visualizations for academics, on the 10 most important features that led to the prediction results, and demonstrated the strength of each feature in the prediction.These types of models are needed, especially when non-experts make decisions or take actions based on the predictive model results.

CONCLUSION AND FUTURE RESEARCH DIRECTIONS
This literature review considered 162 papers published between January 2010 and October 2022, which aimed to predict student performance using ML methods.We classified the main goals of predicting student academic performance into two broad categories: (i) predicting student performance in assessments, courses or programs, and identifying students at risk of failure-129 studies; and (ii) predicting student dropout or retention in a course or program-33 studies.We also observed that classification is the most popular approach used in prediction of student performance (138 studies), followed by regression (25 studies) and clustering (9 studies).This makes sense as predicting student performance can be modelled as a classification problem, where students' results are grouped into different levels such as pass or fail as in binary classification or different grades in multiclass classification.Also, the historical data provides the dataset required for training such models.The supervised learning approach is widely used, as most datasets consist of labelled data.In terms of ML algorithms, more than 50 distinct ML algorithms were used by the 162 papers reviewed.The most commonly used features for predicting students' performance are historical records of students' grades and class performance (presented in 122 studies), followed by academic related features from LMSs (presented in 98 studies) and students' demographics (presented in 97 studies).Around 30% of the research articles adopted feature selection methods for selecting features that are relevant to the prediction models.We identified distinct 32 feature selection methods/algorithms used in the literature.The most frequently used feature selection methods are Information Gain-based selection algorithm, Correlation-based feature selection, Gain Ratio, and Relief Attribute Evaluation.Most of the studies were conducted on data collected through traditional/blended learning environments compared to e-learning environments.We also identified 12 different tools/platforms/libraries that were used to implement ML algorithms for developing prediction models.Finally, very few research articles considered evaluating actions that can be taken and their impact based on the prediction results.
The EDM field aims to gain knowledge and insights from educational data for better outcomes in education.Through this study, we also observed areas that future research can focus on to result in better educational outcomes.One of the main goals of prediction is to identify at-risk students in the early stages and assist in them in real-time. 54,59,60However, in the literature review, we observed that most studies focus on prediction (data, models, accuracy, ML methods, etc.), but few focused on actions that can be taken based on the prediction results.Those studies that took actions based on prediction results demonstrated potential for better educational outcomes.Thus, we feel that there is potential to contribute in future as studies also focus on taking timely action based on prediction results.The challenge is to build systems that offer prediction of students' performance not only as an early and effective warning but also provide meaningful, valuable and timely feedback to students and interventions to improve outcomes. 48nother area that can result in better outcomes in education from ML prediction results is identifying the causes/reasons for such prediction results.However, few studies focus on the cause of such results.In this literature review, we observed one study 39 that investigated the correlation between programs' admission requirements and criteria and students' performance.In their work, Adekitan and Noma-Osaghae 39 examined the association between Nigerian universities' acceptance requirements for choosing eligible prospective candidates for acceptance and the students' academic success during the first semester.Their results indicated that students' performance in the first year is not correlated to the entry requirements.In general, it is expected that there is a strong correlation of entry requirement to success in the program.This points to a need for further investigation whether entry requirements are appropriately aligned to the programs to which they are used.
Investigating the correlation of learning activities to student performance can lead to better course design.García-Torres et al. 98 investigated and evaluated the relevance of Physics course activities to overall students' performance.Their results indicated that there are several activities that are not relevant or helpful towards students' performance in the course.These results can lead to new insights and improvements in course designs, activities, and assessments.
The course curriculum usually changes over years, and this process can impact the performance of the prediction models.We have seen many studies in the literature that use assessment data (students marks) as features in their prediction models.There is a possibility that in the next term/year, the weight of assessments is different, assessment types and the number of assessments can differ, and the prediction models' performance can be affected due to changes in assessments, weightage and others.This issue will probably affect the reliability of ML prediction models, especially in the first instance the course revisions are applied.Therefore, prediction models' reliability and accuracy need to consider changes in the data and updates to courses (and other features used in prediction models) as courses and programs evolve.Also, each iteration of the course provides additional data that can be used to further improve the accuracy of ML prediction models.Deep learning, a class of ML algorithms, may be better suited for prediction in evolving course structures and needs further investigation.
In summary, we hope that researchers can build on from the findings of this systematic literature review with an understanding of existing work on using ML to predict student performance, and also consider future research directions.While this study focused on how ML algorithms can predict student performance in general without considering features such as students' degree level, ethnicity, background, disabilities, university program and others, future literature reviews can expand the scope of the current work by considering how ML has been used in these different contexts.For instance, it would be interesting to look at what factors need to be considered in predicting the performance of international students, postgraduate coursework and research students, students with disabilities, indigenous students, students in different programs (Medicine vs. Computer Science vs. Humanities), and students from different countries, among others.

2
Year wise distribution of included publications.

4
Studies grouped by ML methods (frequency and percentage).

5
Frequency of the most common ML algorithms used in the literature.
course signals system categorizes students into three groups: red signals indicate students are likely to fail, yellow signals suggest students might fail, and green signals indicate students are likely to pass their course.Academics create an intervention strategy based on the outcomes of the students' signals, which may consist of posting the students' signals on the LMS home pages, sending them customized emails or texts, referring them to academic advisors or academic resource centers, or scheduling a face-to-face meeting.Purdue University utilized course signals in 2007, 2008, and 2009 cohorts to assess the initiative's impact on student performance and retention.
Common ML algorithms used in regression problems in the education domain include Multiple Linear Regression, Polynomial Regression, Decision Tree Regression, Random Forest Regression, and Support Vector Regression.
Top 25 most-cited articles in this literature review.
TA B L E 1 Data mining approach to predicting the performance of first year student in a university using the admission requirements 12. 40 Can online discussion participation predict group project performance?Investigating the roles of linguistic features and participation patterns 13. 41 Systematic ensemble model selection approach for educational data mining 14. 42 Learning to represent student knowledge on programming exercises using deep learning 15. 43 Transfer learning from deep neural networks for predicting student performance 16. 1 Student academic performance prediction using supervised learning techniques 17. 44 Analysis of educational data mining using classification 18. 45 Predicting secondary school students' performance utilizing a semi-supervised learning approach 19.46 EMT: Ensemble meta-based tree model for predicting student performance 20.47 Tracking student performance in introductory programming by means of machine learning 21.48 Interpretable multiview early warning system adapted to underrepresented student populations 22. 49 Ensemble learning for estimating individualized treatment effects in student success studies 23.50 Implementing AutoML in educational data mining for prediction tasks 24.51 Multi-split optimized bagging ensemble model selection for multi-class educational data mining 25. 52 Student performance prediction model based on supervised machine learning algorithms a No. of citations was retrieved from Scopus and Web of Science on 10/2/2023.
Computation complexity of ML algorithms.
Note: https://www.thekerneltrip.com/machine/learning/computational-complexity-learning-algorithms/.Abbreviations: (k), number of nearest neighbours and (d) refers to dimensions; (n), the number of training samples; (p), the number of features; (n trees ), the number of trees (for methods based on various trees); (n sv ), the number of support vectors; (n li ), the number of neurons at layer (i) in a neural network.
List of feature selection methods/algorithms used in the literature.We have identified 68 distinct feature selection methods/algorithms used in the literature from the 49 research articles.As shown in Table4, the top four methods/algorithms used in the literature are Informa-Studies grouped by learning environments.
TA B L E 4 tion Gain-based selection algorithms, Correlation-based feature selection, Gain Ratio Attribute Evaluator Filter (Gain Ratio), and Relief Attribute Evaluation.Information Gain-based selection algorithms are the most algorithm used in the literature (23.5%).They have been used in 16 studies.Correlation-based feature selection was used in 7 research articles (10%), followed by Gain Ratio, which was used in six research articles (9%), and Relief Attribute Evaluation was used by 5 research articles (7.5%).