A novel task scheduling approach for dependent non ‐ preemptive tasks using fuzzy logic

Multiprocessor task scheduling problem is a pressing problem that affects systems' performance and is still being investigated by the researchers. Finding the optimal schedules is considered to be a computationally hard problem. Recently, researchers have used fuzzy logic in the field of task scheduling to achieve optimal performance, but this area of research is still not well investigated. In addition, there are various scheduling algorithms that used fuzzy logic but most of them are often performed on uniprocessor systems. This article presents a new proposed algorithm in which the priorities of the tasks are derived from the fuzzy logic and bottom level parameter. This approach is designed to find task schedules with optimal or sub ‐ optimal lengths in order to achieve high performance for a multiprocessor environment. With respect to the proposed algorithm, the precedence constraints between the non ‐ preemptive tasks and their execution times are known and described by a directed acyclic graph. The number of processors is fixed, the communication costs are negligible and the processors are homogeneous. The suggested technique is tested and compared with the Prototype Standard Task Graph Set.


| INTRODUCTION
Task scheduling is one of the main factors which affects the multiprocessor systems' performance. Task scheduling problem is considered to be NP-complete [1], that is, it takes a lot of time to find the optimal schedule. In multiprocessor systems, the tasks must be allocated into the processors such that the total Makespan must be as low as possible. There are many approaches that have been developed to resolve the task scheduling problem on the multiprocessor system. Some of them are heuristic-based approaches [2][3][4]; some depend on evolutionary approaches [5][6][7][8][9][10][11][12] and some follow the hybrid methods [13][14][15][16].
There are various heuristic-based procedures for solving the multiprocessor task scheduling approach. The best of them are based on the task list technique, which can be called list scheduling [17]. It is commonly affirmed as a promising approach as it pairs low complexity with good results. It depends on making a list of tasks which are ordered in a descending manner according to their priorities. Then, the first task in the list is selected and assigned to a processor. This approach can be categorized into static and dynamic scheduling approaches. List-scheduling techniques schedule tasks in the order of their priorities. Most list-scheduling techniques consist of two phases. The former is the task prioritizing phase in which the priority is calculated and given to each task of the Directed Acyclic Graph (DAG), and the latter is the processor selection phase in which each task is assigned a processor in the order of its priority. The scheduling Algorithm is considered as static if the processor selection phase begins subsequent to end of the task prioritizing phase and it is considered dynamic if the two phases are interleaved. In another meaning, Static scheduling of tasks occurs throughout the compiled time prior to running the parallel application. On the other hand, in dynamic scheduling algorithms, scheduling decisions are made within run time. Assigning priority is an important issue which affects the performance of the scheduling techniques. There are many attributes that can be used. Some of the scheduling techniques use the top level (t-level) and bottom level (b-level) for assigning priorities to tasks. There are several level-based heuristics [18] [3] and Heavy Node First (HNF) [17]. All these attributes perform based on the level concept in the DAG without taking the communication cost into account. Table 1 summarizes the contributions of these algorithms and the proposed algorithm.
In recent years, numerous researchers have made use of several fuzzy techniques in the field of task scheduling so as to obtain an optimal performance, but this area of research is still not well investigated. On the other hand, multiple scheduling algorithms have been considered to guarantee the time constraints of real-time processes. The way in which these algorithms make a decision is usually based on parameters that are considered to be crisp [19,20]. Nevertheless, in many situations, the values of those parameters are vague. This vagueness proposes that we can benefit from fuzzy logic to determine the order in which the requests have to be executed for proper utilization of the system and to diminish the chance of missing TA B L E 1 The contributions of mentioned algorithms and the proposed algorithm

Algorithm Contribution
Highest level first with estimated times (HLFET) [18,25,26] � It is considered to be the first list-scheduling algorithm which is designed for homogeneous multiprocessor systems. It is a priority-based algorithm that uses the static b-level for assigning priorities to tasks. � No communication costs are regarded in this algorithm. � The drawback of the HLEFT algorithm is that it does not utilise the idle time slot between the two processors which degrades its performance.
Highest levels first with No estimated times (HLFNET) [18,25,26] � In this algorithm, all tasks are assumed to have the same execution time and their priorities are their levels. � It can be applied when HFLET is desirable, but the tasks' execution times cannot be known in advance.
Smallest co-levels first with estimated times (SCFET) [18,25,26] � It is a list scheduling algorithm in which the task's priority is dependent on its colevel. � The co-level of a task is the summation of the longest path from the entry node to the desired node. The smaller the task's co-level the higher its priority.
Smallest co-levels first with no estimated times (SCFNET) [18,25,26] � It is the same as SCFET except that all the tasks are assigned equal weights assuming them to have the same execution time.
Critical path/most immediate successors first (CP/MISF) [3,26,27] � The CP/MISF is proposed as a variation of the HLFet algorithm. � It assigned high priorities to the tasks of high-level values and the number of their immediate successors is large.
Heavy node first (HNF) [17,27] � The general idea behind HNF is to use local analysis of DAG nodes at each level. Tasks are assigned level by level and the heaviest node is assigned first. � It produces an optimal solution when tasks are assigned a unit weight and the number of nodes is small. � Since the HNF uses local information from the DAG, the DAG can be divided into small levels and each one can be processed independently.
New fuzzy scheduling algorithms (NFSA) [23] � It is designed for independent periodic tasks and their periods are equal to their deadlines. � This algorism assigns priorities to tasks using a single fuzzy inference engine (FIS).
The inputs to the FIS are the task's arrival time, deadline, and computation time. While the output is the runtime priority of the task which is used to schedule tasks on a multiprocessor.
[24] � The author has not given a name to this algorithm. It is designed to be implemented on non-periodic independent tasks for uniprocessor soft real-time systems. � This algorithm depends on a couple of fuzzy inference engines in decision making.
The first FIS is used to calculate the task's priorities, while the second FIS is used to modify these priorities in the case of a new task arrival taking their deadlines into account.

Priority-fuzzy-B-level algorithm (PFB) [proposed]
� A scheduling algorithm that uses a single of fuzzy inference engine along with the bottom level parameter to solve the scheduling algorithm for real-time systems with non-preemptive tasks that have precedence relations and timing constraints. The communication costs between tasks are negligible. � Evaluating the performance metrics of the developed algorithm and comparing it with that of a selected online benchmark. The comparison is done in terms of Makespan, minimum schedule length, speedup, and efficiency. a request [21,22]. There are numerous scheduling algorithms that use fuzzy logic in scheduling such as [23] but some of them are often performed on uniprocessor systems such as [24], as shown in Table 1. Consider a Directed Acyclic Graph G that can be simply defined as G = {V, E} where V is a set of nodes (tasks) and E is the set of directed edges which represent the precedence constraints between the tasks. Each Edge has a source node which is the parent node and a sink node which is the child node. The node which has no parent is called 'Entry node', while the node which has no children is called 'Exit node'. So, for any edge E ij between the two nodes T i and T j , T i precedes T j , the height of any task T i can be calculated using equation: where PRED (T i ) is the set of all nodes which precedes T i , and T j ϵ PRED (T i ). The previous equation indicates that the height (T i ) must be less than the height (T j ), and T i must be executed before T j so that the precedence constraints can be preserved. Figure 1 represents an example of DAG with 11 tasks with their execution times and heights respectively, and the node label stands for the task ID. Nodes 0 and 11 are dummy nodes with zero execution times. This article aims to propose a new algorithm called 'Priority -Fuzzy-B-Level' (PFB). This approach uses a fuzzy logic engine and bottom level parameter for assigning priorities to tasks. The motivation behind this work is to find a task schedule which achieves high performance for the multiprocessor system by finding optimal or sub-optimal schedule lengths. Also, the proposed algorithm is based on a deterministic model which means that precedence constraints between tasks and their execution times are known. The number of processors is fixed, and the communication costs are negligible. The precedence constraints between tasks and their execution times are described by means of a DAG. The tasks should be non-preemptive, in which the task must complete its execution before another task starts its execution on the same processor. The processors are homogeneous which means that the processors are of the same speeds or processing capabilities and they are wholly connected with each other through identical links.
The rest of this article is organized as follows: The proposed system model is explained in Section 2. Section 3 is dedicated to the simulation results and discussion. Finally, the paper is concluded in Section 4 in addition to future work directions. A dedicated processor called the 'Central Scheduler' is responsible for assigning tasks to their appropriate processors. When tasks arrive, they will stay in a queue called Task Queue in which tasks wait to be scheduled. There are two additional queues which are the Waiting Queue and the Ready Queue. The Waiting Queue is preserved for the tasks which need to be scheduled but they have to wait because of their precedence constraints. On the other hand, the Ready Queue is dedicated for the tasks which finished waiting in the Waiting Queue and need to be assigned to a processor. In addition, there is a fuzzy inference engine embedded with them, which is responsible for assigning priorities to tasks depending on their heights and execution times as stated before. All these parts are controlled by the Central Scheduler according to the proposed algorithm as it will be explained in the following section. In the end, the Scheduler will stop working when the three queues become empty.

F I G U R E 1 An example of DAG
F I G U R E 2 Scheduler model block diagram Figure 3a shows the block diagram of the proposed fuzzy inference system. Its input stage contains two variables which are the height of the node in the DAG (HEIGHT) and its execution time (EXT). The output stage consists of one variable which is the priority (PRIORITY). The membership functions map these input variables to their corresponding output variables. These membership functions and the surface of the FIS are given in Figure 3b-e. Nine rules are created and a Mamdani's type fuzzy Inference system is built. Some of these rules are mentioned below:

| The proposed algorithm
The main concept of PFB algorithm is making a list of tasks, assigning them some priorities and sorting them according to these priorities in a descending order. This is achieved as in the following steps: 1. Once the tasks arrive at a queue called Task Queue, the Scheduler extracts their heights and execution times and passes them to the fuzzy inference engine to assign them priorities, as stated before. (2) [28], is the length of the longest path from task T i to an exit task, taking the execution cost of T i into consideration.

Then, it calculates their b-level, as shown in Equation
where w i is the execution cost of T i , c ij is the communication cost, and b-level (T j ) is the b-level value of the successor node of T i. By ignoring c ij, the b-level Equation (2) will be: 3. Then, it sorts the tasks of the Task Queue, according to their b-level in a descending order. 4. Finally, it sorts the tasks of the Task Queue, with the same b-level value according to their priorities in a descending order. Table 2 shows priorities given by FIS, the b-level values and the order of execution of each task according to these steps.
After assigning priorities to tasks and sorting them in the Task Queue, the tasks will be assigned to the processors according to the following procedure: Step 1. While the Ready Queue is empty and the Task Queue is not empty, check the dependency of the first node in it.
� If all of its predecessors are done, then remove it from the task queue and assign it to the first available processor. � Otherwise, move it to the waiting queue until they finish their execution. -217 Step 2. If the Waiting Queue is not empty, then � Move the successors of the finished task from the Waiting Queue to the Ready Queue. � Update the status of the processor to Idle. � If the Ready Queue is updated with a new task, sort it according to the b-level values of the tasks and sort the tasks with the same b-level values according to their priorities.
Step 3. While the Ready Queue is not empty, � Remove the first task from the Ready Queue and assign it to the first idle processor. � Repeat step 2.
Step 4. Repeat all the previous steps until all of Task Queue, Waiting Queue and Ready Queue become empty. Figure 4 indicates the Gantt chart of the resulting schedule according to the proposed algorithm.

| SIMULATION RESULTS AND DISCUSSION
The Prototype Standard Task Graph Set, which is available online at [29], was used as a benchmark to test the proposed PFB algorithm. It is composed of 300 task graphs, each with 50 to 2500 tasks. Table 4 shows 50 selected task graphs out of the 300 tasks. Several optimization objectives can be considered for this problem, especially this problem is multiobjective in its general form. The fundamental criterion here is minimizing the Makespan, which is, the time taken to finish the last job. This criterion can be defined using Equation (4).
where F j is the time when job j finishes, Sched is the set of all possible schedules and jobs is the set of all jobs to be scheduled.
The simulation was carried out, using MATLAB, on all the 300 prototypes to measure the performance of the proposed PFB algorithm in terms of Makespan, Minimum Schedule Length, Speedup (S ) and Efficiency (E ). These metrics are obtained from Equations (4)- (7), respectively [28,30]. The following lines discuss the performance of the proposed algorithm: Minimum Schedule Length ¼ Makespan on a uniprocessor Number of processorsðmÞ ð5Þ E ¼ Makespan on a uniprocessor Makespan on ðmÞ processors ð6Þ A comparison was done between the resulting schedule lengths from the proposed PFB algorithm and the optimal Schedules for the Prototype Standard Task Graph Set [29]. These optimal schedules were obtained by a practical parallel optimization algorithm PDF/IHS (Parallelized Depth First/ TA B L E 2 b-level values of each task and order of execution of each task

-
Implicit heuristic Search) on a shared main memory multiprocessor system Sun Ultra Enterprise 3000 (E3000). The E3000 has 6 Ultra SPARC 167 MHz CPUs and 384 MB shared main memory. Because of NP-completeness of the scheduling problem, the upper limit on the search time was set to 10 min. The word 'Time Over' means that PDF/IHS algorithm using six processor elements could not find an optimal schedule within 10 min. The practical parallel optimization algorithm, PDF/IHS, found optimal schedules for 255 prototypes out of 300 while it could not find optimal schedules for the other 45 prototypes within 10 min, so their values were set to 'Time Over'. The results of the PFB algorithm are shown in Table 3. Table 3 and Figure 5 show that the PFB algorithm achieved 205 optimal schedules out of 255. It obtained a better solution once, for 'proto293', with schedule length 1299 [u.t.], while the optimal schedule is 1472 [u.t.] which is better than the optimal by 173 [u.t.] (11.7525%), while the minimum schedule length for this task graph is 1298 [u.t.] as shown in Table 4 and Figure 5. For the remaining 49 prototypes, it missed the optimal schedules with error values less than 0.5% 45 times, twice for error values between 0.5% and 1%, and finally, twice for error values between 1% and 2.5%. The error value is calculated from Equation (8).
where L OPTIMAL is the optimal schedule length and L PROPOSED is the proposed algorithm schedule length. Figure 6a shows the difference between the schedule lengths obtained by the proposed PFB algorithm and the minimum schedule lengths for each task graph from the 255 prototypes. It also shows the efficiency of the PFB algorithm for them. From this figure, it can be concluded that as the difference between the minimum schedule length and the schedule length obtained from the proposed algorithm decreases, the efficiency increases. If they are equal, the efficiency will be 100% which means that all the processors are fully utilized during all time periods of the programme execution, for example, Proto053, Proto253, Proto295 and Proto298 in Table 4. The quality of any algorithm is strongly relevant to the speedup and efficiency. Figure 6b shows the speedup values for the 255 prototypes. It shows that the PFB algorithm has achieved satisfying speedup values which reached 19.99.
For more clarification about the performance of the PFB algorithm, Figure 7a shows the difference between the minimum schedule lengths and the schedule lengths obtained from the PFB algorithm for the 45 prototypes with 'Time Over' values obtained by the benchmark. It can be observed that the PFB algorithm has also achieved highefficiency values for them reached 99.934%. While Figure 7b shows the speedup values of them which reached 19.937 in some cases.

| CONCLUSION
This paper proposed an innovative algorithm for scheduling tasks on multiprocessor systems. According to that algorithm, priorities of tasks are calculated depending on their bottom level values and the priorities given by the fuzzy inference engine. The fuzzy engine gives priorities to tasks according to their execution times and heights. After that, the tasks are assigned to the appropriate processors with the aid of the scheduler processor. The motivation behind the proposed algorithm is to increase the performance of the system by finding optimal or near-optimal schedule lengths with high efficiency and speedup values.
The Prototype Standard Task Graph Set was used as a benchmark to test the proposed algorithm. The simulation was F I G U R E 6 (a) Efficiency and difference between minimum schedule lengths of 255 prototypes. (b)Speedup of 255 prototypes F I G U R E 7 (a) Difference between minimum schedule lengths and schedule lengths obtained from PFB algorithm for the 45 prototypes. (b) Speedup values of them carried out to measure the performance of the proposed algorithm in terms of Makespan, Speedup and Efficiency. A comparison was accomplished between the resulting schedule lengths from the proposed algorithm and the optimal Schedules for Prototype Standard Task Graph Set. The quality of any algorithm is well related to the speedup and efficiency values and the proposed algorithm achieved efficiency values that reached 100% and speedup values reached that 19.99.
In the future, artificial intelligence tools like the Genetic Algorithm may be used to modify the proposed algorithm to cover more types of scheduling issues to make the proposed algorithm more general and flexible.