Load balanced fuzzy‐based unequal clustering for wireless sensor networks assisted Internet of Things

The fundamental unit of the Internet of Things (IoT) is wireless sensor networks (WSNs). Due to their low cost and prospective use, WSNs have drawn interest in various applications over the last years. Since sensor nodes are equipped with limited battery, it is vital energy consumption be carefully monitored. Therefore, minimizing the node energy consumption is clearly essential to extend the network lifetime. In this paper, a fuzzy‐based unequal clustering protocol is proposed. Not only does the proposed protocol prolong the network lifetime, but it also balances the load among nodes. Former protocols do not consider load balancing. Fuzzy logic is employed considering four inputs: residual energy, distance to a base station, node degree and centrality. To validate its efficiency, the novel protocol is compared with other relevant protocols. The obtained results prove that the protocol here proposed outperforms other state‐of‐the‐art protocols.

• The cluster radius has a significant role in maximizing the network lifetime, hence, unequal clustering is proposed.
• It is a fuzzy-based distributed unclustering algorithm designed for the large-scale nonuniform network.
• Each node calculates the probability of being elected as a CH by the fuzzy logic approach.
• The clusters of variable sizes are formed using a fuzzy logic approach by considering the same four parameters.
• It blends the probabilistic approach with the fuzzy logic approach in a suitable manner.
• Finally, an extensive simulation is done to evaluate the performance of the proposed protocol. The simulation results illustrate that the proposed protocol significantly prolongs the network lifetime when compared with state-of-the-art schemes.
The remaining of the paper is as follows: Section 2 describes the related work done so far in a comprehensive manner. Section 3 represents the network topology and the model used for experimental analysis. Section 4 gives a detailed description of the proposed protocol. Section 5 gives an analysis of the protocol. Section 6 depicts the simulation environment and the parameters used for simulation. Section 7 analyses the experimental results. Section 8 gives the conclusion followed by some future works.

RELATED WORK
IoT consists of a huge amount of sensor nodes that are to be deployed, so the main issue of IoT is to handle them and their servicing and maintenance cost is very high. 28 Moreover, the replacement of sensor nodes batteries that are already placed is also very high as they are deployed in a very harsh environment. 29 For IoT applications, LEACH protocol 30 is extensively improved by several researchers for the betterment of the network performances of the IoT systems. Other protocols of WSNs can also be modified to enhance the network performance of the IoT systems. 31 The IoT is a ubiquitous network that connects the smart entities to the cloud. WSN senses the physical phenomena, gives a way to collect the sensed data and communicate the data to the server for monitoring purpose. 3 Nearly 70% of the energy is exhausted in the communication process. Since IoT contains a huge amount of smart entities that are connected to the internet, this made energy preservation a crucial issue in the IoT model.
The development of energy-efficient techniques for WSN has always a crucial task for researchers. When WSN merged with IoT, it becomes a more vital task. For maintaining IoT standards, the focus is on device energy-conserving methods for example clustering in which the selection of Cluster Heads could be done wisely by considering various parameters. Several techniques have been developed for efficient CH selection that prolongs the network lifetime.
Gupta et al 32 proposed a protocol that implements fuzzy logic during CH election. Inputs given to the system are residual energy, node's degree, and node centrality. This work is not compatible with large-scale WSN as it suffers from scalability issues.
Cluster Head election mechanism using Fuzzy logic (CHEF) 33 uses a fuzzy logic mechanism to elect the final CHs. First tentative CHs are chosen by probabilistic method then fuzzy logic is employed to elect final CHs. Inputs given to the fuzzy system are residual energy and the average distance of a node from its neighbors.
Above mentioned protocols have not considered the energy hole issues of WSN. In recent times, some unequal clustering algorithms are developed. The main idea behind unequal clustering is to create a cluster of less size near the sink. This variable size cluster is defined by the distance to the base station. A node near to the sink has a smaller cluster size to that of nodes that are located far away from the sink.
A fuzzy energy-aware unequal clustering algorithm (EAUCF) 22 takes in to account other factors for computation of cluster size. Fuzzy logic is used for this purpose. Tentative CHs are elected by a probabilistic approach. Final CHs are elected by considering the residual energy of nodes only. Inputs to the fuzzy logic are residual energy and distance of a node from BS.
Fuzzy based unequal clustering (FBUC) and multiobjective fuzzy clustering algorithm for wireless sensor networks (MOFCA) 34,35 is an advancement of EAUCF. MOFCA considers node density as an additional parameter for computation of cluster size while FBUC considers node degree as an additional parameter. Final CHs selections are based on the single parameter. This leads to a nonuniform distribution of CHs.
An optimization problem is considered for the proposed network structure in terms of load balance and energy consumption for the implementation of an efficient and scalable IoT. 8 A framework is designed for the IoT deployment and based on which, an optimization technique is proposed which considers the energy expenditure. After that, ME-CBCCP protocol is proposed to provide a solution to this optimization problem. This shows a significant improvement in terms of delay, packet delivery ratio, and throughput and energy consumption. A dynamic heterogeneous clustering algorithm is proposed by considering features like connectivity, mobility, and communication. 36 The cluster formation is done by considering the parameters like neighbor count and residual energy.
A new energy-efficient clustering protocol for sensor devices in IoT is proposed. 11 A genetic algorithm is used to select the cluster heads with the highest energy level and the lower amount of data transmission. This algorithm decreases energy consumption and end-to-end delay.
Fuzzy-based unequal clustering algorithm (FUCA) 37 is a protocol that considers other factors at the time of calculation of cluster size. Competition radius and rank is calculated for each node by using the fuzzy logic approach. Inputs given to the fuzzy system are the distance to a base station, residual energy, and density. The outputs produced are Competition radius and rank. However, it suffers from intracluster communication overhead.
A novel Lifetime Maximizing optimal Clustering Algorithm (LiMCA) is proposed for IoT devices. 38 Here, both Cluster Heads (CHs) and Member Nodes (MNs) are deployed at predetermined locations. For deployment, a novel deployment scheme is proposed. In addition, a training protocol is proposed to train CHs and MNs about their cluster identity.
Improved grid-based hybrid network deployment (IGHND) 39 is proposed for grid-based WSN. This protocol considers several parameters for CH selection. However, it suffers from load balancing and the energy dissipation rate is high.
To the best of my knowledge, none of the researchers focused on the issue of intracluster overhead. The reduction in the intracluster communication cost of a node results in saving of the dissipation of energy.

PRELIMINARIES
The network consists of N sensor nodes having the same capabilities deployed randomly in an X * Y sensor field. The deployed nodes, as well as the sink, are static. The sink is placed at the center of the field and its location is known to each and every deployed node. The nodes are homogeneous in terms of energy and their location is known to each other. The links are assumed symmetric. The cluster heads aggregate the data and directly send the aggregated data to the sink. Wireless links are used to transmit data and control messages.

Energy model
The radio model is used to calculate the energy consumed by the nodes. This model employs free space and multipath fading channels as per the distance between the transmitting node and the receiving node. The free space (fs) model is utilized if the distance is less than a threshold (d 0 ), else the multipath model will be used. 32 Therefore, the radio energy consumption to transmit a k-bits message in d meters as follows: where E elec denotes the energy consumption by the electronic circuit, fs and mp denote the energy consumption by free space and multipath fading channels, respectively. The energy required for the radio to receive a k-bits message is shown as follows:

Data aggregation model
For data aggregation, the infinite compressibility model is used. The cluster heads collect the data from its node members and aggregate it into a single packet regardless of the number of received packets.

PROPOSED PROTOCOL
The proposed protocol is designed for nonuniform large-scale WSN with homogeneous nodes. This section is divided into two parts:

Fuzzy system
The fuzzy logic controller is employed to select the CHs and calculates the competition radius. Four inputs are given to the system and two outputs are produced as shown in Figure 1. Inputs are residual energy, distance to a base station, node degree and centrality. While the outputs are chance and radius. Input variables are: 1. Residual energy: It is the most important criterion for the selection of Cluster Head. The cluster head spend a considerably higher amount of energy than other nodes since it collects data from the cluster members, aggregates and transmits the aggregated data to the sink. In addition, the competition radius should decrease as energy reduces. 2. Distance to the base station: For transmitting the data, the energy consumption by the nodes increases with the increase in distance between transmitter and receiver nodes. Hence, the distance between CH and BS should be less. Moreover, clusters near the BS should be small as compared to the clusters that are far away from BS. 3. Centrality: Centrality is a measure of how well the sensor node is located at the center of its neighbors in the whole network. This is an important measure to reduce the intracluster communication cost. Less amount of energy is required by the cluster members to transmit the data to the CHs if the value of CH's centrality is lower. 4. Node's degree: Node's degree is the number of neighbor nodes with a communication range. A higher value of the node's degree gives more chance to a node for being selected as a CH. While a higher value of a node's degree means less competition radius.
Output variables are: 1. Chance: It defines the eligibility of a node to be elected as a CH. The higher the value of a chance output, the higher the probability of a node to be elected as CH. Nodes that have high residual energy, less distance to the BS, close centrality, a high number of neighbors nodes will have less competition radius and the probability to become a CH is very high ( Table 1).
The fuzzy set for each input and output variable is shown in Figures 2-7, respectively. Boundary variables follow the trapezoidal membership function while the middle variable follows the triangular membership function as shown in figures.
In FIS, the crisp input values are converted into fuzzy linguistic variables. The fuzzy if-then rules based on the Mamdani method are applied to map the input variables to the corresponding fuzzy output variables. A total of 81 (3 4 ) rules is made with four parameters having different membership functions. Some of the fuzzy if-then rules are shown in Table 2. Center of Area (CoA) method is used for defuzzification that is, to convert output linguistic variables into a crisp output value.

Variables
Membership function

Clustering process
This phase is divided into rounds like LEACH protocol. The proposed protocol operates in two phases:

Cluster formation
At the beginning of each round, each sensor node generates a random number between 0 and 1. If the generated number is less than the predefined threshold then the node will be eligible for the election of CH and the F I G U R E 5 Fuzzy set for "centrality" F I G U R E 6 Fuzzy set for "chance" state of that node becomes ELIGIBLE_NODE. Then the eligible node will compute radius and chance. The node will broadcast ELIGIBLE_NODE_MSG to all the neighbors within its competition radius calculated by FIS. This ELIGIBLE_NODE_MSG will contain details about the node id and chance value. The nodes whose chance value is higher within the neighbor will be declared as final CH and broadcast ELECTED_CH_MSG to the neighbors. The nodes who will not become CH and the nodes that are not eligible will now broadcast JOIN_CH_MSG to the nearest CH. After cluster formation, CH nodes will generate the TDMA schedule and broadcast it to all the members.

Data collection
After the cluster formation and generation of TDMA schedules, CH nodes wait for the data. A non-CH node senses the data and transmits to the respective CH within the allotted time frame. Nodes will go to the sleep state in other time slots. CH nodes will do aggregation of all the collected data into a single message. The aggregated data will then finally be transmitted to BS through CHs. The CHs should be in wakeup mode all the time, this leads to the consumption of higher energy by the CHs. The role of CHs is rotated among the sensor nodes in order to balance the energy consumption.
F I G U R E 7 Fuzzy set for "radius" The total energy drained by the CH at the time of data collection from all the sensor nodes is given by where E RX is the energy spent at the time of receiving the data; E DA is the energy spent at the time of aggregation; E non − CH is the energy spent by the non-CH nodes of the particular Cluster Head.
where m denotes the number of members within a cluster, n denotes the number of clusters E TX (j, CH i ) is the energy spent for transmission from node j to its CH in the ith cluster where k is the number of bits, E perdatabit is the aggregation energy for single data bit Moreover, when CH acts as a gateway node for the members and does not generate any data on its own then- N is the number of nodes. When CH involves sensing-

PROTOCOL ANALYSIS
The following points are drawn based on findings from the literature survey and were laid down in the below section: a. Intracluster communication cost is not considered in many protocols. b. The reduction in the intracluster communication cost of a node results in saving of the dissipation of energy. c. The proposed protocol selects cluster heads based on four parameters, that is, residual energy, distance to a base station, node degree and centrality. d. Centrality ensures the less intracluster communication cost. e. It helps in prolonging the network lifetime as the selection of CHs involves residual energy. f. The unequal clustering helps in balancing the load among the nodes in the network. g. The competition radius is also calculated by considering the same four parameters. h. The fuzzy logic controller is employed to select the cluster head as well as calculates the competition radius. Fuzzy logic makes the process simple by handling all the uncertainties. i. Load balancing and the prolongation of a lifetime are considered in the proposed protocol.

Simulation environment
The proposed algorithm is simulated using MATLAB. To prove the efficacy of the proposed algorithm, different scenarios are created for the simulations. The random generation of nodes brings some coincidence factors which may influence the results of the experiment. Hence, the results of the average of 50 experiments are taken. 100, 200, 300 and 400 nodes are taken for 100 × 100 regions of interest. The sink position is placed first at the center of the region of interest, then at the corner and finally away from the area of interest. So, in total 12 scenarios are created that are represented in Table 3.  Table 4 listed the parameters of the simulation. The first order radio energy model is used for calculating the dissipation of energy. The performance of the proposed protocol is evaluated with some well-known equal and unequal clustering protocols like MOFCA, 35 IGHND 39 and FUCA. 37

Network lifetime evaluation
The first node death (FND) value for each of the protocol is represented in Table 5. The important parameter that is required to estimate performance in an IoT based WSN is network lifetime. It is defined as the round in which the first coverage hole or the death of the first node occurs. Hence, prolongation of the lifetime of the network is of utmost importance in WSN.  The reason because of which the proposed protocol gives an effective performance with its counterparts is that the former uses a parameter namely centrality to cope up with the issue of high intracluster communication cost. IGHND is based on equal clustering hence poor load balancing among nodes. MOFCA is based on unequal clustering but selects cluster heads based on energy only. FUCA considers all the parameters but suffers from high intracluster communication costs. Table 5 illustrates the simulation results for all the scenarios. For scenario 1, considering the first dead node time, the proposed protocol increases the network lifetime by 41.07% when compared to MOFCA, IGHND by 37.54%, and FUCA by 9.72%.

Number of dead nodes evaluation
For scenario 2, considering the first dead node time, the proposed protocol increases the network lifetime by 52.76% when compared to MOFCA, IGHND by 44.64%, and FUCA by 13.30%.
For scenario 3, considering the first dead node time, the proposed protocol increases the network lifetime by 47.90% when compared to MOFCA, IGHND by 38.46%, and FUCA by 8.51%.
For scenario 4, considering the first dead node time, the proposed protocol increases the network lifetime by 46.98% when compared to MOFCA, IGHND by 46.33%, and FUCA by 19.39%. Figures 12-15 show the network lifetime with respect to 100, 200, 300, and 400 nodes setup, respectively. Here, the sink is placed at the corner of the region of interest. The proposed protocol shows better improvement over all the protocols in For scenario 5, considering the first dead node time, the proposed protocol enhances the performances by 55.43% when compared to MOFCA, IGHND by 41.14%, and FUCA by 29.18%.
For scenario 6, considering the first dead node time, the proposed protocol enhances the performances by 67.43% when compared to MOFCA, IGHND by 49.54%, and FUCA by 39.68%. For scenario 7, considering the first dead node time, the proposed protocol enhances the performances by 60.63% when compared to MOFCA, IGHND by 51.35%, and FUCA by 43.43%.
For scenario 8, considering the first dead node time, the proposed protocol enhances the performances by 65.27% when compared to MOFCA, IGHND by 53.34%, and FUCA by 46.23%. Figures 16-19 show the network lifetime with respect to 100, 200, 300, and 400 nodes setup, respectively, when the sink is on the outside of the region of interest. The proposed protocol shows major improvement over all the protocols in all the scenarios. The proposed protocol selects the optimal CHs with proposed meta-heuristics and balances the load among the nodes thus, reducing the dissipation of energy. The protocols used for comparison depletes energy much faster as compared to the proposed protocol.
For scenario 9, considering the first dead node time, the proposed protocol enhances the performances by 92.18% when compared to MOFCA, IGHND by 82.73%, and FUCA by 72.47%.

F I G U R E 18 Dead nodes for SN#11
For scenario 10, considering the first dead node time, the proposed protocol enhances the performances by 95.97% when compared to MOFCA, IGHND by 83.72%, and FUCA by 74.16%.
For scenario 11, considering the first dead node time, the proposed protocol enhances the performances by 93.19% when compared to MOFCA, IGHND by 82.79%, and FUCA by 66.54%.
For scenario 12, considering the first dead node time, the proposed protocol enhances the performances by 95.98% when compared to MOFCA, IGHND by 84.78%, and FUCA by 73.74%. The significant improvement has been shown by the proposed protocol when compared with its counterparts. While the proposed protocol considers all the essential parameters affecting the network lifetime. The use of a parameter centrality reduces the intracluster communication cost in the proposed protocol, thus prolongs the network lifetime. The proposed protocol supports more number of rounds even when node density increases and the placement of the sink does not affect the performance of the proposed protocol.  Table 6 shows the average remaining energy per round by all the algorithms in each and every scenario. This energy calculation includes all the cost that is, the cluster formation cost, the intracluster communication cost as well as the intercluster communication cost during a round. The remaining energy of the proposed protocol is high with respect to the protocols used for comparison. The energy thus saved will prolong the network lifetime and the nodes will be able to transmit data for a longer duration.

Network remaining energy evaluation
It can be seen from the table that the average remaining energy for the proposed protocol is higher when compared to other protocols like MOFCA, IGHND, and FUCA. The proposed protocol balances the load among the nodes as well as solves the energy hole problem by creating small clusters near the sink.

CONCLUSION
In this paper, two issues for IoT based wireless sensor networks, namely load balancing and minimization of energy dissipation are considered. The selection of cluster heads is done with the goal to minimize the energy dissipation of the network and balance the load between the nodes. Here, fuzzy logic is engaged for the selection of cluster heads. The protocol proposed is compared with different well-known equal and unequal clustering protocols over different network scenarios. In all scenarios, a significant improvement in terms of network lifetime is observed. In future works, the proposed protocol can be evaluated by incorporating the mobility of nodes and obstacles in the region of interest. It can further be implemented by considering various other scenarios for IoT-based WSN.