Cluster optimization in wireless sensor network based on optimized Artificial Bee Colony algorithm

Deepa, S.R., SCOPE, VIT, Chennai, 600127, India. Email: deepaprashanth.g@gmail.com Abstract Wireless sensor networks (WSNs) have emerged as a potential research area owing to their wide range of applicability in various fields. Critical application areas of WSN include defence and military surveillance, weather monitoring, health care monitoring, and Internet of Things. Extensive research efforts have been made to improve energy and data delivery performance in WSN with different bio‐inspired optimized clustering methodologies such as particle swarm optimization (PSO), and the bacterial foraging algorithm for optimization (BFAO). However, most constrained solutions are limited to data aggregation performance and enhance the energy efficiency of the network to some extent. Therefore, balancing energy and data delivery performance to a greater extent is crucial because of design limitations imposed on existing hierarchical solutions. This article introduces a novel clustering paradigm, namely optimal clustering using the Artificial Bee Colony (OCABC) algorithm, which improves energy efficiency based on a simplified and robust ABC algorithm. The central idea is to increase the network lifetime of the WSN by optimizing the cluster formation process. The implemented structured module of OCABC attempts to overcome challenges encountered in existing baselines. The extensive numerical analysis with respect to significant performance parameters assists in benchmarking the OCABC compared with the PSO and BFAO.


| INTRODUCTION
The wireless sensor network (WSN) clustering paradigm has become popular and has made significant progress with datadriven computing solutions in a resource-constrained ad hoc environment. Current technological advancements in WSN have enhanced its capacity toward improving reliability and quality of services (QoSs) aspects. The operational environment of a WSN is defined with a baseline theory of the clustering principle. A cluster head (CH) is usually composed of higher computational efficiency compared with its member nodes [1]. Here, the hierarchical clustering paradigm forms routes in multihop communication from node to CH and then CH to sink. The aggregated data finally get accumulated at an external sink computing system. A wide range of potential application areas of WSN exist including habitat monitoring, weather monitoring, and intrusion detection. Sensor nodes operate within a resource-constrained environment in which operational factors such as power supply, bandwidth, central processing unit computing frequency and processing capacity are extremely limited. Hence, highly simplified structured energy-aware routing protocols need to be implemented with the ease of computing factors. Energy use and security are major issues in wireless networks [2][3][4].
The design principle of the Artificial Bee Colony (ABC) algorithm is mimicked in WSN in terms of the collective foraging behaviour of honeybees, which includes both the selforganization principle and the division of labour. The conceptual optimized clustering model is further validated and assessed considering numerical analysis in which the benchmarking is performed with respect to previously established optimization modelling such as particle swarm optimization (PSO) and the bacterial foraging algorithm for optimization (BFAO). However, the incorporation of hierarchical routing in optimal clustering using ABC (OCABC) to reducing energy consumption has enhanced network performance and data delivery performance, which is satisfactory compared with low energy adaptive clustering hierarchy (LEACH) and other optimization modelling.
In the article, Section 2, discusses a few of the significant works comparing PSO-and BFAO-based energy-efficient clustering approaches and describes how optimization based modelling considerably enhanced the performance of hierarchical clustering. Section 3, incorporates the analytical design and core concepts of ABC from both procedural and computational algorithmic viewpoint. Section 4 describes the gap explored in the conventional mode of research in clusterbased routing. It also highlights the deficits of the PSO-and BFAO-based approaches that affects the overall energy performance of the network. Section 5 explains the analytical modelling of OCABC. Section 6 illustrates the experimental outcome observed by simulating PSO, BFAO, and LEACH, and offers a comparison of these baselines with the formulated OCABC approach. Finally, Section 7 concludes by providing contributory remarks about the formulated solution.

| RELATED WORK
This phase of the study explores a few significant works carried out in the field of optimal clustering and highlights design limitations associated with their approaches. In a constrained environment, it is difficult to overcome the bottleneck during cluster-based routing; it is still an open-ended problem in terms of the resources of energy, bandwidth, and processing capacity.
Clustering is considered an effective methodology to date to ensure the proper use of energy in a single tier-network attribute. In clustered communication, single-tier attributes involve data aggregation and fusion to minimise the collision of packets along with redundancy [5][6][7]. LEACH is a prominent clustering hierarchy methodology that had been preferred [8]. The established baseline design concept of LEACH addresses the limitations of direct transmission or flooding resulting in the exhaustion of critical sensor resources. It emerged with a hopto-hop fashion of communication and routing involving both intracluster and intercluster communication to reduce computational effort as well as the communication cycle. LEACH is superior for hierarchical routing and data aggregation, but it lacks efficiency where energy is concerned [9]. Bio-inspired algorithms have a crucial role in implementing cost-effective cluster-based routing in WSN; hence, they have gained the attention of many researchers. However, conventional hierarchical clustering paradigms mimicked the concept of PSO and BFAO to optimize network performance, and yet some bottlenecks still appeared. To address limitations that arise in different well-known optimization solutions, the study introduces a robust hierarchical clustering paradigm based on an optimized ABC algorithm analytically modelled with a simplified execution flow. The prime objectives of the ABC algorithm here are to perform optimal clustering and balance network performance with energy [10][11][12].
The most recent study of Krishnan et al. introduced an improved clustering schema based on PSO theory. The study addresses the problem of fixed sink location and data delivery performance aspects. The optimization principle, in this case, implemented an analytical methodology to address energy problems. The authors claimed that the formulated system attains a better outcome and increases the life span of a WSN [13].
A similar optimization problem is considered in the study of Lee et al. This design and implemented methodology considers the heterogeneity of WSN. The prime objective of the study was to minimise network cost using potential node optimization. The outcome shows that the PSO-based optimization solution has a considerable outcome in the context of energy [14].
The study of Sheta et al. also explored the strength of energy optimization using PSO for WSN and highlighted the remarks on the manuscript from a theoretical viewpoint [15]. On the other hand, a similar pattern of quantitative exploration along with a novel methodology formulation was described in the study of Parvin et al. [16]. The formulated design modelling is conceptualised based on PSO clustering. The PSO clustering in this phase explores the best possible route formation using the search algorithm. The formulated system is simulated using the NS-2 simulator and is superior to some extent. The study of Kaur [17], Singh [18] and Kaur [19] also extended work considering the PSO-based clustering optimization problem.
Apart from PSO-based optimization, much research has focussed on assessing the performance of bacterial foraging optimization (BFO) toward WSN clustering in terms of both energy and data delivery.
Annu et al. introduced a novel BFO-based solution to optimize the energy performance of WSN [20]. The authors claimed that BFO has potential performance toward energyefficient clustering and solving other multidimensional problems. The comparative performance analysis showed that the formulated system attains better energy performance compared with LEACH.
Similarly, the approach of Lalwani et al. also formulated a novel approach of CH selection and routing in WSN using BFO [21]. The methodology of the system formulated computationally performs better than the existing baselines.
Considering the beneficial and contributory aspects, Deepa et al. also formulated a novel PSO [22] and BFAO clustering algorithm [23]. The design principle of the BFAO attained better energy performance as well as QoS compared with LEACH and PSO. The study also explored that existed owing to bottleneck conditions within the design limitations of PSO-and BFAObased clustering problem formulations. A fitness-based glowworm swarm with fruit fly (FGF) algorithm was proposed by Kale et al. based on the foraging behaviour of glowworms for cluster optimization [24]. The firefly cyclic grey wolf optimization (FCGWO) proposed by Murugan et al. is a hybrid algorithm that uses the foraging behaviour of grey wolves and fireflies [25]. Most studies considered only energy problems in which they overlooked the computational complexity and reliability of communication aspects. Therefore, a problem formulation to fill this gap is needed to make the future direction of research stronger and more practical.
The research problem in the context of energy-aware clustering is explored from various aspects. Incorporating LEACH-based clustering policy in one-hop or multihop hierarchical solution, performs energy aware data aggregation to some extent and does not ensure the reliability of data delivery performance. Therefore, in the long term, it exhausts critical WSN computational resources from an energy viewpoint. To overcome this bottleneck, different efficient optimization-based clustering schemes have been used, but for existing solutions of PSO and BFAO, a huge set of computational recursive and iterative operations is involved. Hence, converging on an optimal solution with faster process execution is challenging in both PSO-and BFAO-based clustering approaches. PSO also leads to computational complexity owing to its heavier computation of particles. Moreover, PSO and BFAO do not impose a foolproof solution where the trade-off between data delivery performance and energy efficiency of WSN is concerned. Thereby, the energy and clustering problem in WSN is still an open issue.

| ANALYTICAL DESIGN AND CORE CONCEPT OF ARTIFICAL BEE COLONY
The notion of ABC is an interesting topic to many scientists for distributed problem solving especially in the area of WSN and its energy-efficient communication aspects. The concept is analytically designed and modelled by mimicking the foraging behaviour of swarms of honeybees locating food.

| Self-organization principle in swarm of bees
It refers to a dynamically defined paradigm that involves cooperative communication, learning, and computation. The prime objective is to quantify the global-level solution by exploring interactions among lower-level attributes of the system.

| The implication of the rule set
The system executes a procedure that involves the computation of basic rules that define the interaction among components. The interaction during computations happens purely on the basis of local factors and does not impose dependencies on the global pattern of factors.
The following are four different basic properties associated with the rule set in the ABC concept that helps to determine the optimal path to the exact food source.
Positive Feedback Factor: It is referred to as the rule of thumb that promotes a convenient situation attracting entities among a swarm in terms of recruitment and reinforcement. In the case of a bee colony, as the source of nectar or liquid food increases, in the long term, it has a positive impact on the number of visitors looking for food, such as dances in bees.
Negative Feedback Factor: This indicates a process that can improve system performance by stabilising collective behavioural patterns. Negative feedback systems are useful to avoid a state of saturation in which the system experiences a lack of resource availability. The negative feedback system helps honeybees to stop exploitation against improper food sources.
Fluctuations Factor: In this aspect, exploration of local level components along with randomness and task switching within different swarm attributes lead to a situation in which the evolution of an emergent structure toward the possibility of the new discovery of a solution (e.g. emergent food sources) can be found.
Minimal Density Factor: This refers to a situation in which different attributes of a swarm share individual experiences gathered from their own activities. It helps the system use individual results to come up with a significant solution. For example, bees corresponding to the same nest usually communicate with each other in the same dance region. Here, the prime purpose of communication lies in sharing information about different food sources.

| Honeybee swarm characteristics
The prime objective of a honeybee swarm is to come up with an optimal model of foraging that locates food within a search space. The primary requirement in this case is to obtain collective intelligence from the swarm that emerges with respect to time. A swarm of bees consists of three different preliminary components: food source, employed bees and unemployed bees.
Swarm behaviour defines and enables two prominent modes of operations: recruitment and rejection of a food source. The system of ABC prioritises their food sources based on some sort of factors that includes proximity to the hive, the density of energy attributes and the ease of the energy extraction factor. These factors determine the profitability of choosing a particular food source.
The employed bees exploit a particular food source and carry corresponding metadata about that source in terms of the distance, directionality, and profitability. The calculation or computation generates a probability factor that ensures the effectiveness of the food source. The unemployed bees or foragers always try to discover new food sources to exploit. Two different types of unemployed bees exist: onlookers and scouts. These two attributes have two different roles in exploiting the food sources. The role of scouts is defined is to explore the atmosphere surrounding the hive, to discover new food sources. Onlookers reside within a hive and formulate a significant principle toward discovering new food sources based on information gathered from other employed foragers. The exchange of information among bees determines the collective notion of the knowledge theory, which provides the global best solution, or the optimal food source in this context. The quality of the food source (fitness to the solution) is validated with a factor called the amount of nectar from the conceptual viewpoint. Only one employed bee is associated with a food source, and with respect to knowledge of the old DEEPA AND REKHA -3 food source, the bee determines a new food source. The working procedure of ABC applies optimization to converge to the global best candidate solution. Let the number of the initial food source position in a search area be defined with positional vector P i , where i: i = 1 to S ∈ Z+ (positive whole numbers starting from (1). Here P i is an n-dimensional row vector denoting optimized parameters and S is the dimension of an onlooker or employed bees within a food source. The initial food source positions can be computed with respect to the nectar amount and random orientation of number generation, which can be represented with the analytical expression: P max,j − P min,j Equation (1) shows that the positional vector of the food initial source is bounded. Here, j represents the dimensionality of the vector. By incorporating Equation (1), the ABC system makes an employee bee understand whether the nectar amount computed for the newly explored food source is greater than the current food source. If the nectar value is higher in the case of a new food source, the employed bee emerges with the new food source; otherwise, it retains the previous one. Here, ∆ i,j represents a randomly produced number of [-1 to 1]. Information regarding this search process is shared by every employed bee with onlookers; onlookers collectively gather the information to determine the optimal selection of food sources with a probability factor and fitness value.
The information exchange between the employed and onlooker bees helps the ABC system to determine the fitness of the function corresponding to the nectar value of a new food source.
Here, each employee bee with its current position P i explores the new food source by validating it. The amount of nectar of food source f 2 (P) can be computed with the mathematical expression in Equation (2). P i,j (t + 1) is the nectar amount at the new place, P i,j (t) is the previous value of the amount of nectar and ∆ i,j represents a randomly produced number of [-1 to 1]. P i,j (t) is the selected randomly and i ≠ j. k is the random dimension index: Using the position vector, each employed bee computes the updated location of the food source. Here, P i represents the position of onlooker bees and P k is randomly chosen employed bees. The exchange of information between them determines the updated food source information and location. Here, t represents the number of time instances for a cycle of execution.
The fitness of the solution is computed, Equation (3), indicating the quality of the newly found solution: where α = 0.5, E NCavg , E CHavg are average values of residual energy for noncluster nodes and CHs, respectively, and D NCavg , D CHavg are average values of the distance of noncluster nodes to the base station and the average distance of CHs to the base station, respectively. During the exploration of new food sources, the onlooker bees stay in the hive, communicate with the employed bees, gather information about various food sources in a cycle and update their knowledge. This determines a probabilistic factor calculated based on the fitness of the solution. It computes which employed bee has to move to a new location of food source most suitable for the entire swarm. It can be computed with the analytical model of fitness factors ε i (Equation (4)).
If a bee is not as good about providing information about improved food sources with a predetermined set of decisions during the entire phase of food source searching with collective information gathering, communication, and computation, the ABC system detects that it cannot provide an improved solution. Therefore, the bee becomes a member of the scouts and applies a random search and exploration of the new food sources around the hive. The mathematical model in this context is formulated in the following way (Equation (5)). r is a random number between 0 and 1: These computational procedures are involved in ABC optimization to solve a particular optimization distributed problem in which a set of constraints and trade-offs are concerned for proper modelling. WSN imposes distributed computing and networking aspects to obtain a specific goal of interest collectively. Therefore, it was realized that the concept of ABC optimization can be applied in WSN to optimize energy and communication performance. The next section will elaborate on how ABC optimization is essential to model the enhanced clustering operation of WSN.
The study explores the scope of clustering optimization in the WSN routing and data aggregation phase, which mimics the exact previous pattern of honeybees while searching for cost-effective food sources.
The energy model used here is shown in Equations 6-10). T CH is the CH node total energy. E el is electrical energy, ε a fs is amplification energy, n i is ordinary nodes, D i the transmission distance of nearer nodes, l is data bits, N is total nodes, T O is ordinary nodes total energy, d ji is the distance of node j from node i and E is the energy consumption of the network. The path loss analytical model used in this work is the radiative energy transfer model [26].
The depletion of energy by CH node j owing to the reception of data is given by E RCH j (Equation (6)).
The depletion of energy by CH node j owing to the transmission of data is given by E TCH j (Equation (7)).

| SCOPE OF ABC OPTIMIZATION IN ENERGY AWARE CLUSTERING
The ABC algorithm can overcome bottlenecks that often appear in PSO-and BFAO-based clustering approaches. Ensuring a higher degree of reliability in a resource-constrained environment such as WSN is challenging. However, it requires a costeffective design and modelling of energy-efficient optimal routing, even if the presence of energy, processing and bandwidth constraints are certain. Probabilistic optimization approaches assisted with cluster-based communication outperform the direct transmission and flooding based routing to a significant extent. The scope of ABC-based optimization is higher and significantly better for improving the network lifetime of a WSN compared with existing PSO-and BFAO-based optimization principles. However, to overcome limitations that arise with PSO-and BFAO-based clustering and routing designs, the study emphasises strengthening the ABC design and clustering process and comes with multidimensional solutions. The basic methodology associated with ABC can be applied to the clustering problem of WSN, a critical constraint resource optimization problem that includes energy, bandwidth, data delivery performance, and so on. The core design similarity for the formulated optimization problem quantifies swarm-based collective behaviour for cluster-based communication. Here, self-organization and division of labour have a significant role.

| ANALYTICAL MODELLING OF OCABC
The study explores the strength factors of ABC optimization modelling and formulates the optimization problem for WSN clustering and routing.
The system modelling of OCABC imposes mapping of bee colony optimization with an energy-efficient clustering principle that ensures cost-effective communication with higher data delivery.
The system modelling of OCABC aims to allocate the energy dissipation factor with minimal cost during routing and clustering while satisfying the constraints. The prime concern is to optimize the total energy performance without compromising network throughput. Energy use is minimised during the cluster-based routing in which the clustering principle solely follows ABC methodology for CH selection. The degree of variance factors helps to formulate the objective function where the constraints can be defined differently. The study formulates the clustering problem in which the objective function is subjected to minimise the total energy consumption of all nodes satisfying the constraints.
The study hypothetically and analytically conceptualises the OCABC design principle to optimize clustering performance in WSN. The computational execution flow follows a set of procedure is as follows.

| Formulation: wireless sensor network model
The system deploys the network with a set of sensor nodes and employs a static base station in the initial phase of computation. It also defines the core components of OCABC clustering by mapping the design attributes followed by initialising clustering parameters such as the probability of the node becoming CH (P CH ), initial energy ( J ) of a node (α) and Pkt length (bits). A mapping function f MAP relates the core component of OCABC clustering with ABC core design attributes to solve the multimodel energy problem (Figure 1).
The design principle of OCABC is that an effective radio scheduling schema is implemented to schedule the data exchange intelligently among member nodes and CH. It also imposes solutions pertaining to the active and sleep modes of scheduling.

| Hypothetical consideration
The design methodology is defined so that a bee colony can be mapped with a cluster-based WSN. The colony size is estimated by computing the summation of the number of employed bees and onlookers. Assume that nodes A and B reside within closer proximity within a cluster. The computational model initially considers all nodes as a sensor node and further defines CH based on some predefined set of notion and potential attributes. Thus, initially it is not known which of node A and node B will become CH at each communication round. The formulated OCABC computational execution flow converges optimally toward finding the best possible solution with a higher degree of fitness value. The member node may initiate searching the best optimal solution of CH in which route establishment and other processes consume a minimal amount of energy. Once a CH is elected, all other member nodes update their buffers and transmit corresponding data by defining membership variables.

| OCABC formulated design and implementation
The design shows the way OCABC imposes the optimal design and modelling pertaining to the clustering operation, which ensures cost-effective and energy-aware communication.
Selection of the optimal CH is done by enhancing the strength factor of the OCABC algorithm. It also shows how the system converges with minimum iterative steps toward the best possible solution of CH, which allocates lesser transmission power to amplify the signal data. Therefore, energy allocation for data transmission and processing is computationally reduced with an optimized formulation of the cost-effective solution (sol). It also shows how the concept of communication, learning, and computation is addressed here by cluster-based mapping. Here, a node can become CH based on certain parameters in which the costeffective allocation of energy with a higher degree of data transmission has a crucial role. Thus, maintaining buffer memory in each node is necessary for every learning and communication process.
The honeybees communicate with each other at every stage of foraging to find the best possible food source. With simplicity and flexibility and less control parameters, OCABC outperforms PSO and BFAO. Optimization is obtained by sharing the solution, and searching in the proper direction with updates provided instantly by onlooker bees. The computation finally provides numerical analysis with respect to the global optimal solution provided by onlookers and employed bees. The system finally computes a set of parameters such as throughput and energy variance to justify the quantitative outcome obtained by simulating OCABC in a numerical computing platform.
The final outcome is compared with existing optimizationbased clustering approaches such as PSO and BFOA for better inference and validation of the OCABC model.

OCABC 1 algorithm
Generation of population initially x i ; i=1…S.
S ∈ Z+. Population Evaluation Initialise C to 1 Repeat FOR every employee bee New solution f 2 (P) is produced (see (Equation (2))) Fitness calculation Obtain better solution FOR every onlooker bee Solution is chosen based on probability (see (Equation (4))) New solutionsf 1 (P) are produced (see (Equation (1))) Fitness calculation Better solution is selected If scout has abandoned solution, then the solution is replaced with a new solution (see (Equation (5))) Best solution obtained so far is memorised The optimal solution obtained provides the selection of CH. They are sorted in the order of fitness value. Clusters are formed by considering the CH from the list and including the nodes in the transmission range of that CH. When all nodes are included in some cluster, the communication round begins on event occurrence.

| NUMERICAL ANALYSIS AND EXPERIMENTATION
This section assesses the performance of OCABC modelling and validates its outcome with respect to an elaborate comparative study. It is performed by referencing two of the most popular optimization approaches: PSO and BFOA,

Parameter Values
Initial energy of sensor node 0. respectively. The simulation is done using MATLAB. Table 1 shows the initial value of parameters. Figure 2 clearly shows that formulated OCABC system assists the network in working for a longer duration, which is much less in the case of BFAO and PSO. Therefore the lifetime of the network is significantly improved in the case of the formulated ABC. The network lifetime is until half of the total nodes exist in the network.
The cluster selection based on the ABC concept converges to the best solution in a way that strengthens data delivery performance as throughput. The throughput curve pattern in Figure 3 shows that OCABC accomplishes a better outcome compared with BFAO and PSO. Figure 4 shows that residual energy of the network is higher in the case of OCABC in all communication rounds compared with PSO and BFAO. Figure 5 also shows that the variance of energy in the case of OCABC is stable; it highly fluctuated in the case of PSO and BFAO.
F I G U R E 2 Assessment of number of dead nodes. BFAO, bacterial foraging algorithm for optimization; OCABC, optimal clustering using the Artificial Bee Colony; PSO, particle swarm optimization F I G U R E 3 Assessment of delivery performance (throughput). BFAO, bacterial foraging algorithm for optimization; OCABC, optimal clustering using the Artificial Bee Colony; PSO, particle swarm optimization F I G U R E 4 Assessment of residual energy. BFAO, bacterial foraging algorithm for optimization; OCABC, optimal clustering using the Artificial Bee Colony; PSO, particle swarm optimization F I G U R E 5 Assessment of variance of residual energy. BFAO, bacterial foraging algorithm for optimization; OCABC, optimal clustering using the Artificial Bee Colony; PSO, particle swarm optimization Table 2 shows the average communication round at which the first node dies and half of the nodes die when simulation is run 50 times with an initial 100 nodes in the network. OCABC improves the lifetime of the network compared with the BFAO, PSO, FCGWO and FGF algorithms. Table 3 shows that OCABC performs better than BFAO and PSO in terms of improving the network lifetime with the dense and sparse deployment of nodes in the 100 � 100 m area.

| CONCLUSION AND FUTURE WORK
This article proposes optimized modelling of the clustering approach, OCABC, which attempts to balance the trade-off between energy and communication performance in WSN. The design and modelling of OCABC imposes lightweight computation. The design and the conceptual notion of OCABC consider the baseline principle of ABC with respect to different strength factors. Validation of the OCABC model was performed based on a quantitative study showing that OCABC outperforms PSO and BFAO clustering schema with respect to both energy and data delivery performance. Implementing the bio-inspired techniques for clustering wireless sensor nodes in Internet of Things applications can be considered as future work.