Age-optimal path planning for ﬁnite-battery UAV-assisted data dissemination in IoT networks

Unmanned aerial vehicles have been widely used to assist wireless sensor networks due to ever-increasing demands for Internet-of-things applications. To support timely delivery of information characterised by a recently introduced metric, termed as the age of information, this paper explores freshness of data in an unmanned aerial vehicle assisted wireless sensor network. Speciﬁcally, the authors consider a limited-energy unmanned aerial vehicle moving towards the Internet-of-things devices to disseminate data packets provided by a data centre. Since the unmanned aerial vehicle cannot visit all the nodes in each ﬂight turn due to its ﬁnite-sized battery, the best sequence of nodes, from an age of information perspective, should be selected at the beginning of each ﬂight turn. Thus, an unmanned aerial vehicle trajectory planning for data dissemination is proposed taking into account both maximal use of energy and freshness of data. To minimise the weighted sum age of information metric, by utilising the well-known knapsack and travelling salesman problems, the authors propose an algorithm to efﬁciently select devices and the corresponding visiting order in each ﬂight turn. Finally, to highlight performance of the proposed algorithm, and to investigate the effect of limited-energy unmanned aerial vehicles, the number of nodes and ﬂight turns, and simulation results are also provided and compared with other benchmark algorithms.


INTRODUCTION
Recently, unmanned aerial vehicles (UAVs) have attracted a lot of research attention in many Internet-of-things (IoT) services such as real-time monitoring of traffic [1], disaster management [2] and crowd surveillance [3]. Due to the high mobility and lineof-sight (LoS) communication opportunities, UAVs can be used as a flying base station (BS) for data collection or dissemination, specifically enhancing connectivity when there is no good enough reliable communications link between IoT devices and a BS [4]. However, the deployment of UAVs as aerial BSs leads to several challenges such as optimal trajectory planning, placement and energy consumption [5]. These challenges have been extensively studied in the literature from various perspectives: minimising delay and flight time or maximising network coverage and throughput. For instance, Mozaffari et al. [6] studied optimal deployment of UAVs to maximise network coverage. In [7], the authors min-imised the UAV's total flight time from a starting point to a destination while allowing each sensor to successfully upload a certain amount of data using a given amount of energy. In [8], the authors studied the throughput maximisation problem in mobile-relaying systems by optimising the source/relay transmit power along with the relay trajectory. However, the performance metrics used in these works cannot support fresh demands in emerging real-time applications.
To characterise information freshness, the age of information (AoI) metric was first introduced in [9] and defined as the time elapsed since the latest received status update packet at a destination node was generated at the source node. Then, other AoI-related metrics were also introduced in the variations of queuing models (refer to [10] for a comprehensive survey). Different from the traditional metrics, AoI captures freshness from the receiver's perspective and recent research advances on AoI suggest that many well-known design principles of traditional data networks need to be re-examined for enhancing information freshness in rapidly emerging real-time applications [11].
Due to the significance of AoI and ubiquitous use of UAVs in IoT systems, many related studies on AoI-oriented UAV-assisted networks have been conducted. For instance, Abd-Elmagid and Dhillon [12] and Cao et al. [13] used UAV as a mobile relay for a source-destination pair and designed flight trajectory with the goal of minimising peak AoI. In [14], multiple source-destination pairs were connected via a relay and joint optimisation of sampling, and updating policy of the relay was investigated from the perspective of average sum-AoI. In an IoT network with multiple sensors, two age-optimal trajectories for collecting information from sensors, referred to as the Max-AoI optimal and Avg-AoI optimal trajectories, were designed [15]. These trajectories corresponded to a shortest Hamiltonian path and a stage-weighted shortest Hamiltonian path (WSHP) in the wireless sensor network, respectively. This work was extended in [16], minimising maximum AoI in more general scenarios where the UAV collects data from a set of sensors when hovering at each collection point. Similarly in [17], an online AoI-based trajectory planning for UAV-assisted data IoT networks was investigated, where the traffic generation pattern and network topology were unknown. Different from the common formulated problems in the literature, Li et al. [18] studied a problem of minimising the number of expired data packets in a UAV-assisted system. The authors relaxed the original problem into a Min-Max-AoI optimal path scheme due to the complexity of constraints. Nonetheless, in the aforementioned works, the energy consumption of the UAV had not been considered in the design of age-optimal trajectories, while energy and power constraints have significant impact on the AoI characteristics.
On the other hand, AoI with energy-limited UAV trajectory planning is studied in the literature [19][20][21][22][23][24]. For instance, Jia et al. [19] employed the concept of age to study a UAV path planning and data acquisition problem by jointly considering the data acquisition mode selection, energy consumption at each node and age evolution of collected information. In [20], the authors formulated a joint sensing time, transmission time, UAV trajectory and task scheduling optimisation problem to minimise the AoI in a model of the cellular Internet of UAVs. They decoupled the problem into two parts and solved the subproblems using extreme principles and a dynamic programming (DP) approach. In [21], the authors considered a UAV-assisted network and modelled the battery limit capacity of the UAV as a time constraint. They developed an age minimisation problem as a finite-horizon Markov decision process and utilised reinforcement learning (RL) due to the extreme large state-space of their problem. Similarly, the works in [22][23][24][25][26] proposed techniques from RL to learn age-optimal transmission policies considering the UAV energy consumption. Meanwhile, the concept of AoI has been investigated in cooperative relaying, and joint sensing-transmission protocols were designed to schedule UAVs performing tasks [26][27][28].
However, in most of the existing works, the UAV collects or disseminates data only once and then flies back to the BS. In other words, most of the analyses in the literature are one-shot, while this paper proposes a multi-shot analysis, that is, a multiturn flight. Besides, little attention is devoted to optimal selec-tion of the tasks done by the UAV. In particular, due to the limited energy of the UAV, the UAV may not be able to visit several nodes in a flight turn and therefore, efficient node selection and visiting order are critical in the limited-energy regime. However, in the literature, the infinite-sized battery UAV visits all nodes or performs all tasks. Even for a UAV with a finite-sized battery, by dynamically changing the velocity of the UAV or adopting specific data acquisition strategies, all the devices are visited and there is no need for device selection in each flight turn. In particular, we study the age-optimal data dissemination in a UAV-assisted IoT network while considering both maximal use of energy and freshness of data. Specifically, we study a scenario in which a limited battery UAV moves towards IoT devices to disseminate data packets provided by a data centre (DC). In each flight turn, the UAV takes off from the DC, disseminates data to the selected IoT devices and finally returns to the DC. Due to the limited battery, the UAV cannot visit all nodes in each flight turn and therefore the best devices should be selected from the information freshness perspective. As in each flight turn, multiple devices can be selected and the number of visiting sequences for each set of selected devices is extremely large, proposing an efficient algorithm for trajectory planning is of paramount importance. Under such setting, the main contributions of this paper are summarised as follows.
• An optimisation problem to minimise the weighted sum-AoI of the IoT devices in a UAV-assisted data dissemination network is studied. The formulated problem jointly considers UAV's trajectory, the limited energy of the UAV and age evolution of each IoT device in multiple flight turns. • To propose an efficient solution, we have divided the optimisation problem into two parts: device selection and order of visiting determination. To obtain the age-optimal policy, we have utilised the two well-known and still nonstraightforward problems: knapsack problem (KP) for device selection and travelling salesman problem (TSP) for the order of visiting. Since the formulated problem is NP-hard, a solution framework based on DP is also proposed. • We have shown that the proposed solution outperforms other strategies in terms of minimising the weighted sum-AoI and maximum age, especially for the middle-energy regime.
The remainder of the paper is organised as follows. In Section 2, we present the proposed system model. The problem formulation is presented in Section 3. The solution and the proposed algorithm are discussed in Section 4. We present numerical results and discussions in Section 5, and finally the paper is concluded in Section 6.

SYSTEM MODEL
As depicted in Figure 1, we consider a UAV-assisted data dissemination network consisting of a finite-battery UAV, a DC v 0 , and a set of IoT single-antenna devices represented by  = {v 1 , v 2 , … , v M }. It is assumed that the IoT devices are randomly located on the ground, and the union of IoT devices and DC is denoted by location on the ground which is denoted by and let d i, j = ||s i − s j || be the euclidean distance between v i and v j . This paper uses the notations that are listed in Table 1 for ease of reference. The DC continuously generates packet updates for each IoT device v i , and the UAV flies to send status updates to the corresponding IoT devices. The energy consumption of the UAV mainly consists of two parts: The first part is the data communication related consumption and the second part is propulsion energy consumption. This paper assumes that the source of the two parts is distinct from each other. We also assume that E max and E ′ max are the limited energy for the first and second parts, respectively. This work simply models the propulsion energy for d unit of distance displacement as d , where denotes the energy consumption over reference displacement. Moreover, the altitude and the velocity of the flight are fixed and are, respectively, given by h and V .
Since the UAV has a limited battery with capacity E max and a limited fuel with capacity E ′ max , it may not be able to disseminate data to all the devices. As a consequence, N ≤ M IoT devices are selected, and visited in an optimal order in each flight turn. Let X the ordered set of devices in the optimal trajectory in the kth flight turn. Then, the UAV returns to the DC for recharging and checking whether any new packet is generated. It is worth mentioning that the UAV received data from DC in a set of M queues with firstcome-first-served discipline, one for each IoT device.
To denote the decision of which devices to be visited in the kth turn of flight, let us define an (M + 1)-square matrix X (k) with binary elements x k i, j ; x k i, j = 1 denotes that the ith IoT device is visited exactly before the j th IoT device in the kth flight turn. Thus, the matrix X (k) has the following properties:  Maximum value that can be attained with weight less than or equal to w using first i items.
Here constraint (1a) indicates that the node v 0 belongs to X (k) tp , constraint (1b) denotes visiting each device at most once during each flight turn and constraint (1c) ensures that X (k) specifies only one connected visiting sequence.
The wireless channel between each IoT device and the UAV is modelled by an LoS channel and the channel gain is denoted by g = 0 h − , where 0 and denote the channel power gain at a reference distance and environmental attenuation factor, respectively. Without loss of generality and for the sake of simplicity, it is assumed that the UAV transmits its packet with constant power P i = P, the system bandwidth is B and noise power at the IoT device receiver is 2 . Thus, the data transmission rate becomes which is equal for all of the IoT devices.
In the kth flight turn, L k i [bits] denotes the length of the data packet which is considered for the IoT device v i to be disseminated. Besides, we define for the best N selected nodes to denote the time instants in which packets are delivered. It is also assumed that T k 0 denotes the time at which the UAV takes off from the DC in its kth turn of flight and, for simplicity of exposition, we assume that T 1 0 = 0. We use AoI as a performance measure to analyse the freshness of the disseminated data. The AoI at time t is defined as where u(t ) is the time at which the most recently received packet at the destination node was generated at the DC. Based on the above definition, AoI of the device i, denoted by A i (t ), increases linearly in the case in which the device is not visited or there is no update to be delivered. Otherwise, the age drops to the age of head-of-line update in queue i. As we assume that there is no reliable connection between the UAV and the DC during data dissemination process, the UAV should decide which devices to visit at the beginning of each flight turn. Therefore, we need age of each node at the beginning of each flight. Let A k i be the age of the ith device at the end of the kth flight turn. We have where T k flight is the time spent by the UAV to visit the selected nodes in the kth flight turn. Moreover, y k i is a binary variable; y k i = 1 represents the IoT device i which is selected to visit in that turn of data dissemination and y k i = 0 shows that the corresponding device is not visited.

PROBLEM FORMULATION
Our prime objective is to find a trajectory or equivalently a matrix X (k) for each flight turn such that the weighted sum-AoI is minimised under the constraint that the UAV has a finite-sized battery with battery capacity E max and fuel capacity E ′ max . From (1a)-(1c) and (4), the weighted sum-AoI minimisation problem is written as where i is the importance weight of each device, and E k i = PL k i R −1 denotes the energy needed for the ith packet to be transmitted by the UAV in the kth flight turn. The related constraints (5c) and (5d) guarantee that the energy consumed for data transmission and propulsion would be less than or equal to the battery and fuel capacity of the UAV, respectively. Moreover, s is the flight turn number at which we want to minimise our objective function at the end of that time.
In the given problem statement, the matrix X (k) 's elements are binary variables. Therefore, the formulated problem falls into the category of integer-optimisation problems. It can be easily proved that the problem is NP-hard. Therefore, to solve the problem, in the following, we propose a solution based on DP approach.

ALGORITHM DESIGN AND THE PROPOSED SOLUTION
In this section, based on a DP approach we present an algorithm. Before solving the problem, let us present a few definitions. (Feasible trajectory plan, FTP). Given a set of IoT devices , an FTP is a matrix X which satisfies (5c).

Definition 1
Definition 2 (Maximally FTP, MaxFTP). Given a set of IoT devices , a MaxFTP is an FTP such that adding any IoT device to the mentioned trajectory makes X no longer an FTP.
As it is not reasonable for the UAV to return to the DC while still having the energy for data transmission, the objective is to find the age-optimal trajectory over MaxFTP matrices. It can also be seen that the form of problem (5a) is similar to both 0-1 KP, as a decision should be made to select which IoT devices to visit (with a slight difference in our problem in which we have to minimise an objective function rather than maximising it), and TSP as it is needed to visit the selected nodes in the best order. There already exist many forms and solutions for each of the mentioned problems individually (refer to [29,30] for a comprehensive survey), but our problem is a combination of both well-known problems.
Thus, we decouple the problem into two parts: one is for device selection which is modelled by a KP, and the other is for determining the visiting order of the selected devices modelled as a TSP. Due to this decoupling, it can be seen that constraint (5c) is related to the device selection part while constraint (5d) is related to the visiting order part. Moreover, the data communication energy consumption is much smaller than the propulsion energy consumption. Therefore, constraint (5c) is more restrictive than constraint (5d). As a consequence, we first neglect constraint (5d) to propose an algorithm, and then the proposed algorithm is modified to consider constraint (5d).
In the classical KP, one packs a fixed-size knapsack with capacity C by selecting some items, each having a profit p i and a weight w i , to maximise the profit of the selected items. The simplest form of this problem can be formulated as Here, there is a slight difference. In our problem, the objective function (the weighted sum-AoI) is going to be minimised and the selected items should have the property of being maximally feasible. To fix this issue in each flight turn, it is sufficient to assign the profits of KP, the weighted age of nodes in the previous flight turn, as presented in the following lemma. Lemma 1. Given A 1 and A 2 , such that A 1 < A 2 , as the age of nodes v 1 and v 2 right before time t , respectively, disseminate data to v 2 rather than v 1 at time t results in less sum-AoI at time T > t .
Proof. It can be easily shown that if v 2 is selected, the sum-AoI at time T would be T − t 1 + A 1 T ∕t and if v 1 is selected, the sum-AoI would be T − t 1 + A 2 T ∕t . Due to condition A 1 < A 2 , the first value would be less. □ Remark 1. Note that in the case of the weighted sum-AoI, the comparison would be between 1 A 1 and 2 A 2 instead of A 1 and A 2 .
Let C denote the knapsack capacity and w k i and p k i denote the weight and the profit of each IoT device, respectively, in the kth flight turn. To decide which devices to select, we assign the following parameters of KP (6a) based on our system model: To select the devices using KP, let m[i, w] denote the maximum profit that can be attained with weight less than or equal to w using fist i items. Therefore, m[i, w] can be defined recursively with m[0, w] = 0 as follows: where the solution of this subproblem can be found by calculating m[M , C ]. After selecting the devices for data dissemination in each flight turn, the order of visiting is required to be determined. This would be the same as finding the stage-WSHP studied by Liu et al. [15] except for a slight modification in the stageweighted distance due to the importance weights of devices. For further details, refer to [15].
To determine the visiting order, assume that v 1 , … , v N are selected in the kth flight turn. The time elapsed from T k i to T k i+1 consists of two parts: time spent for data transmission at node v i denoted by T k,tr i and flight time from nodes v i to v i+1 denoted by T k, fl i , i+1 . Therefore, we have To track the weighted sum-AoI of the selected devices, we first derive T k i based on definition (9), we have Hence, where k N , N +1 ≜ L k N ∕R + d N , 0 ∕V . Thus, the weighted sum-AoI for selected devices is formulated as Let g k (i, S ) denote the minimum weighted cost of path starting from node v i , passing all nodes in the set S exactly once and returning to the DC v 0 in the kth UAV flight turn. To find the stage-WSHP in the kth flight turn, g k (i, S ) can be expressed as where I k denotes all the indices of selected devices in the kth flight turn. Therefore, the minimum path cost can be attained using where the cost function g k (i,  − {v i }) is calculated by (13) iteratively. According to the above discussion, Algorithm 1 is proposed. To summarise, in each flight turn, we first assign the weight and profit of each device using (7) and select a few devices. Then using (9), we calculate the i, j between selected devices (for i, j ∈ I k ). To find the best order of visiting, we calculated (13) for each node v i ∈ I k and all the subsets of S ⊆ I k − {v i } and record the values in a table. Finally, the optimum path would be determined using (14).
To use the DP, we multiply all the parameters of the KP with a large number to have integer values. As mentioned, we do not consider constraint (5d) in the proposed solution. This issue is fixed due to the DP approach used for the first subproblem as the selected devices for all the knapsack capacities less than C (integer format) can be easily recorded (or at least for some values near C ). Then, the optimal visiting order is determined using ALGORITHM 1 DP-based age-optimal maximally feasible trajectory planning , the UAV parameters (V , h, E max , P) and the channel parameters ( , , B, 2 ) for all i ∈ I .
Output: Age-optimal trajectory X (k) tp for kth flight turn, battery use, weighted sum-AoI

1:
Initialize: Set profits p k i and weights w k i and C using (7) for all i ∈ I . Multiply weights and knapsack capacity by a large number, e.g. K , to have integer weights and capacity. w k i = Kw k i and C = KC . Set I k = ∅.

2:
for j = 0, … ,C do Find the first optimal node v k 1 = arg min v i ∈ g k (i,  − {v i }).

27:
Trace back to find the optimal trajectory starting with node v k 1 and ending with node v 0 .
an algorithm called the stage-WSHP. If the visiting order needs more energy consumption than the propulsion energy limit or equivalently constraint (5d) does not satisfy, it is sufficient to consider the selected devices for knapsack capacity less than C to satisfy constraint (5d).
Remark 2. The computational complexity of the device selection modelled as a KP is about (MC ), and the computational complexity of visiting order section modelled by a TSP is Number of flight turns (s) 1 0

FIGURE 2
The weighted sum-AoI under different policies for different network sizes

FIGURE 3
The maximum age considering the corresponding node importance weight under different policies for different network sizes  2 N ). It is worth mentioning that due to device selection and limited energy of the UAV, N is much smaller than M . Moreover, there exist extensive studies on TSP-efficient approximate algorithms. Therefore, for large-scale networks, one needs to utilise those methods, specifically, the genetics algorithm is a suitable candidate.

NUMERICAL RESULTS
This section provides numerical results to highlight the effectiveness of the proposed scheme, which is called KP-WSHP, for age-optimal MaxFTP. Besides, we also evaluate the effect of UAV's energy level, the number of nodes and flight turns on the performance metrics. The simulation parameters are summarised in Table 2. We compare performance of the proposed approach with other four benchmarks which are summarised as follows.
• Greedy: In each flight turn, the UAV selects devices with the highest AoI and recursively finds the nearest predecessor v i−1 for the node v i at each stage i = N + 1, N , … , 2. Specifically, among the selected nodes, first the closest node to the DC v 0 = v N +1 is marked as v N in the visited sequence tp . Similarly, among unmarked selected nodes, the node nearest to v i is marked as v i−1 . This iterative procedure repeats until all N selected nodes constitute a trajectory.
• RS-WSHP: In this algorithm, the UAV utilises random selection (RS) policy for decision section in each turn and visits the selected nodes using a stage-WSHP. • MD-WSHP: The selected devices are determined using maximum device (MD) policy wherein as many as possible devices are selected in each turn and the visiting order is a stage-WSHP. • KP-Priority: This paper shows that the UAV selects the devices using KP modelling and visits the selected devices based on the importance weight of each device.
In Figure 2, we plot the weighted sum-AoI for different network sizes. As it can be seen, the proposed scheme outperforms the benchmark algorithms in terms of minimising the weighted sum-AoI. This difference can be more meaningful in larger networks. The KP-Priority algorithm is not as well as the proposed scheme or even greedy algorithm. However, if the devices' importance weights are notably different, this algorithm performs more efficiently. Besides, MD-WSHP is the worst policy as maximising the number of selected devices can cause problems in the case of large packets allocation. To be specific, the nodes with large packets may not be visited for a few number of flight turns and this greatly increases the AoI of the system. Figure 3 shows the maximum age for different network sizes considering the corresponding devices' importance weights. As can be observed, this value is minimum for the proposed algorithm. This is due to the fact that the profits assigned in the device selection of the proposed scheme modelled by a KP, are proportional to the product of each node age in the previous turn and its importance weight. It is worth mentioning that this value relates more to the decision section. Therefore, greedy algorithm achieves higher maximum age compared to KP-WSHP and KP-Priority. Moreover, the descending trend for MD-WSHP is due to the fact that in larger networks, there exists higher probability of varying packet length generations. Therefore, nodes with maximum age in larger networks would be selected in less number of flights compared to the smaller ones.
In Figure 4, we have studied the impact of energy constraint E max on the weighted sum-AoI. As expected, the weighted sum-AoI decreases as E max increases due to the possibility of visiting more nodes. As it can be seen, the proposed KP-WSHP algorithm outperforms other heuristic approaches for all energy levels, specifically in the middle-energy regime. This is due to the fact that the weighted sum-AoI is mainly controlled by the selected nodes and there exist more combinations of nodes to visit in middle-energy regime. To explain the different behaviour of KP-Priority and RS-WSHP in high-energy levels, it is worth mentioning that almost all devices are selected to be visited in the high-energy regime. Therefore, the order of visiting plays a critical role on the weighted sum-AoI compared to the selected part. Thus, for this case, the visiting based on WSHP method is more efficient than visiting based on the Priority method. Figure 5 illustrates the impact of the number of UAV's flight turns on the weighted sum-AoI. In the first flight turns, there exist devices that are not yet visited by the UAV. Therefore, the age-initial values of the devices that are not visited affect the weighted sum-AoI metric. This is due to the definition of age as the AoI of each device linearly increases till there is an update to deliver. Thus, the results of various approaches do not have notable differences in the first flight turns. After a while, there comes a flight turn in which all the devices are at least visited once and therefore, the results are not sensitive to the initial values anymore and reach their steady state. As expected, the proposed algorithm outperforms the other heuristic approaches. Moreover, the weighted sum-AoI in MD-WSHP increases linearly as flight turn increases since this policy tries to maximise the number of visited devices in each flight turn. Therefore, in the case of allocating a large packet to one device, it takes time for the packet to be disseminated.
Finally, we plot the AoI of each node in Figure 6 in a network with size = 5 at the end of each flight turn under different policies. This amount is presented for one time running of each algorithm. As it can be seen, the proposed algorithm is better than the others. In this time of running policies on the so-called network, we see that MD-WSHP performs the worst again as here large packets are allocated to two devices and due to that policy, the UAV does not disseminate the allocated large packets till the DC provides larger packets for other devices.

CONCLUSION
This paper studies an age-optimal trajectory planning for a UAV-assisted IoT network in a multi-shot scenario, where a limited-energy UAV moves towards IoT devices to disseminate data in multiple flight turns. For this set-up, we have formulated an optimisation problem to jointly optimise the weighted sum-AoI of the IoT devices, as well as the energy consumption of the UAV. In order to minimise the weighted sum-AoI, we have divided the problem into two parts. In the first part, we have selected the best devices utilising a KP modelling. In the second part, we have determined the visiting order of the selected devices utilising a TSP modelling. In simulations, we have shown that the proposed algorithm performs more effectively than a bunch of heuristic algorithms and works much better than the brute-force method in terms of time computation complexity.