Optimal path planning using psychological profiling in drone‐assisted missing person search

Search and rescue operations are all time‐sensitive and this is especially true when searching for a vulnerable missing person, such as a child or elderly person suffering dementia. Recently, Police Scotland Air Support Unit has begun the deployment of drones to assist in missing person searches with success, although the efficacy of the search relies upon the expertise of the drone operator. In this paper, several algorithms for planning the search path are compared to determine which approach has the highest probability of finding the missing person in the shortest time. In addition to this, the use of á priori psychological profile information of the subject to create a probability map of likely locations within the search area was explored. This map is then used within a nonlinear optimization to determine the optimal flight path for a given search area and subject profile. Two optimization solvers were compared; genetic algorithms, and particle swarm optimization. Finally, the most effective algorithm was used to create a coverage path for a real‐life location, for which Police Scotland Air Support Unit completed multiple test flights. The generated flight paths based on the predicted intent of the lost person were found to perform statistically better than those of the expert police operators.

Air-based SAR is a core operational requirement of the Police Scotland Air Support Unit (PS ASU), which regularly carry out SAR operations using a Eurocopter EC135 helicopter.However, operating a single helicopter platform to service the whole of Scotland introduces significant limitations.The main problem is that the transit time taken to reach the more remote locations adds a lot of lead time to the search operation in instances where seconds could be the difference between life and death.Hence, PS ASU is placing a small fleet of drones in key locations around Scotland for rapid deployment to assist in missing person search (MPS) operations. 2 These will not be replacing the helicopter, but are rather intended to complement its operations.
To achieve a similar spatial resolution, and thus the probability of detection, as the costlier high-resolution sensors onboard the helicopter, the sensor footprint on each drone must be significantly smaller.Therefore, if little thought is given to the path-planning algorithm used to guide the drones, their use may well be suboptimal in time-sensitive, MPS scenarios.As with all optimization problems, the efficacy of any optimal path will only be as good as the information used in the cost function construction.Unfortunately there exists no deterministic algorithm to predict the exact location of any specific missing person, but there are psychological markers extracted via forensic analysis of previous cases by organisations such as the CSR that could be exploited to assign typical behaviors to classes of subject 1,[3][4][5][6][7] and so doing generate location probabilities.Our research hypothesis is that using the predicted intent of the lost person in the form of psychological profiles, better probability maps may be generated for a given search area resulting in lower time-to-finds compared to naïve methods.
This article is organized as follows.Section 2 will present a short overview of the current popular search patterns and algorithms.Following this, in Section 3, the models used to capture the drone dynamics, sensor characteristics, and generate trajectories are presented.Section 4 develops the nonlinear program and introduces the probability heat-map concept and illustrates how this is integrated into the final cost function.Section 5 presents the results of a Monte-Carlo analysis of a typical scenario, comparing the performance of each optimization algorithm.Finally, the most successful technique was compared against real-world flight data from PS ASU.

CURRENT SEARCH ALGORITHMS
3][14][15] The resulting trajectories can be flown either by a pilot or autonomously.
Search algorithms may then be interpreted as trajectory generation problems with additional constraints and should ideally exhibit the following properties: • incorporate as many constraints as possible • yields a feasible trajectory in the form of a waypoint sequence • be computationally efficient

• be simple to use and interpret
Trajectory generation with simple search patterns is a core functionality in many ground station control software packages.Figure 1 shows example images from QGroundControl and Mission Planner displays showing the standard trajectory specification software.Waypoints specified by the user can be seen for defining a search area in Figure 1A and a specific flightpath in Figure 1B.
The most basic kind of search algorithm is known as the parallel swaths pattern (also know as the lawn-mower pattern) 16,17 due to its distinctive, regular structure.Parallel swaths are particularly well suited for flat areas with uniform probability distributions 18 and operations where complete coverage of the search area is the goal.Furthermore, parallel swaths is highly accessible and is the de facto coverage algorithm within popular open-source ground control software such as QGroundControl and Mission Planner 1.However, as it does not use any á priori knowledge of the search area, although yielding complete coverage, it is suboptimal for a time-critical missing person search task.Some of the most simple informed search algorithms are Reference 19 LHC_GW_CONV (local hill climb, global warming, convolution) and Reference 20 greedy algorithm.Both superimpose a grid over the search area and use local search 21 to move to the adjacent cell with the best fitness, with each using tie-breaker algorithms in the case of two cells having equal values.Reference 19 used convolution kernels to determine which direction is the most promising whereas Reference 20 proposes multiple solutions with LA_MaxMax (a greedy l-step look ahead algorithm) being the most successful.However, Reference 20 still ran into the greedy algorithm problem of getting stuck in a mode and not exploring other areas.This is where Reference 22 algorithm is superior as it introduces the global warming aspect which forces exploration of the area by artificially making probability more scarce.An issue with LHC_GW_CONV is that it does not consider the area as a whole.It is computationally fast but potentially misses regions of high interest due to the core of it being a greedy algorithm.
To explore the full search area, 22 proposes the use of hierarchical heuristics using mode goodness (MG) ratio which is based on Gaussian Mixture Models (GMM).Furthermore, this work 22 uses a glimpse factor to differentiate between areas where a sensor might struggle, whilst most other literature does not.This allowed 22 to more accurately tune their algorithm to what a real-world scenario might require.Ultimately, the TopN search algorithm was found to be profoundly better than the previously mentioned LHC_GW_CONV due to its ability to easily break free from modes.However, LHC_GW_CONV did beat both TopN and Top2 in certain scenarios.A reason for this could be that LHC_GW_CONV was tested using single or dual-mode PDMs, which it is well suited for, whilst 19 may have changed aspects of the algorithm at the time had their PDMs used more information about the search area.Nonetheless, Reference 22 algorithms only managed to reach maximum efficiency of 77.36% in their realistic testing leaving 22.64% left to explore.The efficiency was defined as a path's accumulated probability along it divided by the theoretical maximum found by teleporting between cells with the maximum probabilities.
Another method to break free from modes within a multi-modal area whilst potentially exploring closer to 100% of the area could be optimization algorithms.Reference 23 proposes the LRH-B for the search planning whilst 20 proposes partially observable Markov decision process (POMDP).LRH-B primarily focuses on being faster than standard methods (like Interior Point), whereas POMDP focuses on being more optimal.However, the computational complexity of either cannot be fairly compared as 23 used a 2D single robot whereas 20 used multiple quadrotors at various altitudes.As well as this, 23 does not compare their paths to ones of merit, nor do they use informed PDMs, so its performance cannot be determined.POMDP, on the other hand, was extensively compared to various permutations of the aforementioned greedy algorithms (which POMDP reliably outperformed) and used PDMs based on what a real-world SAR scenario might look like.
Whilst a single optimization algorithm might excel in a specific given search area, the work of Reference 24 combines 5 different standard optimization algorithms to create the hyper-heuristic method (HHM).This works on the principle that a single optimizer cannot be ideal for every case.Thus, HHM uses a fitness function to make the Low-Level Heuristics (LLH) compete.Wang compared 15 differently sized areas, and in each scenario, the HHM out-competed the 5 individual algorithms.In particular, the larger areas proved highly difficult for the LLHs with HHM performing 34% better than the next best LLH in the largest area.
It is easy to see that relative to parallel swaths, the more information available to an algorithm the better it performs.Reference 20 greedy algorithm was trumped by Reference 19 LHC_GW_CONV as the global warming aspect allowed it to explore more of the area.However, Reference 20 POMDP further outperforms the greedy algorithms due to its ability to see the map as a whole.Similarly, Reference 22 extracts further heuristics from the PDM and applies objectively simple search algorithms yet outperforms LHC_GW_CONV.
The optimal method to generate a heuristic-the PDM-is to fit a surface across the landscape based on large amounts of historical data.This would be the most representative of where a lost person may be found, resulting in a highly accurate map for path-planning algorithms and searchers to use.However, the reality is that SAR data is sparse at best with the most complete dataset being ISRID 7 at 50,000 entries from less than 20 countries.High-incident areas, such as Yosemite National Park, may have more dense datasets but in general, there is not enough high-quality data to do accurate curve fitting.Thus, Reference 25 developed MapScore which scores PDMs through heuristics.Similarly, Reference 26 used transfer learning to make smaller datasets usable based on non-SAR data.Another type of historical data is the general trends in the behavior of a lost person's movement.This can be in the form of locations found, 1 dispersion angle, 7 and so forth.This type of data is usually split into various profiles with different characteristics leading to highly customized PDMs for every incident.
Due to the lack of historical data in SAR, combining models is a common occurrence.A popular method to plan a search is the Mattson model 27 which uses a consensus across all searchers at hand to inform where to begin the search in an objective manner.Reference 28 generalizes on this concept by showing that combining models can yield better results due to the total information available increasing.As current SAR datasets do not typically incorporate psychological data due to the difficulty in fusing the environment/geographic data with psychological profiles, there is an opportunity to significantly increase the performance of path planning algorithms through the use of intelligently designed, high-fidelity PDMs. 27,28

SYSTEM MODELLING
As the purpose of this research is to assess algorithms for autonomously creating optimal trajectories for missing person search, realistic constraints must be applied to the optimization to ensure that any computed trajectories are feasible.Physical constraints require physical models and there are plenty of dynamic UAV models in the literature. 12,14,29However, as the physical model is going to be placed inside of a numerical optimization cost function, model fidelity must be minimized to reduce computational overhead while retaining only the necessary physical constraints, that is, the physical model needs to be fit-for-purpose.
Figure 2 illustrates the typical drone operation for search and rescue missions.Quadcopter UAVs are nonholonomic requiring the aircraft to first pitch or roll to induce either forward or lateral motion in the x − y plane.Any platform attitude change will alter the sensor footprint if the sensor is rigidly mounted to the UAV base.Fortunately, most drones use a form of platform stabilization (usually a two-axis gimbal) to isolate the camera orientation from platform angular motion.Assuming the sensor is stabilized, this motion can be neglected.No specific sensor configuration or modality was assumed to ensure the generality of the results, apart from the sensor instantaneous field-of-view (IFoV) which is shown in Figure 2 as the angle .The final assumption is that the drone maintains a constant altitude h throughout the mission.
F I G U R E 2 UAV configuration showing sensor instantaneous field-of-view and footprint.

UAV platform model
To ensure sufficient physical constraints are included, to minimize computational complexity and impose constant altitude, a simple holonomic 2D point-mass model is sufficient fidelity to represent the UAV platform from 1.
where d is the drag coefficient.F T = [F Tx , F Ty ] is the thrust acting on the UAV, and x = [x, y] is the UAV's position vector.Drag is added to the system to match flight through the still air.An average flight speed of 1.5m s −1 was used.
The model was controlled through a basic inverse-kinematics controller such that where ẍdes is the desired acceleration (defined in Section 3.3) and FD is the estimated force due to drag on the model.The resultant force input applied to the model was limited to prevent unrealistic behavior.
where F T,max is the maximum force value that can be applied at any time step.For the use-case of PS ASU, an equivalent value of 3m s −2 was used, however during testing it was found that the acceleration never exceeded 0.42m s −2 regardless.

Sensor characteristics
The UAV is assumed to fly at h m above a flat ground plane.A perfect sensor, that will detect a person with 100% accuracy, is mounted below the UAV with an unobstructed vision of the ground plane, and parallel to said ground plane.A typical UAV sensor is the DJI Zenmuse Z30, which has a Field Of View (FOV) range from 2.

Trajectory generation
A quintic polynomial scheme 30 with low computational complexity boundary conditions was employed.These polynomials will give adequate estimations of a valid and realistic path for the point-mass model to follow.The position, velocity, and acceleration for the quintic polynomial scheme are defined by, where the coefficients c = {c i |i ∈ [0, ..., 5]} are found by solving the system of equations at x t=0 and x t=T n .T n is the finishing time of the line segment n as calculated by x t=0 and x t=T n are the starting and finishing waypoints, respectively.Furthermore, ẍt=0 = 0m s −2 and ẍt=T n = 0m s −2 .
F I G U R E 3 Geometric layout of waypoints for 7.
For the velocity vector v g at each waypoint, the waypoints before and after a waypoint are used to calculate the boundary conditions for ̇x where V is the mean desired speed.Waypoints f and h are the waypoints before and after respectively with g being the current waypoint-as seen in Figures 3 and 4.

Optimization algorithms
Probability Accumulation Based optimization (PABO) is the concept of using probability accumulation over an area to create a path using numerical optimization.As with most topics, there is no clear best solution as 24 showed.For the case of optimum path planning, optimization is taken to mean finding the set of waypoints in a problem space that optimizes various cost parameters.In this case, the cost function J(q), where q is the set of control inputs, is to be maximized to find the set optimal control inputs q * .This is achieved through the numerical solution of the following optimization problem T ≤ {endurance limit} (9) where q * is the optimal path with q ∈ R m , m is the number of 2D coordinates in path q, and p(q(t)) is the probability seen by the sensor at time t.T is the total path flight time, and n waypoint = 10 for the experiments conducted.
x min ∈ [x min , y min ], x max ∈ [x max , y max ] creates a rectangular constraint around the area where p(x, y) > 0.
To find the solution to the optimization problem, and thus the optimal path, two optimization algorithms were selected due to their evolutionary nature: particle swarm optimization 13,[31][32][33] (PSO), and genetic algorithm [34][35][36] (GA).For PSO, the particles within the swarm (the initial values of q(t)) are randomly initialized and maintain a record of what their individual best score has been.To avoid convergence on local minima, PSO filters out identical particles and selects the top third for the next generation.The remaining particles are discarded and replaced with new particles in different and randomly-chosen dimensions.Similarly, GA undergoes a random mutation phase at each step allowing beneficial traits to be passed onto the next generation whilst the population continues to explore.This ability to incorporate random mutations in both algorithms encourages exploration of the global space which makes them well-suited for exploring the problem space at hand.

Probability heatmap
The Centre for Search and Rescue regularly publishes a study on the behaviors of missing persons 1 which is used by the likes of Police Scotland and Northumberland National Park Mountain Rescue Team.This study analyses data gathered by multiple Mountain Rescue Teams (MRT) and segments it into useful categories such as dementia, despondent (defined as "any person who is thought to have disappeared deliberately" 1 ), and developmental problems.Within these categories, there are common statistics analyzed such as location found.
The "location found" is of particular interest as it gives a powerful insight into the search area (this search area is predefined by other means and how the area was determined is not important).Table 1 gives an understanding into how the data is organized.For a person with dementia, it is clear that they are more likely to be near a travel aid (i.e., road) than a body of water.This allows us to draw a PDM for an area with travel aids and water focusing the search on these travel aids, then water.
In order to do this, one can represent a higher likelihood of a person being at (x 1 , y 1 ) than at (x 2 , y 2 ) by placing a bivariate Gaussian at (x 1 , y 1 ) such that where p(x, y, ) is the probability of finding a person at (x, y).This is a continuous probability function and would yield a different probability for any x, y ∈ R which is favorable for numerical optimization methods.This can then be discretized into a probability distribution map (PDM) for performance.However, using a PDM for a numeric optimization problem poses some issues.primarily, numerical optimization methods rely on small changes to the input giving a change in the cost function.If these small changes are carried out on a discrete probability map, there may be no change in cost function and thus the optimization algorithm assumes a minimum is found, exiting prematurely.A solution is to use a continuous gaussian function to represent hotspots.This would work with a few locations and associated radii, however, once the problem is applied to the real world this becomes tedious and difficult to set up.A method of having a local expert (mountain rescue leader, police search leader, etc.) using  brushes, akin to digital drawing, to manually draw hotspots on a map is more intuitive and one can imagine someone in the field with a tablet taking seconds to create a map.An example of this can be seen in Figure 5.
To solve this problem, it is possible to extrapolate a pseudo-continuous function from the discrete probability map.This is done by taking the theoretical search path and moving a search radius along it.All grid squares within this radius are completely seen (100% of the grid square is within the search area), but those at the edge will be partially seen (< 100% of the grid square is within the search area).Figure 6 shows a grid square within the b∕2 search radius of the path being partially seen.The intersection polygon is the polygon created by the common intersection area of the grid square and the search area.Its area is then normalized and the probability from this square is A int ⋅ P x,y where A int is the area of the intersection polygon and P x,y is the probability of the grid square at (x, y).This is repeated for all grid squares intersecting with the edge of the search area.

Experimental setup
To test the effectiveness of each algorithm, a Monte Carlo simulation was run over multiple generated paths to gather data on the time to find.Lost persons were then placed over the abstract search area using the generated PDM as seen in Figure 7.

Comparison against figures-of-merit
There were two PDMs used for the comparison of the algorithms-a 300 m × 300 m area and a 700 m × 700 m area (Figure 8A and Figure 8B respectively).The larger area is used to give an insight into the endurance capabilities of the algorithm.In this section, Figure 8A and Figure 8B will be referred to as PDM A and PDM B respectively.
(A) (B) The PDMs used for this project.
(A) (B) The parallel swaths path can be seen in Figure 9A.The back-and-forth path covers the whole left area of PDM A. However, due to its inability to use á priori information, the parallel swaths do not cover the right region of higher probability before the endurance limit.As well as this, the algorithm spends a substantial amount of time in the bottom left region of very low probability.
Looking at Figure 9B, it can be seen that the first sweep of PDM B completely misses all higher probability regions.The second sweep is not much better, barely gathering any probability from any of the high-probability regions.
It is easy to see that this algorithm is great for complete coverage, if either endurance is not a problem or multiple flights are possible.However, for the use case in SAR, this algorithm is unusable due to its inability to use the information at hand.
The genetic algorithm works very well for the PDM A. GA also has long parallel lines (from (50,225) to (250, 50) in Figure 10A) showing the probability accumulation aspect of the cost function in action.The path traversed PDM avoids low-probability regions very effectively, and it targets the regions of high probability In Figure 10B, GA can be seen flying out to (650,250) whilst covering a lot of the high probability area and then returning home.Most importantly, it adheres to the endurance constraint.
The particle swam path yields similar results to GA. Particle swarm spends too much time in the low probability regions of PDM A (Figure 9A).However, it does not stick to the PDM limits like GA.Furthermore, particle swarm sticks to the high probability regions very effectively in PDM A. Nonetheless, it does not adhere to the endurance constraint nearly as well as GA for PDM A.
Like GA, the PDM B performance of particle swarm is very targeted to high probability areas.It may not return home exactly within the endurance limit, but it follows a realistic path (Figure 11).The below data was gathered by creating new paths for each PDM multiple times and combining the time-to-find data to give a general insight into the performance.This is preferential over a single attempt, as it prevents a single excellent path for an algorithm to influence the conclusion.SAR object placements were random given the PDM and thence all algorithms were treated the same.
Inspecting Figure 12A, it can be seen that parallel swaths find the least amount of people at any given time than the other algorithms, up until the very end.This is expected as it gives almost perfect coverage of the area but it does not use the á priori information of the PDM like the optimization-based algorithms.
Similarly, for PDM B in Figure 12B, the parallel swaths algorithm is outperformed by all others.The optimization-based algorithms accumulate more probability at all times.
From these results, it is evident that increasing the information available to a search planning algorithm massively increases its performance.The paths over PDM B in particular show this as the parallel swaths will search all areas no matter how irrelevant they are.

Comparison of PABO against human pilots
PS ASU performed experimental flights to collect data for analysis.In total 3 test flights were performed for this project: two were piloted by trained officers based on their training and experience, and one attempted to follow a parallel swaths path.These will be referred to as PS ASU 1, PS ASU 2, and Attempted Parallel Swaths henceforth.The area used for these flights can be seen in Figure 13A and the flights were designed to be as close to a real-case SAR scenario as possible in a simulated environment.PABO GA was selected to be pitched against the pilots as it outperformed the particle swarms optimization across both tests.The simulated flight for PABO GA was given an input value for an average flight speed of 1.2m s −1 which was derived from the provided flight data.The PDM for this area can be seen in Figure 13B where emphasis was put on the tree lines, the dry pond, and linear features as per Table 1.Note that PS ASU could not fly out with their grounds which limited the search area.The PDM was created with this in mind.
Figure 14 shows the aforementioned paths, as well as PABO GA, superimposed onto Figure 13A along with the PDM associated with said area.From this it can be seen that PS ASU 1 and PABO GA share a common take-off location (this was intentional), however the others take-off around 100m further east, and 50m further south.This should not affect performance but will be noted nonetheless.
Inspecting PS ASU 1, it can be seen that the path follows the edge of the PDM.PS ASU 2 seemed to find the bottom-right of the PDM very interesting which corresponds to a grouping of trees, something that the CSR outlines as being an area of higher probability for finding missing persons. 1 Looking at PABO GA, it is clear that a major problem with down-sampling the PDM to b sized pixels is that the cost function believes it is in an area of higher probability, but once converted back into a higher resolution PDM it is grossly off the mark.This can be fixed by not down-sampling the PDM but would require much longer computing times or higher computational power.However, ignoring this, the path does explore the area very well.Focusing on locations like the group of trees in the bottom-right similarly to PS ASU 2.
From Figure 15 the effectiveness of the paths at finding a person can be seen.PABO GA performs well from t = 0s by staying ahead of the real-world paths for almost the whole flight time, reaching a maximum of ≈ 73%.

CONCLUSION AND FUTURE WORK
A benchmark search algorithm and two cost-function-based algorithms were implemented.A 2D point-mass model was used along with a quintic polynomial trajectory generation scheme to simulate a SAR flight.Two different PDMs were created to compare the three algorithms through Monte Carlo simulations.It was shown that the PABO GA was found to be the most effective algorithm overall.Parallel swaths, the benchmark, were by far the least effective overall PDMs.PS ASU officers conducted multiple flights to which PABO GA was compared.The PS ASU officers are highly trained in SAR and piloted the UAV similar to what the PDM created from the CSR report 1 would predict.This resulted in PABO GA outperforming all PS ASU SAR scenario flight paths.
Overall, this study has shown the viability of merging multiple information streams that are available to an optimization-based search algorithm to greatly increase its performance over an area.The psychological profile of an LP is a core part of this information, as is the physical landscape at hand.With a simple algorithm like PABO GA outperforming trained experts, it is clear that the future of SAR is data-driven.
This work is a keystone result to build from.Future work will explore automatically generating a PDM based on psychological, and geographical information.This has a potential application for machine learning which will also need to be explored.
As well as increasing the state of the art of PDM generation using ML, the application of a Deep Q-Network will be explored to create a generalized search planning agent for all PDMs.If the PDM generation is advanced enough, reinforcement learning will have the massive amounts of data it needs to closely predict a lost person's intent.

1
Illustration of the path-planning tools in (A) QGroundControl and (B) Mission Planner.

3 •
to 63.7 • .The value for b, the sensor footprint diameter, is then calculated by b = 2 ⋅ h ⋅ tan .Where b is the footprint diameter, h is the flight altitude and  is the FOV.It was assumed that the normal flight altitude is from h = 20 m to h = 30 m.These parameters are further defined in Figure 2. To simplify the sensor model a constant FOV of  = 31 • and a constant flight altitude of h = 25 m where chosen with a resultant search footprint diameter of b = 30 m.

5
Example of how the brush method might be used out in the field by an expert to create a PDM.

F I G U R E 6 F I G U R E 7
Pseudo-continuous probability map technique.Using a PDM to place a reduced number (200) of persons.Once the LPs have been placed, the distance comparison |p i − x| < b 2 was used to determine if the ith LP's position p i was within the search range of the UAV at position x.
U R E 11 PABO particle swarm.

F
I G U R E 14 Police Scotland and PABO GA paths.F I G U R E 15 Percentage found up until time.
1ocation found by terrain and gender for a person with dementia.1 TA B L E 1