UAV‐based simultaneous localization and mapping in outdoor environments: A systematic scoping review

This study aims to investigate the current knowledge of unmanned aerial vehicle (UAV)‐based simultaneous localization and mapping (SLAM) in outdoor environments and to discuss challenges and limitations in this field. A literature search was conducted in three online databases (Web of Science, Scopus, and IEEE) for articles published before October 2022 related to UAV‐based SLAM. A scoping review was carried out to identify the key concepts and applications, and discover research gaps in the use of algorithm‐oriented and task‐oriented, open‐source studies. A total of 97 studies met the criteria after conducting a two‐step screening by a systematic method followed the Preferred Reporting Items for Systematic Reviews and Meta‐Analyses. Among eligible studies, 97 were classified into two main categories: algorithm‐oriented studies and task‐oriented studies. The analysis of the literature revealed that the majority of the studies were focused on the development and implementation of new algorithms and algorithms. This review highlights the significance and diversity of sensors utilized in UAVs in different tasks and applications scenarios that employ different types of sensors. The evaluation method is able to show the real results and performance of the new algorithms in the target scenarios compared with the evaluation method by the public data set and simulation platform.

and structured light sensor), passive sensors that passively receive and measure radiation emitted by natural or artificial sources instead of actively emitting energy or signals (e.g., monocular camera, stereo camera, and RGB-D camera) and orientation sensors which measure the orientation or rotation of the robot or vehicle (e.g., IMU) (Siegwart et al., 2011).Early SLAM methods primarily used range sensors, such as acoustic sensors or LiDAR, which can provide accurate depth data (Dissanayake, Newman, et al., 2001;Zaffar et al., 2018).However, since 2005, research into visual SLAM has gained momentum, particularly in the use of RGB cameras due to the increasing ubiquity of cameras (Karlsson et al., 2005).Although visual sensors offer abundant feature information, they lack depth or distance information (Zaffar et al., 2018).However, depth information can be acquired from stereo cameras or RGB-D cameras.As a result, visual SLAM has become a prominent research topic and practical application area.
F I G U R E 1 Overall SLAM framework consisting of five major components: sensors, odometry, back-end optimization, loop closure, and mapping.The sensors provide data to odometry, after data processing in odometry, the processed data would optimize and detect loop closure.Finally, the map of surroundings can be generated into different forms.SLAM, simultaneous localization and mapping.

| Odometry
Sensors supply data with landmarks such as features and objects for SLAM (Khairuddin et al., 2015) which can be used to initialize the position of the robot (Cadena et al., 2016;Khairuddin et al., 2015).
Odometry involves measuring the movement of the robot or sensor over time and estimating its distance traveled by extracting the features and landmarks and calculating the changes in them (Zhang & Singh, 2015).By monitoring changes in movements, the robot can determine its position within a known map of the environment or create a map from scratch.
Odometry, including visual odometry, and LiDAR odometry, can use various sensors to obtain features and estimate motions.Two common visual odometry methods are currently available: the feature-based method, and the appearance-based method (Aqel et al., 2016;Nistér et al., 2004;Scaramuzza & Fraundorfer, 2011).
The camera captures individual images in a sequence, which can be created into video frames, and the feature-based method tracks distinctive features (e.g., points, lines, and planes) extracted from video frames to match and track the distinctive ones among the extracted features, which can estimate sensor motion (Naroditsky et al., 2011;Nistér et al., 2006).Currently, there are many different feature points (such as oriented fast and rotated brief [ORB] points, Rublee et al., 2011; scale-invariant feature transform points, Lowe, 2004; speeded up robust features points, Bay et al., 2006) which can be used in visual odometry.Instead of extracting and tracking features, the appearance-based method monitors changes in the appearance of acquired images and the intensity of pixel information within the images (Bellotto et al., 2008;Gonzalez et al., 2012Gonzalez et al., , 2013)).Normally, the optical flow-based method computes the displacement of brightness patterns from one image frame to another by using the intensity values of neighboring pixels (Barron et al., 1994;Campbell et al., 2004).Similar to visual odometry, LiDAR odometry acquires distance feature information (Point to Point, Point to Plane, or Point to Edge) from LiDAR point cloud data and estimates the motion and the position of the sensors (Ji & Singh, 2014).

| Back-end optimization and loop closure
Odometry can be quite accurate in predicting the robot's position and orientation over a short time interval but can be subject to drift and accumulated errors over longer distances and time.However, solutions like back-end optimization and loop closure are required to overcome the drift and accumulated error.
In the back end of SLAM, the mapping and localization process is performed recursively by computing the movements from the features and landmarks extracted from input data (Khairuddin et al., 2015).To optimize the estimation of movement, the extended Kalman filter (EKF) method was first introduced in SLAM as EKF-SLAM (Leonard & Durrant-Whyte, 1991).EKF can handle nonlinear systems, reducing the impact of noise on the estimation and prediction of the SLAM system (Bar-Shalom et al., 2001).Later, Fast-SLAM was introduced by integrating particle filter (PF) and EKF (Montemerlo et al., 2002), further improving the accuracy of motion and position estimation.However, unavoidable noise (such as sensor observation noise and odometry noise) can still lead to inaccuracies in the estimation of SLAM over time.To reduce these errors, loopclosure detection was introduced (Ho & Newman, 2007).With loop closure, the robot can recognize when it returns to a previously visited location and make the necessary corrections to the global error.Currently, popular loop-closure detection methods include bag of words (Sivic & Zisserman, 2008), pose graph (Lu & Milios, 1997), and bundle adjust (Triggs et al., 2000).

| Mapping
When the SLAM is executed, many local maps and submaps (Figure 1) containing local coordinate systems are generated for the estimation of motion (Khairuddin et al., 2015).These maps are processed every few seconds or per frame.Eventually, various representations of maps are generated in SLAM, such as point cloud map (Rusu & Cousins, 2011), Octomap (Hornung et al., 2013), and occupancy grid map (Fankhauser & Hutter, 2016).The resulting maps can be used in different fields for path planning, obstacle avoidance, remote sensing, and 3D reconstruction.

| Milestones of SLAM
In the last three decades, the field of SLAM has made significant progress based on the theory of object position estimation (Smith & Cheeseman, 1986).Currently, there are various architectures of SLAM algorithms using different types of sensors.To clearly illustrate the differences between different SLAM algorithms and to show the evolution of SLAM, the flowchart in Figure 2 presents the SLAM milestones and main algorithms from the first complete SLAM method (EKF-SLAM) (Smith, 1988) to the most recent approaches.
We divide SLAM methods into two categories according to the sensors: visual SLAM and LiDAR SLAM.Initially SLAM was conducted in ground vehicles, but as UAVs became vehicles for remote sensing-based monitoring, SLAM also started to be developed for UAV-based applications.The adoption of SLAM for UAV-based sensors with monocular camera started to emerge in 2003 (Kim & Sukkarieh, 2003), this was also the beginning of visual SLAM (Davison, 2003).
Until the introduction of MonoSLAM (Davison, 2003) in 2003, LiDAR was the dominant sensor in SLAM.Furthermore, filtering methods such as EKF and PF were also the main optimization solution in SLAM.After that, both LiDAR and visual SLAM came to several eras and there were some milestone studies, which are explained in the following subsections.

| Filtering methods or smoothing methods
In SLAM, filtering methods are used to estimate the state of the system based on sensory information, while smoothing methods are used to improve the accuracy of these estimates over time.Filtering methods, such as Kalman filtering and particle filtering, use a mathematical model of the system and sensory information to estimate the state of the system at a particular point in time.
Smoothing methods, on the other hand, use information about the past history of the system to improve the accuracy of these estimates.While filtering methods focus on estimating the state of the system at a single point in time, smoothing methods take into account the entire history of the system to produce more accurate estimates.

| Direct visual odometry or indirect visual odometry
Direct visual odometry involves using visual sensors, such as cameras, to directly measure the motion of a vehicle or robot.This is typically done by tracking the features of the images from the cameras and using that information to estimate the movement of the vehicle.Indirect visual odometry, on the other hand, involves using visual sensors to indirectly estimate the motion of a vehicle or robot by comparing the images from the cameras to a prebuilt map of the environment.This allows the vehicle or robot to localize itself within the map and use that information to estimate its motion over time.

| Two-dimensional (2D) and 3D LiDAR
When LiDAR SLAM research argued whether filtering methods or smoothing methods were better, 2D and 3D LiDAR SLAM also started to be debated.The difference between 2D and 3D LiDAR is that 2D LiDAR measures the distance to objects in a single plane while 3D LiDAR measures in multiple planes.LiDAR odometry and mapping (LOAM; Ji & Singh, 2014) is one of the most important milestones in 3D LiDAR SLAM, which proposed a complete and pure LiDAR sensor SLAM framework.Lego-LOAM (Shan & Englot, 2018) was developed as a lightweight model and better optimization based on LOAM.

| Inertial fusion era
Inertial fusion is an approach that combines information from inertial sensors, such as gyroscopes, accelerometers, IMUs, with other types of sensors, such as cameras, to improve the accuracy of SLAM.Previous studies in visual SLAM, such as monocular visual-inertial system (VINS-Mono; Qin et al., 2018) and VINS-Fusion (Qin et al., 2019), have reported that inertial sensors provide highfrequency information about the motion of robot or sensor, which can be used to improve the accuracy of the SLAM estimates over time.Inertial sensors also help correct drift and accumulate errors.
Moreover, inertial fusion allows SLAM to be more robust to individual sensor errors or failures, which can improve the overall reliability of SLAM.A similar era also occurred in LiDAR SLAM, such as LiDAR inertial odometry via smoothing and mapping (LIO-SAM; Shan et al., 2020) and LiDAR-inertial 3D plane SLAM (Geneva et al., 2018), which proposed to make fusion with LiDAR and inertial sensor to improve the robustness and accuracy of SLAM.

| METHODOLOGY
The purpose of the scoping review approach (Arksey & O'Malley, 2005;Munn et al., 2018) is to survey and outline the key concepts, types of evidence, and research gaps in a specific subject area.It is typically performed at the start of a research project to pinpoint the most pertinent and significant studies and to determine areas in need of further investigation.Unlike a systematic review, which is a comprehensive and thorough review of the literature, a scoping review is broader and has more exploratory approach aimed at gaining an overview of the state of research in a particular field.
With this in mind, a scoping review was carried out to identify the key concepts and applications, and discover research gaps in the use of SLAM for outdoor UAV-based applications.
This scoping review adopted a systematic approach in designing search strategies and selection criteria to ensure rigor, credibility, transparency, and reproducibility.The designing, searching, and implementation of the systemic method were conducted in October 2022.The screening workflow followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Moher et al., 2009).

| Search strategy
The search strategy for this scoping review was designed according to the guidelines of Peters et al. (2015) for conducting systematic scoping reviews.A literature search was conducted in three online databases (Web of Science, Scopus, and IEEE) for articles published before October 2022 related to UAVs, SLAM, and outdoors.Detailed literature search terms are described in Table 1 and the search expressions used in the databases can also be found in Appendix A.
The search strategy was specifically designed to identify literature related to UAV-based SLAM in outdoor environments, using a combination of keywords, synonyms, and abbreviations to maximize the retrieval of relevant studies.The resulting literature was then imported into EndNote 20.04 for further screening and literature management.

| Inclusion/exclusion criteria for screening searched literature
This study employed a systematic approach to carry out a scoping literature review of the published literature in the field of UAV-based SLAM.Duplicates were removed, and a five-step literature screening strategy was implemented with defined inclusion and exclusion criteria, as detailed in Table 2.The screening process involved two stages (Figure 3), with the first being a rapid screening based on criteria A-E that only evaluates the title, abstract, and keywords.The second stage was a thorough screening based on criteria B-E, where the full text of the literature that passed the first stage was evaluated.

| Screening results and data extraction
After the removal of the duplicate literature (n = 205) and literature cannot be accessed (n = 12) from the searching result (n = 575), there are 113 literatures from Web of Science, 271 literature from Scopus, and 191 literature from IEEE (Figure 3).And 97 literature met the criteria after conducting a two-step screening.Some necessary information such as the title of the studies, publish year, author, journal and conference, country, sensors used in the studies, SLAM simulation platforms, UAV platforms, flight height, and so forth was extracted from the selected literature to an excel file, which can be found in Appendix A.

| Overview of the reviewed papers
Overall, this review analyzed 97 studies on UAV-based outdoor applications with SLAM published in a range of journals (n = 58) (Figure A1) over a span of 19 years (2003-2022) (Figure 4).These T A B L E 1 The search terms and expressions used for searching the selected literature databases.T A B L E 2 Inclusion/exclusion criteria for a two-step screening of the searched literature.
A The article is written in English.
If yes, proceed to the next criterion.
If not, tag as "not English," retain butexclude.

B
The study is concerned with the SLAM algorithm.
If yes, proceed to the next criterion.
If not, tag as "not SLAM," retain but exclude.

C
The study is concerned with new algorithms, including improvements or applications to existing algorithms, proposing completely new algorithms, excluding review, and survey.
If yes, proceed to the next criterion.
If not, tag as "not new," retain but exclude.

D
The study is concerned with a single UAV, which means that the sensors will be deployed on UAV to implement SLAM algorithms.
If yes, proceed to the next criterion.
If not, tag as "not UAV or not single UAV," retain but exclude.

E
The study is concerned with outdoor sceneries, including park, orchard, forest, university, factory, and simulated outdoor scenes, where SLAM will be evaluated and applied.
If yes, proceed to the next criterion.
If not, tag as "not outdoor," retain but exclude.

| Application scenarios of task-oriented studies
In both task-oriented and algorithms-oriented studies, the application scenarios can be classified into aquatic, agricultural, industrial, urban, rural, forest, and hybrid (indoor and outdoor) environments on the basis of the experiment and application scenarios (Figure 5).Among the seven types of scenarios mentioned above, urban environments (n = 46) were the most used for evaluation, experiments, or applications.While the aquatic scenario was only used by two studies (Yang, Dani, et al., 2017;Yang et al., 2011).Although this review focused on SLAM in outdoor environments, there were many studies that evaluated SLAM in indoor environments or indoor opensource data sets first, and then conduct experiments in outdoor environments.Therefore, hybrid (indoor and outdoor) environment (n = 26) also accounted for a relatively large proportion of the application scenarios.

| Experimental methods in SLAM
To illustrate how the SLAM algorithms have been typically evaluated, There are some famous public data sets for SLAM evaluation shown in Table 3.Among them, evaluating SLAM with the KITTI data set (Geiger et al., 2012) accounts for the largest number in the selected papers (n = 7).And the EuRoC data sets (Burri et al., 2016) were used to evaluate SLAM in the selected papers in large numbers (n = 6) as well.In addition, only one study using the ICL-NUIM data set for evaluation (Forster et al., 2016) and one study using the NewCollege data set for evaluation (Mur-Artal et al., 2015).While LILI-OM (Li, Li, et al., 2021), LIO-SAM (Shan et al., 2020), utbm (Yan et al., 2020), ulhk (Wen, Zhou, et al., 2020), and nclt (Carlevaris-Bianco et al., 2016) were only used in one study (Xu et al., 2022).
To evaluate SLAM in a specific scenario, simulation platforms can also be used to perform SLAM experiments.The investigation of simulation platforms has shown that Gazebo (n = 8) (Koenig & Howard, 2004), which is an open-source software library for robotics developers was the most popular simulation platform in the eligible papers.It provides an essential robotic toolbox to simulate robots accurately and efficiently in complex indoor and outdoor environments.However, other simulation platforms accounted for a very small proportion.Airsim is a simulator developed by Microsoft for drones and cars based on Unreal Engine, which was used by two studies (Nguyen et al., 2021;Xie et al., 2021) in eligible papers.In addition, there was one study using Unity engine (Liu et al., 2022) and one study using Xplane 10 ( Yang et al., 2019).
F I G U R E 5 Overview of application characteristics in the eligible studies, flowing from Research Objectives, Task Categories, UAV flight height ("not mentioned" means the investigated study did not mention the flight height), and Application Scenarios ("urban" includes city, university, highway, "rural" includes village, suburb, "hybrid" includes indoor and outdoor, "forest" includes jungle, forest, rainforest, "industrial" includes wind power station, ship, factory, bridge, "agricultural" includes crop field, "aquatic" includes river, lake, ocean, "unknown" means the investigated studies did not mention the scenarios).UAV, unmanned aerial vehicle.T A B L E 3 Public data sets for UAV-based SLAM used in the selected papers, including the name of data sets, publication year, number of the data sets used in the selected papers, detailed sensor categories, original reference, and the link to the data sets.studies (Bryson & Sukkarieh, 2006;Kim & Sukkarieh, 2003)  Table 4 provides an overview of the major sensor types with sensing principle and basic parameters of the sensors.

| SLAM algorithms
SLAM could be classified according to the name of the open-source architecture into 16 categories in the selected studies (Figure 8).In addition to using efficient optimization algorithms to reduce processing times and improve accuracy, having a powerful processing unit such as a high-performance CPU or GPU is also crucial for the real-time performance of SLAM.Therefore, to better illustrate the computational performance of SLAM, the computational platform was classified in terms of the chip architecture and computation into CPU and GPU (Table 5).Approximately three-quarters of papers

| DISCUSSION
There have been reputable studies pointing out some issues in current SLAM research (Cadena et al., 2016).For example, there are problems in robustness and scalability under long-term autonomy, such as the failure The acoustic sensor is similar to LiDAR and consists of a transmitter, receiver, and signal processing module.The transmitter will continuously emit an acoustic signal, and the receiver will receive a return signal when the acoustic wave encounters an obstacle.The distance between the sensor and the obstacle can be calculated by the time of the sound wave round trip.

Small
Note: The parameters of the sensors were based on Amazon and JingDong (JD) products search and description by the end of 2022.Abbreviations: 3D, three dimensional; LiDAR, Light Detection and Ranging; SLAM, simultaneous localization and mapping.
and recovery of SLAM and metric relocalization.Additionally, metric maps, semantic map models, SLAM combined with deep learning, and novel and unconventional sensors (e.g., range camera, light-field camera, and event camera) all have some limitations to be addressed in SLAM research.And when it comes to UAV-based SLAM, these issues will remain and may give rise to new challenges.
To enable the comparison and discussion among the selected studies with the same scenarios, a data synthesis for UAV-based SLAM was prepared (Figure 9).This data synthesis provides a blueprint and guidance to researchers new to the field of UAV-based SLAM.New researchers can refer to the setups to prepare their studies and experiments according to the synthesis.The categories of The distribution of the SLAM algorithms using UAVs.The y-axis is algorithms ordered by the published year (earlier year at the lower end of the y-axis), excepted for "Not Mention" and "Kaarta Stencil SLAM."For the computational platform, "Only CPU" means the studies only use CPU to run, test, and evaluate SLAM algorithms, "CPU and GPU" means the studies use both CPU and GPU to run, test, and evaluate SLAM algorithms, "Not mentioned" means the studies did not specify the computational platforms used.CPU, central processing unit; EKF, extended Kalman filter; FAST-LIO, fast direct lidar-inertial odometry; GPU, graphics processing unit; LOAM, LiDAR odometry and mapping; LSD, large-scale direct; MSCKF, multistate constraint Kalman filter; PTAM, parallel tracking and mapping; SLAM, simultaneous localization and mapping; SVO, semidirect visual odometry; UAV, unmanned aerial vehicle; VINS-mono, monocular visual-inertial system.Note: The number of symbols "*" in this table represents the level of computational performance, power consumption, and price Abbreviations: CPU, central processing unit; GPU, graphics processing unit; SLAM, simultaneous localization and mapping; UAV, unmanned aerial vehicle.
WANG ET AL.
| 1629 scenarios in the synthesis were based on the results of this review (Figure 5).
To better illustrate the challenges of UAV-based SLAM in different scenarios, the following sections of this discussion focused on the issues, including sensors, algorithms, and tasks, and application scenarios and factors impacting UAV-based SLAM explored in these studies.

| Sensor
This review highlights the significance and diversity of sensors utilized in UAV-based SLAM (Figure 7).Studies using LiDAR accounted for 20% of the total investigated studies.LiDAR accounted for a large proportion of all application scenarios except for the aquatic environments.This may be likely due to its extreme accuracy, simple error model, and robustness in various illumination conditions (Huang, 2021), especially for forest and industrial applications that may involve very high accuracy measurements tasks such as forest growth estimation and industrial scene modeling.
Although LiDAR was currently accurate enough and can be applied to most tasks and application scenarios, it had limitations (Table 4) due to the high cost, large size, and single type of acquired information from LiDAR (Huang, 2021).
In comparison, visual sensors such as monocular camera (70.7%), stereo camera (15.5%), and RGB-D camera (3.1%) were widely used in all kinds of scenarios for UAV-based outdoor SLAM (Figure 7).
RGB-D camera can acquire both depth information and visual information directly, which was very suitable for SLAM research.
F I G U R E 9 Synthesis SLAM studies, based on the selected 97 papers, the sequences of the terms in each block emphasize the number of studies, from top to bottom decreasing.3D, three dimensional; EKF, extended Kalman filter; FAST-LIO, fast direct lidar-inertial odometry; GPS, global positioning system; LiDAR, Light Detection and Ranging; LOAM, LiDAR odometry and mapping; LSD, large-scale direct; MSCKF, multistate constraint Kalman filter; PTAM, parallel tracking and mapping; SLAM, simultaneous localization and mapping; SVO, semidirect visual odometry; UAV, unmanned aerial vehicle; VINS-mono, monocular visual-inertial system.
However, some studies (Jin et al., 2019) pointed out that since the acquisition of depth information is based on structured light or ToF sensors, the measurement accuracy can be easily affected by environmental illumination changes, which meant that RGB-D camera was not reliable for outdoor environment.The monocular camera was more stable and robust than RGB-D cameras in complex, light-changing environment.Moreover, it was low cost, and had simple structure, small size and can be easily mounted on all kinds of robot platforms.Although monocular cameras cannot directly acquire depth information to determine the absolute scale from images in SLAM, many studies illustrated that the absolute scale issue can be solved from IMU data (Nützi et al., 2011) or estimated by detected objects (Sucar & Hayet, 2018) and self-induced oscillations in hover control of SLAM (Lee & de Croon, 2018).In addition, stereo camera was inspired from human eye, which can use the disparity of two camera images in the same scene to calculate depth information and did not suffer from the problem of scale-drift in monocular camera 4.
However, this review found that only 20% of the investigated studies used stereo camera in UAV-based outdoor SLAM.This may be caused by the complexity and difficulty of initial configuration, the calibration of stereo camera and its large size compared with monocular camera for all types of robot platforms, especially for UAV.
Overall, UAV platforms equipped with visual sensors, especially monocular cameras accounted for the majority of the outdoor SLAM studies investigated.And compared with other robotic platforms, this may be due to the characteristics of UAVs, such as limited payload, specific flight motion modes, and limited power.

| Conventional SLAM
Regardless of visual sensors or LiDAR sensors, almost all the investigated studies at the algorithmic level were based on geometric features (such as points, lines, and planes) for outdoor UAV-based SLAM applications (Figure 8).Among them, the ORB-SLAM series As for LiDAR SLAM, Cartographer (Hess et al., 2016) was the current mainstream algorithm of 2D LiDAR SLAM which was a graph-based optimization algorithm for indoor map construction.However, 2D LiDAR has limitations in handling complex outdoor environments with factors, such as smoke, dust, and rain.The introduction of LOAM (Ji & Singh, 2014) addressed these challenges with 3D LiDAR.
LOAM proposed a novel feature extraction (edge and planar features) and developed a motion compensation for SLAM using timestamps but did not achieve back-end optimization to improve mapping accuracy.To avoid trajectory drift and further improve the accuracy of pose estimation and map construction, LeGO-LOAM (Shan & Englot, 2018) utilized Georgia tech smoothing and mapping (gtsam) as back-end optimization.Although these traditional SLAM algorithms performed well on current tasks, SLAM still lacked intelligent perception capability (Figure 2), especially for some specific scenarios and tasks.For example, to solve the problem that the monocular camera cannot acquire depth information which leads to scale uncertainty (Table 4), a monocular depth estimation solution (Kuznietsov et al., 2017) was proposed by using a semisupervised deep learning model.In addition, in some tasks with special requirements (Figure 5) such as detection, recognition, classification, conventional geometric feature-based SLAM cannot handle these tasks.Thus, semantic information can be combined with SLAM by deep learning (Li et al., 2017;McCormac et al., 2017).Although there were some works reporting such solutions, they have been mostly used with ground vehicles or other handheld equipment (Lategahn et al., 2011;Quan et al., 2021;Roussillon et al., 2011).When addressing this problem with UAVs, none or few studies have been found due to the complexities and features of UAV such as the limited payload capacity, flight time, and energy consumption, as stated by some studies (Abeywickrama et al., 2018;Cabreira et al., 2019).
Dealing with high dynamic and unstructured surroundings is one of the most difficult issues in UAV-based SLAM.For example, UAVs are typically operated in open outdoor areas where illumination conditions may vary rapidly and wind can severely impair the UAVs' stability and control (Floreano & Wood, 2015).Thus, the robustness of SLAM is more critical for UAV-based SLAM than ground-based SLAM (Ravankar et al., 2018).However there are more factors affecting UAV-based SLAM compared with other platforms (e.g., ground vehicle, handheld equipment, and backpack device).For example, the relatively short flight time of UAVs limited by the batteries might restrict the quantity of data acquired by the sensors making it difficult to generate reliable maps of the area (Hardin & Jensen, 2011).As a result relocalization capability for UAVs from the previous SLAM results can also be a challenge, which was also reported in a previous study (Zaffar et al., 2018).
In comparison to UAV-based SLAM which can involve working in various environments (Figure 9) with complex features (Opromolla et al., 2016;Semsch et al., 2009) ground vehicles are often used in more controlled settings such as indoor or outdoor locations with well-defined paths and prominent landmarks (Bostelman et al., 2015;Kumar et al., 2021;Martínez-Barberá & Herrero-Pérez, 2010).This is because ground vehicles typically follow predetermined routes and rely on pre-existing maps or landmarks for navigation, which can limit their flexibility in unstructured environments (Lynch et al., 2018;Shahzadi et al., 2021).Although both UAVs and ground vehicles may encounter obstacles in their navigation leading to mission failure, the excellent maneuverability (Dvorak et al., 2015) and cross-terrain | 1631 capabilities (Sujiwo et al., 2016) of UAVs make it easier to deal with obstacles and uneven terrain.However these characteristics of UAVs also add many challenges at the algorithmic level of SLAM.For example, when performing highly maneuverable UAV tasks (e.g., rescue, military mission, and rapid navigation), the geometric featurebased SLAM algorithm mentioned above (e.g., ORB-SLAM) may fail to initialize or fail to extract features due to the interference with sensors when tracking features with excessive speed of the UAV (Esfahlani, 2019).Therefore SLAM algorithms should be optimized for different tasks in different application scenarios to increase their robustness.(Steenbeek & Nex, 2022).There are different ways to build the monocular depth estimation model, such as Unsupervised Learning (Zhan et al., 2018), Semisupervised Learning (Kuznietsov et al., 2017), and Selfsupervised Learning (Godard et al., 2019).Moreover, some computer vision models can extract semantic information from environments for understanding surroundings, improving pose estimation accuracy, and supporting loop-closure detection (Qin et al., 2021;Tateno et al., 2017).There are also some deep learning models for LiDAR SLAM.For instance, Chen et al. introduced a novel semantic mapping method to detect dynamic object and improve pose estimation accuracy with LiDAR solely (Chen et al., 2019).

| Tasks and application scenarios
As mentioned in this review, most of the tasks in the analyzed studies were focused on navigation and detection, and most of the application scenarios were also focused on urban, rural, and forest (Figure 5).Our analysis revealed some interesting patterns in the use of UAV-based SLAM in different scenarios.Specifically, we found that algorithm-oriented research is often carried out in urban and rural scenarios, such as university campuses or parks.These scenarios are usually not chosen for specific applications in these scenarios for SLAM algorithm implementation but only for algorithm evaluation in outdoor open spaces.For other scenarios such as forest, agriculture, industry, aquatic, the selection of these scenarios aimed to address some challenges found in the specific scenarios (Cui et al., 2014;Schultz et al., 2016;Yang, Dani, et al., 2017).However, the number of selected studies worked in forest, agriculture, industry, aquatic is much less than in urban and rural.This suggests that more work is needed to explore the potential applications of UAV-based SLAM in real-world and task-oriented settings in these environments, such as mapping and monitoring of infrastructure, precision agriculture, and disaster response scenarios.And since algorithm-oriented tasks account for 40% of all tasks (Figure 5), most algorithm-oriented tasks are evaluated in public data sets rather than in field experiments.The benefit of conducting SLAM evaluations on public data sets was a uniform SLAM algorithms benchmark (Bujanca et al., 2019) can be established to illustrate the differences in algorithm performance.But excluding algorithm-oriented tasks, most other studies tended to conduct field experiments (60%).A possible explanation for this could be that these experimented SLAM algorithms may perform well in only a few specific scenarios and do not achieve comparable performance in other scenarios.Another possible explanation is that UAV has high experimental requirements, such as pilot specialists to operate the UAVs, ground equipment to assist the SLAM in collecting and processing data, and safety issues of the UAV.And some studies (Kim & Eustice, 2013;Porav et al., 2019) have also conducted specific field experiments for different tasks (inspection, detection), and scenarios (rainy weather, underwater).
Therefore, the possible explanation that field experiments of specific scenarios will be determinant in task-oriented SLAM research is further supported.
In addition, this review found significant differences in the flight height of UAVs in different tasks or application scenarios (Figure 5).
Another important finding was that the flight height of UAVs under 10 m accounted for almost 50% of both task-oriented and algorithmoriented studies.Some studies have found that low-flying UAVs perform well in tasks, such as identifying ruts and potholes (Saad & Tahar, 2019), precision weed management (Huang et al., 2018), and crop characterization (Uto et al., 2013).It seems possible that these results are due to the fact that UAVs with low-altitude flight height are able to acquire more precise and detailed information.These results matched those observed in earlier studies.However, some studies have also pointed out that high altitudes are more beneficial in other UAV tasks, such as digital elevation model mapping (Ageed & Abulrahman, 2020), georeferenced mosaics of wheat (Gómez-Candón et al., 2014), and single tree height estimation in broadleaf forests (Sadeghi & Sohrabi, 2019).Thus, a low UAV flight altitude may be more beneficial in some tasks with high accuracy requirements while a high UAV flight altitude may be more beneficial in some tasks with high efficiency requirements and a large-scale range.

| Exploring experimental factors affecting UAV-based SLAM
Experimental Factors that influence the performance of SLAM in UAV applications can be divided into hardware, software, and external factors based on the components of UAV-based SLAM (Figure 10).
In terms of hardware, there are several experimental factors to consider for UAV-based SLAM.The sensor is one of the most important hardware components that can affect the performance of SLAM.All investigated studies clearly mentioned the sensors (Figure 7) in their experiment setup.SLAM performance relies on the data from sensors to map the environment and estimate the UAV's location and orientation.The quality of the raw data dictates the upper limit of SLAM performance (Aguilar-Moreno & Graña, 2022;Zaffar et al., 2018).If the raw data is noisy, inaccurate, or missing, the result of SLAM including the map and pose estimations will be correspondingly less accurate (Moosmann & Stiller, 2011).The computational platform (Figure 8) can also be a relatively crucial factor for SLAM in UAV applications.81% of studies stated the detailed experiment setup of computational platform including the type of CPU or GPU (such as Intel Atom, Intel i7-4900MQ, Nvidia Jeston TX1, Nvidia GeForce GTX-650, etc.) which can satisfy the requirements of real-time processing capacity for massive sensor data.Nearly 1% of the studies involved both GPU and CPU to process SLAM due to some SLAM algorithms combining deep learning and computer vision models.High-performance computational platforms can enhance the processing efficiency of SLAM (Tuna et al., 2012), especially for some deep learning-based SLAM (McCormac et al., 2017;Rosinol et al., 2020).In addition, different types of UAVs can affect the performance of SLAM due to their different flight styles.
In terms of software, sensor calibration and the UAV control system could be considered in the SLAM experiment setup.Only one investigated study focused on sensor calibration, which developed a self-calibration system for visual SLAM (Heng et al., 2015).Sensor calibration is a preprocessing procedure for conducting SLAM, which is very crucial for obtaining accurate data from sensors (Sturm et al., 2012;Trejos et al., 2022).However, sensor calibration is ignored in some SLAM studies for UAV applications.This is mainly because most of the UAVs used in the studies were commercial products (Figure 6a) that did not require secondary development and whose sensors were calibrated at the factory in advance.Another software factor can be the control system of the UAV, which contains communication modules, flight control modules, mission planning modules, and so forth (Figure 10).For example, software that controls the UAV's flight path and the timing of sensor measurements can affect consistency and quality of sensor data (He et al., 2006) which can in turn affect SLAM performance.
For external experimental factors, flight environment conditions and UAV flight setup can impact SLAM in UAV applications.All investigated studies illustrated scenarios during flying UAV with SLAM (Figure 5), but none or few showed detailed information about environment conditions.The flight environment, such as terrain, wind, weather, and illumination conditions can impact sensor data quality and consistency (Mohammed et al., 2020;Vargas et al., 2021).map quality (Laible et al., 2012).Furthermore, flying UAVs in windy or turbulent environments might cause them to move unpredictably resulting in errors in estimating UAV's position and orientation (Kothari et al., 2014).In addition, terrain can also have an impact on the performance of the SLAM algorithm.For example, in environments with numerous identical or repetitive features such as forests or agricultural fields, the SLAM algorithm may struggle to distinguish between different features leading to errors in feature matching and pose estimation (Zhu et al., 2021).Flight setup including speed and flight height of UAV can also have an impact on SLAM (Figure 5).
Motion of UAV especially speed has a considerable influence on SLAM (Henein et al., 2020;Zhou et al., 2021).If UAV is moving too quickly it may not be able to acquire enough sensor measurements to accurately estimate its position and orientation which can result in errors and drift in the estimates leading to poor navigation and control performance (Dissanayake, Sukkarieh, et al., 2001).Also in case of cameras motion blur can occur if the UAV is moving too quickly to blurry or distorted images that are more difficult to process and use for SLAM (Hayakawa & Ishikawa, 2016).Accuracy of data acquired by sensors may be affected when the UAV is flying at different altitudes, which can also influence SLAM (Hyyppä et al., 2020).

| Strengths and limitations
There are several strengths and weaknesses of UAV-based SLAM limited by the advantages and disadvantages of the UAV platform.
UAVs are high-speed and have excellent capability to cross terrain and obstacles compared with ground vehicles.Moreover, UAVs are able to have an advantageous view from high altitude which cannot be available to ground-based equipment (Mohr & Fitzpatrick, 2008).
More importantly, the use of UAVs has less damage to the ecosystem (Kushleyev et al., 2013).However, current technologies of UAVs have some limitations.Due to the vibrations caused by the wind at high flight altitudes, the center of gravity will shift rapidly, and there is a high risk of losing control (Coppola et al., 2020).Also, due to limited flight time, limited remote controlling distance, and limited power consumption, UAVs cannot be used in critical missions, particularly for UAVs weighing less than 4 kg (Nieto et al., 2003).
In terms of algorithms, there are plenty of SLAM algorithms with high performance and robustness in UAV applications currently (Figure 8).However, there are some challenges for UAV-based SLAM in specific scenes.For example, limitations appear when UAVs implement long-term autonomy SLAM which may cover a large scale and span a long period, such as environmental monitoring in ocean or land exploration and large-scale precision agriculture.Due to the dynamic and harsh environment outdoors, data association in SLAM may fail easily (Martinez-Cantin & Castellanos, 2005;Sossalla et al., 2022).And there are many parameters tuned when initializing the SLAM, but unchanged parameters may cause failures of SLAM when environments change (Saeedi et al., 2019).Besides, hardware including sensors and UAV platforms may have some errors when implementing tasks.This can cause some disasters, such as falling UAV or the lost UAVs.In addition, computational time and memory footprint are constrained by resources of UAV due to the extremely limited payload which may consume lots of resources when generating maps.
At the hardware level, the main sensors in SLAM are LiDAR and cameras (Figure 7).Limitations of LiDAR sensors are lack of variety of information acquired while for visual sensors limitation is small field of view leading to incomplete information acquisition.There are some SLAM studies trying to address challenges in sensors such as solving the lack of information from LiDAR by fusion of different sensors (Xu et al., 2022) or developing new sensors such as event cameras (Gehrig et al., 2020) et al., 2019), EuRoC (Burri et al., 2016), UZH-FPV Drone Racing (Majdik et al., 2017), Zurich Urban MAV (Oettershagen et al., 2016), Solar-powered UAV Sensing and Mapping data set (Oettershagen et al., 2016), Kagaru Airborne Stereo data set (Warren et al., 2014), and Event-Camera Data set (Mueggler et al., 2017).Lack of UAV data sets has led to lack of benchmark evaluations of UAV-based SLAM.
However, most algorithm-oriented studies have heavily relied on public data sets for implementation and evaluation.And some public data sets are overexposed which may lead to biased evaluation (Liu, Fu, et al., 2021).In addition most public data sets cannot adequately reflect real-world circumstances that a UAV might face in a new environment because environments in public data sets are frequently preselected.Therefore how to avoid evaluation bias to improve robustness and generalizability of SLAM algorithms is a challenge.Moreover, most public data sets only include a single type of basic sensors, which is not the same as a real-world UAV would have access to.For example Middlebury (Seitz et al., 2006), EPFL (Strecha et al., 2008), TUM MonoVO data sets (Engel et al., 2016) only provide video data captured by cameras, and KinectFusion (Meister et al., 2012), TUM RGB-D (Sturm et al., 2012), and ETH RGB-D (Oleynikova et al., 2017) data sets only equip with RGB-D sensors.
On the basis of the results and discussions, there are some research points for UAV-based SLAM in future studies.In terms of sensors, multisensor fusion can better support the robustness, stability, and accuracy of SLAM.For UAV-based SLAM, sensor miniaturization and the development of new sensors are major challenges for future research.For example, wide field-of-view cameras such as fisheye cameras or omnidirectional cameras (Caruso et al., 2015) allow for observing and reconstructing a wider scene.Event cameras (Vidal et al., 2018) are robust to motion blur and have very high dynamic range.With the miniaturization of sensors, it is possible to carry a wider variety of sensors in a SLAM with limited payload.The fusion of multiple sensors will not only improve the accuracy of SLAM but will also provide diverse feature that may be suitable for different specific tasks.Thus, SLAM with fusion of multiple sensors (LiDAR, cameras, IMU, GPS, sonar) will likely be the next research points in UAV-based SLAM (De Pazzi et al., 2022;Zhang et al., 2012).
For UAV platforms, although current UAV devices are sufficiently stable and have a wide range of exploration, fast battery consumption, lack of durability, and limited communication range for SLAM are issues that need to be addressed in future research (Galkin et al., 2019;Olsson et al., 2010).Furthermore, UAV swarm collaboration and collaboration between UAVs and ground vehicles will also address individual UAV shortcomings (Zhou et al., 2022).For example, UAV swarm can increase coverage area, enhance fault tolerance, and provide higher mission efficiency than single UAV (Zhou et al., 2020(Zhou et al., , 2022)).The collaboration between UAVs and ground vehicle can supply energy to UAVs which ensures long-term flight (Li, Cheng, et al., 2021;Papachristos & Tzes, 2014).
Regarding SLAM algorithms, on small mobile platforms like UAVs, lightweight and real-time algorithms need to be developed to ensure a reasonable allocation of computational resources.Recent advances in computer vision and deep learning have shown great promise for improving the accuracy and efficiency of UAV-based SLAM.For example, semantic segmentation and object detection algorithms such as swin transformer (Liu, Lin, et al., 2021), deeplab-V3 (Chen et al., 2017), and YOLO-V7 (Wang et al., 2022) can address the issues of identification and classification of objects in the environment, enabling more effective localization and mapping (e.g., dynamic feature removal and object-based feature) as well as fruitful semantic information (Wu et al., 2022;Zhang et al., 2018).As mentioned above the lack of absolute scale for monocular camera SLAM can also be effectively addressed by the recent monocular depth estimation algorithm (Cordts et al., 2016).Similarly, 3D reconstruction methods can be used to generate more detailed and accurate maps of the environment, for example, Nerf (Mildenhall et al., 2021).By integrating the techniques such as computer vision, deep learning, and reinforcement learning, UAV-based SLAM can achieve more advanced and complex tasks, such as autonomous navigation, object manipulation, and so forth.Several studies (Botteghi et al., 2020;Li et al., 2018;Wen, Zhao, et al., 2020)  with SLAM (Aslan et al., 2022;Cheein et al., 2011;Krul et al., 2021).
UAV with SLAM can also be used to inspect and monitor structures, such as bridges, ships, pipelines, and buildings, to identify potential hazards and carry out maintenance and repairs more effectively (Bian et al., 2018;Sato & Anezaki, 2017).Similarly, UAV-based SLAM can also be applied in scenarios involving disaster management, allowing rapid mapping and assessment of affected areas to direct response activities (Lee et al., 2016).These are just a few examples of the many potential applications of UAV-based SLAM.As technology develops, it is likely that many new opportunities will arise.
The autonomy of UAVs is dependent on its ability to perceive and navigate their environment independently, and UAV-based SLAM is a key enabling technology to achieve this.Several recent studies have highlighted the potential of UAV-based SLAM for improving autonomy and increasing the range of tasks that UAVs can perform.For example, several studies (Naveed et al., 2022;Wang et al., 2019) showed that by integrating visual SLAM with reinforcement learning, robots could learn to navigate complex indoor environments with high efficiency and autonomy.Another study (Azpúrua et al., 2021) demonstrated the potential of UAV for performing complex inspection tasks in hazardous environments.
In future, the development of UAV-based SLAM can be combined with deep learning, sensor, battery, and communication technologies, leading to an unmanned, autonomous UAV system for every scenario in city, agriculture, forest, industry, river, and ocean.

| CONCLUSION
This scoping review provides an overview of different tasks and  (Harmat et al., 2015).

F
I G U R E 2 The history and milestones of general SLAM, the path to SLAM divided into LiDAR-based (green blocks) and Visual-based (blue blocks).The red blocks show the beginning of UAV-based SLAM in 2003, the explosion of UAV applications, and the new era of UAV-based SLAM.The path in LiDAR-based SLAM was through 2D versus 3D LiDAR era, and the path in visual SLAM was through Filtering versus Smoothing era, Direct versus Indirect era, and Inertial Fusion era.And the next stage for LiDAR-based and visual SLAM was deep learning era and sensor fusion.2D, two dimensional; LiDAR, Light Detection and Ranging; SLAM, simultaneous localization and mapping; UAV, unmanned aerial vehicle.WANG ET AL. | 1621

F
I G U R E 3 PRISMA 2020 flow diagram for updated systematic scoping reviews, which included searches of databases, registers, and other sources.PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses; SLAM, simultaneous localization and mapping; UAV, unmanned aerial vehicle.WANG ET AL. | 1623 studies were conducted in 29 countries, with the largest number of studies coming from the USA (n = 20), China (n = 19), Germany (n = 13), Switzerland (n = 12), and Spain (n = 11).The number of publications peaked in 2021 (n = 13) with fluctuations in the field, although there was a dip in 2022 due to the search period limitations (by October 8, 2022).Rapid growth after 2010 could probably be due to the improvement of UAV technologies and the increased availability of off-the-shelf UAV systems.Due to the sharply increasing interest in UAVs after 2010 (Nex et al., 2022), there was a definite upward trend in UAV-based SLAM research after 2010.Most studies were published in robotics and remote sensing-related journals and conferences (Figure A1), including IEEE International Conference on Robotics and Automation (n = 8), IEEE Transactions on Robotics (n = 7), Journal of Field Robotics (n = 5), Journal of Intelligent and Robotic Systems (n = 5), and IEEE/RSJ International Conference on Intelligent Robots and Systems (n = 5).
evaluated the methods of data collection in forests by handheld devices, UAV flying under the canopies, and UAV flying above the canopies, concluding that features collected by UAV flying under the canopies are more accurate.As a result, the flight height of a UAV is employed as an important evaluation indicator when performing a UAV task.This topic can best be treated under five levels: low altitude (height <10 m), medium-low altitude (10-30 m), medium altitude (30-50 m), mediumhigh altitude (50-100 m), and high altitude (height >100 m).It was shown clearly that the proportion of flight heights that were less than 10 m accounted for the greatest number (n = 52) of tasks in every UAV mission.
The publication trend in the field of UAV-based SLAM from 2003 to 2022.The publications per year of five main countries are shown in this figure.SLAM, simultaneous localization and mapping; UAV, unmanned aerial vehicle.
the experimental methods are categorized in three ways: public data sets, simulation platforms experiments, and field experiments (Figure6).Although some studies implemented both public data sets, simulation platforms, or field experiments, overall, field experiments (n = 66) for evaluating SLAM performance accounted for the vast majority in the selected papers compared with the public data set and simulation platform experiment methods.Generally, field experiments carry out SLAM with UAVs and some necessary equipment such as ground computing platform and sensors to targeted scenarios mentioned in Figure6.This method is able to show the real results and performance of SLAM in the target scenarios compared with evaluating SLAM by the public data set and simulation platform.However, the use of public data sets and simulation platforms is very efficient in developing new SLAM algorithms and is convenient to carry out benchmark to other SLAM algorithms.

4. 5 |
Hardware for UAV-based outdoor SLAM4.5.1 | UAV platformsThe UAV platforms adopted in the selected studies can be categorized based on commercial or self-made into quadrotor, hexacopter, octorotor, helicopter, and fixed wing.The ratio of commercial UAVs to self-made UAVs in the eligible papers was almost 1:1 (Figure6).It was shown that quadrotor UAVs accounted for a very high proportion in both commercial (n = 30) and self-made (n = 30).Fixed-wing UAVs accounted for a small percentage of UAVbased SLAM studies compared with multirotor UAVs, with two (a) (b) F I G U R E 6 Experiments methods categories, the chord chart (a) indicated the relationship and proportion among experimental methods (fieldwork, simulation, and open-source data set), type of UAVs (commercial and self-developed) and the number of rotors of UAVs (quadrotor, hexacopter, octorotor, fixed wing, and helicopter), the tree chart (b) illustrated the detailed simulation platforms and open-source data sets in experimental methods as well as their proportion used in the literature by the size of the squares.UAV, unmanned aerial vehicle.
using commercial fixed-wing UAVs and three studies(Hinzmann et al., 2016;   Suzuki, Amano, & Hashizume, 2011; Suzuki, Amano, Hashizume, &   Suzuki, 2011)  using self-made fixed-wing UAVs.4.5.2 | Sensor categoriesThe sensors used in the eligible studies can be classified into three categories based on their sensing principles: camera, LiDAR, and others (Figure7).Monocular cameras (n = 68) were the most popular sensors used in SLAM.Both LiDAR (n = 21) and stereo camera (n = 15) accounted for 11.5%.Depth camera(Li-Chee-Ming & Armenakis, 2018) and offline reference map(Shao et al., 2021) were only used in one study each.Monocular cameras accounted for a significant portion of UAVbased SLAM research using only cameras.In SLAM research involving sensor fusion, monocular cameras, and LiDAR account for the largest share.No studies for outdoor applications have been combining cameras, LiDAR, and other sensors yet.
Among the existing SLAM methods, studies using EKF-SLAM (n = 24) and ORB-SLAM (n = 23) represented the largest proportion of the selected papers.Depending on the type of sensors, the highest number of LiDAR-based SLAM studies used LOAM (n = 5) in the investigated studies, while ORB-SLAM (n = 18) accounted for the largest proportion in visual-based SLAM.Among the remaining SLAM algorithms in the selected studies, most were visual SLAM, such as SVO (n = 5), LSD-SLAM (n = 5), PTAM (n = 10), VINS-Mono (n = 3), VINS-Fusion (n = 1), MSCKF (n = 2), and FAST-LIO (n = 1) was based on LiDAR SLAM.

(
Figure 8) in the eligible studies used only the CPU (n = 76) in their SLAM implementations.Only a very small proportion of the studies involved GPU (n = 8) in UAV-based SLAM.In the field of SLAM, the availability of open-source code is significant for community development.Yet, only one in five of all investigated studies accounted for open-source studies.Both taskoriented and algorithm-oriented, open-source studies played a significant role in the eligible studies in this review.Although the proportion of open-source studies was relatively small to the total research, these open-source studies have become milestones in SLAM.

7
The distribution of sensors in the investigated studies.The sensors were divided into camera-based, LiDAR-based and others.The red dots on the left columns indicate the studies using a single type of sensor, and green dots indicate the studies using two types of sensors.LiDAR, Light Detection and Ranging.T A B L E 4 Overview of sensor in SLAM, including current main sensor types with their sensing principle, and basic parameters, including sensing distance, data types, price, power consumption, and the weight of the sensor from small to large.images or videos with a single lens and convert the light into digital data that can be processed by computers.While it lacks depth perception, it may be integrated with other sensors or algorithms to estimate depth and provide extra environmental information.images or videos from slightly varied perspectives, allowing it to discern depth and create 3D images.The principle of the sensor is that they mimic the human eyes perceive depth by using the differences in the two images to calculate the distance to objects in the scene.cameras captures both color and depth information using a combination of traditional RGB cameras and infrared or time-of-flight (ToF) sensor.It allows for more precise and comprehensive 3D modeling and object detection, which are widely utilized in robotics, virtual reality, and computer vision applications.hemispherical or panoramic images of the world using a wide-angle lens.The fisheye cameras use lens with a very short focal length and a highly curved front element to achieve a wider field of view than traditional camera composed of laser transmitter, receiver, scanner, and signal processing module.The current mainstream LiDAR ranging principle is to calculate ToF of the transmitted pulse signal and the reflected signal received by

T
A B L E 5 CPU and GPU model examples used in UAV-based SLAM.
For example, in poor light environments, camera sensors may produce low-quality images leading to inaccuracies in feature identification and matching in erroneous pose estimations and poor F I G U R E 10 Experimental factors impacting on UAV-based SLAM, the first column (blue) shows the hardware factors, including sensor, processing power, and UAV types, the second column (orange) shows software factors, including the sensor calibration, SLAM control system, and the third column (green) shows the external factor which includes flight environment (e.g., wind and weathers), and UAV flight setup (e.g., speed and flight height).Some examples mentioned these factors in the selected literature.LiDAR, Light Detection and Ranging; SLAM, simultaneous localization and mapping; UAV, unmanned aerial vehicle.
have demonstrated the effectiveness of these techniques in improving the performance of SLAM.Moreover, the current SLAM algorithm based on UAV is applicable to fewer scenarios, and future research should collect and establish relevant open-source data sets for different scenarios to prepare for SLAM algorithms and benchmark tests.Finally, the current research area of UAV-based SLAM is mainly focused on urban and rural environments, while the proportion of research in agriculture and industrial areas is low.The high mobility and autonomy of UAVs combined with SLAM offer significant potential for a wide range of applications, such as smart agriculture, infrastructure, and natural hazard monitoring.For instance, UAVs can collect high-resolution imagery and generate comprehensive maps of crop growth, health conditions, and yield estimation, enabling precision agriculture practices and more effective resource utilization applications scenarios employing SLAM and UAVs with various sensors.Currently, the navigation of UAVs relies on GPS/global navigation satellite systems or expert operation within visible range.However, when it comes to GPS-denied outdoor environment, SLAM could be an important technology for UAV navigation and generate maps of surroundings.The UAV-based SLAM is still in its infancy compared with other robotic platforms, such as autonomous cars, warehouse robots, and industrial robots.UAV applications with conventional geometric feature-based SLAM algorithms (e.g., ORB-SLAM, LSD-SLAM, andWANG ET AL.| 1635 VINS-Mono) are relatively mature and have some practical applications, such as fire detection, forest growth assessment, and infrastructure inspection.However, there are still relatively few other applications for UAV-based SLAM in other fields, and compared with SLAM for other robotic platforms, the technology of SLAM is still not widely combined with current technologies, such as deep learning.Lack of related data sets, real-time processing performance, and robustness for long-term autonomy are still challenges.As a next step, UAV-based SLAM will be developed in specific application scenarios and related data sets will be built.At the moment, the monocular camera is the most widely used sensor in UAV-based SLAM due to its small size, lightweight, easy-touse, and robust performance.Different types of sensors have their pros and cons in different tasks and application scenarios.Therefore, some studies have focused on multisensor fusion to compensate for the deficiencies of different sensors that can improve the accuracy of SLAM.However, except for IMU and GPS sensor data, there were none or few studies adopting sensor fusion for more than three different types of sensors.This may be due to the limitations of the UAV's payload, so it is equally important to balance the weight of the battery, computing platform, and sensors within the limited UAV payload.Due to the current limitations of UAVs, sensors, and computational power, UAV-based SLAM experiments should be carefully considered.And some challenging influences should be considered in different experimental scenarios.Thus, we provide three factors from hardware, software, and external level in UAV-based SLAM experiment setup.Researchers can design their UAV-based SLAM experiments referring to our experimental factors that may influence the SLAM results