Optimizing aerial imagery collection and processing parameters for drone‐based individual tree mapping in structurally complex conifer forests

Recent advances in remotely piloted aerial systems (‘drones’) and imagery processing enable individual tree mapping in forests across broad areas with low‐cost equipment and minimal ground‐based data collection. One such method involves collecting many partially overlapping aerial photos, processing them using ‘structure from motion’ (SfM) photogrammetry to create a digital 3D representation and using the 3D model to detect individual trees. SfM‐based forest mapping involves myriad decisions surrounding methods and parameters for imagery acquisition and processing, but it is unclear how these individual decisions or their combinations impact the quality of the resulting forest inventories. We collected and processed drone imagery of a moderate‐density, structurally complex mixed‐conifer stand. We tested 22 imagery collection methods (altering flight altitude, camera pitch and image overlap), 12 imagery processing parameterizations (image resolutions and depth map filtering intensities) and 286 tree detection methods (algorithms and their parameterizations) to create 7,568 tree maps. We compared these maps to a 3.23‐ha ground reference map of 1,775 trees >5 m tall that we created using traditional field survey methods. The accuracy of individual tree detection (ITD) and the resulting tree maps was generally maximized by collecting imagery at high altitude (120 m) with at least 90% image‐to‐image overlap, photogrammetrically processing images into a canopy height model (CHM) with a twofold upscaling (coarsening) step and detecting trees from the CHM using a variable window filter after applying a moving window mean smooth to the CHM. Using this combination of methods, we mapped trees with an accuracy exceeding expectations for structurally complex forests (for canopy‐dominant trees >10 m tall, sensitivity = 0.69 and precision = 0.90). Remotely measured tree heights corresponded to ground‐measured heights with R2 = 0.95. Accuracy was higher for taller trees and lower for understorey trees and would likely be higher in less dense and less structurally complex stands. Our results may guide others wishing to efficiently produce broad‐extent individual tree maps of conifer forests without investing substantial time tailoring imagery acquisition and processing parameters. The resulting tree maps create opportunities for addressing previously intractable ecological questions and informing forest management.

cost equipment and minimal ground-based data collection. One such method involves collecting many partially overlapping aerial photos, processing them using 'structure from motion' (SfM) photogrammetry to create a digital 3D representation and using the 3D model to detect individual trees. SfM-based forest mapping involves myriad decisions surrounding methods and parameters for imagery acquisition and processing, but it is unclear how these individual decisions or their combinations impact the quality of the resulting forest inventories.
2. We collected and processed drone imagery of a moderate-density, structurally complex mixed-conifer stand. We tested 22 imagery collection methods (altering flight altitude, camera pitch and image overlap), 12 imagery processing parameterizations (image resolutions and depth map filtering intensities) and 286 tree detection methods (algorithms and their parameterizations) to create 7,568 tree maps. We compared these maps to a 3.23-ha ground reference map of 1,775 trees >5 m tall that we created using traditional field survey methods.
3. The accuracy of individual tree detection (ITD) and the resulting tree maps was generally maximized by collecting imagery at high altitude (120 m) with at least 90% image-to-image overlap, photogrammetrically processing images into a canopy height model (CHM) with a twofold upscaling (coarsening) step and detecting trees from the CHM using a variable window filter after applying a moving window mean smooth to the CHM. Using this combination of methods, we mapped trees with an accuracy exceeding expectations for structurally complex forests (for canopy-dominant trees >10 m tall, sensitivity = 0.69 and precision = 0.90). Remotely measured tree heights corresponded to groundmeasured heights with R 2 = 0.95. Accuracy was higher for taller trees and lower for understorey trees and would likely be higher in less dense and less structurally complex stands.

| INTRODUC TI ON
Forest inventories characterize the species, size, condition and location of individual trees and are critical resources for advancing ecological theory and informing forest management (Hubbell et al., 1999;Lasky et al., 2014;North et al., 2021;Whittaker, 1956;Wright et al., 2010;Young et al., 2020). Forest inventories are traditionally completed by ground-based field crews and require substantial time, labour and financial investment, which limits their spatial extent and continuity (Gray et al., 2012;USDA Forest Service, 2016). To address these constraints, forest mapping approaches have more recently employed remote sensing data to create continuous forest inventories over broad areas. Remote sensing-based forest mapping has traditionally taken an 'area-based' approach in which remote sensing data (e.g. spectral reflectance data from satellite or aerial imagery) are used to estimate forest summary statistics such as tree density, mean tree height and above-ground biomass (De Luca et al., 2019;Jayathunga et al., 2018;Lamping et al., 2021;Puliti et al., 2019;Rodman et al., 2019). However, the increasing quality of remote sensing data and processing workflows has recently enabled remote forest mapping more analogous to field-based approaches that involve detecting and characterizing individual trees (Jeronimo et al., 2018;Koontz et al., 2021;Swayze et al., 2021).
Small remotely piloted aerial systems (RPAS, or 'drones') provide data at a scale particularly well suited for individual tree detection (ITD). A fundamental technique in drone-based forest mapping involves collecting many partially overlapping images in a dense grid over the study area. The images are supplied to a photogrammetry algorithm, which employs principles of perspective and triangulation to estimate the 3D structure of the landscape by quantifying the amount by which landscape features move relative to each other between images. This method is commonly referred to as 'structure from motion' (SfM; Dandois & Ellis, 2013;Iglhaut et al., 2019;Westoby et al., 2012) because the many optical perspectives from the drone as it moves allows modelling of the 3D structure of objects and landscapes. The structure data can be represented as a point cloud in which each point identifies a surface (e.g. leaf, stem, ground) that appears in multiple photos. The point cloud data can be processed into raster-format vegetation canopy height models (CHMs).
SfM-derived point cloud data share many characteristics with point clouds derived from aerial light detection and ranging (lidar; also known as aerial laser scanning, ALS), which can also be used for ITD (Jeronimo et al., 2018;Zaforemska et al., 2019) and may better capture subcanopy structure because some laser pulses penetrate the canopy (Jayathunga et al., 2018;Lamping et al., 2021;Lisein et al., 2013). Because ALS-derived point clouds have historically been collected using crewed aircraft flying higher than a typical drone operation, they generally cover more ground area per mission but have substantially reduced resolution and point density (e.g. <10 points/ m 2 ; USGS, 2018; Weinstein et al., 2021) compared to drone-based SfM point clouds (e.g. >100 points/m 2 , this study). Technological advances have enabled drone-mounted lidar instrumentation that can achieve a density of thousands of points/m 2 (Kellner et al., 2019;Lin et al., 2011;Sankey et al., 2017). In comparison to drone-mounted lidar, however, drone-based SfM data are much less costly to obtain (data can be collected with standard RGB [red, green, blue] cameras) and can be collected over moderately sized focal areas with high frequency and minimal advance planning (Camarretta et al., 2020;Mlambo et al., 2017).
Numerous algorithms have been developed to detect individual trees from CHMs (e.g. Popescu & Wynne, 2004) and directly from point clouds (e.g. Li et al., 2012;Xiao et al., 2019). ITD accuracy varies considerably depending on the stand structural characteristics and algorithms used, with higher accuracy in lower density stands and in overstorey versus understorey trees. ITD accuracy is arguably best summarized using the F score, which incorporates the rates of both true and false positive detections. The F score is calculated as the harmonic mean of the sensitivity (proportion of ground reference trees detected; also called recall) and the precision (proportion of detected trees that match ground reference trees) and ranges between 0 (no ground trees detected) and 1 (all ground trees detected and no false positive detections). drone, inventory, map, photogrammetry, remote sensing, structure from motion, tree, UAV Recent ITD work using drone-derived SfM products for overstorey trees (Creasy et al., 2021;Mohan et al., 2017) or for all trees in low-to moderate-density stands (Belmonte et al., 2020;Bonnet et al., 2017;Swayze et al., 2021) has obtained F scores ranging roughly between 0.75 and 0.85, whereas for high-density stands or understorey trees, performance tends to be lower (e.g. F < 0.65; Creasy et al., 2021). The height and canopy extent of automatically detected trees can usually be measured from CHM or point cloud data with high accuracy (RMSE: 3%-7% and R 2 > 0.70; Belmonte et al., 2020;Creasy et al., 2021;Silva et al., 2016), though the narrow tops of standing dead trees can be missing in the 3D reconstruction, leading to underestimates of dead tree height .
Despite the promise of drone-based tree mapping using SfM, relatively little work has quantitatively evaluated the influence of different imagery collection, imagery processing and tree detection methods on the accuracy of the resulting tree maps. Using an oblique (as opposed to directly downward, or 'nadir') camera pitch can increase the accuracy of digital terrain models derived from drone images in areas with low vegetation cover (Nesbit & Hugenholtz, 2019) and in forests can increase the point cloud density in the lower canopy and understorey (Díaz et al., 2020;Lamping et al., 2021).
However, the only published evaluation of camera pitch specifically in the context of ITD found that tree detection accuracy was greater with a nadir versus oblique camera pitch . Flight (image collection) altitude may additionally affect 3D reconstruction quality, likely through its effect on the spatial resolution of the resulting imagery (higher altitude results in coarser grain imagery; Dandois et al., 2015). However, previous work has found little difference in ITD performance among flights conducted between 64 and 115 m above ground level  and between 50 and 100 m above ground level (Torres-Sánchez et al., 2018).
Finally, while increased image collection density (i.e. overlap) is associated with increased point cloud quality and density (Dandois & Ellis, 2013;Frey et al., 2018;Ni et al., 2018), it also increases image dataset size and acquisition and processing times. Increasing image overlap can increase ITD accuracy , but it provides diminishing returns to accuracy at increasingly high overlap (Torres-Sánchez et al., 2018).
Image resolution and outlier filtering are key parameters that can be adjusted during the SfM processing. A strong understanding of photogrammetric analysis principles can provide key insights into how these parameters may be adjusted to yield more successful 3D reconstructions (Over et al., 2021;USGS, 2017), but empirical validation of these workflows in the context of forest inventories is generally lacking. Only one study to our knowledge has evaluated image resolution and point cloud filtering parameters in the context of ITD . Using the Metashape v1.6.4 photogrammetry software (Agisoft, LLC), Tinkham and Swayze (2021) found that retaining maximal image resolution and minimizing outlier filtering during point cloud generation yielded the greatest ITD performance. However, this study did not evaluate the influence of image resolution during the alignment stage. Using full image resolution during processing may increase point cloud detail and density (Jayathunga et al., 2018;Lisein et al., 2013), but (a) higher resolution data can substantially increase processing times, (b) high-resolution images may be difficult to align and compare when they include small surfaces like leaves and branches that move or blow in the wind and (c) the extent to which any increase in point cloud detail translates to improved ITD performance is not well known. identifying a parameterization of a point cloud-based method (Roussel, 2021b) as the most accurate. While they also included a test of a variable window filtering algorithm (Plowright, 2021), they tested only three parameter sets for this algorithm based on previous results from lidar acquisitions (Popescu & Wynne, 2004) and thus provide limited opportunity for comparison with previous studies that employed this method.
While work to date has quantified the influence of flight and photogrammetry parameters on the resulting photogrammetry products, few studies have evaluated how these parameters affect the accuracy of the forest inventory-the ultimate product that informs ecological inference and management decisions.
Furthermore, evaluations of SfM-based tree detection algorithms to date have generally not simultaneously considered the role of imagery collection and processing parameters used to create the photogrammetry products. Evaluating the influence of these categories of variables jointly may allow the detection of consistent effects versus idiosyncrasies and may reveal important interactions that enable meaningful improvements in ITD accuracy and efficiency. In addition, many evaluations of ITD methods have been conducted in stands with relatively simple structure and low tree density, potentially yielding parameter selection and tree detection performance different than may be expected in higher density, more structurally complex stands. In the present study, we evaluate multiple factorial combinations of imagery collection parameters (flight altitude, camera pitch and image overlap), imagery processing parameters (image resolution for image alignment and for dense cloud generation, and point cloud outlier filtering intensity) and tree detection methods (algorithm and parameterization), for a total of 7,568 combinations, in a moderately dense, structurally complex mixed-conifer stand in the Sierra Nevada of California.

| Overview
We created a ground reference tree map of 1,916 trees >5 m tall in a 3.23-ha focal area using traditional survey methods. We also used automated algorithms to create 7,568 alternative tree maps for this area from aerial imagery collected by drone to evaluate the influence of image acquisition and processing parameters on aerial tree mapping accuracy. In Stage 1, we identified the best-performing photogrammetry processing parameters and automated tree detection methods. In Stage 2, we applied those methods to identify the best image acquisition parameters (flight altitude, image overlap and camera pitch; Figure 1).

| Focal area
Our study site was a 3.23-ha area of mixed-conifer forest (Safford & Stevens, 2017) in Emerald Bay State Park on the shore of Lake Tahoe in the Sierra Nevada of California ( Figure 2a). The stand is co-dominated, in decreasing order of abundance, by ponderosa pine

Pinus ponderosa, incense cedar Calocedrus decurrens, Jeffrey pine
Pinus jeffreyi and white fir Abies concolor. The stand has high structural complexity, with a continuous size distribution and small trees interspersed with larger trees and often underneath their canopies (Figures 2b and 3). The topography is flat (elevation range: 1,900 to 1,905 m). We developed a 3.23-ha ground reference stand inventory by establishing a grid of points and measuring the distance (using a laser rangefinder) and azimuth (using a sighting compass) to each tree with DBH > 7.5 cm from a nearby grid point (Appendix S1

| Imagery collection and pre-processing
We collected RGB aerial photographs using a DJI Phantom 4

F I G U R E 1
The optimization workflow employed in this study. For each of the two optimization stages (vertical bars), the ranges of tested parameters appear on the left, and the ranges of selected parameters appear on the right in black text, with the best-performing sets bolded. Additional parameters tested in Stage 2 that were not derived from Stage 1 results appear in purple text. For each optimization stage, all factorial combinations of parameters across each category (e.g. altitude, photogrammetry and tree detection) were tested, except that only certain pitch-overlap combinations were tested (factorial combinations within blue outlines). While Stage 2 revealed several parameter combinations with strong performance (see text), the right-hand side of the diagram shows the most consistently best (e.g. vwf_196 tree detection) and/or most practical ( where y is the radius of the search window for higher points centred on a focal point, and x is the height of the focal point; this algorithm is applied to the CHM following a 9 × 9 pixel moving window mean smooth. Other tree detection parameter sets are defined in Appendix S2: Tables S1 and S2; photo sets are described in Table 2; and photogrammetry parameter sets are described in Table 3. Photogrm.: Photogrammetry parameterization; Tree det.: Tree detection algorithm; VWF = variable window filter; Pt. cld.: point cloud-based method  Table 1). We report image overlap percentage for a mission in the format 'front overlap/side overlap', with units in per cent (e.g.
'80/80'). Actual image overlap inevitably differs slightly from the specified overlap (e.g. due to occasionally missed photos, a normal occurrence with some common DJI drones; Figure 2c), so we treat the overlap amount as a 'nominal overlap', reflecting what a future user may expect when using these settings with a similar aircraft. We collected multiple image datasets using different flight parameters (altitude, gimbal pitch and image overlap; All missions used automatic exposure and automatic white balance settings and were flown in MapPilot's 'connectionless' mode. We used the 'terrain awareness' function so that the aircraft re- (c) Spatial locations of drone photos from two drone photo sets ('high nadir' 95% front and side overlap in yellow; and 'high nadir' 90% front and side overlap in purple; Table 2). (d) Canopy height model (lighter indicates taller) of a section of the focal area, with ground-mapped trees shown as light blue points, drone-mapped trees shown as dark red points and pairings between ground-and drone-mapped trees shown as purple lines. In (d), the canopy height model was created by applying photogrammetry parameter set 16 (Table 3) to the 'high nadir' photo set with 90% front and side overlap ( Table 2). The drone-derived stem map was obtained by applying tree detection algorithm 'vwf_059' (Appendix S2: Table S1) to this same canopy height model small clouds for brief periods. We used natural features, with geographic locations identified using Google Earth imagery, as ground control points (Appendix S1). To test the effect of image overlap on the quality of the resulting tree maps, we subsetted the photo sets to effectively reduce image overlap by retaining every nth image on every nth transect (Appendix S1).
Finally, we tested whether combining nadir (0°) and oblique (25°) camera pitch missions into a single composite photo set yielded improved photogrammetric performance and ultimately more accurate tree maps. For each flight elevation (90 m and 120 m), we prepared two composite photo sets with different overlaps (Appendix S1). We report the overlap percentages of these composite datasets as the

| Photogrammetric processing and post-processing
We performed photogrammetric SfM processing of the aerial image sets (see Introduction) to produce 3D point clouds and digital surface models using Metashape version 1.6.5 (Agisoft, LLC). We interfaced with Metashape via its Python API using the UC Davis Metashape workflow software version 0.1.0 , which executes a full photogrammetry workflow from end to end using the processing parameters specified in a configuration file by the user. The workflow reads GCP location data from delimited text files prepared in advance.
Our first objective was to determine the combination of photogrammetry processing parameters that maximized the quality of the photogrammetric products for the purpose of tree mapping (Stage 1). We evaluated all factorial combinations of the photo alignment quality parameter (low, medium or high, corresponding to image upscaling factors of 4, 2 or 1 respectively), the dense cloud quality parameter (medium or high, corresponding to image upscaling factors of 4 or 2 respectively) and the depth filtering intensity parameter (mild or moderate). Upscaling factors refer to the amount by which the image resolution was upscaled (coarsened) in each dimension; for example, with an upscaling factor of 4, the resulting image resolution in x and y dimensions would be ¼ of its original, with each coarse pixel representing the average of 16 original pixels (a 4 × 4 square; Agisoft, 2020). Recent work has suggested the 'medium' and 'high' dense cloud quality parameters yield superior ITD results while avoiding extreme computational expense associated with the 'very high' parameter; it has similarly shown the 'mild' and 'moderate' depth filtering parameters to be among those yielding best ITD performance . The three-way factorial combination of photo alignment quality, dense cloud quality and depth filtering parameters yielded 12 different processing configurations (Table 3), which we ran on two different aerial photo sets: the 120 m nadir (0° camera pitch) mission with 90% front and side photo overlap and the 90 m nadir mission with 90% front and side photo overlap.
After identifying four Metashape parameter sets that yielded the best tree detection results for these two photo sets (see Individual tree detection and Identification of best-performing methods, below), we used each of these four parameter sets to process all of the photo sets (  . We normalized and resampled the products of the photogrammetry workflow to obtain, for each run of the workflow, a CHM with 0.12 m resolution ( Figure 2d) and a point cloud with 70 to 100 points/m 2 (Appendix S1).

| Individual tree detection (ITD)
During Stage 1 of methods evaluation (focused on identifying the best Metashape photogrammetry parameters and tree detection algorithms), we tested a wide range of tree detection algorithms. We first tested the 'variable window filter' (VWF) algorithm of Popescu and Wynne (2004) as implemented in the r package Foresttools version 0.2.1 (Plowright, 2021). This function uses the CHM raster and evaluates each pixel as a potential treetop by searching all pixels within a particular radius around the focal pixel and labelling the focal pixel as a treetop if it has the maximum height value with the search radius.
The search radius is determined by a linear function of the height of the focal pixel. We tested 76 different combinations of the intercept and slope parameters of this linear function (Appendix S2: Table S1).
For each of these parameter sets, we also tested three different CHM smoothing options. These smoothing functions were implemented as moving window algorithms which, for each pixel, computed the mean of all pixels in a n × n pixel square centred around the focal pixel and assigned the resulting value to the focal pixel. We tested a 5 × 5 pixel window (0.6 × 0.6 m; Smooth: 1), a 9 × 9 pixel window (1.08 × 1.08 m; Smooth: 2) and no smooth (Smooth: 0). The smooths were applied prior to running the VWF algorithm. We included these smoothing options with the thought that they may smooth over 3D reconstruction artefacts of the photogrammetry algorithm. Factorially combining the 3 smooth options with each of the 76 variable window filter parameter sets resulted in testing 228 implementations of the VWFbased tree detection algorithm (Appendix S2: Table S1).
We additionally tested six algorithms designed to identify trees directly from 3D point clouds, implemented in the r packages lidr TA B L E 2 The photo set (flight and image overlap) parameters tested to evaluate the effect of flight altitude, camera pitch and image overlap on quality of the resulting photogrammetry products for tree mapping (Stage 2). The multiple image overlap values were obtained by thinning the originally collected image datasets (

TA B L E 3
Metashape photogrammetry processing parameter combinations tested (Stage 1). The image upscaling factor (in parentheses) refers to the factor by which image resolution was upscaled (coarsened), in each dimension, prior to processing (photo alignment or dense cloud creation steps). The four parameter sets that yielded the best tree detection results and that were subsequently used in the evaluations of flight altitude, camera pitch and overlap (Stage 2) are bolded. Parameter set numbering starts at 7 because we preliminarily tested sets with 'low' dense cloud quality (upscaling factor of 8; set IDs 1-6) but excluded them from full testing due to poor 3D reconstruction; we retain the original numbering so that it matches the configuration files and analysis code in the repository accompanying this paper Medium (2) Medium (4) Moderate

| ITD performance evaluation
We quantified the accuracy of the drone-derived tree maps by comparing each one against the 3.23-ha ground-based stem map. An initial coarse filter was applied to eliminate the very poor-quality drone-derived maps: if the number of drone-mapped trees >10 m height was more than five times the number of ground-mapped trees >10 m height, or if it was less than 1/10 the number of groundmapped trees >10 m height, it was eliminated from the pool of candidates.
For all remaining drone-derived maps, we programmatically performed a comparison to the ground-derived map on a tree-by-tree basis, determining whether each ground-mapped tree was present in the drone-based map (true positives) and whether there were any additional trees in the drone-derived map that were not present in the ground-derived map (false positives). This required determining which tree (if any) from the ground-based map corresponded to which tree in the drone-based map (and vice versa), a challenging and subjective exercise given that we never expect trees in a ground-based map to perfectly coincide with those in a drone-based map. Differences can arise due to spatial errors in both mapping techniques and also due to the fact that the treetop (the point identified in the drone-based map) is often not located precisely above the stem (the point identified in the ground-based map).
For a drone-mapped tree to match with a ground-mapped tree, it was required to be within a distance (d max ) of the ground-mapped tree defined as a function of the height (h) of the ground-mapped tree as where units are in metres. Its height was also required to be within ±50% of the height of the ground-mapped tree. Thus, for a groundmapped tree 10 m tall, a drone-mapped tree needed to be within 2 m distance and its height needed to be between 5 and 15 m to paired ground-mapped trees) and (b) error substantially less than the matching threshold in drone-versus ground-mapped height would provide evidence that trees were correctly matched.
For each ground tree, the nearest matching drone tree was assigned as its match. If the same drone tree was assigned to multiple ground trees, it was removed from all of the ground trees except the one spatially closest to it. This procedure was repeated two more times, each time for the ground and drone trees remaining (unmatched) following the previous iteration. After the third iteration, no further matches were possible.
To quantify ITD accuracy, we computed the true positive rate ('sensitivity' or 'recall', the proportion of ground-mapped trees that had a matching drone-mapped tree) and the precision (the proportion of drone-mapped trees that matched a ground-mapped tree).
Because it is possible for a tree detection algorithm to achieve high sensitivity at the expense of precision (and vice versa), we also computed the F score, which integrates sensitivity and precision by computing their harmonic mean, thus disproportionately penalizing low values and favouring balanced sensitivity and precision. We computed sensitivity, precision and F score for two different tree size groups: trees ≥10 m height and trees ≥20 m height.
There is potential for edge effects to confound the tree detection accuracy inferred via tree matching. If a ground-mapped tree were just inside the analysis boundary and the corresponding dronemapped tree just outside it, the ground-mapped tree would be considered to not have a match (and thus constitute a false negative detection). To minimize this effect, when calculating the proportion of drone-mapped trees that matched ground-mapped trees, we considered only drone-mapped trees that were at least 5 m inside the project boundary (so that they all had an opportunity to be matched with ground-mapped trees in any direction). We did the same with ground-mapped trees when calculating the proportion of groundmapped trees that matched drone-mapped trees.
While it is valuable to know the proportion of all ground reference trees that can be detected from aerial imagery, it is unrealistic to expect all trees to be detected, particularly in structurally complex stands like ours where small trees may be hidden under large trees or two immediately adjacent trees of similar size appear as one. Therefore, in addition to evaluating ITD performance across all trees, we evaluated performance in mapping 'dominant' trees that did not have any immediately adjacent taller neighbours (Appendix S1).

| Identification of best-performing methods
To identify the best-performing photogrammetry and tree detection parameter sets (Stage 1), we first identified the photogrammetry parameter sets that most consistently yielded the highest F score (or within 0.005 of the highest F score) across all factorial combinations of photogrammetry parameter set (7-18;  Table S4). We selected eight tree detection parameter sets.
To quantify the influence of flight altitude, image overlap and camera pitch on ITD performance (Stage 2), we used the four bestperforming photogrammetry parameter sets, combined factorially with the eight best-performing tree detection methods, to produce 32 tree maps from each of the 22 different photo sets ( Table 2). When plotting or describing tree detection performance achieved with any of these photo sets, we report the F score obtained from the tree detection method that produced the maximum F score for that photoset.

| Evaluation of drone-based tree height measurement
To evaluate the potential to measure tree heights using drone imagery, we extracted tree heights from the canopy height model produced by using photogrammetry parameter set 16 (

| Stage 1: Optimal photogrammetry and tree detection parameters
Photogrammetry parameter combination 16 (medium alignment quality, high dense cloud quality, moderate depth filtering) consistently enabled the most accurate tree detection performance as quantified by the F score ( Figure 4 and Appendix S2: Table S3) Table S3). We thus selected a total of four photogrammetry parameter sets for further evaluation.
The most accurate tree detection methods, as quantified by F score, were all CHM-based VWF methods (Appendix S2: Table S4).

TA B L E 4
Tree mapping accuracy achieved with the optimal combination of photogrammetry parameter set and tree detection method (Category 1) or with the specific combination of photogrammetry parameter set 16 and tree detection method vwf_196 (Category 2) for each factorial combination of tree position (dominant or all) and tree height class (>20 m or >10 m). All scenarios use the high (120 m altitude) nadir photo set with 90% front and side overlap. For two scenarios (dominant trees >10 m and all trees >20 m), the combination of photogrammetry parameter set 16 and tree detection method vwf_196 is the optimal combination. For accuracy metrics for different flight altitudes, camera pitches and photo overlaps, see Appendix S2:  Table S5).
For a given scenario (e.g. flight altitude, tree position, tree height, camera pitch and photo overlap), the F score achieved by the combination of photogrammetry parameter set 16 and tree detection method vwf_196 was generally within 0.01 of the maximum F score achieved by the optimal combination of photogrammetry parameter set (out of the four best-performing options) and tree detection method (out of the eight best-performing options; e.g. Table 4).
The difference in the F score was less than 0.01 in approximately 80% of scenarios, particularly those with at least 90% front and side photo overlap and nadir images (Appendix S2: Table S6). In photo sets with less overlap and/or oblique images, other photogrammetry and tree detection parameters often performed better (F score difference >0.01), but for these scenarios, even the optimal photogrammetry and tree detection parameter combinations yielded inferior tree mapping performance relative to higher overlap and/or nadir imagery (see Stage 2: Optimal image collection parameters).
For nadir photo sets with at least 90% front and side overlap, the optimal combination of photogrammetry and tree detection parameters achieved tree mapping accuracy ranging between F = 0.67 and F = 0.87 (Appendix S2: Table S6). For example, for dominant trees >10 m tall, the optimal methods (photogrammetry parameter set 16 F I G U R E 4 Individual tree detection performance of different photogrammetry parameter combinations at two flight altitudes, for all trees (a, b) and dominant trees (c, d) with height >10 m (a, c) or height >20 m (b, d). The number in each cell indicates the ID of the tree detection method (Appendix S2: Tables S1-S2) that yielded the maximum F score for the particular combination of parameters; three-digit numbers refer to VWF methods, while four-digit numbers refer to point cloud-based methods. The F score itself is indicated by the colour. The aerial photo sets processed were the high nadir set (Alt: 120 m) and the low nadir set (Alt: 90 m; both with 90% front and side image overlap; bolded entries in Table 2) paired with tree detection method vwf_196, applied to the 90% front and side overlap 120 m nadir mission) achieved an F score of 0.78, with a sensitivity of 0.69 and a precision of 0.90 (Table 4). Generally, precision was greater than sensitivity (Table 4 and Appendix S2: Table S6).  Table S6). Among nadir image sets, higher altitude (120 m) sets tended to yield greater accuracy than lower altitude sets when image overlaps were lower (below 90/90) and similar accuracy when overlaps were greater (90/90 and greater; Figure 5).

| Stage 2: Optimal image collection parameters
Interestingly, even though the 90/80 and 80/90 per cent overlap image sets contained roughly the same image density, the former consistently enabled substantially greater tree mapping accuracy, for both 120 m and 90 m flights ( Figure 5).
Nadir imagery tended to achieve accuracy greater than or comparable to that of oblique or nadir-oblique composite imagery of similar image density ( Figure 5). The 120 m composite nadir-oblique image set performed similar to (at higher overlap) or substantially better than (at lower overlap) the 90 m composite pitch image set.

F I G U R E 5
Individual tree detection F score for different flight altitude, camera pitch and image overlap combinations. For each combination, eight top-performing tree detection methods were combined factorially with four top-performing photogrammetry processing parameter sets (see Figure 1), and the F score depicted is the maximum across these 32 combinations. From left to right, the x-axis represents categories of image overlap with a monotonic, but not consistent, increase in nominal image density. All overlap values within vertical blue-shaded regions correspond to the same nominal image density. Points are connected by lines simply for ease of associating altitude and pitch with points. Overlap pairings in brackets (e.g. [90/90 + 90/90]) refer to the combination of N-S and E-W missions with oblique camera pitch. Oblique pairings are combined with nadir missions to create composite pitch sets (e.g. [90/90 + 90/90] + 95/90). For a version of this figure that uses only the single consistently best-performing tree detection method (vwf_196) combined with the bestperforming photogrammetry parameter set (16), see Figure S2 (Appendix S1) All of the results presented thus far in this section assume that a given image set is processed using the photogrammetry and tree detection parameters that yield the greatest accuracy for that set. When using a single top-performing photogrammetry parameter set (16) combined with a single top-performing tree detection method (vwf_196), the patterns remain qualitatively very similar, with generally only very small shifts in tree detection accuracy (Appendix S1: Figure S2).

| Tree height measurement
Tree height measurement was generally highly accurate, with dronemeasured and ground-measured tree heights corresponding with R 2 = 0.95, a mean bias of −0.86 m (with drone-derived heights generally shorter than their ground reference counterparts) and a mean absolute error of 1.82 m ( Figure 6). The mean absolute error as a percentage of each tree's height was 9% and the mean bias was −3%.

| Imagery acquisition and processing
Our work helps to identify top-performing approaches to imagery collection and processing for SfM-based forest mapping in structurally complex conifer forests using relatively low-cost RGB drones.
Several clear and consistent results can help forest scientists and managers efficiently produce high-quality forest maps. First, a high flight altitude (120 m, the maximum flight altitude generally allowed by the FAA) consistently yielded tree maps with accuracy better than or effectively equivalent to those obtained from lower altitude (90 m) flights ( Figure 5), consistent with previous observations that flight altitude has minimal impact Torres-Sánchez et al., 2018). Even in contexts where stem map quality is insensitive to flight altitude (in our case, when image overlap is 90% or greater), 120 m flights will likely be preferred given that they require fewer images to cover a landscape (as each image encompasses more ground area) and therefore less flight time.
Similarly, our work reveals little if any gain in ITD accuracy by increasing image overlap above 90% (front and side; Figure 5) Alternatively or in addition, greater overlap along the shorter dimension of the image may disproportionately facilitate image matching and/or depth mapping, as it makes the overlap portion more square (as opposed to a thin strip that may lack sufficient spatial context).
Our tests of camera pitch revealed that oblique (25°) and obliquenadir composite imagery, regardless of flight altitude, yielded ITD accuracy worse than nadir imagery collected at 120 m. This finding is surprising because oblique imagery is known to yield more accurate terrain models (Nesbit & Hugenholtz, 2019) and increase understorey point cloud density (Díaz et al., 2020), especially in stands without a closed canopy (Lamping et al., 2021). However, our findings corroborate existing evidence that for ITD specifically, greater accuracy is achieved with nadir imagery . Although the improved understorey imaging that is achieved by using oblique imagery can improve the estimates of tree DBH (by enabling more accurate 3D modelling of tree stems; Swayze et al., 2021), it apparently does not improve the potential for detection of understorey trees.
This limitation to improvement may be explained by the fact that all CHM-based tree detection algorithms and many point cloud-based tree detection algorithms (e.g. Li et al., 2012) are not designed to detect one tree beneath another, so improved imaging of the understorey cannot translate to improved tree detection. Improvements to multilayer tree detection algorithms (e.g. Torresan et al., 2020; Xiao

F I G U R E 6
Drone-based tree height measurements (height value of the CHM at each treetop location) relative to ground reference tree heights for the most consistently best-performing tree detection method (vwf_196) applied to the CHM produced using the most consistently high-performing photogrammetry parameter combination (16)  We expect our results are applicable to many widely used, relatively low-cost drones with an RGB camera that has a resolution and field of view similar to ours. In fact, given that all image processing steps in the optimal parameterization utilize images that have been upscaled (coarsened) twofold in both dimensions (thus converting a 20 megapixel image to 5 megapixels), the same dataset could in theory be generated with a 5 megapixel camera by eliminating the upscaling step, assuming optical quality is otherwise similar. Similarly, imagery from a higher resolution camera could be used optimally by increasing the upscaling factor. While this may represent a waste of data, the coarser scale may actually achieve greater mapping accuracy given that tree canopies largely consist of small surfaces (e.g. leaves, branches) that can move in the wind and thus confound the image matching algorithms central to the photogrammetry software.

| Tree detection algorithms
Despite testing six point cloud-based ITD algorithms (and 58 different parameterizations of them), the CHM-based VWF algorithm consistently performed the best (Appendix S2: Tables S4 and S5), potentially a consequence of the fact that the point cloud-based methods we tested are not designed to detect one tree beneath another and therefore provide little additional fidelity relative to a CHM (see Discussion section Imagery acquisition and processing).
As with other SfM-based work (e.g. Creasy et al., 2021; and lidar-based work (Ferraz et al., 2012;Jeronimo et al., 2018), we observed substantially improved ITD performance for taller trees and canopy-dominant trees versus all trees (e.g. can be limited, especially when the overstorey is dense and/or tall (Campbell et al., 2018). This limitation has led some to refocus detection and mapping of individual trees (ITD) towards detection and mapping of tree-approximate objects (TAOs), which can include single trees and clusters of trees that are not differentiable (Jeronimo et al., 2018;North et al., 2017). Maps of the size and arrangement of TAOs may be valuable for some management applications (Jeronimo et al., 2018;North et al., 2017), and important ecological questions TA B L E 5 Summary of recent forest ITD studies that compare drone-derived tree detections to ground reference trees in coniferdominated forests and assess ITD accuracy using the F score. Note that Koontz et al. (2021) is included here despite not reporting an F score because of the strong similarities in forest structure/type to this study can be addressed using maps of the specific trees visible from above (Brandt et al., 2020;Weinstein et al., 2021) or detectable using SfM that is not canopy penetrating . Our calculation of ITD accuracy metrics specifically for canopy-dominant trees helps to provide a sense of TAO mapping accuracy. Given that we used a conservative set of parameters for classifying a tree as 'canopy dominant' (Appendix S1: Figure S1), our accuracy metrics may be underestimates.
Notably, our ITD precision values were consistently higher than the sensitivity values, especially for all trees (as opposed to canopydominant trees; Table 4 and Appendix S2:

| Tree height measurement and matching of ground and drone trees
The canopy height model resulting from the optimal photogrammetry parameter set provided a relatively accurate representation of tree heights ( Figure 6). The small negative height bias (CHM heights < field-measured heights) generally increased with increasing tree height, suggesting either (a) disproportionate overestimation of tall tree heights during ground surveys or (b) disproportionate underestimation of tall tree heights by the photogrammetry algorithm. Given that CHM generation involves some degree of interpolation and smoothing of the point cloud, it may make sense that objects that are disproportionately tall relative to their surroundings are underestimated by the CHM. Nonetheless, the mean absolute height error was relatively small (1.8 m or 9% of tree height).
Furthermore, given that our algorithm for matching SfM-detected trees with ground-measured trees required the SfM tree to be within ±50% of the height of the ground tree, the fact that the mean height difference was only 9% strongly suggests that trees were generally matched correctly. Our SfM-based tree height measurement accuracy was generally comparable to or better than other SfM-based approaches, which have obtained R 2 = 0.71 (Belmonte et al., 2020),

| Conclusions
Our comprehensive evaluation of numerous SfM imagery collection, imagery processing and tree detection methods led to ITD performance that meets or exceeds expectations based on previous work (Table 5). The majority of SfM-based ITD work to date has been conducted in relatively low-density monodominant stands with low structural complexity, and our work demonstrates that SfM-based ITD can also be a practical approach to tree mapping in denser, more structurally complex stands, especially if the focus is on canopydominant trees or TAOs. To evaluate the extent to which the ITD accuracy and optimal parameter sets we identified may extend to other forest stands, perhaps the most important considerations are stand density and structural complexity (Jeronimo et al., 2018). In forests with lower tree density and limited multi-stratum structure, such as many ponderosa pine-dominated forests of the southwestern United States (e.g. Swayze et al., 2021), we might expect higher accuracy than we achieved; we might expect the reverse for denser or more structurally complex stands. Historical densities of trees with DBH > 10 cm in the yellow pine and mixed-conifer forests of California's Sierra Nevada averaged roughly 195 trees/ha (Safford & Stevens, 2017;Young et al., 2020), relative to the 591 trees/ha in our mixed-conifer stand. Considering contemporary stands are roughly two to fourfold denser than the historical average (Safford & Stevens, 2017; therefore, roughly 400-800 trees/ha), our focal stand may be roughly reflective of mean contemporary California mixed-conifer forest structure and thus of expected ITD performance. In denser stands with strong multi-stratum structure, the use of oblique images, coupled with a point cloud-based ITD algorithm, will likely become more important for capturing understorey trees (see Discussion section Imagery acquisition). With additional refinements (e.g. use of a more sensitive tree detection algorithm with a false positive filtering step, improvement in point cloudbased multilayer tree detection algorithms and application of deep learning computer vision to tree detection; Weinstein et al., 2020Weinstein et al., , 2021, the accuracy and applicability of drone-based forest mapping will continue to improve.

ACK N OWLED G EM ENTS
The authors' Emerald Point study site is part of the ancestral lands of

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.com/publo n/10.1111/2041-210X.13860.

DATA AVA I L A B I L I T Y S TAT E M E N T
The raw and processed data and code supporting this publication are available via the Open Science Framework (https://doi.