Mitigating systematic error in topographic models derived from UAV and ground‐based image networks

High resolution digital elevation models (DEMs) are increasingly produced from photographs acquired with consumer cameras, both from the ground and from unmanned aerial vehicles (UAVs). However, although such DEMs may achieve centimetric detail, they can also display systematic broad‐scale error that restricts their wider use. Such errors which, in typical UAV data are expressed as a vertical ‘doming’ of the surface, result from a combination of near‐parallel imaging directions and inaccurate correction of radial lens distortion. Using simulations of multi‐image networks with near‐parallel viewing directions, we show that enabling camera self‐calibration as part of the bundle adjustment process inherently leads to erroneous radial distortion estimates and associated DEM error. This effect is relevant whether a traditional photogrammetric or newer structure‐from‐motion (SfM) approach is used, but errors are expected to be more pronounced in SfM‐based DEMs, for which use of control and check point measurements are typically more limited. Systematic DEM error can be significantly reduced by the additional capture and inclusion of oblique images in the image network; we provide practical flight plan solutions for fixed wing or rotor‐based UAVs that, in the absence of control points, can reduce DEM error by up to two orders of magnitude. The magnitude of doming error shows a linear relationship with radial distortion and we show how characterization of this relationship allows an improved distortion estimate and, hence, existing datasets to be optimally reprocessed. Although focussed on UAV surveying, our results are also relevant to ground‐based image capture. © 2014 The Authors. Earth Surface Processes and Landforms published by John Wiley & Sons Ltd.


Introduction
Unmanned aerial vehicles (UAVs) and systems such as tethered blimps and kites, are being increasingly used to provide high resolution, detailed imagery and associated digital elevation models (DEMs) for surface process and geomorphological research (e.g. Gimenez et al., 2009;Marzolff and Poesen, 2009;Smith et al., 2009;Niethammer et al., 2010;d'Oleire-Oltmanns et al., 2012;Harwin and Lucieer, 2012;Rosnell and Honkavaara, 2012;Fonstad et al., 2013;Hugenholtz et al., 2013). DEM generation is also being facilitated by the rapidly widening availability of 'structure-from-motion' (SfM) software which offers significantly easier image processing workflows than traditional aerial photogrammetric techniques. SfM-based approaches have been successfully used with oblique images from both terrestrial and manned airborne platforms for assessing processes such as soil and coastal erosion and lava emplacement (Castillo et al., 2012;James and Robson, 2012;James and Varley, 2012;Tuffen et al., 2013). However, some DEMs derived from vertical UAV-imagery show systematic broad-scale deformations, expressed as a central 'doming' (e.g. Rosnell and Honkavaara, 2012;Javernick et al., 2014), that can make data unsuitable for broader comparative studies or for modelling gradient-sensitive processes such as rainfall runoff. This fundamental drawback needs to be overcome in order to fully exploit future data from UAVs and from similar ground-based image networks.
Here, we show how such systematic DEM deformation is associated with processing image sets with dominantly parallel viewing directions, and is correlated with inaccuracies in modelling radial camera lens distortion. Using fixed camera models with deliberately introduced radial distortion error, the 'doming' effect has been previously illustrated in single stereo image pairs (Fryer and Mitchell, 1987;Chandler, 2008, 2011). UAV data collection strategies, particularly those for fixed-wing platforms (e.g. Eisenbeiss, 2009), are largely based on the tried and tested flight plan styles developed for traditional aerial surveying (Krauss, 1993;Abdullah et al., 2013) which are built up from sequential stereo pairs. Images are acquired in regular linear patterns, with overlapping images taken along a flight line forming an 'image strip', and parallel, overlapping image strips used to survey areas as an 'image block'. Thus, if systematic error can persist from a single image pair to full blocks, multi-image DEMs should be anticipated to display similar deformation.
In traditional aerial surveying, doming artefacts are largely minimized through the use of purpose designed and constructed 'metric' survey cameras with well-defined camera models (and generally negligible radial distortion), along with dense networks of control points. In contrast, UAV and increasing volumes of light aircraft and ground-based data are acquired using consumer-grade cameras, with less accurate models derived from 'self-calibration' procedures (in which camera calibration is carried out simultaneously with the image processing), and projects are supported by weaker control. Consequently, as previously shown for self-calibration in stereo pairs (Wackrow and Chandler, 2008), the opportunity for radial distortion error and ensuing DEM deformation will be significantly enhanced in such surveys. We demonstrate that, for dominantly parallel image sets, systematic error is also an inevitable result of self-calibration (which is increasingly being used with consumer camera data).
Although we focus on airborne surveying, the issue is also relevant to ground-based image networks. For example, using locally near-parallel images to reconstruct a~50-m-long coastal cliff results in systematic deformation of the recovered surface when compared with benchmark laser scanner data ( Figure 1). In this complex case, the deformation is not a straightforward dome or arc, probably due to the curvature of the cliff and the variability in the camera orientations. However, Chandler (2008, 2011) illustrated that a convergent imaging geometry (in which the camera viewing directions are not parallel, but converge on the area of interest) mitigated systematic error in stereo-pairs and, by adopting a convergent strategy within the coastal cliff image strip, the systematic deformation is reduced to negligible levels [see Figure 1 and James and Robson (2012)].
In this letter, we use simulated data within established photogrammetric software to allow the fundamental geometric sensitivities to be assessed. Processed image network simulations show how self-calibration of parallel camera axes image networks leads to systematic DEM error, and we explore the use of convergent imaging geometries to mitigate this effect. Collecting convergent imagery from UAVs is not necessarily as easy as from the ground, so we present adaptations to flight plan strategies to help maximize the accuracy of UAV-derived DEMs. Finally, we illustrate how the underpinning relationship between radial distortion error and surface form can be defined to optimize surface reconstruction when oblique imagery are not available. shows systematic distortion of the surface if only near-parallel images are used (c), but negligible systematic error when convergent images are also included in the processing (d). Differences are highlighted by calculating average errors for~0.1°azimuth segments along the cliff (e). Redrawn in part from James and Robson (2012). This figure is available in colour online at wileyonlinelibrary.com/journal/espl Three-dimensional (3D) Surface Models from Photographs Prior to detailing our simulations, we review the three-dimensional (3D) reconstruction methods that underpin our analyses. DEM creation from images requires that all areas of the surface to be modelled are photographed from two or more different positions. Features in the photographs are then identified, matched across multiple images, and a mathematical 'camera model', along with information on camera position and orientation, is used to determine 3D point coordinates from the two-dimensional (2D) image coordinates. A variety of techniques are used to make initial estimates of the unknowns (such as camera positions, pointing directions and the 3D point coordinates) in the resulting system of equations. These initial values are then simultaneously optimized in a 'bundle adjustment' (Granshaw, 1980), which produces a self-consistent 3D model with associated camera parameters, by minimizing the overall residual error. The camera model itself can be fixed (invariant) within the bundle adjustment or, alternatively, can be included in the optimization (i.e. a refinement of focal length, radial distortion, principal point offset, tangential distortion, affinity and orthogonality, for example) to form the 'self-calibration' process.
The results of such a 3D reconstruction will be in an arbitrary coordinate system so, to reference to a real-world system, either the camera positions, or the positions of control points are usually measured in the field [e.g. by differential global positioning system (dGPS)]. Both 'traditional' photogrammetry and SfM approaches use bundle adjustment optimization, but typically differ in whether the control data are used prior to, and within, the bundle adjustment process (photogrammetry), or only after bundle adjustment in the form of a separate coordinate transformation (SfM). Where control measurements can be included within the bundle adjustment, they represent observations which are 'external' to the image set, that need to be satisfied in the adjustment process. In comparison, features identified in the images, and their associated matches, represent measurements that are 'internal' to the image set, which also need to be satisfied. Thus, including control measurements in the bundle adjustment (such as in a 'traditional' photogrammetric approach) represents a minimization under independent inner and external constraints which, together, determine the shape, scale and orientation of a 3D model.
In contrast, typical SfM approaches use significantly fewer control points because the 3D model is built from information in the image set alone ('inner constraints' only). Control data are then used to scale and orient the model to the 'real' coordinate system, but do not contribute to reducing any distortion of the model shape. Nevertheless, convergence between the workflows of the photogrammetric and computer vision communities is increasing, and several SfM-based applications now allow control measurements to be included in the bundle adjustment.

Methods
We assess systematic DEM error by simulating UAV surveys using the close range photogrammetry software VMS (Vision Measurement System, http://www.geomsoft.com). Hypothetical camera networks were generated in which virtual 3D points were defined to represent the topographic surface, and the initial position and orientation of the cameras were described. Using a specified camera model, the pixel coordinates at which each 3D point would be observed in each image were then calculated, with small pseudo random offsets added to represent a component of measurement noise. Offsets were generated from a normal distribution with a 0.5-pixel standard deviation, a magnitude representative of the precision of commonly used image feature detectors in SfM software (Remondino, 2006;Barazzetti et al., 2010). A bundle adjustment was then carried out to minimize overall error, and any resulting systematic DEM deformation was determined by comparing the postbundle adjusted 3D point coordinate estimates with their pre-bundle equivalents. The simulations thus represent synthetic data processed with the same algorithms and workflow as real image networks. In order to focus on the effects of imaging arrangement, simulations were run without control measurements using an inner constraints method (Granshaw, 1980) to define the coordinate datum. In the Discussion section, we consider the implications of including control, but a quantitative assessment, relevant to more than just one specific scenario, exceeds the scope of this letter.
Although the simulations are effectively scale independent, we attribute parameter values relevant to geomorphological studies, to aid familiarity and interpretation. A straightforward camera model described a 4000 × 3000 sensor of 5 μm pixels with a 20 mm lens, in which lens distortion was given by only one radial term (K 1 ), taken from a standard distortion model (Brown, 1971). The value of K 1 defines a radial distortion that increases with the cube of the distance from the effective image centre, with our default camera (used to determine the image measurements) having zero distortion, i.e. K 1 = 0 mm À2 . To simulate scenarios with radial distortion error, K 1 was fixed during bundle adjustment at 10 À5 mm À2 , which corresponds to a maximum geometric distortion of~5.5 pixels in an image corner.
Simulations were carried out to represent a nominal flying height of 50 m, giving a ground pixel size of~13 mm and image footprint of~50 × 38 m. Firstly, the simulation process was validated using a single stereo pair (with 60% overlap) to reproduce the 'domed' DEM deformation in the presence of radial distortion error as illustrated by Chandler (2008, 2011). Idealized standard aerial survey image strips and blocks were then constructed using photograph centres positioned to give 60% along-strip image overlap and 20% overlap between adjacent strips. Thus, a block represented 40 images, acquired through four parallel flight lines separated by 40 m, with images taken at 15 m intervals along each line. Such scenarios were used to assess the effect to enabling self-calibration adjustment of the K 1 parameter which, initially set to zero, was given an uncertainty of ±10 À5 mm À2 and allowed to vary within the bundle adjustment.
However, these idealized scenarios, with one set of perfectly parallel flight lines and vertically oriented cameras, contain none of the variability that would naturally exist in real UAV flights. During practical fixed-wing UAV surveys, a straightforward image block may be supplemented by a second set of flight lines at a slightly different azimuth heading to ensure good image overlap. Camera altitude and pointing direction will also be subject to a degree of natural flight variability. Consequently, to explore error sensitivity for a more complex flight plan under realistic flight conditions, simulations were performed in which small random variations were added to the camera pointing direction and flying height. Systematic variations in camera height or orientation, as produced by flying over sloping terrain, at different altitudes or with an off-nadir installed camera, were also explored.
With convergent imaging geometry known to significantly reduce DEM error in image pairs processed with invariant camera models Chandler, 2008, 2011), we consider multi-image convergent imaging scenarios similar to strip and block-like layouts, and present practical flight plan solutions to implement the advantages of convergent imaging in self-1415 MITIGATING SYSTEMATIC ERROR IN TOPOGRAPHIC MODELS calibrated UAV surveys. Finally, for scenarios in which data collection with some convergent geometry is not possible (such as existing datasets), but where there is sufficient ground control to characterize systematic error, we describe an invariant camera model processing strategy to mitigate error by better defining K 1 values. The approach is demonstrated using a scenario similar to the practical flight plan described earlier, but with natural variability simulated with pseudo-random variations of both camera angle and altitude, with standard deviations of 2°and 1 m, respectively. The relationship between K 1 and the doming magnitude can be characterized by carrying out multiple bundle adjustments using an invariant camera model, with each adjustment using a different value of K 1 . An initial self-calibrated bundle adjustment will provide a first K 1 estimate that subsequent values can bracket. For each adjustment, the gradient of a linear fit to the resulting dome profile (as expressed by the vertical error on control points or with respect to a reference DEM) can be used as a metric for the doming magnitude, to associate with K 1 . Thus, modelling this gradient-K 1 relationship allows a zero-doming (zero-gradient) K 1 estimate to be made, which can then be used in an invariant camera model for optimized reprocessing.

Idealized parallel geometries
The results of simulating a standard stereo image pair in which an invariant camera model has radial distortion error (Figure 2a, central panel) reproduced the symmetrical domed deformation observed by Chandler (2008, 2011), and contrasts with the negligible DEM deformation produced with an invariant error-free camera (Figure 2a, left panel). If adjustment of the camera model was allowed within the bundle adjustment, the presence of image measurement noise allows the self-calibration process to converge on a non-zero radial distortion term, with associated DEM deformation (Figure 2a, right panel), again, similar to that seen in stereo image pairs (Wackrow and Chandler, 2008). Another way of expressing this is that, in a parallel-axes image network, the computed surface form (the DEM) and the radial distortion estimation are correlated, and are inseparable without additional information. The simulated image strip (Figure 2b) shows that this self-calibration issue persists over multiple images, and the processed image block (Figure 2c) demonstrates the characteristic DEM doming observed in real UAV data (e.g. Rosnell and Honkavaara, 2012).

Sensitivity to variability
Augmenting the image block with an additional set of flight lines at a different azimuth heading did not significantly reduce the systematic DEM deformation (Figure 3a). However, adding variability to the camera pointing direction and altitude provides notable improvements (Figure 3b). Ground slope (effectively representing a systematic rather than random variation in flight height) also helped but, even with a gradient of 20% (giving image heights of between 36.5 and 63.5 m above the ground), deformation of amplitude~0.2 m remained. Similar levels of doming mitigation can be achieved over flat terrain by flying the additional flight lines at a different altitude, or by inclining the camera within the airframe (Figure 3b), although these solutions have unfavourable implications in terms of the Figure 2. Vertical DEM error in idealized simulations of (a) a single stereo pair, (b) a 10-image strip and (c) a four-strip image block. The simulated camera positions (cones) for each scenario are shown in the left-most column above the area of reconstructed surface. Three simulation results are given for each scenario, showing the results of bundle adjustment processing carried out using either a fixed, error-free camera model (left plots), a fixed camera model with error introduced into the radial distortion term (middle plots), or a camera model in which self-calibration of the K 1 radial distortion parameter was enabled within the bundle adjustment (right plots). Each plot shows the systematic component of vertical error in the resulting DEM as both a 3D wireframe and shaded image. Note the magnitudes of the vertical scales, which differ between the different scenarios shown. This figure is available in colour online at wileyonlinelibrary.com/journal/espl scale of the acquired imagery. Nevertheless, the results indicate that all parameter variabilities have mitigating effects on the deformation, with camera angle being particularly effective.

Convergent imaging geometries
For a convergent image pair with an invariant camera model and error in the radial distortion term, the simulations reproduce the findings of Chandler (2008, 2011), showing mitigation of systematic DEM error (i.e. compare the central panels of Figures 2a and 4a). However, if camera models are allowed to vary during bundle adjustment, extending the image strip scenario (Figure 2b) by adding images angled at 5°either side of the original images, also significantly decreased the deformation (compare Figures 4b and 2b). Collecting convergent images from a circular orbit (Figure 4c, e.g. Cecchi et al., 2003;James and Robson, 2012;James and Varley, 2012) is particularly effective, with the maximum error shown in Figure 4c being < 2 mm.

Practical flight plans
Thus, convergent imaging clearly mitigates surface deformation in self-calibrating multi-image blocks but, because practical implementation of the scenarios in Figure 4 may be difficult for UAVs, we suggest alternative solutions ( Figure 5) based on augmenting a traditional flight plan (which has advantages in terms of efficiency in areal coverage and for producing nadir imagery for ortho-image mosaics). For fixed-wing systems, oblique images could be captured during additional, gently curved overpasses (Figure 5a) or, for systems in which the camera inclination can be varied, fewer, more highly angled images may be possible (Figure 5b). The results of both scenarios demonstrate the effective minimization of doming error.

Optimized processing
When convergent imagery is not available, then the direct relationship between K 1 and the magnitude of the doming  Figure 2c). (b) Systematic vertical DEM errors plotted by radial distance from the survey centre. To facilitate comparisons, all results are translated vertically to give zero error at zero radius. Upper-row plots illustrate the effect of adding a component of random noise to the camera pointing directions (i.e. variability in UAV roll, pitch and yaw) or camera altitude, and the effect of surveying over sloping ground. Results are labelled by the standard deviation, σ, of the varied parameter, or the ground slope (in percent). In the lower row, plots demonstrate the effect of non-nadir installation of the camera in the UAV (with the camera forward-pointing by the given angle), and by flying the second set of flight lines at increased altitude (labels give the magnitude of the increases). In all plots, the greatest error is given by the idealized scenario shown in (a) that represents zero noise and flat topography. This figure is available in colour online at wileyonlinelibrary.com/journal/espl 1417 MITIGATING SYSTEMATIC ERROR IN TOPOGRAPHIC MODELS can be exploited to minimize systematic error. Carrying out multiple bundle adjustments with invariant camera models demonstrates the resulting variation in systematic vertical error magnitude with K 1 (Figure 6a). Note that due to using synthetic data in the simulations, the error-free topographic surface is effectively known, so error magnitude can be calculated for each 3D point. In most real cases, a reference DEM will not be available and z-error would be calculated from the control points. The gradient of linear fits to the resulting error profiles show a linear relationship with the K 1 parameter (Figure 6b), reflecting the correlation between K 1 and deformation of the surface form. Using this relationship to estimate the zero-doming K 1 value for optimized processing gives K 1 = À3.7 × 10 À8 mm À2 (compare with the actual default camera value of zero mm À2 ). Using the estimated K 1 value results in an order of magnitude systematic error reduction over the self-calibrated bundle adjustment solution (Figure 6c and blue data in 6a) although the small remaining distortion error does leave some residual, more complex DEM error.

Discussion
The simulations demonstrate how auto-calibration processing of parallel-axis image networks can produce the metre-magnitude systematic dome or arch deformation as seen in some UAV reconstructions (e.g. Rosnell and Honkavaara, 2012;Javernick et al., 2014). Rosnell and Honkavaara (2012) mitigated the systematic error in their SfM-derived DEM by including control measurements and processing the image set with standard aerial photogrammetry software (BAE Systems SocetSet). Javernick et al. (2014) were able to reduce z-error to the decimetre level by including control points in their bundle adjustment and by using a more complex camera model. However, evidence of systematic error remains (figure 5 of Javernick et al., 2014) and the standard deviation of z-error for check points was seven times that for control points. This suggests that the additional parameters in the camera model were enabling the control measurements to be accommodated better locally, but at the expense of a more general solution with poorer 3D coordination precision. Our simulations did not include any control measurements within the bundle adjustment, thus, the results represent maximum expected DEM deformations, as anticipated for SfM-based reconstructions in which control data are only used to scale and orient the resulting 3D model (e.g. figure 10 of Rosnell and Honkavaara, 2012). Where spatially well distributed (and suitably accurate) control measurements can be included within the bundle adjustment, deformation magnitudes should be reduced. For an idealized case with no error on any image or control measurements, and where the camera model provides an accurate representation of the imaging process, then bundle adjustment should equally satisfy both inner and external constraints simultaneously. However, with the inner and external constraints being independent, there are no implicit guarantees that the bundle adjustment goals for them will fully agree; indeed, with all measurements subject to noise and other natural variability, this is unlikely to be the case. Thus, when both types of information are considered, the results represent a balance between optimizing for either purely inner or external constraints, achieved by weighting the measurements based on their estimated precision. Consequently, even when control data are present, if inner constraints tend to a systematic error, then this will persist in the output model to a degree determined by the relative weightings of the image and control measurements. To enable doming error to be detected (and corrected), control points should be widely distributed, covering both the survey centre and peripheral regions, enabling radial z-error plots (e.g. Figures 3b and 6a) to be generated.
The concept of combining parallel and oblique images is equally valid for ground-based image collection (e.g. Figures 4b  and 4c), although this is seldom carried out in a dominantly vertical direction. James and Robson (2012) attributed the lack of detectable deformation in their cliff reconstructions to such a strategy, as demonstrated in Figure 1. Nevertheless, analysis of output statistics from the adjustment process and correlations between estimated parameters, as well as the inclusion of check points (known coordinates of measured locations that are compared against estimated positions) are essential to provide confidence in the output solution.

Conclusions
For image sets with near parallel viewing directions, self-calibrating bundle adjustment (as normally used in SfM-based reconstructions) will not be able to derive radial lens distortion accurately, and will give associated systematic 'doming' DEM deformation. In the presence of image measurement noise (at levels characteristic of SfM software), and in the absence of control measurements, simulations representative of UAV surveys (with camera angles only deviating from parallel by a standard deviation of 2°) display domed deformation with magnitude of~0.2 m over horizontal distances of~100 m. Deformation will be reduced if suitable control points can be included within the bundle adjustment, but residual systematic vertical error may remain, accommodated by the estimated precision of the control measurements.
The likelihood of detectable systematic DEM error in UAV (or similar) surveys can be reduced in a number of ways: (a) If an accurate camera model is available, then self-calibration is not required and systematic error should be negligible (Figure 1, first column). (b) In the more usual case where self-calibration is necessary, systematic error can be significantly reduced through the collection of oblique imagery. The exemplar practical flight plans given in Figure 4 reduce DEM deformation by one to two orders of magnitude and demonstrate the advantage of gimballed camera mounts (as on many rotor-UAVs). We propose this as a particularly useful strategy when SfM-based software is to be used, which may provide less access to Figure 6. Minimizing systematic DEM error through optimized estimation of K 1 . (a) Radial profiles from the survey centre through six DEMs produced from the flight plan scenario of Figure 3, with variability in the camera altitude and pointing directions given by standard deviations of 1 m and 2°, respectively. The grey datasets show the results of using an invariant camera model within the bundle adjustment, with different K 1 values (given by the number labels, ×10 À6 mm À2 ). The red data (with a systematic deformation of up to~0.2 m in magnitude) result from a self-calibrated image network which recovered a K 1 value of 2.2 × 10 À6 mm

À2
. The black lines show linear fits to each dataset. (b) The gradient values for the linear fits demonstrate a linear relationship with K 1 (reflecting the correlation between K 1 and the surface form) from which the zero-gradient (i.e. minimum-doming) K 1 value can be estimated (+symbol, K 1 = À3.74 × 10 À8 mm À2 ). Using this value with an invariant camera model in a bundle adjustment, results in strong mitigation of the doming effect [(c), and blue data in (a)]. This figure is available in colour online at wileyonlinelibrary.com/journal/espl 1419 MITIGATING SYSTEMATIC ERROR IN TOPOGRAPHIC MODELS processing parameters than typical aerial photogrammetry techniques. If a nadir image set is not required, then flying overlapping flight lines in opposing directions with an offnadir installed camera is also effective. (c) Finally, if oblique imagery is not available but suitably distributed control points are present, the relationship between deformation magnitude and radial distortion can be characterized. Through repeated bundle adjustment using an invariant camera model with different distortion parameter values, the parameter value associated with minimal systematic DEM error can be estimated (Figure 6), and then used for optimized processing.
Whichever approach is adopted, for critical DEM extraction surveys, especially over large flat regions, the established photogrammetric practice of inspecting bundle adjustment output statistics, estimated parameter correlations and the use of check points is highly recommended.