A simple, fast, and repeatable survey method for underwater visual 3D benthic mapping and monitoring

Abstract Visual 3D reconstruction techniques provide rich ecological and habitat structural information from underwater imagery. However, an unaided swimmer or diver struggles to navigate precisely over larger extents with consistent image overlap needed for visual reconstruction. While underwater robots have demonstrated systematic coverage of areas much larger than the footprint of a single image, access to suitable robotic systems is limited and requires specialized operators. Furthermore, robots are poor at navigating hydrodynamic habitats such as shallow coral reefs. We present a simple approach that constrains the motion of a swimmer using a line unwinding from a fixed central drum. The resulting motion is the involute of a circle, a spiral‐like path with constant spacing between revolutions. We test this survey method at a broad range of habitats and hydrodynamic conditions encircling Lizard Island in the Great Barrier Reef, Australia. The approach generates fast, structured, repeatable, and large‐extent surveys (~110 m2 in 15 min) that can be performed with two people and are superior to the commonly used “mow the lawn” method. The amount of image overlap is a design parameter, allowing for surveys that can then be reliably used in an automated processing pipeline to generate 3D reconstructions, orthographically projected mosaics, and structural complexity indices. The individual images or full mosaics can also be labeled for benthic diversity and cover estimates. The survey method we present can serve as a standard approach to repeatedly collecting underwater imagery for high‐resolution 2D mosaics and 3D reconstructions covering spatial extents much larger than a single image footprint without requiring sophisticated robotic systems or lengthy deployment of visual guides. As such, it opens up cost‐effective novel observations to inform studies relating habitat structure to ecological processes and biodiversity at scales and spatial resolutions not readily available previously.

| 1771 PIZARRO et Al. & Puotinen, 2012) to longer term stressors associated with climate change (Hughes, 2003). The physical habitat structure built by benthic communities provides diverse niches and supports and array of associated organisms. Benthic habitats with higher levels of structural complexity support greater levels of species abundance and diversity (e.g., fishes and crustaceans) (Graham & Nash, 2013), and show faster rates of recovery following disturbances (Graham, Jennings, MacNeil, Mouillot, & Wilson, 2015). As a consequence, habitat complexity can be used as an indirect indicator of the health and functioning of some ecosystems (e.g., productivity and trophic redundancy). Enabling fast and reliable observation of benthic community composition and structural complexity over large areas, especially during difficult field conditions, will greatly improve tests of ecological theory and the effectiveness of monitoring programs.
Traditional techniques to estimate community composition and habitat structural complexity are labor-intensive, low-dimensional, and capture data at small spatial and temporal scales (Friedlander & Parrish, 1998;Loya, 1972;Luckhurst & Luckhurst, 1978). For instance, the line intercept transect records the one-dimensional length of overlap with different benthic categories (e.g., coral, seaweed, and sand), which is used as an estimate of the two-dimensional coverage of these categories. Similarly, a common measure of habitat structural complexity (or rugosity) is the ratio between the length of a chain draped over the benthos and the absolute distance between the start and end points. The limited scale of such techniques requires high levels of replication for accurate estimates. Furthermore, transects and chains are usually set randomly, resulting in a loss of spatial relationships.
Recent advances in computer vision have enabled threedimensional reconstructions of bathymetry from which benthic categories (Beijbom et al., 2015;Bewley et al., 2012;Shihavuddin, Gracias, Garcia, Gleason, & Gintert, 2013) and multiscale structural complexity (Friedman, Pizarro, Williams, & Johnson-Roberson, 2012) can be estimated. Several recent studies have used to off-the-shelf Structure from Motion (SfM) software such as Photoscan to build 3D models of colonies and broader reef patches, and characterize the quality of these reconstructions (Burns, Delparte, Gates, & Takabayashi, 2015;Figueira et al., 2015;Leon, Roelfsema, Saunders, & Phinn, 2015;Storlazzi, Dartnell, Hatcher, & Gibbs, 2016), establishing confidence in the use of visual reconstructions to address ecological questions (Burns et al., 2016). These techniques rely on combining overlapping images into a composite 3D reconstruction, and while they can scale to areas of tens to thousands of square meters consisting of tens of thousands of images, they need a systematic way of covering the survey site. Otherwise, poor coverage in the form of gaps or holes (missing imagery for parts of the benthos) or in low overlap (low number of views of the same scene point, resulting in low-precision triangulations and structure estimates) compromise the usefulness of the imagery. Systematic coverage is an ideal task for a properly instrumented underwater robot, which can carry down-looking cameras and be preprogrammed to follow a survey pattern to collect the desired imagery . However, the use of robots is still logistically complex, requiring specialized personnel in the field, and robots do not operate well in shallow-water, high-energy conditions. Diver-held imaging systems can also deliver broad area coverage although the replication of systematic "mow the lawn" patterns requires ropes as visual guides and additional people in the water to handle them (Henderson, Pizarro, Johnson-Roberson, & Mahon, 2013). Others have relied on swimming an approximate grid pattern unaided by external guides (Burns et al., 2015), but this approach does not scale well to larger areas, tends to break down for narrow line spacing, or in strong swell or currents (Andersen, 1968). It is also a tedious task that depends heavily on the skill of the diver. A simpler approach, the "minute mosaic" (Gintert et al., 2012) uses a rebar pin as a visual reference for a diver to complete three revolutions with increasing radius. As it depends on the diver's assessment of distance to the pin, the areas covered varied from 19 to 44 m 2 . This is also likely to result in variable image overlap between revolutions, affecting the quality of the composite and limiting its value for repeat surveys. Other domains use spirals to survey areas.
For example, Archimidean spirals, that resemble involutes of a circle, are used in surface reconstruction in metrology (Wieczorowski, 2001) and in estimating patchy distributions (Kalikhman, 2006) although these uses are concerned with sparse sampling of the area of interest.
We present a simple, repeatable, and low-cost method to generate systematic surveys for visual three-dimensional reconstructions of benthic habitats. It removes the need for high-end navigation and controls and relies instead on constraining motion of a swimmer carrying the imaging equipment. Consistent coverage is attained using a line wound around a fixed drum as a guide. Unwinding the line under natural swimming tension constrains motion to a spirallike pattern. The curve traced by the tip of the line (and the imaging package attached to it) corresponds to the involute of a circle, with constant separation distance between revolutions corresponding to the circumference of the drum. The approach we present is practical F I G U R E 1 Footprint, b, seen by a down-looking camera with angle of view of β at an altitude of h. In general, the aspect ratio of imaging sensors is not square so that angle of view across track (across direction of motion) and along track is different. We refer to these by subscripts β across and β along and likewise for the corresponding footprints b across and b along h β/2 b/2 for full coverage of ~110 m 2 areas, which corresponds to a radius of ~6 m. Much larger lengths increase the chances of entanglement.

| MATERIALS AND PROCEDURES
For down-looking cameras, systematic surveys covering areas much larger than the footprint of a single image require multiple views of the same scene points (i.e., "image overlap") to relate the multiple images into a composite representation such as a 3D reconstruction or an orthographic mosaic. In the case of a down-looking camera, image overlap along the direction of motion depends on the angular field of view, altitude, and motion between image capture instants ( Figure 1).
The footprint b is given by b = 2h tan ( ∕2), where β is the angular field of view and h the altitude (distance from camera to seafloor).
The angular field of view can be estimated from a camera calibration (Bouguet, 2004) or, approximately, using the effective focal length in water and the imaging sensor size. For example, the configuration that captured the imagery used here has an across-track field of view of 42°, and 34° along track. At a desired altitude of 2 m, the acrosstrack footprint is b across = 1.54 m and the along-track footprint is b along = 1.22 m ( Figure 2) giving a footprint of just under 2 m 2 per image. The footprint size and the displacement between frames Δ determine the number of views of a scene point n = b/Δ The along-track displacement is Δ along = s·T, where s is the survey speed and T the period between frames. For the purposes of designing a survey with a target F I G U R E 2 Contours of constant footprint b in meters as a function of altitude h and angular field of view β. Across-track footprint (blue marker) and along-track footprint (green marker) for the camera used to generate the results in this study, with a target altitude of 2 m. The effect on footprint of a ±0.5-m variation in altitude is illustrated by the range bars F I G U R E 3 Geometry relating the tip of the unwinding line (r, ϕ) to the drum diameter R and angle along the drum circumference α made by the involute of a circle of diameter 0.16 m and a 6-m-long line. In practice, the tip of the line will trace this pattern as it unwound around a drum while keeping it in tension.
F I G U R E 5 Contours of constant survey path length in meters, as a function of drum diameter (or spacing between revolutions) and desired survey area. The drum diameter sets the spacing between revolutions and should be selected considering the camera footprint ( Figure 2) and desired overlap. Given the path length and a swimming speed, the survey time can be calculated. The blue marker indicates the survey design used to generate the results presented in this study, with a drum diameter of 0.16 m and an approximate survey area of 113 m 2 number of views, we can determine the displacement as Δ = b/n. For along-track motion, the desired survey speed is The trackline spacing (across track) is Δ across = b across /n.

| Materials
We rely on three-three major components for data acquisition:

Star picket or base.
For reef structures, a star picket driven into the substrate at the center of the survey patch serves as the anchor point to hold the drum and pole. The pole is keyed to the holes on the star picket so that a pin or screwdriver locks the pole from rotating around the star picket. See Section 5 for a discussion on using the method with other bottom types.

| Methods
We characterized the technique's performance based on a cyclone recovery monitoring program involving 21 shallow reef flat (approx. 1-2 m depth) sites around Lizard Island on the Great Barrier Reef  winds. Therefore, the 21 sites capture a broad range of habitats and fieldwork conditions, ranging from sheltered back reef and lagoons through to exposed reef crests.  in points in the survey or artificial markers using surface GPS (Burns et al., 2015) or in shallow-water near-shore cases, using a total station theodolite approach (Henderson et al., 2013).

| Survey consistency
To quantify the image overlap consistency of the spiral survey procedure, we also performed six surveys using the "mow the lawn" method (Henderson et al., 2013;Mahon et al., 2011). Note that these surveys were conducted on sheltered reef and required three swimmers.
Any survey pattern should ensure image overlap across track, with each image observing common scene points with other nearby images, and thus forming a well-constrained photogrammetric network that produces reliable estimates of camera poses and 3D scene points. We quantify this effect by comparing the local density of connections between cameras in the photogrammetric networks formed by the "mow the lawn" and spiral patterns.
Specifically, we propose a metric based on the shortest path along linked images in the resulting photogrammetric network. For example, if images are directly linked to each other, the shortest path length between them is one; if they have to go through another image, the shortest path length is two. A "hole" (lack of matches because of poor or nonexisting overlap) between two spatially close F I G U R E 1 2 Feature matches between two stereo pairs (top row and bottom row) across two revolutions of the spiral pattern. The colored lines' start and end points correspond to the same feature on the first pair and second pair. The reduced overlap results in fewer matches when compared to Figure 11 cameras requires a long path through several other cameras. We considered six "mow the lawn" and 33 spiral surveys. For each one, we calculated the median length of direct links between cameras and then define a circular neighborhood using a radius twice that size to consider cameras that are "close." This provides invariance to the size of the image footprint (which changes with imaging altitude). For each camera in a survey, we find the shortest path (in number of links) to all nearby cameras within this neighborhood. Figure 10 illustrates the metric.

| Operational simplicity and survey speed
The equipment used is easy to handle. Driving a star picket temporarily into the reef is a standard task for field ecologists. Once clipped onto the line, the swimmer only needs to advance while keeping tension on the line and maintain a desired altitude. The time to perform a spiral survey is consistent, with variations depending on currents  Table 1 for details. In the case of snorkeling on reef flats, tides affect water depth which determines the imaging altitude and image footprint (see Section 2 and Figure 1). Swell and currents act as disturbances that affect speed and the actual path followed (the line only constrains motion away from the pole). In comparison with "mowing the lawn" (Mahon et al., 2011), our approach is significantly simpler and more reliable. Table 2 contrasts these two survey techniques.

| Survey consistency
The spiral survey by design results in constant separation between revolutions. When matched to the field of view and altitude, it guarantees high overlap. Image features are matched automatically between image pairs to provide estimates of the relative pose of camera positions both across and along track. Figures 11 and 12 show examples of along-and across-track matches between pairs of images.
For the results in this study, we acquire stereo pairs at 2 Hz, providing ample overlap along track. Figure 13 shows examples of spiral surveys.
The black dots represent the estimate location of the camera throughout the survey. The red lines join camera locations for which image features  The distribution of the lengths of the shortest paths is indicative of the quality of the survey. Figure 15 shows the histogram of the lengths of the shortest paths to the neighbors of each camera on all the dives.
Ideally, the mass of the distribution will be concentrated in shortest path lengths of one and two links. It is clear that the spiral survey approximates this while the "mow the lawn" is skewed to much higher path lengths of three up to nine, indicating the presence of significant holes. Figure 16 shows an example of the texture-mapped model for one of the spiral surveys while Figure 17 shows the underlying threedimensional surface model.

| Revisiting sites
Given a waterproof printout of the mosaic from a previous survey and its coordinates, an experienced swimmer aided by GPS can relocate the central point in seconds to a few minutes, depending on how much the site has changed. We have successfully completed at least 63 revisits of monitoring sites using this approach (21 sites revisited three times, approximately every 6 months) in an area that was subject to a cyclone after the second visit. If the particular application allows the star picket to be left embedded in the substrate, relocating the survey site is trivial. This approach will be robust to substantial changes in appearance that can occur after events such as large storms. Figure 8 shows locations of sites revisited on Lizard Island for April 2014, October 2014, May 2015, and November 2015. Figures 18 and 19 show details of six sites around the island. Our method was able to consistently survey and revisit sites with varying levels of exposure to waves, wind, and currents, ranging from sites in the protected lagoon to those open to ocean swell.
F I G U R E 1 5 Normalized histogram of the minimum distance or path length (in links) between camera poses within a radius of twice the median length of the links in a survey, based on six "mow the lawn" surveys and 33 spiral surveys in similar conditions and terrain. The minimum distances distribution for the "mow the lawn" surveys has a longer tail and less mass in the one and two link distance bins. Greater minimum distances correspond to holes in coverage or images that do not overlap enough to reliably find common features between them, leading to poorly constrained photogrammetric networks (see Figure 10)

| DISCUSSION
Our constrained motion survey provides a simple yet robust and effective way to systematically cover an area much larger than a single image footprint. The successive passes in the spiral path can be spaced precisely to allow overlap across revolutions and enable 3D visual reconstructions. This approach facilitates georeferencing and revisiting sites for monitoring. With this type of survey data, it is straightforward to generate multiscale terrain complexity measures .
This method enables scientists to reliably generate high-resolution, broad-scale representations of reef environments without depending on engineering specialists and complex robotic systems. It can be integrated into their standard fieldwork with modest additional effort providing novel views of structural complexity and larger scale spatial patterns. For example, reconstructions from spiral surveys have been color-printed onto underwater paper and uploaded into underwater tablet GIS software, for in situ coral species identification and habitat feature annotation. When coupled with ecological surveys (e.g., corals and fish), the method can offer valuable data at multiple scales for understanding the relationship between species diversity and habitat complexity. When repeated and coupled with environmental data and observations of the physical disturbances, it enables powerful insights into the ecological and evolutionary processes operating in marine systems.

| COMMENTS AND RECOMMENDATIONS
One of the limitations of the technique is that the line between the imaging platform and drum must be free to "sweep" the site unobstructed. This is satisfied by a relatively planar, though not necessarily horizontal, surfaces. It also is satisfied if the center of the survey is at a local minimum or maximum. In practice, constant survey altitude is not achieved and the range of altitude variations encountered by the imaging system needs to remain in focus and provide an image footprint that still achieves overlap with neighboring revolutions at the low end of the altitude range. In cases of significant surfaces that are not captured by a down-looking camera, it should be possible to F I G U R E 1 8 Six spiral surveys collected at sites around Lizard Island showing the variability in the reef cover. Each spiral survey covers an area of approximately 113 m 2 . These are rendered as the orthographic projection of the image-textured mosaics. The variability in cover is readily apparent. Clockwise, from top left: North Reef 3, Washing Machine, Easter Point, South Island, Lagoon 2, and Resort. See Figure 19 for the corresponding underlying bathymetry and Figure 8 for location of these sites around the island (red points) complement the systematic spiral survey with additional imagery at oblique angles, as long as there is a sequence of images that gradually change the orientation of the camera while observing the same scene points. This ensures that the additional images can be used by the reconstruction pipeline. This technique has been mostly used on carbonate reefs, where a temporary or permanent star picket can be driven into the substrate and then serve as an attachment point for the pole. In cases of rocky reefs or soft sediments, different attachment methods are required.
An alternative would be to use a pole with a heavy base or tripod. The increase in versatility of bottom types on which the technique comes at the price of a more awkward transport in water. While the results presented in this study are based on surveys using snorkel, it has been used with scuba to collect data at greater depths. In such cases, care must be taken to keep the line length (i.e., maximum radius) under the maximum allowed safe separation distance between divers. In cases of near-vertical slopes, consideration of the dive profile would also be necessary as the final revolutions would result in changes in depth for the diver comparable to the diameter of the survey.

Research in this study was supported by the Australian Research
Council grants DP1093448 and FT110100609 and the University of Sydney. A special thanks to Tom Bridge for capturing some of the work with his camera, and for helping out with fieldwork on Lizard Island.