Towards monitoring groundwater‐dependent ecosystems using synthetic aperture radar imagery

Mapping of groundwater‐dependent ecosystems (GDEs) relies largely on assumption‐laden evaporation models, and few global, direct, and real‐time monitoring techniques exist. We propose a new synthetic aperture radar imagery‐derived index, SARGDE, to identify and monitor these ecosystems across Australia. The index captures vegetation reliance on groundwater during dry periods by estimating the relative stability of foliage and branch structure from the vertical/horizontal cross‐polarized band and InSAR coherence. SARGDE is tested over two contrasting study sites in Australia. To build and verify the index, a total of 90 Sentinel‐1 interferometric wide images are processed and stacked in two data‐cubes. GDE response to the SAR signal is explored using a non‐linear dimension reduction algorithm. Relevant statistical parameters are derived from data‐cubes and combined to form the index. As the index relies on a 1‐year time series of globally, freely available, and cloud‐insensitive SAR imagery, SARGDE offers unprecedented capabilities for large‐scale, annual monitoring of GDEs. Such monitoring will aid reconciliation of human and ecosystem groundwater needs by acting as a systematic monitoring tool, helping policy makers to assure ecosystem sustainability where impacts related to mining, agriculture, or climate change may occur.


| INTRODUCTION
Groundwater extraction associated with development activities such as mining, hydropower, or irrigated agriculture, coupled with future climate impacts, pose considerable threats to the planets groundwaterdependent ecosystems (GDEs). As surface water abstractions reach (and often exceed) ecological limits in many areas of the world (Acreman et al., 2008;Datry et al. 2014;Eamus & Froend, 2006), groundwater exploration and use are increasing. As a result, hydrological alteration by development activities is triggering attention and stimulating scrutiny in relation to their potential current and future impact on the hydrological processes that govern connectivity between GDEs and groundwater. Interest in GDEs has largely been driven from Australia where tree decline and death have been noted in response to excessive groundwater abstraction (Arrowsmith, 1996;Clifton & Evans, 2001;Hatton & Evans, 1998).
GDEs have been defined in Australia by Richardson et al. (2011) as "natural ecosystems that require access to groundwater to meet all or some of their water requirements on a permanent or intermittent basis, so as to maintain their communities of plants and animals, ecosystem processes and ecosystem services." As such, they are complex and dynamic ecosystems that have varying temporal and spatial dependency on groundwater. GDEs can include vegetation, wetlands, and subsurface stygofauna to name a few. Altered connectivity between GDEs and groundwater results in ecosystem decline (e.g., Gnangara Mound, Western Australia (Froend et al., 2004;Bekesi et al., 2009), or Ida Bay karst system, Tasmania (Boulton, Humphreys, & Eberhard, 2010), leading to irreversible change over extended periods, with ecosystem components succeeded and often by opportunistic invasive weed species (Doody & Benyon, 2011b;van Wilgen et al., 2008).
Despite their hydrological and ecological significance, consideration of GDEs in water management planning is hampered by the challenges in identifying and monitoring GDEs. Eamus et al. (2015) and Doody et al. (2017) provide an overview of various GDE identification methods. Although accurate, field detection is often expensive and localized, limiting its use in regional or national studies. Efficient remote sensing mapping techniques are therefore actively sought. Barron et al. (2012) map GDEs at 25-m resolution using Landsat band ratios; Howard and Merrifield (2010) and Klausmeyer et al. (2018) take a multiple lines of evidence approach in California, as does Doody et al. (2017) to provide national GDE mapping in Australia, combining . Additionally, Gow et al. (2016) addresses a significant knowledge gap, by taking a "model-data" approach using land surface temperature (1-km resolution) to map GDEs and provide daily temporal groundwater connectivity. These methods, however, often require a high degree of user interaction, represent only a single point in time, have low temporal or spatial resolution, and/or are limited by cloud cover. The latter is especially an issue in low latitudes, where tropical monsoons may occur. There is thus a paucity in efficient and repeatable remote sensing methods to support GDE mapping and monitoring at finer temporal and spatial scales.
Globally acquiring SAR satellites opens the door to potential applications in GDE monitoring, and its use as a monitoring tool warrants investigation. The period 2014-2018 marks a turning point in SAR science with automated and global acquisition beginning in 2014 with the launch of Sentinel-1A satellite, followed by its synchronous twinsatellite Sentinel-1B in 2016. Although this is a short period of image availability compared with the global imagery from multispectral (MS) platforms, available since the early 1970s, SAR offers the advantage over MS of global day-and-night sensing capability, which is relatively insensitive to cloud cover. This makes Sentinel SAR data particularly suitable for monitoring global changes in the structure and composition of the Earth's surface (Hornacek et al., 2012). SAR, by design, greatly differs from MS remote sensing; it is an active sensing system using longer electromagnetic (EM) waves, with wavelengths in the range 3-24 cm. The letters X, C, S, and L (in order of wavelength) refer to the most commonly used wavelengths. Longer EM waves penetrate deeper into vegetation and soil. They are also less sensitive to phase decorrelation (e.g., due to changes in surface structure). Several authors compared and/or combined wavelengths for different applications, such as inferring displacement from InSAR (Samsonov & d'Oreye, 2012) or characterisation of vegetation cover (Khosravi et al., 2017;Mohammadimanesh et al., 2018). As of 2019, global and free SAR imagery datasets are only available in C-band, which corresponds to a~5.6-cm wavelength and provides sufficient capabilities for a wide array of applications.
The integration of SAR imagery into hydrology and hydrogeology largely lags behind MS imagery, from which several routinely used indices, such as the normalized difference vegetation index (NDVI), are derived. This is mostly due to challenges in understanding the complex driving mechanisms related to SAR waves and its scattering in both space and time, which often is counterintuitive (Freeman & Durden, 1998;Sabel et al., 2012;Ulaby et al., 1990). Additional limiting factors include processing complexity (which often includes numerous intermediary steps), data storage, and considerable CPU/GPU computing needs. Several open-access tools, such as the Sentinel-1 toolbox (S1TBX), are streamlining processing for nonspecialists by providing processing libraries usable within a Python environment.
Products derived from SAR imagery can be divided into two categories: SAR intensity and InSAR ( Figure 1). The intensity products are derived by extracting information related only to the strength of the back-scattering signal, referred to as amplitude or intensity. InSAR products are derived by interpreting differences in the phase signal (in addition to intensity) between two acquisitions acquired from the same orbital position at different times. The former can be performed with images acquired from different acquisition geometries after proper line-of-sight (LOS) and terrain corrections. The latter can only be performed with stable acquisition geometries in a repeat path orbital configuration.
In this study, we explore the value of SAR imagery derived from a globally acquiring SAR system (Sentinel-1) to map evergreen vegetation GDEs in Australia by (a) observing the stability of the vegetation structure, (b) inferring vegetation-groundwater interactions, and (c) differentiating GDEs from other ecosystems. Towards these objectives, we explore three out of the four SAR-derived data-cubes described in Figure 1, that is, two intensity and InSAR coherence timeseries, and discuss their spatial and temporal patterns in relation to current knowledge of GDE occurrence. From there, we test a combination of summary statistics, concentrating on the information embedded in SAR imagery to produce an operational GDE mapping index, which is deployable as an annual monitoring tool.

| SAR SENSING AND VEGETATION RESPONSES
The SAR active antenna sends an EM wave to Earth along its orbital path. The measured response is the strength of the back-scattering signal bouncing back to the satellite along the same angular plane, producing one or two "like-polarized" images: horizontal-horizontal (HH) and/or vertical-vertical (VV) bands. Some sensors can also record the signal returned along a perpendicular angular plane, producing "crosspolarized" images: horizontal-vertical (HV) or vertical-horizontal (VH) bands. Current orbital SAR systems are single-polarized (i.e., are able to produce either HH or VV images, not both), dual-polarized (HH and HV, VV and VH, or HH and VV), alternating (HH and HV, alternating with VV and VH), or fully polarimetric (all bands acquired synchronously: HH, VV, HV, and VH). Currently, global and automatic acquiring SAR systems like Sentinel-1 are dual-polarized, acquiring simultaneously VV and VH bands. The application presented here can be potentially replicated using data from any dual-polarized C-band SAR system.
A variety of scattering mechanisms occur once the SAR EM waves reach vegetation cover ( Figure 2). The measured value of a SAR intensity pixel is the sum of contribution from all scattering mechanisms, F I G U R E 1 Typical products derived from repeat-path SAR imagery. Here, a dual-polarized SAR sensor such as Sentinel-1 is presented as an exemple. "Intermediary steps" refers typically to radiometric calibration, filtering, radiometric terrain correction, and normalization for the creation of SAR intensity products, or to temporal phase-shift signal corrections and interpretations (e.g., phase to displacement inversion) for InSAR products thus limiting our ability to precisely discriminate between them. However, they are not affecting like-polarized bands (here, VV) and crosspolarized bands (here, VH) similarly. Generally speaking, the more bands acquired by a system, the more information is available to feed scattering models, infer underlying scattering mechanisms, and understand their relative importance (Freeman & Durden, 1998;Ulaby et al., 1990). Note that in all cases, only partial polarimetric information is available; hence, explanatory models of SAR scattering are always simplifying the true complexity.
The like-polarized bands (VV or HH) gather information about the signal that was not angularly shifted by the ground target. Over vegetation, this is the sum of three main scattering mechanisms. Penetration of the SAR signal through vegetation is highly dependent on wavelength (Figure 2c; L-band penetrates vegetation deeper than X-band). Consequently, the relative contribution of each scattering mechanism is highly related to SAR system design. For example, over vegetation, L-band produces double-bounce scattering return more frequently than X-band. Shorter wavelength SAR is dominated by volume scattering mechanisms more rapidly than longer wavelength SAR, as vegetation height and density increases. Shorter wavelengths also produce stronger direct scattering over vegetation as the signal does not penetrate the canopy as much as longer wavelengths. Interferometric coherence (Zebker & Villasenor, 1992) is a measure of the variance of phase between two acquisitions, usually subsequent or close in time and acquired from a similar orbital position. It is linked to reflective ground structure similarity between the two images. Coherence is mathematically represented as a correlation coefficient ranging from 0 to 1. Low coherence (below 0.3/1) occurs over vegetated areas, where the phase signal is unstable due to wind and plant growth affecting the vegetation volume structure ( Figure 2b). This explains why areas dominated by volumetric scattering mechanisms (the higher values in VH or HV "cross-polarized" bands) usually have low InSAR coherence (below 0.2/1). High coherence (0.5/1 and higher) occurs over stable scattering surfaces, such as rock outcrops, buildings, or dry bare soils. Typically, coherence is low and temporally stable over persistent vegetation. Contrastingly, it varies in space and time if vegetation density varies. For example, the highly coherent double-bounce scattering mechanism (Figure 2a) F I G U R E 2 Simplified SAR signal scattering over vegetation. (a) Main mechanisms influencing VV or HH like-polarized bands (1, canopy-only direct scattering; 2, soil-trunk or trunk-soil double-bounce scattering; and 3, soil-only scattering); (b) volumetric scattering dominating HV or VH cross-polarized bands; and (c) the correlation between SAR wavelength and signal penetration into the vegetation (Ulaby et al., 1990) increases as leaf and branch density decreases during leaf abscission caused by tree stress or deciduous periods. verify if a single year of data was sufficient to identify GDEs. For all data-cubes, the time-series consist of images acquired along the same track and LOS angle, which facilitates the interpretation of SAR intensity change over time (i.e., no significant LOS change to take into account or compensate for). All information extracted from SAR imagery are structured into coregistered stacks, where every pixel of each SAR acquisition are projected to the exact same ground footprint, forming a 3D data matrix Space * Space * Time, referred to as data-cube.

| Processing
Sentinel-1 interferometric wide images were processed using SARSCAPE 5.5. The three swaths inherent to Sentinel-1 TOPSAR acquisition strategy are merged to produce a stripmap-like SAR image (i.e., a more typical format of SAR image), followed by multilooking (i.e., decrease in resolution) with a factor 8/2 in Range/Azimuth to produce a regular matrix grid at~30-m resolution. This reduces data size and granular noise ("speckle") inherent to SAR data. All images were coregistered and spatio-temporally filtered (De Grandi et al., 1997) to reduce noise. Images are then converted into normalized backscattering coefficients following a Gamma Nought calibration (correction for local incidence angle variations), and projected along a linear scale. Coherence matrices are computed at the same resolution as the intensity matrices (30 m) and with a 5 × 5 pixel analysis window. They were produced in a "time-line" process, where one coherence matrix is produced per image by matching with the subsequent image. All data-cubes are similarly coregistered to facilitate comparison and interpretation. After exploring the data-cubes, they are normalized with the mean value set to 10 and standard deviation to 1.

This spreads the values over an easily interpretable value domain
ensuring that the combined statistical indexes have equal weighting within the final index.

| Data-cube exploration and visualization
The challenge in exploratory data analysis of high-dimensional spatiotemporal data-cubes is to find areas with similar temporal behaviour.
Linking these spatio-temporal patterns with other spatially varying data, such as vegetation, or temporally varying data, such as rainfall, helps to generate hypotheses on the underlying process. Traditionally,

F I G U R E 3 (a) Location of Sites 1 and 2. (b,c) Footprints of the 90
Sentinel-1 scenes used in this study. 30 subsequent scenes are used for Site 1, and 60 subsequent scenes (2 × 30) were used for Site 2. Three spatial subsets are selected for detailed analysis related to data-cube exploration (SUB3) or index performance evaluation (SUB1 and SUB2) this is achieved by examining temporal summary statistics (mean, coefficient of variation, and linear trend), unsupervised classification, or dimensionality reduction, which includes the commonly applied principal component analysis.
This study builds on the principle of dimensionality reduction using a non-linear machine learning technique: the Generative Topographic Mapping (GTM; Bishop et al., 1998), to project the n-dimensional time-series onto a two-dimensional plane (map). This projection aims to preserve dataset topology so that similar time-series data plot closer together in the 2D colour plane, whereas dissimilar time-series plot further apart. GTM has an advantage over principal component analysis in that it can account for non-linear structure in the dataset, and it has an efficiency advantage over other non-linear dimensionality reduction techniques, such as multidimensional scaling, in that the computational load for GTM scales with the number of data vectors (i. e., pixels), not the square of the number of data vectors (Gisbrecht & Hammer, 2015).
Each time-series is assigned a colour, based on its position in the 2D colour plane. This colour is chosen from a perceptually uniform colour scheme (HSL UV , www.hsluv.org). This colour scheme is designed so that the perceived colour difference between two points is proportional to the distance in the 2D colour plane and thus the difference in time-series behaviour.

| OBSERVING GDES THROUGH SAR-DERIVED DATA-CUBES
In this section, we explore the three data-cubes and compare the map  Figure 4D. Similar visualisations are presented for the VV and VH data-cubes in Supporting Information ( Figures S1 and S2).
GDEs closer to floodplains ( Figure 4C) show temporal patterns in the InSAR coherence data-cube represented by the (b1) cluster, whereas GDEs along ephemeral streams are mostly represented in (d3) and (d4). The former shows little seasonal pattern and low coherence, and the latter shows seasonal patterns less pronounced than surrounding non-GDEs (e.g., a4) occurring further away from the floodplain. GDEs, in comparison with other surface cover, can therefore be identified by their low temporal average and high temporal stability of coherence, indicating vegetation persistence throughout the year (Figure 4). Presence of similar temporal patterns outside of GDEs also suggests that InSAR coherence is not sufficiently discriminating to constitute an efficient GDE index by itself.
VV and VH data-cubes are likely to add relevant, GDE-discriminating, information (Figures S1 and S2). The VV data-cube visualization presents GDEs in very different clusters, in both the a1-a2-a3-a4 domain and in c3-c4 ( Figure S1). GDEs present a spatially variable degree of seasonal variation in the VV data-cube. Contrastingly, the VH data-cube yields alternate results ( Figure S2) where GDEs are represented by nearly similar clusters: c4 and d4. It corresponds to smoother temporal curves in comparison with most other vegetation covers present in this data-cube, as represented by the different clusters and their allocated colours.
The contrast between the VV and VH data-cubes is explained by a higher concentration of volume scattering within the VH bands than within the VV bands. GDE environments are often mixed with non-GDE perennial plants that are shallow-rooted and temporally unstable. They potentially have a stronger seasonal influence on the soil-trunk or trunk-soil (double-bounce) scattering than on the canopy-only (volumetric) scattering, the latter not going through any type of ground interaction. Consequently, it is reasonable to assume the VH bands elucidates more GDE-discriminating information than the VV bands. Within the VH data-cube, GDEs are mainly characterized by their lower seasonal amplitude than non-GDEs, which would translate into a relatively low temporal standard deviation.

| DERIVING A GDE INDEX FROM SAR IMAGERY
As demonstrated, GDEs have (a) low InSAR coherence, which is typical for dense vegetation, as leaves and branches are moving targets producing temporally unstable scattering mechanisms; (b) limited seasonal changes of InSAR coherence, as vegetation structure is expected to be stable in time and potentially more stable than non-GDE vegetation during droughts. In fact, some non-GDEs would either grow at an uneven rate throughout the year or would shed leaves during drier months. This would modify the relative proportions of scattering mechanisms, increasing the strong and phase-coherent double-bounce scattering and, as a consequence, InSAR coherence; (c) a relatively limited seasonal change in VH band intensity, reflecting low temporal variability of canopy volume.
From three statistical parameters containing deterministic information about GDEs (Figure 5), it is apparent that both coherence mean and standard deviation (Figure 5a-d) yield significant information on GDE occurrence but also contain False Positives (FPs), where non-GDEs also appear dark-grey or black. The VH band standard deviation (Figure 5e,f) also contains GDE information but also produces numerous FPs. It also provides complementary information to InSAR coherence over both sites (Figure 5e,f), where the coherence data-cube fails in discriminating GDEs surrounding the floodplain/wetland from F I G U R E 4 InSAR coherence datacube exploration and visualization. The time-series of normalized coherence are grouped according to similarity and organized in 16 groups to visualize transitions between endmember timeseries behaviour (d). The colour coding of the time-series is designed to be perceptually uniform, such that a perceived difference in colour corresponds to the difference between the coherence time-series. The pixels in A, B, and C are coloured according to the colour associated with the coherence time series in D. The black contours in A, B, and C are GDE as identified in the GDE atlas (Doody et al., 2017). GDEs along perennial streams appear to coincide with pink colours (corresponding to subplot D-b1), whereas GDEs along ephemeral streams are associated with green colours (corresponding to subplots D-d3 and D-d4). Note that the axis in D are not labelled to minimize visual clustering. Time is on x-axis of each subplot and spans 1 year, from January 8, 2017, to December 22, 2017, which correspond to~30 repeat cycles of Sentinel-1. Normalized coherence is on the y-axis, spanning the 0-1 interval the floodplain/wetland itself (Figure 5a-d). This suggests that combining these parameters could potentially help concentrating GDE information, with one parameter potentially masking FPs from the others.
In order to discriminate GDEs from non-GDEs, an index combining these three observations and concentrating the information contained within the two notably relevant data-cubes, VH and InSAR coherence, is proposed. Hence, Version 1 of SARGDE is defined through Equation (1). Note that all three parameters are normalized (mean = 10; SD = 1) across each scene so they contribute equally to the index.
where μ represents the annual mean, σ the annual Standard Deviation (SD), vh the VH intensity data-cube, and cc the InSAR coherence.
SARGDE will be high when all three parameters (σ cc ,σ vh ,μ cc ) are low, whereas the index will be low when any of the parameters is high.
Hence the index is designed to be high where the potential for GDE occurrence is high. The index is tested over spatial samples from the MG and WK study areas ( Figures 5 and 6), normalized (mean = 10, SD = 1), and compared with the GDE Atlas (Doody et al., 2017 The degree of similarity between SARGDE results and the GDE Atlas is stronger for the MG site (Site 1; Figure 6) than for the WK site (Site 2; Figure 7). In fact, the FP pixel percentage decreases significantly before the true positive pixel percentage starts to decrease.
Multiple factors could explain this. First, spatial patterns of GDEs in the WK area are sinuous as they are usually confined to river banks and narrow ephemeral streams (e.g., Figure 4C), making the pixel-perpixel comparison (Figure 7c) more sensitive to the resolution difference between the two methods (Figure 7b,d). Second, the frequent cloud-cover occurring during monsoon season in WK site limits the accuracy of NDVI and evapotranspiration models on which the GDE Atlas relies.

| INTERPRETING SAR-DERIVED LARGE-SCALE GDE MAPS
Observing GDEs over large scales provides insights over hydrogeologic, climatic, geomorphologic, and lithologic features ( Figure 8). Lithological boundaries related to coastal sediment deposition episodes provide a strong spatial control over aquifer transmissivity, depth to groundwater static levels, and as a consequence, over GDE occurrence (Arrow 1 in Figure 8). Lakes formed by collapse of ground voids (sinkholes) created by hydro-chemical erosion of the F I G U R E 5 Statistical parameters proposed for GDE discrimination over two subsets of SUB2 (a,c,e) and SUB1 (Bool Lagoon; b,d,f) shown on Figure 3. (a,b) Temporal average of InSAR coherence, (c,d) standard deviation of InSAR coherence, and (e,f) standard deviation of VH intensity data-cube. GDE spatial distribution presented in the GDE Atlas (Doody et al., 2017) is overlaid with red contours. Note that the standard deviation of the VH intensity band in Bool Lagoon (f) yields contrasting results compared with the coherence information (b,c). More information is given in Supporting Information, section 2 unconfined/surficial calcareous aquifer occur throughout Site 1 ( Figure 8a). These surface and subsurface karstic features provide easier access to groundwater for vegetation (Doody & Benyon, 2011a) and, as such, are often surrounded by GDEs (Arrow 2). When groundwater levels are close to the ground surface and groundwater supply is sufficient, GDEs can be part of a wetland system with strong groundwater/surface-water interactions (Arrow 3). Particularly sharp contrasts can be noted in Figure 8a (e.g., Arrow 4) as GDEs often occur in agricultural areas. The latter have a lower SARGDE v1 value due to high seasonal variability of the SAR response in all source data-cubes. Site 2 shows contrasting spatial features. GDEs mainly occur along floodplains (Arrows 5 and 6) or ephemeral streams (Arrows 6, 7, and 8). An interesting observation can be drawn from the other end of the SARGDE v1 value spectrum (from 8 to 10). Important spatial variations occur within the non-GDE vegetation across Site 2 (Figure 8b), particularly areas more than 100 km inland (Arrow 9). This confirms the increasing contrasts in temporal SAR response between GDEs and non-GDEs as precipitation rates decrease going inland. A clear North-South gradient occurs across Site 2 ( Figure 8b); however, it is uncertain if vegetation from the northern part of this area is structurally stable because of its groundwater dependence or the temporally smoother precipitation patterns providing sufficient water to the vegetation throughout the year. However, this is also observed in fluxbased GDE maps (e.g., figure 7b in Doody et al., 2017), confirming the classification of these ecosystems as GDEs. This work focuses on Australian GDEs; hence, more work is needed to adapt it to non-Australian vegetation, particularly for deciduous regions affected by structural changes unrelated to water availability. Even though our analysis shows the effectiveness of SARGDE v1 , further work is required to fully understand its advantages and limitations. First, SARGDE v1 should be compared with direct in situ GDE mapping across a wide range of vegetation types and climates. Second, the index should be tested for its ability to F I G U R E 6 (a) SARGDE v1 index values over the SUB1 area (see Figure 3). (b) GDE map derived from the GDE Atlas. Permanent water bodies are delimited using an arbitrary thresholding of the minimum VV intensity map derived from the VV data-cube. F I G U R E 8 SARGDE v1 over Sites 1 and 2 ( Figure 3). Large-scale observation of GDE using SARGDE v1 allows to identify lithological boundaries controlling aquifer transmissivity and groundwater depth from ground surface (a -black arrows). GDE occurs throughout the landscape over the MG area (a), whereas they are mostly confined to floodplains and ephemeral stream banks across the WK area (b). Black arrows are highlighting GDE features described in the main text F I G U R E 7 Idem as Figure 5 for a spatial subset of the WK area: SUB2 (see Figure 3). The optimized threshold for this area is set at Tr = 10.92 (i.e., mean + 0.92 SD) and is shown in green on (a) and (c). The resulting GDE (d) map is 89% identical to the Global GDE Atlas (b) on a pixelper-pixel comparison comprising both GDEs and other pixel groups