In the Arctic, surface hydrology plays an important role in controlling plant community composition and ecosystem processes such as land-atmosphere carbon and energy balance. Investigating how climate change in this region will affect surface hydrology and subsequent biotic, atmospheric, and climatic feedbacks could be key to understanding the future state of the Arctic and Earth systems. Improved methods for monitoring surface hydrology at large spatial scales are needed in the Arctic. Near Barrow, Alaska, a large-scale experiment with flooded, drained, and control treatment areas, each exceeding 9 ha, was initiated during summer 2008 following 3 years of monitoring under nonmanipulative conditions. Throughout the 2008 growing season, hyperspectral reflectance data were collected in the visible to near-infrared (IR) range using a 300 m long robotic tram system. Water table depth, surface water depth, and percent surface water cover were also measured. A spectral index (Normalized Difference Surface Water Index (NDSWI)) was developed using reflectance in the IR region (R1000 strong absorbance) and blue region (R460 poor absorbance). NDSWI was strongly correlated with both surface water depth and surface water cover, and was used to monitor spatial and temporal patterns of surface hydrology in the experimental treatment. Using 2002 and 2008 Quickbird satellite imagery, the index was also used to examine differences in NDSWI between experimental treatments. Using this approach, we demonstrate that the flooded treatment was significantly different from the other two treatments (drained and control) and that the new index can be used to monitor surface hydrology in arctic wetlands.
 The degree to which changes in land-atmosphere carbon exchange dynamics will interact with and offset the balance and stability of the substantial store of soil organic carbon in the Arctic [Tarnocai et al., 2009; Ping et al., 2008] is a primary concern [McGuire et al., 2009]. If the increased primary productivity predicted for the Arctic does not balance the net loss of CO2 equivalent carbon loss to the atmosphere, a positive feedback to regional and potentially the global climate system will occur [Kimball et al., 2006]. This positive feedback response to warming in the Arctic is likely to be controlled by soil moisture and surface hydrology [McGuire et al., 2006; Huemmrich et al., 2010], so there is a need to monitor these parameters and determine how they are changing over time and space to better understand the future state of the arctic system. Substantial efforts before, during, and now following the 2007–2009 International Polar Year have focused on improving environmental observing capacities in the Arctic, particularly the future fate and transport of the Arctic soil organic carbon store [Arctic Observing Network, 2006]. Such efforts highlight the importance of observations that span plot to global scales. Because of the many advantages remote sensing offers to such scaling challenges, the development of remote sensing approaches and technologies in an integrated arctic observing network is a key priority.
 Several remote sensing methods have been developed for monitoring surface hydrology or soil moisture over large spatial scales. To assess the state of surface hydrology in terrestrial ecosystems, remote sensing has been used to monitor inundation area [Smith, 1997; Smith et al., 2005], snowpack [Green et al., 2006], and vegetation water content [Roberts et al., 1997; Ustin et al., 1998; Serrano et al., 2000]. Radar remote sensing using microwave electromagnetic radiation has been used successfully for assessing soil moisture status in some arctic landscapes [Kasischke et al., 2009; Meade et al., 1999]. While radar is useful for estimating inundation area, the return signal is easily confounded by wind or the presence of vegetation that can alter the backscatter properties [Smith, 1997]. Light Detection and Ranging (LiDAR) remote sensing has also been used in habitat mapping [Wedding et al., 2008], coral reef structure [Storlazzi et al., 2003; Brock et al., 2004, 2006], and the bathymetry of coastal regions [Irish and White, 1998]. Optical remote sensing methods (using visible or near-infrared wavelengths) can penetrate water bodies, and have been applied to bathymetry [Brando et al., 2009; Legleiter and Roberts, 2009], seagrass mapping [Phinn et al., 2008] and coral reef assessment [Hochberg et al., 2003]. Additionally, optical remote sensing has often been used to evaluate vegetation moisture content through reflectance indices based on water absorption bands [Sims and Gamon, 2003]. Examples of the latter include the Water Index (sometimes called the Water Band Index) based on reflectance at the 970 nm water band [Peñuelas et al., 1993], and the Normalized Difference Water Index [Gao, 1996] based on the 1240 nm water band. Some methods apply a form of Beer's law to fit the water absorption coefficient spectrum to a reflectance spectrum, yielding an “Equivalent Water Thickness” (EWT) [Gao and Goetz, 1995; Roberts et al., 1997; Sims and Gamon, 2003; Green et al., 2006]. Based on the assumption of Beer's law, which states that reflected or transmitted light is related to the exponent of the amount of an absorbing compound, EWT offers a physically based alternative to other index based methods for assessing vegetation water content [Gao and Goetz, 1995]. To the extent that water bodies or vegetation behave according to Beer's law, and to the extent that calculation of EWT is based on a full spectrum (multiple bands rather than two bands), it seems that EWT should offer a superior method of assessing moisture status, but not all studies have supported this conclusion [Serrano et al., 2000; Sims and Gamon, 2003]. Consequently, the “best” method for retrieving surface hydrology information with optical remote sensing remains an open question and a key challenge.
 Most of the optical remote sensing methods for assessing surface moisture properties utilize one or more water absorption features present in reflectance spectra [Green et al., 2006]. Because these methods employ water absorption bands, they readily saturate, are easily confounded by atmospheric water vapor, and the vapor absorption bands overlap with liquid water features [Sims and Gamon, 2003; Green et al., 2006]. These issues create practical problems for quantitative retrieval of surface moisture measurements, particularly in areas with high atmospheric moisture such as the Arctic. Additionally, instrument limitations (e.g., low signal to noise) and artifacts (e.g., “second-order” or “stray light” errors) in certain spectral regions (e.g., the 970 nm water band) often confound a clear interpretation of water signals, particularly with silicon photodiode detectors [Sanchez-Azofeifa et al., 2009].
 To our knowledge, optical remote sensing has not been applied to assess surface water depth, a sensitive indicator of surface hydrology and soil moisture in the Arctic, especially on the coastal plain of northern Alaska. One goal of this study was to develop a spectral index from optical remote sensing that could be used to estimate surface water depth and surface water cover, and to characterize changes in surface hydrological properties at multiple spatial and temporal scales. To maximize the prospects for scaling, we specifically sought an index that could be least confounded by atmospheric moisture, would be relatively free from instrument errors, and would be commonly available from ground-based, airborne or satellite-borne instruments. Another key goal of this study was to evaluate the treatment effect associated with the Biocomplexity experiment, a large-scale flooding and draining manipulation conducted in northern Alaska near the city of Barrow to assess the impact of altered soil moisture status on land-atmosphere carbon, water and energy balance. Like other large-scale experimental manipulations, the Biocomplexity experiment was unreplicated and required the development of new environmental metrics to capture changing surface properties across a large treatment area. Our goal was to develop an index that could be used to evaluate hydrological treatment effects across a heterogeneous, dynamic landscape, and to evaluate discrepancies between various sampling “footprints” (flux towers, optical sampling transects, and the entire treatment basin).
2.1. Study Area
 The study was conducted within the Biocomplexity flooding and draining experiment on the Barrow Environmental Observatory (BEO) near Barrow, Alaska, 71°17′01″N, 156°35′48″W (Figure 1). The BEO is situated on the Alaskan Arctic Coastal Plain and has a low relief with an average elevation of 4 m [Aguirre et al., 2008]. Seventy two percent of the landscape near Barrow contains oriented lakes, vegetated drained thaw-lake basins and small ponds [Hinkel et al., 2003]. The Biocomplexity experimental area is located in a series of three coalesced drained thaw-lake basins that include moist and wet tundra vegetation dominated by graminoid tundra (C. E. Tweedie et al., Land cover classification and change detection near Barrow, Alaska using QuickBird satellite imagery, submitted to Arctic, Antarctic, and Alpine Research, 2010). The study site is underlain by continuous permafrost and includes thermokarst terrain typical of the Alaskan Arctic Coastal Plain [Brown et al., 1980], as well as thaw lakes, high- and low-centered polygons, shallow ponds and lakes, and a shallow active layer that is generally less than 50 cm in the experimental basin [Shiklomanov et al., 2010]. Soils of the area are described by Bockheim et al.  and include cryoturbated gelisols, specifically typic aquorthels with high soil moisture content, histoturbels, and aquaturbels. The upper layer of this soil consists of carbon rich peat (ca. 50 kg/C/m3) [Bockheim et al., 1999]. Soils are generally moisture rich due to shallow drainage gradients, relatively low rates of evapotranspiration, and impeded drainage caused by ice-rich continuous permafrost [Bockheim et al., 1999; Miller et al., 1998].
 Winter is generally long, dry and cold and the summers are relatively short, moist and cool [Brown et al., 1980]. The sun is above the horizon continuously from 10 May to 2 August, and below the horizon from 18 November to 24 January [Brown et al., 1980]. Air temperature remains below freezing for 9 months of the year and can fall below freezing during any of the 3 summer months. A gradual warming trend begins in April, and snowmelt typically occurs in early June. Wind speed varies little during the year, averaging 5.3 m/s, with the fall months being the windiest. Fog and low cloud persist throughout the summer and the relative humidity generally exceeds 80% from June through September. Mean June–August precipitation is 58.4 mm [Engstrom et al., 2008].
2.2. Experimental Infrastructure
 As part of the Biocomplexity experiment, water tables were manipulated in a vegetated thaw-lake basin to investigate the impact of variation in soil moisture on land-atmosphere carbon, water and energy balance. In 2008 (the year of this study), following 3 years of baseline measurements (2005–2007), the lake basin was divided into three treatments, a flooded section (+10 cm water table depth), a drained section (−10 cm water table depth), and a control section where the water table was maintained relative to water levels outside the manipulation area (Figure 1b). The experimental infrastructure established for this site includes a robotic tram system similar to that described by Gamon et al. . The tram system consists of three 300 m long transects (“tramlines”) with one tramline located in each of the three treatment areas. Each tramline spanned the entire width of the lakebed and was oriented east–west to avoid midday shading of the south side (the sampling footprint of the tramline). An eddy covariance flux tower designed to measure trace gas flux for each of the treatment areas [Zona et al., 2009] was situated adjacent to each of the tramlines (Figure 1b). A range of other collaborative studies were conducted throughout the experimental area, including several featured in this special issue [Olivas et al., 2010; Shiklomanov et al., 2010]. The infrastructure established for the Biocomplexity experiment provided an ideal research platform for this study as it established the capacity to repeatedly assess the surface spectral properties of the same land cover type affected by contrasting surface hydrology regimes throughout the snow-free period.
2.3. Reflectance Measurements
 Field reflectance data used in this study were collected from early June 2008 to late August 2008. This period spanned the majority of the 2008 snow-free period for the study site. Spectral data were obtained using a dual-detector field portable spectrometer (Unispec DC, PP Systems, Amesbury, MA, USA), which collects radiance (radiation from the target) and irradiance (radiation from the sky) simultaneously, thereby permitting correction of surface reflectance under varying sky conditions [Gamon et al., 2006]. The two detectors were cross-calibrated using a white panel with 99% reflectance (Spectralon, Labsphere, North Sutton, NH, USA) at the beginning and at the end of each set of measurements along each tramline. Our Unispec-DC had a nominal range of operation between 303 and 1148 nm in 256 contiguous bands with a spectral resolution of approximately 3 nm and a full width at half maximum of approximately 10 nm. The optimal range of this detector (range with reasonable signal to noise) is approximately 400–1000 nm, which was used to limit the spectral range used in most subsequent analyses.
 Along each tramline, reflectance data were collected by placing the field spectrometer on a semiautonomous robotic cart [Gamon et al., 2006], which traveled along each of the 300 m long tramlines in a west–east direction (Figure 2). The cart speed was set to approximately 1 m every 5 s, allowing each 300 m long tramline to be sampled in approximately 25 min. A mechanical switch mounted on the base of the robotic cart was triggered when the cart passed over crossbars situated at every meter along each tramline. This activated the spectrometer to make a measurement. Three hundred spectral measurements were made along each tramline every time the robotic cart was operated. Each tramline was sampled up to three times per week if rain and mist did not prevail. The downward looking foreoptic was positioned at approximately 3 m above the ground and provided a field of view of approximately 20 degrees from the tip of the foreoptic. This equated to approximately a 1 m diameter sampling footprint at ground level (Figure 2). By repeating measurements throughout the snow-free period, the tram system enabled us to obtain a spatially explicit time series of surface reflectance of the same land cover type under different surface hydrology regimes. Reflectance was calculated as the ratio of the two channels, corrected by the mean cross-calibration spectrum measured for each run of the cart along each tramline (see Gamon et al.  for details). Reflectance data were processed using the software Multispec (Version 5.1, available at http://specnet.info), which calculated interpolated reflectance at 1 nm intervals between 303 and 1148 nm.
2.4. Surface Hydrology
 Water table depth was measured manually every 10 m along each tramline every time spectral measurements were made, using a method similar to that described by Olivas et al. . Perforated PVC tubing with a diameter of 4 cm was placed in holes drilled in to the tundra, leaving approximately 20 cm of tubing above ground level. A ruler was used to measure the height of the water table relative to ground level. When the water level was below the surface, water table depth was considered negative and positive when the water table was above the surface. In cases of below ground water (negative water table depth values), surface water depth was assigned a value of zero; consequently, surface water depth simply refers to positive water table depth values (i.e., areas of visible standing water).
 Percent surface water cover was determined by running supervised classifications of digital color photographs using image processing software (ENVI, Version 4.2, ITT Visual Information Solutions, Boulder, CO, USA). Photos of each tramline footprint (matching the optical sampling areas) were acquired on 18 July 2008, using a digital camera (Coolpix 5400, Nikon) that was mounted on the boom of the robotic cart and triggered manually using an electronic shutter cable. The photo locations were also adjacent to water table depth measurements (i.e., every 10 m along each of the three tramlines; n = 90), allowing direct comparison with water table depth and surface water depth measurements.
2.5. Reflectance and Surface Hydrology
 The development of a spectral index suitable for characterizing surface hydrology (water table depth, surface water cover and surface water depth) required the assessment of spectral sensitivity to surface water. To do this, reflectance was measured at separate locations along the tramline having different surface water cover, water table depth, and surface water depth. Additionally, reflectance at every 10 nm from 400 to 1000 nm was regressed with surface water cover, water table depth and surface water depth collected every 10 m along the tram line on 18 July 2008. Both linear and logarithmic regressions were run to determine the best fit for each.
 We used the regression equations from the analysis described above to derive two versions of a surface water index, the “Normalized Difference Surface Water Index” (NDSWI). The first version of NDSWI calculated the normalized difference of reflectance at 460 nm and 1000 nm (“NDSWI-linear,” equation (1) below). The second version of NDSWI used a similar equation and the same reflectance bands but required reflectance to be log transformed (“NDSWI-log,” equation (2) below). The relative success of NDSWI-linear and NDSWI-log in assessing surface water cover and depth (both surface water depth and water table depth) is presented below and is compared to two additional spectral indices that have also been used to describe plant or surface water status: the Water Band Index (WBI, equation (3) below [Peñuelas et al., 1993]) and the Equivalent Water Thickness index (EWT, equation (4) below [Roberts et al., 1997; Sims and Gamon, 2003]). EWT was calculated, using a Beer's law approximation [Gao and Goetz, 1995; Roberts et al., 1997; Sims and Gamon, 2003], where the impact of surface water on reflectance is determined as the negative slope of the linear regression between the water absorption coefficient spectrum [Sims and Gamon, 2003; Green et al., 2006] and the natural log of the reflectance spectrum over a wavelength range of 900–1000 nm. The formulas for calculating NDSWI-linear, NDSWI-log, WBI and EWT are:
The 18 July values of EWT, WBI, NDSWI-linear, and NDSWI-log were regressed with surface water cover estimated from digital image analysis as described above (n = 90). For assessing which index was most effective at assessing water table depth and surface water depth, we ran regressions between these indices and water depths from a range of sampling dates and locations within the 2008 sampling year. For example, these indices (equations (1)–(4)) measured for 5 days (18, 23, and 28 July and 4 and 9 August 2008) were first calibrated against corresponding surface water depth measurements for those dates using regression analysis. The two spectral indices that most strongly correlated with surface water cover and surface water depth in the above analyses (equations (1) and (2)) were then applied to the entire growing season by using these calibrations to predict surface water depth and water table depth throughout the 2008 sampling period. These predictions were then tested against independent surface water depth and water table depth measurements and used to make plots illustrating the temporal and spatial dynamics of surface water depth for the treatment areas over the 2008 season. Note that since photographic estimates of percent water cover were only available for one date (18 July), surface water cover could not be independently tested in this way.
 Particular attention was paid to assessing the accuracy of the model for estimating surface water depth across time, space, and experimental treatments. Data from alternate sampling dates during the peak of the growing season (18, 23, and 28 July and 4 and 9 August 2008) were combined to derive the final model for seasonal analysis. Seasonal surface water depth trends for each tramline were then modeled and compared to measured surface water depth and water table depth to determine if the model over or underestimated water depths in each of the experimental treatments. Plots combining all treatment dates and positions were derived to assess the spatiotemporal behavior of the model along each tramline and throughout the sampling period. To assist in the interpretation of these plots, microtopographic variation along each tramline was also examined. Data for microtopography were acquired from a 1 m digital elevation model developed for the study area from airborne LiDAR acquired in late August 2006 (Tweedie et al., submitted manuscript, 2010).
2.6. Spectral Mixture Analysis
 To further understand the effect of surface water on spectral reflectance, we also applied spectral mixture analysis. In remote sensing, this method is most typically used to “unmix” pixels containing more than one cover type into fractions of their component cover types, or “spectral end members,” representing different cover classes [Adams and Gillespie, 2006]. In our case, we used the reflectance spectra of the dominant cover types (green vegetation and water) and combined them to simulate various levels of percent water cover by area. To do this, we selected end-member spectra from sites along the tram lines that were either 100% water covered or 100% vegetation covered on 18 July 2008, and created synthetic spectra of various fractions (F) of the two spectral types, ranging from 0% surface water cover (100% vegetation cover) to 100% surface water cover (0% vegetation cover), assuming linear (additive) mixing (equation (5)).
In this equation, ρ indicates the spectral reflectance (from 400 to 1000 nm) for the mixture, pure vegetation (veg) or pure water, and F is a coefficient representing the contribution of each cover type (vegetation or water), with the two portions always adding to one. These synthetic mixtures were then compared to actual field spectra of plots having a known surface water cover.
2.7. Scaling Analysis
 Two high spatial resolution multispectral satellite images (QuickBird, Digital Globe, Longmont, Colorado, USA) acquired on 2 August 2002 and 27 July 2008, were used to calculate NDSWI for the entire study area for pretreatment (2002) and posttreatment (2008) years. Note that due to frequent cloud cover in this coastal region [Hope et al., 2004], these were the only clear-sky high-resolution Quickbird images we were able to obtain for the experimental period. Radiometrically corrected QuickBird images for these dates were available with four spectral bands: blue (450–520 nm), green (520–600 nm), red (630–690 nm) and near-IR (760–900 nm) at 2.8 m resolution. Using ArcGIS (ESRI, Redlands, California), NDSWI was derived using equation (1) above for both the 2002 and 2008 Quickbird images where R460 and R1000 were substituted with the blue and IR bands of the Quickbird image, respectively. The linear version of the NDSWI equation was used to avoid any problem with band math calculations using the satellite bands. The difference in NDSWI between images (2008 minus 2002) was calculated to visualize and quantify treatment effects for different regions of the basin (flooding, versus drained and control regions; tramline regions versus flux tower regions versus the entire treatment basin).
2.8. Assessment of Experimental Treatment Effects
 The tramline transects and idealized flux tower footprints (based on experimental layout), and the inundated lake basin were delineated for each treatment area. For the 2002 and 2008 images, the NDSWI pixel values for the tram and flux tower footprints and treatment basins were extracted for statistical analyses (Systat Software Inc., Chicago, Illinois). To assess differences between the 2002 and 2008 NDSWI coverages for a given sampling area and treatment, t-tests were run. A univariate analysis of variance (ANOVA) was used to assess the difference between treatments within a given year. To determine the difference in the frequency distribution of NDSWI between sampling areas (tramline, flux tower, and treatment basin) within a given experimental treatment and year, a two-sample Kolmogorov-Smirnov test was performed. This nonparametric test assesses whether the tramline footprint area and flux tower footprint area, for example, have a similar NDSWI frequency distribution or whether they are significantly different from one another. This is an important consideration in testing the scaleability of measurements taken in different areas of a particular treatment area. It should be noted that the analyses outlined above do not necessarily imply causation between surface hydrology and the spectral indices analyzed. This is because we cannot fully correct for the effects of covariation between vegetation and surface hydrology on spectral signature, and instead treat NDSWI as a proxy of surface hydrologic state.
 Spectral reflectance was clearly affected by varying surface water cover within the treatment basin (Figure 3). The least change was observed in the blue spectral region (450–500 nm), and the biggest difference was observed in the near-infrared (>700 nm) with reflectance decreasing markedly with increasing surface water (water table depth, surface water depth, and surface water cover). Synthetic mixtures behaved in a similar manner, and closely matched field spectra. This response to surface water coverage indicated that an index that accounted for the contrasting response to water between the blue and NIR bands could be used as an indicator of surface water. To assess which wavelengths would be optimal for such an index, we correlated several metrics of surface water (water table depth, surface water depth and surface water cover) with reflectance at every 10 nm from 400 to 1000 m for every location where reflectance, water table depth, surface water cover were measured on 18 July 2008 (Figure 4). Regardless of which water metric was used (water table depth, surface water depth, or surface water cover), R2 values were lowest in the blue (460 nm) and highest in the NIR (approximately 1000 nm), confirming that reflectance in the blue is poorly correlated with surface water, whereas reflectance in the NIR is strongly correlated with surface water. Correlations between reflectance in the NIR and water table depth of surface water cover were greatest for a logarithmic model. R2 values between reflectance and water table depth were higher than that between reflectance and surface water depth while R2 values between reflectance and surface water cover were similar to that of reflectance and surface water depth.
 Of the indices tested, the best predictor of water table depth was NDSWI-log, closely followed by NDSWI-linear (Table 1). Both NDSWI versions (log and linear) exhibited identical R2 values with surface water cover (Table 2). All indices showed significant R2values with water table depth and surface water cover although EWT had a stronger R2 values with water table depth than WBI and WBI had stronger R2values with surface water cover than EWT (Tables 1 and 2). Relative to NDSWI linear, NDSWI-log showed slightly higher R2 values with water table depth throughout most of the sampling period (Figure 5), although the overall differences were very small. Since independent surface water cover measurements were not made over the entire season (just 18 July 2008), we could not evaluate the seasonal dependence of the fits between these indices and surface water cover.
Table 1. Results of Regressions Between EWT, WBI, NDSWI-Log, NDSWI-Linear, and Water Table Depth for 18, 23, and 28 July and 4 and 9 August 2008a
Y, water table depth; x, index value. Here R2 values are calculated from all dates combined.
Y = 37.303x – 1.6882
Y = 19.271x – 18.854
Y = 31.53x + 22.578
Y = 57.093x + 20.95
Table 2. Results of the Regressions Between EWT, WBI, NDSWI-Log, NDSWI-Linear, and Percent Surface Water Cover for 18 July 2008a
Y, percent surface water cover; x, index value. Here R2 values are calculated from all dates combined.
Y = 133.15x + 6.5587
Y = 80.555x + 67.566
Y = 108.67x + 88.562
Y = 188.83x + 81.127
 The model for predicting water table depth and surface water cover using NDSWI-log is given in Figure 6. Linear models best described the relationship between NDSWI-log and water table depth. However, when only surface water depth was included, a nonlinear model improved the fit slightly (not shown), and NDSWI tended to saturate at larger water table depth or surface water depth values. When belowground values were examined separately (Figure 6a, open circles) the R2 value was greatly reduced, but was still significant (p < 0.001, Figure 6a). These results demonstrate that the NDSWI-based model's greatest predictive power was for surface water depth (i.e., when the water table was above the surface of the ground). The R2 value for the linear regression between NDSWI-log and surface water cover was also highly significant (Figure 6b). Modeled results from synthetic mixtures of water and vegetation (Figure 6b, open squares) closely matched the results of the field measurements (Figure 6b, solid circles), providing further support of a strong link between surface water cover and NDSWI-log for this landscape. Surface water cover and surface water depth from 18 July (the date photographic estimates of surface water cover were calculated) were strongly correlated (Figure 7), suggesting that surface water cover, surface water depth, or some combination of the two could be driving this index.
 In seasonal analyses, modeled water table depth closely followed measured water table depth for all the tramlines but appeared to be more accurate for the north tramline. Here, water table depth was higher compared to that of the other tramlines, which often had water tables below the ground surface (Figure 8). When the water table depth values were most negative (e.g., between days 190 and 200 in the central and south treatment basins), the index tended to overestimate water table depth when these were below ground (i.e., produce less negative water table depth values, Figures 8b and 8c). The index also had an abberant peak during a snowfall event (day 213) in the central and south basins (Figures 8b and 8c), but not in the flooded north basin (Figure 8a), where surface water prevented snow from accumulating. Direct R2 values between modeled and measured water table depth for each tramline and for all treatments combined (Figure 9) showed that modeled water table depth tended to overpredict water table depth in the control treatment and underpredict water table depth in the drained treatment. In the flooded treatment, where surface water was present, the model predicted water table depth (i.e., surface water depth) most accurately.
 The model developed for water table depth (Figure 6a) depicted the spatiotemporal dynamics of water table depth along each tramline and throughout the sampling period (Figure 10). Measured water table depth and modeled water table depth varied with microtopography (Figure 10, top). Locally high-elevation areas had the lowest water table depth values, and low-elevation areas had the highest water table depth values (green-yellow areas in the water table depth image, Figure 10). In early June, soon after snowmelt (ca. day 170), water table depth was high. As the snow-free period progressed, water table depth decreased throughout each treatment. The horizontal banding in the modeled water table depth around day 210 for the central and south tramlines was caused by a snowfall event, which was also prominent as an anomalous spike in these same treatments shown in Figures 8b and 8c.
 NDWSI extrapolated across the experimental area using Quickbird satellite imagery from 2 August 2002 (Figure 11a, pretreatment) 27 July 2008 (Figure 11b, posttreatment) showed spatial and temporal changes in surface water. These changes were particularly evident for the northern basin (flooded treatment). In Figure 11, three sampling areas, the treatment thaw-lake basin, hypothetical flux footprint, and tramline transects, are shown. The most conspicuous difference between these NDSWI extrapolations is an increase in NDSWI (becoming wetter) in the flooded treatment area. Although significant differences between years were recorded in the drained and control treatment areas (Figure 12, asterisks), these are relatively minor compared to the flooding treatment effect illustrated in Figure 11, and is further documented in Figure 12.
 Significant differences (t test, p < 0.05) were noted between years for all sampling areas and treatments except for the tramline footprint in the drained treatment (see Figure 12, asterisks). Univariate ANOVA showed no significant difference between the tramline footprints in the control and drained treatments for both years (see Figure 12, letters). The tramline footprint in the flooded treatment was significantly different from the tramline footprints in the drained and control treatments for both years (see Figure 12, letters). Kolmogorov-Smirnov tests showed a significant difference between tramline and idealized flux footprints and treatment basins for all years and treatments except for the 2002 flooded treatment, where the tramline was similar to the idealized flux tower footprint and treatment basin (Figure 12, Roman numerals). These results generally highlight a successful flooding treatment in 2008, but revealed little apparent effect in the draining treatment, which yielded no detectable change in NDSWI. These results also indicated substantial variability between sampling areas (tramline, flux footprint area, and treatment basin) in all three treatments (flooded, drained, and control) for both 2002 and 2008. Note that the surface hydrology of the control treatment varied between years for each sampling region, indicating interannual variability in surface hydrology independent of any experimental treatment effect.
 In the Arctic, soil moisture and surface hydrology are important in controlling plant community composition and ecosystem functional processes such as land-atmosphere carbon and water exchange and surface energy balance [Merbold et al., 2009; Walker et al., 2006; Chapin et al., 2005]. Understanding how surface hydrology is changing with climate change could be key to understanding the future state of the Arctic. This study focused on developing a spectral index capable of estimating surface water status that would not be confounded by atmospheric moisture, would be relatively free from instrument errors, and would be commonly available from ground-based, airborne or satellite-borne instruments. The Normalized Difference Surface Water Index (NDSWI) was able to accurately estimate surface water cover and surface water depth in an experimental flooding and draining experiment situated in a vegetated thaw-lake basin on the Arctic Coastal Plain of northern Alaska.
 Compared to EWT and WBI, two other spectral indices that have been widely used to estimate surface hydrological properties using remote sensing [Peñuelas et al., 1993; Gao and Goetz, 1995; Roberts et al., 1997; Sims and Gamon, 2003; Green et al., 2006], NDSWI was a better predictor of surface water depth, surface water cover, and water table depth within the study area. We initially expected that EWT could have captured these surface properties better than NDSWI because the calculation of EWT employs more wave bands, and more closely adheres to physical principles of Beer's law. However, both WBI and EWT used wavelengths within the 970 nm water absorption band, a region which is prone to instrument errors with the detector in our instrument [Sanchez-Azofeifa et al., 2009]. It is likely that these factors contributed to the poorer fit with these indices. Additionally, the presence of more than one cover type (vegetation and water, having different scattering properties), may have reduced the efficacy of EWT in this case, since Beer's law assumes minimal scattering. Our instruments and sampling protocols were designed to correct for changing sky conditions [Gamon et al., 2006], but these corrections are often difficult under rapidly varying cloud conditions. Since the 970 nm water band can be affected by varying atmospheric moisture, the often cloudy and misty conditions of the site may have further contributed to the weaker fit with WBI and EWT. One reason why NDSWI may have performed better was that it uses wavelengths that are on the edge of (rather than near the middle of) the 970 nm water band. The slightly better prediction with the log version (versus the linear version) of NDSWI suggests that a Beer's law approximation may actually be a reasonable assumption for modeling surface water depth, since Beer's law predicts an exponential extinction with depth. This becomes problematic for wavelengths affected by varying atmospheric water vapor absorption [Sims and Gamon, 2003] and by poor instrument performance (low signal-to-noise and stray light errors).
 Electromagnetic radiation in the optical region cannot penetrate deep into opaque surfaces (e.g., soil), which explains why negative water table depth values were not modeled well with NDSWI (low R2 values for open circles, Figure 6a). When combined with positive water table depth values (i.e., standing water above the surface), these negative water table depths did not detract from the accuracy of the NDSWI-derived model. However, the low R2 values between NDSWI and negative water table depth values indicate that this model is not capable of accurately predicting belowgroundwater depth. This weakness explained the poorer fit in the control (south) and drained (central) treatment basins relative to the flooded (north) treatment, which was characterized by positive water table depth values (i.e., aboveground surface water) for the seasonal sampling period (Figures 8 and 9). Thus, we caution that the detection of below ground water is not logically possible using the optical remote sensing techniques we have employed in this study, but note that they work well for visible surface water, whether expressed as surface water cover or as surface water depth (which were strongly correlated with each other, Figure 7). The weak but statistically significant correlation between NDSWI and negative water table depth may have been driven by covariance between water table depth and percent standing water associated with the frequency of local depressions where small amounts of subsurface water may have been visible to the sensor. An NDSWI-log value of approximately −0.4 indicates a water table depth at the ground surface using the model derived in Figure 6 and could be used as a reasonable cutoff for studies wishing to maximize the predictive power of NDSWI in modeling above ground water table depths. Although this study shows a strong potential of NDSWI to be used as an index of surface water depth and surface water cover, we caution that further testing is required to determine the spatiotemporal scalability of NDSWI across multiple sensing platforms, a broader range of land cover types, and regimes of surface hydrology.
 For largely pragmatic reasons discussed above, we sought a spectral index that could not be overly confounded by atmospheric moisture and could be calculated from readily available ground-based, airborne or satellite-borne instruments. It appears that NDSWI meets this requirement. Further testing of NDSWI to assess its significance and extrapolation potential in the Arctic and elsewhere will need to address many remote sensing challenges. These include challenges specific to the Arctic, that span multiple spatial scales [Vanderbilt et al., 2007], address spectral variation between sensors, particularly optical and spaceborne sensors [Ganguly et al., 2008], and challenges that are specific to differing surface cover types and hydrological regimes. Because a wide range of NIR wavelengths are sensitive to surface water (high coefficients of determination in Figure 4), this index can be readily adapted to a variety of sensors with wave bands in this region, facilitating its broad use, but also presenting challenges in intercomparison. Additional remote sensing challenges inherent to the Arctic include but are not limited to characterization of diverse and fine-scale landscape heterogeneity [Stoy et al., 2009], frequent cloud cover and high atmospheric moisture [Hope et al., 2004], and characterization of sometimes fine-scale and often short-term biophysical phenomena that are associated with ecosystem structure and function [Laidler et al., 2008; Vanderbilt et al., 2007]. An example of the latter is the occurrence of a midsummer snowfall (day 213 in Figures 8b and 8c) documented in this study. At this time, snow accumulated in the drained and control treatment areas but not in the flooded treatment area (Figure 8a). The response of NDSWI to this event demonstrated a strong sensitivity to snow, which appears as an anomalously high NDSWI value (Figures 8 and 10).
 The capacity of NDSWI to characterize the surface hydrology of the study area enabled us to evaluate the performance of the experimental flooding and draining experiment. Like many other large-scale experimental manipulations in the ecological sciences, our flooding and draining experiment was unreplicated, was a logistic and operational challenge, and displayed a high degree of “natural” variability in land cover and surface hydrology within and between treatment areas (Figures 11 and 12). The integration of various subproject or discipline-based sampling results from this long-term ecosystem experiment is just beginning to be determined and this study has facilitated this process by providing a tool for evaluating the surface hydrological properties of different experimental treatments, time periods and sampling areas (e.g., tram versus flux tower footprint). This study suggests that prior to experimental manipulation (i.e., in 2002), the basin of each experimental treatment had significantly different surface hydrology properties (Figures 11 and 12), as did the flux tower and tramline sampling footprints (Figure 12). There were significant differences between years, indicated by a difference in NDSWI for the control treatment (Figures 11 and 12) and elsewhere throughout the study area outside of the experimental thaw-lake basin (Figure 11). In 2008, flooding appeared to be more effective than draining throughout each treatment basin and within the flux tower and tramline sampling footprints (Figure 12). During the experimental manipulation, further draining of the “drained” basin by pumping water lower than the soil surface (water table depth equal to 0) proved to be logistically difficult, and this may help account for the lack of a clear result of a “drained” treatment effect. Additionally, we caution that our NDSWI-based model cannot readily distinguish between water table depth values of zero and negative values (below ground water), because optical remote sensing methods cannot penetrate far below the soil surface, so a successful drainage treatment may not have been clearly detectable with this method.
 A distinct advantage of NDSWI over the other spectral indices tested in this study is that it can be readily adapted to a range of available satellite remote sensing platforms. Considering the importance of surface hydrology on ecosystem processes and properties such as carbon dioxide and methane flux [Merbold et al., 2009; Wolf et al., 2008], surface energy balance [Chapin et al., 2005; Euskirchen et al., 2007], plant phenology and response to warming [Arft et al., 1999; Walker et al., 2006], and geomorphic processes [Lawrence and Slater, 2005; McNamara and Kane, 2009], the potential for NDSWI to facilitate the advancement of modeling and spatial extrapolation of these processes seems promising. Similar to how satellite-derived NDVI is increasingly being used to model land atmosphere carbon flux, leaf area index, or plant biomass [Running et al., 2004], we believe that, pending further testing and refinement, NDSWI could be applied in a similar manner to improve models of carbon dioxide or methane fluxes, which are highly dependent on surface hydrology [Merbold et al., 2009]. If possible, this could facilitate the development of spatially explicit models that estimate net greenhouse warming potential through the combination of land-atmosphere carbon dioxide and methane flux models. Such development, including ground, air and satellite-based estimates of NDSWI, could be integrated within regional studies or observatories focused on carbon dynamics as recently called for by McGuire et al. .
 This study has addressed a critical need in the Arctic terrestrial sciences: an improved capacity to detect and monitor surface hydrological properties that are associated with or regulate important ecological processes and phenomena. Our goal was to develop a spectral index from optical remote sensing that could be used to estimate surface water depth, and characterize changes in surface hydrology at multiple spatial and temporal scales. We also sought a method that could be used to assess the impact of a large-scale unreplicated experimental flooding and draining experiment that supported a range of simultaneous multidisciplinary and multiscale investigations. The Normalized Difference Surface Water Index (NDSWI) out-performed other spectral indices that have been used to estimate similar properties. This index appears to accurately estimate aboveground surface water depth and surface water cover for this landscape, and detected experimental treatment effects at multiple spatial and temporal scales in the experimental manipulation. We caution that while our results describe a vegetated thaw-lake basin on the Arctic Coastal Plain of northern Alaska, further work will be needed to extend this approach beyond the context of this particular study site. We recommend further testing and refinement of NDSWI, and an assessment of its potential use in the development of models that estimate ecological processes and phenomenon that are sensitive to surface hydrology such as land-atmosphere carbon dioxide, methane and water vapor exchange.
 This project was supported by the U.S. National Science Foundation (ASSP-0421588). We are grateful to the Ukpeaġvik Iñupiat Corporation (UIC) for permitting the Biocomplexity experiment on the Barrow Environmental Observatory and the Barrow Arctic Science Consortium and CH2M Hill Polar Services (formerly VECO Polar Services) for logistical support, construction services, and ongoing tramline maintenance. We are grateful to the following for their assistance in the field: Sergio Vargas, Christian Andresen, Irbis Gallegos, Mark Lara, Ryan Cody, Gilda Victorino, Paulo Olivas, and Steven Oberbauer. Vanessa Lougheed and David Johnson in the Department of Biology at UTEP provided statistical advice, and Ryan Cody, Christian Andresen, and Adrian Aguirre provided GIS support. Yufu Cheng, Loren MacKinney, Ann Kelly, and Lekealem Taku assisted with the initial tram assembly. Rob Green and Susan Ustin provided helpful comments on the manuscript.