The transposition of atmospheric turbulence statistics from the time domain, as conventionally sampled in field experiments, is explained by the so-called ergodic hypothesis. In micrometeorology, this hypothesis assumes that the time average of a measured flow variable represents an ensemble of independent realizations from similar meteorological states and boundary conditions. That is, the averaging duration must be sufficiently long to include a large number of independent realizations of the sampled flow variable so as to represent the ensemble. While the validity of the ergodic hypothesis for turbulence has been confirmed in laboratory experiments, and numerical simulations for idealized conditions, evidence for its validity in the atmospheric surface layer (ASL), especially for nonideal conditions, continues to defy experimental efforts. There is some urgency to make progress on this problem given the proliferation of tall tower scalar concentration networks aimed at constraining climate models yet are impacted by nonideal conditions at the land surface. Recent advancements in water vapor concentration lidar measurements that simultaneously sample spatial and temporal series in the ASL are used to investigate the validity of the ergodic hypothesis for the first time. It is shown that ergodicity is valid in a strict sense above uniform surfaces away from abrupt surface transitions. Surprisingly, ergodicity may be used to infer the ensemble concentration statistics of a composite grass-lake system using only water vapor concentration measurements collected above the sharp transition delineating the lake from the grass surface.
 In its strictest form, the ergodic hypothesis states that ensemble statistics (mean and higher-order moments) at any given time or position are identical to the temporal or spatial statistics. It is a central concept invoked in a wide variety of subjects including chaotic systems [Eckmann and Ruelle, 1985], thermodynamics [Evans and Searles, 2002; Brody et al., 2007], stochastic processes [Deodatis, 1996; Ding et al., 2011], hydrology [Benson et al., 2000; Veneziano and Tabaei, 2004], and turbulence [Stanisic, 1985]. In atmospheric sciences, ergodicity provides the mathematical underpinnings for Monin and Obukhov  similarity theory that is the most common framework for describing the atmospheric surface layer [Brutsaert, 1982; Stull, 2003]. When considering atmospheric motions, Monin and Yaglom [1971, pp. 209–210] noted that the statistical approach to the theory of turbulence “transition from the consideration of a single turbulent flow to the consideration of the statistical ensemble of all similar flows, created by some set of fixed external conditions.” However, in the case of atmospheric turbulence, it is clear that for any given atmospheric observation, the external meteorological and hydrological conditions are not precisely controlled and cannot be repeated as may be the case in laboratory studies. Given that almost all turbulence theories employ ensemble averaging of the equations of motion while almost all atmospheric surface layer (ASL) measurements report time (or space)-averaged statistics, it is logical to ask under what conditions do the two averaging operators converge in light of the difficulties alluded to by Monin and Yaglom . Support for the ergodic hypothesis has been reported via direct numerical simulations of the Navier-Stokes equations for statistically stationary and homogeneous flows [Da Prato and Debussche, 2003; Galanti and Tsinober, 2004]. In laboratory studies, the ergodic hypothesis has also been tested using velocity time series measurements in a channel with repeated independent yet similar experiments at the Institut de Mecanique de Grenoble [Lesieur, 1990, p. 102]. In the ASL, a weaker form of “similar experiments” is implicitly adopted if “similarity” refers to the mean surface heating (Hs) and friction velocity (u*), the two surface boundary conditions that the flow experiences. Support for this weaker version of “similar experiments,” discussed in Monin and Yaglom , has received much success in the form of the Monin and Obukhov  similarity theory, where changes in the flow statistics scale with the changes in Hs and u* (i.e., external conditions). While similarity theory provides indirect validation for the use of Hs and u* as indices or surrogates for quantifying similarity in “external conditions,” direct testing of the ergodic hypothesis in the ASL has frustrated all experimental efforts and frames the compass of this work. The main novelty here is to demonstrate how Raman lidar (light detection and ranging)-based water vapor concentration (q) measurements can provide, for the first time, an evaluation of the ergodic hypothesis for ASL flows using the “similar experiments” concept. By using space and time as proxies for multiple experiments performed under the same external forcing, it is shown via a field study that ergodicity may be robust to transitions in the land surface cover when the system is defined by the composite cover bounding the transition.
2 Case Study: Seedorf Measurement Campaign
 The Raman lidar water vapor mixing ratio q [g/kg] measurements were collected at a height z = 5 m above an agricultural grass field and adjoining pond [Froidevaux et al., 2013; Higgins et al., 2012]. An aerial photograph of the field site is provided in Figure 1. A unique feature of this setup is the sharp lake-grass transitions over which the data are collected, allowing a well-defined step-jump in the surface humidity and latent heat flux at 140 and 470 m from the lidar location. The spatiotemporal resolution of the sampled q is 1.25 m in space and 1.0 s in time, respectively. The multitelescope array used for light collection (in the Ecole Polytechnique Federale de Lausanne (EPFL) Raman lidar) is designed such that the signal-to-noise ratio does not diminish over the first 500 m; therefore, the spatial domain of the analysis was restricted to the region extending from about 50 to 500 m from the lidar mirror. While it is desirable to have a larger region for such an analysis, it should be noted that this region is at least two orders of magnitude larger than the effective eddy sizes = kv z (≈2.0 m) responsible for the water vapor mixing at z = 5 m for near-neutral conditions, where kv = 0.4 is the von Karman constant. Also, the spatial sampling (= 1.25 m representing 360 sample points) is commensurate with kv z, thereby reducing the possibility of one effective eddy influencing several successive spatial sampling locations.
 Thirty-four horizontal lidar scans were taken between 19 August and 25 August 2008 representing 21 h of data. This sampling covers an atmospheric stability (z/L) range from −1.4 to 0.07, where L is the Obukhov length. While the entire data set is analyzed, results from a single segment are illustrated. The conclusions derived from this segment are shown to hold for the more expansive data set. A sample lidar scan used in this analysis, presented in Figure 1, was collected over the course of 45 min in the early morning of 19 August 2008. The impact of the transition jump is not readily apparent from this Raman lidar scan, as there are no discernable features in the humidity signal at the land surface transitions (140 and 470 m) denoted by the black dashed lines in Figure 1. At this time, the Hs over the agricultural field was minimal resulting in near-neutral atmospheric conditions (|z/L| = 0.05).
 The lidar experiment can be viewed in one of the following two configurations, both constructed under the same mean “external” conditions (i.e., Hs and u*):
 An array of 360 towers, separated by 1.25 m, simultaneously sampling q every 1 s (i.e., time variations are used to construct statistics when stationary and ergodicity are considered).
 An experiment repeated 2700 times where q is simultaneously sampled every 1.25 m (i.e., spatial variations are used to construct statistics when homogeneity and ergodicity are considered).
 In these two configurations, time or space samplings are used individually as proxies for an ensemble of independent experiments. Autocorrelation decay in both space and time is rapid, thus verifying the assumption of independence. It is clear from the measurement span (360 points corresponding to 450 m of data) and time interval (2700 points corresponding to 45 min of data) that statistics computed from temporal fluctuations should have higher convergence due to the larger sample size. It is for this reason that the repeated experiments in time (configuration 1 above) are used as surrogates for independent replications, and ergodicity is hereby explored along the lidar transect spatially.
 Lagged autocorrelations in both space and time are performed, and the length scale and time scale at which the humidity data becomes uncorrelated is 5 m for space and 4 s for time, respectively. To construct an ensemble probability density function (pdf), a subsample of the data set presented in Figure 1 is taken. This subset consists of all points that are separated by the autocorrelation length/time (5 m separation in space and 4 s separation in time), so that each point may be viewed as an independent sample. Another ensemble, restricted only to the lake system, is delineated by the lake surface 30 m away from transitions. This lake-only ensemble is also constructed and compared to the full ensemble that includes the lake-grass system. The ensemble over the lake system allows testing for ergodicity over a uniform surface (i.e., lake), while the ensemble collected from the composite lake-grass system allows testing for ergodicity at the transition to assess its representativeness of the composite (or heterogeneous) system. Both ensembles are shown in Figure 2a and are hereafter referred to as Pl(q) for the lake-only system and Pgl(q) for the composite grass-lake system. Notice in Figure 2a that Pgl(q) (black solid line) has heavier tails than the Pl(q) (dashed blue line), presumably due to the added variability introduced by the heterogeneity in the composite system.
 Figure 2 also presents pdfs computed using only (Figure 2b) time variations where location is interpreted as a repeated experiment or (Figure 2c) spatial variations where time was treated as repeated experiment. Here it is apparent that although the pdfs are similar and independent of spatial or temporal origin, there are small, but measurable differences. To test the statistical similarity of the mean of each of these distributions in Figure 2b to the mean of P(q) (composite domain or lake-only system) in Figure 2a, a t test is performed. H values from the t test are presented in Figure 3 for (Figure 3a) pdf computed from individual time-varying data sampled at each location and compared to the mean of Pl(q) and (Figure 3b) the same as Figure 3a but data compared to the mean of Pgl(q). In Figure 3a, the mean of individual realizations collected over the lake cannot be statistically distinguished (at the 95% confidence level) from the mean of Pl(q). That is, less than 10% of the statistical tests performed between 200 and 450 m reject the null hypothesis. When repeating the same analysis for all the spatial locations and comparing the individual means to the mean of Pgl(q), only the mean near the transitions (between the ranges of 50–175 m and 470 and 500 m, respectively) is not potentially different from the mean of Pgl(q). Still, in the transitions, a significant fraction of the statistical tests reject the null hypothesis (~50% and ~30% of the statistical tests performed in each region, respectively). However, only the transition region appears to be capable of approximating the ensemble mean of the lake-grass system.
 While this analysis focused on mean behavior, Figure 4 explores the applicability of ergodicity to higher-order statistics, which is conventionally called ergodicity in the “strict sense.” This exploration uses the Kolmogorov-Smirnov (KS) test, which employs the H statistic at the 95% confidence to bin-by-bin comparisons between the individual pdf and P(q). A stencil of data points is selected that is composed of two intersecting segments. The first segment is space local and time-varying (10 min), while the second segment is time local and spatially varying (300 m). The segments intersect, and the relative length of the segments is fixed by Taylor's hypothesis. The point of intersection is the point at which the H statistic value of the KS test is reported here. The comparison is then repeated for data only collected over the lake and later for all possible points representing the composite lake-grass system, resulting in a map of H values from the KS test. Since the KS test determines the statistical likelihood that two separate data segments have been drawn from the same underlying probability distribution shown in Figure 2a, the H value returned can be interpreted as a measure in the ergodicity for all statistical moments. For H values of zero, the underlying ergodic assumption (null hypothesis) cannot be rejected (Figure 4). The KS test results in Figure 4 support the findings in Figure 3. That is, the strict ergodicity approximation appears to be valid for the lake system. Moreover, near the transitions in the lake-grass system (particularly in the range of 300–470 m), the water vapor concentration fluctuations measured cannot be distinguished from Pgl(q).
 From this Raman lidar field campaign, “necessary” conditions for homogeneity and ergodicity in the strict sense over the lake cannot be rejected when the measurements are collected for “similar” mean meteorological conditions as long as the measurement location is sufficiently far from land surface transitions. However, for the lake-grass system, it appears that ergodicity applies only near the transition. Hence, if a sufficient number of transitions are included to meet statistical spatial homogeneity requirements [e.g., Brutsaert, 1998], ergodicity could hold once again for the entire system. The latter finding is “good news” for interpreting ensemble concentration measurements from tall towers, including those near land-ocean interfaces, where the scalar source distribution is nonuniform [Bakwin et al., 1998].
4 Discussion and Conclusions
 The lidar technique opens up new possibilities for atmospheric measurements and analysis by providing simultaneous high-resolution spatial and temporal atmospheric information. This analysis of Raman lidar water vapor concentration data supports the use of the ergodic hypothesis in the ASL near the ground, away from transitioning land surface conditions. We repeated the analysis for all available scans, and the findings remain consistent across a range of atmospheric stabilities (moderately stable, near-neutral, and unstable). While it is clear that transitioning terrain affects the local turbulence statistics and ergodicity, the severity of a step change in the surface and the resulting extent of influence on atmospheric ergodicity are yet to be quantified. The work here suggests that water vapor concentrations collected above sharp transitions in surface humidity might encode the ensemble statistics of the composite lake-grass system if sampled long enough.
 The issue of similar mean meteorological conditions becomes harder to define for some mean meteorological states. Synoptic atmospheric conditions may lead to nonstationarity and lack of ergodicity. For example, nonstationarity can be introduced through entrainment of water vapor at the capping inversion or the passage of clouds [Cava et al., 2004]. How to quantify these effects in ASL turbulence necessitates new lidar experiments particularly aimed at investigating the robustness of the atmospheric stability-stationarity-homogeneity-ergodicity question perhaps using an analysis similar to the one presented here. Answering these questions may clearly show the need for new theories that do not assume, a priori, the ergodic hypothesis.
 The authors would like to thank the FNRS (project no. 200021-107910/1, 200021_120238/1, 2008-2011) and NCCR_Mics3 for their support of this research. Katul acknowledges support from the National Science Foundation (NSF-EAR-10-13339, NSF-AGS-1102227), the U.S. Department of Agriculture (2011-67003-30222), and the United States Department of Energy (DOE) through the Terrestrial Ecosystem Science program (DE-SC0006967).
 The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.