We analyze the requirements for detecting changes in midlatitude land carbon sources or sinks from sampling the atmospheric CO2 concentration. Programs to sample the continental lower troposphere have only recently started, and it is not yet clear which atmospheric sampling strategy is most adequate. To shed some light on this question, we use simulations of atmospheric CO2 over Eurasia with two regional-scale atmospheric transport models. Our analysis focuses on the detection of the monthly mean CO2 signal caused by a perturbation of Eurasian summer biospheric fluxes by 20% (0.06 PgC/month). The main results are (1) that several measurements per day, preferably during the afternoon, are necessary to permit the detection of the additional land sink and (2) that the ratio between signal and background variation, corrected for autocorrelation in time, suggests no preferred level in the vertical for sampling. However, (3) the signals in the free troposphere are very small (0.2 ppm per 0.06 PgC/month) given the precision of atmospheric measurements. In contrast, signals in the planetary boundary layer (PBL) are on the order of 1 ppm per 0.06 PgC/month. This suggests that optimal sampling on continents should concentrate on the mixed portion of the PBL during afternoon. (4) Finally, the spatial correlation structure of the atmospheric CO2 concentrations suggests that a horizontal sampling density on the order of a few 100 km is needed.
 Most of our information about the dynamics and magnitude of large-scale carbon sources and sinks on land is derived from a limited spatial sampling of atmospheric CO2 concentration. Available data indicate that approximately one quarter of carbon emitted to the atmosphere as a result of fossil fuel burning, cement manufacture, and land-use change is absorbed by terrestrial ecosystems [e.g., Prentice et al., 2001]. One of these uptake regions is likely located in the Northern Hemisphere midlatitudes [Keeling et al., 1989; Tans et al., 1990]. The same data indicate also that there are large interannual variations in the growth rate of atmospheric carbon [e.g., Prentice et al., 2001]. While results from several studies based on atmospheric CO2 data [e.g., Randerson et al., 1997; Bousquet et al., 2000; Rödenbeck et al., 2003] indicate a pattern of response of the land biosphere to the anthropogenic perturbation of the atmospheric composition, it has been difficult to determine the underlying mechanisms on a large scale. Also, recent model-based predictions of the future behavior of the land biosphere under a warming climate exhibit very different temporal and spatial patterns [cf. Friedlingstein et al., 2001; Cox et al., 2000].
 One important reason for the limited capability to relate source/sink processes on land to external forcing is the sparse sampling of atmospheric CO2 on continents; another is that atmospheric source/sink signatures are degraded rapidly by strong mixing in the lower troposphere. Indeed, atmospheric observations have traditionally been made mainly at the Earth's surface at remote oceanic stations with weekly to biweekly flask air sampling, and the number of stations is small (currently on the order of 100 stations) (e.g., Conway et al. , GLOBALVIEW). The reason why sampling of the atmosphere has focused on remote oceanic stations is that the variability of CO2 signals over vegetated areas are very large compared to mean gradient signals. As an example, a Northern Hemisphere midlatitude carbon sink on the order of 1–3 PgC yr−1 has been inferred from an observed interhemispheric annual mean CO2 gradient on the order of 1–2 ppm [Keeling et al., 1989; Tans et al., 1990]. In comparison, Bakwin et al.  measured day-night differences of up to 100 ppm during summer at the Wisconsin tower site as a consequence of the diurnal cycle of photosynthesis and respiration.
 However, as attempts to sample the continental atmosphere are relatively recent and the best sampling strategy is not clear yet, it is of interest to inquire what simulations with spatially highly resolving transport models that use fluxes from land biosphere models with a realistic diurnal cycle do suggest. The reason for following such an approach is threefold. First, dissipative processes responsible for tracer transport within the PBL (planetary boundary layer) and exchange between the PBL and the free troposphere as well as the dynamics of the PBL itself can be represented more realistically with high-resolution models compared to global transport models used so far. Traditionally used models have a spatial resolution on the order of 5° × 5° latitude by longitude. Second, the spatial variability of transport processes and fluxes should be as close to observed variability as possible, which calls for a flux resolution in time on the order of an hour. Finally, a major factor for properly interpreting lower troposphere CO2 data is that the interplay between the diurnal cycle of surface fluxes and vertical transport processes is represented adequately in the models. Thus the diurnal cycle of photosynthesis and respiration of the land biosphere should be represented in a realistic way.
 To fulfill to some extent the request for high spatial resolution of the simulations, we use here two regional high-resolution transport models (REMO and MM5/HANK) for the atmospheric transport simulations. The use of more than one model permits us to obtain a handle on the dependence of our conclusions on the representation of transport. For simulating quite realistically the land biosphere variability, we use the TURC (Terrestrial Uptake and Release of Carbon) biosphere model [Lafont et al., 2002] that is based on incoming solar radiation (including the intermittent nature of cloud cover) and temperature from meteorological analysis together with the NDVI (normalized difference vegetation index) from satellite observations. To simulate the remaining two carbon flux components that determine the atmospheric CO2 distribution, atmosphere-ocean CO2 exchange and CO2 release as a result of fossil fuel burning and cement manufacture, we use monthly varying fluxes estimated by Takahashi et al.  and data from the Emission Database EDGAR V2.0 [Olivier et al., 1996], respectively. In our study we focus on Europe and western Siberia but as the main determinants of the variability of atmospheric CO2 are similar, the conclusions of this study should be transferable to North America as well. The simulations cover a summer period (July 1998). This is the period when the terrestrial biosphere is most active and consequently the CO2 variability is particularly large.
 Our approach to answer the question posed follows roughly a study by Kjellström et al. . They investigated the response of the atmospheric CO2 concentrations over Eurasia to an alteration of the surface fluxes and concluded that small differences (<20%) in the terrestrial surface fluxes would be difficult to detect on a monthly basis with existing stations.
 We conceptualize the source/sink detection problem as being analogous to detecting a signal with an analyzer in the laboratory. The signal is the difference in the mean CO2 concentration over a time interval over which fluxes are likely to be approximately of constant strength, here 1 month, over the Eurasian continent that is caused by a specific biosphere sink scenario. The analogue of instrumental noise is the monthly variance of the composite CO2 concentration, which consists of the contributions from fossil fuel emissions, ocean-atmosphere CO2-exchange, and photosynthesis and respiration of the undisturbed land biosphere. A difference to the analyzer analogue is that concentration observations at a fixed location cannot be treated as independent measurements and thus temporal correlations in the signal need to be taken into account. The central quantity of the approach is the ratio between signal and noise. In general this approach is similar to the one of Kjellström et al. , with a major difference being that the autocorrelation structure of the simulated signal and noise is properly taken into account, which alters some of the conclusions of the exercise. In addition, we investigate the spatial coherence pattern of atmospheric CO2 in order to infer some general guidelines for sampling the continental troposphere.
 The paper is structured as follows. First, the models and the model setups are presented. As any inferences on sampling of the atmosphere for the purpose of estimating land-atmosphere CO2 exchange hinges critically on the ability of the models to reproduce realistically the diurnal cycle of CO2 in the lower troposphere, we then confront model simulations with observations. Once the realism of the simulated variability and its limitations has been assessed we proceed to discuss the atmospheric CO2 concentration across Eurasia and its variability predicted by our modeling framework. Next, we investigate the signal-to-noise ratio of the hypothetic signal caused by a difference of 20% in the terrestrial biosphere fluxes in Eurasia. On the basis of the signal-to-noise outlined above we finally infer the temporal and spatial sampling requirements for detecting land carbon sources and sinks from the model simulations.
2. Model Setup
2.1. Model Descriptions
 The first model used for the simulations is the regional atmospheric model REMO [Jacob and Podzun, 1997; Langmann, 2000]. REMO calculates tracer transport online together with meteorology. The dynamical part and physical parameterizations are based on the regional weather forecast model Europamodell (EM) of the German Weather Service (Deutscher Wetterdienst DWD) [Majewski, 1991]. Tracer transport is represented by horizontal and vertical advection according to the algorithm of Smolarkiewicz , vertical diffusion following Louis  in the surface layer and using a second-order closure scheme [Mellor and Yamada, 1974] in the layers above, and convective transport by the mass flux scheme of Tiedtke . The performance of REMO in simulating the atmospheric circulation on the regional scale has been evaluated in several studies [e.g., Karstens et al., 1996; Jacob et al., 2001; Rockel and Karstens, 2001]. The transport properties of REMO have recently been evaluated through a detailed comparison of simulations of the passive tracer 222Rn with continuous measurements in Europe [Chevillard et al., 2002a] showing the ability of REMO to reproduce fairly well the transport features on a synoptic and subsynoptic scale. In a companion study, Chevillard et al. [2002b] simulated the atmospheric CO2 concentration in Europe and western Siberia using REMO together with the terrestrial biosphere model TURC. Comparisons with continuous measurements at ground stations and vertical profiles from aircraft measurements showed that for their study period of July 1998 REMO realistically simulates the short-term variability of the CO2 concentration.
 In the present study, REMO (version 5.0) with the DWD physical parameterization package uses 20 vertical levels in a hybrid coordinate system with six layers below 1500 m and an average thickness of the lowest layer of 60 m. The horizontal grid resolution is 0.5° in a rotated spherical coordinate system with the equator almost in the center of the computational domain. This results in a horizontal grid size of roughly 55 km × 55 km. The model domain encompasses Europe and western Siberia and covers an area of 36 × 106 km2 (cf. Figure 5b). For the meteorological part of the model analysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF) are used for initialization and as lateral boundary information with a time resolution of 6 hours.
 To enable direct comparison of model results with observations, REMO is run in the so-called forecast mode [e.g., Chevillard et al., 2002b], in which the results of consecutive short-range forecasts (30 hours) are combined. REMO is started for each day at 0000 UTC from analysis and a 30-hour forecast is computed. To account for a spin-up time, the first 6 hours of the forecast are neglected. This spin-up time has been shown to be sufficient for the Europamodell of DWD and is therefore assumed adequate also in the present study. By restarting the model every day from ECMWF analysis, the model state is forced to stay close to the weather situation of these analyses. The tracer transport is calculated continuously by simulating only the meteorology in the first 6 hours of each run and passing the tracer fields directly from the last time step of the previous 30-hour forecast. For a limited area model, like REMO, the influence of CO2 sources and sinks outside the model domain has to be accounted for by using global CO2 concentration fields as initialization and at the lateral boundaries during the model run. Furthermore, we have to allow CO2, which has been emitted from a source inside the model area and then left the area to be transported back into the area. This is taken into account by using simulation results from the global transport model TM3 [Heimann and Körner, 2003]. TM3 is used with a horizontal resolution of 4° × 5° along with 19 vertical layers and provides concentration fields at 3-hour intervals. To ensure consistency when nesting REMO into TM3, the CO2 concentration is generated globally in TM3 with the same ECMWF analysis and the same set of CO2 surface fluxes.
 The second regional model that we employ in this study is the off-line tracer transport model HANK [Hess et al., 2000]. HANK uses meteorological fields from the regional weather forecast model MM5 to drive tracer transport. MM5 is the fifth generation version of a regional weather forecast model developed at Pennsylvania State University and NCAR (National Center for Atmospheric Research) [Grell et al., 1993]. As input MM5 uses preprocessed meteorological fields provided by weather forecast centers like ECMWF or NCEP (National Centers for Environmental Prediction). The high-resolution meteorological fields calculated by MM5 are obtained using a nudging technique that forces the model fields to be close to the analyzed wind fields provided by ECMWF or NCEP. MM5 offers model physics that either do make or do not make use of the hydrostatic approximation. Here the version based on the hydrostatic approximation is employed. The modeling of nonreactive and reactive constituent concentration fields in the atmosphere with MM5/HANK proceeds in two steps. First, the regional weather forecast model MM5 is used to simulate meteorological fields within a model domain that can be flexibly chosen and with high spatial resolution up to a mesh size of a few kilometers. Three different projections can be chosen to set up the model coordinate system and up to nine levels of subdomains can be nested into the main domain. In a second step the winds and cumulus mass flux fields provided by the MM5 simulation are used off-line to drive transport within the atmospheric chemistry and transport model HANK.
 For the simulations performed for this paper a polar stereographic coordinate system with the coarsest grid centered at the North Pole, covering approximately two thirds of the Northern Hemisphere is used. Within this larger domain, a domain with three times finer resolution centered over Europe and western Siberia is embedded (cf. Figure 5a). The horizontal resolution of the coarse grid and the fine grid is 270 km × 270 km and 90 km × 90 km, respectively. In the vertical direction a sigma coordinate system with 27 layers is used, which has a resolution of approximately 25 m next to the ground. The meteorological fields used to simulate the driver fields for tracer transport with MM5 are from NCEP reanalysis. Driver fields for HANK are provided by MM5 once every hour. In HANK the initial conditions for the tracer concentration fields are zero. Boundary conditions for the larger domain are also zero while the boundary values for the smaller domain centered over Europe are set equal to the concentrations of the larger pole-centered domain. The algorithm used for advection is from Smolarkiewicz ; vertical diffusion in the PBL follows Holtslag and Boville  and Grell et al.  above the PBL. Deep convection is parameterized following Grell .
 When comparing the setups of REMO and MM5/HANK, the main differences are (1) an approximately two-fold higher horizontal resolution of REMO compared to HANK, (2) higher vertical resolution of the PBL by MM5/HANK compared to REMO, (3) different meteorological forcing with ECMWF analysis used for REMO and NCEP reanalysis used for MM5/HANK, (4) differences in CO2 lateral boundary conditions due to embeddings in a global (TM3) versus a coarse-grid Northern Hemisphere domain (MM5/HANK), and (5) online transport of tracers in REMO compared to off-line transport in HANK.
2.2. Surface Fluxes
2.2.1. Terrestrial Biosphere
 In this study biosphere-atmosphere exchange of CO2 is simulated by the TURC model [Lafont et al., 2002; Ruimy et al., 1996]. TURC is a production efficiency model, which estimates carbon uptake (photosynthesis) and release by vegetation and soils (respiration) from satellite information about the location and phenology of the land biosphere. This is done via the use of the normalized difference vegetation index (NDVI) in combination with meteorological information on downward solar radiation and air temperature close to the surface, as well as soil temperature and soil wetness from ECMWF analysis. Daily biospheric fluxes, split into gross primary productivity, maintenance respiration, growth respiration, and heterotrophic respiration, are estimated on a 1° × 1° grid. The primary driver of the day-to-day variability of the biosphere acting as CO2 source or sink is the incoming radiation at the surface, which determines the amount of photosynthesis [Lafont et al., 2002]. The day-to-day variability of direct shortwave radiation itself is mainly due to changes in cloudiness. In contrast to photosynthesis, soil respiration varies more slowly. As closure assumption annual soil heterotrophic respiration is set to be in balance with net primary productivity (photosynthetic uptake minus autotrophic respiration) for every grid cell. This assumption restricts the use of the simulated fluxes to seasonal or shorter timescales. To permit realistic simulations of the interaction between the diurnal cycle of biospheric fluxes and vertical transport in the troposphere, an approximately hourly temporal resolution of the biospheric fluxes is necessary. Therefore the diurnal variations of the individual components are parameterized using ECMWF analysis in accordance with the basic assumptions of TURC following the method described by Chevillard et al. [2002b]. Two previous studies demonstrated that TURC is able to simulate CO2 flux variability on hourly to daily timescales in good agreement with measurements at eddy flux sites [Lafont et al., 2002; Chevillard et al., 2002b]. Integrating the biospheric fluxes in July 1998 over the month and the model domain results in a net sink of −0.35 PgC for the REMO model setup.
 Air-sea flux of CO2 is prescribed following Takahashi et al. , who assembled and interpolated measurements of CO2 partial pressure differences across the air-sea interface and estimated monthly mean net fluxes using the wind speed dependent gas exchange coefficient formulation of Wanninkhof . In July 1998 the net uptake of CO2 by the ocean in the REMO model domain is −0.05 PgC.
2.2.3. Fossil Fuel
 Fossil fuel CO2 emissions are prescribed in the models using data from the Emission Database for Global Atmospheric Research (EDGAR V2.0) [Olivier et al., 1996]. EDGAR V2.0 provides global 1° × 1° maps of CO2 emissions for the year 1990. The emission estimates are based on national statistics on energy consumption (from IEA) and industrial production (from United Nations) together with CO2 emission factors (amount of CO2 emitted per unit of energy consumed). The emissions are distributed within each country according to point-source and area-source information as well as population density maps. Great Britain and central Europe are the regions with highest emissions within the model domain. Only very few sources are mapped north of 60°N and there are almost no fossil sources in northern Siberia. The total emission for July 1998 is 0.17 PgC in the REMO model domain.
3. Realism of Simulated CO2 Variability
 In order to illustrate to what extent our modeling framework is capable to reproduce the temporal and spatial variability of atmospheric CO2, we confront our model results with continuous in situ measurements at several stations in Europe and western Siberia (cf. Table 1). We show comparisons of simulated CO2 time series for July 1998 with observations at the European WMO station Hegyhátsál and at the flux tower sites Aberfeldy, Bayreuth, and Tharandt. Vertical profiles between 100 and 3000 m at a station in western Siberia are compared to illustrate the performance of the models in simulating the diurnal variability in the vertical CO2 distribution. A more detailed comparison of model simulations with observations at various stations is presented by Geels et al.  for several transport models, including REMO and HANK. For the comparison of model results to point measurements it is important to keep in mind that the influence of local conditions at the measurement sites cannot be fully represented in the models with an average grid size of 55 km and 90 km for REMO and HANK, respectively.
Table 1. List of Atmospheric Stations Used in This Study Including References to Site Descriptions
 Our models do not provide an absolute concentration of CO2, but only its variation in time with respect to an arbitrary initial value. We therefore added an “offset” of 355 ppm to all the simulation results. The offset was chosen such that the modeled CO2 concentration is closest to observations at 3000 meters at Zotino during July 1998. Since CO2 measurements on flux towers are not calibrated against air standards as it is done at WMO atmospheric stations, an additional arbitrary offset was used for some of these stations.
 In Figures 1a–1c the temporal variability of CO2 at ground stations is illustrated for the Aberfeldy, Bayreuth, and Tharandt sites. During the growing season, strong diurnal variations in the atmospheric CO2 concentration are observed that are a result of the covariance between the diurnal cycle of the atmosphere-biosphere CO2 exchange and the daily evolution of the boundary layer mixing regime. Synoptic weather events contribute to the day-to-day variability of the diurnal cycle during the growing season by influencing the photosynthetic activity and hence the CO2 flux as well as the strength of the vertical mixing in the atmosphere. Both models are able to reproduce the phase of the diurnal oscillation, and they also capture the variability of the amplitudes due to changes in the meteorological situation (passing fronts). Nevertheless, the quantitative agreement is sometimes poor because measurements close to the surface are representing very local conditions, which cannot be resolved in regional scale models (root mean square errors (RMSE) range between 4 ppm for REMO at Aberfeldy and 13 ppm for HANK at Bayreuth). At Hegyhátsál, CO2 is measured at different heights (10, 48, 82, and 115 m) along a tall tower. REMO and HANK simulations are interpolated to the measurement heights and compared to the observation time series in Figure 2. The data show a large diurnal cycle with higher nighttime maxima at the lower levels compared to 82 m and 115 m, which are caused by strong gradients of CO2 near the ground under stable nocturnal conditions. During midafternoon, when the boundary layer is well mixed, CO2 values are similar at all levels. Both models reproduce the main features of the vertical structure: higher diurnal amplitude at 10 m and 48 m compared to 115 m and vertically almost constant afternoon CO2 values. RMSE values are generally higher at the lower levels (15–24 ppm) compared to the uppermost level (9–12 ppm). The comparison of the CO2 time series at the stations presented here shows that the general differences in terms of mean diurnal amplitude and monthly mean concentration are well captured by the models.
 In order to characterize the variability of the CO2 signals and because of its later use for calculating effective sample sizes in section 4 we present in Figure 3 autocorrelation functions of observed CO2 concentrations and model simulations at a selection of stations. As we will later on focus on daytime sampling for calculating the autocorrelation function, average daytime (1100–1700 local time) concentrations have been used here. The autocorrelation functions reveal a dominant timescale on the order of approximately 3–4 days, which we interpret as reflecting the time between passing fronts that “reset” the lower troposphere CO2 concentration. The agreement between simulations and observations is quite good regarding the several day timescale.
 The performance of HANK and REMO in simulating the vertical structure and daily course of the CO2 concentration in the lower troposphere is illustrated by a comparison of model results with data from an aircraft campaign at the station Zotino, central Siberia [Lloyd et al., 2002a]. In Figure 4 CO2 concentration profiles sampled three times a day during several days in July 1998 are presented together with model results. The measurements show a characteristic pattern with small variations of the CO2 concentration at 3000 m and a large diurnal change within the PBL. Morning profiles show an accumulation of respired CO2 near the surface below 500 m and a sharp decrease of the concentration above. At noon, the CO2 profiles are much more uniform and the CO2 concentration slightly increases with height because the net uptake of CO2 by the vegetation reduces the concentration and daytime mixing due to convection transports CO2-depleted air higher up. The comparison of model simulations with data reveals that both models reproduce qualitatively the diurnal cycle of the CO2 profiles fairly well but have difficulties in positioning and resolving the sharp gradients marking the top of the convective boundary layer. One possible reason for this could be an underestimation of net CO2 uptake in the biospheric model. Chevillard et al. [2002b] showed a comparison of TURC fluxes with observations at Zotino, which gives no clear indication of a systematic underestimation of the daytime uptake. Hence we conclude that both models are not adequately representing turbulent mixing in the PBL. While in REMO vertical mixing seems to be too vigorous and as a consequence air depleted in CO2 is mixed too high up into the free troposphere during daytime, the vertical exchange in HANK is much weaker and the uptake signal is often confined to the lowest few hundred meters of the daytime boundary layer.
 In summary, we conclude from the comparisons that both models are able to reproduce the general characteristics of the atmospheric CO2 concentration fairly well. As the temporal variability is well captured, the employed models are suited for the purpose of this study even though quantitatively the agreement with observations is relatively poor. However, the models are less successful in modeling the PBL dynamics and hence tend to underestimate the magnitude of the vertical CO2 gradient during daytime.
4. Detectability of CO2 Signals
 To set the stage for the signal detection and optimal sampling analysis, we present here first the spatial pattern and spatiotemporal variability of the simulated atmospheric CO2 concentration in Europe and western Siberia during summer. We then investigate the detectability of signals and sampling needs.
4.1. Spatial Distribution and Variability of Simulated CO2
 The monthly averaged CO2 fields obtained from daytime (defined here as 1100–1700 local time) values for July 1998, as simulated by REMO and HANK, are shown in Figure 5 for a model level at approximately 300 m above ground within the atmospheric boundary layer. Over the continent CO2 concentrations show a west to east gradient with high values over western and central Europe and low concentrations over Siberia (region to the east of the Ural Mountains). While the main patterns are similar when using REMO or HANK, maxima and minima are higher for simulations with HANK, indicating less efficient ventilation of the near ground CO2 field by this model compared to REMO.
 The motivation for applying a daytime sampling is to avoid the high nighttime concentrations close to the surface that are difficult to model because the amount of accumulated CO2 in the shallow nocturnal boundary layer depends strongly on the local surface fluxes and on the stability of the PBL. Hence nighttime values are only representative for a small area around the measurement station. Afternoon values in contrast are distributed over a much larger fraction of the air column [Geels et al., 2006]. In order to be less dependent on the ability of the models to resolve local effects, we restrict most of our analysis to 1100–1700 local time.
 The contributions of biospheric and fossil fuel sources to the simulated CO2 for July 1998 are illustrated in a zonal cross section along a line between 45°N/20°W and 64°N/120°E (cf. red line in Figure 5) that extends from the North Atlantic across western and central Europe into western Siberia (Figure 6). This cross section is chosen such that CO2 concentration patterns characteristic for both densely populated areas in Europe as well as remote forest areas like Siberia are captured. The CO2 components are shown relative to a constant background value. The effect of a selective daytime sampling on the computation of monthly averages is illustrated in Figure 6 by the differences between solid and dotted lines. For the solid lines the hourly sampling is restricted to 1100–1700 local time and for the dotted lines the monthly average is computed from all hourly values without any conditional sampling. Fossil fuel emissions in Europe cause locally elevated CO2 concentrations reflecting emissions from major towns and industrial areas. The terrestrial biosphere signal is characterized by some spatial variability along the cross section superimposed on a mean negative west to east gradient between the Atlantic coast and western Siberia. The negative zonal gradient in biospheric CO2 at 300 m height above ground reflects the cumulative effect of the exposure of air to CO2 uptake by the vegetation under the predominantly westerly winds. At the lowest model level (at approximately 30 m) the biospheric CO2 component causes an increase of the unselected monthly average CO2 (dotted line). This indicates that the signal is on average dominated by the accumulation of respired CO2 in the shallow nocturnal boundary layer. Selecting only daytime values results in a decrease of the monthly mean at 30 m (solid line). At the 300 m level, which is located in the well-mixed boundary layer during day and in the residual layer above the shallow nocturnal boundary, the influence of the afternoon sampling is much smaller than near ground. This conclusion may be somewhat influenced by the limited realism of the simulation of vertical mixing (see section 3). Above the boundary layer (at approximately 3000 m) the contributions from the fossil fuel sources as well as from the biosphere are strongly diluted by atmospheric mixing and the contribution to monthly averages is less than 1 ppm. At this height the simulated mean concentrations are insensitive to the sampling protocol.
 The comparison between REMO and HANK simulations reveals generally larger signals near ground when using HANK. The signal caused by fossil fuel emissions is similar when using the two models in contrast to the biosphere signal, which in HANK is characterized both by a larger east-west gradient with a smaller minimum value over Siberia, as well as a much larger near-ground gradient. Both model results show a similar response to daytime sampling with the exception of the 30 m daytime values, which differ from the 300 m values in HANK but not in REMO. This may partially be a reflection of the higher vertical resolution of HANK near ground.
 The temporal variability of atmospheric CO2 is exemplified by the standard deviation of hourly values from the monthly average of the total CO2 concentration, which includes all contributions from fossil fuel emissions, ocean-atmosphere carbon exchange, and from the interaction with the terrestrial biosphere. This variability is the “noise” against which the signal of surface flux differences has to be detected. Without daytime sampling the standard deviation in the boundary layer shows very high values along the zonal cross section (Figure 7, dotted line). The large variability close to the surface is mainly caused by the strong diurnal cycle of the land biosphere component (cf. Figures 1 and 2). In the well-mixed layers above the surface layer (e.g., at 300 m), the variability of the atmospheric CO2 is much smaller because it is less affected by the temporal variability of the surface fluxes and mainly reflects the source-sink signal integrated along the path of the air mass. As expected, afternoon sampling (Figure 7, solid line) strongly reduces the standard deviation close to the surface (at 30 m) where the diurnal cycle is most pronounced. Above the boundary layer the standard deviation is small corresponding to the small spatial structure in the monthly average.
 Our estimation of variability, or noise, is limited by the spatial resolution of the surface fluxes and the atmospheric models because the temporal variability in the atmospheric CO2, which is caused by subgrid variations in surface fluxes is not resolved. Therefore we have to keep in mind in our analysis that we probably use a rather low estimate of the noise level.
 To illustrate the spatial coherence of the simulated atmospheric CO2, REMO time series at each model grid box are correlated with those at the surrounding grid boxes. The resulting correlation length is defined as the mean radius around each grid box within which the correlation coefficients are higher than 0.7. Figure 8a displays this correlation length in July 1998 for a model level in the boundary layer (at 300 m). Again, selective afternoon sampling was used to exclude the otherwise dominant influence of the diurnal cycle on the correlation of the time series in the lower boundary layer. Areas with far-reaching correlation, like in the North Atlantic or the Polar Ocean, are characterized by large-scale coherent changes of CO2. In these areas the spatial patterns of the surface fluxes are relatively smooth and only slowly varying. The large-scale concentration patterns are transported into the model area from the boundaries where the smooth CO2 distributions from the TM3 results are prescribed. Over the continent the spatial and temporal variability of sources and transport is much larger. This results in a small correlation length of 170–260 km of the CO2 patterns over central Europe. In western Siberia, which is under the influence of a stable high-pressure system in July 1998 with only very few synoptic disturbances passing by, the correlation length is larger (on the order of 380–490 km). The spatial coherence of the underlying afternoon biospheric fluxes from TURC simulations is slightly higher, especially in Europe (Figure 8b). This indicates that the coherence of CO2 concentrations is reduced by both turbulent transport in the atmosphere and high spatial variability of fossil fuel emissions. A similar analysis of simulations for a winter month shows that the correlation length of the atmospheric CO2 patterns over the continent is generally larger in winter with maximum values of 1000 km in western Siberia.
 Given our flux fields, we find spatial correlation length scales of PBL concentrations to be on the order of a few hundred kilometers. This is in rough agreement with the findings of Gerbig et al.  based on PBL-free troposphere CO2 profile measurements across the USA during the COBRA (CO2 Budget and Rectification Airborne) experiment in summer 2000. The analysis of the spatial coherence indicates that over continental areas the horizontal distance between stations should be no more than 300–500 km in order to allow a detection of carbon sources and sinks from measurements of the CO2 concentration in the PBL.
4.2. Signal-to-Noise Ratio
 The signal-to-noise ratio is used as a measure of the detectability of a specific CO2 signal given the variability introduced by all sources and transport processes. We investigate here the signal that an enhanced biospheric activity would cause in July 1998. The magnitude of the biospheric fluxes in Europe and western Siberia is increased by 20%, which corresponds for the REMO domain to an additional sink of −0.063 PgC/month in July spatially and temporally distributed according to the fluxes themselves. This results in a partitioning of the additional sink in Europe and western Siberia of 0.038 and 0.025 PgC/month in July, respectively. A 20% difference is within the year-to-year variability of the terrestrial CO2 fluxes in Europe according to the recent global inversion study of Rödenbeck et al. . Monthly averages as well as standard deviation are computed from daytime (1100–1700 local time) values only. In Figure 9 REMO and HANK simulation results of the spatial pattern of signal, noise, and signal-to-noise ratio are shown. Within the planetary boundary layer (e.g., at 300 m) the signal of the 20% biospheric flux difference is strongest over southern Europe and western Siberia when using REMO, while there is also a maximum over Scandinavia when using HANK. The high temporal variability in the European area results in low signal-to-noise ratios. Large areas with signal-to-noise ratios higher than 0.4 are only found over western Siberia. In REMO maximum values are located over mountain areas in southern Siberia, suggesting that a biospheric signal could best be detected over areas where the local variability is small and at the same time the signal is advected from the nearby vegetation. While the HANK simulations differ from REMO regarding the high concentrations and noise over Scandinavia the resulting horizontal distribution of the signal-to-noise ratio is very similar.
 The cross section along the line between 45°N/20°W and 64°N/120°E (cf. red line in Figure 9) shows an example of the vertical structure of the signal-to-noise ratio in the lower troposphere (0–3000 m) (Figure 10). The signal as well as the variability of the total concentration is strongest in the boundary layer. The main features of the vertical structure are generally similar in REMO and HANK; however, there is a larger uptake signal to the east of the Ural Mountains when using HANK compared to REMO. In both model simulations the signal is comparably small over western Europe. Together with the high noise level, this results in very low signal-to-noise ratios. The model results suggest that in July 1998 the signal would be most difficult to detect to the west of the 40°E meridian. The highest signal-to-noise ratio along this cross section is found in the region east of 80°E within the lowest 1000 m above ground. Between 1000 and 3000 m height in the atmosphere the pattern of the signal-to-noise ratio is slightly smoother than below. In this height range the detectability is less influenced by the local surface fluxes and therefore less dependent on the exact location of a CO2 concentration measurement.
 Although the overall features of the signal-to-noise ratio are quite similar in both model simulations, there are also some differences between REMO and HANK in the horizontal and vertical distribution. While HANK simulates low signal-to-noise ratios close to the surface and maxima around 500 m in the region east of 60°E, REMO results do not show a strong vertical gradient in the lowest 1000 m anywhere along the cross section as long as daytime sampling is applied. Since both models use the same set of surface fluxes, these differences are caused by the different transport characteristics of the models. The comparison indicates that caution is necessary in the interpretation of the results because they depend to some extent on modeled transport and thus its biases.
 Similar simulations have also been performed for a winter month, when the biosphere is acting as carbon source. The analysis shows that the patterns of the detectability differ substantially for different periods in the year. In the REMO simulations for December 1998 the signal-to-noise ratio for a 20% increased respiration is generally small, with maximum values of 0.3 over Siberia between 55°N and 65°N. The weaker signal-to-noise ratio in winter is the result of a smaller signal in combination with a similar noise level. In general the uptake signal in summer is stronger and confined to a shorter time period than the smoother respiration signal during the rest of the year.
Kjellström et al.  showed in their study, where they used simulations with the regional tracer transport model MATCH together with TURC biospheric surface fluxes, a similar pattern of the signal-to-noise ratio over Eurasia for July 1998. For a 20% difference in the biospheric fluxes they found highest values in a model level at approximately 1000 m height. The signal-to-noise ratio in this level showed maximum values of more than 0.8 in western Siberia between 45°N and 60°N. Since they did not apply a selective daytime sampling, their results are not directly comparable, especially concerning the vertical distribution of the signal-to-noise ratio. A daytime sampling would reduce the variability in the lower part of the PBL and hence result in higher signal-to-noise ratios. That they found slightly higher signal-to-noise ratios at 1000 m over Siberia is probably due to differences in the simulated transport because the biospheric fluxes are very similar to the version we use in our study.
4.3. Sampling Requirements Inferred From Model Simulations
 On the basis of the signal-to-noise ratio and the spatial coherence patterns presented in the previous section, we now investigate sampling strategies to detect surface flux signals. The simulated concentrations are assumed to represent in a statistical sense the population from which the measurements (=samples) are taken. The signal-to-noise ratio STN, in the version presented above, relates the expected value of a given signal S (=monthly mean) to the dispersion within the population σ (=standard deviation), representing the background noise. The population is defined for each grid box and consists of the hourly values in the time series. If there is one measurement during the measurement period of 1 month, then a signal is detectable at the 1-σ level if S > σ or STN = S/σ > 1. If there are n measurements and if the measurements were independent, the variance of the sample mean would decrease with the number of measurements n according to Var() = σ2/n with σ2 population variance and a signal would be detectable at the 1-σ level if S > σ/ or STN · > 1. However, the measurements are temporally correlated which results in the sample being effectively smaller than n. Following von Storch and Zwiers  in such a case, an equivalent sample size n′ can be estimated according to
with ρ(k) autocorrelation function at lag k.
 The variance of the sample mean must then be computed using the equivalent sample size n′. It should be noted that the approach to increase the signal-to-noise ratio by simply increasing the number of samples at a fixed location is thus effectively limited due to an increasing temporal autocorrelation, and therefore there is a saturation in the return of high-frequency sampling. The higher-order terms of the sum in equation (1) are poorly estimable because of the increasing uncertainty in the autocorrelation function at longer time lags [Thiébaux and Zwiers, 1984]. An alternative approach is to approximate the time series by an autoregressive process and use the corresponding autocorrelation function in the estimation of n′ [von Storch and Zwiers, 1999]. We use here a first-order autoregressive process to represent the afternoon CO2 time series. For a first-order autoregressive process the following approximation
holds [cf. von Storch and Zwiers, 1999]. This expression for the calculation of the ratio n/n′ effectively truncates higher-order terms in the left-hand-side sum and thus avoids noisy features (confirmed by calculating n′ in both ways).
 Besides the variability in CO2 signals, instrument precision also puts a limit on the signal detectability. Assuming 0.5 ppm as an upper limit of the precision of atmospheric CO2 measurements (inferred from comparisons of simultaneous measurements [e.g., Levin et al., 2002]), we can estimate the limit of a detectable monthly mean difference in the CO2 concentration to be 0.5 ppm/. From Figure 10a it appears that in the free troposphere, where the signals are strongly diluted, instrument precision limits the detectability if only few measurements are available. The instrument precision is therefore included in the signal-to-noise ratio by defining the noise as the quadratic sum of variability and instrument precision.
 Accordingly, the signal-to-noise ratio for a 20% difference in biospheric activity in Europe and western Siberia as presented in the previous section (cf. Figure 9c) is modified by including instrument precision and also multiplied by the square root of the number of effectively independent observations. The scaled signal-to-noise ratio STN · is shown in Figure 11 for a layer within the PBL (300 m) and along the cross section from 45°N/20°W to 64°N/120°E, again assuming hourly afternoon sampling every day of the month. Both models predict a similar pattern at the 300 m level but they deviate in the vertical distribution. In HANK, taking into account the temporal autocorrelation further reduces the detectability in the boundary layer east of 60°E.
 The scaled signal-to-noise ratio STN · in Figure 11 can be translated either into a minimum sampling frequency needed to detect a given signal or into a minimum signal, which would still be detectable at a 1-σ significance level with a given sampling frequency. For the detection of the 20% difference in biospheric fluxes from observations during daytime a signal-to-noise value of 1 defines the limit when all (i.e., hourly) measurements are needed to detect the signal at a 1-σ significance level. At higher values the sampling frequency can be reduced to reach the same detection limit. For example a signal-to-noise value of 3 would still allow the detection with 1/9 of the measurements, i.e., approximately one measurement during the preselected daytime period (1100–1700 local time). Accordingly, signal-to-noise values of 5 or 7 would imply a minimum sampling frequency of approximately twice or once per week, respectively. Both models agree in the general pattern that in central Europe hourly observations at ground stations and in the PBL are needed to allow the detection of possible changes, daily values do not suffice. In western Siberia the sampling can be less frequent but at least daily observations are needed. The simulations suggest that above the PBL (at 2000–3000 m) a lower sampling frequency could be accommodated but still weekly observations, like they are available from aircraft profiles, might not always suffice. The other way round, the detectability of a signal depends on the strength of the source changes. From Figure 11 we can infer that in some regions (like parts of Siberia) observations with high temporal sampling frequency would still allow the detection of much smaller signals, e.g., from hourly observations during daytime even a signal of 10 or 5% difference in the biospheric sink in July would still be detectable in areas with a signal-to-noise value of 2 or 4, respectively.
 Our model results do not clearly indicate a preferred height level for the detection of large-scale surface flux differences. However, as a consequence of the existing model deficiencies in resolving the structure of vertical gradients in CO2 profiles (cf. section 3), the predictions of our study on where to distribute sensors in the air column have to be interpreted with some caution.
 Next, we investigate if the described signals could be detected with the current network of CO2 measurement stations in Europe and western Siberia. This network consists of a combination of stations belonging to several global networks (NOAA/CMDL, CSIRO), national agencies and stations installed in the framework of special projects (AEROCARB, CARBOEUROFLUX, TCOS-Siberia, and EUROSIBERIAN CARBONFLUX). The number of stations is still increasing and at the moment the network includes approximately 40 ground stations, of which approximately 20 provide continuous (hourly) and approximately 10 weekly high precision measurements, another 10 stations provide continuous, but less well calibrated, measurements. At least three of the stations are towers with continuous measurements at several heights up to 200 m. Monthly aircraft profiles are currently carried out at seven sites. For each station the signal-to-noise ratio STN · computed using the REMO simulations is evaluated at four height levels, corresponding to 30 m, 300 m, 1000 m, and 3000 m, and is represented by a bar in Figure 12. Here we split up the signal into signals emanating from Europe (west of 60°E) and from western Siberia (east of 60°E) in order to separately investigate their detectability. Those stations and levels where a 20% difference in the biospheric activity in Europe or western Siberia could be detected at the 1-σ level are shown in green and dark blue, respectively. The light blue color indicates that both signals would be detectable at the same place. Where the difference in the monthly mean is not significant, the bar remains black. In this evaluation we do not specify the type of each station (surface, tower, or aircraft) and we also assume an hourly sampling at all levels. Assuming a 20% difference in surface fluxes that extends all over Europe it will be possible to detect this signal at most of the surface stations and towers in the central part of the network (Figure 12, upper panel). In measurements above the PBL the signal will be detectable downwind of sources and even at stations east of 60°E. Reducing the sampling frequency to only one measurement per day at 1000 m and 3000 m would significantly reduce the number of sites, which permit signal detection. The present network includes only five stations east of 60°E. A 20% difference in the biospheric fluxes in this area will be detectable at most of these stations. A comparison with the detectability of a 20% increase of fossil fuel emissions (Figure 12, lower panel) shows that measurements will be affected by both signals at most European stations except for the stations in southwestern Europe. Figure 12 presents the artificial situation where at each site measurements are available within and above the PBL. At present most of the stations measure the CO2 close to the surface (corresponding to the lowest level in Figure 12). With these stations combined in a network it would be possible to detect large-scale changes in the surface fluxes like the ones we prescribe in our study. However, it is obvious from Figure 12 that in some areas differences confined to small regions could easily go undetected. The picture clearly shows the need for an extension of the network to the east. In reality surface stations are much more influenced by local conditions than in the model simulations where they are represented by a grid cell with a size of 55 km × 55 km and 90 km × 90 km in REMO and HANK, respectively. The underestimation of the heterogeneity of surface fluxes probably results in an overestimation of the signal-to-noise ratios and therefore restricts our conclusions concerning individual surface stations.
5. Summary and Conclusions
 From our model study we infer some general guidelines for the sampling of the continental troposphere for the purpose of constraining carbon sinks in general and land sinks in particular. Comparisons of model simulations with near ground observations of summertime CO2 concentration in Europe suggest that nighttime concentrations are predicted with very large uncertainty only and thus render them useless for a quantification of regional-scale carbon sources and sinks that employs atmospheric transport models for the interpretation of the signals. The influence of nighttime trapping of respired CO2 decreases with altitude and becomes negligible in the free troposphere. The remaining summary restricts itself to results based on selective daytime sampling (1100 to 1700 local time).
 Our simulation results do not indicate a preferred location in the vertical for sampling of the continental CO2 field when taking into account autocorrelation of the signals. Near-surface, mixed layer, and free-troposphere measurements lead approximately to the same uncertainty range of inferences on the magnitude of sources and sinks. However, because signals in the free troposphere are quite small, precision and accuracy of the CO2 measurements will also influence the detectability of sources and sinks. As furthermore local conditions, which are not properly represented in our modeling framework, affect near-ground measurements, sampling the mixed layer somewhat above ground (a few 100 m) may be the most promising strategy.
 With regards to sampling in time, the simulations indicate that carbon sources and sinks can be detected and quantified with a frequent sampling of the continental troposphere. However, to achieve this goal generally at least daily sampling and, for most locations, more frequent (effectively continuous) measurements are needed to detect the assumed additional sink of 0.06 PgC/month in Eurasia.
 Our analysis is limited in its capability to make predictions on the necessary spatial density of sampling stations. Nevertheless, we can consult the correlation length of the simulated atmospheric CO2 field. The results indicate that a spatial density of stations with mesh size on the order of a few 100 km is necessary for detecting continental carbon sources and sinks at the one-sigma uncertainty level.
 In summary, we conclude that in order to detect regional scale changes in the surface fluxes on continents from atmospheric CO2 concentration measurements, a network of stations is needed that permits to sample the PBL several times per day and with a spatial distance between sampling stations not exceeding a few hundreds of kilometers. A possible realization is an array of continuously measuring tower sites located approximately every 300–500 km across the continent, ideally complemented by frequent vertical aircraft profiles.
 We would like to thank all participants of the projects AEROCARB and TCOS-Siberia for useful discussions and for providing their measurements and valuable advice. Similarly, we would like to thank CARBODATA for providing CO2 records that have been measured as part of the EUROFLUX program. MG would like to thank P. Hess and A. Klonecki for providing the HANK model and for help with the model. We also thank P. Ciais for constructive comments and S. Lafont who provided the biospheric fluxes from the TURC model. ECMWF and NCEP are acknowledged for providing meteorological analysis and reanalysis data. The work was partially funded by the European Commission under contracts EVK2-CT-1999-00013 (AEROCARB) and EVK2-CT-2001-00131 (TCOS-Siberia).