A regional variational inverse modeling system for the estimation of European biogenic CO2 fluxes is presented. This system is based on a 50 km horizontal resolution configuration of a mesoscale atmospheric transport model and on the adjoint of its tracer transport code. It exploits hourly CO2 in situ data from 15 CarboEurope-Integrated Project stations. Particular attention in the inversion setup is paid to characterizing the transport model error and to selecting the observations to be assimilated as a function of this error. Comparisons between simulations and data of CO2 and 222Rn concentrations indicate that the model errors should have a standard deviation which is less than 7 ppm when simulating the hourly variability of CO2 at low altitude during the afternoon and evening or at high altitude at night. Synthetic data are used to estimate the uncertainty reduction for the fluxes using this inverse modeling system. The improvement brought by the inversion to the prior estimate of the fluxes for both the mean diurnal cycle and the monthly to synoptic variability in the fluxes and associated mixing ratios are checked against independent atmospheric data and eddy covariance flux measurements. Inverse modeling is conducted for summers 2002–2007 which should reduce the uncertainty in the biogenic fluxes by ∼60% during this period. The trend in the mean flux corrections between June and September is to increase the uptake of CO2 by ∼12 gCm−2. Corrections at higher resolution are also diagnosed that reveal some limitations of the underlying prior model of the terrestrial biosphere.
 Measurements of atmospheric CO2 mixing ratio have been widely used along with atmospheric transport models to infer statistical information about surface CO2 fluxes based on data assimilation or inverse modeling techniques [Enting, 2002]. In particular, atmospheric inversions are used to yield statistical corrections to the estimate of the surface fluxes from ecosystem exchange models, oceanic and anthropogenic flux inventories. Their applications have improved the quantification and understanding of natural carbon sources and sinks at global scale [e.g., Tans et al., 1990; Rayner et al., 1999; Gurney et al., 2002; Rödenbeck et al., 2003; Rayner et al., 2008].
 However, global inversions still bear large uncertainties at regional scale [Baker et al., 2006], which strongly limits the projection of the future climate. The uncertainty in biogenic flux in Europe is illustrated by the differences in the estimates of the land natural fluxes (over an area of 1.05 × 107 km2) during 2001–2004 given by the systems of Peylin et al.  (−0.3 PgC.y−1 ∼−29 gCm−2.y−1; negative and positive values corresponding to CO2 uptake and sources respectively), of Rödenbeck et al.  (−0.86 PgC.y−1 ∼−82 gCm−2.y−1) and of Peters et al.  (−0.12 PgC.y−1 ∼−11 gCm−2.y−1). These differences have the same order of magnitude as the anthropogenic emissions ∼1 PgC.y−1 for the European Union 25 member states (EU-25) [Ciais et al., 2010a]. They are mainly due to errors in the transport model or in the setup of inverse modeling systems and are enhanced by the lack of data for assimilation.
 There are even larger uncertainties at higher spatiotemporal scales because fluxes are influenced by strongly heterogeneous climatic, biogenic and anthropogenic drivers in Europe [Peters et al., 2009]. A better knowledge of the variability in the European biogenic fluxes would help characterize the responses of the ecosystems to climate variability and their feedback on climate. Extreme climatic conditions emphasize these responses e.g. during the heat wave of summer 2003 when the summer anomaly of Net Ecosystem Exchange (NEE) was comparable to a year of anthropogenic emissions [Ciais et al., 2005].
 Compared to other continental atmospheric networks, a relatively dense CO2 atmospheric concentration and flux observation network has been developed in Europe, within the framework of CarboEurope-Integrated Project (CE) [Ciais et al., 2010a, 2010b, 2010c; Luyssaert et al., 2010; Schulze et al., 2010], to improve the quantification and understanding of the biogenic and oceanic carbon sinks. It should expand in the future in the framework of the Integrated Carbon Observation System (ICOS, http://www.icos-infrastructure.eu/). Much of the European concentration data collected over the last decade are in situ continuous measurements, providing accurate hourly data at ∼10–15 stations (depending on the year considered). The application of inverse modeling to European fluxes should thus be a relatively good assessment of the skill of this approach for the estimation of regional fluxes.
 This skill depends strongly on the ability of the transport models to reproduce the variability of observations when they are driven by realistic fluxes [Gerbig et al., 2003a, 2003b]. This ability is limited by the description of CO2 measurements, surface fluxes and surface topography using average values at the model spatial and temporal resolution, i.e. by representation errors. It is also limited by the use of imperfect forcing and parameters for transport at the model resolution. This explains why recent global coarse resolution inversions assimilate daily, weekly or monthly rather than hourly averages of data (e.g., Rivier et al.  and the setup of most of the inversions used for the Regional Carbon Cycle Assessment and Processes project (RECCAP, http://www.globalcarbonproject.org/reccap/products.htm)), and why they still yield strong uncertainties on European NEE, even though they use the CE data. Carouge [2010a, 2010b] showed that the reduction in flux uncertainties that could be expected from the assimilation of in situ daily CE data with a global hydrostatic transport model zoomed over Europe (with a ∼60 km horizontal resolution) is not significant at scales smaller than 1000km and 10 days. Due to the lack of confidence in the corrections inverted at the model resolution, and due to the need to extrapolate the information on fluxes that is constrained only in the vicinity the observation locations (while observation networks are relatively sparse), the corrections to fluxes are generally aggregated at lower resolution. Peters et al.  also used a global configuration zoomed over Europe (with a 1° horizontal resolution) and they applied such an aggregation by inverting fluxes homogeneously in each of their 2 European climate zone and for each of their 19 Plant Functional Types (PFTs, which are used in biosphere models to sort biomes into synthetic classes with specific modeling parameters), every week. This provides 30 degrees of freedom per week. However, the aggregation of corrections introduces some errors in the inversion even at the scale of aggregation [Kaminski et al., 2001].
 As part of the CE Regional Experiment Strategy (CERES), Lauvaux et al. [2008, 2009a, 2009b] developed an inversion system based on the Mesoscale Non Hydrostatic model (MésoNH) coupled to the Lagrangian Particle Dispersion Model (LPDM). They applied the transport model at 8 km resolution over the CERES domain in the south west of France. They showed that this transport model coupled to a biosphere model could reproduce the variability of hourly in situ data from two stations in the region with enough accuracy so that the mesoscale inversion system could produce promising 30% uncertainty reduction in fluxes at 8 km and 6 day scales over distances of a few hundred of km from the atmospheric measurement stations. Intercomparisons [Law et al., 2008; Patra et al., 2008] confirmed the greater ability of high resolution transport models compared to global low resolution transport models to fit continental observations. As a consequence, European mesoscale inverse modeling systems are being developed [Rödenbeck et al., 2009].
 In that context, Aulagnier et al.  and Aulagnier  set up a European 50 km horizontal resolution configuration of the mesoscale transport model CHIMERE [Schmidt et al., 2001], in order to assimilate CE atmospheric data for high resolution flux inversion. They used NEE from the carbon-water-energy model Organizing Carbon and Hydrology In Dynamic Ecosystems process-based model (ORCHIDEE) [Krinner et al., 2005] to provide prior fluxes to this configuration. Aulagnier  checked the skill of the transport model for capturing the variability in the observations. Aulagnier et al.  studied the modeled versus observed decadal trends and interannual variability in atmospheric CO2 between 2001–2006. They discriminated the role of transport versus flux trends in driving trends in the concentrations. Aulagnier  initiated the development of a variational data assimilation system based on the adjoint of CHIMERE [Pison et al., 2007] and on the methods of Chevallier et al. [2005, 2007]. Variational data assimilation is particularly efficient in dealing with many observations and high resolution flux inversions. Continuing the development of this mesoscale flux inversion system, this study aims at characterizing the reliability of its results and its potential to improve the knowledge on European NEE by assimilating hourly CE observations.
 Even though the model transport and representation errors, called also “the model error” hereafter (i.e. the error in the concentrations when using perfect fluxes) are strongly reduced when using increasing resolution, they are still significant at 50 km resolution when compared to the errors in the concentrations due to errors in the fluxes [Lauvaux et al., 2009a; Dolman et al., 2009]. A good estimate of the model error is necessary to avoid over-fitting the data, to distinguish between these errors and the errors from the fluxes, and thus to avoid projecting them into the fluxes during the inversion. Following Chevillard et al. [2002a], the estimation of the model error is based here on comparisons of simulations and data of 222Rn (hereafter Radon), a short-live radioactive gas emitted by soils.
 In this study, the confidence in the inverted fluxes is characterized using different types of evaluation: (1) the flux uncertainty reduction from inversion is estimated using a Monte Carlo approach with the assimilation of synthetic data [Chevallier et al., 2007]. (2) The temporal variability in the flux and concentrations obtained when assimilating real data are checked against independent atmospheric CO2 data and eddy covariance surface flux measurements [Lauvaux et al., 2009b]. (3) The monthly to interannual variability and the spatial distribution of the inverted NEE in summer is analyzed during the period 2002–2007. The corrections derived by the inversion in summer should help characterize the errors in the fluxes from ORCHIDEE. The focus on the summer period is sensible to study the skill of the inverse modeling for the estimate of NEE at high resolution even though the variability caused by the seasonal cycle cannot be analyzed. The response of the NEE to strong temperature anomalies can be analyzed in the impact of the heat wave in 2003.
Section 2 describes the components of the inverse modeling system, section 3 the estimate of the flux uncertainty reduction from inversion and section 4 the comparison of the inverted fluxes, and associated concentrations to assimilated and independent data. Section 5 shows a series of diagnostics on the summer NEE from ORCHIDEE and from the inversion. Section 6 summarizes the most important results and concludes the study.
2. Mesoscale Inverse Modeling Setup
 The inversion is applied to correct the European NEE at adapted temporal and spatial resolutions of aggregation (cf. section 2.3), denoted f, given its prior estimate from ORCHIDEE fb and using hourly CE in situ data, denoted yo. Based on the Bayesian framework, the aim of data assimilation is to provide the conditional probability of the “true f”, denoted ft, given fb and yo: p(ft|fb, yo) = p(yo|ft, fb).p(ft|fb)/p(yo|fb). The atmospheric transport model CHIMERE, denoted M, is used to project the f-space into the y-space of the observations. yo is independent of fb, so p(yo|ft, fb) is derived from the so-called observation error: p(yo − M(ft)|ft) whose estimate should account for measurement errors, model representation and transport errors, and aggregation errors (when corrections are aggregated at a resolution lower than that of the model).
 Assuming that statistical errors on fb i.e. p(ft − fb|fb) and yo i.e. p(yo − M(ft)|ft) are Gaussian and unbiased, with respective covariance matrices B and R that are known, and assuming that M is linear (M = M, which is the case for CO2 atmospheric transport), p(ft|fb, yo) is actually a Gaussian distribution N(fa, A) and data assimilation solves fa = argminJ where J(f) = (f − fb)B−1(f − fb)T + (Mf − yo)R−1(Mf − yo)T and A = (B−1 + MTR−1M)−1.
 This section describes the models, the data, the setup of error covariances and the minimization method involved in the practical implementation of this flux inversion.
2.1. Configuration of the Carbon Cycle Regional Modeling
 The setup of the atmospheric transport modeling is summarized in Table 1. CHIMERE (http://www.lmd.polytechnique.fr/chimere/) is an Eulerian mesoscale chemical transport model, initially designed for air quality studies [Schmidt et al., 2001], but used here for CO2 transport modeling. The model has provided some of the best results in transport model intercomparison exercises for the simulation of high temporal variability in CO2 concentrations [Law et al., 2008; Patra et al., 2008]. CHIMERE can be used at regional (several thousand km) to urban (100–200 km) scale with resolutions from 100 km to 1 km.
Table 1. Atmospheric Transport Configuration
Atmospheric Transport Model
0.5° (∼50 km)
20 layers up to 500 hPa
hourly outputs from a European configuration of MM5
nudged every 6 hours toward ECMWF operational analyses
Open boundary conditions for CO2
daily outputs from the LMDZ global inversion of Chevallier et al. 
LMDZ configuration: 3.75° lon. × 2.5° lat. resolution / 19 vertical layers up to 3hPa
Prior NEE for CO2 fluxes
global simulation of ORCHIDEE at 3-hourly/0.3° spatial resolution
 The configuration used here has a 0.5° horizontal resolution and 20 vertical layers from the surface up to the mid-troposphere. The first seven vertical levels lie below 300 m altitude, yielding a good discretization of the boundary layer. The domain of this configuration (Figure 1) covers ∼3.9 × 106 km2 of land surface.
 CHIMERE is an off-line model requiring 3D mass-fluxes as inputs for transport calculations. A simulation with a European configuration of the MM5 meteorological model [Grell et al., 1994], is used to provide hourly mass-fluxes. MM5 is nudged toward the operational analyses of the European Centre for Medium Range Weather Forecasts (ECMWF). This simulation and its coupling with CHIMERE has been analyzed by Xueref-Remy et al. , Aulagnier  and Aulagnier et al. . The global inversion of Chevallier et al.  based on the LMDZ model [Hourdin et al., 2006], is used to impose the boundary conditions for CO2 concentration at the lateral and top boundaries of the CHIMERE domain. This inversion assimilates in situ data including daily averages of CE continuous data in Europe. These boundary conditions account for the large scale incoming transport of CO2 from optimized fluxes outside the model domain or from optimized fluxes in the domain leaving the domain and entering it later [Chevillard et al., 2002b; Rödenbeck et al., 2009].
 Climatological data from Takahashi et al.  are used to impose ocean fluxes. These fluxes are set to 0 wherever there is no climatological data e.g. in the Mediterranean Sea. Anthropogenic CO2 emissions are derived from the EDGAR V3.2 fast-track product [van Aardenne et al., 2005] giving mean annual fluxes for the year 2000. It has been convolved by Peylin et al.  with diurnal, weekly and seasonal variations provided by EMEP [Vestreng et al., 2005] to get hourly emissions. Annual fossil fluxes are rescaled according to the country level yearly fossil estimates of the Carbon Dioxide Information Analysis Center (CDIAC, http://cdiac.ornl.gov/).
 The prior estimate of CO2 NEE is provided by ORCHIDEE, a process based ecosystem model which accounts for half hourly to interannual variability in photosynthesis and respiration of plants and soils [Krinner et al., 2005]. ORCHIDEE is initialized with the steady state equilibrium of long term mean NEE even though the equilibrium of annual total NEE is not imposed after the initialization. ORCHIDEE has been extensively used for land-surface modeling e.g. in the North American Carbon Program (NACP) model intercomparison (http://www.nacarbon.org/nacp/), in the biogenic flux model intercomparisons of the CE project [Vetter et al., 2008; Churkina et al., 2010] or as part of the Institut Pierre-Simon Laplace (IPSL) Earth System model which has participated in the coupled climate-carbon simulations of the Intergovernmental Panel on Climate Change (IPCC) 4th Assessment Report. Several inverse modeling systems derive their prior estimates of biosphere fluxes using ORCHIDEE [Peylin et al., 2005; Chevallier et al., 2010]. The typical values and uncertainties in the CO2 fluxes are discussed in section 2.3.
 Atmospheric transport simulations are conducted for periods 31 May – 2 October 2002 to 2007 to study the synoptic to interannual variability in summer. The initial conditions for CHIMERE are spatially interpolated from 3D LMDZ CO2 fields. A general offset of the CO2 concentrations is applied so that a bias in the background concentrations from the boundary forcing using LMDZ would not impact inversions. Inversions should be strongly constrained by spatial and temporal gradients of concentrations in the data, but such a general bias may have a strong impact on the inversion of fluxes outside the dense parts of the observation network. The offset is estimated using the spatiotemporal mean of all CE data that will be assimilated (Table 2). During a given summer, concentrations x from CHIMERE are thus reestimated using xoffset = xno offset + ∑j(yjo−yjno offset)/Nobs where Nobs is the number of data yjo which will be assimilated and where the yjno offset denote the simulated concentrations at the corresponding time and space locations obtained using CHIMERE forced by LMDZ, ORCHIDEE, and the ocean and anthropogenic fluxes described above (the offset is kept constant during subsequent inversions). In the following, we only consider the unbiased concentrations xoffset.
Table 2. CE Stations Used for CO2 and Radon Measurements
2.2. Model Error: Comparisons to CE in Situ Continuous Data
2.2.1. Data and Aim of the Comparisons
 The CE continuous stations for atmospheric measurements used in this study (Table 2 and Figure 1) deliver hourly data with less than 0.2 ppm measurement errors for CO2 which is far smaller than model errors. A good deal of care is taken to ensure stable intercalibration of the stations. Measurements errors are relatively higher for Radon. Five of the CE stations used here are high towers with several vertical levels of CO2 measurements (CBW, HUN, LMU, OXK and TRN). For comparisons with the simulations, the levels of measurement are placed into the model grid at their correct altitude above the sea level.
 In the following section, these comparisons are used to characterize the model error, which is assumed to be unbiased and Gaussian in the inversion framework. The analysis of the distributions of Radon or CO2 differences between the simulation and the data (hereafter “misfits”), indicates that they can be approximated reasonably well by Gaussian distributions (not shown), which supports the characterization of the model error by the quantification of its variance. However, there are potential biases in the model error which may bias the inversion of fluxes and data should not be assimilated during the periods most prone to them. Because radon fluxes can be biased, Radon misfits can hardly be used to identify model biases.
 At a given CE station, the temporal variability of the Radon fluxes is smaller than that of the CO2 fluxes because it is essentially driven by soil humidity and precipitations [Szegvary et al., 2007]. Therefore, for Radon, most of the temporal variability in the concentrations and thus in the model error can be related to atmospheric transport. The Radon lifetime scale is close to the synoptic scale so that this variability catches the features of some of the main sources of model error when simulating CO2, e.g. the representation error, inaccurate synoptic events, vertical mixing, Planetary Boundary Layer (PBL) height (PBLH) [Chevillard et al., 2002a]. At a given CE station, the variance of the misfits for Radon is thus expected to be a good estimate of the variance of the model error when simulating CO2.
 This estimate of the model error using Radon misfits has some limitations: (1) there can be more than 20% error in the Radon fluxes used for the simulations [Szegvary et al., 2007] which may alter the temporal variability of the Radon simulation. Biases can impact the variability of the concentrations due to interactions with the atmospheric transport. Errors in the variability of Radon fluxes may also be significant because the weekly product used here cannot account correctly for variations due to precipitations. (2) Locally, the model error may strongly depend on the spatiotemporal variability in the fluxes which is different for Radon and CO2, due to interactions with the atmospheric transport, and on the radioactive decay which affects Radon but not CO2. The model error may thus be different when simulating CO2 or Radon. The analysis of the mean diurnal cycles (Figure 2) gives indications about the sites or the time windows where or when those limitations become too strong.
 The model error should be estimated for each measurement location as a function of the time, but only four CE stations provide both Radon and CO2 data for this study: HEI, GIF, MHD and PUY. In order to extrapolate the estimates from these sites, the model error is assumed here to depend only on the 6-hour window of the day: 0:00–6:00, 6:00–12:00, 12:00–18:00 or 18:00–0:00 (UTC times are used hereafter) and on the vertical location of the station at low, intermediate or high altitude (respectively corresponding to model levels 1 to 2, 3 to 8, 9 to 20; Table 4), and not to vary during the summer period. GIF, HEI and MHD are at low altitude. Concentrations are more difficult to model with 50 km resolution at HEI, which is in the Rhine valley (below the model ground at this location, Table 2), than at GIF which is in a relatively flat area. MHD is located on the coast and its location in a 50 km × 50 km grid cells is a source of representation error [Chevillard et al., 2002a]. MHD receives background air entering Europe with Westerly winds. Its data are mainly used to constrain gradients of concentrations, and thus the fluxes, between continental sites and MHD, during the inversion. Larger problems of representation occur at mountain sites such as PUY, because they are far from the model ground which is lower than the real ground (Table 2). At these sites, the model hence simulates incorrectly the influence of local fluxes including those transported through orographically forced mixing and injection of PBL air by upslope winds.
2.2.2. GIF, HEI and MHD (Low Altitude Stations)
 At HEI and GIF, the main misfits between the simulation and the data in the diurnal cycle of Radon concentrations (Figures 2a and 2d) are an anticipation of the extrema of the cycle with a bias of ∼1–2 hours in the simulation and misfits in the amplitude of the cycle. These should be related to errors in the simulation of the diurnal cycle of the PBL, inside which HEI and GIF are positioned during all the day. At HEI, the bias in the Radon fluxes is likely negative (because the model concentrations are too low during daytime), and cannot explain the too high amplitude of the diurnal cycle in the simulation. The too low diurnal cycle at GIF is likely due to tendencies in the model to overestimate PBLH during the night as noticed by Aulagnier  who compared model PBLH to radiosonde measurements from Dolman et al.  at 12:00 and 0:00 during May-June 2005. Differences in the biases at HEI and GIF indicate that such biases cannot be solved by applying a spatially homogeneous correction to the PBLH. However, at these sites, the impact of errors in the PBLH should be similar for Radon and CO2 (Figures 2b and 2e). Therefore, the use of Radon to estimate the model error when simulating CO2 should be reliable. At HEI, the better simulation of the diurnal amplitude of CO2 than that of Radon likely reveals a too low diurnal cycle in fluxes from ORCHIDEE which compensates for the increase in the amplitude due to model error. The evening minimum of concentration at HEI or GIF which is too late in the CO2 simulation while it is too early in the Radon simulation suggests that the time window of uptake is too long in ORCHIDEE.
 The temporal statistics of the misfits between the simulations and the hourly data at HEI and GIF during summer 2007 are given in Table 3 for two 6-hour windows: 0:00–6:00 and 12:00–18:00 but results are similar respectively for the windows 6:00–12:00 and 18:00–0:00. The ability of the simulations to capture synoptic events is also illustrated by the evolution of daily concentrations provided in the auxiliary material (Figures S1 and S2 in Text S1). The simulation generally underestimates the amplitude of synoptic events and thus the standard deviation of the concentrations. Correlations between the simulation and the data ∼ 0.6–0.7. The ratio of the standard deviation for the Radon hourly misfits relative to the standard deviation for the Radon data is larger between 0:00 and 12:00 than between 12:00 and 0:00. The model error seems thus too large to filter information about the fluxes from the CO2 data between 0:00 and 12:00. Furthermore, biases due to erroneous PBLH (likely the main source of model error at GIF and HEI) should be larger during nighttime (after 20:00) than during daytime [Aulagnier, 2009]. Therefore, the CO2 observations made during the windows 0:00–6:00, 6:00–12:00 and after 20:00–0:00 will not be assimilated for inverse modeling.
Table 3. Temporal Statistics of Misfits Between the Simulations and CE CO2/Radon Hourly Data at GIF and HEIa
Type of Data
Results are given for the windows 0:00–6:00 / 12:00–18:00 during summer 2007. STDobs, standard deviation in the data; STDmod, standard deviation in the simulation; RMS, RMS misfits; BIAS, mean of the misfits; STD, standard deviation of the misfits; CC, correlation between the simulation and the data; and STDmoderr, estimate of the model error standard deviation when simulating CO2 with CHIMERE = STD(Radon)/STDobs(Radon) * STDobs(prior CO2).
1.50 / 0.93
0.98 / 0.69
1.17 / 0.68
−0.17 / -0.07
1.15 / 0.67
0.64 / 0.69
13.01 / 5.19
5.39 / 3.11
16.19 / 4.08
−12.11 / -0.55
10.75 / 4.04
0.59 / 0.63
9.74 / 3.74
13.01 / 5.19
13.32 / 4.71
12.11 / 3.00
5.59 / -0.36
10.75 / 2.98
0.67 / 0.82
Inverted C02 in LMDZ
13.01 / 5.19
6.81 / 3.12
18.82 / 5.06
−13.72 / -0.18
12.88 / 5.05
0.28 / 0.34
2.88 / 1.75
2.04 / 0.94
2.37 / 1.57
0.21 / -0.93
2.36 / 1.26
0.58 / 0.72
13.66 / 8.42
12.26 / 3.73
11.87 / 7.39
3.25 / -3.50
11.42 / 6.50
0.62 / 0.68
11.19 / 6.06
13.66 / 8.42
15.74 / 6.82
18.30 / 4.47
13.27 / 0.13
12.60 / 4.46
0.64 / 0.85
Inverted C02 in LMDZ
13.66 / 8.42
7.62 / 4.35
16.43 / 7.71
−10.24 / -3.00
12.85 / 7.10
0.38 / 0.54
 At MHD, the representation error seems to be very different when simulating Radon or CO2 (Figures 2g and 2h), and the estimate of the model error based on the Radon method does not seem reliable at this station. The model error for low altitude stations is thus quantified using the temporal standard deviation of Radon misfits STDmodel errorRadon for the whole period of simulation (June-September) at HEI and GIF. For CO2 the following expression is used:
where and STDobsRadon are the temporal standard deviations respectively for the CO2 and the Radon data. The setup of the inverse system is based on hourly (with a view to assimilate hourly data) time series from HEI (instead of GIF, where values are smaller) for the period June-September 2006 and for each of the four 6-hour windows of the day. This yields the values displayed in Table 4 for time windows when data will be assimilated, and for time windows 0:00–6:00: 12.7ppm and 6:00–12:00: 12.5ppm. These values are quite similar to that for summer 2007 (Table 3) and to the values derived by Aulagnier  who studied the sensitivity of CHIMERE concentrations to erroneous PBLH, errors in the simulation of synoptic events and erroneous model topography.
Table 4. CE CO2 Hourly Data Selection for Assimilation and Typical RMS Misfits to These Data During the Corresponding Time Window for the Simulations Without Inversion
Type of Station
Time Selection for Assimilation
Setup of the Model Error for Hourly Data
Typical RMS of Hourly Misfits Simulation-Data
Low alt. station (lev 1–2)
GIF, HEI, LMP, LMU, MHD, WES,
5.4 ppm before 18:00
low levels of CBW, HUN and TRN
6.7 ppm after 18:00
Interm alt. station (lev 3–8)
top levels of CBW, HUN and TRN,
4.9 ppm before 18:00
5.3 ppm after 18:00
High alt. station (lev 9–20)
CMN, JFJ, OXK, PUY, PRS
 Temporal autocorrelations with 3 to 6-hour timescales are analyzed in the Radon misfits (not shown). There should be such correlations in the model error based on the assumption that the temporal variability in Radon misfits is a good estimate of that in the model error. However, presently, these correlations are not included in the inversion setup which may underestimate the model error even though it overestimates its standard deviation at sites such as GIF by using values from HEI.
2.2.3. PUY (High Altitude Stations)
 PUY is above the PBL at night and included in a deep PBL during the day (Figures 2j and 2k). Due to the problem of vertical location of the station in the model, the local fluxes of Radon and CO2 have a far weaker signature on concentrations at PUY in the simulations than in the data. This explains the far smaller concentrations and diurnal cycles in the simulations. These errors are enhanced by the PBL which seems too low in the model because the time window during which Radon concentrations are increased is shorter in the model than in the data. For Radon, the fluxes seem also too low in the model because the concentrations in the first level of the model are smaller than the data during daytime (the Radon fluxes from the local volcanic rocks are likely underestimated). Furthermore, the Radon emitted from nearby regions decays in the simulation before reaching the model location for PUY. Therefore, the model error variability strongly depends on the variability of the local fluxes which is very different for the two gases, on biases in these fluxes and on the radioactive decay when simulating Radon. An estimation of the model error when simulating CO2 based on comparisons between Radon simulations and data at PUY would not be reliable.
 The model error is likely smaller by night, when the station is in the free troposphere, than by day (Figure 2j). Thus, following Patra et al.  and Chevallier et al. , at high altitude stations, only nighttime (0:00–6:00) values are used in the inversion. The setup of the model error variance for these data is based on the estimate from Aulagnier  of the sensitivity of concentrations to meteorological synoptic events at high altitude (Table 4). The very high altitude stations (JFJ and PRS) are thus mainly used to anchor the inversion with background concentrations, such as MHD. However, data at PUY, CMN and OXK may catch a significant signal from the European NEE even between 0:00 and 6:00.
2.2.4. Intermediate Altitude Stations
 The vertical structure of CO2 concentrations at the multi level sites CBW and TRN is used to derive some information about the model error at intermediate altitudes because their lower and top levels of measurements are located respectively below and above the second layer of the model (Figure 3). For 6:00–12:00 and 12:00–18:00 (Figures 3a, 3b, 3d and 3e), the misfits between the simulation and the data have a quite homogeneous structure which indicates that all the vertical levels are inside a well mixed PBL. For 18:00–0:00 (Figures 3c and 3f, results are similar for 0:00–6:00), these misfits have a larger negative bias and a higher variance at the lowest levels than at the highest ones. The simulated PBL seems too deep or too strongly mixed. The ratio between the standard deviations of the error and of the signal does not change much with height and is similar to that defined by values for the model error at low and intermediate altitude from Aulagnier . Therefore, at intermediate altitude, data are assimilated between 12:00 and 20:00 only such as at low altitude, with estimates of the model error derived from the estimates at low altitudes using the ratio of the values from Aulagnier  (Table 4).
2.3. Setup of the Variational Data Assimilation
 At the CE stations, the scale of RMS misfits for CO2 (and for Radon) does not exceed that of the data variability (Table 3), even though they combine model errors and high errors from the prior estimate of fluxes. Biases can be smaller in the LMDZ simulation of Chevallier et al.  (which does not assimilate data at HEI and GIF) than in the CHIMERE simulation using fluxes from ORCHIDEE (Table 3), due to the large scale optimization of concentrations in LMDZ from the global flux inversion, but the error standard deviation is generally smaller with CHIMERE due to its greater ability to capture the variability in the concentrations. The error from the fluxes is likely smaller than the model error for hourly data, but it increases significantly the misfits to these data when combined to the model error (Tables 3 and 4). The temporal autocorrelation scales should be far larger in the error from the fluxes (see the characterization of the error in the prior fluxes below) than the 3 to 6-hour timescales diagnosed for the model error. Therefore, the inversion system should be able to filter a large signal of the errors from the fluxes even in short time series of hourly misfits.
 Data assimilation is thus applied to correct f = the 6-hour mean European biosphere fluxes (for windows 0:00–6:00, 6:00–12:00, 12:00–18:00, 18:00–0:00 every day) at the horizontal resolution of CHIMERE (i.e. on 0.5° × 0.5° surface areas). The split of corrections for fluxes, usually applied to daily, weekly or monthly means, into four 6-hour periods of the day, enables the corrections of errors in the diurnal cycle of the fluxes. However, the variation of fluxes from ORCHIDEE at the 3-hourly scale within the 6-hour windows is kept unchanged by the inverse modeling system.
 The method of Chevallier et al. [2005, 2007] which is used here is similar to 4D variational data assimilation such as applied in meteorology by Courtier et al. . The minimization of J is handled iteratively, here using the M1QN3 algorithm [Gilbert and Lemaréchal, 1989]. At each iteration, ∇J is estimated using the adjoint of CHIMERE: MT. During the inversions conducted for this study, estimates fe of argminJ are obtained after 13 iterations only, because 13 iterations are sufficient for ||∇J|| to get very low during all experiments (e.g. less than 8% of its initial value during the inversion for summer 2007), and for J to get very close (with less than 10% relative error) to its theoretical minimum during experiments using synthetic data (see section 3). The estimation of A is detailed in section 3.
 The final corrections fe − fb that are applied to the fluxes are mapped from the y-space to the f-space by BMT which highlights the importance of correct specification of B. Following Chevallier et al. , correlations in B for the NEE are modeled using exponential functions of distance and time. Chevallier et al. [2006, also What eddy-covariance flux measurements tell us about prior errors in CO2-flux inversion schemes, submitted to Global Biogeochemical Cycles, 2011] estimate the statistics of errors on ORCHIDEE based on local comparisons to FLUXNET eddy covariance flux measurements and provide a formulation for the statistical upscaling of this error at lower scales. This is used to derive the correlation e-folding length (the length required for the correlations to decrease by a factor of e) in B for 6-hour/50 km aggregated fluxes: 1 month in time and 250 km in distance. Chevallier et al. (submitted manuscript, 2011) do not show any dependence of the correlations in the error on fluxes to the PFTs in ORCHIDEE. There should be large errors in the description of biomes by the different PFTs but PFT-independent errors arising from wrong soil composition parameters or meteorological forcing may have a larger influence on the correlations in errors on fluxes in ORCHIDEE. Correlations in B are thus setup without accounting for PFTs. Here, temporal correlations apply only for 6-hour flux errors related to identical 6-hour windows of the day, and are set to 0 between 6-hour flux errors related to different 6-hour windows of the day. This assumes that errors in ORCHIDEE are uncorrelated between different 6-hour windows due to the diurnal cycle in the carbon-cycle dynamics. Actual correlations are likely negative between nighttime and daytime and positive between two consecutive nighttime or daytime windows. However, it seems difficult to derive an estimate of such correlations.
 Standard deviations are set proportional to the respiration fluxes given by ORCHIDEE. The multipliers applied to respiration do not vary in space or during summer but they are function of the time of the day. They have lower values at night because the errors of fluxes appear to be larger at daytime. They are setup so that daily errors on fluxes ∼2 gCm−2day−1 and do not exceed 3 gCm−2day−1, which is consistent with the estimations of Chevallier et al. (submitted manuscript, 2011). The dependence of respiration on PFT and forcing means the prior variances on the diagonal of B are less homogeneous in space and time than the correlations. This setup for B yields a typical uncertainty with standard deviation ∼0.2 PgC for the summer flux over the ∼3.9 × 106 km2 of land surface in the CHIMERE European domain, about 60% of the uptake of ∼0.3 PgC (∼79 gCm−2) simulated by ORCHIDEE.
 Ocean fluxes are adjusted along with NEE using prior error covariances with 500km/1 month lengths of exponential decay for the correlations and standard deviations = 0.2 gCm−2day−1 such as in Chevallier et al. . This value derived at global scale is very low when compared to the uncertainty in NEE and the uncertainty at regional scale may be larger [Cai et al., 2006], especially here, because null fluxes are used in the Mediterranean Sea. However, estimates of ocean fluxes from Takahashi et al.  in the Atlantic Ocean and North Sea are relatively small compared to the NEE in the domain and for the period considered here (<15 gCm−2 locally) and the Mediterranean Sea is a small sink for CO2 [Ortenzio et al., 2008]. Furthermore, at the CE stations and in the timescales considered here, the impact of changes in ocean fluxes is very low. The weak corrections that occur in the ocean from the inversion will thus be ignored hereafter.
 The CE sites (that are far from urban areas) see mostly the effect of natural fluxes, but they can also be influenced by anthropogenic sources. During particular synoptic events, the quantity of CO2 from anthropogenic sources can even exceed that from NEE [Levin and Karstens, 2007]. The anthropogenic fluxes bear significant uncertainties (∼19% in the estimate of the annual emissions for EU-25 [Ciais et al., 2010a]). However, according to Peylin et al. , the impact of this uncertainty at CE sites is relatively weak. They investigate the differences in the concentrations simulated by several models, due to the use of different anthropogenic flux estimates, including the one which is used here. The standard deviation of such hourly differences when using CHIMERE is smaller than 0.5 ppm at CE sites in summer except at CBW, GIF and HEI where it is ∼1–1.5 ppm. Errors in the anthropogenic fluxes are thus ignored [Ciais et al., 2010d; Rayner, 2010] either in the control vector (in B) or in the observational uncertainty.
 In the present configuration, the system assimilates data from all the CE stations listed in Table 2 except TRN whose data are kept for validation. At tall tower stations, only data at the top level of measurement are assimilated to avoid over-weighting information from these stations and to avoid dealing with correlations in the model error between the top and the bottom levels of a given station. The top level is selected because it integrates the signature of fluxes at larger scales. One observation is assimilated per hour (when available) at a given site and during the time windows defined in section 2.2. The selection of the observations used for inverse modeling is summarized in Tables 2 and 4. Here, R is set up diagonal. Model transport and representation errors are assumed to dominate in the observation error and values for R are based on Table 4. The aggregation error due to the correction of fluxes at 6 hourly resolution instead of hourly resolution is not accounted for.
 Neither the initial condition for the 3D concentrations nor the open boundary conditions are adjusted in the inverse modeling system even though the concentrations from the LMDZ inversion used to impose these boundary conditions bear significant uncertainties. Lauvaux et al.  showed that the influence of the boundary conditions in their configuration is very low at the CE stations they use. However, mountain stations are used here and they are far more sensitive to concentrations transported from the boundaries than low altitude stations [Chevillard et al., 2002b].
3. Estimation of the Uncertainty Reduction
 Experiments assimilating synthetic data which are generated using known fluxes can be conducted to assess the potential of an inversion system to retrieve information about such fluxes. Following Chevallier et al. , a Monte Carlo approach is used here to estimate A as the posterior covariance of fluxes based on an ensemble of inversions using synthetic prior fluxes fib and observations yio which sample the assumed prior distributions N(ft, B) and N(Mft, R). The fluxes from ORCHIDEE are used to provide a truth ft. Posterior fluxes are denoted fia. 15 inversions are used for periods Jun–Sep 2003 and Jun–Sep 2006. For each summer, the error estimates are pooled for all 4 months giving 60 ensemble members. This assumes the error characteristics are the same month to month during summer. The setup of R does not evolve and the setup of B and the data availability do not evolve much from June to September every year. However, the variability in the atmospheric transport may imply some variability in the posterior uncertainty even when assimilating synthetic data.
 Characterizing uncertainties in fluxes by their standard deviations, the uncertainty reduction for any scalar function G of the monthly fluxes is given by ρmonthG = 1 − (σamonthG)/(σbmonthG) where σamonthG and σbmonthG denote the standard deviations of the G(fia) and G(fib) respectively. ρmonthG is assumed to be a lower limit of the estimate of the uncertainty reduction for a similar function of the seasonal fluxes ρG, based on the hypothesis that time correlations in A are smaller than in B, so that the uncertainty in posterior fluxes decreases more when upscaling to seasonal fluxes than the uncertainty in prior fluxes. This hypothesis is based on the fact that the errors in the allocation of misfits in concentrations to fluxes (by increasing/decreasing the wrong flux in order to increase/decrease the concentration at a given observation location) yield errors in the inverted fluxes which have negative correlations. A statistical estimate of the temporal autocorrelation in the error to the “true” 6-hour mean fluxes as a function of the lag-time (based on samples from many space and time locations and from the 15 experiments in 2006) give confidence in this assumption by indicating a 29.7 day correlation e-folding time for the prior uncertainty (a good approximation of the 30 day correlation e-folding time used to set up B), and a 27.6 day correlation e-folding time for the posterior uncertainty. Values of the posterior uncertainty in seasonal fluxes are thus derived from the estimate of the prior uncertainty in seasonal fluxes based on the B used to set up the inversion, from the ensemble estimate of ρmonthG, and from the assumption ρG = ρmonthG.
 During these inversions using synthetic data, the setup of prior and observation covariances match the actual noise on prior fluxes and observations. Thus, the iterative minimization of J converges toward the expected value of its minimum in theory, i.e. the number of observations that are assimilated Nobs [Weaver et al., 2003]. After 13 iterations, J(fe) is within 10% of this minimum. Real data cases generate cost functions ∼20% less than the theoretical minimum suggesting some errors in the specification of the uncertainties. The incomplete convergence of J along with the underestimate of ρG using ρmonthG suggest that posterior uncertainties are overestimated but this may compensate the neglect of other sources of error, such as errors in the characterization of the actual uncertainties in prior fluxes when setting up B.
 The convergence of the ensemble estimations of σamonthG and σbmonthG is verified for G = (1, ., 1) (for the monthly total fluxes) in summer 2006. The estimates using 40 and more members are within 6% of the estimates using 60 members. The estimate of σbmonthG using 60 members is also within 6% of the estimate of the prior uncertainty in monthly fluxes based on the setup of B in the inversion system. 60 members should thus be sufficient to get a fair estimate of ρmonthG.
 Those estimates indicate an uncertainty reduction for monthly fluxes of the full domain that is ∼60% for both summer 2003 and summer 2006, although 11 stations are used in 2006 versus 9 stations in 2003. Changes in the observing network between 2002 and 2007 may have little impact on the uncertainty reduction at continental scale but changes in the climatic conditions may also favor the extrapolation of the information from the data during the inversion in 2003. The uncertainty reductions of up to 60% are promising given planned expansion in the network. They will be used for each month and season in the study period.
 Maps of uncertainty reduction for local monthly fluxes for each 6-hour window of the day are shown on Figure 4 for 2006 (and on Figure S3 in Text S1 of the auxiliary material for 2003). The uncertainty reduction is stronger close to the CE stations, in particular close to low altitude stations at daytime (e.g. close to GIF, CBW and HEI). A large part of the spread of uncertainty reduction is due to covariances in the setup of B, but the transport plays a critical role too, especially in the extrapolation of the information from nighttime (daytime) at mountain (low altitude) sites to daytime (nighttime) in nearby areas. At night, the uncertainty reduction is smaller because data are not assimilated at low altitude. An inversion has been conducted with the assimilation of nighttime synthetic data at low altitude (not shown). This demonstrated large uncertainty reduction in nighttime fluxes as for Lauvaux et al.  and Aulagnier  because experiments using synthetic data deal with a model error that is unbiased and perfectly characterized in the setup of R in the inversion system. In real cases, the assimilation of nighttime observations at low altitude (not shown) has add errors to flux estimates because of poor model performance (with large errors on PBLH at night). This shows that conclusions from experiments and network design studies using synthetic data must be analyzed carefully before applying them to real cases.
4. Validation of the Results From Experiments Using Real Data
 In this section, concentrations and fluxes are compared with independent (i.e. not assimilated) data to assess the performance of the inversion. This process is historically termed validation within the data assimilation community.
4.1. Comparisons to CO2 Mixing Ratio Measurements
 Most of the CE data are assimilated so that an overall decrease in the misfits between the simulation and these data is necessarily obtained. Except for 3 cases (LMU and HEI for 0:00–6:00 and HUN for 6:00–12:00) temporal RMS misfits between the simulations and all (assimilated or not) CE data are decreased for any 6-hour window of the day by the inversion in 2007 (Table 5). This shows that the inversion improves the simulated atmospheric state, which is a first indication of improvement in the flux estimates. This decrease is not homogeneous, and the reduction of RMS misfits at high altitude is far smaller (∼15%–20% for data assimilated between 0:00 and 6:00, less than 10% for non assimilated data from 6:00 to 0:00) than at low altitude (∼40% for data assimilated between 12:00 and 20:00, ∼15%–25% for non assimilated data from 0:00 to 12:00).
Table 5. Reduction of Temporal RMS Misfits Between the Simulations and CE CO2 Hourly Data During Summer 2007a
Results are given for each 6-hour window of the day: 0:00–6:00/6:00–12:00/12:00–18:00/18:00–0:00.
CBW 200 magl
no / no / yes / before 20:00
10.12 / 4.15 / 13.98 / 12.41
no / no / yes / before 20:00
25.18 / 23.58 / 26.46 / 44.86
no / no / yes / before 20:00
−54.16 / 12.57 / 39.54 / 6.73
HUN 115 magl
no / no / yes / before 20:00
9.55 / -6.92 / 42.71 / 30.96
yes / no / no / no
5.07 / 6.26 / 10.37 / 7.09
LMU 79 magl
no / no / yes / before 20:00
−6.27 / 10.22 / 52.14 / 22.76
no / no / yes / before 20:00
13.78 / 12.99 / 12.27 / 10.59
OXK 163 magl
yes / no / no / no
12.41 / 7.44 / 1.15 / 7.65
yes / no / no / no
6.44 / 7.80 / 14.22 / 12.77
yes / no / no / no
12.31 / 4.14 / 0.86 / 6.42
TRN 180 magl
no / no / no / no
19.13 / 5.32 / 10.56 / 28.60
 At HEI, the increase in RMS misfits during the window 0:00–6:00 is essentially due to an increase of the bias in the simulation (Table 3). Adjusting concentrations from 12:00 to 20:00, the inversion raises night concentrations to higher values than the data (Figures 2b and 2c), probably a signature of model error, as suggested by the diurnal cycle of the Radon simulation which is too high (Figure 2a and section 2.2). Increases in RMS misfits at LMU and HUN in the morning should also be related to model error. At GIF (Figures 2e and 2f), the inversion raises concentrations from 20:00 to 5:00 with a peak at 0:00, while the diurnal cycle of the Radon simulation is too low (auxiliary material Figure S1d in Text S1). Therefore, close to GIF and before 0:00, the fluxes may be increased too much by the inversion. Even though no data are assimilated at TRN and at the lower levels of CBW, the misfits are decreased at any level of these towers and for any 6-hour window of the day by the inversion. However, the standard deviation of the misfits is increased and the bias is kept large at the bottom levels during 18:00–0:00, when the PBL stratifies (Figure 3). This can be related to errors in the PBLH which is positioned between the bottom and the top levels of these towers at that time. Finally, for most of the low and intermediate altitude sites, and for any 6-hour window of the day, all components of the misfits to the data are decreased (the correlation between the simulation and the data is increased, the difference of standard deviation and the bias between the simulation and the data are decreased) by the inversion in the same way they are decreased at HEI for the window 12:00–18:00 or at GIF (Table 3). This shows that the inversion improves the variability in the concentrations at high temporal resolution.
 At PUY (Figures 2e and 2f) and at all CE mountain sites, the simulated concentrations in the free troposphere are improved as can be expected from the assimilation of data at nighttime only, because of the model not capturing the decreasing trend during the day. Improvement also exist at the synoptic scale (not shown). The inversion abnormally perturbs the diurnal cycle of concentrations at lower model levels at PUY as at GIF (Figures 2f and 2l), but not at other mountain sites, due to the assimilation of data at GIF.
4.2. Comparisons to Eddy Covariance Flux Measurements
 The gap-filled CE L4 product [Papale et al., 2006] gathering eddy covariance flux measurements from most parts of Europe is used to check the consistency of inverted fluxes. Hourly data from the stations given in Figure 1 are used for comparisons with the fluxes from ORCHIDEE and those from the inversion sampled on the grid cells in which the stations are located. These data are generally not available for all years between 2002 and 2007 at a given station. They represent scales from a few hectares to a few km2 (depending on the height of the sensors above the canopy, on the roughness of the surface and on the air stability) while fluxes in the model are given at 50 km resolution. Thus large scale averaging over the set of stations or in time is needed before comparison to reduce the representation error. The validation focuses here on the time variability in the spatial averages over all CE L4 stations (Figures 5 and 6).
 The analysis of the diurnal cycle of 6-hour mean fluxes during summers 2006 and 2007 (Figure 5) yields conclusions which apply to other summers. The inversion improves the mean fluxes for windows 0:00–6:00, 6:00–12:00 and 18:00–0:00. The largest improvement is obtained during 6:00–12:00. At night, the improvement is not systematic but biases are never significantly increased. The corrections are smaller than at daytime which is consistent with the fact that errors on the prior NEE are smaller during the night. However, between 12:00 and 18:00, for every year, the inversion increases the mean error to CE L4 data. This is assumed to be a result of a temporal aggregation error [Thompson et al., 2010]. The uptake simulated by ORCHIDEE is larger (smaller) than the CE L4 data after (respectively before) 16:00 (not shown). Due to the stronger stratification of the atmosphere, errors in the fluxes generate larger misfits at the observation locations in the PBL even though they may be smaller after 16:00 than before. Therefore, during 12:00–18:00, the mean misfits that are assimilated are negative and the 6-hour mean flux increment is, on average, positive. Although the inversion increases this bias during the afternoon, it improves the match of the daily mean fluxes to the CE L4 data.
 An artifact of the 6-hour averaged display of the fluxes indicates that inversion deteriorates the general shape of the flux diurnal cycle, with a stronger uptake in the morning than in the afternoon in the inverted fluxes while ORCHIDEE and CE L4 data indicate a stronger uptake in the afternoon than in the morning. Actually the maximum for uptake erroneously occurs at ∼13:00–14:00 in ORCHIDEE while it occurs at ∼11:00–12:00 in the CE L4 data. The uptake is balanced between the morning and the afternoon in the CE L4 data with differences smaller than the uncertainty in these data (in particular other CE products for fluxes that are not used here display larger uptake in the morning than in the afternoon). These differences are thus not significant enough to assess whether the change of the diurnal cycle of the 6-hour mean fluxes by inversion is sensible or not.
 Time series of daily 6-hour averages of mean fluxes for all CE L4 locations (Figure 6) show that the inversion improves the monthly variability of these fluxes. Furthermore, the split of corrections between the different 6-hour windows of the day enables the system to apply monthly corrections that are not identical from one 6-hour window to the other, in keeping with the differences observed between ORCHIDEE and the CE L4 data, which improves the variability in the diurnal cycle. Those two points are illustrated by the fact that in 2006, consistently with the data, corrections applied in June-July are stronger than in August (when ORCHIDEE fits better with the data) for the window 18:00–0:00, while, in the opposite sense, smaller corrections are applied in June-July than in August for the window 0:00–6:00. Time correlations between the simulations and the data are increased by the inversion so that error standard deviations are generally decreased. For 12:00–18:00, although the inversion increases the RMS error and bias, it improves the variations in monthly average. Improvements are obtained when restricting the estimate of NEE to CE L4 stations within model grid cells covered mostly by forests or crops PFTs in ORCHIDEE (auxiliary material Figure S4 in Text S1), which tends to increase the confidence in the spatial variability (at least in the split of the contributions to the NEE from the different PFTs) from the inversion. However, such stations are so few that these comparisons should bear large errors of representativeness in term of length scales and because the vegetation surrounding a station may not correspond to the PFT which dominates in the 50 km × 50 km model grid cell where this station is located.
5. Diagnostics for Summer Fluxes
 The previous sections give confidence in estimating the net seasonal biogenic flux over the European domain of CHIMERE. The mean of the prior summer NEE for 2002–2007 is ∼−0.3 PgC (∼−79 gCm−2 over ∼3.9 × 106 km2). The uncertainty (expressed hereafter by its standard deviation) in this mean ∼0.08 PgC if errors are independent or ∼0.2 PgC if errors are fully correlated from year to year. ORCHIDEE has some biases but does not completely captures the interannual variability of fluxes, therefore the actual uncertainty should lie between these two values. The value 0.2 PgC seems more reasonable considering the potential errors in the estimate of the prior uncertainty for each year. The mean of the inverted summer NEE for 2002–2007 is ∼−0.35 PgC (∼−91 gCm−2) with uncertainty ∼0.08 PgC conservatively assuming perfect correlation of errors from year to year. This uptake is larger that the mean summer anthropogenic emissions used here (∼31 PgC for 2002–2007). However, the NEE increments from inversion have a scale comparable with that of the uncertainty on the anthropogenic fluxes in summer.
 The increase in the summer uptake by the inversion occurs every year except for 2002 (Figure 7a). This increase is mainly applied in the morning (Figures 5 and 7b). One reason for ORCHIDEE's underestimate of uptake is that its run commences from equilibrium, negating one important mechanism for carbon sinks. The amplitude of the interannual variability is slightly decreased by the inversion (by ∼0.03 PgC). The positive anomaly in 2003 is slightly modified but its amplitude becomes far larger than that for other years except 2004. For the window 6:00–12:00 (Figure 7b), the anomaly in 2003 is very large because smaller increases are made to prior uptakes for 2003 than for other years. The response of ORCHIDEE to the anomalous weather in 2003 counteracts the mean bias. However, the calculation of anomalies from the relatively short 2002–2007 period suggests caution interpreting such results.
 According to estimates of NEE in areas covered mostly by a given PFT of ORCHIDEE, the inversion increases the summer uptake of forests for any year except 2004 while it decreases the uptake for croplands for 2002, 2004, 2005 and 2007 (Figure 8). They may be sensitive to the crop sowing in winter which is not accounted for in the version of ORCHIDEE used for this study [Smith et al., 2010a]. The main corrections applied in 2003 during the drought occur for areas dominated by croplands (Figure 8). Smith et al. [2010b] also increase the anomaly of the uptake by cropland in 2003 by improving the modeling of crops in ORCHIDEE. In particular, they account for irrigation, which reduces the negative impacts of drought on NEE, and for winter- and spring- type crop phenology, which attenuates the carbon loss in July -September. In 2003, the human management has moderated the impact of the heat wave so that the summer 2003 NEE positive anomaly should be stronger for forest than for croplands. This can be seen in the inverted fluxes but not in the fluxes from ORCHIDEE for which the NEE from croplands displays also a strong positive anomaly in summer 2003. However, the inversion produces a NEE for croplands in 2006 and 2007 that is similar to that for 2003 which seems abnormal. This may highlight some problems in the inversion system or a large dependence of the results from inversion to the CO2 observation network which has significantly changed from 2003 to 2006.
 Corrections to prior fluxes occur at quite high resolution despite the 250km length-scale of correlations in B. The main patterns of the corrections for mean summer NEE in regions with reasonable uncertainty reduction (Figure 9; see also the map of mean summer NEE and corresponding anomalies on Figure S5 and S6 in auxiliary material Text S1) consist in an increase of the uptake in the northern Balkans and Italy, in a decrease of the uptake in northern France and western Germany and in a strong interannual variability elsewhere. There are contiguous areas with positive and negative flux increments but the presence of stations in each of these areas shows that this is constrained by the data and that it does not correspond to the dipoles which are artifacts of inversion in weakly constrained regions [Peylin et al., 2002]. Abnormal corrections occur in areas of low uncertainty reduction far from CE stations (auxiliary material Figure S5 in Text S1), e.g. in Spain in 2002, 2004 and 2005. Their amplitude does not exceed that of the uncertainty in prior fluxes but there are clear indications that they yield unrealistic patterns e.g. a strong negative anomaly of NEE despite the drought in Spain during summer 2005 (auxiliary material Figure S6 in Text S1). Estimates of NEE over the whole European domain in Figures 7 and 8 are however sensible because the spatial aggregation decreases the uncertainty in the resulting NEE [Carouge et al., 2010b] and the uncertainty reduction for the whole domain, even with restrictions to croplands, forest areas, or to the window 6:00–12:00, should be larger than 30% according to the results from section 3.
 The inversion modifies the shape of the monthly variability every year (Figure 10; see also the monthly anomalies on auxiliary material Figure S7 in Text S1) even though inverted estimates lie within the error bars of the prior estimates. It decreases the uptake in June except in 2005, it always increases the uptake in July, it increases the uptake in August except in 2005, and it increases the uptake in September except in 2004 and 2007. Smith et al. [2010b] obtain similar corrections for croplands by improving the modeling of crops in ORCHIDEE. In fluxes from ORCHIDEE, summers 2002, 2006 and 2007 display an uptake which slightly increases in September while inverted fluxes display an uptake which keeps on decreasing in September every year. Monthly anomalies are reduced by the inversion. The monthly anomalies during 2003 are kept positive by the inversion, despite a large decrease in the anomaly for September. Those modifications confirm that the system identifies specific corrections at the monthly scale rather than mean corrections in time.
 This paper describes the setup of an inverse modeling system assimilating hourly atmospheric data from ground based stations for the estimate of European CO2 NEE. It is based on a mesoscale atmospheric transport model whose skills are evaluated qualitatively and quantitatively using comparisons to Radon and CO2 data. Radon data are used to compute the standard deviations of the model error when simulating CO2 at low altitude. The estimate of temporal correlations in the misfits between simulated and observed Radon data should be used to setup correlations in the model error for future studies. The method would benefit from the measurement of Radon at more continuous stations. Measurements of PBLH with ceilometers or LIDARs would be also useful to correct for errors from the simulation of PBLH but the use of Radon data provides an estimate of the model error which accounts for other important error sources. The confidence in this estimate would benefit from improved estimates of Radon fluxes, but a more complex method is needed to account for the differences when simulating Radon and CO2 in the impact of the flux variability on the model error.
 The time windows during which the model errors and biases should be small enough so that hourly misfits can be used to retrieve corrections in the fluxes at fine scale are identified as a function of the measurement locations. This defines when data are selected for assimilation in the inverse modeling system. Increasing the spatial resolution of the model would enable a better representation of mountain sites and of their vicinity so that far more data could be assimilated at high altitude. This would provide a further reduction of uncertainty in fluxes over large areas due to the large spatial influence of such observations. At low altitude stations, better estimates of PBLH are needed to assimilate data from 20:00 to 12:00 or to improve their use between 12:00 and 20:00. Based on an improved simulation of the PBL, future studies should determine how useful the assimilation of measurements from several levels at the high tower stations (using non diagonal R) can be, especially if data can be assimilated during nighttime.
 Analyzing hourly misfits to concentration data, the inverse modeling system is used to derive corrections to fluxes from ORCHIDEE at 6-hour/0.5° resolution. However, the setup of correlations between the uncertainties in fluxes from the same 6-hour window of the day in ORCHIDEE using 250km/1 month correlation lengths strongly smooths these corrections. Here, the main improvements of the regional inversion compared to the most recent global inversions are linked to the decrease of model error by using a mesoscale model at high resolution rather than to the increase of the resolution in the corrections. There are structures of uncertainties in the fluxes from ORCHIDEE at scales higher than 250km/1 month. Setting up the correlations using exponential functions and shorter lengths would produce a smaller uncertainty in the area-integrated prior fluxes. However, the present setup defines a ∼60% uncertainty in prior total fluxes for summer, which seems already too low according to the confidence generally given in biosphere models [Schulze et al., 2010; Ciais et al., 2010b, 2010c]. A modeling of the correlations in B which would account for the several dominant length scales could form a useful input. A new definition of the control vector may also help characterize prior error covariances. The system could adjust the Gross Primary Production (GPP) and the respiration separately [Tolk et al., 2011]. However, it would have to invert far more unknowns by assimilating the same amount of data. It could also invert corrections for the parameters in the ecosystem model underlying the CO2 fluxes such as in Carbon Cycle Data Assimilation Systems (CCDAS) [Rayner et al., 2005; Knorr et al., 2010; Tolk et al., 2011]. However, as in the system of Peters et al. , the limited number of parameters inverted would imply aggregation errors. The definition and the modeling of a limited number of PFTs in the ecosystem models would also be a new source of model error which should be characterized.
 The present setup of R likely overestimates the weight that should be given to the observations during the inversion because it does not take into account space or time correlations and several sources of observation error such as errors in the anthropogenic forcing, errors in the boundary conditions, and aggregation errors which will be explored in future studies.
 The uncertainty reduction by the inversion is estimated using experiments assimilating synthetic data. Such estimates should be considered qualitatively rather than quantitatively because they rely on strong hypotheses and approximations of the prior uncertainty. The estimate of uncertainty reduction is a general weakness of inverse modeling systems even though this is a critical result. In particular, the value of ∼60% that is extrapolated here as an upper limit for the uncertainty reduction for summer fluxes may be very optimistic, despite all the cautions detailed in section 3. The experiments indicate a reduction of uncertainty which is large in the areas of influence of the CE stations defined by the length scales of the correlations in B, but which is small far from the CE stations. The atmospheric transport seems to play a weaker role than B in the spatial extrapolation of the information from these stations. A clear lack of CE continuous stations is thus highlighted by the maps of uncertainty reduction. Analysis of inverted fluxes when using real data confirm that the increments are more reliable in the areas of influence of the CE stations than in areas far from the CE stations. However, the atmospheric transport plays a critical role in the temporal extrapolation of the information from one 6-hour window of the day to another, and in particular from windows during which data are assimilated to windows during which there is no data assimilation. This extrapolation generally decreases misfits to concentration data that are not assimilated but in the area of influence of a given station, the uncertainty reduction for fluxes is far larger during windows when data are assimilated than during other windows.
 Despite the influence of the long length scales in the correlations of B, the proximity of numerous CE stations, the use of hourly data from these stations, and to a lesser extent the heterogeneity of the standard deviations of B at fine scale can yield corrections which have relatively fine spatial and temporal variability. Such corrections at high spatial scale must be analyzed carefully but local improvements can be expected in the regions well covered by the CE network.
 Comparisons to independent data, especially to CE L4 flux data, give a relatively high confidence in corrections by the inversion to the mean fluxes or to the monthly variability of the fluxes. The inversion successfully increases the respiration at night and the uptake between 6:00 and 12:00 every summer, with larger corrections during the day than during the night, which decreases the biases in the fluxes from ORCHIDEE. This indicates that the use of independent corrections for each of the different 6-hour windows of the day is sensible. However, it leads to a decrease of the uptake and then an increase of the bias to the CE L4 data during the window 12:00–18:00. The problem is a form of temporal aggregation error which can be ameliorated by increasing the time resolution (by adjusting 3 hourly to hourly fluxes). However, the need for estimating correlations will be more critical between errors in the prior fluxes for different 3-hour or 1-hour windows than between errors for different 6-hour windows.
 The restriction of the experiments to summer periods highlights several trends in the corrections which are used to assess the reliability of the inversion or to reveal some limitations in ORCHIDEE. The inverted fluxes will be analyzed deeper in the results from future experiments covering all seasons for the period 2002–2007 and they will be compared to estimates from recent studies such as that of the CE project [Schulze et al., 2010; Ciais et al., 2010b, 2010c] which have tried to improve the knowledge of the mean European carbon balance or of the interannual variability of the annual fluxes, or to the upscaling of flux measurements by Jung et al. . However, given the few stations used during the inversion, the changes in the observation network from year to year over the last decade may have a significant impact on the interannual variability of the inverted fluxes which should be difficult to characterize. Another issue is that the interannual variability may not be completely accounted for by the inverse modeling system due to the offset applied to the spatial and temporal mean of concentrations to unbias misfits every summer. This bias is corrected because a part of it comes from errors (from transport, representativity or analyzed fluxes) in the global inversion used to impose the boundary conditions but another part of this bias could be related to errors in the interannual variability of summer fluxes in ORCHIDEE. The definition of an offset based on data from MHD is difficult due to the problems of representation that this station raises. An extension of the domain for the regional inversion may ensure that errors in the global inversion would not have a significant impact at the CE sites. A two-way coupling of the regional and the global inversions or the inversion of boundary increments [Peylin et al., 2005] along with flux increments in the regional system would be other solutions to deal with these errors.
 The inversion increases the summer uptake likely because ORCHIDEE does not account for all sources of growth and management in the ecosystems or because it has a seasonal variability that is too low. Paradoxically, several results (Figures 2 and 5) indicate that the window of uptake in the diurnal cycle of the fluxes from ORCHIDEE is too long. This problem cannot be handled correctly by the use of 6-hour mean corrections during the inversion, which supports the idea of adjusting 3 hourly to hourly fluxes in future studies. Analyses of the corrections as functions of the PFTs seem to reveal some particular weaknesses in the modeling of crops in ORCHIDEE.
 Finally, a lot of the results displayed in this study have a significant sensitivity to the setup of the system and therefore, they should be analyzed carefully. Many experiments have been conducted to refine the setup of the CHIMERE inverse modeling system (in particular the setup of B and R). The next improvements in the estimate of the fluxes should be obtained with an increase of the spatial/temporal resolution of the model (which should decrease the model error) or of the control vector (which should decrease the aggregation error), with a better characterization of the prior uncertainty and of the observation errors (in particular by accounting for correlations in R, which is supported by Lauvaux et al. [2009a]), with an account for the errors from the boundaries, and with an improved data treatment before assimilation.
 We would like to thank Philippe Peylin, Nicolas Viovy, Philippe Bousquet and Leonard Rivier, who have provided simulations of ORCHIDEE, LMDZ and MM5, and the anthropogenic emission product. We also thank all the principal investigators and scientists of the CarboEurope-Integrated Project who have provided the data used in this paper, and in particular Ingeborg Levin (Universität Heidelberg), Attilio Di Diodato and Marco Alemanno (Servizio meteorologico dell' Aeronautica Militare Italiana), Laszlo Haszpra (Hungarian Meteorological Service), Markus Leuenberger (Universität Bern), Alcide Di Sarra and Salvatore Piacentino (Agenzia nazionale per le nuove tecnologie, l'energia e lo sviluppo economico sostenibile), Josep-Anton Morgui (Universitat de Barcelona), Jošt Valentin Lavrič (Max-Planck-Institute for Biogeochemistry), Francesco Apadula (Research on Energy Systems) and Frank Meinhardt (Umwelt Bundes Amt). The setup of CarboEurope atmospheric measurements was also supported by the CHIOTTO project and the Max-Planck-Society. We thank the custodians of the International Foundation High Altitude Research Stations Jungfraujoch and Gornergrat (HFSJG) for their help with the continuous CO2 measurements at Jungfraujoch. The L4 eddy covariance data have been downloaded from http://gaia.agraria.unitus.it/database/carboeuropeip/. We thank Dario Pappale and the scientists involved in the setup of this database. This study was co-funded by the European Commission under the EU Seventh Research Framework Programme (grant agreement 218793, MACC). Peter Rayner is in receipt of an ARC Professorial Fellowship (DP1096309).