Carbon dioxide (CO2) fluxes at the Earth's surface may be recovered (or inverted) from the observed spatial and temporal gradients of the CO2 concentrations in the atmosphere by applying Bayes' theorem [e.g., Enting et al., 1995; Bousquet et al., 2000; Gurney et al., 2002]. Atmospheric mixing makes the problem ill-constrained and therefore prior information about the CO2 flux originating from the land and water surface is also used in the inversion process. In statistical terms, this approach transforms the prior probability density p(x) about the CO2 fluxes, jointly called state vector x here, into the posterior probability density p(x∣y) conditioned on atmospheric measurements, jointly called y. The statistically optimal estimator of the fluxes, given the available information, corresponds to the maximum of the function p(x∣y). By design, it critically depends on the assumed prior density function p(x). Under the numerically convenient assumption of a multivariate Gaussian density, describing p(x) requires assigning means, variances and correlations. The atmospheric inversion studies of CO2 fluxes published so far have assumed various probability distributions centered on climatology, regional inventory statistics or the output of terrestrial ecosystem models, as well as ocean carbon cycle models [Gurney et al., 2002]. In practice, some of the key characteristics of the prescribed a priori flux error distributions p(x) in use stem from the capacity of the current flux-inversion systems to deal with large state vectorsx, rather than from the statistics of the inference problem: the largest correlation patterns in space and time are specified in the case of classical analytical systems (i.e., coarse regions inversions [e.g., Gurney et al., 2002]), while the narrowest structures (i.e., pixel size) can be introduced in the variational (i.e., adjoint-based) schemes [Chevallier et al., 2005; Rödenbeck, 2005; Baker et al., 2006]. Ensemble methods lie in-between [Zupanski et al., 2007; Peters et al., 2007; Feng et al., 2009]. This subjective choice of error correlation structures critically influences the way the information from a single atmospheric measurement is spread in space and time for the flux inversion systems.
 Two studies have attempted to shed light on the characteristics of p(x) based on observations. Michalak et al.  used CO2 concentration measurements within a flux inversion system by introducing some poorly known characteristics of the prior errors in the state vector x. They highlighted the power of their method but stressed its subjectivity. In the second study, Chevallier et al. relied on the non-gap-filled, raw CO2flux measurements at the eddy-covariance flux sites (total 34) in the northern hemisphere to constrainp(x). They showed a heavy-tail distributionp(x) that contradicts the usual assumption of a multivariate Gaussian distribution. Further, the error correlations appeared to follow a linear temporal dependency after the second lag day without any particular spatial structure.
 Following the approach of Chevallier et al. , we examine the characteristics of p(x) for terrestrial ecosystem CO2 fluxes, when p(x) is centered around the Organizing Carbon and Hydrology In Dynamic Ecosystems (ORCHIDEE), a process-based ecosystem model [Krinner et al., 2005]. Our study advances our previous knowledge in two ways. First, it uses a much-wider archive of eddy-covariance sites (156 in total) with gap-filled records, which provides more detailed information onp(x) for a variety of biomes. Additionally, we explore the influence of temporal and spatial aggregation on the statistics in order to bridge the gap between the local scale of the daily eddy-covariance flux measurements that are used to definep(x) and the typically much larger spatial and temporal scales of the inversion systems.