Being one of the largest carbon reservoirs in the world, the Siberian carbon sink however remains poorly understood due to the limited numbers of observation. We present the first results of atmospheric CO2 inversions utilizing measurements from a Siberian tower network (Japan-Russia Siberian Tall Tower Inland Observation Network; JR-STATION) and four aircraft sites, in addition to surface background flask measurements by the National Oceanic and Atmospheric Administration (NOAA). Our inversion with only the NOAA data yielded a boreal Eurasian CO2 flux of −0.56 ± 0.79 GtC yr−1, whereas we obtained a weaker uptake of −0.35 ± 0.61 GtC yr−1 when the Siberian data were also included. This difference is mainly explained by a weakened summer uptake, especially in East Siberia. We also found the inclusion of the Siberian data had significant impacts on inversion results over northeastern Europe as well as boreal Eurasia. The inversion with the Siberian data reduced the regional uncertainty by 22% on average in boreal Eurasia, and further uncertainty reductions up to 80% were found in eastern and western Siberia. Larger interannual variability was clearly seen in the inversion which includes the Siberia data than the inversion without the Siberia data. In the inversion with NOAA plus Siberia data, east Siberia showed a larger interannual variability than that in west and central Siberia. Finally, we conducted forward simulations using estimated fluxes and confirmed that the fit to independent measurements over central Siberia, which were not included in inversions, was greatly improved.
 The terrestrial biosphere is an important carbon reservoir and controls much of the observed variability of atmospheric CO2, including its seasonal cycles and interannual variations. Northern high-latitude ecosystems are thought to be a significant sink of anthropogenic CO2 emissions, but the magnitude and distribution of this carbon sink are still uncertain [McGuire et al., 2009; Hayes et al., 2011, and references therein]. Northern high-latitude regions are particularly sensitive to climate variations and are expected to be greatly influenced by future climate warming [e.g., Ito, 2005; Zhuang et al., 2006; Intergovernmental Panel on Climate Change (IPCC), 2007; McGuire et al., 2009]. Siberia, in northern Eurasia, contains large amounts of plant biomass and soil organic carbon, making it one of the largest carbon reservoirs in the world [e.g., Houghton et al., 2007; Tarnocai et al., 2009; Kurganova et al., 2010; Schepaschenko et al., 2011]. Accurate estimates of carbon fluxes in Siberia are therefore essential, both for understanding global and regional carbon cycles and for predicting future changes in the Siberian carbon cycle. Zimov et al.  predicted that future warming in high latitudes would release CO2 from Siberian permafrost, which contains large amounts of organic carbon, with a positive feedback on climate change. A recent Coupled Model Intercomparison Project phase 3 (CMIP3) analysis using surface temperature data suggested that temperature changes at higher latitudes of the Northern Hemisphere have been larger than previously estimated by equal-weight multimodel means of CMIP3 models [Abe et al., 2011]. In a process-based modeling study by Hayes et al. , the sink for atmospheric CO2 in land areas of the northern high latitudes (arctic tundra and boreal forests) may have weakened in a recent 10 year period. This highlights the sensitivity of these regions to certain controlling factors, such as climate conditions, atmospheric CO2 concentrations, tropospheric ozone (O3) levels, nitrogen (N) deposition rates, and disturbances due to wildfire, timber harvest, and agriculture.
 Many studies have used “bottom-up” or “top-down” approaches to accurately estimate current carbon fluxes. For example, “bottom-up” approaches, such as direct flux measurement or process-based ecosystem modeling, sometimes with satellite remote sensing analysis added, have indicated that the Siberian region absorbs 0.520 GtC yr−1 [Nilsson et al., 2003], 0.563 GtC yr−1 to 0.761 GtC yr−1 [Dolman et al., 2012], or 0.112 GtC yr−1 (for all of boreal Asia) [Hayes et al., 2011]. A “top-down” approach, that is, inverse modeling using atmospheric transport models and atmospheric CO2 observations, can also be effective for estimating regional and global carbon fluxes from limited atmospheric observations, and this approach has been successful in deriving reasonable carbon fluxes for most land and ocean areas [e.g., Gurney et al., 2002, 2003; Rödenbeck et al., 2003; Gurney et al., 2004; Chevallier et al., 2010; Bruhwiler et al., 2011]. However, until now, few inverse modeling studies have focused on Siberia because the number of available observations there is still sparse relative to its large area. In the TransCom 3 (TC3) control inversion, Gurney et al.  used 17 models to estimate fluxes for boreal Eurasia and obtained values that ranged from −1.70 to 0.71 GtC yr−1, even though there was no observation site in the target region. Maksyutov et al.  obtained a value of −0.63 ± 0.36 GtC yr−1 that was larger sink by 0.2 GtC yr−1 than the results from the TC3 setup for the boreal Eurasian sink by using data from three Siberian aircraft sites [Machida et al., 2001] and the non-Siberian TC3 sites used by Gurney et al. . Downwind observations could also be used to constrain huge regions such as boreal Eurasia, where geography and climate often make it impractical to maintain observatories [Patra et al., 2004]. Chevallier et al. , however, argued that extending the observational network into Eastern Europe and Siberia is important to reduce uncertainty in fluxes estimated by inversion methods over these regions.
 Periodic atmospheric CO2 measurements were carried out from aircraft at altitudes of up to 4000 m over Zotino (ZOTTO; 60.75°N, 89.38°E) in central Siberia at 12 to 21 day intervals [Levin et al., 2002; Lloyd et al., 2002] from 1998 to 2005 by the Max Planck Institute for Biogeochemistry, and data from four altitudes have been provided to the GLOBALVIEW-CO2 data product . Inferring CO2 surface fluxes around ZOTTO with an inverse approach, Van der Molen and Dolman  analyzed the impact of horizontal gradients of CO2 concentration and mesoscale atmospheric heterogeneity on the estimated fluxes by using measurements at ZOTTO and the Regional Atmospheric Modeling System. High-quality, continuous CO2 measurements began at the Zotino Tall Tower Observatory at ZOTTO in April 2009, and Winderlich et al.  reported large seasonal amplitudes of CO2, larger than those observed at continental tall towers under oceanic influences or at mountainous tower sites. Others have carried out campaign CO2 measurements over Siberia. For example, Nakazawa et al. [1997a] measured tropospheric concentrations of CO2 and trace gases by aircraft campaigns for several years, although only in summer; YAK-AEROSIB (Airborne Extensive Regional Observations in Siberia) aircraft campaigns in 2006 precisely measured the variability of CO2, CO, and O3 [Paris et al., 2008]; and the Trans-Siberian Observations Into the Chemistry of the Atmosphere (TROICA) project has measured CO2 and other species such as carbon compounds, O3, nitrogen oxides, and aerosols about once per year along the Trans-Siberian Railroad from Moscow to Khabarovsk since 1995 [e.g., Turnbull et al., 2009].
 Despite these efforts to precisely observe CO2 concentrations, most of the observations were obtained from several campaigns during specific seasons or over a few years. As a result, the number of available CO2 observations remains too small to fully constrain carbon fluxes in Siberia with inverse modeling. To overcome the problem of sparse observations of atmospheric CO2 over Siberia and to capture seasonal cycles, vertical profiles, and long-term trends, the Center for Global Environmental Research (CGER) of the National Institute for Environmental Studies (NIES) of Japan, with the cooperation of the Russian Academy of Science (RAS), began periodic, precise aircraft measurements over Surgut (SUR) in 1993, Yakutsk (YAK) in 1996, and Novosibirsk (NOV) in 1997, which Maksyutov et al.  used in the inversion analysis described above. In the aircraft data, seasonal amplitudes of CO2 variations were found to be larger over Siberia than those at background sites in the same latitudes [Machida et al, 2001], which suggested that the observations were greatly influenced by the high activity of Siberia's terrestrial ecosystem. In addition, CGER/NIES and RAS constructed a new Siberian tower network, Japan-Russia Siberian Tall Tower Inland Observation Network (JR-STATION) in 2002 to observe regional and short-term variations of greenhouse gases (CO2 and CH4) and to produce data for inverse modeling that would obtain regional carbon estimates [Sasakawa et al., 2010, 2012; Watai et al., 2010]. The network (eight towers in western Siberia and one in eastern Siberia at Yakutsk) continuously measures CO2. At one site, Berezorechka (BRZ), aircraft measure vertical profiles of CO2 from the planetary boundary layer (PBL) to the lower free troposphere.
 This study was designed to estimate monthly fluxes of CO2 on a subcontinental scale over Siberia from 2000 to 2009 by using an inverse modeling approach with observations from this unique Siberian data (from nine towers and four aircraft sites) and data from global background sites to overcome the lack of Siberian observations in inversion studies. Section 'Material and Methods' outlines an inversion method, a transport model, and observations used for the study. In our results in section 'Results', we describe annual fluxes and uncertainties in Siberia and global fluxes. To highlight the impact of the observation network on the inverted fluxes, we assess how Siberian data affect global and regional estimates of carbon flux by comparing these results with results obtained without Siberian data. In section 'Discussion', we compare our estimated fluxes with previous bottom-up and inversion (top-down) studies. Our conclusions follow in section 'Conclusions'.
2 Material and Methods
 We used a Bayesian inversion framework, which uses an atmospheric transport model as a linear observation operator to infer CO2 surface fluxes from 2000 to 2009 from a priori fluxes plus their uncertainties and CO2 measurements. First we describe the four components of the inverse modeling system: the transport model, inversion method, and the a priori flux data set and observations, and then inversion experiments which we conducted follow.
2.1 The Atmospheric Transport Model
 We used an off-line atmospheric transport model developed by NIES (NIES-TM) [Maksyutov et al., 2008; Belikov et al., 2011] to calculate CO2 concentrations using an a priori data set and response functions corresponding to each basis function of the model. An earlier version of NIES-TM [Maksyutov et al., 2008] was one of the models participating in TC3 [e.g., Gurney et al., 2003, 2004; Baker et al., 2006; Patra et al., 2008]. The NIES-TM used in this study was implemented with a 2.5° × 2.5° horizontal resolution and 32 vertical levels in a hybrid sigma-isentropic (σ-θ) coordinate system with the isentropic part above 350 Kelvin (K). Consequently, it can reproduce the age of stratospheric air more reasonably than the earlier version, which used a sigma coordinate system for the vertical levels. The model advection was calculated with a flux-form algorithm following a second-order van Leer scheme [van Leer, 1977] and driven by a mass-flux-corrected meteorological field produced from the Japanese 25 year reanalysis (JRA-25)/Japan Meteorological Agency (JMA) Climate Data Assimilation System (JCDAS) data set [Onogi et al., 2007] to precisely maintain the model linearity. The PBL height was taken from the Interim Re-analysis data set of the European Centre for Medium-range Weather Forecasts [Dee et al., 2011]. Vertical mixing was represented in the model by cumulus convection based on the convective precipitation rate, provided by JCDAS and calculated using a Kuo-type scheme following Grell et al. , and turbulent diffusion with explicitly parameterized physical processes in the PBL. This new version of the model is one of the participating models of the “Comprehensive Observation Network for Trace gases by Airliner (CONTRAIL)” transport model intercomparison for CO2 [Niwa et al., 2011] and the TransCom-CH4 experiment [Patra et al., 2011].
2.2 Inversion Method
 To infer regional carbon fluxes in this study, we used a fixed-lag Kalman smoother, which is a more computationally efficient scheme for handling a large number of measurements than a full-matrix batch inversion [e.g., Bruhwiler et al., 2005; Peters et al., 2005; Bruhwiler et al., 2011; Tang and Zhuang, 2011].
 Assuming a linear system, the vector of observations (or of differences between model predictions and observations) z is described in the model space by
where s is the vector of sources and sinks to be estimated, H is an observation operator that maps the fluxes in the model space into the measurement space, and ν is the “data uncertainty” of the approximate observations Hs, which includes observation uncertainty and the representation error in the transport model itself of observations z. By assuming that z, s, and ν have Gaussian probability distributions and adding a priori information (invoking Bayes' Theorem), the cost function J to be minimized can be described as follows:
where s0 is the vector of the a priori flux and R and Q are error variance-covariance matrices for the model-data mismatch and a priori flux estimates, respectively, and T represents the transpose of a matrix. The Kalman gain matrix is
and the a posteriori fluxes s and the a posteriori covariance matrix Q′ are expressed as
 The inverse of the matrix (R + HQHT) was solved by using LU factorization in the LAPACK library [Anderson et al., 1999].
 In a batch mode inversion, all data and all inferred sources are handled simultaneously. Hence, the response functions (each element of H) must be stabilized by the transport process during the target period. For this reason, in the TC3 interannual variability (IAV) experiment [Baker et al., 2006], the response functions were calculated over a 3 year period by each transport model, and after the 3 years the subsequent signals were approximated by using exponential decay to represent the fully mixed state. This calculation of H creates more computations when estimating fluxes at finer resolution and for a longer period; therefore, to reduce the computational burden, the TC3 IAV experiments used meteorological data from a single year to derive the carbon fluxes for 1998–2003 [Baker et al., 2006].
 In the lagged form of Kalman smoother with lag length L, equation ((1)) at the current time step t can be expressed as
where u represents on-line state variables (still being estimated) from t to t − L + 1, and v represents off-line state variables (no longer being updated) from t − L to the first time step. In the same way, equations ((4)) and ((5)) can be rewritten in terms of on-line variables u and off-line variables v. In this study, before the next time step was calculated, dropped response functions (in this case, the CO2 concentration) no longer being used for the inversion were multiplied by the final estimate of sv at t − L + 1 and then added to the forward model prediction, that is,
where Ct is the model prediction at time t, which is used to find model-data mismatch z in equation ((6)); nreg is the number of regions for which fluxes are to be estimated, that is, the dimension of vector s; and R represents the three-dimensional CO2 concentrations in the response functions at time t, corresponding to the final estimate for the source-sink vector, st − L + 1. Thus, Cnewt includes the newly dropped variables and is used as the new model prediction for the next time step. Accordingly, off-line variables are transported as forward model predictions throughout the inversion period.
 Less computation is needed to calculate the elements of the observation matrix H in the fixed-lag Kalman smoother than in a batch inversion. Also, the sizes of the matrices to be solved are significantly smaller in the Kalman smoother method. Bruhwiler et al.  showed that most of the CO2 signal from each basis function (i.e., source region) is apparent at a sampling site within the first 4–6 months, in contrast to the 3 year run needed in the TC3 IAV experiments. In this study, we set the lag-length L to 4 months because our focus was on Siberia. Note that although the inversion period extended from January 2000 to December 2009, the fluxes for the last 3 months are still in the “on-line” state because we set the lag length to 4 months.
 We also tested a truncated singular value decomposition (t-SVD) method as another regularization technique to stabilize the linear regression. It is valuable to quantify the robustness of the estimated fluxes to the parameters of the inversion setup such as setting of the uncertainties. As a method that can be used for reducing the effects of the remote fluxes on local flux estimate, a regularization technique of truncated SVD has been proposed in Brown  or Fan et al. . This method approximates matrix A as
where U and V are orthogonal matrices, Σk is a diagonal matrix whose elements are the singular values σ from 1 to k of matrix A, and k is the truncated parameter or “rank.” This approximation eliminates small singular value elements that are sensitive to measurement error [e.g., Brown, 1995; Fan et al., 1999]. In our case, our target was the estimation of monthly carbon fluxes, whereas the Siberian tower observations are continuous and expected to be highly variable. Thus, we expected the t-SVD method to reduce high-frequency noise in the observed CO2 concentrations, which cannot be reproduced by the model. In this method, to calculate the a posteriori covariance matrix Q′, equation ((5)), variances with rank larger than k substitute a priori variances because t-SVD estimates variances only from rank 1 to k. As a result, the a posteriori uncertainty tended to be larger than that of the full-rank inversion.
 We deduced monthly fluxes for 68 subcontinental regions (46 regions on land and 22 in the ocean) over the globe (Figure 1). The 68 regions are based on the 64 regions defined by Patra et al. , who subdivided the 22 original regions of the TC3 protocol [Gurney et al., 2003]. To utilize the dense Siberian tower network, we further subdivided the two west Siberian regions [Patra et al., 2005] into six smaller regions according to the distributions of 17 land cover classes in the International Geosphere-Biosphere Program-Data and Information System's (IGBP-DIS) DISCover land cover classification system [Loveland et al., 2009] (Figure 1b).
 The response functions corresponding to unit monthly pulse emissions from each of the 68 regions were calculated by NIES-TM and sampled at the observed locations and times. Thus, each element of the observation operator H in equation ((5)) was calculated. Then, the CO2 concentrations simulated using the a priori flux data set were used to calculate the differences between the model predictions and the observations, as the model-data mismatch z in equation ((4)).
2.3 A Priori CO2 Flux Data Set
 The a priori flux data set for CO2 forward simulations was composed of four subdata sets: (1) fossil fuel CO2 emissions from the Open Source Data Inventory of Anthropogenic CO2 (ODIAC) [Oda and Maksyutov, 2011]; (2) daily net ecosystem exchange (NEE) from a process-based model, the Vegetation Integrative Simulator for Trace gases (VISIT) [Ito, 2010; Saito et al., 2011a; M. Saito, manuscript in preparation, 2012]; (3) monthly biomass-burning CO2 emissions from the Global Fire Emissions Database (GFED) version 3.1 [van der Werf et al., 2010]; and (4) monthly air-sea CO2 fluxes produced by an ocean pCO2 data assimilation system using the Offline Ocean Tracer Transport Model (OTTM) [Valsala et al., 2008; Valsala and Maksyutov, 2010]. All of these flux data sets have interannual variabilities and covered the period from 2000 to 2009 at 1° × 1° spatial resolution for the NIES-TM input. These a priori flux data sets have also been used for atmospheric inversions using satellite-observed CO2 column abundance obtained from short-wavelength infrared spectra from the Greenhouse Gases Observing Satellite (GOSAT) [Takagi et al., 2011]. Note that emissions due to anthropogenic land-use changes (deforestation, logging, etc.) were not explicitly included in this a priori flux data set; thus, they were included implicitly in the inverted land fluxes.
 For land regions, a unit flux of 1 GtC yr−1 for each region was spatially distributed based on the absolute values of VISIT NEE averaged over 2000–2009 under the assumption, also made in the TC3 experiment, that terrestrial fluxes reflect the biological activity in each region [Gurney et al., 2003]. The spatial distribution of the oceanic basis functions was uniform, but the distribution in time varied in the northern and southernmost regions because of winter sea-ice cover, as in the TC3 experiments.
 The a priori flux uncertainty for land regions was determined from the standard deviation of the monthly mean VISIT NEE averaged from 1979 to 2009; that is, the simple standard deviation of all 30 years times 12 months of NEE was used. The oceanic a priori flux uncertainty was prescribed as the sum of the standard deviations of the assimilated oceanic flux by OTTM averaged over 2001–2009 and residual mismatches between the OTTM flux and the climatological CO2 flux maps of Takahashi et al. , with a minimum uncertainty of 0.02 gC m−2 day−1 [Valsala and Maksyutov, 2010]. In this study, off-diagonal elements of the error covariance matrix Q in equation ((2)) of a priori fluxes were set to zero under the assumption that the a priori error in different regions were uncorrelated.
2.4 Atmospheric Observations
 We used two observational data sets: a worldwide network that mainly observes background CO2 concentrations and the Siberian network. For the background CO2 network, we used surface flask atmospheric CO2 sampling data from 58 terrestrial sites and one ship cruise of the Cooperative Air Sampling Network, coordinated by the Global Monitoring Division of Earth System Research Laboratory, the National Oceanic and Atmospheric Administration (NOAA) (hereafter called NOAA data) [Conway et al., 2011]. This data set covers a large area of the world (Figure 1) and is well maintained with high quality control of data. Because the Cooperative Air Sampling Network is a key network for observing greenhouse gases, it has been used in many CO2 inversion studies [e.g., Chevallier et al., 2010; Bruhwiler et al., 2011]. Conway et al.  describes the measurement method in detail. In this study, we used the NOAA flask event data directly in the inversions, that is, without processing the data (we did not use statistical monthly means or filled values as in the GLOBALVIEW data set [GLOBALVIEW-CO2, 2011]). Data flagged by NOAA as having quality control problems were rejected; then, the remaining event data were averaged if appropriate data existed at the same location and time.
 The Siberian sites, operated by CGER/NIES and RAS, consist of nine tower sites (JR-STATION) and four aircraft sites (Tables 1 and 2; see Figure 1 for their locations). Here we briefly describe the methods of aircraft sampling and tower observation; details are available in the cited references. Routine air sampling has been conducted once per month with a chartered aircraft over a wetland near Surgut (61°N, 73°E) since 1993, over a mixed forested/cultivated area near Novosibirsk (55°N, 83°E) since 1997, and over a forested area near Yakutsk (62°N, 130°E) since 1996 [Machida et al., 2001]. For sampling, a diaphragm pump pumps the air from outside the aircraft into pressurized Pyrex glass flasks. The CO2 mixing ratios of the SUR air samples were measured with a nondispersive infrared analyzer (NDIR, type-VIA, Horiba, Japan) at Tohoku University until 2004; the SUR samples collected after 2005 and the YAK and NOV samples were analyzed with a NDIR at NIES (LI-6252, LI-COR, Lincoln, NE, USA). At the fourth aircraft site, we used continuous measurements from a small aircraft (Antonov An-2) over the BRZ tower (M. Sasakawa, manuscript in preparation, 2012). A small CO2 measurement device based on a NDIR (LI-800, LI-COR) equipped with a flow and pressure regulation system was developed and installed in the aircraft. The An-2 ascended to 2 km (in winter) or 3 km (in summer) above the tower, then descended to 0.15 km to obtain a vertical profile of CO2 concentration. Routine aircraft measurements at BRZ were conducted generally on sunny days in the afternoon 2–4 times per month until March 2007, and less often after that. We used data averaged over 500 m height intervals from the surface on each day for the inversion calculation. We used the daytime mean data (13:00–17:00 LT) from JR-STATION (Table 2) [Sasakawa et al., 2010]. Atmospheric air was sampled at two levels on eight towers and at four levels on the BRZ tower. Sampled air was dried and then introduced into a NDIR (LI-820, LI-COR). The CO2 observation data were calibrated against the NIES 09 CO2 scale, which is lower than the WMO X2007 CO2 scale by 0.07 ppm at around 360 ppm and consistent in the range between 380 and 400 ppm [Machida et al., 2011]. The Siberian data are available at the CGER/NES Global Environmental Database website http://db.cger.nies.go.jp/ged/data/siberia/.
Table 2. Tower Network Sites in Siberia (JR-STATION)
Sampling height (m)
Data period used for the inversion
 The tower network is in fact suitable for regional scale inversion with grid-based setup like Carouge et al.  and Lauvaux et al. , thus our large region inversion setup may not use the observations to the full advantage. However, the size of the regions in west Siberia (~800 km) is close to the spatial correlation length adopted in many global inversion studies performed at grid-size resolution [e.g., Chevallier et al., 2010] or size of regions in Feng et al.  that has been selected by resolution optimization approach as in Carouge et al. .
2.5 Data Uncertainty
 As described by Gurney et al. [2002, 2003], the data-mismatch error covariance matrix R (equation ((2))) is a weighting term to determine the degree to which the concentrations predicted by the model match the observed data; hence, in addition to measurement precision, the weighting term should take into account the uncertainty associated with the model itself, such as imperfect transport, the coarse resolution of the model's spatial and temporal grid, a priori flux resolution, and aggregation errors [e.g., Kaminski et al., 2001]. Here, we assumed that R was a diagonal matrix whose diagonal elements were the averaged residual standard deviation (RSD) of the measurements about smoothed curves at each site. The RSDs at each NOAA site were taken from the “wts” file of the corresponding site in the GLOBALVIEW-CO2  data set with a minimum data uncertainty of 0.25 ppmv, following the TC3 experiments [Gurney et al., 2002, 2003]. For the Siberian network sites, where the variations in the observed data were expected to be larger than those at the background sites, average monthly RSDs were determined by using a digital filtering technique [Nakazawa et al., 1997b] with averaged seasonal cycles expressed as the sum of the Fourier harmonics with periods of 12, 6, and 4 months. All RSDs were further scaled by 1.5 so that the chi-square value for the inversion with only NOAA data would be nearly equal to unity. Here chi-square was calculated as
after Patra et al. , where T, N, and M are the number of time intervals, observation stations, and source regions, respectively. Observations made at neighboring sites or vertical aircraft observations at multiple levels might be correlated with each other; although nondiagonal elements should therefore be added to the error covariance matrix for these sites for a more consistent treatment of correlated observations [Tarantola, 1987], here we assumed R to be a diagonal matrix. As our sites are well separated, we did not include correlations in data uncertainty. Also the vertical profiles appear having a vertical structure, thus the correlation between levels was also neglected.
 During the inversion, observations with a model-data mismatch of more than 15.0 ppmv were given large data uncertainty (104) to reduce their weight. Such a large model-data mismatch might be caused by the model to reproduce the measurements imperfectly or by outlier data. We tried using 8 ppm filter as well, but in this case many tower observations were excluded. So we decided to use 15 ppm filter in this study.
 We conducted four inversion experiments with three data sets and two methods of regularization:
Case 1 used only NOAA flask data.
Case 2 used NOAA data and the monthly Siberian aircraft observations over three sites, SUR, YAK, and NOV.
Case 3 used all data (i.e., case 2 + BRZ aircraft data + network data from all nine towers).
Case 4 used all data, as in case 3, and was solved by the truncated SVD method (described in section 'Inversion Method') to reduce noise in the estimated fluxes under the same conditions as for case 3 but using a rank of 37, corresponding to a singular value of 2, instead of the full rank of 272 (68 regions × 4 months). The rank 37 corresponds to cumulative squared covariance fraction of about 90%, which is enough to represent the overall signals.
 Cases 1 to 3 were solved by the basic Bayesian inversion method using the full rank.
 The model was initialized on 1 January 2000 with the zonal average CO2 concentration based on the 3-D CO2 climatology, called the Gap-filled and Ensemble Climatology Mean [Saito et al., 2011b], which had been produced in the framework of the TransCom satellite experiment. We assume the global offset as one unknown parameter of the first inversion. After the first 4 month run, the estimated global offset value was treated as a known value and used to correct the initial field. Then, the inversion was recalculated by using the initial concentration field plus the estimated global offset with uncertainty of 0.01, which was set to be constant during the subsequent inversion process.
 The results presented here are the first carbon flux estimations obtained for 2000 to 2009 by inverse modeling using the dense Siberia observational network plus NOAA data. We describe the results mainly over Siberia, where we expected the network to provide additional constraints.
3.1 Global Carbon Fluxes and Their Spatial Distributions
 The estimated (a posteriori) results for annual carbon flux for 2000–2009 after the case 3 inversion (NOAA and all Siberian data) (Figure 2) showed that the land regions were generally sinks, except for Central Asia, tropical America, and a part of North America. The estimated ocean fluxes showed the same tendencies as the a priori OTTM fluxes; that is, the North Pacific Ocean, the equatorial ocean, and the Southern Ocean were CO2 sources, and other oceanic regions were sinks. The eight small regions of boreal Eurasia were mainly annual net sinks. An especially strong sink was present in the central Siberian Highland (region 30), the southern part of which is broadly covered by mixed forests and deciduous needle-leaf forests according to IGBP-DIS classification (Figure 1b). Both the a priori flux of VISIT (Figure 2, top) and the estimated flux showed a weak annual net source over the mountainous areas east of the Lena River.
 We compared the 2000–2009 mean natural carbon fluxes of the a priori data set with the inversion results of the four cases for three aggregated regions: global, global land, and global ocean (Table 3). In this table, biomass-burning emissions, which averaged 1.7 GtC yr−1 during the inversion period, are included, but fossil fuel emissions, which increased from 6.7 to 8.1 GtC yr−1 from 2000 to 2009, are excluded. We calculated the uncertainty of each aggregated large region as
where σij2 is the covariance of small regions i and j, and n is the number of subcontinental regions in the aggregated global region (in this case, global, global land, and global ocean).
Table 3. A Priori and Estimated Fluxes and Their Uncertainties [GtC yr−1 Region−1] Globally and for Boreal Eurasia Averaged over 2000–2009a
A Priori Flux [GtC yr−1 Region−1]
Estimated Flux [GtC yr−1 Region−1]
NOAA Data Only
NOAA Data + 3 Siberian Aircraft Site Data
NOAA Data + All Siberian Data
NOAA + All Siberian Data Solved by t-SVD
NOAA, National Oceanic and Atmospheric Administration
Biomass-burning emissions are included in the land fluxes, but fossil fuel emissions are not.
−1.80 ± 5.28
−3.50 ± 3.26
−3.50 ± 3.21
−3.51 ± 3.18
−3.51 ± 4.00
−0.39 ± 5.04
−1.95 ± 3.08
−2.00 ± 3.03
−1.90 ± 3.00
−1.48 ± 3.78
−1.41 ± 1.58
−1.55 ± 1.06
−1.51 ± 1.06
−1.61 ± 1.06
−2.03 ± 1.31
 The inverted global fluxes were −3.50 ± 3.26, −3.50 ± 3.21, −3.51 ± 3.18, and −3.51 ± 4.00 GtC yr−1 for cases 1 to 4, respectively, compared with the global a priori flux of −1.80 ± 5.28 GtC yr−1. The estimated total global fluxes resulted in almost the same value: a carbon sink of about −3.50 GtC yr−1. The estimated uncertainty ranged from 3.18 GtC yr−1 (case 3) to 3.26 GtC yr−1 (case 1), a reduction in the uncertainty of about 40% compared with the a priori uncertainty of 5.28 GtC yr−1. For case 4, the estimated uncertainty of 4.00 GtC yr−1 is larger than those of cases 1–3 because we used only the first 37 largest singular values and put the a priori uncertainty after the 37th element, as described in section 'Inversion Method'. The land flux, including biomass-burning emissions, estimated by the four inversions ranged from −1.48 to −2.00 GtC yr−1 (42–46% of the global total), and the estimated ocean fluxes ranged from −1.51 to −2.03 GtC yr−1. The small difference in ocean uptake between case 1 (−1.55 GtC yr−1) and case 3 (−1.61 GtC yr−1) is due to the difference in estimated North Pacific uptakes (−0.27 and −0.33 GtC yr−1 for cases 1 and 3, respectively). In case 4, the estimated ocean uptake of −2.03 GtC yr−1 is larger than the land uptake of −1.48 GtC yr−1, which is opposite to the land-ocean partitioning in the inversions of cases 1–3. A quarter of this difference comes from a much larger uptake in the Southern Ocean by the case 4 inversion (−0.63 ± 0.44 GtC yr−1) than −0.52 ± 0.28 GtC yr−1 for the case 3 inversion. We compare these results with previous studies in section 'Discussion'.
 To examine the role of the Siberian data in the estimated regional fluxes, we compared cases 1 and 3. Figure 3 shows their estimated fluxes and differences in January and July 2008, the year when most of the Siberian sites are in operation. Flux differences between cases 1 and 3 appear over a broad area of the mid-latitudes and high latitudes of the Northern Hemisphere (northeastern Europe, North America, temperate Eurasia, and boreal Eurasia). Generally, in northeastern Europe (region 46) and southern Siberia (regions 25, 26, 27, and 31), the difference between cases 3 and 1 is positive in January and negative in July, which means inclusion of the Siberian data causes the estimated fluxes to have larger amplitudes in seasonal variation than those estimated with NOAA data only. Region 30 shows a slight sink in January, which might be caused by transport error in winter when a strong inversion layer is formed. An exception is northeastern Siberia (region 32), where the YAK tower and aircraft sites are located. There, the case 3 inversion fluxes were positive relative to the case 1 fluxes in both January and July. We discuss the seasonal patterns of the estimated fluxes in the next section (3.2). Although in TC3 experiments Siberia is typically treated as a single large region (“boreal Eurasia”), by dividing Siberia into subregions and using Siberian network data, we obtained temporally and spatially heterogeneous estimated fluxes.
 Next we compared the reduction in estimated uncertainty between cases 1 and 3. The uncertainty reduction rate (UR) is defined as a percentage as
Where σAll and σNOAA denote the estimated uncertainty in cases 3 and 1, respectively. The maximum and mean reduction of the estimated uncertainty for each region in any month of the period from January 2000 to September 2009 is in Figure 4. As expected, the reduction rate was pronounced in boreal Eurasia, where no NOAA site exists, in part because of the higher a priori flux uncertainty in this region that accompanied NEE variations in the Siberian forest. The maximum reduction of around 80% was seen in eastern Siberia and a part of western Siberia. The reduction in eastern Siberia became particularly pronounced after the YAK tall tower began operation. The Siberian network also reduced the uncertainty in northeastern Europe, upwind of the Siberian network. A reduction in region 35 (Central Asia) appeared mainly in the winter at the end of 2004, when observations became available from multiple towers. From January 2000 to September 2009, the mean uncertainty reduction in boreal Eurasia and northeastern Europe (case 3 versus case 1) was 22% (Figure 4). Thus, the trend of the reduction was almost the same as in Figure 4; that is, inclusion of Siberian data reduced the uncertainty, particularly in northeastern Europe and boreal Eurasia. The mean uncertainty reduction for the year 2008–2009, when the most of Siberia sites are in operation, is about 40–60% in most of boreal Eurasia (Figure 4 bottom).
 Until the end of 2001, when continuous airborne measurements were started over BRZ, the monthly flights over SUR, YAK, and NOV at several altitudes provided unique data for Siberia. By comparing results of case 2 with case 1, the mean uncertainty reduction is 8.7% for boreal Eurasia and northeastern Europe, and we noted that the Siberian aircraft data effectively reduced the uncertainty in eastern Siberia (where airborne observations were made over YAK) by up to about 40%, and by up to about 20% in northeastern Europe and western Siberia (data not shown). The uncertainty reduction in southwestern temperate Asia (region 33), which reached 40%, was due to Siberian aircraft data collected in the boreal winter. Although these flights took place only about once a month, the monthly aircraft data reduced the uncertainty over northern Eurasia and northeastern Europe by about 10–20% and over regions adjacent to YAK by about 40%, especially in summer.
3.2 Boreal Eurasian Fluxes
 Here we focus on the fluxes estimated for boreal Eurasia, our target region (Table 4, Figure 5). The aggregated boreal Eurasian a priori flux averaged from 2000 to 2009 was −0.05 ± 1.39, and the averaged fluxes from cases 1–4 were −0.56 ± 0.79, −0.52 ± 0.69, −0.35 ± 0.61, and −0.35 ± 0.87 GtC yr−1 region−1, respectively, including an average biomass-burning emission of 0.11 GtC yr−1. Thus, a smaller uptake of carbon was estimated in case 3 (NOAA plus Siberian data) than in case 1 (NOAA data only). Moreover, the Siberian data helped to reduce uncertainties in the estimated fluxes over boreal Eurasia and northeastern Europe, as described in section 'Global Carbon Fluxes and Their Spatial Distributions'. It is known that estimated fluxes by inverse modeling for remote regions from observations can be underconstrained [e.g., Gurney et al., 2004], and thus the relatively close match in the estimated fluxes between cases 1 and 2 could be accidental. Although the difference in the estimated fluxes between cases 1 and 3 is smaller than the posterior uncertainty of the later case, and the change in the posterior uncertainties is not drastic between the cases, the case 3 value is more reliable as it takes into account the local Siberian observations. In this sense, the case 3 inversion provides new information on the estimated flux by tuning the flux to the local Siberian observations than the case 1 inversion. The two different regularization methods, full-rank inversion and t-SVD (cases 3 and 4, respectively), both resulted in a net sink of 0.35 GtC yr−1, although the estimated uncertainty differed. The 0.35 GtC yr−1 sink for boreal Eurasia accounts for 18.4% and 23.6% of the global land sink in case 3 (1.90 GtC yr−1) and case 4 (1.48 GtC yr−1), respectively.
Table 4. The A Priori and Estimated Fluxes and Their Uncertainties for Aggregated Boreal Eurasia and for Eight Small Boreal Eurasian Regions and Eastern Europe in [GtC yr−1 Region−1] and [gC m−2 yr−1]a
A Priori Fluxes
NOAA Data Only
NOAA Data + 3 Siberian Aircraft Site Data
NOAA Data + all Siberian Data
NOAA + All Siberian Data Solved by t-SVD*
Biomass-burning emissions but not fossil fuel emissions are included in the land fluxes.
Aggregated boreal Eurasia
−0.05 ± 1.39
−0.56 ± 0.79
−0.52 ± 0.69
−0.35 ± 0.61
−0.35 ± 0.87
(Averaged over 2000–2009)
[gC m−2 yr−1]
−3.76 ± 104.51
−42.11 ± 59.40
−39.10 ± 51.88
−26.32 ± 45.86
−26.32 ± 73.68
Area: 1.33E+13 m2
8 small regions in boreal Eurasia (Regions 25–32)
Region 25 (southern west Siberia)
−0.0059 ± 0.052
−0.0066 ± 0.0023
−0.0078 ± 0.011
−0.0055 ± 0.03
−0.0053 ± 0.0048
Area: 6.41E+11 m2
[gC m−2 yr−1]
−9.20 ± 81.12
−10.30 ± 3.59
−12.17 ± 17.16
−8.58 ± 46.80
−8.27 ± 7.49
Region 26 (west central west Siberia)
−0.0070 ± 0.15
−0.015 ± 0.020
−0.024 ± 0.063
−0.0042 ± 0.09
−0.0071 ± 0.046
Area: 4.74E+11 m2
[gC m−2 yr−1]
−14.77 ± 316.46
−31.65 ± 42.19
−50.63 ± 132.91
−8.86 ± 189.87
−14.98 ± 97.05
Region 27 (east central west Siberia)
−0.010 ± 0.12
−0.015 ± 0.012
−0.023 ± 0.070
−0.0081 ± 0.11
−0.014 ± 0.048
Area: 4.07E+11 m2
[gC m−2 yr−1]
−24.57 ± 294.84
−36.86 ± 29.48
−56.51 ± 171.99
−19.90 ± 270.27
−34.40 ± 117.94
Region 28 (northern west Siberia)
−0.00058 ± 0.23
−0.037 ± 0.072
−0.064 ± 0.13
−0.039 ± 0.14
−0.019 ± 0.11
Area: 7.45E+11 m2
[gC m−2 yr−1]
−0.78 ± 308.72
−49.66 ± 96.64
−85.91 ± 174.50
−52.35 ± 187.92
−25.50 ± 147.65
Region 29 (southern central Siberia)
−0.024 ± 0.26
−0.051 ± 0.067
−0.044 ± 0.12
−0.026 ± 0.15
−0.032 ± 0.052
Area: 1.77E+12 m2
[gC m−2 yr−1]
−13.56 ± 146.89
−28.81 ± 37.85
−24.86 ± 67.80
−14.69 ± 84.75
−18.08 ± 29.38
Region 30 (northern central Siberia)
−0.013 ± 0.58
−0.18 ± 0.31
−0.25 ± 0.49
−0.29 ± 0.49
−0.10 ± 0.26
Area: 2.10E+12 m2
[gC m−2 yr−1]
−6.19 ± 276.19
−85.71 ± 147.62
−119.05 ± 233.33
−138.10 ± 233.33
−47.62 ± 123.81
Region 31 (southern east Siberia)
−0.012 ± 0.99
−0.13 ± 0.71
−0.052 ± 0.81
−0.0073 ± 0.82
−0.082 ± 0.57
[gC m−2 yr−1]
−3.59 ± 296.41
−38.92 ± 212.57
−15.57 ± 242.51
−2.19 ± 245.51
−24.55 ± 170.66
Region 32 (northern east Siberia)
0.024 ± 0.68
−0.13 ± 0.45
−0.058 ± 0.46
0.028 ± 0.39
−0.095 ± 0.48
[gC m−2 yr−1]
6.27 ± 177.55
−33.94 ± 117.49
−15.14 ± 120.10
7.31 ± 101.83
−24.80 ± 125.33
Region 46 (northeastern Europe)
−0.022 ± 0.61
−0.18 ± 0.33
−0.29 ± 0.61
−0.19 ± 0.61
−0.0086 ± 0.37
[gC m−2 yr−1]
−9.73 ± 269.91
−79.65 ± 146.02
−128.32 ± 269.91
−84.07 ± 269.91
−3.81 ± 163.72
 The estimated fluxes of eight small boreal Eurasian regions and northeastern Europe are also listed in Table 4 in units of both GtC yr−1 region−1 and gC m−2 yr−1, and their climatological seasonal cycles are shown in Figure 6. Generally, the taiga of west Siberia (regions 26 and 27) had larger uptakes than croplands further south (region 25) or the northern open shrublands (region 28), and central and eastern Siberia (regions 29–32), which are mainly covered by mixed forest and deciduous needle-leaf forests, had large uptakes in summer.
 The a priori flux and the estimated fluxes of the four cases in the southern part of west Siberia (region 25), covered by croplands and which includes the sites Savvushka (SVV), Azovo (AZV), and Vaganovo (VGN) (Figure 1b), were almost the same at around −10 gC m−2 yr−1, because of the small a priori uncertainty, and the seasonal variations were also similar (Figure 6a). In contrast, in central west Siberia (regions 26 and 27; Figures 6b and 6c), characterized by mixed forest, the estimated fluxes of cases 2 and 3 had slightly larger uptakes in summer, from June to August, whereas those of cases 1 and 4 differed little from the a priori fluxes. In region 27 for case 3, there was larger release of CO2 in winter than there was for the other three cases, possibly because of high CO2 events observed at the Karasevoe (KRS) and BRZ towers in winter (the results at KRS will be discussed in section 'Predicted CO2 Concentration'). In region 28 (Figure 6d), northernmost west Siberia, large uptakes occurred in July for all cases, but the maximum a priori uptake occurred in June.
 A priori and estimated fluxes in central Siberia (region 29), an upwind area with no observation sites, were almost the same in magnitude (−13 to −29 gC m−2 yr−1) and seasonality (Figure 6e), whereas the estimated fluxes in region 30 (Figure 6f) had large uptakes of 1−2 GtC yr−1 in July, and for all four cases the releases from October to April were lower than the a priori flux. As a result, region 30 had the largest uptake (48 to 138 gC m−2 yr−1) among the eight small regions of boreal Siberia, after the inversions were done. Also in region 30, cases 2 and 3 had almost the same fluxes in summer, which indicates that aircraft data could constrain the summer fluxes as effectively as tower data in this region.
 The a priori fluxes in regions 31 and 32 (Figures 6g and 6h) in east Siberia showed large uptakes in summer. Among the four cases of this study, there were smaller uptakes and peak-to-peak amplitude in both regions for case 3, that is, small uptakes in summer and small emissions from January to April. The maximum uptake is in July for region 32 in case 3 inversion, while in June in a priori flux. Dolman et al.  observed net ecosystem exchange by eddy correlation method above a Larch forest at Yakutsk and found the maximum uptake was June for the year 2001, although the net carbon exchange was very sensitive to small changes in weather which may change a sink of the land vegetation easily to a source. In region 32, the minimum a priori flux occurred in June when biomass burning was included, whereas in the inversion results fluxes were a minimum in July, except that the uptake was similar in June and July for case 4. Uptake was large in summer for case 1 (without the Siberian data), especially in region 32. Although the estimated fluxes differed greatly among the four inversions, the density of emissions was almost the same (within 10 gC m−2 yr−1) between the two regions 31 and 32 for each case.
 The Siberian network data also influenced the estimated fluxes in northeastern Europe, region 46 (Figure 6i), upwind of the network. There were large uptakes of about 2.5 GtC yr−1 in June for cases 2 and 3. Unlike the Siberian results, which had minimum values in July, the a priori flux and all of our cases had minima in June (in region 46).
3.3 Interannual Variations of Estimated Fluxes
 We next examined the estimated annual carbon fluxes from 2000 to 2008 for boreal Eurasia and west, central, and east Siberia, and northeastern Europe for cases 1 and 3 (Figure 7); boreal Eurasia and west, central, and east Siberia are aggregations of regions 25–32, 25–28, 29–30, and 31–32, respectively. Emissions from biomass burning are included in the figure. Note also that the Siberian network changed during the period as new Siberian sites were constructed (Tables 1 and 2). As pointed out by Dargaville et al. , it is preferable to avoid changing the size of an observation network over the period of inversions because introducing new sites might modify the calculated fluxes of surface CO2, and this may be misinterpreted as interannual variability.
 The annual fluxes in boreal Eurasia (Figure 7a) were negative or near zero when biomass-burning emissions were included. The interannual variability (all “variability” is interannual in this section) was larger in case 3 than in case 1. In case 3, the uncertainty clearly diminished as Siberian sites were placed in operation through the years. This tendency that case 3 derives large variability in the estimated fluxes and reduction in uncertainties can be seen in most of other subregions of west Siberia (Figure 7b), central Siberia (Figure 7c), and Northeastern Europe (Figure 7e), which shows that Siberian data contribute to these variability and reduction in uncertainties. The fluxes in east Siberia (Figure 7d) were the most variable with similar amplitudes of about 0.50 GtC yr−1 but different mean fluxes of −0.23 and +0.03 GtC yr−1 in cases 1 and 3, respectively. In the GFED database used here, fire emissions are 1.25, 1.07, 1.86, 3.33, 0.12, 0.37, 0.57, 0.43, and 1.71 GtC yr−1 from 2000 to 2008, respectively, which are larger in east Siberia than in west or central Siberia. Fire emissions mainly occur in the spring and summer. Positive fluxes in case 3 in east Siberia in 2003, 2004, and 2008 may result from biomass burning in the a priori fluxes (from the GFED database), which are larger than uptake of CO2 by the biosphere on land. Actually, smoke from large forest fires in east Siberia was transported downwind to western North America and influenced air quality there in summer 2003 [Jaffe et al., 2004]. The Siberian network (mainly the YAK sites in east Siberia) captured these large biomass-burning fires well, and their influence was seen in the fluxes. Fluxes in case 3 ranged from −0.54 to 0.13 GtC yr−1 in northeastern Europe (Figure 7e), whereas all case 1 results were negative with small variability. The impact of the Siberian network can be seen here as well.
 It is well known that the interannual variability of regional carbon fluxes can be explained by local or synoptic-scale meteorological conditions, such as the Siberian high, and global climatic events such as El Niño or La Niña and volcanic eruptions [Patra et al., 2005; Gurney et al., 2008; Deng and Chen, 2011; Gurney et al., 2012]. Relationships between the variability of fluxes in this region and climatic events require further analysis, however, because the Siberian network changed size during the period and the available measurements from tower sites increased greatly in frequency.
3.4 Predicted CO2 Concentration
 We evaluated atmospheric CO2 concentrations a posteriori at each station for case 3 in terms of χ2, defined as follows:
where z′ is the a posteriori concentration after the inversion, z is the observed concentration at a site, and σ is its uncertainty [Peylin et al., 2002; Gurney et al., 2004]. At the NOAA and Siberian sites, χ2 values ranged from 0.63 to 4.83. At 44% of all sites, χ2 values fell within the interval 0.8–1.2, so that the inversion was quite consistent in χ2. At nine Siberian towers, χ2 ranged from 0.79 at SVV to 1.09 at AZV; that is, χ2 ≈ 1. At the Siberian aircraft sites, χ2 values were 1.06–1.43, 0.99–1.34, 0.94–1.19, and 1.2–2.87 at BRZ, NOV, SUR, and YAK, respectively. At YAK, χ2 tended to be greater than at the other aircraft sites because there were few data at some altitudes and the a priori VISIT flux had a large seasonal variation in east Siberia, which was poorly reproduced by the transport model. Siberia is one region where atmospheric transport models fail to reproduce CO2 vertical profiles because of the covariation between ecosystem fluxes and vertical transport [Paris et al., 2008]. Although χ2 values were large at some sites, overall the NOAA and Siberian data made valuable contributions to the cost function.
 We show a time series of simulated and observed CO2 concentrations at the KRS tower and the SUR aircraft at 7000 m in Figure 8 as examples of the inversion performance. At KRS, the results of the prediction model after the inversion reproduced the observed CO2 concentration better than the model free run with the a priori data set, although the inversion results still could not adequately reproduce the high concentrations in winter or the very low concentrations in summer. These observed concentrations, which show a model-observation mismatch of more than 15 ppmv, were given high uncertainty when the inversion was performed (see section 'Data Uncertainty'). In winter, the Siberian high-pressure system [Lloyd et al., 2002] causes a strong inversion layer from 200 to 600 m above the ground to develop, and the transport model sometimes has difficulty in reproducing such a strong winter inversion. Furthermore, the coarse grid and the 1° × 1° resolution of the a priori flux data may make it incapable of reproducing these locally high winter concentrations. The model should be improved to address these problems. At 7000 m at SUR, where CO2 observations are made monthly, the optimized CO2 concentration captured the observed seasonal variation and trend well (χ2 = 1.01, correlation = 0.97, overall bias = −0.51, centered root-mean square [RMS] difference = 1.78, while χ2 = 4.56, correlation = 0.96, overall bias = 2.19, centered RMS difference = 2.42 for model “free run”). The free-run model results (gray triangles, Figure 8) had larger increases after 2005, compared with the observations and inversions, because of changes in the interannual variability of VISIT. Otherwise, the modeled vertical transport reproduced observations in the free troposphere fairly well.
3.5 Comparison with Independent Observations
 To verify our inverted fluxes, we performed forward model simulations with the four fluxes derived from cases 1 to 4 and the simulated monthly CO2 concentrations, and compared the results with independent airborne observations over Zottino (ZOT) in central Siberia. The ZOT data [Lloyd et al., 2002] are from the GLOBALVIEW-CO2  data set (GV data set). The model setup was the same as for the inverse modeling (section 'Material and Methods'), except that the inverted fluxes were added to the a priori fluxes during the forward simulations. The time series of the ZOT observations and simulated CO2 concentrations with the four flux data sets at four levels (0.5, 1.5, 2.5, and 3.5 km) are plotted in Figure 9, with statistics of the comparisons in Table 5. Note that ZOT observations stopped in mid-2005 and the data in the GV data set after that are climatological data derived by fitting curves to the available GV data. The correlation coefficients between the observations and all model simulations are fairly good (>0.79), but the performance of the forward simulations with inverted fluxes was much better than that of the free-run simulation (Table 5). At altitudes of 1.5, 2.5, and 3.5 km, the inversion cases had overall biases (Table 5) less than 1 ppmv, while the free-run simulations had biases larger than 1.28 ppmv. In contrast, although the free-run simulation at 0.5 km had the smallest bias among the five simulations, it overestimated the climatological concentrations at all altitudes after 2005. Generally, the simulations using inverted fluxes could reproduce the observations over ZOT well. For some years, the simulations for higher altitudes could not reproduce the observed summer minimum, although the 0.5 km simulations matched the observations quite well in 2004, 2005, and 2007. These differences might be caused by interannual variability of the PBL height over ZOT. Lloyd et al.  estimated the boundary layer height at ZOT from observed profiles of temperature and CO2 and water vapor concentrations and found a clear seasonal variation; it varied from 200 to 600 m in winter and from 1 to 2.8 km in summer. The 1.5 and 2.5 km levels in ZOT observations are within the highly variable summer PBL, and sometimes the PBL height in NIES-TM might be lower than the actual PBL. The model-data mismatch is likely to be caused by error in the PBL height and crude representation of the vertical transport when shallow cumuli exist in the transport model. The variability of the strong winter Siberian high, which suppresses vertical transport from the surface to the free troposphere, might cause the large model-data mismatch in winter (for example, in winter 2001–2002). Above 1.5 km, differences in the four forward simulations using inverted fluxes were very small and correlated well with the GV observations (Table 5), with small overall biases of less than 1 ppmv, and RMS differences of 3–4 ppmv, but their peak-to-peak amplitudes were about 4 ppmv smaller than the observed amplitudes. At 0.5 km, the peak-to-peak amplitude differed slightly among the five forward simulations. The peak-to-peak amplitude of the ZOT observations derived from the average seasonal cycle was 23.4 ppmv; the free-run and case 1 simulations slightly underestimated the observed amplitudes, while the other three simulations from cases 2, 3, and 4 matched the observed amplitudes better. It is difficult to say which result best reproduced the ZOT observations from these statistics, but cases 2, 3, and 4 explain the ZOT variations most reasonably.
Table 5. Pearson's Correlation Coefficient, Overall Bias, Root-Mean-Square (RMS) Difference, and Peak-To-Peak Amplitude of the Four Inversion Cases in Comparison with ZOT Observations at Altitudes of 0.5, 1.5, 2.5, and 3.5 km from GLOBALVIEW-CO2 Data Set
“Free-run” indicates a forward simulation with the a priori fluxes only. “Inv. case 1” indicates a forward simulation with the inverted fluxes from the case 1 inversion in addition to the a priori fluxes, and so on.
Peak-to-peak amplitudes were defined as the difference between the minimum and maximum monthly concentrations during an average seasonal cycle derived by fitting curves by the digital filtering technique (see section 'Data Uncertainty').
 We estimated that the global mean carbon flux was ~3.51 GtC yr−1 for the period 2000–2009 for all four inversion cases (Table 3). The sink on land was estimated to be −1.90 ± 3.00 and in the ocean, −1.61 ± 1.06 GtC yr−1, for case 3 (NOAA and Siberian data), and −1.48 ± 3.78 and −2.03 ± 1.31 GtC yr−1, respectively, for case 4 (same data set but solved by t-SVD). We found that the land-ocean partitioning was different between cases 3 and 4, although the global totals remained the same. The estimated land flux for case 1 (NOAA data only) was −1.95 ± 3.08 Gt yr−1, a larger sink by 0.05–0.47 GtC yr−1 than the sinks of cases 3 and 4. Previous bottom-up and top-down studies have reported the carbon budget for the 2000–2009 period. For example, IPCC [Intergovernmental Panel on Climate Change, 2007] summarized the carbon balance of anthropogenic emissions (fossil fuel plus cement) to be about 7.2 ± 0.3 GtC yr−1 and the net land and ocean fluxes to be −0.9 ± 0.6 and −2.2 ± 0.5 GtC yr−1, respectively, for 2000–2005. Le Quéré et al.  reported bottom-up flux estimations of 7.7 ± 0.4 GtC yr−1 for fossil fuel plus cement industry emissions for 2000–2008, of 1.4 ± 0.7 GtC yr−1 for land-use changes, and of 3.0 ± 0.9 and 2.3 ± 0.4 GtC yr−1 for land and ocean sinks, respectively. Using nonprocessed (monthly mean) observations from the World Data Centre for Greenhouse Gases in their atmospheric inversion system, Maki et al.  inferred a mean global flux of −3.24 GtC yr−1 for 2001–2007 (−1.36 and −1.88 GtC yr−1 for land and ocean fluxes, respectively). Deng and Chen  derived a land sink of 3.63 ± 0.49 GtC yr−1 (excluding biomass-burning emission of 2.56 GtC yr−1) and an ocean sink of 1.94 ± 0.41 GtC yr−1, for 2002–2007, by using hourly terrestrial ecosystem exchanges as the a priori flux and time-dependent Bayesian inversion. In light of these results, our global, all-land, and all-ocean estimates of the carbon budget in this period are reasonable. Carbon emissions due to land-use change (1.4 ± 0.7 GtC yr−1, Le Quéré et al. ; 1.10 ± 0.11 GtC yr−1, during 2000–2009, Houghton et al. ) were not explicitly included in our inversion and would be implicitly included in the global land flux.
 Focusing on boreal Eurasia, our target region, Table 6 summarizes estimated fluxes in this study and published studies. The estimated annual mean flux there by TC3 annual mean control inversion with 17 models was −0.60 ± 0.52 GtC yr−1 (from −1.70 to 0.71 GtC yr−1), depending on the participating atmospheric models [Gurney et al., 2003]. In a TC3 seasonal inversion, the mean boreal Eurasian flux was −0.36 GtC yr−1, with “within-model” uncertainty of ± 0.23 GtC yr−1, and “between-model” uncertainty of ± 0.51 GtC yr−1 [Gurney et al., 2004]. The TC3 time-dependent inversion (TDI) experiment estimated the average flux to be −0.37 GtC yr−1 for 1992–1996 and −0.33 GtC yr−1 for 1991–2000 based on the results of 13 atmospheric models [Baker et al., 2006]. Using the same TC3 TDI inversion system but a new method for site selection, Maki et al.  obtained a boreal Eurasian flux of −1.46 GtC yr−1 for 2001–2007. Gurney et al.  estimated that flux to be −0.267 ± 0.467 GtC yr−1 during 2000–2004 and −0.284 ± 0.472 GtC yr−1 during 2003–2006, but when the three models that best represented observed vertical profiles of CO2 in the Northern Hemisphere [Stephens et al., 2007] were used, the estimated fluxes were −0.033 ± 0.093 and 0.023 ± 0.206 GtC yr−1 for 2000–2004 and 2003–2006, respectively (Table 1 in Hayes et al. ). Many other studies also estimated boreal Eurasian fluxes but without using observations from Siberia because no adequate observations were available. Few “top-down” studies have used Siberian observations, as described in section 'Introduction'. Using the TC3 seasonal inversion approach, Maksyutov et al.  estimated fluxes with and without measurements from the three Siberian aircraft sites used here and observations around Japan. For 1992–1996, they estimated a boreal Eurasian flux of −0.41 ± 0.56 GtC yr−1 and of −0.63 ± 0.36 GtC/yr yr−1 without and with the Siberian aircraft data, respectively. In our inversion results, the estimated boreal Eurasian flux was −0.56 ± 0.79 and −0.52 ± 0.69 without and with the Siberia aircraft data, respectively. The results in this study showed difference tendency from those obtained by Maksyutov et al. , mainly because difference in model transport, that is, the results by Maksyutov et al. was ensemble means of TransCom 3 models, while this study is not. Other possible reasons are the inversion components, including the inversion method, the background data set (GLOBALVIEW by Maksyutov et al. and NOAA data in this study), the inversion period, the a priori flux data set, and the NIES-TM version all differed from this study.
Table 6. Comparison of Carbon Fluxes for Boreal Eurasia from the Current Study and Published Studies
*Bottom-up inventories, Dynamic global vegetation models, atmospheric inversions
−0.05 ± 1.39
A priori flux
(13.3 × 1012 m2)
−0.56 ± 0.79
Case 1 (with NOAA data)
−0.52 ± 0.69
Case 2 (with NOAA plus Siberian aircraft data)
−0.35 ± 0.61
Case 3 (with NOAA plus all Siberian data)
−0.35 ± 0.87
Case 4 (with NOAA plus all Siberian data)
 We turn our attention to bottom-up studies on carbon fluxes in Siberia. As part of the Regional Carbon Cycle Assessment and Processes (RECCAP) project, Dolman et al.  obtained their best estimate for net biosphere to atmosphere flux of −0.659 GtC yr−1 for Russian territory (including Ukraine, Belarus, and Kazakhstan; 17.1 × 1012 m2), an average of three independent estimates: (1) a bottom-up estimate of −0.563 GtC yr−1 using eddy covariance measurements and Dynamic Global Vegetation Models; (2) a top-down estimate of −0.690 ± 0.246 GtC yr−1 (including fire emissions, but not fossil fuel emissions) by 12 inverse models over various periods; and (3) another bottom-up estimate of −0.761 GtC yr−1 by the landscape approach. Hayes et al.  also used a terrestrial ecosystem model to estimate the cumulative net uptake of CO2 in northern high-latitude lands from 1960 to 2006. They estimated that average annual NEE for boreal Asia (“BOAS” in their paper) was −0.112 GtC/yr during 1997–2006, and also found a substantial smaller uptake rate of 0.01 GtC yr−1 between 1999 and 2006 than that before 1999. Quegan et al.  presented five independent carbon flux estimates for central Siberia (roughly corresponding to our regions 29 and 30) using landscape-ecosystem-based carbon accounting approaches and dynamic global vegetation models. They also compiled Siberian flux estimates from previous studies, including the TC3 experiments and that by Maksyutov et al. . From 13 estimates, they calculated an average carbon sink for boreal Eurasia with net biome production of 0.352 ± 0.092 GtC yr−1 (or a density of 27.5 ± 7.2 gC m−2 yr−1, ranging from −6 to 49 gC m−2 yr−1 for 13 estimates; only the estimate from Lund-Potsdam-Jena was negative). According to Dolman et al. , the results by Quegan et al.  corresponds to about −0.470 GtC yr−1 when converted to the Russian territory which was the target region of their study, and investigated that this difference might be caused by difference in target areas and in variety of land-use type and climate in the areas.
 In this study, we estimated that the flux for boreal Eurasia was −0.56 ± 0.79 GtC yr−1 for case 1, and the inclusion of Siberian data reduced the uptake to −0.35 ± 0.61 GtC yr−1 (case 3) and −0.35 ± 0.87 GtC yr−1 (case 4) (Table 4), including biomass-burning emissions of 0.11 GtC yr−1 (8.27 gC m−2 yr−1). The case 1 sink is rather close to the RECCAP top-down estimate of −0.690 ± 0.246 GtC yr−1, which was obtained without the Siberian data. This suggests that inversions performed without Siberian data might lead to overestimate the boreal Eurasian sink. Our two regularization methods (cases 3 and 4) yielded similarly strong sinks (−0.35 GtC yr−1 = −26.32 gC m−2 yr−1 for boreal Eurasia) (Table 4). In comparison with the previously mentioned studies, our results were close to the average estimate of −0.352 ± 0.092 GtC yr−1 (−27.5 ± 7.2 gC m−2 yr−1 for boreal Eurasia) by Quegan et al. , which is a smaller sink than that of RECCAP (−0.659 GtC yr−1; −40.35 gC m−2 yr−1 for Russian territory) [Dolman et al., 2012], although investigated area was different.
 We performed inversion analysis to estimate fluxes of carbon for (mainly) Siberia, by using, for the first time, measurements from a Siberian observational network of nine towers (JR-STATION) and four aircraft sites, in addition to NOAA's surface background flask measurements. We performed analyses for four cases using different observational data and two different regularization methods (full-rank inversion and truncated SVD) for 2000 to 2009 by using NIES-TM and a fixed-lag Kalman Smother approach. For the average of the four cases, we obtained a total global flux of about −3.51 GtC yr−1, which is consistent with previous studies. Our main focus was boreal Eurasia, where the Siberian network is expected to constrain carbon flux estimation. By comparing results with (case 3) and without (case 1) the Siberian data, we found clear differences in the estimated fluxes over northeastern Europe and North America as well as over Siberia. The Siberian data also reduced the regional uncertainty by 22% for boreal Eurasia and northeastern Europe. The uncertainty was reduced by up to 80% in Eastern and western Siberia, which suggests that the Siberian network can increase our confidence in estimated fluxes. With only NOAA data (case 1), the boreal Eurasian flux was −0.56 ± 0.79 GtC yr−1, but when the Siberian data were included (case 3), the same flux was −0.35 ± 0.61 GtC yr−1. The case 4 analysis similarly resulted in −0.35 ± 0.87 GtC yr−1. This sink for the boreal Eurasian land area accounted for about 20% of the estimated total land sink. Case 2 inversion with Siberia aircraft data inferred −0.52 GtC yr−1, which was almost the same flux as case 1 results, but it was found that the aircraft data contributed to reduction in uncertainty in estimated fluxes over northern Eurasia and northeastern. Compared to other recent studies, this estimate agrees better with an average of 13 different estimates (−0.352 ± 0.092 GtC yr−1) obtained by Quegan et al.  than the RECCAP estimate of −0.659 GtC yr−1 [Dolman et al., 2012]. When we divided boreal Eurasia into eight small regions to take advantage of the dense Siberian network, our analysis showed that the seasonal cycles and interannual variability of the fluxes were spatially heterogeneous. Maximum uptake of carbon occurred in July in all parts of Siberia and in June in Eastern Europe, and there were large sinks in central and east Siberia in summer. In west Siberia, where the network sites are concentrated, we found a latitudinal difference in the estimates, with maximum emissions located in central west Siberia. The interannual variability of the fluxes was larger when Siberian data were included than when only NOAA data were used, but it should be noted that the network size changed during the study period, which may affect the variability results. To verify the performance of our inverted fluxes, we performed forward simulations with our estimated fluxes added to the a priori fluxes. Comparisons with independent observations over ZOT in central Siberia showed that CO2 concentrations simulated with the inverted fluxes agreed better with observations than those simulated with only a priori fluxes, but the differences were statistically small, and it was difficult to determine which inversion best reproduced the observations at ZOT. Our results confirmed the importance of the Siberian network data. The dense Siberian network is still in operation and should be able to constrain future inverse calculations for estimating carbon fluxes, especially over boreal Eurasia.
 This research was supported by the Global Environment Research Account for National Institutes (estimation of CO2 and CH4 fluxes in Siberia by the tower observation network), in the Ministry of the Environment, Japan. R.J.A. was sponsored by the U.S. Department of Energy, Office of Science, Biological and Environmental Research (BER) programs and performed at Oak Ridge National Laboratory (ORNL) under U.S. Department of Energy contract DE-AC05-00OR22725. We thank Sergey Mitin (Institute of Microbiology, Russian Academy of Science (RAS)), Boris Belan, Denis Davydov, Aleksandr Fofofonov, Oleg Krasnov (Institute of Atmospheric Optics of the Siberian Branch, RAS) and Nikolai Fedoseev (Melnikov Permafrost Institute of the Siberian Branch, RAS) for coordinating the observations and administrative procedures. The model simulations were performed with the NIES supercomputer system (NEC SX-8R/128M16) and the Greenhouse gases Observing Satellite Research Computation Facility (GOSAT RCF/SGI Asterism ID318 cluster with NVIDIA C2050). We also thank Kaduo Hiraki, the administrator of the RCF, for his support during the port of the inversion system to RCF. We are also grateful to two anonymous reviewers for their useful comments to improve the manuscript. The Japan Meteorological Agency (JMA) Climate Data Assimilation System (JCDAS) data sets used for this study were provided by the cooperative research project of the Japanese Re-Analysis 25 years (JRA-25) long-term reanalysis of JMA and the Central Research Institute of the Electric Power Industry (CRIEPI).