Recent comparisons between satellite observed and global model simulated glyoxal (CHOCHO) have consistently revealed a large unknown source of CHOCHO over China. We examine this missing CHOCHO source by analyzing SCIAMACHY observed CHOCHO vertical column densities (VCDs) using a Regional chEmical trAnsport Model (REAM). This missing source is first quantified by the difference between SCIAMACHY observed and REAM simulated CHOCHO VCDs (ΔCCHOCHO), which have little overlap with high biogenic isoprene emissions but are collocated with dense population and high anthropogenic NOxand VOC emissions. We then apply inverse modeling to constrain CHOCHO precursor emissions based on SCIAMACHY CHOCHO and find that this missing source is most likely caused by substantially underestimated aromatics emissions (by a factor of 4–10, varying spatially) in the VOC emission inventories over China used in current regional and global models. Comparison with in situ observations in Beijing, Shanghai, and a site in the Pearl River Delta shows that the large model biases in aromatics concentrations are greatly reduced after the inversion. The top-down estimated aromatics emission is 13.4 Tg yr−1in total, about 6 times the bottom-up estimate (2.4 Tg yr−1). The resulting impact on regional oxidant levels is large (e.g., ∼100% increase of PAN in the afternoon). Furthermore, since aromatics are important precursors of secondary organic aerosol (SOA), such an increase of aromatics could lead to ∼50% increase of global aromatic SOA production and thereby help to reduce the low bias of simulated organic aerosols over the region in previous modeling studies.
 Glyoxal (CHOCHO) is the smallest di-carbonyl compound and an oxidation product of many unsaturated volatile organic carbons (VOCs) such as isoprene and aromatics [e.g.,Fu et al., 2008]. Observations of CHOCHO provide useful constraints on emissions of these VOCs since the primary emissions of CHOCHO are small [Volkamer et al., 2005]. Recently, satellite observations of tropospheric CHOCHO vertical column densities (VCDs) became available from the SCanning Imaging Absorption SpectroMeter for Atmospheric CHartographY (SCIAMACHY) [Vrekoussis et al., 2009; Wittrock et al., 2006] and Global Ozone Monitoring Experiment-2 (GOME-2) [Vrekoussis et al., 2010]. A number of CHOCHO hot spots were identified in these observations, such as over the tropical oceans and eastern China. CHOCHO abundance over China was found to have increased from 2003 to 2008 [Vrekoussis et al., 2009].
 Global modeling studies of CHOCHO [Fu et al., 2008; Myriokefalitakis et al., 2008; Stavrakou et al., 2009] have consistently shown a substantial underestimation of the VCDs of CHOCHO over China, when compared to satellite observations. In contrast, much better agreement was found over the high CHOCHO region of the southeastern U.S., driven by biogenic isoprene emissions [e.g., Fu et al., 2008]. Model discrepancies have also been found over tropical regions and the causes are still under discussion. Previous studies over China [e.g., Carmichael et al., 2003; Fu et al., 2007; Q. Zhang et al., 2009] highlighted the complexity and large uncertainties in the emissions of reactive VOCs, many of which are precursors of CHOCHO. Satellite CHOCHO VCDs are particularly valuable for fast-reacting VOCs, the emissions of which cannot be reliably constrained by in situ measurements in the outflow region [e.g.,Carmichael et al., 2003]. In this work, we explore the missing source of CHOCHO over China by comparing CHOCHO VCDs from SCIAMACHY and those calculated by a Regional chEmical trAnsport Model (REAM).
2. Data and Model Descriptions
2.1. CHOCHO VCDs
 The differential optical absorption spectroscopy (DOAS) technique is applied to observations of the upwelling radiation leaving the top of the atmosphere and measured by SCIAMACHY [Burrows et al., 1995, Bovensmann et al., 1999, and references therein] to retrieve CHOCHO VCDs in the spectral window of 435–457 nm. The retrieval error includes (1) systematic errors in trace gas absorption cross sections, atmospheric temperature, instrument calibration, air mass factor calculations, and (2) random errors due to the noise on the measured backscattered electromagnetic radiation as a function of wavelength range relative to the measured absorption. The total uncertainty of the monthly average CHOCHO VCD (CCHOCHOobserved) at a given location (60 × 30 km2 and cloud fraction <20%) is given by α × CCHOCHOobserved + 2 × 1014 molecules cm−2, where the value of α lies in the range of 0.1–0.3. For details of the retrieval processes, we refer the readers to Vrekoussis et al. .
2.2. In Situ VOC Measurements
 VOC observations at three sites, in Beijing (39.99°N, 116.31°E, August 2007), Shanghai (31.17°N, 121.43°E, July 2010) and the Back Garden site (23.5°N, 113.03°E, July 2006) in the Pearl River Delta (PRD), are used in the model evaluation. The first two are urban sites on the roof top of a building (∼20 m above ground). The last one (∼10 m above ground) is a rural site 60 km northwest of Guangzhou. C2 (or C3) − C9 (or C12) VOC species were measured by online GC-FID/PID systems. The detection limits are several to hundreds of pptv and the estimated uncertainties are 1–10%. More details about these VOC measurements are given in theauxiliary material.
2.3. REAM Model
 The 3-D REAM model has been applied over North America, East Asia and the polar regions [e.g.,Choi et al., 2005; Wang et al., 2007; Choi et al., 2008a, 2008b; Zhao et al., 2009a, 2009b; Zhao and Wang, 2009; Zhao et al., 2010; Yang et al., 2011]. The model has a horizontal resolution of 70 km with 21 vertical layers in the troposphere. Transport is driven by WRF (v3.2) assimilated meteorological fields constrained by the NCEP reanalysis products [Kalnay et al., 1996]. Most meteorological inputs are archived every 30 min except those related to convective transport and lightning parameterizations, which are archived every 5 min [Zhao et al., 2009a, 2009b]. Chemical initial and boundary conditions for chemical tracers in REAM are obtained from the global simulation for the same period using the GEOS-CHEM model (v7-03-06) driven by GEOS-4 assimilated meteorological fields [Bey et al., 2001]. The chemistry mechanism for CHOCHO in REAM is extended from that of standard GEOS-Chem by including production from the oxidation of aromatics, acetylene, and ethene [Carter, 2009; Fu et al., 2008], which are major precursors for CHOCHO. Biogenic emissions of isoprene and other species are based on MEGAN (v2.1) [Guenther et al., 2006]. The anthropogenic VOC emission inventory by Q. Zhang et al.,  is used. The biomass burning emissions are obtained from the Global Fire Emissions Database, Version 2 (GFEDv2.1; available at http://daac.ornl.gov/). The precursor emissions and chemistry for CHOCHO in this work are generally consistent with those used by Fu et al. , except for our neglection of the aerosol sink, as done by Myriokefalitakis et al. . We note that this additional sink of CHOCHO would only make the magnitude of the missing source of CHOCHO even larger. The model simulated CHOCHO VCDs are sampled at the SCIAMACHY overpass time (10:00 am local time) and averaged over a month to compare with the satellite monthly average VCDs.
 We chose to conduct inverse modeling of SCIAMACY CHOCHO observations for August 2007 in part because of the full suite of observation data available and well-characterized photochemical simulations during the CAREBeijing-2007 Experiment [Liu et al., 2010, 2012] and in part because the influence of biomass burning sources is much smaller in August than in July. Simulations were also conducted for July of 2006 and 2010 to compare model simulated VOC concentrations to in situ observations in PRD and Shanghai, respectively.
3. Results and Discussions
3.1. Spatial Distribution of CHOCHO Source Underestimates
 We determine the spatial distribution of CHOCHO VCD underestimates (ΔCCHOCHO) by calculating the difference of SCIAMACHY observed and model simulated CHOCHO VCDs (ΔCCHOCHO = CCHOCHOobserved − CCHOCHOmodel). Figure 1a shows the spatial distribution of ΔCCHOCHO over China in August 2007. While the spatial pattern of ΔCCHOCHO is generally consistent with those previous global modeling results, showing large model estimates over eastern China [e.g., Fu et al., 2008], the finer spatial resolution of REAM allows more detailed scrutiny of the spatial distribution features of the sources that are being underestimated. High ΔCCHOCHO values (>3.5 × 1014 molecules cm−2) much larger than observational errors are identified over the East China Plain, the Pearl River Delta, the Northeast Plain, as well as the Sichuan Basin.
 In order to assess the potential origins of the ΔCCHOCHO, the spatial pattern of ΔCCHOCHOis examined and compared with those of various emission sources. Previous studies have proposed that the significant and quite uncertain biomass burning emissions may result from large-scale agricultural fires over East China Plain in June [Fu et al., 2007]. ASTR fire hotspot map (http://wfaa-dat.esrin.esa.int/) shows that biomass burning is insignificant and restricted to few areas in August (Figure S1 in the auxiliary material). Biomass burning is therefore not likely to be responsible for the widespread source underestimation of CHOCHO in August, shown in Figure 1a. Further, ΔCCHOCHO is mostly significant over regions having low biogenic isoprene emissions (Figure 1b) and high anthropogenic emissions, and associated with high population densities (Figure 1c) and satellite vertical columns of NO2 (Figure 1d), the latter of which is an indicator of fossil fuel combustion emissions. This implies that the unidentified source is likely to be anthropogenic in nature and not biogenic. Previous global modeling studies also showed that CHOCHO sources from biogenic isoprene oxidation at northern mid-latitudes are reasonably represented in the models [e.g.,Fu et al., 2008]. Figure S2 shows the model predicted distributions of CHOCHO VCDs resulting from the emissions of isoprene, ARO1 (benzene, toluene, ethylbenzene), ARO2 (xylenes and trimethylbenzenes), and acetylene and minor contributions from alkenes and monoterpenes, respectively. Comparing Figure S2 to Figure 1, it is clear that the contributions by aromatics (ARO1 and ARO2) are most likely underestimated. Neither isoprene nor acetylene derived CHOCHO distributions resemble that of ΔCCHOCHO. Subsequently inverse modeling has been used in this study to quantify the underestimation of the aromatics emissions.
3.2. Top-Down Inversion
 Following previous inverse modeling studies of VOC emissions using satellite column observations [Fu et al., 2007; Shim et al., 2005], we assume a linearized relationship between glyoxal VCDs (CCHOCHO) and the precursor emissions (Ei):
 The relationship is valid only for precursors that are short-lived, i.e., isoprene and fast-reacting aromatics. We first remove the contribution to the regional background by the long-lived acetylene, which cannot explain the ΔCCHOCHO distribution in Figure 1. Previous analyses using aircraft measurements in outflow regions suggested that current emission estimates of long-lived VOC species, including acetylene and light alkanes are reasonable over China [Carmichael et al., 2003]. We also remove the minor contributions by alkenes and monoterpenes in the inversion since these small contributions cannot be estimated by inverse modeling when the difference between the model and observations is much larger. The contribution by these sources is shown in Figure S2. We obtain:
where ΔC′CHOCHO denotes the difference between observed and simulated CHOCHO VCDs after removing the contributions from acetylene, alkenes, and monoterpenes.
 A general approach of inverse modeling is to assume that the spatial distributions of the sources are correct and that only the emission magnitudes are adjusted to improve model simulations of the observations [e.g., Shim et al., 2005]. Applying this approach, in which the state vectors of the inversion are the emissions of ARO1, ARO2, and isoprene, we found poor agreement between the simulated and observed distributions of CHOCHO after the inversion. The result is not surprising given the large spatial difference between observed CHOCHO and simulated contributions from precursor emissions (Figures 1 and S2).
 We apply a different approach, in which we derive the emission adjustment for each grid, such that the spatial variability of emission adjustments can be examined. Since satellite CHOCHO VCD observations are not precursor specific, we make use of surface observations of VOCs from Beijing, Shanghai, and PRD sites to further evaluate the validity of the inversion results. We conduct 4 sets of top-down inversion for the emissions of ARO1 + ARO2, ARO1 only, ARO2 only, and isoprene only (Figure S3 in theauxiliary material). In the case of isoprene only, we need to increase isoprene emissions by a factor of 6–20 over eastern China (Figure S3), where estimated biogenic emissions are low (Figure 1). Comparison with in situ isoprene measurements at the three surface sites using the standard model shows that the ratios of simulated to observed isoprene concentrations are close to or somewhat higher than 1, indicating that the large increases suggested by top-down inversion are unrealistic. Similarly, top-down inversion of ARO1 or ARO2 only leads to large overestimates of ARO2 or ARO1, respectively, at the surface sites.
 The best comparison with surface measurements is obtained when both ARO1 and ARO2 are constrained by top-down inversion (Figure 2). Top-down inversion increases ARO1 and ARO2 concentrations by a factor of 4–10, bringing the model results in much closer agreement with the observations, although the simulated concentrations are still a factor of 2–3 lower than the observations at the two sites in Shanghai and PRD. This top-town inversion significantly increases simulated CHOCHO VCDs in much better agreement with SCIAMACHY observations than the standard model (Figure 3).
 Top-down inversion suggests large increases in the emissions of aromatics over eastern China. The increase is a factor of 4–8 for central eastern China, and is a factor of >10 in Yangtze River Delta (YRD) and southern China surrounding PRD. The latter two regions are known for spear-heading the economic developments in China. We now consider the results presented here in light of the uncertainties of the top-down inversion. The uncertainty of ΔCCHOCHO comes mainly from SCIAMACHY retrieval (30%) since simulated CHOCHO VCDs are too low in the standard model (Figure 3). If we assume that the uncertainty of ∂CCHOCHO/∂Eis 100% for ARO1 + ARO2 and the retrieval and model errors are independent, the top-down emission uncertainty is 104%, considerably less than the estimates of emission changes.
 Large aromatic VOC abundance over polluted regions in China appears to be the major contributor to the elevated CHOCHO concentrations observed over these regions by the satellite. This is quite different from the eastern U.S., where concentrations of aromatics are much lower and isoprene is a major source of CHOCHO (particularly over the Southeast). Our results demonstrate that satellite observations of CHOCHO provide the critical information to better characterize the emissions of aromatics over polluted regions, where the information of anthropogenic VOC emission sources is incomplete, and VOC measurements are relatively sparse.
 The results from the top-down inversion imply a factor of 5–6 increase of total aromatics emission over China, i.e., increasing from 2.4 Tg yr−1 in the standard inventory to 13.4 Tg yr−1. The resulting increase of the total VOC emission over China by 47% (from 23.2 Tg yr−1 to 34.2 Tg yr−1) is still within the 68% uncertainty of the total VOC emission estimated by Q. Zhang et al., . The underestimates of aromatics at Shanghai and PRD after inversion (although much reduced compared to the standard model) and the exclusion of CHOCHO loss to aerosols in the model indicate that the emissions of aromatics from China after inversion are probably a lower limit and in real emissions are even larger than these top-down estimates.
 The large uncertainty of reactive VOC emissions in China has been noted by some previous regional modeling studies [e.g., Chen et al., 2010; Sarwar et al., 2011; Yang et al., 2011]. The inaccurate and incomplete information for anthropogenic emission sources, e.g., their source profiles, emission factors and source activities, is a major source of uncertainty in current emission inventories [Q. Zhang et al., 2009; Chen et al., 2010]. The specific causes for the large underestimation of aromatics emissions merit further investigations.
 The consequences of such a nationwide large underestimation of aromatic VOC emissions are manifold. Sensitivity simulations show that as a result of the increased aromatics emissions, concentrations of PAN at the surface of those underestimated regions increase by ∼100%. This is supported by in situ observations over the region [Liu et al., 2010; J. M. Zhang et al., 2009]. The O3 response to VOC changes is much more complex. Aromatic VOCs are the major precursors of oxygenated VOCs in Beijing, the photolysis of which provides a major primary radical source and increases O3 production [Liu et al., 2012]. Furthermore, aromatic VOC emissions become a large contributor to the organic aerosol budget over the region by secondary organic aerosols (SOA). A recent modeling study by Henze et al.  estimated that the global aromatics emission of 18.8 Tg yr−1 produces SOA at a rate of 3.5 Tg yr−1. Applying this SOA production rate, the additional aromatics emissions from China, which result in more than 50% increase of the global aromatics emissions (from 18.8 Tg yr−1 to 29.9 Tg yr−1), would lead to over 50% increase of global aromatic SOA production to 5.5 Tg yr−1. The increase of 2 Tg yr−1 SOA production helps explain model underestimation of organic aerosols found previously over China and the outflow region [Fu et al., 2011; Heald et al., 2005].
 This work was supported by the National Science Foundation Atmospheric Chemistry Program. The authors thank the two anonymous reviewers for their constructive comments.
 The Editor thanks two anonymous reviewers for assisting with the evaluation of this paper.