A new approach to scenario analysis using simplified chemical transport models



[1] Given the computational burden of running full chemical transport models, it is highly desirable to have alternative ways to obtain fast and accurate approximations to at least some of these model outputs. We propose two methods that closely approximate the ammonia wet deposition of the Community Multiscale Air Quality (CMAQ) model, a regional-scale three-dimensional chemical transport model. The first method uses a greatly simplified version of CMAQ, called here “Tracer,” which requires one fortieth the processing time of CMAQ. The second method uses an extension of the Tracer model called “Multitracer.” Both methods make use of a CMAQ run under a reference emission scenario and are shown to provide good approximations to CMAQ outputs under different emission scenarios. The first fast approximation method proposed here requires a Tracer run under the new emission scenario while the second proposed approximation only requires a matrix multiplication between a precomputed matrix that approximates the transport in the model, obtained from a single Multitracer run, and the new emission. It will be shown that this method is not a simple source-receptor approximation of the model. As an important application of the second predictor, we propose an inverse modeling method for ammonia that makes it possible to adjust emissions by a different factor for each of the 100 subregions of the spatial domain. Testing with pseudodata yields a good match between the inverse modeled emissions and the actual emissions. Estimates of emissions using actual observations in the eastern United States show a reasonable adjustment field.

1. Introduction

[2] The current generation of regional-scale chemical transport models (CTMs) are complex and require large computational resources for scenario and episode studies. One such model, the Community Multiscale Air Quality (CMAQ) model [Byun and Ching, 1999] is currently favored by the Environmental Protection Agency for the simulation of multiple pollutant concentration levels at urban and regional scales. The main goals of these CTMs are to simulate the physical and chemical processes that transport and transform gas and particulate pollutants emitted into the atmosphere and to assess the impact of changes in emissions on air quality. They are also used as an air quality management tool.

[3] There is an increasing interest in the emission, distribution and deposition of ammonia since it primarily reacts with sulfuric acid and nitric acid in the atmosphere forming fine particulate matter (PM2.5), designated as a criteria pollutant by the National Ambient Air Quality Standards set by the U.S. Environmental Protection Agency (USEPA). Positive association between high levels of particulate matter and adverse health effects has been noted in a number of recent studies [Peng et al., 2005; Pope, 2004; Daniels et al., 2000]. The emissions of ammonia into the atmosphere are highly uncertain [Bouwman et al., 1997]. Ammonia emission measurements are very difficult, indirect [Aneja et al., 2000; Roelle and Aneja, 2002], and sparse. Current USEPA inventories [U.S. Environmental Protection Agency, 2000] are based on annual averages that are calculated by multiplying the source abundance by emission factors published in the literature [Asman et al., 1998]. Since the primary sources of ammonia are from farm animals and seasonal agricultural practices and because emission factors are significantly affected by temperature, annual average emissions are grossly inadequate. Gilliland et al. [2003] estimated seasonal adjustments to ammonia emissions in the eastern United States by using an inverse modeling method. They considered the entire region as one source region, which means that for each month they estimated one global adjustment factor for the whole region. Their method demanded at least 3 simulations per month, and at the time of their study, each one required 2 weeks of runtime on a CRAY T3E system. These times have been reduced dramatically with newer versions of CMAQ and improved processors. As of October 2004, EPA was achieving annual continental U.S. runs in about a week. Nevertheless, the number of runs necessary to solve inverse problems grows rapidly as we increase the number of source regions to resolve, so the main limitation of the scope of inverse problems is still the computational burden. Therefore a fast approximation is crucial for inverse modeling purposes.

2. CMAQ, Tracer, and Multitracer Models

[4] We ran CMAQ in the eastern U.S. region with 67 × 68 square cells of 36 by 36 km and 28 vertical layers. The CMAQ model version 4.3, with carbon bond scheme (CB4) chemistry, organic and inorganic, aerosols and aqueous chemistry (CB4_ae3_aq in CMAQ terminology) was employed for making the model runs. The emissions were calculated using Sparse Matrix Operator Kernel Emissions Modeling System (SMOKE) with National Emission Inventory (NEI) 1996, generated by EPA, and the meteorology input to CMAQ was computed with the Fifth Generation Penn State/NCAR Mesoscale Model (MM5). The initial and boundary condition for the MM5 (V3.6) were generated from the 4 times a day National Centers for Environmental Prediction (NCEP)/Department of Energy Atmospheric Model Intercomparison Project II (AMIP-II) Reanalysis data set [Kistler et al., 2001]. The input data has a spatial resolution of 2.5° by 2.5° in the horizontal and 17 mandatory pressure levels. The selected physics parametrizations for these runs include Dudhia's [1989] simple ice moisture scheme; Grell's [1993] basic cumulus parametrization scheme; the medium range forecast (MRF) planetary boundary layer (PBL) scheme and a radiation scheme based on the Climate Radiation Model (CRM) and updated every 30 min. The CMAQ trace gas and aerosol boundary conditions were set to nominal continental background conditions for the respective trace gases and aerosols.

[5] Most of the runs were for the 8 day period from 26 June 1996 to 4 July 1996. We also did some additional runs for the 8 day period from 4 July 1996 to 12 July 1996 and for the 28 day period from 2 July 1996 to 30 July 1996. Ammonia wet deposition was aggregated over each period. The length of the shorter period used, 8 days, is approximately the lifetime of NHx.

[6] The Tracer model is a reduced version of CMAQ in which ammonia is the only species and it is treated as a tracer without any chemical interactions. Ammonia is allowed to undergo wet and dry deposition as gaseous ammonia. The full version of the CMAQ model converts emitted NH3 to NH4, which is partitioned into particulate ammonia (ammonium sulfate and ammonium nitrate) and removed from the atmosphere by dry and wet deposition. Ammonia dry and wet depositions are computed using the dry deposition velocities and wet deposition rates computed in the model for ammonia. The emissions and meteorology are the same as for the full version of CMAQ.

[7] The Multitracer model is a generalization of Tracer that simultaneously calculates the transport and deposition of 100 ammonia-like tracers released from 100 different aggregated surface grid locations in the model.

[8] Our computations were carried out on a Beowulf cluster with 4 computational nodes and one data server. The nodes are dual AMD MP 1800+ processors with 2 gigabytes of memory, connected via gigabit ethernet. The operating system is Debian Linux and the parallel software is Mpich 1.2.4. Under this system, the processing time for CMAQ was 35 min per day using 4 processors while the Tracer model would take 3.5 min per day on one processor (there was no gain in using more processors for the Tracer model). The Multitracer took about the same time as CMAQ using the same number of processors.

3. Methodology

3.1. Predicting CMAQ

[9] Our goal is to predict CMAQ ammonia wet deposition output under new emission scenarios making use of a reference CMAQ run and a few Tracer runs or a single Multitracer run. For this purpose we have analyzed the changes in CMAQ generated ammonia wet deposition using different emission scenarios.

[10] We use as the reference emission field our current best estimate using EPA's inventory, which will be referred to as So. We modified the reference field to obtain new scenarios in order to test our predictors. Three target scenarios, SσZ with σ = 0.1, 0.4, and 1, were generated by multiplying So by the exponential of a Gaussian random field. This field was simulated using a covariance function exp(−d/r), where d is the distance between cells and r is the range of the correlation (400 km). The Gaussian random field is shown in Figure 1, where we can see regions of high and low values with diameter of the order of 400 km consistent with the range of the correlation. Modifying the reference emission field with this method generates a wide range of plausible scenarios one may want to test in an inverse modeling procedure. When σ = 0.1 (σ = 0.4, σ = 1), the base emission field is changed on average about 10% (50%, 300%). Depending on the location, changes were much more extreme; for example, for σ = 1 the base emission was multiplied by factors as large as 40. Another scenario, Sp, was generated by multiplying the base emission by a plane that takes value 1 in the northern end of the region (latitude 50.216°N) and 0 in the southern end (latitude 27.125°N). Last, S1 is a completely unrelated emission field that has value 1 (mol/s) in the whole domain. Figure 2 shows the reference emission scenario So and target scenarios S0.4Z, SZ, and Sp.

Figure 1.

Simulated Gaussian random field with covariance function ed/r, where d is the distance between cells and r is the range of the correlation (400 km).

Figure 2.

(a) Reference emission scenario So and (b–d) target emission scenarios S0.4Z, SZ, and Sp. The range of the figure is set to 20 mol/s for visualization purposes, but the actual maximum is much larger than this value.

[11] If we scale the original emission by a constant factor α, the resulting CMAQ output is scaled by the same factor raised to the power of 0.7. We have tested this relationship with α taking values 0.5, 0.8, 1.5, 2, and 3. Table 1 shows the slopes of the least squares fit of CMAQ wet deposition under the scaled emission versus the base CMAQ wet deposition and, for comparison, the values of α0.7. The last column of Table 1 shows the coefficient of determination R2, which is calculated using 1 − equation image(yi − α0.7xi)2/equation imageyi2. We can see that these values are all above 0.99 except for α = 3 in which case the R2 is 0.979. Therefore, for constant adjustments of the emissions, as long as the adjustment is within this range of 0.5–3, a very good predictor of CMAQ ammonia wet deposition is the reference CMAQ wet deposition multiplied by the adjustment factor raised to the power of 0.7:

equation image

where we denote CMAQ wet deposition under emission scenario Sx as C(Sx). Motivated by this result, we could propose the following naive predictor when the new emission Sn is not a simple rescaling of So

equation image

This expression reduces to equation (1) when Sn is α times the reference emission So.

Table 1. Relationship Between CMAQ Wet Deposition With Scaled Emissions and Base Emissions

[12] A more useful approximation can be generated by examining the relationship between ammonia emission source regions and the regions calculated by the model to experience significant wet deposition. Figure 3 shows CMAQ wet deposition versus emissions where each cross represents a cell of the domain. The lack of correlation between the two variables indicates that ammonia is transported away from its source before being scavenged by precipitation. This result is consistent with the fact that most of CMAQ wet deposition comes from ammonium in aerosol form with sizes ranging between 0.1 and 2 μm, which has residence times of the order of 4 to 7 days. Thus a more comprehensive approximation to CMAQ wet deposition of ammonia can be generated by including the effect of transport in our predictors. One way of doing so is by substituting the Tracer model's generated wet deposition in equation (2) as a substitute for just the emission scenario's (So, Sn, etc.). Each Tracer calculation is an approximation of the impact of the transport history from source to sink (wet deposition in this case) of ammonia emissions, yielding an approximation for the effect of transport on calculated wet deposition. Hence our first proposed predictor of C(Sn) is

equation image

where tn and to are the Tracer output under emission scenarios Sn and So, respectively, and Co = C(So) corresponds to CMAQ output under the base scenario. Notice that the superscript with a number between parenthesis indicates that the variable is a predictor, not the actual CMAQ output, and the subscript indicates the corresponding emission scenario. Once we have a base run of CMAQ (Co) and Tracer (to), we only need to run the Tracer model with the new emission scenario (Sn) in order to compute this predictor. As already mentioned, this effectively reduces the computation time by a factor of 40.

Figure 3.

CMAQ ammonia wet deposition versus ammonia emission.

[13] We can further simplify our prediction procedure by exploiting the near linearity of the Tracer model. Since the Tracer model consists of transport, wet and dry deposition, it is very nearly linear in emissions, i.e., if we multiply emissions by α the resulting wet deposition is multiplied by α, except for some modest effect, which means that we do not need to rerun the Tracer model when the new emission is a scaling of the base emission. This almost linearity means that if we arrange the model calculated wet deposition and emissions into vectors, which will have length 4556 (67 × 68 cells in the domain), the tracer output can be reasonably represented as a product between a matrix T (as yet undetermined) and the vector of emissions Sn:

equation image

We will refer to the matrix T as the transport matrix. We can calculate T by noticing that its ith column is the wet deposition resulting from a unit emission from location i. Thus one method to calculate T is by setting emissions to 1 in one location and 0 elsewhere, running the Tracer model and using the output to read off the column of T corresponding to that location. Repeating this procedure for all locations would allow us to construct all of T's columns. However, running the Tracer model 4556 times is too time consuming to be carried out in practice. We propose aggregating the 4556 locations into 100 subregions so that the transport matrix has 100 columns, and we could then obtain an approximate version of T that we denote by equation image by running the Tracer model only 100 times. Instead, we find equation image by using the Multitracer model, which runs all 100 Tracer models simultaneously. Figure 4b shows the Tracer model's output versus the aggregated transport matrix times the aggregated emissions. Note that t(So) ≠ equation imageo because of the aggregation and the small nonlinearity of the Tracer model.

Figure 4.

(a) Tracer versus CMAQ under So (the least squares fit slope is 0.56, and the correlation is 0.92). (b) Transport matrix approximation versus tracer under So (the least squares fit slope is 0.88, and the correlation is 0.97).

[14] The use of transport or source-receptor matrices has been used in integrated models such as RAINS [Amann et al., 2004] (available at http://www.iiasa.ac.at/rains/review/review-full.pdf). The difference in approach to our method is that we use the transport matrix only as an approximation to the simplified model and further combine it with baseline full model runs to obtain a much better approximation as can be seen in section 4.

[15] Denoting by equation image and equation image the aggregated matrix and emission vectors, our second proposed predictor of C(So) is

equation image

Once we have a base run of CMAQ and one Multitracer run, we only need a matrix multiplication in order to compute this predictor for any new emission scenario.

[16] In equation (5) we have two types of matrix/vector operations. One is the usual matrix-vector multiplication that returns a vector (as in equation imagenequation imagen) and the other one is a vector-vector multiplication that has to be interpreted as elementwise multiplication and returns a vector. We use no symbol to denote either operation but it should be clear from the arguments involved which type of multiplication is being used. Division and raising to a power should always be understood as elementwise operations.

3.2. Inverse Modeling Emissions

[17] Let us assume that we have observations in every location of our domain. This is not realistic but, in practice, we can interpolate the observations, for example, by kriging [Cressie, 1993], to obtain values in each location. The inverse modeling problem consists of finding the emissions that originated this field assuming that the only source of error in the model is the emission field.

[18] The form of our second predictor suggests a simple method to obtain a direct estimate of the unknown emission scenario. In this case, the left hand side of equation (5) is considered to be known and equal to the observed field. We can invert this relation to obtain an estimate of the unknown emission scenario equation imageobs

equation image

We use the caret to denote estimated value and tilde to denote aggregation. However, we do not use caret and tilde simultaneously to avoid complicated expressions.

[19] In our case, we are using an approximate transport matrix that has been aggregated into 100 columns, so it is not a square matrix. It is common in inverse modeling problems [Enting, 2002] to encounter matrices that are not full rank. Both problems can be addressed by using a generalized inverse or pseudoinverse. By the properties of the pseudoinverse [Golub and Van Loan, 1996], equation image = T#X, where we are denoting pseudoinverse by superscript #, is a least squares solution of the problem X = TS, i.e.,

equation image

where ∥ · ∥2 is the sum of the squares of the elements. When the matrix is full rank, as is the case in this study, the least squares solution is unique.

[20] On the basis of equation (6), we propose the following quick inverse modeling method: (1) run CMAQ with the best initial estimate of emissions to get Co; (2) interpolate the observations to all the locations of the domain to get Cobs; (3) run Multitracer model in order to obtain the approximate transport matrix equation image; (4) aggregate emissions to the 100 subregions used in the calculation of the transport matrix to get equation imageo; (5) calculate the new aggregated emissions estimate using

equation image

(6) calculate adjustment factors for each region as the ratio between the aggregated emissions estimate and the aggregated reference emissions; (7) multiply the base emission field (nonaggregated) by the correction factors to get the corrected emission field.

[21] In order to avoid the artificial discontinuities introduced by the subdivision in subregions, we interpolate (bilinearly) the adjustment factors to get a smoother field. In essence, we get 100 correction factors for the emission which is a substantial gain from previous methods that could only be applied in practice to get one global correction factor for the whole region [Gilliland et al., 2003].

4. Results

4.1. CMAQ Versus Tracer

[22] CMAQ ammonia wet deposition constitutes about half of the total deposition, whereas Tracer's ammonia wet deposition only accounts for 25% of the total. As mentioned earlier, most of the CMAQ wet deposition comes from aerosol (NH4) and most of the dry deposition comes from gas phase (NH3). Since there is no aerosol formation in Tracer model all the deposition is in the form of NH3. It is reasonable then that in the Tracer model a larger fraction of ammonia is settled by dry deposition.

[23] Figure 4a shows Tracer output versus CMAQ output under the reference emission scenario So. The correlation is relatively high (0.92) but the slope is 0.56. This bias is consistent with the proportion of wet deposition in both models. The high correlation between CMAQ output and Tracer output indicates that gas phase ammonia in the Tracer has followed a similar transport path as the aerosol in the full CMAQ. This fact supports the idea that the Tracer output incorporates the right transport information into the predictors. Although the Tracer model does not convert NH3 into NH4, the gas phase ammonia is dissolved in water and taken up by clouds at roughly the same rate as aerosols are. The gas phase constituents of the Tracer models do not behave chemically like ammonia but have the right transport and wet deposition properties for our purpose.

[24] We notice that the Tracer's ammonia wet deposition is a biased approximation to full CMAQ. Nevertheless, our predictors are not affected by this bias since we only use the ratio between Tracer (or Multitracer) model outputs under the new and base emission scenarios. Our predictors combine this imperfect approximation with the base CMAQ output in a way that exploits the information contained in each component and achieve much better approximation than just directly using Tracer output, as will be shown below.

[25] Figure 4b shows Tracer output versus the approximation obtained by multiplying the aggregated transport matrix by the aggregated emission (equation imageo). The slope is 0.88 and the correlation is 0.97.

4.2. Predicting CMAQ

[26] In order to assess the performance of the predictors we use three measures of performance: root-mean-square error (RMSE), slope and median relative error (MRE). The RMSE is the square root of the average squared difference between the predictor and the CMAQ output. The slope is calculated by using least squares regression between the predictand and the predictor, with intercept fixed to 0. Values larger (smaller) than one indicate overprediction (underprediction). The MRE is calculated by taking the difference between the predictor and the predictand, dividing it by the predictand and taking the median value. The RMSE penalizes large absolute errors so in general it gives more weight to larger values of predictands, whereas the MRE gives similar weights to large and small values of predictands. MRE captures errors that are usually not easily detected by looking at scatter plots, in which large relative deviation of small values are not apparent.

[27] Figure 5 shows, with black crosses, the predictor C(1) from equation (3) versus the actual CMAQ output under four different target emission scenarios S0.4Z, SZ, Sp and S1. We show the CMAQ output under So with gray circles on the same figure to give an idea of how different the reference and target scenarios are. For σ = 0.4, which corresponds to changes of the order of 50% relative to the base emissions, the predictor C(1) does a very good job since all the points are aligned around the unit slope line. For σ = 1 the changes relative to the base emissions are more extreme so the predictor is more dispersed around the unit slope line. Nevertheless, we get a relative error of less than 6%. For scenario Sp the performance is good with a median relative error of 12% and a slope of 1.08, which indicates a slight overprediction. For S1, which is completely unrelated to the base emission, we see that there is some overprediction with a slope of 1.29 but it is still substantially improved compared to the base CMAQ output. If we compare Figures 5b and 5c, the predictor under scenario SZ seems to be performing much worse than under scenario Sp. This is true if performance is measured by RMSE, which is 0.043 for the former case and 0.010 for the latter case. However, the median relative error under scenario SZ is 5.7%, less than half the MRE under scenario Sp, which is 12%.

Figure 5.

Predictor C1 (black crosses) and reference CMAQ output (gray circles) versus CMAQ output under (a) S0.4Z, (b) S1Z, (c) Sp, and (d) S1.

[28] Figure 6 shows the predictor C(2) from equation (5) versus the actual CMAQ output under the same target scenarios as in Figure 5. We can see that the errors are a bit larger than for predictor C(1), but the overall performance is comparable.

Figure 6.

Predictor C2 (black crosses) and reference CMAQ output (gray circles) versus CMAQ output under (a) S0.4Z, (b) S1Z, (c) Sp, and (d) S1.

[29] Table 2 summarizes various measures of performance of the predictors. The first three columns of Table 2 show the summary of the performance of the first predictor C(1) for scenarios S0.1Z, S0.4Z, SZ, Sp, and S1. The following three columns correspond to predictor C(2) and the last three columns show the difference between the CMAQ output under the target scenario and the CMAQ output under the base scenario. The latter serve as reference scales for assessing the performance of the predictors. They also serve as measures of the difference between the target and reference scenarios.

Table 2. Performance Measures of Predictors 1 and 2 and Difference Between Reference and Target CMAQ Outputa
 Predictor C(1)Predictor C(2)C(So)
  • a

    RMSE, root-mean-square error; MRE, median relative error.


[30] The first three rows show the performance measures of predictor C(1) under scenarios SσZ with σ = 0.1, 0.4, and 1. The RMSE increases from 0.001 to 0.043 as σ increases from 0.1 to 1. The slopes have the ideal value of 1 for σ = 0.1, and 0.4. For σ = 1 the slope is 1.05, a slight overprediction that can be seen in Figure 5b. The MRE also increases from 0.4% to 5.7%.

[31] Since the performance of the predictor depends on the difference between the base and target emissions, it makes sense to compare the prediction error with the difference between base and target CMAQ outputs. In all 5 scenarios we see a reduction by a factor of 6 or more in the MRE of the predictor C(1) compared to the base CMAQ output. The slopes are improved from 0.98 to 1, 0.87 to 1, 0.61 to 1.06, 1.72 to 1.07 and 3.25 to 1.37. The RMSE is improved by factors of 6.0, 4.4, 2.5, 7.9 and 8.1.

[32] For predictor C(2) the improvements are slightly less dramatic but overall it also performs very well. The RMSEs are improved by factors of 3.0, 3.9, 2.4, 7.9, and 6.1. The slopes improved from 0.098 to 1.00, 0.87 to 1.00, 0.61 to 1.06, 1.71 to 1.07, and 3.25 to 1.37. The MREs are improved by factors of 3.7, 4.2, 4.0, 6.8, and 7.9.

[33] All the computations in this subsection and the following one were done for the period from 26 June 1996 to 4 July 1996. Some runs were done for the period from 4 July 1996 to 12 July 1996 and no substantial change in the results was found.

4.3. Inverse Modeling With Simulated Observations

[34] In this section, we consider the CMAQ outputs to be the “observed” ammonia wet deposition and the emissions to be unknown. We use equation (7) to get an estimate of the emission field. Figure 7 shows the estimated emissions (black crosses) and the base emissions (gray circles) versus the target emissions for scenarios SσZ with σ = 0.4 and σ = 1, Sp, and S1. In all four cases we see an improvement in the slope and dispersion of the new estimated emissions as compared to the initial emissions equation imageo. Table 3 shows the RMSE, slope and correlation of these estimates and the initial emission scenario compared to the target emission fields. The first 3 columns correspond to the estimated emission fields relative to the actual target emission fields and the second 3 columns correspond to the initial emission field So relative to the target emission fields. The RMSEs, slopes, and correlations of the estimated emission fields are consistently better than the initial field, which indicates that the inverse modeling method proposed should be useful for improving the emission fields. The RMSEs are reduced by factors of 2.3, 2.4, 1.9, 5.3, and 4.5 for the emission fields S0.1Z, S0.4Z, SZ, Sp, and S1, respectively. In all 5 scenarios the slopes are improved substantially. The correlations are also improved from 0.99 to 1, 0.89 to 0.98, 0.53 to 0.87, 0.85 to 0.96 and 0.04 to 0.43. In this subsection we compare only the aggregated emission fields; that is, we do not apply the steps 6 and 7 from the proposed method because the performance of the method is well represented by the aggregated emissions.

Figure 7.

Estimated emissions (black crosses) versus actual emissions for (a) S0.4Z, (b) S1Z, (c) Sp, and (d) S1. Initial emission So is included for comparison (gray circles).

Table 3. Summary Statistics of Estimated Emission Fields and Base Emission Field Compared to the Target Fielda
 equation imagen Versus SnSo Versus Sn
  • a

    Corr, correlation.


4.4. Inverse Modeling With Actual Observations

[35] We use NH3 monthly wet deposition concentration data from the National Atmospheric Deposition Program National Trends Network (NADP) (available at http://nadp.sws.uiuc.edu). We have 63 sites within our domain with valid observations for the period of 2 July 1996 to 30 July 1996. We have run CMAQ for the same period using the base emission So and aggregated the ammonia wet deposition for the whole period. We have also run the multitracer model in order to obtain the transport matrix for the same period of time. Figure 8 shows the observed wet deposition versus the CMAQ wet deposition at these sites. The straight line has the least squares regression slope, which is 0.29. The correlation between observations and CMAQ is only 0.27 and the modeled depositions are generally much higher than the observed depositions. This poor agreement suggests that a method based on assuming that emission errors are the main source of the discrepancies between CMAQ and observations may have problems. Nevertheless, as an illustration of our method, we apply the inverse modeling scheme, keeping in mind that one would want to reduce other sources of error (e.g., meteorology, especially rainfall amount) before one could reliably use these estimated adjustments in practice.

Figure 8.

Observed ammonia wet deposition concentration (NADP) versus CMAQ wet deposition concentration for the period from 2 July 1996 to 30 July 1996. The slope of the straight line is 0.29.

[36] We use concentrations instead of depositions because the latter is less sensitive to errors in precipitation [Gilliland et al., 2003; Stein et al., 1993; Styer and Stein, 1992]. This has the same effect as adjusting the wet deposition by the ratio of actual and model precipitation as done by Yarwood et al. [2003].

[37] In order to interpolate the observations we modeled the observation as a constant mean plus a stationary and isotropic Gaussian random field characterized by a covariance function in the Matérn class [Stein, 1999] plus independent observation errors. We estimated the parameters of the observed process by maximizing the restricted likelihood [Stein, 1999].

[38] When we calculated the estimated emissions with equation (7), we ran into problems because CMAQ wet concentration, which appears in the denominator, can take values that are very close to zero. A quick fix of the method was to interpolate the (log of the) ratio between the observations and CMAQ output instead of the observation itself. An additional benefit of this approach is that the log of the ratio between observed and CMAQ concentration has a simpler correlation structure than the observation field itself so it is better represented by a stationary and isotropic Gaussian random field [Jun and Stein, 2004]. In this modified method, we only use the CMAQ output at the observation sites. This makes sense since we do not have enough data to get good estimates of the ratio between observation and CMAQ away from the observation sites. Figure 9 shows the estimated emission field versus the base emission field and the least squares line has a slope of 0.27, which is close to the least squares fit of observation versus CMAQ. So the overall average correction is close to what we would expect. We did get two regions where the new emissions were negative but their magnitudes were small, less than 0.02 mol/s, and corresponded to regions where the base emissions were close to 0. Figure 10 shows the interpolated values of the ratio between observed wet deposition and CMAQ output (Figure 10a) and adjustment factors for emissions interpolated to the 4556 cells of the domain (Figure 10b). The ratio between observed and CMAQ depositions show very small values in the southeast; the model is overpredicting by a factor of 3 to 4 in this region and by a factor of at least 2 in the remaining regions. In Figure 10b the regions where the reference emissions were close to zero are represented in white. The adjustment field has a similar pattern to the ratio between observation and CMAQ field with a shift to the west. This is reasonable since in order to compensate for the small ratio between observation and CMAQ in the lower left corner of the figure one would need to decrease the emissions upwind.

Figure 9.

Inverse modeled ammonia emission versus initial emission estimate So. These are aggregated values in each of the 100 subregions. The slope of the straight line is 0.27.

Figure 10.

(a) Interpolated ratio between observed and CMAQ ammonia wet deposition concentration; (b) adjustment factors interpolated bilinearly. The regions where the base emissions were close to zero are shown in white.

[39] As far as we are aware, there are no previous studies where emissions adjustments are calculated with spatial variation like in our study. Gilliland et al. [2003] report overall adjustment factors for the whole eastern U.S. region for some of the months in 1990. Their adjustment factor for July 1990 was close to 1, whereas our adjustment factor for July 1996 is around 0.27, the least squares regression slope of the estimated emission field versus the base emission field. This discrepancy would be smaller if we performed a 15% upward adjustment to the observed ammonia wet deposition as done by Gilliland et al. [2003] based on the comparisons of daily versus weekly sampled ammonia wet deposition data at 4 sites in the NADP network [Butler and Likens, 1998]. Another difference between their runs and ours is that we used 1996 National Emission Inventory instead of the 2000 inventory. Further study is needed to evaluate the source of the substantial discrepancy.

5. Discussion

[40] We have proposed two fast methods to approximate CMAQ ammonia wet deposition under new emission scenarios when we have CMAQ output under a reference emission scenario. The first method requires one fortieth the processing time of CMAQ. The second method requires computing a transport matrix once, which requires roughly the same time as a CMAQ run. After that, a simple matrix multiplication between the matrix and any new emission vector gives good approximations to CMAQ under new emission scenarios. These methods allow us to solve forward and inverse problems that were in practice not tractable because of the computational burden.

[41] We have found that the performance of the predictor depends on the difference between the base and target emission fields. The closer the scenarios the better the predictors work. However, even for emissions scenarios that are quite different from the base scenario, we get predictions that are close to the actual CMAQ output. This means that our predictors can be safely used for inverse modeling purposes. In order to get more accurate results we can use our predictors to quickly search in the space of emission scenarios not too far from the base case. When the new emission and base emission fields are judged to be sufficiently different, execute a new CMAQ run, use the new output as the reference CMAQ and continue the process.

[42] On the basis of the form of predictor C(2), we have proposed a new approach to inverse modeling ammonia emissions. This method requires running the Multitracer model once in order to get emission corrections for each of the 100 subregions of the domain. This allows us to obtain spatially varying corrected emission fields with resolution of the order of 500 km. We generated pseudodata by running CMAQ under different emission fields to test our method. We found that the performance depends on the difference between the base emission and the target emission. Our method gave a substantially improved emission field compared to the base emission field.

[43] We have applied our inverse modeling method using observed NH3 wet deposition data. Since the match between observation and model is very poor we do not trust the actual values of the estimated emissions. Nevertheless, the relationship between the deposition and the estimated emission adjustments did show a reasonable westward shift reflecting the transport of ammonia between sources and sinks and the right overall average correction factor.

[44] The ill-posed nature that affects most inverse problems was addressed here by two features of the method. First, the fact that we use an aggregated transport matrix forces the calculated emissions to be aggregated, and, as a consequence, smoother than what one would get with a separate adjustment for each of the 4556 cells of the domain. We aggregated the data because of resource constraints but this aggregation had the added benefit of smoothing the adjustment field. The second feature is the interpolation of the ratio between observation and CMAQ output that also has a smoothing effect. Further study on how to regularize adjustment factors is needed.

[45] A forward problem that can be addressed with large reduction in computation time is the comparison between different ammonia inventories. For example, the CMU Ammonia Emission Inventory developed an alternative ammonia inventory for the continental United States. One could check which of the two inventories, CMU or EPA (the one used for our runs), matches better with observations by using our second predictor. In practice, one would need to have a much better agreement between CMAQ and observations in order to trust the result of this comparison. This may be the case if the aggregation period is much longer than a few weeks. We will pursue this work in the future.

[46] Extension of our results to wet deposition of other species should be possible as long as the nonlinearity is not severe and one chooses the right temporal aggregation scale. A good preliminary test of whether the approximation method can be applied or not is to compare the output under two emission scenarios that only differ by a constant factor and check how much information the output under one scenario has about the output under the other scenario.


[47] We thank Alexis Zubrow for his invaluable help in running CMAQ, for providing preprocessed MM5 and SMOKE data, and for developing software that facilitated the analysis of output data. The authors would also like to thank Alice Gilliland for helpful comments and discussions. The research described herein has been funded wholly or in part by the United States Environmental Protection Agency through STAR Cooperative Agreement R-82940201-0 to the University of Chicago, it has not been subjected to the Agency's required peer and policy review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred.