Journal of Geophysical Research: Atmospheres

Comparison of Canadian air quality forecast models with tropospheric ozone profile measurements above midlatitude North America during the IONS/ICARTT campaign: Evidence for stratospheric input

Authors


Abstract

[1] During July and August, 2004, balloon-borne ozonesondes were released daily at 12 sites in the eastern USA and Canada, producing the largest single set of free tropospheric ozone measurements ever compiled for this region. At the same time, a number of air quality forecast models were run daily as part of a larger field experiment. In this paper, we compare these ozonesonde profiles with predicted ozone profiles from several versions of two of these forecast models, the Environment Canada CHRONOS and AURAMS models. We find that the models show considerable skill at predicting ozone in the planetary boundary layer and immediately above. Individual station biases are variable, but often small. Standard deviations of observation-forecast differences are large, however. Ozone variability in the models is somewhat higher than observed. Most strikingly, none of the model versions is able to reproduce the typical tropospheric ozone profile of increasing mixing ratio with altitude. Results from a sensitivity test suggest that the form of the ozone lateral boundary condition used by all model versions contributes significantly to the large ozone underpredictions in the middle and upper troposphere. The discrepancy could be reduced further by adding a downward flux of ozone from the model lid and by accounting for in situ production of ozone from lightning-generated NOx.

1. Introduction

[2] Ozone plays a major role in the chemical and radiative balance of the troposphere, controlling the oxidizing capacity of the lower atmosphere and also acting as an important greenhouse gas. It is also a principal indicator of air quality (AQ), and ozone in association with particulate matter in the lower troposphere has implications for human health. The Canadian AQ forecast models CHRONOS (Canadian Hemispheric and Regional Ozone and NOx System) and AURAMS (A Unified Regional Air-quality Modeling System) have been developed by Environment Canada (EC) in order to understand better the atmospheric processes governing air quality, to evaluate different possible AQ management options, and to provide public forecasts of air quality in the short (48-hour) term. Given these different applications, it is important to characterize model performance for a range of model chemical species, geographic locations, seasons, and heights. However, evaluation of model forecasts has been conducted to date primarily with surface measurements [e.g., Sirois et al., 1999; Gong et al., 2006].

[3] During the ICARTT (International Consortium for Atmospheric Research on Transport and Transformation) field campaign (1 July to 15 August 2004), EC, NASA, NOAA, and several U.S. universities pooled resources to release 275 ozonesondes from a dozen sites across the eastern USA and Canada under the IONS-04 (INTEX Ozonesonde Network Study 2004) program (see overview by Thompson et al. [2007a]). At the same time, daily 48-hour forecast runs were performed with both CHRONOS and AURAMS during ICARTT. The model forecasts provided guidance for planning EC aircraft operations during the field campaign, and they were also submitted to NOAA as part of an evaluation of real-time AQ forecast models that was a subcomponent of ICARTT [McKeen et al., 2005, 2007]. The IONS measurements present an unprecedented opportunity to compare vertical ozone distributions predicted by the two AQ models with time- and space-resolved ozonesonde data during the most photochemically active part of the year.

[4] This comparison reveals that in numerous cases the models perform well, apparently reproducing surface boundary layer ozone production and loss, concentration gradients, and pollution plumes with some skill. It is also immediately apparent from this comparison that the models fare poorly in the middle and upper troposphere, underpredicting ozone by as much as 90%. The magnitude of the discrepancy is surprising, and seems far too large to be explained by deficiencies in either the chemistry of the models (which is fairly comprehensive, and has been extensively tested in field studies) or the dynamics (as the models are driven by the Canadian operational weather forecast model). However, both models are limited area (regional) models and may therefore be affected by the imposed chemical boundary conditions, either lateral or vertical. In the remainder of this paper, the comparison methodology and results are described, the effects of these boundary conditions are explored, and the possible importance of free tropospheric ozone to surface concentrations is briefly discussed.

2. Data Set and Model Simulations

2.1. Ozonesonde Profiles

[5] During the IONS-04 campaign ozonesonde profile data were collected at the 12 sites described in Table 1. Site locations are also indicated in Figure 1. At all sites electrochemical concentration cell (ECC) ozonesondes were used, either the 2Z model manufactured by EnSci Corp. or the 6A model manufactured by Science Pump, with some variation in concentration of the KI sensing solution and of its phosphate buffer. The maximum variation in tropospheric response resulting from these differences is likely of the order of 2–3% [Smit et al., 2007] and so is of minor importance for the purposes of this comparison. ECC ozonesondes have a precision of about 5% and an absolute accuracy of about 10% in the troposphere [World Climate Research Programme, 1998; Kerr et al., 1994; Thompson et al., 2007c; Smit et al., 2007]. Data are typically reported at 10-s intervals, and the balloon ascent rate is about 4–5 m s−1. As the ozone sensor has a response with an exponential time constant of about 20 s, this rapid ascent rate can lead to some distortion of the profile that may be important to consider here for sharp vertical transitions in the planetary boundary layer (PBL). Data have not been corrected for this effect. Sonde releases were generally at the same time of day at each site, but there was some variation in release time between sites (generally midafternoon, but in some cases at synoptic times). Sounding frequency varied from daily (at four sites) to as little as weekly (one site).

Figure 1.

Location of 12 IONS ozonesonde sounding sites relative to the AURAMS and CHRONOS model domains used in this study. The ozonesonde site numbers correspond to the site list given in Table 1. Note that the Boulder (3) and Trinidad Head (10) sites are outside of the AURAMS domain.

Table 1. INTEX Ozonesonde Network Study Sites for 1 July to 15 August 2004 Study Period
 Mean Release Time
Site NumberSounding SiteLocationAltitude, mNumber of ProfilesUTCLST
1Ronald H. Brown research vesselGulf of Maine (∼43.27°N, 69.70°W)03315001000
2Beltsville, MD, USA39.04°N, 76.52°W24814000900
3Boulder, CO, USA40.30°N, 105.20°W1743717001000
4Egbert, ON, CAN44.23°N, 79.78°W251511000600
5Houston, TX, USA29.72°N, 95.40°W192519001300
6Huntsville, AL, USA35.28°N, 86.58°W1961419001300
7Narragansett, RI, USA41.49°N, 71.42°W213918001300
8Pellston, MI, USA45.57°N, 84.68°W2353818001300
9Sable I., NS, CAN43.93°N, 60.01°W43323001900
10Trinidad Head, CA, USA40.80°N, 124.15°W204018001000
11Wallops I., VA, USA37.85°N, 75.50°W131817001200
12Yarmouth, NS, CAN43.87°N, 66.12°W91517001300

2.2. Model Descriptions

[6] CHRONOS is a chemical transport model (CTM) that was developed originally by EC to provide guidance to Canadian policymakers on managing photochemical oxidants. More recently it has been used operationally by EC to issue short-term (48-hour) public forecasts of ozone (since 2001) and PM2.5 (since 2004) concentrations. The CHRONOS operational domain covers most of the North American continent (see Figure 1). In the horizontal a 350 × 250 grid with 21-km grid cell spacing is used on a polar-stereographic map projection; in the vertical 24 Gal-Chen terrain-following levels are used with a top at 6 km.

[7] Both CHRONOS and AURAMS are off-line CTMs that are driven by the Canadian operational weather forecast model, GEM (Global Environmental Multiscale model). GEM is a nonhydrostatic, two-time-level, semi-implicit, semi-Lagrangian model [Côté et al., 1998a, 1998b]. For AQ applications, meteorological fields from a high-resolution regional window positioned over the AQ modeling domain are stored at the frequency required by the CTM.

[8] The advection scheme used in CHRONOS to treat tracer transport is a nonoscillatory, semi-Lagrangian scheme [Pudykiewicz et al., 1997; Sirois et al., 1999]. Vertical diffusion is treated via a turbulence kinetic energy closure scheme using a first-order solver. The ADOM-II gas phase chemistry mechanism used by CHRONOS considers 114 chemical reactions and 47 species [Pudykiewicz et al., 1997]. Dry deposition of gases is based on a resistance parameterization with the dry deposition velocity of each dry-depositing species parameterized as a weighted combination of two master species, SO2 and O3: 13 gaseous species are assumed to dry deposit [Zhang et al., 2002]. A simple two-section particle size distribution (diameter ranges of 0–2.5 μm and 2.5–10 μm) is employed to represent particulate matter (PM). Secondary organic aerosol formation is parameterized on the basis of a scheme proposed by Pandis et al. [1992]. Treatment of size-dependent particle dry deposition and sedimentation is based on Zhang et al. [2001]. For inorganic heterogeneous chemistry (gas-particle partitioning of H2SO4, HNO3, and NH3), a numerically efficient and stable code that was developed for AURAMS, based on the ISORROPIA algorithms, is used [Makar et al., 2003a]. However, aqueous phase chemistry is not considered.

[9] CHRONOS was also used as the starting point for AURAMS, a more comprehensive AQ model designed to treat photochemical oxidants and acid deposition, and PM. As a consequence, CHRONOS and AURAMS share the same grid structure, advection scheme, gas phase chemical mechanism, inorganic heterogeneous chemistry scheme, meteorological driver, and anthropogenic emissions inputs, but AURAMS also contains considerably more detailed treatments of aerosol kinetics and chemistry as well as some other process representations (e.g., aqueous phase chemistry, plume rise from major point sources, below-cloud evaporation of hydrometeors, natural PM sources) that are not included in CHRONOS.

[10] AURAMS employs a sectional approach to represent the size distribution and chemical composition of atmospheric PM. Twelve size bins span the diameter size range from 0.01 to 40.96 μm and the following aerosol processes are considered: emissions; nucleation; condensation; coagulation; hygroscopic growth; dry deposition/sedimentation; aerosol activation; and below-cloud scavenging [Gong et al., 2003a]. Up to nine PM chemical components are considered: sulphate; nitrate; ammonium; black carbon; primary organic matter; secondary organic matter; crustal material; sea salt; and particle-bound water. The AURAMS aqueous phase chemistry mechanism includes 20 reactions, including mass transfer and aqueous phase sulphur oxidation: 13 aqueous phase species are considered, and nucleation scavenging of aerosol particles by cloud droplets is directly linked to particle activation [Gong et al., 2006]. Sea-salt emission from wave breaking is modeled online [Gong et al., 2003a]. Secondary organic aerosol formation is parameterized using one of two schemes [Odum et al., 1996; Jiang, 2003]. A second-order scheme is used to treat vertical diffusion. Plume rise is considered for major point sources. Wet deposition includes the removal of soluble gases and particles by cloud-to-rain conversion and below-cloud scavenging (impact scavenging of aerosol particles, reversible and irreversible scavenging of gases), and below-cloud evaporation is also considered [Gong et al., 2006]. Mass-consistency and mass-conservation corrections are also applied [Gong et al., 2003b].

2.3. Model Runs During ICARTT

[11] CHRONOS forecasts were readily available during the ICARTT period since the model was run operationally throughout 2004 at the Canadian Meteorological Centre in Montreal, Quebec. In order to participate in ICARTT, a special real-time AURAMS run was set up to complement the operational CHRONOS run. The AURAMS domain used for ICARTT covered eastern North America (Figure 1). In the horizontal, a 85 × 105 grid with 42-km grid cell spacing was used on the same polar-stereographic map projection as CHRONOS; in the vertical, 28 vertical levels with a top at 29 km were used. Each AURAMS 48-hour forecast was launched at 0000 UTC daily using the previous day's forecast at 24 hours to specify the initial atmospheric chemical state. The AURAMS integration time step was 450 s while the CHRONOS time step was 3600 s. For the ICARTT runs both models used meteorological fields from GEM version 3.1.2, but with 15-km horizontal grid spacing for CHRONOS versus 24 km for AURAMS.

[12] Both models used emission fields that were based on the 1990 Canadian and U.S. national criteria air contaminant inventories scaled to 1995 and 1996 levels by Canadian province and U.S. state, respectively. The Canadian Emissions Processing System was used to prepare hourly point-, area-, mobile-, and biogenic-source emission fields on the CHRONOS grid shown in Figure 1 from these inventories, including 17 gas phase species and primary bulk PM2.5 and PM10 emissions [e.g., Scholtz et al., 1999; Makar et al., 2003b]. AURAMS used the same emissions fields but aggregated to its 42-km grid. Biogenic emissions of NOx and VOCs were calculated “online” in CHRONOS using BEIS2 algorithms and BELD3 vegetation database [Pierce et al., 1998; Pierce et al., 2000] and meteorological fields from GEM, whereas in AURAMS biogenic emissions were calculated off-line using BEIS2 algorithms but an older, less detailed BEIS1 vegetation database. The advantage of the BEIS1 vegetation database was its use of the same vegetation classes in both Canada and the United States, unlike the BELD3 data set, which contains much more detail for the U.S. Emissions from biomass burning, including wildfires, were not considered.

2.4. Additional Model Runs

[13] The CHRONOS and AURAMS runs made during the ICARTT field experiment in 2004 will be referred to hereinafter as the CHRONOS-OP (short for “operational”) and AURAMS-RT (short for “real-time”) runs. Three additional runs that were made retrospectively for the ICARTT period using modified versions of CHRONOS and AURAMS are also considered here. During ICARTT, in addition to the operational CHRONOS 48-hour forecast, an experimental 48-hour forecast with assimilation of near-real-time surface O3 data was generated at 0000 UTC. This version of CHRONOS had 24 levels with a top at 8 km. Assimilation of surface O3 data was for a 3 hour period, from 1200 UTC to 1500 UTC. This second CHRONOS version will be referred to CHRONOS-SDA (surface data assimilation).

[14] One significant difference between CHRONOS-OP and AURAMS-RT was in the treatment of biogenic emissions. Following the ICARTT experiment, a comparison of CHRONOS and AURAMS predictions of free-tropospheric isoprene concentrations versus aircraft measurements showed the AURAMS values to be significantly lower than both the CHRONOS values and the aircraft measurements (S. McKeen, personal communication, 2005). Since biogenic sources are the dominant source of atmospheric isoprene, an ozone precursor, a second run of AURAMS was carried out after implementation of an improved treatment of biogenic emissions based on the BEIS3 (version 3.09) algorithms (U.S. Environmental Protection Agency, http://www.epa.gov/asmdnerl/biogen.html, 2001, viewed 23 June 2006) and the BELD3 vegetation database. This second AURAMS run will be referred to as the AURAMS-BIO (biogenic) run.

[15] The third additional run will be referred to as the AURAMS-NEW run and was performed with a newer version of AURAMS. The AURAMS-RT and AURAMS-BIO ICARTT simulations were both run using AURAMS version 1.1, whereas the AURAMS-NEW run was made using version 1.3.1. New features in version 1.3.1 include the treatment of CO as a prognostic species rather than as a fixed, horizontally homogeneous field, the implementation of a more accurate solver for vertical diffusion, the use of the Jiang [2003] scheme for secondary organic aerosol formation instead of the Odum et al. [1996] scheme, and the same treatment of biogenic emissions as in the AURAMS-BIO run. The pseudo-1995/1996 anthropogenic emission files used by the other four CHRONOS and AURAMS runs were replaced by an updated set of emissions files generated from the 2000 Canadian and 2001 U.S. national emission inventories using the SMOKE emissions processing system (version 2.1) (see Carolina Environmental Program, http://cf.unc.edu/cep/empd/products/smoke/index.cfm, visited 23 June 2006). In addition, a newer version of the GEM weather forecast model (version 3.2.1 plus a treatment of urban heat fluxes following Makar et al. [2006]) was used to prepare input meteorological fields. Only fields for the first 20 vertical levels (up to 5125 m) were saved from this run.

[16] Table 2 provides a summary of the key similarities and differences between this set of five model versions with respect to forecasting ozone. The CHRONOS-OP and CHRONOS-SDA pair differed in two main respects, the use of surface data assimilation and the extension of the domain in the vertical from 6 to 8 km, and the AURAMS-RT and AURAMS-BIO pair differed in only one respect, the change in the treatment of biogenic emissions. Comparing CHRONOS and AURAMS, the CHRONOS-OP and AURAMS-BIO versions were the most similar, and AURAMS-NEW was the most different from all of the other versions.

Table 2. Overview of Key Characteristics of the Five Model Versions
Model CharacteristicsModel Versions
CHRONOS-OPCHRONOS-SDAAURAMS-RTAURAMS-BIOAURAMS-NEW
GEM version3.1.23.1.23.1.23.1.23.2.1
CTM version2.52.51.11.11.3.1
Horizontal grid dimensions (X × Y)350 × 250350 × 25085 × 10585 × 10585 × 105
Horizontal grid spacing, km2121424242
Number of vertical levels2424282828
Altitude of model top, km68292929
Time step, s36003600450450450
Emissions base year (Cda/U.S.)1995/19961995/19961995/19961995/19962000/2001
Biogenic emissions algorithmBEIS2BEIS2BEIS2BEIS3BEIS3
Biogenic vegetation databaseBELD3BELD3BEIS1BELD3BELD3
Surface O3 data assimilationnoyesnonono
Prognostic CO fieldnonononoyes

3. Profile Comparisons

[17] Several different comparisons of the 275 IONS ozone soundings with CHRONOS and AURAMS predictions are presented in this section. Six examples of model forecasts compared with single ozone soundings are shown in Figures 27. These examples have been chosen to illustrate different features of model performance, and are not necessarily representative of average performance. Average observed and predicted ozone profiles are compared at each IONS sounding site, and some statistics are presented in Figure 8. Time series comparisons at 0 and 1000 m are shown in Figures 9 and 10, and differences in the upper troposphere are examined in Figure 11.

Figure 2.

Ozone profile comparisons of the five model runs and the ozonesonde data, for 14 July at Egbert, Ontario. All the models show some skill at reproducing the sharp boundary layer transition in the vertical, and the CHRONOS runs also reproduce the secondary feature at 2 km.

Figure 3.

Ozone profile comparisons of the five model runs and the ozonesonde data, for 30 July at Sable Island, Nova Scotia. The boundary layer transition in the vertical is less pronounced for the evening comparison. Two of the AURAMS runs predict this well, but are biased low overall. The CHRONOS runs predict a sharper transition.

Figure 4.

Ozone profile comparisons of the five model runs and the ozonesonde data, for 30 July at Yarmouth, Nova Scotia. What appears to be a marine boundary layer transition in the vertical is surprisingly sharp for this late afternoon comparison. Nevertheless, the two CHRONOS runs predict this well.

Figure 5.

Ozone profile comparisons of the five model runs and the ozonesonde data, for 16 July at Huntsville, Alabama. All the models predict large ozone production in the surface layer, although underpredicting higher up.

Figure 6.

Ozone profile comparisons of the five model runs and the ozonesonde data, for 28 July at Narragansett, Rhode Island. None of the models predicts a surface depletion of this magnitude, although several forecast a surface layer that is depleted of ozone to some degree.

Figure 7.

Ozone profile comparisons of the five model runs and the ozonesonde data, for 7 July at Wallops Island, Virginia. Four of the models predict the ozone feature near 1 km. Some of the runs show some indication of the secondary peak at 2 km.

Figure 8.

Model-sonde average differences in ppb for the five model runs. Error bars correspond to one standard deviation. Sonde mean release time and the number of profiles available for comparison at each site are indicated. Boulder and Trinidad Head are not shown as these sites were located outside of the AURAMS domain.

Figure 8.

(continued)

Figure 9.

Surface ozone from ozonesondes at six IONS sites compared with the five model runs. Although individual differences are often significant, all the models track major changes in ozone concentration well. Variability in the model values is somewhat higher than in the measured values, by 12%, 32%, 38%, 27% and 17%, for AURAMS-RT, AURAMS-BIO, AURAMS-NEW, CHRONOS-OP and CHRONOS-SDA, respectively.

Figure 10.

Ozone at 1000 m from the five model runs and the ozonesonde data, at six IONS sites. Although individual differences are often significant, all the models track major changes in ozone concentration well. Variability in the model values is somewhat higher than in the measured values, by 13%, 34%, 27%, 23%, and 29%, for AURAMS-RT, AURAMS-BIO, AURAMS-NEW, CHRONOS-OP and CHRONOS-SDA, respectively.

Figure 11.

Average differences (model-sonde) between the observed and forecast ozone profiles for each model at Yarmouth, Nova Scotia. Dashed lines indicate 1σ envelopes. Differences for other sites are similar in the upper troposphere.

3.1. Lower Troposphere

[18] Figure 2 compares an early morning ozone sounding (1100 UTC/0600 LST) on 14 July 2004 at Egbert, Ontario, with the predicted ozone profiles from the five model versions for that time and location (obtained by bilinear interpolation in the horizontal). The sharp transition in ozone concentration in the vertical from the surface through the nocturnal inversion to the residual layer above is captured by the models, although in all cases the vertical gradient is apparently overestimated. This may be due in part to the response time of the ozone sensor (see section 2.l), since this will tend to smooth the observed profile. The vertical gradient of the true ozone profile below 500 m may therefore more closely resemble that forecast by the models than it would appear from this comparison. However, all the models forecast higher ozone than is observed between 500 and 1000 m. The two CHRONOS runs also reproduce the secondary feature at 2000 m.

[19] Figure 3 shows an evening sounding (2300 UTC/1900 LST) on 30 July at Sable Island, Nova Scotia. The boundary layer transition in the vertical is less pronounced for this case. The three AURAMS runs predict this low-level feature well, though two are biased low overall and one high overall. The two CHRONOS runs, on the other hand, predict much more pronounced PBL effects than are seen in the measurements. On the other hand, all of the models are low, relative to the sonde, above 3000 m. Similar behavior above 3000 m is apparent in Figures 47, and as discussed in section 3.2 this bias becomes more pronounced at higher altitudes.

[20] An early afternoon sounding (1700 UTC/1300 LST) on 30 July at Yarmouth, Nova Scotia (Figure 4), shows a shallow layer of high ozone molar mixing ratio at 500 m, just above the top of the marine boundary layer. The two CHRONOS runs predict this feature fairly well, with the operational version (CHRONOS-OP) doing a somewhat better job of reproducing the narrowness of the layer, while the version with assimilation of surface ozone data (CHRONOS-SDA) correctly places the altitude of the layer. The three AURAMS runs are all quite different among themselves but all predict a broader feature and all are biased low.

[21] In Figure 5 the early afternoon ozone profile (2000 UTC/1400 LST) for 16 July at Huntsville, Alabama shows a deep (2 km) boundary layer of photochemically produced ozone. All five model versions predict large ozone production in this layer, and all get the PBL depth about right, but none predicts the ozone increase from the surface to 1500 m. As a result the AURAMS run results are much closer to the sonde measurement at the surface, whereas the CHRONOS run results are close to the sonde value at the top of the PBL (1500 m). All of the models underpredict ozone values above 3000 m, the AURAMS runs especially so.

[22] Some of the sites near the ocean (Narragansett, RHBrown) appear also to be subject to daylight titration under certain conditions (temperature inversions, clouds or fog). A dramatic example of this was observed at Narragansett on 28 July (Figure 6). Interesting, three of the model runs (AURAMS-NEW, CHRONOS-OP and CHRONOS-SDA) appear to reproduce this, although the predicted loss is only half that observed. The assimilation of surface ozone data appears to have helped the CHRONOS profile very little in this case, probably because the titration is a transient phenomenon.

[23] In Figure 7, the early afternoon sounding (1900 UTC/1400 LST) for 7 July at Wallops Island, Virginia shows a very complex, multilayered ozone profile. Four of the model versions predict the ozone maximum at 700 m. One run (CHRONOS-SDA) also shows some indication of the secondary peak at 2000 m. The AURAMS-NEW profile is quite different from the others and does not predict any layering above 500 m.

[24] These examples demonstrate that while the models all show some skill in forecasting ozone in the boundary layer and lowermost troposphere, they all show at times large differences from the actual profile, and in general large differences from each other. This is quite surprising, because the models have major features in common: they all use the same gas phase chemistry, and all are driven by the same meteorological forecast model. This implies that the differences in predicted ozone are due to differences in horizontal resolution, integration time step, treatment of biogenic emissions and aqueous phase chemistry, all of which might be expected to be of minor importance.

[25] A statistical summary of overall model performance in predicting the IONS ozone profiles is given in Figure 8 for the five model versions. Calculated biases are variable, and in some cases quite modest. Differences in sonde preparation between stations may contribute a minor part of the station-to-station variation in model-sonde bias. As noted in section 2.1, such differences are small (2–3%), but they are systematic between stations. Model-sonde differences for individual profiles, however, are often large, as evidenced by the error bars (one standard deviation), which are generally in the range of 10–30 ppbv, or 25–75% of typical tropospheric ozone amounts. Over all sites, in the first 1000 m, biases are lowest for the AURAMS-NEW run, while standard deviations are lowest for the AURAMS-RT run. Average biases over the first 1000 m are −5.6, 2.8, 1.8, 5.3 and 2.6 ppbv for AURAMS-RT, AURAMS-BIO, AURAMS-NEW, CHRONOS-OP and CHRONOS-SDA, respectively, while average standard deviations are very similar, ranging between 15.5 and 18.1 ppbv. The surface ozone data assimilation appears to reduce both biases and standard deviations for CHRONOS, although for some sites actual surface biases increase (e.g., Huntsville, where the bias for CHRONOS-SDA is the largest surface-level bias of any of the models, at any site). In general, agreement in the first 1000 m is best at Egbert, Yarmouth, Pellston and Sable Island, that is, at the northernmost IONS stations (see Figure 1). One possible explanation is that the Canadian emissions used as input to CHRONOS and AURAMS were more accurate than those for the United States. Interestingly, implementation of pollutant control legislation in the United States (“NOx SIP Call”) resulted in a significant reduction in U.S. NOx emissions occurring between 2001 and 2004, after the applicable years for the two U.S. emission inventories that were used for these runs [Frost et al., 2006]. The biases in Figure 8 at U.S. sites in the lowest 1000 m are predominantly overpredictions, and this may be partly due to the reduction in actual versus forecast emissions. In addition, several of the U.S. sites (Beltsville, Houston, Narragansett) are near or downwind of large pollution sources, and so see large variability in surface ozone depending on local winds, insolation and temperature inversions, rendering forecasting more difficult. In Figure 8 the standard deviations of the model-sonde differences for these sites decline markedly from the surface to 3000 m.

[26] The AURAMS-RT and AURAMS-BIO runs show much larger (negative) biases than the two CHRONOS runs and the AURAMS-NEW run above about 1500 m. As noted above, all of the models show exclusively negative biases above 2000 m.

[27] Another aspect of model performance, one that is perhaps the most important for an AQ forecast model, is how well the model predicts changes in ozone concentration from day to day. Several of the IONS sites launched sondes on a daily or near-daily schedule. Figure 9 shows time series of surface ozone from the ozonesondes at six of these sites, compared with the five model runs. Although individual differences are often significant, all the models track major changes in ozone concentration well overall. Variability in the model values is somewhat higher than in the measured values, by 12%, 32%, 38%, 27% and 17%, for AURAMS-RT, AURAMS-BIO, AURAMS-NEW, CHRONOS-OP and CHRONOS-SDA, respectively. Figure 10 is similar, comparing time series of measured ozone at 1000 m with those forecast by the five model versions for the same six sites. All of the models also track major changes in ozone concentration at 1000 m well, although individual differences are often significant. This is probably in part due to the fact that the models use emissions inventories for ozone precursors, and lack data on actual emissions. For example, none of the model runs predicts the large increases in ozone at 2000 m seen over Houston on 19 and 20 July, which were apparently due to pollution from Alaskan and Canadian forest fires [Morris et al., 2006]. Variability in the model values is somewhat higher than in the measured values, by 13%, 34%, 27%, 23%, and 29%, for AURAMS-RT, AURAMS-BIO, AURAMS-NEW, CHRONOS-OP and CHRONOS-SDA, respectively.

3.2. Upper Troposphere

[28] In marked contrast to the skill shown in the first 2000 m, above this level all of the models show exclusively negative biases with respect to measurements, and these biases become quite severe, particularly for AURAMS, in the upper troposphere (UT). Figure 11 shows average differences at Yarmouth, Nova Scotia, between the observed and forecast ozone profiles for each model. Other IONS sites show similar differences in the middle and upper troposphere. These differences can be as much as 80–90% in the UT for AURAMS; that is, the model is only showing 10–20% of the actual ozone values at these heights (compare Figures 11 and 13). For CHRONOS the low bias is less marked, but can be nearly 50% at 8 km. Possible reasons for this behavior will be discussed in the next section.

4. Discussion

4.1. Interversion Differences

[29] Inspection of Figures 27 suggests some systematic differences between the five model versions. For example, comparing CHRONOS-OP with CHRONOS-SDA, it is evident from these figures that the addition of surface data assimilation of ozone does not always improve the forecast of surface ozone but at the same time the impact of surface data assimilation reaches into the free troposphere. However, as noted above, examination of Figure 8 suggests slightly better evaluation statistics for CHRONOS-SDA. Turning to the AURAMS-RT/AURAMS-BIO pair, the higher isoprene emissions in the AURAMS-BIO run have resulted in higher mean ozone concentrations at all sites, although the magnitude varies from site to site. The profiles from the AURAMS-NEW run presented in Figures 27 are quite distinct from those of the other two AURAMS runs. The underprediction at upper levels is also significantly reduced for this run, compared to the other two AURAMS runs (Figure 8) but is generally still larger than the two CHRONOS runs.

[30] It is also instructive to compare the ensemble of the five model-predicted ozone profiles to the measured profile in Figures 27. In most cases the ensemble of profiles brackets the measured profile, suggesting that an ensemble-average profile might compare better to measurements than any individual model version. On the basis of the forecasts submitted for the ICARTT real-time AQ model intercomparison, including the CHRONOS-OP and AURAMS-RT forecasts, McKeen et al. [2005] and Pagowski et al. [2005] found that an ensemble forecast from the six participating AQ models performed better on average than any of the individual models. It is also worth noting, given the range of forecasts from basically similar model versions, how sensitive model performance can be to changes in model configuration or input files.

4.2. Role of Chemical Initial and Boundary Conditions

[31] As discussed by Brost [1988], the treatment of moderately long-lived trace species such as ozone poses a challenge for limited area CTMs, since species whose chemical lifetimes are on the order of days will be long-lived enough to travel from the model boundary across the model domain but reactive enough to be transformed or removed within the model domain. This suggests that ozone concentrations at inflow boundaries will influence ozone concentrations in the model interior. As shown by Brost [1988], Langmann and Bauer [2002], and Tong and Mauzerall [2006], lateral boundary influences will also increase with height, since most emissions of ozone precursors are emitted at, or near, the Earth's surface and hence have the most immediate impact close to the ground. Thus lateral boundary influences will be more important in the free troposphere than in the PBL. Lin et al. [1996] found similar results for Rn-222, an inert gas which has an e-folding lifetime of 5.5 days and only surface sources. However, both CHRONOS and AURAMS employ a zero-gradient boundary condition for each chemical species at inflow lateral boundaries, including ozone, and thus in essence ignore inward fluxes at the lateral boundaries. They thus assume that species abundances in the model interior are determined only by processes within the interior, and especially by surface emissions.

[32] Brost [1988] and Berge et al. [2001] demonstrated as well that regional-scale CTMs can be significantly influenced by initial vertical distributions of ozone for three days or more after the start of a simulation before horizontal winds have had time to “flush” the model interior. They also found the influence of initial ozone concentrations to be larger in the free troposphere than in the PBL. This phenomenon will be enhanced for the two models considered in this study as they ignore inward fluxes at the lateral boundaries.

[33] It is apparent from the model evaluation results presented in section 3 that model performance was better in the PBL than the free troposphere. This is consistent with the greater role of local emissions of ozone precursors in the former. However, even for the PBL, initial and boundary conditions are likely to have a greater influence for periods when emissions are reduced or transport is from an unpolluted area or photochemistry is reduced (e.g., winter).

[34] The large differences between the AURAMS and CHRONOS deficiencies in the UT are surprising, since, as noted above, the models have many features in common, including the same gas phase chemistry and the same emissions inventory. Although the two models employ the same chemical lateral boundary conditions (CLBCs), there is a difference in the initial ozone fields. CHRONOS assumes a horizontally homogeneous, uniform initial O3 profile of 60 μg kg−1 (36.1 ppbV), whereas AURAMS assumes a uniform initial O3 profile of 80 μg kg−1 (48.2 ppbV). CHRONOS and AURAMS also have different horizontal grid spacing, which can affect the magnitude of mixing ratio extrema. However, other factors may contribute as well. For example, the difference in biogenic emissions between the AURAMS-OP and AURAMS-BIO runs results in somewhat smaller biases in the UT for the latter run. Possible candidates for the even smaller biases for the AURAMS-NEW run are the change in the anthropogenic emission files, the change in the meteorological input files, and the addition of CO as a prognostic species to the gas phase chemistry mechanism. One other difference between CHRONOS and AURAMS is the difference in domain size and lateral boundary locations (Figure 1), which may affect the impact of the inflow zero-gradient CLBCs: for example, dry deposition of O3 will be larger on the AURAMS western boundary over land than on the CHRONOS western boundary over water.

[35] One way to investigate the impact of the zero-gradient CLBC for ozone used by both CHRONOS and AURAMS is to run a sensitivity test with an alternative CLBC. To rerun the models for the ICARTT period with a different CLBC is a large task; fortunately a similar experiment has already been conducted with AURAMS for the summer of 2002. In this sensitivity experiment the zero-gradient CLBC for O3 was replaced in AURAMS by a time-invariant climatological CLBC in which a fixed O3 concentration profile was prescribed at all inflow lateral boundaries. CLBCs of this type have been used in a number of other limited area AQ models such as CMAQ and CHIMERE [Hogrefe et al., 2004; Tong and Mauzerall, 2006; Vautard et al., 2005].

[36] Two AURAMS runs that differed only in their treatment of the O3 CLBC were made for the same multimonth period in 2002. The domain used for these two runs was continental in scale, similar to the CHRONOS domain (see Figure 1), and considerably larger than the subcontinental domain used for the ICARTT period simulations. The run with the original zero-gradient O3 CLBC was started on 15 May 2002 at 0600 UTC and ended at 1 October 2002 at 0600 UTC. The sensitivity run with the prescribed O3 vertical profile at inflow lateral boundaries was started at 1 August 2002 at 0600 UTC and also ended at 1 October 2002 at 0600 UTC. A uniform vertical O3 profile of 80 μg kg−1 (48.2 ppbV) was prescribed, which is identical to the initial O3 vertical profile assumed at the start of both simulations.

[37] Figure 12 shows domain-average vertical O3 profiles for the two runs at the same time, 30 September 2002 at 1900 UTC (1400 Eastern Standard Time), 4.5 months after the start of the first (base case) simulation and 2 months after the start of the second (sensitivity test) simulation. The O3 profile prescribed at the inflow lateral boundaries for the sensitivity test simulation is also plotted. The two domain-average O3 vertical profiles are strikingly different. The base case vertical profile for the zero-gradient CLBC is lower by a factor of two at the surface and decreases rapidly with height, falling by ∼75% in the first 5 km. The sensitivity test vertical profile, on the other hand, is not very different than the prescribed O3 lateral boundary profile; it is reduced by a few ppb close to the Earth's surface relative to the boundary profile and is increased by a few ppb in the mid troposphere relative to the boundary profile.

Figure 12.

Domain-average vertical O3 profiles for the two runs at the same times, 4.5 months after the start of the first (base case) simulation and 2 months after the start of the second (sensitivity test) simulation. “Old CLBC” indicates the results using the base case, zero-gradient CLBC; “new CLBC” indicates the results using a prescribed O3 profile at the inflow lateral boundaries (shown in black). The two sets of domain-average O3 vertical profiles are strikingly different.

[38] This result suggests that for the sensitivity test simulation, the inflow lateral boundaries act as an important additional source of O3, counteracting the O3 sinks of dry deposition to the Earth's surface and aboveground chemical destruction that overwhelm O3 production from North American anthropogenic and biogenic sources of O3 precursors in the base case simulation. Two other important sources of tropospheric O3, downward transport from the stratosphere and in situ production in the UT from lightning-generated NOx, were not considered directly in either of these runs. However, they may be considered to have made an indirect contribution along with anthropogenic and biogenic sources of O3 precursors outside of North America in that it is assumed implicitly in the prescription of the time-invariant climatological O3 profiles at the lateral boundaries that there is a balance of processes outside of the model domain that maintains the prescribed profile. Thus, while the new time-invariant CLBC may appear to have “fixed” the problem, it should really be considered to have merely transferred it outside the model domain. Moreover, the new CLBC has produced a domain-average O3 profile that is much closer to the climatology near the surface as well, a result that emphasizes the importance in this case of having accurate O3 profiles at the domain boundaries. Nevertheless, this exercise demonstrates the importance of transport across lateral model boundaries. Interestingly, Tong and Mauzerall [2006] have recently suggested that even the specification of accurate and time-varying O3 CLBCs for a limited area AQ model may need to be supplemented by stratospheric inputs of O3 at the upper boundary of the limited area model in order to account for all UT O3.

4.3. Role of Stratosphere-Troposphere Exchange

[39] There are also several processes not represented in the models that likely contribute to the significant underestimate of ozone in the UT. Emissions of NOx in the UT due to lightning and to in-flight aircraft emissions have not been considered. Such emissions could lead to in situ production of ozone. Vertical transport of ozone and its precursors from the PBL to the UT by subgrid-scale deep convective systems such as large thunderstorms or squall lines is also not considered. However, the stratosphere is a large reservoir of ozone, and so another potential source of the “missing” ozone is injection from the stratosphere, which is also not presently considered in AURAMS or CHRONOS.

[40] Observational studies on stratosphere-troposphere exchange of ozone comprise a large literature [e.g., Danielsen, 1968; Davies and Schuepbach, 1994; Cho et al., 1999; Monks, 2000]. A number of these studies have suggested that the process is quite important to the tropospheric ozone budget [e.g., Dutkiewicz and Husain, 1985; Oltmans et al., 1989; Bachmeier et al., 1994; Browell et al., 1994; Mauzerall et al., 1996; Dibb et al., 1997, 2003; Allen et al., 2003], while others have concluded that it is a minor source [e.g., Dibb et al., 1994; Bazhanov and Rodhe, 1997; Elbern et al., 1997; Li et al., 2002; Browell et al., 2003]. In general, the former studies dealt with the UT while the latter concluded that stratospheric ozone was a minor source at the surface. Other ozonesonde-model comparison studies [Hoff et al., 1995; Mauzerall et al., 1996] have found it necessary to assume a stratospheric ozone source in order to reproduce the observed vertical distribution of ozone. Indeed, consideration of the average vertical profile of ozone molar mixing ratio at any of the IONS sites (e.g., Figure 13) strongly suggests that the stratosphere must be a source of at least some of the ozone in the troposphere, since the observed monotonic decline of ozone mixing ratio from the tropopause to the PBL cannot readily be explained by means of only tropospheric sources.

Figure 13.

Ozone molar mixing ratio from ozonesondes at Yarmouth, Nova Scotia, averaged on AURAMS model levels. As for other sites, ozone decreases monotonically from the stratosphere to the surface boundary layer.

[41] Most recently, several observational studies using the IONS-04 data set have concluded that the stratosphere is an important source of free tropospheric ozone. Thompson et al. [2007a, 2007b], using a number of observational criteria to classify portions of ozone soundings, calculate the stratospheric contribution to the total tropospheric column to be between 16 and 34% at IONS-04 sites. Cooper et al. [2006], using the PV-based FLEXPART retroplume technique, estimate that between 13 and 27% of ozone in the UT at IONS-04 sites is of recent stratospheric origin.

[42] Estimates by global CTMs of the cross-tropopause flux of ozone from the stratosphere vary between about 400 and 1400 Tg(O3) yr−1 [World Meteorological Organization, 1999; Brasseur et al., 2003]. More recently Lelieveld and Dentener [2000] have estimated it to be 565 Tg(O3) yr−1, on the basis of a model study using ECMWF meteorological reanalyses and ozonesonde data. The flux has also been estimated from measurements of N2O and ozone, based on the observed correlation of N2O and ozone, at 400 Tg(O3) yr−1 [Murphy and Fahey, 1994], and at 475 Tg(O3) yr−1 [McLinden et al., 2000], from measurements of N2O and NOy, based on the observed N2O:NOy and NOy:O3 correlations. These fluxes are comparable to the total tropospheric burden of ∼350 Tg(O3).

[43] Using these estimates it is possible to calculate, using a simple model, how much of the “missing” ozone in Figure 11, for example, can be accounted for by transport from the stratosphere. The vertical ozone flux f through a horizontal surface can be written

equation image

where μ = image is the ozone mass mixing ratio, ρ is air density, and z is the vertical coordinate. Then assuming no in situ production or loss, the change in ozone concentration within a horizontal layer, ρdμ/dt, is equal to the vertical derivative of the flux, so that

equation image

[44] This will be recognized as the equation for diffusion of μ in one dimension. K (often written Kzz) is the coefficient of vertical diffusion and has a value appropriate to represent all vertical motion in the atmosphere (as opposed to the K described in a three-dimensional atmospheric model like GEM, which represents only turbulent eddy diffusion, since large-scale vertical motions in GEM are modeled explicitly).

[45] For a steady state, the left-hand side of (2) must be balanced by an equal rate of chemical loss. We assume a chemical loss rate, L = 2 ppb/day. This corresponds to a lifetime for ozone at the surface of about 2 weeks, and 40 days in the UT. We scale this assumed total L by the fraction of stratospherically derived ozone and set it equal to the flux divergence, /dt, above. From the sonde data, we know ρ and /dz, and so for a given value of K can solve for the amount of ozone that is supplied by downward transport.

[46] As K is neither a real (i.e., measurable) atmospheric variable nor is it expressed in global 3D models like GEM, it must be estimated indirectly. Using an average of sonde data over several IONS stations for /dz, the range of values for the cross-tropopause flux of ozone quoted above implies, from (1), values of K near the tropopause between about 3 and 6 m2 s−1. This is similar to values of K used in 1D and 2D models, which are generally in the range 4–10 m2 s−1 [Mauzerall et al., 1996; Intergovernmental Panel on Climate Change, 1999]. Using values of K between 3 and 6 m2 s−1 and L = 2 ppb/day in (2), this order-of-magnitude estimate gives values for the stratospheric contribution to ozone in the upper troposphere of about 15–35%, similar to that estimated from more sophisticated observational analyses using the IONS data set [Thompson et al., 2007a, 2007b; Cooper et al., 2006]. Thus the lack of a stratospheric flux could account for most of the low bias in the UT for CHRONOS, but only about one third to one half of that for the AURAMS runs.

[47] It seems quite possible, therefore, that the addition of a realistic stratosphere, and the corresponding stratospheric ozone flux, would significantly improve the model profiles of ozone in the UT. It also seems likely that some additional chemical source of ozone in the UT will be required. One likely candidate is lightning-generated NOx [Huntrieser et al., 1998; Cooper et al., 2006], which is not currently considered in either model.

[48] It is reasonable to ask whether or not correcting the profile in the free troposphere would have an important effect on modeled ozone values at the surface. Stratospheric ozone intrusions are occasionally observed to reach the ground [e.g., Lefohn et al., 2001; Elbern et al., 1997; Davies and Schuepbach, 1994; Wakamatsu et al., 1989; Oltmans et al., 1989], so the addition of a stratospheric source would clearly improve modeling of these events; however, they are relatively infrequent. Much more frequently, intrusion events are observed to reach the upper or middle troposphere, where they appear to dissipate and contribute to the “background” ozone, generally defined as tropospheric ozone that is more than seven days old and therefore of uncertain origin, and estimated at about 20–45 ppb [Naja et al., 2003; Altshuller and Lefohn, 1996; Hirsch et al., 1996; Lin et al., 2000]. Since the troposphere is generally well mixed on a timescale of 2–3 weeks, and the lifetime of ozone in the lower troposphere is of similar duration, this background ozone likely makes a significant, seasonally varying contribution to ozone values at the surface. Indeed, long-term trend studies of ozonesonde data find statistically significant (95% confidence) correlations between ozone mixing ratio in the lower stratosphere and in the troposphere, right down to the surface, at middle and high latitude sites far from major anthropogenic pollution sources [Tarasick et al., 2005; Taalas et al., 1997]. The CLBC sensitivity test (Figure 12) gives some indication of the importance of ozone that has been subject to long-range horizontal transport to the total ozone budget. The fact that all the models overpredict the variance of ozone in the surface layer (Figures 9 and 10) is not inconsistent with such a background contribution, since the addition of a (constant) background term would reduce this variance (although it would also increase the bias). Further evidence that ozone from the free troposphere affects the surface is found in the observed diurnal cycle of ozone at most urban sites; the ozone that is destroyed by NO titration is replenished each day when the nighttime surface inversion is dispersed in the morning and ozone is mixed down from the residual layer.

5. Conclusions

[49] The availability of the IONS-04 data set, which consists of a sizable number of quasi-daily ozone vertical profiles at 12 sites in North America during a 5-week period in summer 2004, constitutes a valuable new resource for evaluating the performance of complex regional chemical transport models above the surface. Two such models, AURAMS and CHRONOS, both show considerable skill at forecasting boundary layer ozone but have serious discrepancies in the free troposphere on average when compared to the IONS-04 profiles. These findings would not have been attainable from model evaluations based only on surface observations, which is all that is usually available to modelers, and they help to identify areas of individual models that require further work. For example, the analysis presented here suggests that significant improvement in model performance in the free troposphere may be obtained by adding a realistic stratospheric ozone flux term. Work is now in progress to develop a version of GEM with inline chemistry, which will also carry a stratospheric ozone tracer. This should improve the skill of the air quality forecast system.

Acknowledgments

[50] Thanks are due to B. Pabla, M. Sassi, S. Gaudreault, and J. Zhang of Environment Canada for their assistance in developing and running AURAMS and to D. Dégardin of the University of Quebec at Montreal for his work on the AURAMS-BIO simulation. R. Pavlovic and M. Samaali provided the results from the AURAMS CLBC sensitivity test. The provision of a set of evaluations of CHRONOS and AURAMS performance during the ICARTT period by S. McKeen of NOAA in Boulder, Colorado, was also much appreciated. We also thank the many observers who obtained the measurements at the sites used in this study. Their careful work is gratefully acknowledged. We also thank J. Davies and R. Mittermeier, who processed all of the Canadian data in near real-time.

Ancillary