This paper presents one of the first extensive intercomparisons of models and methods used for estimating stratosphere-troposphere exchange (STE). The study is part of the European Union project Influence of Stratosphere Troposphere Exchange in a Changing Climate on Atmospheric Transport and Oxidation Capacity (STACCATO). Nine different models and methods, including three trajectory methods, one Eulerian method, two Lagrangian and one Eulerian transport model, and two general circulation models applied the same initialization. Stratospheric and tropospheric tracers have been simulated, and the tracer mass fluxes have been calculated through the tropopause and the 700 hPa surface. For a 12-day case study over Europe and the northeast Atlantic the simulated tracer mass fluxes have been intercompared. For this case the STE simulations show the same temporal evolution and the same geographical pattern of STE for most models and methods, but with generally different amplitudes (up to a factor of 4). On the other hand, for some simulations also the amplitudes are very similar.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Despite extensive research on stratosphere-troposphere exchange (STE) and its effect on atmospheric chemistry, there are still large uncertainties in the qualitative and quantitative characteristics of STE. Model estimates of the global annual cross-tropopause ozone flux, e.g., range between 400 and 1400 Tg O3 year−1 [Prather et al., 2001]. For several purposes it is important to reproduce the transport processes throughout the upper troposphere and lower stratosphere correctly, e.g., for the assessment of the atmospheric impact of aircraft emissions [Rogers et al., 2002], or for the estimation of the influence of stratospheric ozone on tropospheric chemistry [Roelofs and Lelieveld, 1997]. Recently, Butchart and Scaife  predicted an increase of air-mass exchange between the troposphere and the stratosphere with 3% per decade due to enhanced greenhouse gas concentrations. In order to investigate the impact of enhanced greenhouse gas concentrations on STE it is important to reproduce the involved transport processes correctly. For the overall influence of STE on the dynamics and chemistry of the atmosphere not only more research on the dynamical and physical details of the exchange processes is required, but also the confidence in the used diagnostic methods needs to be increased.
 For the estimation of STE a range of methods is in use. On the global scale residual mean mass fluxes in the stratosphere have been calculated, e.g., by Holton  and Rosenlof and Holton  using the downward control principle. Follows  estimated the global cross-tropopause mass flux from the evolution of the budget of CFCs in the troposphere and stratosphere. The net mass transport across the tropopause has been analyzed by Appenzeller et al.  by estimating the global-scale stratospheric meridional circulation and the seasonal mass variation of the stratosphere. Hoerling et al.  made a global analysis of the monthly mean air-mass flux across the tropopause including both the diabatic transport and the quasi-horizontal isentropic transport. The contour advection technique was used by Dethof et al.  to quantify the global quasi-horizontal, isentropic mass transport across the dynamical tropopause due to small-scale filamentation. Other studies calculated isentropic cross-tropopause mass exchange, using a semi-Lagrangian transport model [Chen, 1995] or a two-dimensional model of isentropic turbulence [Hartjenstein, 2000].
 STE has also been investigated for specific events, e.g., for stratospheric intrusions or cut-off lows. Hereto a variety of approaches have been applied. These approaches can be divided into three subgroups: estimates of STE derived from observations of trace gases, from three-dimensional analyzed fields of wind and temperature, and computed from general circulation models (GCMs). Several quantitative estimates of STE have been derived, e.g., from observations of ozone and nitrogen oxides [Danielsen, 1968; Murphy et al., 1993], radioactive tracers [Staley, 1960; Danielsen, 1968] and various other stratospheric constituents that are conserved on a relatively long time scale. Ancellet et al.  combined lidar measurements with trajectories to calculate air-mass exchange for several cases. Gouget et al.  investigated mechanisms for STE and the associated mass flux in a cut-off low with the help of trajectories and MOZAIC (Measurement of Ozone and water vapor by Airbus In-service airCraft) data. The exchange of air-mass between the stratosphere and the troposphere for specific events or longer time periods can also be estimated directly from the three-dimensional fields of wind and temperature that are computed by numerical weather prediction and general circulation models (GCMs). Several studies calculated instantaneous spatial distributions of air-mass exchange using the method described by Wei , e.g., those of Grewe and Dameris , Siegmund et al. , and Gettelman and Sobel . A disadvantage of this method is that it is less reliable, because it suffers from an almost cancellation of large terms [Wirth and Egger, 1999]. Spaete et al.  estimate the STE for a single event with a semi-Lagrangian transport model. Recently new methods for the calculation of STE based on trajectory calculations were developed. For instance, Sigmond et al.  and Meloen et al.  applied the Wei formula to trajectory model output. GCMs are used to study the effect of transport on, e.g., the ozone distribution. Early model estimates of the vertical transport of ozone across the tropopause have been made by Mahlman et al.  and by Gidel and Shapiro . Recently GCMs have been interactively coupled with global chemistry, with which STE events and the associated amount of stratospheric ozone transferred into the troposphere during the event have been estimated [Kentarchos et al., 1999].
 Quantitative comparison of the results of studies on STE with different methods is difficult, because of the use of different time periods and different events. Given the wide range of available methods it is important to perform objective intercomparisons in well-defined circumstances. Some intercomparison studies with a few different models and methods for an individual case have already been performed. Kowol-Santen et al. , for example, implemented a trajectory based method and the method developed by Wei  in a mesoscale model and compared the results of these two methods, which showed good agreement between the net flux values. Wirth and Egger  compared five methods to diagnose STE, three of which were derived from Wei's general formula, one involved the computation of a large number of trajectories, and one evaluated the flux directly as the difference between the motion of the air and the motion of the tropopause. They found that the different methods to quantify STE yield quite different results.
 The present STE intercomparison study adds to the existing studies on three aspects. First, the number of applied models and methods is larger, second, also the range of the applied models and methods is larger and, finally, the model results are evaluated with measurements in a companion paper [Cristofanelli et al., 2003]. Within the scope of STACCATO (Influence of Stratosphere-Troposphere Exchange in a Changing Climate on Atmospheric Transport and Oxidation Capacity), nine different models and methods to estimate mass exchange have been applied. These models and methods range from trajectory based analysis to Lagrangian models to GCMs, with different horizontal and vertical model resolutions and different sub-grid scale parameterizations. Although these models and methods are very different, they all have been used in the past to perform similar types of calculations. Therefore, an intercomparison of these models and methods is desired. To enable an intercomparison of these models and methods, they were applied in this study for the calculation of idealized tracer mass exchange. This was done for a specific event, from the 26th of May until the 7th of June 1996 over central Europe. In this period a deep stratospheric intrusion with associated STE occurred. The intercomparison setup, the case study episode and the broad range of models and methods is an appropriate way for an intercomparison of estimated STE.
 In section 2 a summary is given of the applied models (section 2.1), the intercomparison setup is described (section 2.2) and an outline of the quantification methods is given (section 2.3). Section 3 gives a short overview of the meteorological case for which the intercomparison is performed. In section the model and method intercomparison the results of the intercomparison are described, starting with the flux through the tropopause (section 4.1), followed by the results for the tracer fluxes across the 700 hPa surface (section 4.2). Finally, time versus height plots are presented for the observation station Mt. Cimone (section 4.3). Section 5 provides a discussion and the main conclusions.
2. Intercomparison: Models, Setup, and Methods
 The nine models that are used in this intercomparison are briefly described in Table 1. Three trajectory models have been used (LAGRANTO, FLEXTRA and TRAJKS), two Lagrangian models (FLEXPART and STOCHEM), one Eulerian transport model (TM3) and two GCMs (ECHAM4 and MA-ECHAM4). One method directly used output of the ECMWF model (Wei method). The three trajectory models have been intercompared by Stohl et al. . All these models have been used in previous studies for the simulation of STE. For the present intercomparison all models use ECMWF data. This has the advantage that differences in the results cannot be due to differences in input data. On the other hand, the effect of uncertainties in the input data on the results is not considered in this way.
Table 1. Overview of the Models Used in the Intercomparisona
The input for all models is first guess ECMWF data.
 For the intercomparison a case has been selected that occurred from the 26th of May 00 UTC to the 7th of June 00 UTC 1996. All models and methods have generated output for the region of interest, which extends from 20°N to 70°N and from 20°W to 40°E. Models which require a spin up time (FLEXPART, ECHAM4, MA-ECHAM4 and TM3) are started on the 1st of May 1996.
 All models are initialized as similarly as possible. The tropopause is defined at a potential vorticity (Q) value of 2 pvu (1 pvu = 10−6 K m2 kg−1 s−1). Ideal stratospheric and tropospheric tracers with a mixing ratio of 1 kg/kg are then inserted in the stratosphere and troposphere, respectively. Stratospheric tracers are only inserted above the 700 hPa surface, to exclude tropospheric air with high Q values due to friction or diabatic heating in the boundary layer. If a stratospheric (tropospheric) air parcel crosses the tropopause and enters the troposphere (stratosphere), the tracer mixing ratio decays exponentially with a time constant of 2 days. The stratospheric (tropospheric) tracer mixing ratios are kept at a constant value of 1 kg/kg in the stratosphere (troposphere).
 A tracer decay has been used, because this gives the atmosphere the possibility to establish an equilibrium between supply and decay of the tracers. Without this decay the troposphere (stratosphere) would slowly fill up with stratospheric (tropospheric) tracer. A decay time of 2 days has been chosen in order to limit the trajectory calculations to 10 days, a period beyond which trajectories become unreliable.
 If the tracer crosses the tropopause several times, different approaches are applied by the different methods. For the trajectory methods (LAGRANTO, FLEXTRA and TRAJKS) no STE takes place before the last crossing. The other models and methods (FLEXPART, Wei method, STOCHEM, TM3, ECHAM4 and MA-ECHAM4) cannot follow air parcels. Therefore, with these models and methods STE can take place every time a certain amount of air passes the tropopause.
 Because the 700 hPa surface is situated entirely in the troposphere for the considered case study, the stratospheric tracer flux at this pressure surface can be used as a measure of deep STE (see section 4.2). Therefore, every 3 hours upward and downward fluxes of the inserted stratospheric and tropospheric tracers are calculated at this pressure surface. In addition, the vertical velocity and the tracer concentrations at this surface are considered. As a direct measure of STE, the air-mass fluxes through the tropopause have been calculated by several of the models and methods. To highlight the meteorological events, 24-hour running means have been calculated for all model output.
 For ten measurement sites, the models computed vertical profiles of the stratospheric tracer concentration, that also will be intercompared. The complementary study by Cristofanelli et al. , compares the model results from the present study with vertical profile or surface measurements of water vapor, ozone and radio nuclides at these measurement sites.
 In Table 2 the methods applied for calculating STE are summarized. For the three methods that are based on trajectories, three trajectory models, LAGRANTO, FLEXTRA and TRAJKS, have been used. The methods of the flux calculation through the 700 hPa surface applied by LAGRANTO and FLEXTRA are similar. They used the formula F = cw, where F denotes the tracer mass flux (kg m−2 s−1), c the tracer concentration (kg m−3) and w the vertical velocity (m s−1). This equation approximates the isobaric mass flux by more than 90% [see, e.g., Holton, 1992, section 3.5]. The concentration is determined by means of 10-day backward trajectory calculations, starting on the pressure surface every 3 hours on a 1° × 1° grid. Along the trajectory Q is computed, thereby allowing to determine tropopause crossings. To calculate the concentration of the stratospheric and tropospheric tracers on the pressure surfaces, it is only necessary to know the time of the last crossing through the tropopause.
Table 2. Overview of the Applied Methods
Surfaces for which flux is calculated
700 hPa + 2 pvu
700 hPa + 2 pvu
Method for calculating flux through 700 hPa surface
F = cw Tracer concentration: determined with backtrajectories, starting at 700 hPa
F = cw Tracer concentration: determined with backtrajectories, starting at 700 hPa
Lagrangian advection of 8 million particles.
Lagrangian advection of 100.000 particles.
Direct calculation of flux from vertical transport of tracers.
Direct calculation of flux from vertical transport of tracers.
F = (1/g) × ω × mixing ratio
Method for calculating flux through 2 pvu surface
Budget study of when and where trajectories pass the tropopause. Trajectories are started in the troposphere and the stratosphere in the entire Northern hemisphere.
Trajectory model output used as input for Wei's  equation. Trajectories are started on the tropopause.
Budget study of particles that crossed the 2 pvu surface.
ECMWF model output directly used as input for Wei's  equation.
 With LAGRANTO also the flux through the tropopause is calculated. The approach to obtain fluxes across the 2 pvu surface is identical to the method described by Wernli and Bourqui . Trajectories are started in the entire Northern Hemisphere every 24 hours on a regular grid with a horizontal (vertical) spacing of 80 km (30 hPa) between 80 and 600 hPa. The air parcels represented by these trajectories are only considered as exchange events if they cross the tropopause within 24 hours and if they have resided for at least 1 day in the stratosphere before crossing the 2 pvu surface and reside at least 1 day in the troposphere after the crossing or vice versa for troposphere to stratosphere exchange. With this second criterion, parcels that move transiently across the tropopause on short time scales are eliminated. By keeping up a budget of when and where the trajectories pass the tropopause, the flux in the domain specified for this case study can be calculated.
 With the TRAJKS trajectory model only fluxes through the tropopause are calculated, with a different method compared to LAGRANTO [Meloen et al., 2001]. For the TRAJKS method the equation derived by Wei  is used, with Q as the vertical co-ordinate. Every 3 hours a 48-hour forward and 48-hour backward trajectory is calculated at the tropopause from a regular grid with 1° resolution. In the present study the method used by Meloen et al.  is extended with a residence time criterion [Wernli and Bourqui, 2002]. This means that the flux is calculated only for those air parcels which reside 48 hour in the stratosphere and troposphere before and after the exchange. This is done to eliminate air parcels that move rapidly to and fore across the tropopause. Of those air parcels that satisfy the residence time criterion, only the first 12 hours are used as input for the Wei equation, from which the flux is derived.
 In addition to the trajectory methods, two Lagrangian transport models, FLEXPART and STOCHEM, are used in this intercomparison. In these models, 8 million and 100.000 particles are initialized, respectively. Unlike the trajectory models, these models contain parameterizations of sub-grid scale processes. Tracer concentrations are determined by the tracer masses of the particles located within a grid cell. Upward and downward tracer fluxes are determined by keeping a budget of the particle tracer masses crossing the surfaces within 3-hour periods.
 With the Eulerian Wei method the mass flux across the tropopause has been calculated by applying Wei's equation in isobaric coordinates [Siegmund et al., 1996]. The cross-tropopause flux for the time interval [t0, t0 + 3h] is computed from data at t0 and t0 + 3h. To ensure physical consistency, they are taken from the same ECMWF forecast. For example, the flux for the 9–12 UTC time interval is computed from the 3-hour and 6-hour forecasts based on the analysis at 6 UTC. Fluxes with an amplitude smaller than 0.005 kg m−2 s−1 are considered as noise and are not taken into account.
 Besides FLEXPART and STOCHEM, three other global models, TM3, ECHAM4 and MA-ECHAM4, are used to calculate the stratospheric tracer flux through and the concentration on the 700 hPa surface. Like FLEXPART and STOCHEM, TM3 is driven by ECMWF wind fields. ECHAM4 and MA-ECHAM4 are atmospheric general circulation models. TM3 and MA-ECHAM4 calculate the stratospheric tracer fluxes directly from the vertical transport of the tracers. ECHAM4 calculates the stratospheric tracer fluxes through the 700 hPa surface as (1/g) × ω × mixing ratio of the tracer on the pressure surface, where ω is the vertical velocity in pressure coordinates (Pa s−1) and g the acceleration due to gravity (m s−2). With STOCHEM, TM3, ECHAM4 and MA-ECHAM4 the fluxes through the tropopause have not been calculated, because the tropopause is not a predefined model level and rapidly moves up- and downward. As a consequence, interpolation to it would lead to large errors in the computed fluxes.
3. Meteorological Situation
 The period for which the model simulations have been performed, is from the 26th of May 00 UTC until the 7th of June 00 UTC 1996. This period has been considered previously in studies by, e.g., Stohl et al. , Eisele et al.  and Bonasoni et al. . Extensive observations made in this period indicate a deep stratospheric intrusion with associated STE. Therefore, this period is an attractive case for this model intercomparison. During this period several weather systems developed and decayed within the region of study, as described by Stohl et al. . An example is shown in Figure 1. Q on the 320 K isentropic surface on the 28th of May 1996 12 UTC shows a southward extrusion of stratospheric air over central Europe (Figure 1a). A cross-section through this stratospheric filament (Figure 2), reveals the considerable depth of the extrusion into the troposphere. During the next day, the extrusion transformed into a cut-off low (Figure 1b), which decayed rapidly over the eastern Mediterranean (Figure 1c).
4. The Model and Method Intercomparison
 In this section the results of the intercomparison are described. In section 4.1 the computed cross-tropopause mass fluxes are presented. In section 4.2 the results for the 700 hPa surface are considered by discussing the time series of the domain-integrated stratospheric tracer concentration (4.2.1), the tropospheric tracer flux (4.2.2), the stratospheric tracer flux (4.2.3) and the latitude/longitude fields of the stratospheric tracer flux (4.2.4). Finally, in section 4.3 time versus height plots are shown for the station Mt. Cimone (44°N, 10.5°E, 2165 m asl).
4.1. Cross-Tropopause Mass Fluxes
 A direct measure of STE is the air-mass flux through the tropopause, which is defined in this study as the 2 pvu surface. Unfortunately only four of the nine models are able to calculate the air-mass flux through the 2 pvu surface. From Figure 3, which displays the net and the separate up- and downward fluxes averaged over the considered domain, it can be seen that TRAJKS and LAGRANTO give very similar results. Although both methods use ECMWF data and trajectories, this similarity is quite surprising since the applied methods are very different. The LAGRANTO method starts trajectories through almost the entire troposphere and stratosphere and keeps a budget of trajectories passing the tropopause, whereas the TRAJKS method only starts trajectories on the tropopause and uses the potential vorticity along the trajectories to solve the Wei formula which gives the air-mass exchange through the tropopause.
 The Eulerian Wei method and the FLEXPART model show the same temporal evolution as LAGRANTO and TRAJKS, but with a larger amplitude. This is most clearly illustrated by the up- and downward tracer fluxes through the tropopause in Figure 3b. Especially FLEXPART, whose results have been multiplied by 0.1 in Figure 3b, yields about ten times larger up- and downward mass fluxes than the three other methods. The difference with the LAGRANTO and TRAJKS method is, that there is no residence time criterion applied in FLEXPART. James et al.  found that most of the air parcels that cross the tropopause return to their original reservoir within 24 hours. Thus, if these fluxes would have been excluded, the FLEXPART results would have likely been more similar to the LAGRANTO and TRAJKS results. Furthermore, boundary layer and convective fluxes are included in the FLEXPART model, possibly explaining the larger FLEXPART net fluxes.
4.2. Tracer Concentrations and Fluxes at the 700 hPa Surface
 The 700 hPa surface has been chosen for the intercomparison of the models and methods because in the considered case it is entirely located in the troposphere (compare Figure 2). The 700 hPa stratospheric tracer concentration and flux are, therefore, indirect measures of deep STE. Pressure surfaces at altitudes higher than 500 hPa are in the considered case situated partly in the stratosphere and partly in the troposphere. On these levels differences in the fluxes between the models can therefore not only be attributed to differences in STE, but also partly to differences in tropopause height between the models. When comparing the 700 hPa stratospheric tracer results with results for pressure surfaces at higher altitudes, it is found that the different model results are more similar for the higher altitude pressure surfaces. This is as expected, because these surfaces are closer to the stratospheric tracer's source region, and because they are situated partly in the stratosphere. For the 700 hPa surface, the obtained stratospheric tracer concentrations and fluxes are entirely due to STE and the subsequent transport of the stratospheric tracer down into the lower troposphere.
4.2.1. Time Series of the Stratospheric Tracer Concentration
 The time series of the domain-averaged stratospheric tracer concentration at 700 hPa is shown for several models in Figure 4. It can be seen that MA-ECHAM4 and ECHAM4 (ECHAM4 results have been multiplied by 0.5) give much larger stratospheric tracer concentrations than the other models. This might be partly due to numerical diffusion. Especially in the presence of large gradients, as is the case for the stratospheric tracer at the tropopause level, these models tend to decrease the gradient by numerical diffusion, bringing stratospheric tracer into the troposphere. It is then rapidly transported throughout the troposphere by vertical motions. An increased horizontal and vertical model resolution reduces the numerical diffusion. The tracer advection scheme as used in ECHAM4 is more diffusive than the scheme used by MA-ECHAM4. On the other hand, ECHAM4 has a finer horizontal resolution than MA-ECHAM4. Because the stratospheric tracer concentrations at 700 hPa are much larger for ECHAM4 than for MA-ECHAM4, the difference in numerical diffusion is assumed to be mainly due to differences in the tracer advection scheme. The increased vertical resolution around the tropopause as MA-ECHAM4 has, might also have contributed to the lower numerical diffusion. TM3 also suffers from numerical diffusion, but its tracer advection scheme is less diffusive than the scheme used by MA-ECHAM4. The other models and methods only suffer from weak numerical diffusion.
 Another reason why the ECHAM4 and MA-ECHAM4 models give much larger stratospheric tracer concentrations might be that these models are on-line, whereas the other models are off-line. The off-line models use the ECMWF wind fields every 6 or 3 hours, and interpolate between these two values to obtain the wind field in between. ECHAM4 and MA-ECHAM4, on the other hand, are nudged by the 6-hourly ECMWF meteorology but produce wind fields every time step. This means that the vertical wind in the on-line models can display a larger variability than the interpolated, hence smoother, vertical wind in the off-line models. As a result of these more fluctuating winds the transport of air across the tropopause can be larger in the GCMs, and the stratospheric tracer is expected to be transported faster throughout the troposphere, leading to larger stratospheric tracer concentration at 700 hPa.
 The nudging is not supposed to bias the stratospheric tracer concentration on the 700 hPa surface. Although the tendency introduces by the nudging is not physical, the perturbation of the model's physical balance is smaller than the physical tendencies. Therefore, it is assumed that the nudging reproduces the observed meteorology without introducing substantial noise [Jeuken et al., 1996].
4.2.2. Time Series of the Tropospheric Tracer Flux
 In the troposphere the tropospheric tracer flux gives an indication of the vertical velocity in the models. An accurate vertical velocity is necessary for a correct representation of STE in general, and for a correct representation of the stratospheric tracer fluxes. In Figure 5 the time series of the domain-averaged net (a) and up- and downward tropospheric tracer fluxes (b) at 700 hPa are shown. As can be seen in Figure 5a, the net tropospheric tracer fluxes computed by the different models are qualitatively similar, but the absolute values differ about a factor of two or three. The net tropospheric tracer flux for TM3 has relatively low, and even negative values. The differences between the models and the methods might be due to, e.g., differences in the vertical velocity ω, different parameterizations (e.g., convective parameterizations in FLEXPART) or the different methods to calculate the tropospheric tracer fluxes.
 In Figure 6 the latitude/longitude fields of ω at a particular time are shown for some of the models and methods. As expected, the models with a fine resolution (i.e. ECHAM4, TM3 and LAGRANTO) show more small-scale variability than the models with a relatively coarse resolution (MA-ECHAM4 and STOCHEM). It can also be seen that the geographical pattern of ω is not the same for the different models and methods. For example, over Scandinavia the ω is upward for TM3 where it is downward for ECHAM4. Such differences in ω imply differences in the calculated stratospheric and tropospheric tracer fluxes.
 The net tropospheric tracer flux (Figure 5a) is a residual from relatively large up- and downward fluxes (Figure 5b). Especially FLEXPART displays very large up- and downward fluxes compared to the net flux and also compared to the up- and downward fluxes of the other models. Probably, these large FLEXPART fluxes arise because in some parts of the domain the 700 hPa surface lies within the boundary layer. In the boundary layer of the FLEXPART model the particles are rapidly transported up- and downward by the parameterized turbulent eddies, causing larger up- and downward fluxes than when the effects of boundary layer turbulence would have been neglected or treated as a grid box average. The absolute differences between the other models are about a factor of three, with LAGRANTO, FLEXTRA, FLEXPART and MA-ECHAM4 giving relatively large net tropospheric tracer fluxes, and the others relatively small fluxes.
4.2.3. Time Series of the Stratospheric Tracer Flux
 In Figure 7 the time series of the domain-averaged net (panel a) and up- and downward (panel b) stratospheric tracer fluxes at 700 hPa are shown. The temporal evolution in the stratospheric tracer fluxes is more comparable to the stratospheric tracer concentration (Figure 4) than to the tropospheric tracer fluxes (Figure 5). The differences between the fluxes and the concentration of the stratospheric tracer are entirely due to differences in the vertical velocity.
 Unlike the results of FLEXPART for the cross-tropopause flux in Figure 3 and the tropospheric tracer flux in Figure 5, the up- and downward stratospheric tracer fluxes of FLEXPART are comparable to those of the other models. Comparing FLEXTRA and FLEXPART, the latter being an expansion of the first, it can be seen that FLEXPART gives in general larger fluxes than FLEXTRA. This is as expected, because in FLEXPART more physical processes are included, such as boundary layer turbulence, which enhance the stratospheric and tropospheric tracer fluxes. STOCHEM produces relatively small stratospheric tracer fluxes, probably because in STOCHEM the number of air parcels is relatively small, leading to only a small chance of a parcel crossing the 700 hPa surface in a grid cell in a 3-hour time period, and, consequently, to a small air mass flux across this surface.
 Comparing the time series of the stratospheric and tropospheric tracer fluxes it can be seen that they show signatures of various events. The tropospheric tracer flux (Figure 5a) has a maximum around 36h and a minimum around 96h, which is reverse in most of the modeled stratospheric tracer fluxes (Figure 7a). At the time of the intrusion of stratospheric air into the troposphere (Figure 1a, t = 60 hours), the net tropospheric tracer fluxes at 700 hPa are enhanced (Figure 5a). The net stratospheric tracer flux at 700 hPa (Figure 7a), becomes large only when the low has been cut off and is decaying (Figure 1c, t = 84 hours). In the latter situation the tropospheric tracer flux is reduced. TRAJKS, which was used to calculate the flux through the tropopause, displays the same features, i.e. less exchange at the time of the intrusion of stratospheric air into the troposphere and more transport from the stratosphere to the troposphere at the time and place of the decaying cut-off low (not shown).
4.2.4. Latitude/Longitude Fields of the Stratospheric Tracer Flux
Figure 8 shows the latitude/longitude fields of the stratospheric tracer flux through the 700 hPa surface on the 29th of May 1996 9–12 UTC. Here the differences between the GCMs and the other models and methods can be clearly seen. The fluxes of the stratospheric tracer in ECHAM4 and MA-ECHAM4 are larger than those in the other models and methods, as explained in section 4.2.1. The patterns are similar for all models and methods except for STOCHEM where the flux equals zero in most regions. This is probably because in STOCHEM the number of air parcels is relatively small, leading to only a small chance of a parcel crossing the 700 hPa surface in a grid cell in a 3-hour time period, and, consequently, to a small air mass flux across this surface.
4.3. Time Versus Height Plots of the Stratospheric Tracer Concentration
 In Figure 9 the time versus height plots of the stratospheric tracer concentration above the station Mt. Cimone are shown for the different models and methods. In this figure the tropopause heights, which correspond to where the gradient of stratospheric tracer concentration is largest, are at about the same altitudes in all models. The plots display some similar features, i.e. enhanced stratospheric tracer concentrations in the first 24 hours and an intrusion of stratospheric air into the troposphere around 48 hours. However, the penetration depth into the troposphere differs between the models. All models and methods except STOCHEM show a pattern of high stratospheric tracer concentration around 96 hours at an altitude of about 600 hPa. At the end of the period all models and methods again show a slightly increased stratospheric tracer concentration whose duration and penetration depth into the troposphere differs again between the several models and methods.
 In Figure 9 it can be seen that the ability of capturing the intrusion depends very much on the model resolution. STOCHEM, which has the coarsest resolution hardly shows the intrusion, and MA-ECHAM4 shows a broader intrusion than ECHAM4.
 The similarity in spatial pattern between LAGRANTO and FLEXTRA is not surprising, since the applied methods are almost similar. They both show a smaller amount of stratospheric tracer in the troposphere and very localized patches of stratospheric tracer in the troposphere. This is probably due to the fact that these are the only two methods that, in case of multiple tropopause crossings, only allow STE after the last tropopause crossing.
 ECHAM4 and MA-ECHAM4 show larger stratospheric tracer concentrations in the troposphere than the other models and methods. As already mentioned in section 4.2.1, this is probably due to numerical diffusion and more varying vertical winds.
5. Discussion and Conclusions
 This paper presents one of the first extensive case study intercomparisons of models and methods used for estimating stratosphere-troposphere exchange (STE). In the present study the number and range of applied models and methods is relatively large. Also, the model and method results are evaluated with measurements in a companion paper by Cristofanelli et al. . The intercomparison has been performed in the framework of the EU-project STACCATO with nine different models and methods. Hereto, for all models and methods an idealized stratospheric tracer is inserted in the stratosphere and an idealized tropospheric tracer in the troposphere, both with a mixing ratio of 1 kg/kg. When this tracer leaves the stratosphere or troposphere, it decays exponentially with a time constant of 2 days. Three trajectory methods (LAGRANTO, FLEXTRA and TRAJKS), one Eulerian method (the Wei method), two Lagrangian transport models (FLEXPART and STOCHEM), one Eulerian transport model (TM3) and two nudged GCMs (ECHAM4 and MA-ECHAM4) participated in this intercomparison.
 For a correct representation of STE several processes need to be included in the models. For a correct representation of the spatial and temporal distribution of STE, synoptic-scale weather systems, like tropopause foldings and cut-off lows, need to be correctly represented. Another important quantity is the vertical velocity, which is affected by diabatic processes, especially the release of latent heat, and by turbulent mixing in the tropopause region. For the models and methods in which the dynamics entirely depend on ECMWF data (i.e. all models and methods except the two GCMs), these processes are captured by the ECMWF analyses. In addition, the ECMWF model has a high horizontal (0.5° × 0.5°) and vertical (31 model levels) resolution and the utilized analyses are based on observations, which makes the ECMWF model an appropriate source of input data for the models and methods that are used to calculate STE. For GCMs the resulting representation of the spatial and temporal distribution of STE strongly depends on the chosen horizontal and vertical resolution, as is indicated by the differences between the ECHAM4 and MA-ECHAM4 results.
 Except for physical processes that can lead to STE, there are also several artificial sources for STE. In the ECMWF model the addition of measurement data disturbs the dynamical consistency in the model every analysis time step. In the GCMs there is strong numerical diffusion restricting the reliability of the results. Also the nudging of the GCMs can be an artificial source of STE. However, the perturbation of the model's physical balance has found to be much smaller than the physical tendencies. Therefore, this artificial source for STE is assumed to be small [Jeuken et al., 1996].
 The goal of this intercomparison study is to intercompare a wide range of models and methods, in order to learn more about the advantages and disadvantages of each model and method. It should be realized that this intercomparison does not draw unambiguous conclusions because the number of differences between the models and methods is too large to attribute the differences in results to a single model/method difference. Ideally, an intercomparison as performed in this study should be accompanied by an additional intercomparison in which the method of flux calculation, the horizontal and vertical resolution, the treatment of multiple tropopause crossings, etc. are as similar as possible. An intercomparison like the present one, with the commonly used (differing) model and method characteristics, can be used then to draw conclusions on the normally used model and method characteristics. An overall comment is that a more realistic tracer distribution would have prevented difficulties with numerical diffusion in some of the models and would have facilitated the intercomparison of the results with measurements [Cristofanelli et al., 2003].
 Nevertheless, we have tried to draw some conclusions from the results found in this study. The results indicate that the ECHAM4 and MA-ECHAM4 results are influenced by a larger numerical diffusion, especially in the vicinity of large tracer gradients, and by a larger variability of the vertical winds. Both effects might be partly due to the applied intercomparison setup, because the stratospheric tracer was inserted with a mixing ratio of 0 kg/kg in the troposphere and a mixing ratio of 1 kg/kg in the stratosphere. Therefore, the stratospheric tracer gradient in the vicinity of the tropopause is very large, and consequently, both the numerical diffusion near the tropopause and the additional exchange due to the more varying vertical winds will be very large. In practice, tracers do not have such a sharp gradient. Ozone, for example, is initialized a few model levels above the tropopause, with a concentration that gradually increases going up in the stratosphere. Therefore, the numerical diffusion and more varying vertical winds will not have such a large impact on the ozone transport into the troposphere. Simulations with GCMs of the exchange of ozone are therefore expected to be more realistic than the simulation of the exchange of the stratospheric tracer in this study, which was also found by Cristofanelli et al. .
 From the results of the present study it is not possible to conclude which of the applied models or methods simulated the most realistic STE. Hereto, in the complementary study by Cristofanelli et al.  the simulations are compared with measurements. Nevertheless, the results from the present study give some insight in the dependence of the simulated STE on several aspects of the models and methods, such as the spatial resolution or the strength of the numerical diffusion. What is important for a correct spatial and temporal distribution of STE is a high horizontal and vertical resolution, as has already been shown by van Velthoven and Kelder . This can be seen comparing ECHAM4 and MA-ECHAM4, the latter having a coarser horizontal resolution (but a slightly higher vertical resolution in the vicinity of the tropopause), and by comparing STOCHEM with the other models, STOCHEM having a relatively coarse resolution.
 LAGRANTO, FLEXTRA, TRAJKS and FLEXPART all show the same aspects of STE in the considered time period, with only slightly differing amplitudes. The results for the air mass flux through the tropopause for TRAJKS and LAGRANTO are even almost identical. That these four models show similar results is perhaps not surprising, because they are all based on trajectories, and LAGRANTO and FLEXTRA are even identical apart from the applied trajectory model and the temporal resolution of the input data. LAGRANTO, FLEXTRA, TRAJKS and FLEXPART also have the same and a fairly high resolution, which favors similar results. TM3 shows a similar amplitude as these models, but has a slightly different pattern and temporal evolution. STOCHEM shows the same temporal evolution but has an amplitude that is two to three times smaller than that shown by the majority of the other models and methods. This is due to the relatively coarse resolution of the model which leads to smaller vertical transport and mixing. The crude meteorological assimilation scheme used in this model may also be responsible for the underestimation, since an improved assimilation scheme gave results that were more in line with the results of the other models (not shown).
 In conclusion, for the period and region considered in the present study the STE simulations with nine different models and methods show the same temporal evolution and the same geographical pattern of STE, but with generally different amplitudes. On the other hand, for some simulations also the amplitudes are very similar. However, any model estimate of STE should be confronted with observations. This is presented in the companion paper by Cristofanelli et al. .
 This study is funded by the EU project STACCATO (EVK2-1999-00316). We would like to thank the reviewers of this manuscript.