Monthly zonal mean climatologies of atmospheric measurements from satellite instruments can have biases due to the nonuniform sampling of the atmosphere by the instruments. We characterize potential sampling biases in stratospheric trace gas climatologies of the Stratospheric Processes and Their Role in Climate (SPARC) Data Initiative using chemical fields from a chemistry climate model simulation and sampling patterns from 16 satellite-borne instruments. The exercise is performed for the long-lived stratospheric trace gases O3 and H2O. Monthly sampling biases for O3 exceed 10% for many instruments in the high-latitude stratosphere and in the upper troposphere/lower stratosphere, while annual mean sampling biases reach values of up to 20% in the same regions for some instruments. Sampling biases for H2O are generally smaller than for O3, although still notable in the upper troposphere/lower stratosphere and Southern Hemisphere high latitudes. The most important mechanism leading to monthly sampling bias is nonuniform temporal sampling, i.e., the fact that for many instruments, monthly means are produced from measurements which span less than the full month in question. Similarly, annual mean sampling biases are well explained by nonuniformity in the month-to-month sampling by different instruments. Nonuniform sampling in latitude and longitude are shown to also lead to nonnegligible sampling biases, which are most relevant for climatologies which are otherwise free of biases due to nonuniform temporal sampling.
 Stratospheric trace gas observations are often used to produce monthly zonal mean data sets, or climatologies [e.g., von Clarmann et al., 2012; Grooß and Russell, 2005; Hassler et al., 2009; Jones et al., 2012; Randel and Wu, 2007]. Monthly zonal mean data products are typically used as prescribed forcing for models [e.g., Cionni et al., 2011] and are also useful for comparison with similarly averaged chemistry climate model output [e.g., Grewe et al., 2012; SPARC CCMVal, 2010].
 Monthly zonal mean data sets can be constructed through a variety of methods. A simple method involves binning observations into latitude and month bins and calculating average values for each bin. This is the method used by the Stratospheric Processes and Their Role in Climate (SPARC) Data Initiative [Hegglin and Tegtmeier, 2011; M. I. Hegglin et al., SPARC Data Initiative: Comparison of trace gas and aerosol climatologies from international satellite limb sounder, manuscript in preparation, 2013], which aims to produce such climatologies for a number of stratospheric trace gas measuring limb sounding instruments to be used in model-observation comparisons and help guide future data merging activities.
 Each observation-based climatology is an estimate of the true atmospheric mean state; however, differences between the observation-based climatology and the truth may arise for a number of reasons. In many cases, the largest source of error in the climatology is due to errors in the measurements themselves. These errors are present in individual measured profiles and can be best estimated through careful comparison of coincident profiles. However, the construction of climatologies may itself introduce additional errors. The choice of averaging technique can lead to differences in the produced climatologies [Funke and von Clarmann, 2012]. The random error in a climatological mean due to the finite and potentially nonuniform sampling of an atmospheric field was investigated by Toohey and von Clarmann . Here we investigate how such sampling issues may lead to systematic errors, or biases, in atmospheric climatologies.
 Sampling bias is an error in a computed quantity which arises due to unrepresentative sampling of the population. In the estimation of an atmospheric mean value, sampling bias may occur when the atmospheric state within the time-space domain to be averaged over is not uniformly sampled. The magnitude of sampling bias will also be related to the degree and structure of the variability: If the atmospheric variability is weak, then nonuniformity of sampling will not strongly bias the sample mean compared to the population mean. Conversely, when variability is strong, then nonuniform sampling may lead to significant sampling bias.
 The impact of sampling and averaging on observation-based climatologies of tropospheric fields such as temperature, clouds, and chemical trace gases has been examined by a number of prior studies [e.g., Aghedo et al., 2011; Engelen et al., 2000; Guan et al., 2013]. The central technique of these studies is to subsample climate model or reanalysis fields based on the sampling patterns of the instruments of interest and then to quantify differences between the fully resolved raw fields and the instrument-sampling-pattern-based sampled fields.
 In the present study, we use a similar technique as the above cited works, making use of chemistry climate model output, and sampling the model fields based on the sampling patterns of a suite of satellite instruments. However, we focus on instruments measuring stratospheric trace gas species, many of which have smaller sample density than the (often nadir viewing) instruments investigated in previous studies focusing on tropospheric measurements. Furthermore, we focus on the effect of averaging within 5° latitude bands, which is a coarser horizontal resolution than that typically used in prior studies. We use sampling patterns for 16 instruments, all of which are participants in the SPARC Data Initiative. We apply the sampling exercise to modeled O3 and H2O fields and characterize the structure and magnitude of the potential sampling biases associated with the sampling pattern of each instrument for the two chemical species. The results of this study should be used to help interpret instrumental climatology comparisons within the SPARC Data Initiative for O3 and H2O (Tegtmeier et al.  and Hegglin et al. , respectively) and inform future use of the climatologies for other purposes such as model comparisons.
2 Data and Methods
2.1 Instrumental Sampling Patterns
 Instruments participating in the SPARC Data Initiative and which are used in the present study are listed in Table 1, and include the limb emission sounders Aura MLS, HIRDLS, MIPAS, SMR, SMILES and UARS MLS, the limb scattering sounders SCIAMACHY and OSIRIS, the solar occultation instruments ACE-FTS, HALOE, POAM II, POAM III, SAGE II and SAGE III as well as the stellar occultation instrument GOMOS. We have also included the nadir-viewing TES emission sounder, which is used in the SPARC Data Initiative evaluation of upper troposphere/lower stratosphere (UTLS) ozone observations (J. L. Neu et al., The SPARC Data Initiative: Comparison of upper troposphere/lower stratosphere ozone climatologies from limb-viewing instruments and the nadir-viewing Tropospheric Emission Spectrometer (TES), submitted to Journal of Geophysical Research, 2013). TES has much coarser vertical resolution (~6–7 km versus ~2–4 km) but much higher horizontal resolution (~10 km versus ~200 km) than the other instruments in this study. Detailed information on the individual instruments including their sampling patterns and retrieval techniques can be found in the SPARC Data Initiative report (section 2) and Hegglin et al. (manuscript in preparation, 2013).
Table 1. Participating Instruments With the Time Period Used to Define the Sampling Pattern for Each Instrument
Sample Reference Period
The MIPAS sampling pattern used here refers to the nominal reduced spectral resolution measurement mode, which was the regular MIPAS measurement mode from 2005 to 2012.
 Sampling patterns have been compiled for each instrument, defined as day, time, latitude, and longitude of measurement locations. For most instruments, a typical year of actual sampling locations has been used in the analysis, rather than, for instance, a time series of all possible measurements (which may differ for reasons such as data download limitations). The particular years that were used to define each instrument's sampling pattern are included in Table 1. Year-to-year variations in sampling patterns due to such factors as changing orbital states, changing instrument capabilities, or irregular data gaps—which may be significant for a number of instruments—are not addressed by this study. For Aura MLS, HIRDLS, MIPAS, and TES, month-to-month variations in sampling are generally negligible, and we have therefore used a typical month from their sampling pattern and repeated this for all months of the year. For SMR, sampling patterns are defined separately for O3 and H2O, based on locations of retrieved profiles from the 501.8 GHz and 488.9 GHz measurement bands, respectively. (In addition to the 488.9 GHz band H2O retrievals, which extend from 20 to 70 km, SMR also produces climatologies of H2O from ~16 to 20 km through measurements of the 544.6GHz band [Hegglin et al., 2013], however these measurements are not considered here.) For SMILES, the sampling patterns correspond to the samples over the full lifetime of the mission, from October 2009 to April 2010.
 Monthly sample counts within the 5° bins of the SPARC Data Initiative grid are shown for the instrumental sampling patterns in Figure 1. The different sample patterns of remote sounders of the stratosphere lead to substantial differences in the sample density patterns shown in Figure 1. Solar occultation instruments typically have the sparsest sample density, and therefore the smallest monthly sample counts, because their number of measurements is limited to a maximum of two per orbit. The latitudinal coverage of solar occultation instruments varies with time, leading to samples concentrated within finite latitude ranges (usually one in the Northern Hemisphere (NH) and one in the Southern Hemisphere (SH)) each month and also leaving unsampled latitudes each month. Solar occultation instruments with high inclination orbits (~57 to 74°, e.g., ACE-FTS, HALOE, and SAGE II) have sampling which spans nearly the full globe over a year. Other solar occultation instruments (POAM II, POAM III, and SAGE III) with Sun-synchronous orbits have sample ranges concentrated in the middle to high latitudes. Climatologies produced for the stellar occultation instrument GOMOS use only night measurements and therefore exclude measurements from the polar summer. Instruments which measure scattered sunlight (SCIAMACHY, OSIRIS) have denser sampling but are restricted to measuring within the sunlit portion of the atmosphere, leading to a smaller number of samples in the winter hemisphere, which may be just the polar latitudes, or a larger portion of the winter hemisphere depending on the pointing of the instrument. Instruments which measure atmospheric emission (Aura MLS, HIRDLS, MIPAS, SMILES, SMR, TES, and UARS MLS) do not have such constraints related to the radiation source, and thus typically have sampling coverage that is denser than solar occultation and limb scattering instruments and relatively uniform with latitude and time. An exception to the latter case is the sampling pattern of UARS MLS, which produced measurements spanning 34° on one side of the equator to 80° on the other side, with hemispheric coverage switching when the satellite performed a 180° yaw maneuver 10 times per year, at approximately 36 day intervals.
2.2 Model Fields
 Chemical fields are taken from an integration of the Whole Atmosphere Community Climate Model version 3 (WACCM3), a fully coupled chemistry-climate model, spanning the range of altitude from the Earth's surface to the thermosphere [Garcia et al., 2007]. The particular version of the model used here (3.4.58) is essentially the same as that used for the second Chemistry-Climate Model Validation Activity [Morgenstern et al., 2010; SPARC CCMVal, 2010]. The model's horizontal resolution is 1.9° by 2.5° (latitude by longitude). WACCM has been extensively evaluated as part of the SPARC CCMVal report . WACCM has a good distribution and variability of O3 and H2O in the stratosphere [SPARC CCMVal, 2010, chapter 9] and UTLS [Gettelman et al., 2010; Hegglin et al., 2010]. We use here model output with daily resolution at 0 UTC from 1 year of a transient simulation under modern conditions.
 Annual zonal mean mixing ratios for O3 and H2O are shown in Figure 2. Since sampling biases result from nonuniform sampling of a varying field, we also show for reference a measure of “average variability” for the WACCM fields, calculated by taking the standard deviation of the daily fields for each month and averaging over the 12 calendar months. The monthly standard deviations are calculated over all model values for each latitude and pressure level, i.e., over all times and longitudes, and as such represent the magnitude of the variability due to temporal progression of the seasonal cycle within any month and that due to dynamically induced disturbances to zonal symmetry. Maximum variability of O3 is seen in the high latitudes, where the O3 seasonal cycle is strongest, and where meridional gradients in the mean O3 field mean that dynamical variability results in chemical variability. The variability of H2O mixing ratios is notably weaker, a result of its weaker seasonal cycle and weaker meridional gradients. Strongest variability of H2O is seen in the SH lower stratosphere high latitudes and is a result of the dehydration of the SH winter polar vortex.
2.3 Sampling Bias Calculation
 Starting with full resolution model fields, the goal of the exercise is to emulate as closely as possible the sampling of each instrument and the climatology procedure used to produce the SPARC Data Initiative climatologies and compare these climatologies to the “true” model climatology.
 First, the instrument sampling patterns for each month of the year are used to subsample the model data. For each sample, model fields from the corresponding Julian day are linearly interpolated in space to the latitude and longitude of the sample location. Once the model data have been interpolated to each sample location, the subsampled fields are processed in the same manner as the real measurements in the production of climatologies, i.e., binned according to the SPARC Data Initiative latitude grid, and the mean calculated for each latitude bin and pressure surface. The only exception is that in the case of MIPAS SPARC Data Initiative climatologies, measured mixing ratios were interpolated to the central latitude of the bin, in order to avoid sampling biases. This operation has not been accounted for in this study; our analysis of the sampling error is based on the original geolocations of the measurements. The true model climatology, or population mean, is produced by first calculating the mean of all model fields on each latitude circle of the model's latitude grid, then linearly interpolating these mean values to the midpoint of each latitude bin. This interpolation is performed due to the fact that the two model latitudes (at 1.9° resolution) within each 5° latitude bin will not be perfectly centered with respect to the bin midpoint, therefore simply averaging the model data within a latitude bin could introduce a small sampling bias to the true model climatology. The difference between the instrument-sampling-pattern-based subsampled field mean and the full-model-resolution field mean gives the sampling bias. For each month and for each instrument, this bias is calculated for every latitude bin in which an instrument has measurements, and at all pressure levels of the model fields. The vertical range of presented results for each instrument is then “cropped” based on the range of measurements in the corresponding SPARC Data Initiative climatologies.
 This method produces an estimate of sampling bias due to the horizontal sampling of the instruments, and no effort is made here to understand differences between instrumental climatologies due to differences in the vertical resolution of the measurements. Furthermore, by using daily mean model data, we cannot address possible biases in instrumental climatologies due to different sampling of a chemical species' diurnal cycle. As such, we apply our study here only to chemical species with long stratospheric lifetimes.
 Ideally, in an exercise such as this, one would like the resolution of the sampled model fields to be similar to the horizontal resolution of the measurements they are meant to emulate. This is because the variability of a sampled field depends on the resolution of the measurement: measurements made at fine resolution will have larger variance than those made at coarser resolution. While such issues are important when comparing measured and modeled variability [e.g., Toohey et al, 2010], when examining average values, the horizontal resolution should not have a large impact on the results, as long as sample sizes are large enough that the mean is well estimated (i.e., the standard error of the mean is small). Nevertheless, the resolution of the model fields used here is roughly consistent with that of the limb-sounding instruments—the horizontal resolution of limb sounding retrievals is typically on the order of a few hundred kilometers [e.g., von Clarmann et al., 2009]. The model fields used here have latitudinal resolution of about 200 km and longitudinal resolution that ranges from about 250 km at the equator to 200 km in the midlatitudes. At high latitudes, the rough agreement between model and observation resolution no longer holds, as the longitudinal resolution of the model fields is finer than that of the observations; however, we find (as shown in section 3) that estimated sampling biases are well explained by mechanisms for which the horizontal resolution of the model fields should not impact the results.
 Sampling biases calculated by the method described above are the biases introduced to the monthly zonal mean of model fields by the respective instrument sampling pattern. Whether the results represent reasonable estimates of the true sampling bias for any instrument's measurement of the real atmosphere depends on how similar the model fields are to reality. The model can be expected to reproduce in a fairly accurate way the seasonal cycle in chemical trace gases, and short-term (subseasonal) variability can be expected to be consistent with reality in a statistical sense. However, it should be clear that any sampling bias estimate resulting from short-term variability within this 1 year of model data is simply one realization of potential sampling bias. In this way, the sampling bias estimates should be considered example cases, and we focus on characterizing the general features of sampling bias and its dependence on latitude and height for the given sampling patterns.
 Since sampling bias depends on the space-time gradients of the measured field, it can be different for different trace gases. In the following, we explore sampling bias for two trace gas species included in the SPARC Data Initiative climatologies: O3 and H2O.
3.1 O3 Sampling Bias
3.1.1 Monthly Mean Sampling Bias
 Zonal mean sampling bias for O3 is shown for each instrument and each calendar month as a function of latitude and height in the supporting information. To summarize the absolute magnitude and spatial structure of sampling bias, Figure 3 shows the root-mean-square (RMS) of the 12 monthly sampling bias estimates for each instrument.
 In general, at pressure levels above ~70 hPa, O3 sampling bias for all instruments is weak in the tropics where intramonthly variability is weak (see Figure 2). In the extratropics and polar latitudes, where variability is stronger, sampling bias becomes much larger. Sampling bias is found to be weakest for the instruments with dense and uniform sampling density, Aura MLS, HIRDLS, MIPAS, and TES. RMS bias for these instruments is less than 1% over most of the stratosphere, with maximum values of 1–3% occurring in isolated regions at the high latitudes, and within the upper troposphere, where O3 mixing ratios are quite small. SMR, SMILES, and SCIAMACHY have similarly small (<1%) sampling bias through most of the tropical and subtropical stratosphere, but with larger RMS sampling biases in the middle to high latitudes. SMILES and SCIAMACHY show RMS sampling bias of >1% poleward of 45° in the NH and SH, respectively, while SMR and SCIAMACHY sampling bias reaches >5% in regions of the high latitudes of both hemispheres. RMS sampling bias of >2% covers most of the middle to high latitudes for OSIRIS and UARS MLS, and inspection of the monthly sampling bias plots reveals that the results are variable in time, with large sampling bias (>10%) in the middle to high latitudes for certain months of the year. Finally, the occultation instruments ACE-FTS, GOMOS, HALOE, POAM II, POAM III, SAGE II, and SAGE III show strong sampling bias, with RMS values greater than 5% over much of the high latitudes, and reaching values of >10% in isolated locations. A number of these instruments also show RMS sampling biases of 1–2% through much of the tropical stratosphere. The large sampling biases at high altitudes (above 0.4 hPa) calculated for many instruments should be considered to be an artifact of the exercise, due to the fact that O3 has an appreciable diurnal cycle at these heights, and since the model fields are saved at 0 UTC, the diurnal variability of O3 is expressed as a longitudinal structure in the O3 field.
 The largest sampling biases in Figure 3 can be understood to be a product of nonuniform sampling through the days of a month, as can be seen when one examines variations in O3 over a month, and the correlation of these variations with instrument sampling patterns. Sampling biases for the month of March are shown in Figure 4 for ACE-FTS, MIPAS, and OSIRIS, as examples of sampling bias estimates that span the range of results discussed above. Sampling bias for this month is strong for ACE-FTS (reaching values of >5%), weak for MIPAS, and strong for OSIRIS in the SH. In Figure 4 (right column), the model O3 field in March is shown in terms of percent anomalies from the monthly zonal mean and plotted as a function of latitude and Julian day for the 1 and 10 hPa pressure surfaces. Included in these plots are gray markers indicating the latitude bins in each day which contain measurements based on each instrument's sampling pattern.
 The MIPAS sampling pattern contains measurements in all latitude bins for all days: There is no variation in its sampling locations with time, and as a result, the sampling bias is small. ACE-FTS, on the other hand, as a solar occultation instrument, samples each latitude band over only a few days of the month. For example, in the month of March, SH midlatitudes (45°S) are sampled only at the very beginning of the month, while SH high latitudes (80°S) are sampled only at the very end of the month. At 1 hPa, O3 mixing ratios are increasing through the month over this latitude range; therefore, the ACE-FTS sampling pattern leads to negative sampling bias around 45°S, and slightly positive sampling bias at the highest SH latitudes. The seasonal cycle of O3 is comparatively reversed at 10 hPa, leading to slightly positive bias in the SH midlatitudes and negative bias in the SH high latitudes. In this way, it can be seen that the sampling biases of ACE-FTS can be well explained by its sampling of the temporal variations in O3, which depend strongly on height and latitude. At 100 hPa, intramonthly O3 variations are relatively noisy (not shown), and as a result, the sampling bias is dependent on the sampling of short-lived variability. We therefore can expect that in regions where the sampling bias is due to the nonuniform sampling of the slow seasonal variability through a month, the sign and approximate magnitude of the sampling bias calculated through our model exercise is a reasonably accurate estimate of the real sampling bias for each instrument. However, in regions where variability is dominated by short-term (time scale of days) variations, limited sampling of such a chemical field will lead to a random sampling error. In this case, the sign and magnitude of the sample error calculated through our model exercise is only an example and should be used only to identify regions where sampling error may be important.
 The sampling pattern of OSIRIS in March is dense and uniform in the NH, but it samples SH latitudes only within the first half of the month. The OSIRIS sampling pattern, and thus its sampling bias, is intermediate to the two extreme cases explored above. As a result of the sampling pattern, OSIRIS means in the SH are biased low at 1 hPa and high at 10 hPa. This type of sampling bias occurs for OSIRIS in months where its sampling pattern latitudinal range shifts from one hemisphere to the other. Sampling biases for UARS MLS come from a similar source due to its latitudinal coverage shifting within months, leading to the mid to high latitudes being sampled for only a portion of each month.
 The applicability of the sampling bias estimates to real data is demonstrated through a brief case study. Assuming small O3 sampling bias for MIPAS (as discussed above), one would expect that the difference of another instrument's climatology with respect to that of MIPAS, e.g., ACE-FTS minus MIPAS, should contain the effect of sampling bias in the ACE-FTS climatology, in addition to measurement errors. Figure 5 shows such differences for March climatologies of ACE-FTS and OSIRIS, with differences calculated with respect to both MIPAS and Aura MLS. Average values are shown over 2 years: 2005–2006 for ACE-FTS comparisons and 2008–2009 for OSIRIS comparisons. Differences between ACE-FTS and the dense sampling instruments are dominated by positive values in the upper stratosphere and mesosphere (3–0.1 hPa). This feature is independent of latitude and is a common feature of comparisons between ACE-FTS and other instruments [Dupuy et al., 2009]. Aside from this primary feature, the signature of the estimated sampling bias can be seen in the ACE-FTS comparisons: For example, the main features of the sampling bias in the SH, namely a negative bias at the highest SH latitudes between ~30 and 1 hPa and a negative bias in the midlatitude upper stratosphere (60–45°S, 3–1 hPa) can be detected in the ACE-FTS–MIPAS difference plot. In the NH, both ACE-FTS–MIPAS and ACE-FTS–Aura MLS show negative differences in the high latitudes between 30 and 1 hPa, much of which is consistent with the sampling bias estimate. For OSIRIS, the sampling bias signature of a strong negative bias in the SH upper stratosphere (3–0.8 hPa) is clearly evident in both OSIRIS–MIPAS and OSIRIS–Aura MLS difference plots. The predicted signature of positive bias in the high-latitude SH lower stratosphere is also seen in the OSIRIS–Aura MLS difference plots. The sampling bias estimate plot for OSIRIS suggests however that the positive difference between OSIRIS and both comparison instruments in the NH lower stratosphere high latitudes is not a result of sampling bias.
 Clearly, there are large differences between the climatologies shown in Figure 5 that are not explained by the sampling bias estimates, such as the positive difference between ACE-FTS and other instruments in the upper stratosphere and the negative bias of OSIRIS compared to MIPAS in the upper stratosphere. Such biases are almost certainly due to systematic biases in the measurements themselves. The comparisons highlight, however, that in certain regions and times, sampling biases make up a significant portion of the difference between monthly mean climatologies.
 In addition to temporal considerations discussed above, nonuniformity in latitude sampling can also lead to sampling bias, which can be most notable at the northern and southern limits of instruments' sampling patterns. In these cases, instrumental sampling patterns typically do not sample the full latitudinal extent of the latitude bin. If the measured species has a significant gradient from one side of the bin to the other, a sampling bias can occur. This sampling bias likely occurs for all instruments and all months to some degree but is most noticeable in cases where other sampling biases are not present, e.g., at the extreme southern latitudes for the dense samplers Aura MLS, HIRDLS, MIPAS and SMR.
 As an example, Figure 6 shows sampling biases from selected instruments in the SH high latitudes for the month of September, when Antarctic O3 depletion leads to a strong gradient in O3 mixing ratio across the polar vortex. Figure 6 (bottom row) shows the bias in sampled latitude (mean sampled latitude minus latitude midpoint), which is seen to be significant in the southernmost latitude bin of each instrument's sampling range, due to the incomplete coverage of this latitude band. Aura MLS, for example, samples only the northern half of the 80–85°S bin, which leads to a positive sampling bias (of around 4%) in the Aura MLS climatology in this latitude bin at around 3 hPa. In comparison, MIPAS samples the full 80–85°S bin, but only half of the 85–90°S bin. However, the latitudinal gradient in O3 is weaker over this bin, so the MIPAS sampling bias at 3 hPa is smaller (~2%). All instruments display some degree of latitudinal sampling bias at the edges of their sampling domain; however, the bias in the measured field may be insignificant compared to other sources. For example, the bias in sampled latitude at high latitudes for SCIAMACHY is similar to the other instruments shown in Figure 6; however, the sampling bias for O3 is dominated by temporal considerations since it measures only sunlit latitudes, in a way that varies with time during this time of year in the southern high latitudes. The bias seen for SCIAMACHY in the mesosphere and in the UTLS at latitudes lower than 70°S is most probably due to an exclusion of measurements over the southern Atlantic (see below for details).
 The potential for sampling bias is greatest when temporal and spatial gradients of the measured field are largest. The Antarctic O3 hole leads to extreme temporal change in O3 mixing ratios and creates a strong gradient across the vortex edge in SH spring. The temporal evolution of sampling bias in the 80–85°S latitude bin in the lower stratosphere (50 hPa) is examined in Figure 7. Sampling bias is quite small through most of the year but reaches values of ±30% or more for some instruments in the SH spring months. Most of the behavior shown in Figure 7 can be understood in terms of previously discussed mechanisms. The low bias of SCIAMACHY and OSIRIS in September is a result of the fact that their measurements at high latitudes cover only the latter half of the month, when O3 depletion has begun. A similar situation occurs for UARS MLS sampling in October and December, leading to a low bias in October and a high bias in December (for the sample year of UARS MLS sampling used here). Slight biases in the 80–85°S bin can be seen for Aura MLS and TES in November and December due to latitudinal bias in this latitude bin, as discussed above.
 Nonuniformity in sampled longitude can also lead to sampling bias if the measured field is not zonally symmetric. This is, for example, the case for the SCIAMACHY climatologies compiled for the SPARC Data Initiative. Due to data quality issues, measurements over the South Atlantic were not included in these climatologies, which results in a systematically nonuniform sampling in longitude for latitudes between 20°S and 70°S. The effect of this longitudinal sampling irregularity can be seen in the monthly mean sampling bias plots for SCIAMACHY (see the supporting information), which show systematically larger absolute magnitude sampling biases (with RMS values of >5%, see Figure 3) between 20 and 70°S latitude in the upper troposphere/lower stratosphere (UTLS, approximately 300 to 70 hPa) and lower mesosphere (approximately 1 to 0.1 hPa). The sampling biases at high altitudes should be considered to be an artifact, as discussed above. However, the sampling biases shown for SCIAMACHY climatologies in the UTLS are likely due to real longitudinal dependence of UTLS O3 due to variations in tropospheric chemistry and dynamics and should be considered when making use of the SCIAMACHY climatologies in the UTLS.
 It is clear from the discussion above that sampling bias results from a combination of nonuniform sampling and variability of the sampled field. With this in mind, it is interesting to consider sampling biases normalized by the variability of the chemical field. Figure 8 shows RMS O3 sampling biases, as in Figure 3, but normalized by the standard deviation of the O3 field. RMS normalized O3 sampling bias shown in Figure 8 is more uniform with respect to latitude and height compared to the percent values shown in Figure 3. Maximum RMS normalized sampling biases, as seen for the solar occultation instruments, are on the order of approximately 0.5–1σ. For the dense sampling instruments, RMS normalized sampling biases are typically 0–0.1σ. Normalizing by the standard deviation acts to better isolate the impact of nonuniform instrumental sampling on the sampling bias, making it independent of the magnitude of the chemical field variability. In principle, the normalized O3 sampling bias estimates could be used to calculate rough sampling bias values for other long-lived trace gases which show variability coherent with that of O3, if the variability of the chemical field is known.
3.1.2 Annual Mean Sampling Bias
 To assess how sampling bias can affect annual mean climatologies, we produce annual mean sampling bias estimates for each instrument by calculating annual averages of the sampled monthly mean fields for each instrument, then calculating the percent difference of this quantity compared to the model annual mean. The annual mean bias therefore includes not only the sampling biases present in the monthly mean biases discussed above but also any bias due to an instrument's incomplete sampling over the year for any latitude bin. Annual sampling biases for O3 are shown for each instrument in Figure 9.
 The instruments with uniform sampling throughout the year (Aura MLS, HIRDLS, MIPAS, SMR, and TES) show very weak annual mean sampling biases of only a few percent. In contrast, the annual mean sampling biases for the other instruments, all of which have latitudinal sampling which varies throughout the year, are considerable, often exceeding 10%, and reaching values of >20% in some cases. The annual mean sampling biases can be qualitatively explained by the seasonality of sampling. For example, the SMILES sampling pattern covers only approximately half the year, during NH winter and spring. Clearly, some sampling bias is to be expected when calculating an annual mean with a half year of SMILES data, and while this example is perhaps not realistic, it helps to illustrate the mechanism behind the sampling biases of other instruments with less dramatic sampling nonuniformity throughout the year. The annual mean sampling bias for the SMILES sampling pattern in each hemisphere is characterized by a “tripole” pattern, with alternating sign with height, and is opposite in sign in each hemisphere. OSIRIS does not sample the high-latitude winters (Figure 1), and as a result, the annual mean sampling bias pattern shows the characteristic tripole pattern in the high latitudes, which is similar in morphology with the SMILES sampling bias in the “summer” SH. HALOE, SAGE II, and SCIAMACHY show similar biases toward summer sampling (see again Figure 1), leading to similar sampling bias morphology for these instruments. GOMOS sampling is biased toward winter in the high latitudes; as a result, its annual mean sampling pattern has a similar structure to that of OSIRIS, HALOE, SAGE II, and SCIAMACHY but with opposite sign. The more complicated annual mean sampling bias morphologies of ACE-FTS, POAM II, POAM III, and SAGE III result from the more complicated patterns of their sampling. For example, ACE-FTS samples the highest latitudes during autumn and early spring, with a lack of samples at these latitudes through the summer months. As a result, the annual mean sampling for the highest latitudes resembles that of GOMOS, while in the midlatitudes, ACE-FTS sampling covers most of the summer months but less of the winter, leading to a sign change in the sampling bias for any pressure level moving equatorward. A similar mechanism explains the complicated structures of the POAM II, POAM III, and SAGE III annual mean sampling biases.
 Figure 10 details the calculated annual mean sampling biases within the UTLS for instruments with applicable measurement ranges. A common feature of many of the instrumental sampling biases is positive biases around 30°N and S between 200 and 100 hPa. In these regions, the tropopause slopes downward with latitude, creating a strong horizontal O3 gradient on pressure surfaces. Instrumental sampling densities tend to increase modestly with latitude. As a result, within the latitude band that straddles the tropopause for a given pressure surface, instruments tend to sample the poleward side of the latitude band slightly more often than the equatorward side, leading to positive bias in the average O3 mixing ratio. This feature is apparent to different degrees in the UTLS sampling bias estimates for Aura MLS, HIRDLS, MIPAS, and TES. The sampling biases for ACE-FTS and OSIRIS are dominated by nonuniform month of year sampling issues as discussed above, and for SCIAMACHY, the nonuniformity of longitudinal sampling is seen to have an influence between 20 and 70°S, seemingly amplifying the magnitude of the tropopause-related bias at 30°S.
3.2 H2O Sampling Bias
3.2.1 Monthly Mean Sampling Bias
 Monthly mean sampling bias results for H2O are shown in the supporting information plots. To summarize the absolute magnitude and spatial structure of H2O sampling bias for each instrument, Figure 11 shows the RMS of the 12 monthly sampling bias estimates for each instrument.
 Sampling bias for H2O is generally smaller than for O3. For example, solar occultation instruments which had sampling biases of greater than 5% over large portions of the high latitudes (poleward of ±50°) for O3 have H2O sampling biases of 2% or less over most of the NH. This is related to the weaker temporal and spatial variability of stratospheric H2O compared to that of O3 (see Figure 2), and in fact RMS normalized H2O sampling biases (normalized by the H2O standard deviation, not shown) are quite similar to those of O3 shown in Figure 8. One location of strong H2O sampling bias is the high SH latitudes, where dehydration of the SH winter vortex leads to strong gradients in space across the vortex edge and temporal gradients during the period of dehydration and recovery. Most solar occultation instruments (all but SAGE III, which does not measure in the high SH latitudes) show signs of the impact of vortex dehydration in their H2O sampling biases. Larger biases are seen for the instruments with irregular temporal sampling, e.g., ACE-FTS, HALOE, POAM II, POAM III, SAGE II, SCIAMACHY, SMR, and UARS MLS.
 Figure 12 illustrates the impact of irregular temporal sampling in the month of September, when SH vortex H2O values increase over the month, with short-term variability superimposed on the steady recovery in the lower stratosphere. Aura MLS sampling is uniform, and as a result, the calculated sampling bias is small (<2%, except for the highest SH latitudes, as discussed below). ACE-FTS samples the highest SH latitudes only at the beginning of the month, leading to a low bias in its monthly mean estimate, which is strongest in the lowest stratosphere (>10%) but also present in the upper stratosphere (>4%). In contrast, SCIAMACHY sampling follows the return of sunlight to the highest latitudes, so its sampling of the highest SH latitudes occurs in the latter half of the month, leading to a positive bias (reaching values >10%) compared to the true monthly mean.
 Aura MLS and MIPAS, which have temporally invariant sampling patterns, also show sampling bias in the high SH latitudes. As was shown for O3, the strong gradient in H2O across the vortex edge and the nonuniform sampling of the southernmost latitude bin lead to bias at that latitude. Figure 13 shows SH high-latitude H2O sampling biases for Aura MLS, MIPAS, and SCIAMACHY in September and the bias in sampled latitude. The incomplete sampling of the latitude band can lead to biases of more than 10%, due to the strong gradient in H2O and its very low mixing ratios.
3.2.2 Annual Mean Sampling Bias
 Annual mean sampling biases are shown in Figure 14 for H2O. As was the case for O3, the annual mean sampling bias for Aura MLS and MIPAS—instruments with relatively uniform seasonal sampling—is small throughout all of the stratosphere, while sizeable sampling biases are computed for instruments with seasonally varying coverage. Many instruments (HALOE, POAM II, SAGE II, SCIAMACHY, and SMR) show a low bias in the SH high-latitude lower stratosphere (reaching values of at least −8%), resulting from the fact that these instruments sample the H2O mixing ratio minimum in SH spring following the dehydration of the polar vortex but tend to undersample the maximum in H2O mixing ratios during SH fall and winter. ACE-FTS, on the other hand, samples the highest SH latitudes only during March and April, leading to a >10% positive sampling bias there. For occultation instruments whose vertical range extends into the UTLS, sampling bias is large in the UTLS, with values >10%. It is clear that due to the significant variability of H2O in the UTLS region, an accurate estimation of the annual mean H2O mixing ratio requires the high density and uniform sampling of instruments like Aura MLS and MIPAS.
 Monthly zonal mean climatologies produced through the method of binning and averaging measurements into months and 5° latitude bands may contain nonnegligible biases of >10% in some cases due to the nonuniform temporal and spatial sampling of the atmosphere performed by the instruments. We have examined sampling biases produced by the sampling patterns of 16 instruments participating in the SPARC Data Initiative when applied to coupled chemistry-climate model output of O3 and H2O fields. Keeping in mind that our results are based on a single year of simulated fields from a single model, and the sampling patterns are based on single years of satellite measurements, a number of general statements regarding the impact of sampling bias on trace gas climatologies can be made. Specifically, we find that:
 Climatologies built from measurements from instruments with regular and uniform sampling patterns have generally small sampling bias. Sampling bias may still exist, however, when the sampled latitudes within latitude bins are not uniformly distributed. For example, incomplete coverage of the northernmost and southernmost latitude bins can lead to sampling bias in those bins. This type of sampling bias appears most significantly in the high southern latitudes during SH winter, when the Antarctic vortex produces particularly strong latitudinal gradients.
 Climatologies built from measurements from instruments whose latitudinal coverage varies with time can have strong sampling biases for certain months and locations. Sampling biases for O3 were found in some instances to be as large as 10–40%. This is primarily due to nonuniformity in day-of-month sampling and occurs whenever an instrument provides measurements over only a portion of the month. In cases for which the atmospheric variability is dominated by the seasonal cycle, this type of sampling error could in theory be reasonably well quantified or even corrected, however, when variability is dominated by short-term random fluctuations only the absolute magnitude of the sampling bias can be estimated from model studies. This type of sampling bias is most relevant for solar occultation instruments but is also important for any instrument whose sampling of certain latitudes is not uniform in time.
 Sampling bias is a function not only of the sampling pattern but also the time-space variability of the field being averaged. Throughout most of the stratosphere, sampling bias is much more important for O3 than for H2O, since the variability of O3 is stronger. A normalized sampling bias, shown in Figure 8, isolates the impact of an instrument's nonuniform sampling and can in principle be used to produce rough estimates of potential sampling bias for other long-lived trace gases if the variability of the trace gas is known.
 In the UTLS region, space-time gradients in O3 and H2O (and in fact most trace gas species) are strong, and sampling bias is important. Instruments with uniform temporal sampling have monthly mean sampling biases of a few percent in the UTLS, while instruments with nonuniform temporal sampling have sampling biases of up to 10%. Due to the random nature of sampling bias in the UTLS, sampling biases are somewhat reduced in annual means. However, even for instruments with dense and uniform sampling, the strong horizontal gradients in chemical mixing ratios across the subtropical tropopause can lead to annual mean sampling biases of 1–2%.
 Users of monthly zonal mean climatologies of trace gases, including those of the SPARC Data Initiative, are encouraged to keep these results in mind when using such data products for chemistry-climate model validation, assessments of observed variability, or other applications. It should be understood that in some cases, the inter-instrument spread of the climatologies is larger than the uncertainty in the true fields due to measurement errors. In such cases, the climatologies of instruments with highest sampling density should be considered more representative estimates of the truth, in the sense that differences between the climatologies and the true atmospheric mean state should be dominated by measurement error rather than sampling biases. It should also be noted that no instrumental climatology produced by binning of measurements into latitude bins is completely free of the influence of sampling bias, as even modest irregularities in sampling over latitude bins can lead to sampling biases of a few percent in regions of strong meridional gradients, such as across the polar vortices and the tropopause.
 The SPARC Data Initiative thanks the International Space Science Institute in Bern (ISSI) who supported the activity through their ISSI International Team activity program, and the World Climate Research Programme and the Toronto SPARC office for generous travel funds. The work by Matthew Toohey is supported by the BMBF MIKLIP project ALARM through the grant 01LP1130B. Michaela Hegglin thanks the Canadian Space Agency (CSA) and the European Space Agency for supporting her work for the SPARC Data Initiative. The work from Susann Tegtmeier was funded from the WGL project TransBrom and the EU project SHIVA (FP7-ENV-2007-1-226224). The work from Hampton University was partially funded under the National Oceanic and Atmospheric Administration's Educational Partnership Program Cooperative Remote Sensing Science and Technology Center (NOAA EPP CREST). Work at the Jet Propulsion Laboratory, California Institute of Technology, was performed under contract with the National Aeronautics and Space Administration. ACE is a Canadian-led mission mainly supported by the CSA. Development of the ACE-FTS climatologies was supported by grants from the Canadian Foundation for Climate and Atmospheric Sciences and the CSA. Odin is a Swedish-led satellite project funded jointly by the Swedish National Space Board (SNSB), the CSA, the National Technology Agency of Finland (Tekes), the Centre National d'Etudes Spatiales (CNES) in France, and the Third Party Mission program of ESA. The work of E. Kyrölä was supported by the Academy of Finland through the project MIDAT (134325). Work on HIRDLS was supported by NASA's EOS Program in the U.S. and in the UK by NERC. IAA was supported by the Spanish MINECO under grant AYA2011-23552 and EC FEDER funds. The work of the University of Bremen team on SCIAMACHY climatologies was funded in part by the German Aerospace Agency (DLR) within the project SADOS (50EE1105), by the DFG Research Unit FOR 1095 “Stratospheric Change and its role for Climate Prediction (SHARP)” (project GZ WE 3647/3-1), and by the State and University of Bremen. WACCM simulations were performed in the Centro de Supercomputacion de Galicia under a Reto2009 Project, and the authors thank Rolando García, who developed the WACCM model version used here, and Andrew Gettelman for guidance with the model. Finally, the authors thank Ted Shepherd, Peter Braesicke, and Karen Rosenlof for helpful feedback on early results from this work, and two anonymous reviewers who provided valuable comments on the submitted manuscript.