Corresponding author: C. Lucas, Centre for Australian Weather and Climate Research, GPO Box 1289, Melbourne, VIC 3001, Australia. (email@example.com)
 Historical radiosonde data are analyzed using the tropopause height frequency method to investigate the variation of the Southern Hemisphere tropical edge from 1979/80–2010/11, independently of reanalysis-derived data. Averaged across the hemisphere we identify a tropical expansion trend of 0.41 ± 0.37 deg dec−1, significant at the 90% level. A comparison with four reanalyses shows generally consistent results between radiosondes and reanalyses. Estimated rates of tropical expansion in the SH are broadly similar, as is the interannual variability. However, notable differences remain. Some of these differences are related to the methodology used to identify the height of the tropopause in the reanalyses, which produces inconsistent results in the subtropics. Differences between radiosondes and reanalyses are also more manifest in data-poor regions. In these regions, the reanalyses are not fully constrained, allowing the internal model dynamics to drive the variability. The performance of the reanalyses varies temporally compared to the radiosonde data. These differences are particularly apparent from 1979 to 1985 and from 2001 to 2010. In the latter period, we hypothesize that the increased availability and quality of satellite-based data improves the results from the ERA Interim reanalysis, creating an inconsistency with earlier data. This apparent inhomogeneity results in a tropical expansion trend in that product that is inconsistent with the radiosonde-based observations. These results confirm the need for careful evaluation of reanalysis-based data for use in studies of long-term climate variability.
If you can't find a tool you're looking for, please click the link at the top of the page to go "Back to old version". We'll be adding more features regularly and your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Recent studies have identified an expansion of the tropics or, a widening of the Hadley cell since the late 1970s. This means that air with tropical characteristics, variously defined but broadly found between the equator and ∼30° latitude, is extending further toward the poles. This expansion has been observed across numerous data sets, suggesting that it is a robust feature of the atmosphere [Seidel et al., 2008]. Hudson et al.  used total ozone measurements to identify a tropical expansion in the Northern Hemisphere (NH). Fu et al. inferred a global expansion of approximately 2° based on satellite-derived stratospheric and tropospheric temperature trends.Hu and Fu  reported expansion trends of 0.7 to 1.1 degrees latitude per decade (deg dec−1) from zonal mean outgoing longwave radiation measurements. However, the primary sources of data for the identification of this expansion have been the various global reanalysis products, typically the NCEP/NCAR [Kalnay et al., 1996] and/or ERA-40 [Uppala et al., 2005]. In combination with the reanalyses, a variety of metrics have been used to estimate the expansion of the tropics. Archer and Caldeira  analyzed the position of the various jet streams and found small poleward increases in their position. Hu and Fu  used isobaric mass stream function calculations to identify the limits of the tropical Hadley cell, identifying strong expansionary trends in both hemispheres, particularly in the respective summer and autumn seasons of each hemisphere.
 One metric that has attracted considerable interest in tropical expansion studies is the tropopause height frequency defined by Seidel and Randel . On annual time scales, the subtropical regions have a distinct bimodal structure of tropopause height, with a ‘tropical’ mode centered around 16 km altitude and an ‘extratropical’ mode found near 12 km. The degree of bimodality can be tracked on long time scales to estimate the amount of tropical expansion. They identified an expansionary trend of the total tropics of 1.7 to 3.1 deg dec−1, and identified related trends in the radiosonde data that broadly supported their assertion. Lu et al. used this methodology with the ERA-40 reanalysis to find trends of 0.5 to 0.7 deg dec−1 in the Southern Hemisphere (SH) and 0.0–0.2 deg dec−1 in the NH extending back to 1958. Birner used a more objective technique and additional reanalyses to arrive at broadly similar conclusions; distinct expansionary trends in the SH in most reanalyses, and small or near-zero trends in the NH. One reanalysis suggested the possibility of contraction trends in the SH, depending on the exact methodology used. Overall, the tropopause frequency methods suggests larger expansion trends in the SH, while other methods and/or data sources suggest even or perhaps greater expansion in the NH. Clearly, some discrepancies exist in estimates of the amount of tropical expansion, depending on a variety of factors.Davis and Rosenlof  compared several methodologies and suggested that most of these differences relate to the use of different data sets and definitions of the tropical edge.
 The various reanalyses are widely used products with global coverage. However, they are not observations per se, but rather a hybrid of numerical weather prediction model output and assimilated data from various observing platforms. The global observing network is not spatially uniform and is continuously evolving, these changes affecting the output of the reanalysis products. In seasonal and annual means, the reanalyses do a good job of reproducing ‘dynamic’ quantities like wind and temperature [Reichler and Kim, 2008]. However, where observational data are poor, the reanalysis model is not fully constrained and it can develop its own variability [e.g., Sterl, 2004]. Temporal changes in the observations can impact the long-term homogeneity of the data and reduce the reliability of climate trends calculated from these sources [e.g.,Trenberth et al., 2001; Bengtsson et al., 2004a]. Different representations of physical processes in the underlying reanalysis models [e.g., Mitas and Clement, 2005; Mitas and Clement, 2006; Song and Zhang, 2007] also result in circulation changes and different climate trends between the reanalyses.
 A largely independent source of data that has been poorly exploited in understanding tropical expansion is the radiosonde data. Radiosonde data are included in most reanalysis products, but are a relatively small component in the SH. Seidel and Randel  used radiosonde data in their study, mainly to support their results in the reanalysis. They did not explicitly calculate trends of the tropical edge from this data as we do here. However, the use of this data is potentially problematic. On a global scale, the radiosonde sampling of the atmosphere is relatively poor. Further, the historical records are often incomplete, with missing data and partial records. These problems are particularly pressing in the SH. However, Antuña et al. suggest that tropical means are relatively unaffected by missing data. Further, the missing data issues can be overcome with appropriate statistical techniques. Spatial sampling difficulties can be overcome with a focus on regional scales, where the sounding network is denser. In this study, we use the historical radiosonde network to investigate the expansion of the tropics using the tropopause height methodology over three broad regions of the SH from 1979 to 2010, a 32-year period.
 Reported trends of tropical expansion cover a broad range of values. Most show a modest expansion, especially in the SH, but values range from weak contraction to rapid expansion. A main objective of this paper is to document the trends in SH tropical expansion over the study period using a data set that is largely independent from the reanalyses. Our focus is regional, but we extrapolate the results to create a hemispheric picture. Three broad regions – Australia/New Zealand, South America and Africa – are chosen for further analysis. While the explicit radiosonde-based estimates of the tropical expansion are interesting and useful by themselves, direct comparison with identical calculations made with reanalysis data highlights potential deficiencies in the reanalyses, as well as shedding light on differences between the various reanalysis products. This helps answer the question of ‘How trustworthy are these data in understanding long-term tropical expansion?’ These objectives are addressed in the following text.Section 2 discusses the characteristics of the data and the tropopause height calculation. In section 3, the tropopause height frequency methodology is illustrated, as are the processing techniques used to overcome some of the limitations of the data. This includes the calculations of errors and biases associated with the data and methodology. Section 4 discusses the results from a mean perspective and shows the temporal evolution of the edge of the tropics, along with its sensitivity. A comparison with four reanalysis products is also presented. In section 5, the SH-wide view of tropical expansion is developed and the drivers of the observed variability are explored.Section 6 presents the conclusions of the study.
 The primary data source used in this study is the Integrated Global Radiosonde Archive (IGRA) described by Durre et al. . These sounding data are used to evaluate characteristics of the tropopause following the World Meteorological Organization (WMO) lapse-rate definition, i.e., the tropopause is the first layer where the lapse rate is wholly less than 2 K km−1 extending for a depth of 2 km. We apply an additional criterion that the tropopause must occur at a pressure lower than 500 hPa. The pressure and temperature of the tropopause are directly reported by the radiosonde. The geopotential height of the tropopause (ZT), the main variable of interest here, is reported on mandatory levels and interpolated to the tropopause level. While many radiosonde observations explicitly report a tropopause level, we determine this independently from the available mandatory and significant level data (with no interpolation) to insure a consistent calculation. In long records, the reported and independently identified ZTvalues agree approximately 90% of the time with most of these disagreements occurring earlier in the record. Further, soundings that terminate below 100 hPa are not used, nor are tropopause estimates that occur on a mandatory level (e.g., 250, 200,150, 100 hPa). While the latter are not necessarily incorrect, they are more likely to be an artifact of the data reporting methodology rather than a valid measurement. These observations are most prevalent prior to the mid-1980s and the soundings where this is observed generally have few if any significant level observations reported at heights where the tropopause is found. In this case, the tropopause ‘defaults’ to a mandatory level.
 Our primary interest in this study is the region where the troposphere transitions from the tropics to the extratropics. As a whole, the sampling of the tropopause in the SH is poor. To overcome this, we analyze three broad regions independently: Australia-New Zealand (ANZ), South America (SA) and Africa (AFR).Figure 1 identifies the limits of these regions on a map. In each region, stations from between approximately 10°S and 55°S are selected. The SA region extends further equatorward than the nominal limit to fully sample the deep tropics. Fewer stations are available in equatorial Africa; hence the AFR region starts at 15°S. The locations of these stations are also plotted on Figure 1. The average distance to the nearest station is 530 km, 510 km and 710 km in ANZ, SA and AFR respectively, with the minimum value being 340 km, 250 km and 220 km (excluding those that are close due to a station relocation) in the same three regions.
Figures 2, 3, and 4show the data coverage by year and station in the three regions. Nominally, 1 or 2 soundings per day are available. In the figures, the number of ‘unique days’ per year sampled is reported, as distinct from the number of individual observations. The figures indicate data coverage is not uniform. In many cases the data start well after 1979 or end before 2010. Extended periods of missing data can and do occur; and some of the chosen stations only report data for a few years. Stations close and (sometimes) reopen in a different location. The philosophy behind this study is not to exclusively rely on complete records of data, but rather to use all available (and suitable) information to inform the analysis. This is potentially problematic, but considerable effort - described insections 3.2 and 3.3- is made to work around these limitations and quantify the uncertainty they introduce.
 In general terms, the ANZ region has the best data coverage (Figure 2), with the largest number of stations and the most complete spatial sampling. Further, many of the stations have records that span the entire 32-year period with few periods of extensive missing data, although the mandatory level tropopause does affect the completeness of the annual samples before the mid-1980s. The data coverage in the SA region is mixed (Figure 3). Equatorward of ∼20°S, the sampling before 1997 is poor, with few stations. Further south there are longer records available, but the spatial distribution of the available stations is sparse. Overall, annual samples are less complete in SA. The data coverage from the AFR region is also mixed (Figure 4). Equatorward of ∼30°S, data availability is poor. Few long-term records exist, and the annual sampling is poor, especially after the mid-1990s. Southward of this latitude, the data coverage is much better, especially in South Africa, where long-term records exist at many stations. Poleward of the southern edge of the continent (∼34°S), the spatial sampling is very poor, with only four widely spaced island stations available.
 The historical radiosonde data are known to have inhomogeneities that produce discontinuities in stratospheric and tropospheric temperature measurements. These have been detected via analysis of instrumentation changes [e.g., Gaffen, 1994; Lanzante et al., 2003] and comparison with satellite measurements [Randel and Wu, 2006]. Seidel and Randel suggest that tropopause observations also contain inhomogeneities. Of the 19 stations in this study that overlap with their station set, they eliminated all but 7 from consideration in their study. During our own processing of the data, obvious inhomogeneities were noted, and could generally be related to the number of significant levels reported in the sounding. Many of these are associated with the ‘mandatory level’ tropopause phenomenon noted above, and clearly visible on extended time series plots of individual tropopause observations (not shown). In most regions, these soundings became less prevalent in the mid-1980s. In Australia, the introduction of sounding systems with automated reporting clearly eliminated these minimal information soundings.
 Despite all this, we make little effort to correct for inhomogeneities (beyond removing the mandatory level tropopause observations). Some inhomogeneities likely remain within the data. However, their effects are apparently too subtle to make much difference to the final result. The simple tropopause height frequency metric (sections 3.1 and 3.3) used here is relatively insensitive to small jumps in the data. Further, the techniques utilized to build composite time series (section 3.2) is largely resistant to any data artifacts that may exist. The consistency of individual time series when making the composite series and the general agreement between regions in the mean and in interannual variability support the assertion that inhomogeneities do not significantly affect the results. The broad consistency seen between the observations and reanalysis results (section 4.4) also lends support to the hypothesis that the presence of inhomogeneities in the radiosonde data does not adversely affect the main conclusions of this paper.
3.1. Defining the Edge of the Tropics
Figure 5 shows time series of ZT based on radiosonde observations at Charleville, Queensland and Cobar, New South Wales from 2000 to 2003. The stations are separated by about 5 degrees of latitude and lie roughly on the same longitude (146°E) in eastern Australia. Over that distance, the tropopause shows distinct differences in behavior. At the more equatorward station Charleville the height of the tropopause varies between 15 and 18 km over an annual cycle. On occasion, the tropopause moves out of this range, likely reflecting a tropopause fold associated with a baroclinic wave or ‘cutoff’ low. These intermittent events can occur in any season, although less frequently in summer [e.g., Elbern et al., 1998]. In contrast, more southerly station Cobar has a larger-amplitude annual cycle, with ZT varying from near 18 km in summer down to 10 km (and below) during winter. Despite the seasonal cycle, observations of winter ZT at 14–15 km are common.
 Following Seidel and Randel , these data can be used to identify the edge of the tropics from the annual frequency distribution of ZT, shown on the right of Figure 5 for each calendar year. At Charleville, distributions of ZT show a single broad peak above 15 km with a long tail to lower heights. Moving equatorward from this locale (not shown), the distribution narrows and the long tail recedes. At Cobar, the annual height distribution of ZT is bimodal, with one peak near 16 km and the second centered at 11 km. This distribution is representative of the subtropics–a transition zone between the Tropics and the extratropics – and displays characteristics of both regions for some part of the year. Moving poleward from this locale (not shown), the lower peak grows at the expense of the upper peak.
 For our investigation of the tropical edge, we use the sounding data to estimate the number of ‘tropical tropopause days’ (TTD), a normalized count of the number of days with ZT greater than some height threshold representative of the tropopause values. Let Nt and N be vectors representing the monthly count of radiosonde observations above the chosen threshold and in total respectively, for a given year. Then the observed number of TTD is
where Iis a 12-point identity vector. For this paper, the year is defined to run from June through May of the following year to better match the timing of the El Niño-Southern Oscillation (ENSO) and to not split the warm season across two years. This metric is computed for each year and station in the network.
 The choice of the most appropriate tropical tropopause height threshold is unclear. Seidel and Randel  used a threshold equivalent to 15.5 km in geopotential height. Lu et al.  used a tropopause pressure of 120 hPa, on average equivalent to a height of 15.5 km. Birner , using reanalysis data, showed that the tropical expansion trends are sensitive to the choice of height threshold, particularly when it exceeds ∼15 km. He proposed a varying threshold based on the location of the minimum between the two peaks of the bimodal distribution. Davis and Rosenlof  suggest that trends based on the use of an absolute threshold in reanalyses are inferior to those that use variable or relative thresholds, as they do not account for observed rising trends in global tropopause heights of 40–80 m dec−1 [Seidel and Randel, 2006; Schmidt et al., 2008], although the uncertainty in this trend value remains large [Wang et al., 2012].
 For this study, we use a fixed tropical tropopause height threshold of 14.5 km. While this choice is subjective, it is based on meteorological interpretation of the radiosonde data. The frequency distribution indicates that the threshold should lie between 13 and 15 km. As the thresholds is moved higher than 15 km, where Birner  shows that the trend sensitivity is particularly large, even regions of the deep tropics (∼10°S) increasingly become classified as ‘extratropical’ for some part of the year. With a lower threshold, the tropical edge is further poleward in the mean. Considering the time series in Figure 5, it can be argued that the majority of observations between 13 and 15 km arise from extratropical processes; stations from deeper in the tropics (not shown) only rarely report ZT lower than 15 km.
 The consequence of the choice of a fixed threshold is a possible overestimate of the ‘true’ rate of SH tropical expansion, as no explicit account is made of the tropopause trend. The variable thresholds in the different reanalyses described by Birner  show similar interannual variability, but little consistency in the trends and the methodology does not produce robust results for the tropical expansion trend values. Similarly, Davis and Rosenlof [2012, Table 4] report amounts of ‘trend reduction’ caused by using relative ZT thresholds that are seemingly unrelated to ZT trends computed in the different reanalyses; the largest reduction occurs with one of the smallest ZTtrends, which itself is an order of magnitude smaller than the observed. Taken together, it appears that this methodology produces results that strongly reanalysis-dependent and not physically robust.
 The choice of 14.5 km is at the upper end of the suggested range. It roughly corresponds to the minimum tropopause heights observed in a tropical atmosphere and is well above the absolute minimum between the modes. This high threshold also minimizes the effect of formerly extratropical soundings ‘crossing over’ to become tropical soundings solely as the result of the modest (40–80 m dec−1) rising tropopause trends. The overall sensitivity of the result to threshold choices is shown in section 4.3, with explicit estimates of the value of this ‘sympathetic trend’ presented in section 5.
3.2. Creating a Time-Latitude Array of TTD
 From the station records in a region, we derive a time-latitude array that shows the temporal evolution of the meridional profile of TTD. Treating the data in this manner tacitly assumes that the broad-scale structure of the tropopause, at least over the study area, is axisymmetric – the meridional variations are larger than the zonal variations. Zonal variability in the structure of the tropopause introduces a degree of uncertainty into the analysis.
 In reanalysis products, the data are available on a regular uniform grid and the annual meridional profile is a straightforward exercise of zonal averaging. With the radiosondes, the available data are sparse in time, unevenly spaced and require special consideration. With these data, a composite time series over broadly defined latitude bands is created. Defining a latitude band requires a balancing between the desired latitudinal resolution, the number of available stations and the completeness of the data at the individual stations making up the band. Greater latitudinal resolution allows for greater nuance in the final analysis, with less meridional smoothing in the subtropics where TTD changes rapidly. A larger number of stations in a band suggests a more representative and robust zonal average and provides greater confidence in the results.
 The bands are depicted spatially in Figure 1, and Figures 2–4show the station composition and nominal latitude ranges of each band and region. The bands vary between the different regions, subjectively chosen to maximize latitudinal resolution while providing adequate station sampling. Ten bands are chosen in ANZ, 9 in SA and 8 in AFR, dictated by the realities of the available data. Within a region, the spacing of the bands is not uniform. Generally, the heart of the subtropical regions is well-sampled, and most bands there are between 3 and 5 degrees in width and contain three or more stations. Outside of those regions the bands are generally broader as there are fewer stations available. In some cases, the bands have significant shortcomings, notably poor station coverage, which reduces confidence in the results. This is further discussed insection 4.2.
 Creating the composite time series of TTD for a band is a multistep process. The initial step is the removal of poorly sampled years. Minimum thresholds based on the numbers of unique days and months with usable data (defined as 4 or more ZTobservations in a month) are applied. In AFR and SA, the minimum thresholds are 60 unique days per year and 6 individual months sampled. In ANZ, the minimum thresholds are 120 unique days per year and 6 months. These low thresholds of data acceptance are necessary to complete the time-latitude array; the ‘cost’ is an increase in the uncertainty of the value of TTD, as well as a possible bias. The uncertainty and bias are detailed insection 3.3.
 With the refined and bias-corrected station TTD time series, a first difference technique analogous to that described inPeterson et al.  is used to produce the composite time series. This technique is relatively resistant to inhomogeneities in the data, limiting their impact to the single time when they occur [Peterson and Easterling, 1994]. However, Free et al.  note that it may introduce a ‘random walk’ error when used in time series with missing data because of an ‘endpoint outlier’ effect. Here, no automatic removal of these points was applied, but all series were visually inspected to identify blatant instances of such errors, and any such points were removed. Subtle effects of these errors may remain, and are reflected in the calculated uncertainty. The time series are calibrated using the 2010 mean TTD as the initial point; an ‘absolute’ bias is introduced to the degree that this initial point is in error.
Figure 6 shows examples of the composite time series and their individual components for each region. The individual station series of TTD cover a different range of values, reflecting the different latitudes of the stations in a region where TTD changes rapidly (cf. Figure 8). In AFR and ANZ the stations are within 2° latitude and the individual TTD series cover a narrow range; in SA the spread is closer to 4°, with an increase in the range of TTD values. However, the variations in time across stations within a given band are broadly consistent; years with high or low values of TTD generally coincide (e.g., around 1990). The sample bias correction plays a significant role in producing the similarity between time series, particularly in the SA and AFR regions where the data are generally less complete. The station composition of any given year is variable, not all stations are present the entire time. However, with a few exceptions noted later, at least two stations go into the composite at all years. From Figure 6 it is apparent that the composite methodology successfully captures the essential variability within the selected latitude bands. Large effects due to ‘random walk’ errors associated with the technique have apparently been minimized, although subtle effects may remain.
 A composite annual time series from each zonal band forms the required time-latitude array suitable for examining the temporal variability of the tropical edge of the tropics. Variations in this transition zone are illuminated by contouring the resulting time-latitude array as shown in previous studies [e.g.,Seidel and Randel, 2007]. In this study, we show four contours where TTD equals 300, 200, 100 and 50 days per year, reflecting the transition from the tropics to the extratropics. As indicated by Birner , the specific choice of TTD threshold (contour) can also introduce some sensitivity into the results. A particular focus will be on the 200 days per year contour (hereafter referred to as the 200-day contour). This contour, found near 30°S with the ZT threshold used here, broadly coincides with the edge of the Hadley cell as traditionally defined by using the isobaric stream function [e.g., Oort and Yienger, 1996; Trenberth and Stepaniak, 2003]. To investigate tropical expansion, linear trends are computed on the positions of these contours using ordinary least squares regression. Confidence intervals, reported at the 90% significance level, account for the measurement uncertainties following Press et al., , calculated as described in section 3.3. The statistical effects of autocorrelated residuals are also included following Santer et al. , although we note this effect is generally small.
3.3. Sampling Bias and Uncertainty
 Even the best radiosonde stations have some degree of missing data. Disruptions in the radiosonde sampling of the tropopause range from a few scattered missing days to gaps in the data coverage extending for months (or even years), and are attributable to a variety of causes. With the methodology of this study, missing data potentially introduces a bias and creates uncertainty on the number of TTD. The effects on the calculation depend on the amount and timing of the missing data and the location of the station. These issues must be addressed to obtain reliable and consistent results.
 A sampling bias arises from the uneven measurement of the tropopause over the complete annual cycle. If missing days are few and/or distributed somewhat evenly throughout the year, the effects on the TTD calculation are minimal. However, ‘solid blocks’ of missing data are more problematic. For example, if the summer months (DJF) are missing then the TTD will be biased low. A second consideration is the amplitude of the seasonal cycle. Consider the stations in Figure 5. Charleville, the ‘tropical’ station is less likely to be substantially biased compared to the more subtropical Cobar, where ZTranges seasonally across the chosen threshold. Missing data are not the only potential cause of sampling bias; seasonal differences in sonde performance (e.g., the often-lower winter tropopause is easier to sample) can create a subtle effect on the sample where soundings are launched more often than once a day.
 Accounting for these effects requires an adjustment to the raw TTD value (equation (1)). Let H be a monthly vector defining the historical probability of an observation exceeding the chosen threshold, as derived using the complete data set. A climatological TTD (TTDC) can then be defined
where D is a vector of the number of days in each month (i.e., 31, 28, etc.). From H, a ‘predicted’ value for TTD can be made based on the observed distribution of the sample for a given year:
A direct estimate of the magnitude of the annual sampling bias is obtained from bias = TTDP − TTDC. This bias estimate is then subtracted from the raw value to obtain the bias-adjusted TTD,
Beyond this sampling bias, uncertainties still exist in the TTD estimate if anything less than the full 365-day sample is used; whether the tropopause exceeds the threshold on the missing days is not known with absolute certainty. Logically, the more unique days the tropopause is sampled the lower the uncertainty. Like the sampling bias, this random uncertainty is also larger where the amplitude of the annual cycle varies across the chosen threshold. It is largest where the annual chance of being above or below the threshold is near 50% (i.e.,TTDC = 182.5 days). At the extremes (i.e., where TTDC is near 365 or zero days), the uncertainties approach zero.
 To quantify this uncertainty, we derive a relationship between the amount of uncertainty, the number of unique days and TTDC, a multistep process. As a basis, eight stations from ANZ during 2003–04 are used to define a unique-days/uncertainty relationship. These chosen stations lie in bands ANZ2-ANZ9 and have nearly complete annual samples. To estimate the uncertainty, a Monte Carlo approach is used. The stations are randomly sub-sampled to mimic a year with incomplete sampling. For each degraded sample, TTD is re-computed as inequations (1)–(4). The error is defined as the difference of the new TTD from the known value of the full sample. Statistics of the error for 10 000 tests are calculated for a range of unique days retained. The mean error for all values of unique days is near zero; the bias correction removes any systematic bias. The standard deviation of each set of tests is used as the uncertainty for that value of unique days. With more than 60 unique days sampled, the uncertainty linearly decreases as more days are included in the sample.
 The linear regression coefficients for these relationships vary at the eight stations. As expected, uncertainty is higher in the subtropics and lower toward the ‘extremes’. From these 8 stations, a linear relationship between the linear parameters and the TTDC is derived. We assume that the slope and intercept of the first fit are symmetric about TTDC = 182.5 days and transform the predictor accordingly. The resulting relationships predict the slope and intercept parameters for the unique-days/uncertainty relationship based on TTDC. The relationship as initially derived over-predicts the uncertainty at the extremes. To account for this we split the stations and derived two separate relationships for the slope/intercept-TTDC valid over different ranges.
Figure 7 shows the calculated uncertainty as a function of the number of unique days sampled and TTDC. This two-step process is used to make annual estimates of the uncertainty at each station. With 60 unique days sampled in the heart of the subtropics, the estimated uncertainty (one standard deviation) is approximately 18 days, decreasing to about 5 days when 300 unique days are sampled. The uncertainty is smaller moving away from the subtropics, and is relatively low at the extremes.
 When applying the multipart first difference technique for creating the composite time series, these uncertainties are propagated through each step of the calculation – the differencing, the averaging and reintegration. This process gives the annual measurement errors for the band-averaged TTD time series relative to the determined initial 2010 starting point. As a result of the iterative nature of the calculation, the uncertainties accumulate toward the beginning of the record. A Monte Carlo procedure is again used to translate the measurement errors into estimates of the uncertainty in the position of the contours. Normally distributed random errors scaled by the measured uncertainty are applied to each point in the time-latitude array and new contour positions are then determined. This procedure is repeated 1000 times, and the standard deviation of the contour positions is used as the uncertainty in the trend calculations.
4.1. Mean Structure of TTD
Figure 8shows the mean structure of TTD, highlighting its broad meridional characteristics in the atmosphere. North of ∼26°S, the tropopause is at tropical levels for more than 300 days of the year. This region is unambiguously tropical. South of ∼38°S, the tropopause is at tropical heights for less than 50 days of the year. These are the extratropics. The shift from the tropics to extratropics occurs as transition between these limits. This region is the subtropics, and can be loosely defined as the regions between the 300- and 50-day TTD contours.Seidel and Randel  similarly identified the subtropics as lying between 28° and 40° latitude in both hemispheres. The location of these limits is reasonably robust across in all three regions sampled.
 The mean results support the analysis methodology used in this work. The broad similarity of the meridional structure over the three regions indicates that the assumption of an axisymmetric tropospheric structure is reasonable on the annual time frame used here. Further confidence in the results is gained by comparing the estimates of band-average TTD produced from the average climatological TTD at individual stations and the mean of the composite time series. With a few exceptions, the two estimates of the mean are within 7 days of each other, indicating that any ‘random walk’ errors introduced by the first difference methodology are minimal. Despite the deficiencies of the data, the methodology for producing the latitude band composite time series is reasonably accurate and produces meaningful results.
4.2. Variation of the Southern Hemisphere TTD Since 1979
Figure 9presents the contours of the TTD time-latitude array for the three regions of the SH analyzed here; the calculated uncertainty in the position of those contours (section 3.3) is also shown. With the exception of parts of SA, the contours in all three regions show modest expansionary trends overlain with different degrees of interannual variability.
 The uncertainty in the contour positions ranges from 0.8° to 2.6° latitude across all regions. Uncertainty is largest early in the record, a result of both the first-difference technique used and the generally lower data availability then. The contour uncertainty is also comparatively lower on the 300- and 200-day contours. The uncertainty is lowest in the ANZ region, where the data are most complete. None of the bands are obviously deficient. The largest uncertainty is found in the SA region, particularly on the 100- and 50-day contours, where the uncertainty ranges overlap. The high uncertainty is reflection of the poor data quality on bands SA6 and SA7 (Figure 3), where limited data are available after 2001 and likely accounts for the apparently high trends on those contours. While the calculated uncertainty in AFR region is comparable to that in ANZ, we have less confidence in these results. The AFR1–4 bands are data-sparse (Figure 4), possibly affecting the 300-day contour from this region. South of the continent, only four widely spaced are available. The bands are wide, and the 100- and 50-day contours are bounded by the same two bands, likely reducing the resolution of detail there.
Table 1shows the linear trend values for each contour for the entire 1979–2010 period, along with the associated 90% confidence intervals, which include the effects of the position uncertainties. Overall, 11 of the 12 contours show expansionary trends, although these values only exceed the confidence intervals in 6 of 12 cases. This is a result of including measurement errors; 9 of 12 are significant without them. Discounting the lower-confidence contours (i.e., SA 100-day, SA 50-day, AFR 300-day) as identified above, linear trends range from −0.2 to −0.7 deg dec−1. The AFR region shows distinctly lower trends; values in SA and ANZ are comparable.
Table 1. Linear Trends and 90% Confidence Intervals of TTD Contours From Radiosondesa
Units are degrees per decade and negative trends indicate an expansion of the tropics in the SH. Confidence intervals include effects of measurement errors. Results significant at the 90% level are in bold.
−0.37 ± 0.32
−0.67 ± 0.52
+0.09 ± 0.32
−0.62 ± 0.39
−0.39 ± 0.40
−0.25 ± 0.39
−0.59 ± 0.49
−0.95 ± 0.52
−0.29 ± 0.49
−0.21 ± 0.48
−1.21 ± 0.91
−0.21 ± 0.47
 Interannual variability is apparent in the position of all contours, although the magnitude changes at different numbers of TTD. Generally speaking, the 300- and 200-day contours tend to show less interannual variability compared to the more extratropical 100- and 50-day contours. An exception is the SA 300-day contour, which shows large excursions in the early 1980s and 1990s. The AFR 100 and 50 day contours also show less variability than in other regions. The position of the contours can vary by 4° latitude from year-to-year.
4.3. Sensitivity to Threshold Choices
Figure 10 shows the sensitivity of the calculated trends to the choice of tropical ZT and TTD thresholds for each region. Overall, the details of the pattern of sensitivity in each region are different. The trends are relatively stable in the parameter space between TTD = 150–250 days and ZT = 13.5–15.0 km. Here the range of trend variability is around 0.2 to 0.4 deg dec−1. In ANZ, with the largest variability overall, the trend varies more strongly with the ZT threshold. In the other regions, the trend is more sensitive to the TTD parameter. Overall, the smallest sensitivity is seen in AFR. Outside of this central portion of parameter space, greater sensitivity is seen in the ANZ and SA regions, particularly as the TTD threshold is decreased. The SA region also shows a strong region of sensitivity where the ZT threshold exceeds 14.5 km and TTD is larger than 250 days.
 Overall, there is no consistent pattern to the sensitivity between the 3 regions or with the results in Birner . This suggests that the methodology itself is largely free from systematic bias. Our analysis is clearly not free from uncertainty due to the subjective choice of thresholds; however, these sensitivities are relatively small, less than the size of the computed confidence intervals for reasonable choices of the parameters. We speculate that the sensitivities arise -at least in part- due to deficiencies in the underlying data. Generally, greater sensitivity is observed where the data are of poorer quality and/or the composite time series is less trustworthy. For example, the large sensitivities in the SA region around TTD = 100 days coincide with data quality issues noted earlier. Following this logic, the stronger sensitivities noted in the reanalyses could be related to issues with the detection of the tropopause in that data. This is examined in the next section.
4.4. Comparison of Radiosonde and Reanalysis TTD
 The results of the radiosonde analysis of TTD are here compared to the same metric computed using data from four global reanalysis products. Specifically, we use the NCEP/NCAR (NCEP), the NCEP/DOE (NCEP2) [Kanamitsu et al., 2002], the ERA-40 and ERA Interim (ERA-I) [Simmons et al., 2006] reanalyses for this purpose. These reanalyses are chosen simply on the basis that the necessary daily data are available to us. The particular data used here is interpolated to specified pressure levels, rather than the original model levels. The 1979–2010 period is analyzed, with the exception of the ERA-40 which stops in 2001. Instead of ZT, the daily tropopause pressure (PT) is used. Neither ZT nor PT is explicitly measured in the reanalyses. Instead, the technique of Reichler et al.  is used to estimate its pressure from the coarse reanalysis vertical resolution from each grid point within the limits of the ANZ, SA and AFR regions (Figure 1) following the WMO  tropopause definition. While this method is reasonably accurate at most locales, Reichler et al.  note that significant uncertainty is observed in the subtropics, where large differences from the radiosondes (up to 60 hPa or more) were identified in monthly means. Similarly, Son et al.,  show that reanalysis based PTare biased low (by 20 hPa or more) in the global subtropics when compared to Global Positioning System (GPS) radio occultation derived measurements. Our testing found similar results. In particular, the reanalyses-detected tropopause fails to correctly reproduce the annual cycle of mean monthly tropopause height (not shown), particularly in the subtropics. Toward the deep tropics, the PT is too low (i.e., ZTis too high) all year-round, with particular problems during the late winter and early spring. Further south, the summer mean tropopause heights are too high. These artifacts result in an overestimate of TTD. Another possible contributing factor was noted byBirner . In subtropical regions where a double tropopause structure is frequently observed [e.g., Randel et al., 2007; Añel et al., 2008], the detection of the lower tropopause is sensitive to the way the thickness criterion was implemented within the tropopause identification routine. With a more restrictive criterion, the lower tropopause was not identified roughly 10% of the time in his sensitivity study, which would result in a higher estimate of TTD. Reanalysis tropopause heights computed in this study use the equivalent of Birner's ‘less restrictive’ criterion, while radiosondes implement stricter depth criteria. We do not feel that these differences in the implementation of the tropopause detection routine play a significant role in the differences between radiosondes and reanalyses described below, and may in fact result in a slight underestimate of those differences. Our investigation suggests that these problems in the reanalysis largely result from the relatively coarse resolution of the fixed model levels, which do not adequately capture the complex thermodynamic structure in the upper levels of the subtropics. The resolution and the subsequent interpolation ‘smooth out’ this structure, resulting in a misidentification of the tropopause. Our results show that the routine better reproduces the radiosonde-based annual cycle (but still not correctly) when the number of levels in the upper troposphere is increased, as in newer reanalysis products (e.g., ERA-I).
Figure 11 shows the TTD contours from both the radiosonde and reanalyses. The tropical tropopause threshold for PT is set at approximately 140 hPa (it varies between 142 and 137 hPa with latitude), matching (on average) the 14.5 km threshold used for ZT in the radiosonde analysis. The most obvious feature of the plot is the poor match of the reanalysis and radiosonde TTD contours. Overall, the reanalysis contours are shifted poleward by 0.5–3.5 degrees latitude. This is a result of the overestimate of TTD by the reanalyses. Given the typical TTD gradient (Figure 8) in the subtropics, we estimate this too high tropopause occurs from 10 to 70 days per year. The amount of this shift depends on the choice of contour, region and the reanalysis product. The shift is larger on the 300- and 200-day contours compared to the 100- and 50-day contours. The shift is also smaller over the ANZ region, where the data quality is highest; the AFR and SA regions have noticeably larger shifts. Finally, the NCEP and NCEP2 reanalyses have a significantly larger shift, particularly in comparison to the ERA-I, which shows the smallest shift and a tendency toward convergence with the sonde-reported contour positions in the latter part of the record of many contours. The amount of shift also varies with the choice of threshold. Choosing a PT threshold of 120 hPa (roughly ZT of ∼15.5 km; not shown), as in the studies by Seidel and Randel  and Lu et al. , results in a smaller shift (<1.8 degrees).
 Despite this mismatch in position, the radiosonde and reanalysis contours show a broad agreement in interannual variability. Correlations of de-trended contour positions generally range between +0.5 and +0.9, with a typical value near +0.6. Larger correlations are seen with the ERA-I on the ANZ 100- and 50-day contours (Figure 9). The contractions of the SA 300-day contour are not well captured with the threshold of 140 hPa, but are resolved with the higher threshold (not shown). A similar result is also seen in the sensitivity for SA (Figure 10). As with the shift, the best correspondence of interannual variability with the observational data is seen in the ANZ region and with the ERA-I reanalysis.
Table 2gives the linear trend values and 90% confidence intervals for the reanalysis contours. No measurement errors were applied here, which reduces the width of the confidence intervals in this case. Reanalysis trends cover a wider range of values compared to the radiosondes. The ERA-I is most variable, with contraction trends in SA, expansion trends in Africa and near-zero trends in ANZ. The other reanalyses show a general expansion trend, usually stronger in the NCEP and NCEP2 products. The radiosonde trends (Table 1) in ANZ typically show more expansion, particularly on the 200-day contour; NCEP and NCEP2 are the closest here. In SA, the expansion trends are generally weaker than seen in the radiosondes. In AFR, the observed weak trends are only reproduced by the ERA-I; the others suggest an expansion far in excess of the observations. The AFR trends are generally stronger than those found in other regions, as well.
Table 2. Linear Trends and 90% Confidence Intervals of TTD Contours From Four Reanalysesa
Units are degrees per decade and negative trends indicate an expansion of the tropics in the SH. ERA-40 analysis runs from 1979 to 2001. Results significant at 90% level are in bold. Measurement errors are unavailable or not included in confidence intervals.
+0.13 ± 0.16
−0.02 ± 0.27
−0.34 ± 0.20
−0.50 ± 0.24
+0.07 ± 0.20
−0.13 ± 0.36
−0.41 ± 0.22
−0.53 ± 0.19
−0.02 ± 0.28
−0.28 ± 0.56
−0.52 ± 0.30
−0.51 ± 0.19
+0.04 ± 0.50
−0.42 ± 0.95
−0.83 ± 0.57
−0.88 ± 0.34
+0.19 ± 0.24
−0.38 ± 0.54
−0.45 ± 0.36
−0.42 ± 0.32
+0.18 ± 0.22
−0.29 ± 0.39
−0.25 ± 0.22
−0.28 ± 0.22
+0.23 ± 0.24
−0.37 ± 0.38
−0.21 ± 0.27
−0.29 ± 0.24
+0.32 ± 0.35
−0.65 ± 0.55
−0.33 ± 0.25
−0.44 ± 0.25
−0.09 ± 0.19
−0.48 ± 0.37
−0.63 ± 0.23
−0.62 ± 0.22
−0.20 ± 0.21
−0.56 ± 0.39
−0.64 ± 0.17
−0.63 ± 0.17
−0.40 ± 0.36
−0.92 ± 0.67
−0.83 ± 0.24
−0.88 ± 0.25
−0.27 ± 0.47
−0.86 ± 0.60
−1.39 ± 0.36
−1.30 ± 0.35
 The regional results presented in section 4 represent three independent samples of the variation of the Southern Hemisphere tropical edge. Figure 12presents the contours of the individual TTD time-latitude arrays on the same figure, combining the regional results fromFigure 9 onto one graph to allow the identification of global commonalities and regional differences.
 Overall, the results are broadly consistent between regions (see also Figure 8). This similarity in the regional results indicates that the methodology used is robust and gives consistent, believable results despite the shortcomings of the historical radiosonde data. The broad similarity of the radiosonde and reanalysis interannual variability also supports this conclusion. Despite this confidence, some bands contain only limited data and the resulting contours should be treated with caution. In particular, this applies to the 100- and 50-day contours in the AFR and SA region, as noted insection 4.2. The 300-day contour in AFR is also questionable, and the same contour in SA indicates a significant regional effect, strongly subject to the choice of threshold. Considering these factors, we conclude that the 200-day contour is the most robust and most suitable threshold for capturing the global signal. While the effects of sensitivity due to choice of thresholds cannot be completely eliminated,Figure 10shows that our threshold choices lie in a relatively insensitive region of parameter space in all three regions. The 200-day contour is located near 30°S in all regions (Figure 8) and has been previously justified for use as a proxy for the tropical edge, albeit with different ZT thresholds [e.g., Lu et al., 2009; Birner, 2010]. Correlations between the de-trended regional 200-day contours range between +0.5 and +0.7; not only are the individual contours co-located, but they vary interannually in a similar fashion.
 The regional 200-day contours positions for both reanalyses and radiosondes are averaged to create a global picture of the variation of the SH tropical edge.Figure 13 shows the relative positions of the average contours. The relative positions remove the position shift of the contours (section 4.4) and emphasize the similarities and discrepancies between the different analyses. Broadly speaking, the results track one another quite well. Overall, root mean square differences between the radiosonde and reanalysis relative contours are between 0.3 and 0.7 degrees. However, the performance is variable. In the first few years, particularly 1979–1982, the radiosonde contours are relatively equatorward of the reanalyses position, by 0.5° or more. Both the ERA-40 and ERA-I in particular are a poor match here. From approximately 1985–1996, the agreement is better, with the ERA-I closely following the sonde contour. Both NCEP reanalyses match well until 1992. During 1997–2000, all four reanalyses and the radiosonde display excellent agreement, with a distinct expansion of ∼1°. After 2000, the ERA-I and two NCEP products diverge, with a strong retreat of the tropical edge seen in the ERA-I contour; the NCEP products show a better match, but with differences largely in the opposite direction.
Table 3presents the linear trends for the relative contours. The radiosonde-based ‘global 200-day’ contour shows an expansionary trend of −0.41 ± 0.37 deg dec−1from 1979/80 through 2010/11. This trend is significant at the 90% level. The magnitude is comparable to the trends over the full period in the NCEP and NCEP2 values, while the ERA-I shows near-zero trend. The ERA-40 shows a smaller trend than the radiosonde trends (−0.73) over the appropriate period. In general, ERA-based trends are weaker than those found in the NCEP-based reanalyses.Birner  and Davis and Rosenlof also suggest the possibility of contraction trends using tropopause-based methodologies in some reanalysis data sets, particularly in the JRA-25 product (not used here). The possibility of decadal variability cannot be excluded. From the radiosonde analysis, the rate of the expansion of the tropics is clearly variable in time. Most expansion has occurred in the first half of the record; little expansion is suggested in the latter half. Similar results were noted inBirner  and Davis and Rosenlof , both of whom suggested a possible link to stratospheric composition and temperature changes identified by Randel et al. .
Table 3. Linear Trends and 90% Confidence Intervals for the ‘Global 200’ contoura
Units are degrees per decade and negative trends indicate an expansion of the tropics in the SH. Trend calculated over 1979–2010, with the exception of ERA-40 (1979–2001). Measurement errors included in confidence interval of radiosonde, but not in reanalyses. Significance at 90% level indicated by bold text.
−0.41 ± 0.37
−0.48 ± 0.18
−0.43 ± 0.19
−0.32 ± 0.42
+0.01 ± 0.18
 As noted in section 3.1, the use of a fixed ZTthreshold may introduce a spurious trend caused by the observed trend in tropopause heights. The magnitude of this effect is estimated with an idealized calculation. We create a simulated time-latitude array with observations of ZT distributed following the mean annual frequency distribution. These observations are perturbed only by a specified (constant) ZTtrend amount. The resulting change to the time-latitude array is then used to estimate the tropical expansion trend.Table 4 shows the sensitivity using the thresholds chosen here (i.e., 200 days and 14.5 km). Graphs of the sensitivity similar to Figure 10 (not shown) indicate that this ‘sympathetic’ trend is generally a minimum with ZT between 14 and 15 km, increasing both above and below that height range. Expansion trends also increase where both ZT threshold and TTD are at higher (lower) values, particularly when the imposed ZT trend is large. For observed ZT trend amounts, Table 4 shows that the tropical expansion trend associated with choosing static thresholds is less than 0.1 deg dec−1, 10–20% of the observed value of the tropical expansion trend. The effect of static tropopause thresholds is apparently not large enough to explain the entire observed trend.
Table 4. Magnitude of Tropical Expansion Trends Arising From Trend in Tropopause Heighta
ZT Trend (m dec−1)
Tropical Expansion Trend (deg dec−1)
Trends are estimated using the thresholds of TTD = 200-day and tropical ZT threshold of 14.5 km.
 Modeling-based studies [Lu et al., 2007; Johanson and Fu, 2009; Lu et al., 2009] typically show an expansion of 1–2° latitude is expected in those scenarios by the end of the 21st century. These studies hypothesize that the expansion of the tropics is related to an increase in the static stability in the extratropics, a result of changes in the hydrologic cycle following the mechanism described by Held and Soden . This static stability change inhibits baroclinic wave activity in the subtropics and results in the expansion. Johanson and Fu  and Lu et al.  raised the possibility that stratospheric ozone depletion in the SH is a possible factor in the expansion. In fact, Son et al.  and Polvani et al.  suggest that stratospheric ozone depletion is the main driver of 20th century atmospheric circulation changes, including a broadening of the SH tropics, particularly during the summer season. Other factors like absorbing aerosol or tropospheric ozone may also contribute to widening [Allen et al., 2012]. The forcing and dynamics behind tropical expansion trends remain an open question.
 Fluctuations on an interannual time-scale are a significant component of the variability of TTD. Although the amplitude of this variability is larger in the first half of the record, it is present throughout period of study. Year-to-year changes in TTD are noted in all regions and on all contours, albeit to different degrees in both a regional and latitudinal sense. In general, higher amplitude variability is seen on the 100- and 50-day contours. However, there are exceptions to these broad statements. The AFR 100- and 50-day contours show smaller amplitude variability compared to other regions. The SA 300-day contour shows larger displacements compared to other regions.Figure 11 supports the idea that these individual contour differences are real. Figure 12indicates that the interannual signal is more-or-less coherent between regions.Figure 13 suggests coherence between the radiosondes and reanalyses. These results indicate that the variability observed in the radiosonde calculation of TTD is real.
 One major driver of interannual variability in the global climate system is ENSO. Oort and Yienger  demonstrated its effect on the Hadley Cell, showing that during the ENSO warm phase (i.e., El Niño) the meridional extent of the circulation contracts. The state of ENSO can be represented by the Multivariate ENSO Index (MEI) [Wolter and Timlin, 1993], in this case annually averaged over the same period as the radiosonde data (i.e., June to May).
Table 5shows correlations of the annual MEI and the positions of the individual contours for each region, as well as for the global average. All variables are de-trended to emphasize the interannual relationship. Globally, significant correlations (at the 90% level) are found for all contours. This relationship is stronger on the 100- and 50-day contours, consistent with studies [e.g.,Cook, 2001; Vera et al., 2004] identifying a wave response to ENSO in the SH that attains its maximum amplitude in the midlatitudes. The relationships also vary with region. The strongest relationship is found in ANZ, where a significant correlation is found on all contours, and largest on the 100- and 50-day contours. The SA region is similar, although no significant correlation was identified on the 300- and 200-day contours. The correlations are weaker in AFR, where a significant relation is only identified on the 300- and 200-day contours.
Table 5. Linear Correlations of Regional and ‘Global’ Contour Positions With Annually Averaged MEIa
All variables are de-trended. The 90% significance level is r = 0.30 for 32 points.
 Large volcanic eruptions are another source of interannual variability. These eruptions can increase stratospheric aerosols, which warm the lower stratosphere [e.g., Randel, 2010] and depress the tropopause height [Santer et al., 2003], and result in a contraction of the tropics by this measure [e.g., Lu et al., 2009]. During the period of this study, the globally significant eruptions of El Chichón in 1982 and Mt. Pinatubo in 1991 occurred. There is a clear signal of both eruptions in the data and the reanalyses – an equatorward shift in contour lines during the early 1980s and again in the 1990s. These volcanic effects especially stand out in the SA analyses. Based on the position change of the contours, we estimate that the magnitude of the contraction is on the order of 1° latitude.
6. Concluding Remarks
 This study uses historical radiosonde data derived from the IGRA to investigate the variation of the SH tropical edge using the tropopause height frequency methodology described in Seidel and Randel . An annual record of the number of tropical tropopause days (TTD) is created from 1979 to 2010, where a fixed ZTthreshold of 14.5 km is chosen to delineate a tropical tropopause day, While the quality of the observational data is potentially problematic, an extensive effort including bias adjustment, a first difference compositing technique and uncertainty estimation has been made to address these shortcomings. Across the three continental regions analyzed, the radiosonde-based results demonstrate a high degree of similarity in their mean structure and a general consistency in their temporal evolution. Most TTD contours show an expansionary trend overlain with interannual variability, although in many cases the trend is only marginally statistically significant, partly due to the consideration of measurement uncertainty in the calculation of the confidence intervals. Based on the estimated uncertainties and regional intercomparison, we consider the 200 days/year TTD contour as the most robust across all the regions in the radiosonde analysis. Averaging this contour across all three regions, we estimate a hemispheric expansion of the tropics from 1979 to 2010 of −0.41 ± 0.37 deg dec−1, significant at the 90% level. This trend is not uniform across all three regions.
 Data from four reanalysis products are analyzed using the same methodology and compared to the observations. Overall, the behavior of the tropical edge as pictured by reanalyses and radiosondes is broadly consistent, but the details are different. The position of contours is shifted poleward in the reanalyses, and the computed trends show a wider range of values. One fundamental difference lies in the determination of the tropopause level. In the radiosondes, the tropopause is generally reported as a significant level, allowing for a precise determination of its height. In the reanalyses, ZT (or PT) is estimated using interpolated data from between comparatively coarse model levels. Subtle features in the subtropical atmosphere are often not correctly identified in these products [e.g., Reichler et al., 2003], resulting in errors in the representation of the tropopause.
 Furthermore, differences in model characteristics and the assimilation techniques are likely important. Bengtsson et al. [2004b]show that where radiosonde data are available the details of weather systems are better resolved by a reanalysis product. In the SH, the reanalyses are largely based on satellite-derived data, with only ANZ and a small part of AFR influenced by the radiosonde data. In this study, the reanalysis results are most consistent – both with each other and the radiosonde data – where the radiosonde data are more complete. In the data-rich ANZ region, reanalysis and radiosonde show similar contour locations and temporal variability. In data sparse regions of SA and AFR, the agreement is poorer, and the various reanalysis responses more likely reflect differences in internal model dynamics or assimilation techniques rather than actual meteorological changes.
 The ERA-I is an interesting case. It is a newer-generation reanalysis, with better horizontal and vertical resolution and state-of-the-art physical parameterizations and assimilation schemes. Overall, the observed shift of the contours is less and the interannual variability is well resolved, as indicated byFigure 11 and the measures presented in section 4.4. Careful examination of Figure 11shows that the amount of the shift in the ERA-I has been reducing since the early-2000s, suggesting that its representation of the atmosphere is becoming more realistic. However, the calculated tropical expansion trends differ considerably from those of the radiosonde-based data and the other reanalyses (Tables 1 and 2).
 While we cannot unequivocally eliminate subtle differences associated with the methodology and the choice of tropical ZTthreshold from consideration at this time, we speculate that one or more of the various reanalyses, the ERA-I in particular, may contain inhomogeneities which are the source of this disparity. This is illustrated inFigure 13 and discussed in more detail in section 5. At times, the reanalyses and radiosondes show good agreement in their relative behavior, at other times their behavior diverges. The most notable changes occur after the year 2000. These differences in performance and their variability in time cannot simply be attributed to differences in model resolution, which remains fixed in time. Given the strong influence of satellite data on the SH reanalyses, we can speculate that changes to the satellite observing system may have introduced these artifacts into the data. Rienecker et al. indicate a dramatic enlargement of the observation network in the early-21st century with the inclusion of the NOAA polar-orbiting satellite-based Advanced TIROS Operational Vertical Sounder (ATOVS) and the Advanced Infrared Sounder (AIRS) on the NASAAquasatellite. Ironically, the (potential) improvement in performance of the ERA-I with the introduction of these satellite systems may have produced an inhomogeneity that reduces the reliability of the long-term climate trend in that product. The first-generation NCEP and NCEP2 reanalyses may see these changes differently as a result of different assimilation methods (e.g., direct assimilation of satellite radiance against assimilation of retrieved products) or some other factor. This hypothesis does not address the differences in the earlier part of the record, when satellite data were less prevalent. The possibility of inhomogeneities in the ERA-I and other reanalyses is not further considered here, as a full evaluation of this hypothesis is beyond the scope of the current paper. We offer this as a potential explanation of the observed discrepancies in reanalysis performance and as a direction for future research.
 The findings presented in this study provide a unique view of the temporal evolution of the tropical width, independent of products derived from the reanalyses. The possibility of inhomogeneities in reanalysis products suggests that their use for the computation of climate trends should be treated with caution. Taken at face value, our results are broadly consistent with previous studies' findings that there is an expansion of the tropics in the southern hemisphere. Our hemispheric results are significant at the 90% level and show regional differences which may or may not be related to the underlying data issues. Our rates of expansion are generally consistent with other studies using this methodology, but considerably smaller than some derived using different approaches [e.g., Hu and Fu, 2007; Johanson and Fu, 2009], a reflection of the different physics represented by different diagnostic measures [Davis and Rosenlof, 2012]. Our results suggest the possibility of inhomogeneities within the reanalyses data. To fully understand the differences between metrics, as well as the forcings and underlying physics of tropical expansion, the source of these inconsistencies in the reanalysis data must be identified and understood. This knowledge is essential for understanding the impacts that tropical expansion may have now and in the future.
 We thank Scott Power, Robert Fawcett, and three anonymous reviewers for their helpful suggestions. This research was funded by the South Eastern Australian Climate Initiative (SEACI): Phase 2.