The quasi-biennial oscillation (QBO) is important for understanding the dynamical and chemical variability of the global stratosphere. Currently available wind data from the equatorial stratosphere extend back to 1953. Here we present reconstructions of the QBO extending back to 1900 that can be used to constrain climate model simulations. The reconstructions are based on historical pilot balloon data as well as hourly sea-level pressure (SLP) data from Jakarta, Indonesia. The latter were used to extract the signal of the solar semi-diurnal tide in the middle atmosphere, which is modulated by the QBO. The reconstructions are in good agreement with the QBO signal extracted from historical total ozone data extending back to 1924. Further analyses suggest that the maximum phases of the QBO are captured relatively well after about 1910.
 The quasi-biennial oscillation (QBO) is an essential component of stratospheric variability. It affects circulation and chemistry of the stratosphere in both tropics and extratropics, as well as the underlying troposphere [Baldwin et al., 2001]. It is therefore important to have a good QBO representation when attempting to model stratospheric variability. As many models do not produce a QBO, they are often nudged to the observations [e.g., Giorgetta, 1996] in order to reproduce QBO-related variability or to synchronize modeled and observed QBO. Currently available wind data in the equatorial stratosphere extend back to 1953. Only indirect or extremely sparse information is available prior to this and is insufficient for directly constraining simulations of the entire 20th century. Here we present reconstructions of the QBO (i.e., zonal mean zonal wind in the equatorial stratosphere) back to 1900. The purpose was to obtain a QBO that is physically realistic with respect to amplitude, phase propagation, and periodicity, while maintaining consistency with all historical QBO information.
 Three types of historical QBO information were used, the first being wind measurements in the tropical stratosphere made with pilot balloons (free flying balloons tracked from the ground). The second and third type is indirect information such as the amplitude of the solar semi-diurnal tide in the middle atmosphere (SSDT, which is modulated by the QBO and whose imprint is found in SLP data) and historical total ozone data.
 Stratospheric wind data are available from ERA-40 [Uppala et al., 2005], 1957–2002. The Free University of Berlin (FUB) provides radiosonde wind data between 70 and 10 hPa, some levels reaching back to 1953 [Labitzke et al., 2002]. Some earlier pilot balloon wind data from the tropical stratosphere are also available (TD52/TD53, provided by Roy Jenne, NCAR). We have previously used these data to estimate the phase of the 50 hPa QBO back to 1942 [Labitzke et al., 2006], but with considerable uncertainty.
 Only sparse pilot balloon data from the tropical stratosphere are available prior to the 1940s. The most notable example is a series of observations performed at Jakarta during the 1910s [Ebdon, 1963]. Hamilton  has compiled most of the available pre-1945 stratospheric wind data, however they are insufficient for obtaining a QBO reconstruction. Of these observations, only the wind direction, not the wind speed (which is more uncertain) is used here.
 In order to extract the SSDT signal, hourly SLP data from Batavia (Jakarta), 1866–1945 were used. We also used the Extended Reconstructed Sea Surface Temperature data, Vers. 2 [Smith and Reynolds, 2004] for calculating the Niño3.4 index ([5°S-5°N, 170°W-120°W] average). Total ozone data from Oxford, UK, 1924–1975 [Vogler et al., 2007], Arosa, Switzerland, 1926–2006 [Staehelin et al., 1998], Tromsø, Norway, 1935–2006 [Hansen and Svenøe, 2005], Shanghai, China, 1932–1942 [Brönnimann et al., 2003], and Tateno, Japan, 1957–2005, were also used. They were supplemented using TOMS V8 (Total Ozone Mapping Spectrometer) satellite data.
3.1. General Approach
 The concepts normally used for climate reconstructions are mostly based on covariances between variables, assuming stationary relationships over time. These fail in the case of the QBO. For instance, zonal winds at 20 and 50 hPa are uncorrelated, even though they are highly coherent. Also, in case of low covariance, conventional reconstructions tend towards a “no knowledge prediction” such as climatology or a mean annual cycle, which in the case of the QBO would be physically implausible. However, the QBO can be very well characterized in the time domain: In the observational record since 1953 it is cyclic (though the period varies slightly) and shows characteristic time evolution (downward propagation of phases) and amplitudes. Changing from the “variable domain” to the “time domain”, these characteristics can be used to constrain a reconstruction (formulate stationarity assumption and “no knowledge prediction”). Note that stationarity remains an assumption; the QBO characteristics could change in a changed climate.
 The general approach (detailed in auxiliary material) is as follows: we start with a perpetually repeating idealized QBO cycle as “no knowledge prediction” (termed raw QBO hereafter). Rather than reconstructing zonal wind, we use all available historical QBO information to reconstruct a time axis onto which the raw QBO is interpolated. In other words, the raw QBO is phase shifted and stretched in time using linear interpolation to match the true QBO signal.
3.2. “No Knowledge Prediction”
 The idealized QBO cycle is determined from a compositing approach adapted from Giorgetta et al. . From the ERA-40 QBO, the mean annual cycle was subtracted, but the annual mean was added back. This is necessary because the stratospheric zonal winds show an annual cycle in the middle stratosphere and a semi-annual cycle in the upper-stratosphere. The monthly wind profiles were divided into E and W phases according to the wind at 20 hPa. The individual half-cycles were stretched to match the average duration of E and W phases, respectively (Figure S2). The maximum amount of stretching is used as a stationarity constraint in the reconstructions. The other characteristics of the time evolution (peak amplitudes, downward propagation) are stationary by construction.
 The averaged stretched cycles for the levels 7 to 70 hPa constitute the idealized cycle (Figure S2). It compares well with a non-linear principal component analysis approach [Hsieh, 2004].
3.3. Defining the Time Axis
3.3.1. Extracting the QBO Signal From Hourly SLP Data
Hamilton  analyzed whether the amplitude of the SSDT can be used to infer QBO information back in time. The SSDT is related to ozone heating, influenced by the amount of ozone present, which is affected by the QBO [Baldwin et al., 2001]. It causes a small semi-diurnal surface pressure oscillation with peaks near 9:00 local time [Hamilton, 1983], whose amplitude is termed S2(p) hereafter. Hamilton , Teitelbaum et al. , and others have shown that a QBO signal can indeed be found when extracting S2(p) from 9:00 am SLP data from Batavia. We follow a similar approach, but use additional pressure readings in order to reduce random errors:
where the subscript refers to the local hour. The long-term mean annual cycle was determined based on daily data, smoothed with a 5-day moving average, and removed before forming monthly averages. A visual inspection showed obvious inhomogeneous periods (Jan. 1880 to Mar. 1880, Feb. 1912 to Jul. 1916) which were corrected by adding a constant offset (+18 and +9.1 Pa, respectively). Moreover, the period 1865–1875 was excluded because of an obvious drift. The corrected series was detrended and spectral analysis was performed. The resulting spectrum shows a clear QBO signal (Figure 1a). However, other processes such as ENSO (El Niño/Southern Oscillation) also have power in this spectral region (Niño3.4 in Figure 1a). Since ENSO affects ozone in the tropical stratosphere, it could also affect S2(p) and interfere with its QBO signal. In fact, S2(p) is significantly correlated with Niño3.4 in boreal winter. We therefore subtracted the linear influence of ENSO using a linear regression model for each calendar month (no evidence for a lag was found). The resulting spectrum (Figure 1a) still shows the characteristic QBO peak near 26–28 months. There are secondary peaks near 22 and 32 months [see Tung and Yang, 1994; Baldwin et al., 2001].
 In order to extract the QBO signal from S2(p), we used a band-pass filter with half-power points at 22 and 32 months (Figure 1b), in agreement with the stationarity assumption. The timing of peaks in the filtered series (termed S2(p)QBO, Figure 1c) gives information on the QBO. The amplitude of S2(p)QBO reflects the reliability of the QBO signal (assuming quasi-stationarity of the QBO period) rather than its amplitude. If it is larger than ∼1 Pa (which in random permutation experiments occurs in 6 cycles per 70 yrs), peaks can be identified unambiguously and their spacing is within the range of observed QBO periods. This is the case after around 1913. Before then, the spacing of peaks often violates the stationarity assumption. Therefore the unstretched raw QBO was perpetually repeated backward from 1913.
3.3.2. Calibrating the Phase
 Next we attribute the QBO signal in S2(p)QBO to a specific pressure level in the wind QBO based on historical pilot balloon information. For this purpose, the raw QBO was filtered in the same way as S2(p). For each level, the dates of maxima/minima in S2(p)QBO were mapped onto the maxima/minima of the filtered raw QBO, thus defining a time axis (linearly interpolated between subsequent maxima and minima) that would be obtained if S2(p)QBO represents zonal wind at that level. The (unfiltered) raw QBO was then interpolated onto this new time axis as well as a finer altitude resolution (for better comparison with observations). Then the seasonal anomaly cycle was added. This reconstruction was compared to the QBO phase information derived from historical pilot balloon wind data (Figure 2), expressed on the same altitude grid.
 A skill score R was defined as the number of matching phases nm minus the number of non-matching phases nnm, divided by their sum:
R is between −1 and 1, with an expectation value of 0. For the calibration, we attributed weights of 0.25 to the data from Ebdon  from 1908–1910 and 1914–1918, 0.5 to the chronology of easterly phases by Schove , and 1 to all other observations. Also, we excluded winds below 3 m/s (i.e., phase changes) in the reconstructions. In the first round, observations prior to 1913 were discarded. R maximized when assuming that S2(p) represents 30 hPa wind. This is around the region of the maximum ozone partial pressure and is consistent with results given by Hamilton  and Teitelbaum et al. . This reconstruction was therefore chosen as a basis.
 There were still cases where reconstructions and observations did not agree. Therefore, in a second step, the time axis was adjusted. This was achieved by slightly shifting the maxima and minima (while obeying the stationarity constraint). Pre-1913 observations were also used now, as well as observations after the end of the reconstruction period. R for the resulting reconstruction is 0.67 and reproduces most observations (note that observations are contradictory at times, limiting the maximum possible R).
3.4. Reconstructions From 1944 to 1957
 Since S2(p)QBO ends in fall 1945, QBO reconstructions for 1945–1953 require direct observations. We used the same approach (mapping of maxima and minima) as described above, but replaced S2(p)QBO with the 50 hPa QBO from Labitzke et al. . From June 1953 on, direct QBO observations available from FUB are used, but not for all levels. The missing levels were supplemented in the same way as described above, replacing S2(p)QBO with filtered 30 hPa winds from FUB (termed RECFUB).
4. Results and Validation
 The reconstructions, merged with FUB and ERA-40 data, are shown in Figure 2 together with wind directions from historical observations prior to 1945. The figure demonstrates that the resulting product is realistic in terms of amplitude, phase propagation, and period. However, the inter-cycle variability is clearly smaller in the early period. In order to assess the uncertainty of the reconstructions, validation experiments were performed.
4.2. Validation in ERA-40 Period
 It was not possible to assess the reconstructions by performing the same procedure in the ERA-40 period (and comparing with ERA-40 winds) as no sufficiently long, high-resolution and high-quality equatorial SLP series could be found. S2(p) calculated from possible candidates did not exhibit a clear spectral QBO peak and data from sites further away (5°) from the equator may not be comparable.
 It is instructive, however, to analyze the skill of RECFUB in the ERA-40 period in order to asses the effect of inter-cycle variability (Figure 3). Adding back the annual cycle clearly improves the reconstructions (grey curve compared to thin black curve in Figure 3a), though not all of the inter-cycle variability in ERA-40 (thick black line) is captured. R is between 0.7 and 0.8 depending on the pressure level (Figure 3b). To simulate the uncertainty in the timing of peaks in S2(p)QBO (assuming that the number of cycles is correct), we perturbed the position of peaks by a systematic plus a random component of the same magnitude (bias = standard deviation = P). If the stationarity assumption was violated, the random component was discarded. Up to P = 4 months, R remains above 0.5. For low P and levels below 30 hPa, R even reaches values of 0.9 when excluding wind speeds below 3 m/s in the reconstructions (Figure 3c).
4.3. Validation Using Historical Total Ozone
 Further validations were performed using historical total ozone data. First, the 1979–2000 mean annual cycle was subtracted and the data were detrended (using separate trends before and after 1978). Because extratropical total ozone is affected by ENSO [Brönnimann et al., 2004] the Niño3.4 signal was removed using a linear regression for each calendar month with a 6 month lag (which maximizes the explained variance). Gaps up to 6 months were linearly interpolated (most gaps were much shorter) in order to avoid filtering artifacts. By averaging the three European series, a close to complete series from 1924 to present can be obtained (termed EUR). Similarly, the two Asian series were averaged (ASI). Spectral analysis (Figure 1b) reveals a clear QBO signature in EUR (less pronounced in ASI), with similar side peaks as found by Tung and Yang .
 The filtered total ozone series were compared with the filtered ERA-40 QBO. The strongest (negative) correlation with EUR was found for 30 hPa winds, with ASI for 20 hPa winds. The corresponding reconstructed (filtered) winds and total ozone agree very well in the historical period (Figure 4). Agreement is worse in the early 1950s, but here reliable wind observations are available. Note also that there is some uncertainty during 1940–1942 (ASI ends, EUR shows a very long cycle, and direct observations are sparse), which could be a remaining effect of the prolonged 1939–1942 El Niño [Brönnimann et al., 2004].
 This validation can be used to estimate P in the reconstruction period. For EUR, 1924–1929 and 1938–1949 (when the QBO amplitude in total ozone is sufficiently large), we find an average offset and standard deviation of 2 and 2.4 months, respectively (Figure 4). For ASI (1932–1942), the corresponding values are 0.6 and 3.2 months (in the ERA-40 period they are ∼1 month). Based on these results, P ≈ 3 months is a conservative estimation for the historical period.
 We also compared the QBO reconstructions (RECS2p) with alternative QBO reconstructions derived from the maxima/minima in the filtered ozone data (RECTOZ). In terms of R (Figures 3d and 3e), the two reconstructions agree relatively well. If wind speeds below 3 m/s are excluded in RECS2p, R is between 0.6 and 0.8. However, the inter-cycle variability is not fully considered as both reconstructions are based on the same idealized cycle.
 We have reconstructed the QBO back to 1900 and have performed a number of validation experiments. Following is a summarized assessment (see also Figure 3b). Prior to around 1910, reconstructions are not constrained by any historical information. Afterwards, the first direct observations become available and constrain P to around 2 months for a short period. Between 1914 and 1924 observational evidence is extremely sparse and the amplitude of S2(p)QBO is small. The QBO signal becomes clearly apparent ten years later. It is likely that the number of cycles in this intermediate period is four. In this case, according to our stationarity assumption, P can hardly be larger than 4 months.
 Between 1924 and 1944, the amplitudes of S2(p)QBO and total ozone are large and in good mutual agreement, with P ≈ 3 months. It is unlikely that the number of cycles is wrong. After around 1945, the QBO can be constrained with direct observations, limiting P to around 2 months. From 1953 to 1957 (for levels not contained in the FUB QBO) we can assume that P ≈ 0.
 The final reconstruction shows all characteristic features of the QBO (periodicity, amplitude, downward propagation) and co-varies with most of the historical observations. Moreover, it reproduces the mean annual cycle of ERA-40 winds. There are remaining uncertainties as the QBO signal both in S2(p) and total ozone is weak and historical pilot balloon observations may be wrong. The inter-cycle variability can only partly be reproduced. Finally, as for all reconstructions, the stationarity assumption might be wrong. For instance, Teitelbaum et al.  suggested a longer QBO period prior to 1905. If real, such changes would strongly affect our quality assessment.
 SB, JA, and CV were funded by the Swiss National Science Foundation. PDJ has been supported by the U.S. Department of Energy (grant DE-FG02-98ER62601). We thank NCAR (TD52/53), NASA (TOMS data), ECMWF (ERA-40), the Free University of Berlin (QBO), and WOUDC (total ozone) for providing data.