We have reevaluated the original total ozone measurements made in Oxford between 1924 and 1957, with a view to extending backward in time the existing total ozone series from 1957 to 1975. The Oxford measurements are the oldest Dobson observations in the world. Their prime importance, when coupled with the series from Arosa (since 1926) and Tromsø (since 1935), is for increasing basic understanding of stratospheric ozone and dynamics, while in relation to studies of the recent ozone depletion they constitute a baseline of considerable (and unique) significance and value. However, the reevaluation was made difficult on account of changes to the instruments and wavelengths as the early data collection methods evolved, while unknowns due to the influence of aerosols and the possible presence of dioxides of sulphur and nitrogen created additional problems. Our reevaluation was based on statistical procedures (comparisons with meteorological upper air data and ozone series from Arosa) and also on corrections suggested by Dobson himself. The comparisons demonstrate that the data are internally consistent and of good quality. Nevertheless, as post-1957 data were not assessed in this study, the series cannot be recommended at present for trend analysis, though the series can be used for climatological studies. By supplementing the Oxford data with other existing series, we present a European total ozone climatology for 1924–1939, 1950–1965, and 1988–2000 and analyze the data with respect to variables measuring the strength and the temperature of the polar vortex.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Long-term total ozone series can provide valuable information on the variability of the ozone layer and the connection to other geophysical variables. For the pre-CFC period in particular, when the ozone layer was barely affected by anthropogenic emissions [see, e.g., Solomon, 1999; Staehelin et al., 2001], they provide a background against which the current variability can be compared and allow us to address interannual variability such as that related to El Nino events [Brönnimann et al., 2004], North Atlantic Oscillation [Appenzeller et al., 2000], and other climate indices [Steinbrecht et al., 2001]. In addition, they are an important source for the validation of numerical chemistry-climate models, both of the chemically undisturbed pre-CFC era and of current conditions. The further back an ozone series reaches, the more thoroughly the numerical models can be checked for natural fluctuations and hence the more reliable their results in predicting the future state of the ozone layer (assuming a decreasing influence of anthropogenic halogens on stratospheric ozone).
 The first systematic total ozone observations started in Oxford in 1924, where G.M.B Dobson constructed his spectrophotograph and pioneered ozone research [Dobson and Harrison, 1926]. The Dobson Spectrophotometer, developed a few years later, is still the standard instrument for the total ozone network run under the auspices of the WMO. However, continuous reevaluated ozone series going back before the International Geophysical Year (IGY, 1957–1958) are very sparse and are restricted to Arosa (since 1926 [Staehelin et al., 1998a, 1998b]), Tromsø (since 1935 [Hansen and Svenøe, 2005]) and Svalbard (since 1950 [Vogler et al., 2006]). Total ozone measurements from Oxford were formerly only available back to 1957 (except some data reported by Goldsmith et al. ). We have now reevaluated the 1924–1957 data based on the original observing records. It was Dobson himself who performed and continuously developed ozone observations at Oxford [see Dobson, 1968; Houghton and Walshaw, 1977; Walshaw, 1989].
 The data set can be divided into different periods, which are characterized by the use of different instruments and wavelengths (see Figure 1). Dobson used three different instruments. The early work on atmospheric ozone (1924–1928) started at Robinwood, Boar's Hill, about 6 km from the center of Oxford. In autumn 1924 the first solar spectra were obtained with a Féry spectrograph in Oxford and in 1925 the first regular ozone measurements were started. The spectral regions included the Huggins ozone bands. In general two long wavelengths and two short wavelengths (but not yet the relative intensity of the pairs) were measured for the differential determination of ozone column densities.
 The Féry spectrograph had two disadvantages: the processing time of the ozone values was too long, and measurements were limited to high Sun and therefore were not possible at higher latitudes [Dobson, 1968]. In order to overcome these limitations, a new instrument, the photoelectric Dobson Spectrophotometer, was developed in the late 1920s. This instrument additionally allowed measurements during cloudy conditions. According to Walshaw , Spectrophotometer 1 was constructed during 1928 and Spectrophotometer 2 was produced by R&J Beck Ltd in 1931/32. In optical design 2 was identical to 1, but mechanically it was different, owing to the production methods of R& J Beck Ltd. The first information on the measurements with one of the new instruments dates back to 1930, when a few measurements were made, probably with 1 [Dobson, 1931]. Measurements which are registered on our forms start in October 1933 with 2. Apparently, the investigations of stratospheric ozone and atmospheric circulation, entailing measurements worldwide in these years, left little time for routine measurements at Oxford. Instrument 1 was obviously used for Umkehr observations at Arosa in 1932/1933 [Götz et al., 1934]. Until autumn 1935, observations in Oxford were performed at Boar's Hill, but later observations were performed at Watch Hill (The Ridings, Shotover) on the opposite side of Oxford.
 In August 1938 measurements with instrument 2 stopped, and 1 a later regular measurements restarted with 1. During World War II the frequency of the observations was increased tremendously, in connection with investigations of the humidity of the stratosphere. This work (in collaboration with the Meteorological Office and A. W. Brewer) was in the service of the Royal Air Force [Dobson et al., 1946; Dobson, 1968]. Unfortunately, the data from October 1942 till August 1943 are missing. The regularity of observing before and after the gap suggests that the data corresponding to the gap itself have been lost, though it is also possible that for reasons undisclosed, observing for that period actually ceased. Between September 1944 and the autumn of 1949 regular observations did stop; there were very few observations in 1946, while some test measurements with another instrument (see section 2.3) were carried out in 1948.
 After World War II most of the time was taken up with adjusting and calibrating newly produced instruments. At the same time photomultipliers became commercially available. The instruments accordingly became much more sensitive and it was possible to use shorter wavelengths (A and B pair, see section 2) with larger ozone absorption coefficients. On the other hand it was desirable to have a longer-wavelength pair too (D pair). However, there seems to be a loss of data from our records at this point because in his “ozone history” Dobson states “The first instrument fitted with a photomultiplier had been sufficiently well adjusted and calibrated for daily routine measurements of the ozone to be restarted in November 1947 and they have been continued ever since then” [Dobson, 1968], whereas the data presently available to us did not recommence until autumn 1949, so it would seem that our archive does not contain all the observations which were made at Oxford. Only from 1951 onward were regular observations performed again. In 1949 Sir Charles Normand (coresearcher since 1947 and then Secretary of International Ozone Commission) suggested observing double wavelength pairs (e.g., AD) in order to minimize the impact of aerosol scattering. From then on the measurements of the AD double pair increasingly became the norm, and since the end of 1952 almost no other type of observation was made (except CC′ measurements in the ZC mode). The AD double pair was denoted the official standard mode by the WMO during the IGY [see Dobson, 1957a], and it still is today.
 From 1949 till September 1955 observations were performed with instrument 1. From September 1955 till September 1957 instrument 2 was back in use, though the reason for the change is not reported. There is some information suggesting problems with the accuracy of instrument 2 (see below) which necessitated the change.
 In this paper, following a detailed description of the methods used and the reprocessing of Oxford data (section 2), the results and proposed adjustments are presented (section 3). Section 4 shows possible applications of the reevaluated data, while the results of the study are summarized in section 5.
2. Methods and Measurements
2.1. Total Ozone Measurements By Dobson Spectrophotometry
 Our determination of the total amount of ozone in the atmosphere is based upon the following principle. The relative intensity ratio at two wavelengths (a wavelength pair) of the Huggins bands (3000–3400 Å), one of which is strongly absorbed by ozone and the other is much less absorbed, is determined; over time, several wavelength pairs were used (see Table 1). This relative intensity is expressed with the N value:
where I0 is the light intensity outside the atmosphere and I is the intensity at the Earth's surface. In this paper the primed quantities always refer to the longer wavelength (the one that is only weakly absorbed by ozone). The light source can be direct sunlight (DS), light from the blue/cloudy zenith (ZB/ZC), or moonlight. Including the effect of Rayleigh and aerosol scattering, one can calculate an ozone value from a DS observation by:
where X is the ozone column amount in Dobson units [DU]. N is the relative wavelength intensity (see equation (1)), α (α′), β (β′) and δ (δ′), each in cm−1, are the ozone absorption coefficient, the Rayleigh scattering coefficient, and the aerosol scattering coefficient, respectively, m is the geometrical path length of the light through the atmosphere, μ is the ozone slant path of the light through the ozone layer, SZA is the solar zenith angle, p is the barometric pressure at the observatory, and p0 is that at sea level.
Table 1. Wavelengths (λ) Used at Different Times Prior to IGY
 In the case of single wavelength pairs the aerosol scattering term has to be ignored because δ and δ′ are unknown. However, atmospheric conditions at Oxford require an aerosol correction. In the IGY the use of a dual wavelength pair (AD) was defined as the standard, so N, (α-α′), (β-β′) and (δ-δ′) in equation (2) can be replaced by the corresponding AD differences, for example, NA-ND [see Vogler, 2006].
 The scattering due to aerosols is almost independent of wavelength, so its effect on a dual wavelength pair measurement such as AD can be neglected. Although Dobson started to use the AD dual wavelength pairs in 1952, most of the reevaluated data go back further. A suitable way of deriving corrections for aerosol interference is therefore needed. Where possible, the correction method of Svendby  which is based on observations of the C' wavelength pair, is used. For details of its application, see Vogler .
 The C′ pair is much less influenced by ozone than the C pair and hence gives a measure of the attenuation by factors other than ozone. It was introduced by Dobson [Dobson, 1957a] as a method to estimate total ozone under cloudy conditions. However, sometimes the C′ pair was also observed in DS mode, which in retrospect allows one to determine empirical corrections Δ for the aerosol effect. In order to apply the corrections to all DS observations where no information from the C′ wavelength pair is available, we developed a linear regression model in which Δ was modelled in terms of the variables X[DU], SZA, and Julian Day (JD):
When DS observations of AD are available the DS observations of C can be corrected by the same equation, but now with respect to simultaneous AD measurements, which are considered to be free of aerosol influence.
 Ozone columns cannot be calculated directly from ZB or ZC observations. However, those so-called “Zenith Sky” (ZS) data are important for extending the number of daily means per month. The determination of ozone columns from ZB and ZC measurements is usually done empirically. Since the weather in Oxford is often cloudy, we do not have very many ZB observations (see Figure 2), so the standard procedure cannot be used. The ZB and the ZC data are therefore handled by the procedure described by equation (4), including an additional independent variable to account for the effect of the clouds. We chose to use L′ (from the longer-wavelength C′ pair in the case of CC′ observations) for this purpose, though modern-day measurements also include information on cloud height/thickness and opacity. During an earlier study [Vogler et al., 2006] we found that it is very difficult to convert hand-written information on weather conditions from the original observing record to systematic variables such as cloud height/thickness and opacity. In the light of the results from the previous study we decided to work just with L′ using the following polynomial, which is similar to ones in use earlier [Vogler et al., 2006; Vanícek et al., 2003; Svenøe, 2000] but with additional terms including L′:
The polynomial is then calibrated with DS ozone values derived from observations made close in time to the ZB/ZC one.
2.2. Reevaluation Procedure
 The Handbook for Dobson ozone data reevaluation [Bojkov et al., 1993] describes how a reevaluation has to be done. The process is applied to Dobson instruments which were operated according to the procedures described by Dobson [1957a] and Komhyr . Since the 1970s, Dobson instruments are regularly compared with standard instruments [Basher, 1994], which was not the case in earlier times. Hence this standard method can only be applied to ozone data which have been gathered since the IGY. If, as in the present instance, the data are much older, one needs to apply different methods because not all the required information is available.
 If the original observing record sheets are available, the ozone values are recalculated from the instrument reading (R) with modern-day (Bass-Paur, BP) absorption and scattering coefficients. If no observing records are available (as for 1924–1928), correction factors based on the ratio between old and new absorption coefficients [Brönnimann et al., 2003] can be applied to the total ozone calculations.
 The information required for the recalculation of ozone values includes the exact location of observations (longitude, latitude, and altitude above sea level), date and time (including time zone), N-value (converted from R-reading), airmass and ozone slant path (derived from SZA), ozone absorption and Rayleigh scattering coefficients, observation mode, mean barometric pressure, and weather information.
 The conversion from the dial (R-) reading to the N-value is done by R-N conversion tables, which are checked by regular tests. Since those tables were not available to us, they were reconstructed by polynomial regression. The relationship between R and N is theoretically linear, but in practice it is not (K. Vanícek and M. Stanek, private communication, 2006). Airmass (m) and ozone slant path (μ) were derived from SZA with equations by Young  and Komhyr . SZA was calculated with a “LibRadtran” tool which uses an algorithm by Blanco-Muriel et al. . In order that the ozone values can be compared with other recent measurements, one has either to use absorption and scattering coefficients on the BP scale [Komhyr et al., 1993] or to scale the original ozone values to that level. Since the wavelengths employed prior to 1949 were different from those used today, one cannot use the standard coefficients from the BP scale (see sections 2.3.1 to 2.3.3).
 One can finally calculate an ozone column density from equation (2) or a corresponding version for a dual wavelength pair. In order to eliminate unreliable observations, an outlier removal procedure is applied. For each day a mean ozone value is calculated; then each observation is divided by the daily mean, and excluded if it deviates from the daily mean by more than 3 standard deviations (calculated from all deviations). Usually, this corresponds to about ±7–8%. However, it is possible that a few correct data were also removed, since such high variability in the ozone layer can occur naturally (though only on very exceptional days).
 If the Sun is low (high value of μ), intensity at the shorter wavelength becomes very small compared to that at the longer wavelength (this is especially true for the A-pair), and internal scattered light becomes critical to the instrument's ability to measure intensity differences accurately. The recommended ranges of μ [Komhyr, 1980] for the operation were therefore respected in most cases since the A, B, C, and D pairs were used. The recommendations do not include DS observations of the C pair, since this is not a WMO standard mode. However, these observations are a valuable extension of the data set. They were therefore included up to μ = 4.5. The reliability of this inclusion was tested by investigating μ * X versus μ [Komhyr and Evans, 2007]. All observations that were performed before the introduction of modern wavelength pairs were also restricted to μ = 4.5, since the wavelength pairs in actual use were close to the C pair. For the Féry spectrograph (1924–1928) this problem did not exist, since the instrument could not be used at low Sun.
 In the case of AD-ZC observations (since 1952) the recommended μ-limit of 2.4 was not respected (the upper limit was set to 4.2) as the polynomial used for calibrating the ZC observations accounts for the effect of low Sun. A plot of μ * X versus μ justifies this threshold [see Vogler et al., 2006].
2.3. Availability of Measurements and Data Processing
 The frequency of the observations performed between 1924 and 1957 was very irregular, rising from a low level during the early years to a steady peak during World War II, falling dramatically when new postwar instrumental methods were developed, and returning to the former high level in 1951 until the end of Dobson observations at Oxford in 1975 (see Figure 2). The use of the different wavelengths, instruments, and platforms is summarized in Figure 1.
 During the examination of the original records a quantity of observation made with other Dobson instruments was discovered (probably performed for tests and calibrations before instruments were transported to their measurement sites, see Table 2). The data reprocessing was therefore carried out in a manner that was appropriate, within each period, for the different instruments and their selected wavelengths (see Figure 1). We now explain the data processing for the individual periods.
Table 2. Observations With Other Instruments at Oxforda
Dates are given as yyyy-mm-dd. Asterisk refers to measurements with an Oxford instrument made in Tromsø.
Dobson et al.  claim that the (few) ozone values from 1924 and 1925 should be corrected by 0.015 cm (equal to 15 DU), what is exactly 5% for an ozone value of 300 DU. We therefore corrected all values from 1924–1925 by +5%. In order to transfer the original values to the BP scale, the data were scaled up by a factor of 1.246 [Brönnimann et al., 2003]. If Dobson marked an observation as unreliable, we excluded it. If there was more than one measurement per day (rarely there were two), we calculated the mean value. The data cover the period from August 1924 till October 1928, with missing data during winter months (mostly November to January) because of the low sensitivity of the instrument (that did not change until the construction of the photoelectric version).
2.3.2. Years 1933–1938
 These first years after the introduction of the photoelectric instruments are characterized by a low frequency of irregular observations. This makes the reevaluation difficult. The wavelengths used (λ = 3110 Å and λ′ = 3290 Å) are no longer in use today. Owing to a lack of information the original ozone values from the observation forms have been used and scaled [Brönnimann et al., 2003] to the BP level by using the original absorption coefficients (as indicated in the original observing records). For this purpose it was assumed that the wavelengths used are equal to the shorter wavelength of C pair and the longer wavelength of B pair (3114.5 Å and 3291 Å). We neglected the difference in (β − β′). There are several uncertainties in this method: the wavelengths used differ to a certain extent from the today's ones (the resulting error is expected to be less than 1%; for details, see Vogler ), and the slit functions (the slit width is needed for weighting the absorption spectrum) are most probably not the same as today's. In the work of Dobson [1957a] and Komhyr et al.  the slit widths are different from those described by Dobson . We do not know if the slit widths changed between 1931 and the 1950s.
 The information necessary for developing an aerosol correction is not available. The correction used for the period 1939–1946 (see below) was therefore applied for this period, thereby assuming that the aerosol load for the two periods was the same.
2.3.3. Years 1939–1946
 For a considerable part of this period very regular observations were performed. The instrument reading (R-reading) was not converted to an N-value but to an L-value (see equation (1)); L0 = 2.943 was used, as indicated in the observing records. The Statistical Langley plot method [Dütsch, 1984; Dobson and Normand, 1957] was applied, in order to check the reliability of L0 = 2.943. The error in L0 is about ±0.002. For values of μ = 1.5, 2 and 4 the uncertainty is about ±0.45%, 0.34% and 0.17% respectively for an ozone value of 350 DU. This uncertainty is in any case much smaller than the instrument's precision (> 2%) and can be neglected [see Vogler, 2006].
 From 1939 to 1946 the slits were set to the widths used for λ = 3110 Å, λ′ = 3300 Å and λ″ = 4450 Å, where the combination of λ/λ′ is the shorter-wavelength pair (herein denoted the E pair) and λ′/λ″ the longer one. Because those wavelength pairs were later discontinued, we do not know the absorption coefficients on the BP scale. Svendby  recalculated the absorption coefficients for instrument 8, which employed the same wavelengths, by assuming the slit widths to be 0.62 mm and 1.20 mm (quoted from Dobson ). This is indeed a plausible assumption, but it is not possible to verify it. In the work of Dobson  the wavelengths are given as λ = 3110 Å, λ′ = 3265 Å, and λ″ = 4435 Å (and in the work of Meetham and Dobson  as λ = 3110 Å, λ′ = 3290 Å), which obviously do not match with the ones used here. The question is whether the slit widths were changed or whether the estimation of the central wavelengths was somewhat subjective. From the literature no further information can be found until the introduction of the photomultiplier and the adoption to the A, B, C, and D pairs known today (the corresponding slit widths are 0.40 mm and 1.20 mm [see Dobson, 1957b]). Despite these doubts we applied the absorption coefficients (α = 0.910, α′ = 0.056) and the Rayleigh scattering coefficients (β = 0.452, β′ = 0.352) proposed by Svendby .
 Since DS observations at the longer-wavelength pairs are only occasionally available (mainly at the beginning of this period) a linear regression model was developed (see equation (3)) for the available observations and applied to all DS observations (R2 = 29%).
Figure 3 shows the original aerosol corrections based on λ′/λ″ (top) and the fitted corrections obtained with equation (3) (bottom). While the magnitude of the corrections is similar, the variability is smaller for the fitted ones. However, the main features of variability are reproduced.
 For the development of the ZS polynomial (equation (4)), the maximum offset between DS and ZS (ZB/ZC) observations was set to 60 min. If we compare the ozone values for ZB/ZC data (derived with the polynomial) with their direct Sun reference values (DS), we see that they agree to within ±8%, corresponding to ±2 standard deviations.
2.3.4. Years 1949–1955
 Owing to inconsistencies in the data and a comment from Dobson in the observation record, the following corrections were applied to the N-values from 10 June 1954 to September 1955 (change from instrument 1 to instrument 2): NA + 2.2 and ND − 3.3. In the case of DS AD a time delay of 10 min between the observation of the A and the D pair was accepted as a maximum.
 Some special features in this period deserve note. According to Dobson [1957a] the skylight becomes important relative to the sunlight if μ > 3.5, so the measured ozone value will be too low. It is therefore necessary to form a focused image of the Sun on the slit. The included sky contribution is then reduced relative to the sunlight. This mode is called “Focused image (DS FI).” Dobson states that N values which have been obtained from a DS FI observation need to be corrected. This is done by simultaneous observations of the standard mode (DS with Ground Quartz Plate) and the DS FI mode. Our data include DS FI observations, especially in the winter half of the year when the Sun is low. In the observing records the N-values from DS FI observations of the A pair have always been corrected by −1.4. In rare cases there was also a correction for the observations of the C pair. We adopted the correction for the A pair (including a correction called “S3-scan” that was indicated in some cases; for details, see Vogler ). For most of the C pair we did not apply any correction, in accordance with the original observing records.
 Since DS observations of the AD pairs are today's standard we have adopted them as our standard as soon as they became available, that is, from September 1951. Fortunately, the DS C observations are continuing and allow one to derive an aerosol correction (see above), so that all DS C observations (1949 to September 1951) could be brought to the level of DS AD measurements. The correction was developed with equation (3), yielding R2 = 33% (see auxiliary material). This function was then applied to all DS C observations before 10 June 1954. From then on, slightly different coefficients had to be used for the correction, owing to changes in the R-N conversion tables.
 In contrast to the previous analysis period, the offset between ZS and its reference observation was increased from 60 to 180 min in order to include sufficient matches with reference observations. Both DS AD and DS C (corrected) were taken as reference ozone values. The AD Zenith observations were calibrated as described by Vogler et al. .
2.3.5. Years 1955–1957
 These data were generally treated like those from the previous period (1949–1955). However, some additional treatment was required because the observation records show that the R-N tables were changed twice during the period. The reconstruction of the R-N tables had therefore to be done separately for the following periods: 10 September 1955 to 17 December 1955, 18 December 1955 to 31 December 1956, and 1 January 1957 to 30 June 1957. For the shortest period this was especially difficult since there were barely enough data to reconstruct the R-N tables.
 From 1 January 1957 onward the exact times of observations are not given, so it is not possible to calculate an accurate SZA. We therefore adopted the originally calculated value of μ. That led to a higher level of uncertainty compared to the other data from this period, since the lack of information on the timing does not allow the identification of errors from the original calculations of μ or through the process of digitizing the observing records. From 20 April 1957 onward the R-value was no longer given in the observation records. Hence the R-N tables could not be applied, so the N-values as originally noted had to be used.
 The calculated ozone values were inconsistent over the three subperiods; in particular, the relation between the DS measurements from the C and the AD wavelength pairs was different for the subperiod from 10 September 1955 to 17 December 1955. The observation records reveal a comment by Dobson, suggesting the following corrections for that subperiod: NA + 2, ND − 6.3 and NC − 2.5. After the adoption of those corrections the offset referred to above vanished. The C pair shows only a small difference compared to the standard observations of the AD pairs; this can be attributed to the recreation of R-N tables which must have taken the differences of observations of the C and AD pairs into account. In order to find a good fit for the aerosol correction function we used the first four (instead of two) harmonics of JD. The fitted function is otherwise identical to the one used in the previous section (equation (3), R2 = 15%).
 In the case of a DS FI observation the measurements from the C wavelength pair were also corrected by −2.5 since that was clearly indicated by the observers. The observations of the AD pairs were corrected by +1.5 (for the previous period the correction was +1.4).
 It is not known why instrument 2 was used again for such a short period. The changes of the R-N tables and the corrections we have found in the observation records indicate problems with the instrument. The data from this period must therefore be considered less reliable. The same is true of the corresponding data already stored at WOUDC, since they were derived from the same measurements.
2.3.6. After 1957
 In addition to the reevaluated data there are data at WOUDC in Toronto (starting from June 1952) and satellite data. The WOUDC data from 1952 to 1957 have been discussed by Brönnimann et al. ; they are based on daily means which have been scaled to BP but are less reliable than our reevaluated data, which are based on recalculations starting with each single instrument reading. From September 1957 (not July, the beginning of IGY, as stated at WOUDC) until the end of Dobson observations in Oxford in 1975, instrument 1 was used exclusively. The observations for the period 1957–1975 are available in a similar frequency as in 1952–1957.
 The last 27 a can be covered by satellite data from TOMS (Version 8 overpass data; http://toms.gsfc.nasa.gov/ozone/ozoneother.html). Unfortunately there are no ground-based Dobson observations from Oxford during that time which would enable us to make a comparison with the satellite data.
2.4. Validation of the Oxford Series
2.4.1. Calibration of the Instruments
 The basic information about the calibration of Dobson instruments derives from intercomparison with standard instruments, standard lamp tests, mercury lamp tests, and wedge calibrations. Since we have no information about such tests from 1924 to 1957, we need to find other information which informs us about the stability of the instruments. Several statements [Dobson, 1968; D. Moore, UK Met Office, private communication, 2006] show that until 1964 all Dobson spectrophotometers made by Ealing Beck were sent to Oxford for comparisons with instruments 1 and 2. One of those instruments (41) showed only a small offset of −0.8% compared to the World Standard Dobson instrument [Basher, 1994], which is surely a good indicator of the quality of our data. Unfortunately, no information concerning other intercomparisons (e.g., UK national calibrations) was found at the archive of UK Met Office. However, Dobson instrument 1 was regarded as the standard instrument until 1964 [Dobson, 1968]. More details about these issues are given by Vogler .
 For the period 1924–1957, the only method which enables us to test the stability of the instrument is the statistical Langley plot method [Dütsch, 1984]. If the instrument is stable one expects that L0 will not show any trend. However, should an instrumental drift have occurred, due to optical or spectral problems/changes, it should be clearly visible as a trend in L0. This method was used in the present study when possible. The results are shown in section 3.1.
2.4.2. Comparison With Meteorological Variables and Arosa Series
 Since we do not have the necessary information to track the calibration of the instruments, the validation and homogenization procedure had to be based on statistical methods. The method we applied is very similar to that used by [Brönnimann et al., 2003] and is based on a comparison between the Oxford total ozone series (candidate series) and a synthetic reference series. Any series that can be considered homogeneous and has a high correlation with the candidate series can be used as a reference. For the present study we principally used the 400 hPa geopotential height and the temperature at Oxford (termed Z400hPa and T400hPa), and we also tested the consistency of our results using total ozone from Arosa [Staehelin et al., 1998b]. All series were found to be sufficiently well correlated with Oxford total ozone. The actual synthetic reference series were then obtained by calibrating a regression model (either with the two meteorological series or with Arosa total ozone) against Oxford total ozone for a reference period and applying that model to the earlier period. The reference period was chosen as October 1957 (start of instrument 1) to 1963 [see Brönnimann et al., 2003], and seasonal effects were removed from all variables. Hence:
where a, b, c are determined from the reference period via regression:
The difference series between the total ozone at Oxford and the synthetic reference series in historical periods should ideally have a mean value of zero, and we could check that using a t-test. Because of changes in instruments and wavelength pairs we selected the following periods: 1924–1928, 1933–1938, 1939–1946, 1949–1955, and 1955–1957. In addition to step inhomogeneities, we also tested for SZA-dependent errors (such as stray light or the calibration of the optical wedges) by fitting to the difference series a sine curve that is symmetric around the solstice dates and testing the significance of the coefficient. Notice that in both tests serial correlation was neglected, which was justified by the very large number of observations (and hence degrees of freedom). This general method can be performed on both a daily or a monthly scale, the only difference being that in the former case the first two harmonics of the annual cycle are used instead of the mean annual cycle.
 The 400 hPa GPH and temperature for Oxford can be extracted from NCEP/NCAR data [Kistler et al., 2001] for dates after 1948. They were supplemented with daily aircraft data from locations close to Oxford from 1939 to 1944 (Brönnimann , S026DCRD, and S028DCRD). For earlier periods monthly reconstructions of T400hPa and Z400hPa must be used. They were calibrated for 1958–1980 and are based on the first four principal components of the SLP-field between 70°W–30°E and 20°N–75°N and temperature data from Oxford, Valentia, Debilt, Nantes, and Aberdeen (expressed as monthly anomalies with respect to the 1961–1990 mean seasonal cycle). In the overlapping period of the reconstruction with NCEP reanalysis (1948–1957) the correlations were 0.94 for Z400hPa and 0.83 for T400hPa. Although there are NCEP data from 1948 onward, our reconstructions were carried out for the whole period, in order to avoid introducing an inhomogeneity into the historical years.
3. Results and Readjustments
3.1. Statistical Langley Plot
 The statistical Langley plot method can only be performed for the periods 1940–1944 and 1949–1955 owing to data restrictions. For the first (1940–1944) period the instrument was found to be stable; L0 shows only a small and insignificant trend (Figure 4), and the possible effect on total ozone is clearly less than 1% for this period.
 For the period 1949–1955, the required conditions are not met often enough for the calculation of a trend, but a visual assessment shows no obvious instrumental drift.
3.2. Comparison With Synthetic Reference Series
 This comparison was carried out on the basis of daily values where possible, but monthly values had to be used for 1924–1928 and 1933–1938 since there are no daily 400 hPa data available for this period.
 The resulting correlations between the candidate and the synthetic reference series (see Table 3) are 0.55 to 0.75 for Ref400hPa and 0.3 to 0.5 for RefAro. The correlation for 1924–1928 (on a daily basis) with respect to RefAro is lower than for the other periods, which is in fact expected since that is the period with the largest uncertainty (for both the synthetic reference and the candidate series). The correlation for 1955–1957 with respect to RefAro is clearly lower than for the previous periods; as that is not also true with respect to Ref400hPa it could point to a possible problem in the Arosa series between 1955 and 1957.
Table 3. Comparison of Candidate Series (Oxford Total Ozone) With Ref400hPa and RefAro on a Daily and a Monthly Basis (Reference Period October 1957–1963)a
Unless specified otherwise, the data set contains all DS and ZS observations. “Corr” refers to the correlation coefficient. “Diff” refers to the mean difference (in DU) between the Oxford and the reference series. “Correction” refers to corrections applied to the ozone values.
Difference is significant at the 95% confidence level.
There is a significant (95% confidence level) seasonality in the differences.
 The mean difference between the candidate series and the synthetic reference series for the different periods can be found in Table 3. This value ought to be close to zero but is clearly negative by about the same amount for all periods except 1924–1928. It is encouraging that the differences with respect to both synthetic reference series are very similar (thought somewhat less for 1924–1928 and 1955–1957). As already suspected earlier, the Oxford ozone values for 1924–1928 are too high, probably due to uncorrected aerosol interference, but the inconsistencies for 1955–1957 are such as to suggest a problem with the Arosa data for this period and not with the Oxford ones.
 Apart from 1939–1946 and 1949–1955 we do not find significant seasonality in the differences for either synthetic reference series. This suggests that the data are of good quality and do not suffer from instrumental problems. The seasonality for 1949–1955 can be explained by the use of different wavelength pairs. We are of course comparing AD observations from Oxford (or C observations which have been transformed to the AD-level) with C observations from Arosa. In Arosa there were no regular AD observations before 1957. It is known that the AD and the C observations have a different seasonal cycle (mainly due to aerosol effects). Problems with the aerosol correction could also be the reason for the seasonality of the differences for 1939–1946.
3.3. Applied Adjustment
 Owing to the fact that the differences with respect to synthetic reference series are consistent for 1933–1957 (see last section), the ozone values from Oxford were adjusted relative to the reference period. Because of the higher correlations, only the differences with respect to Ref400hPa (from daily data if available, see Table 3) were respected. The approximate annual mean ozone value at Oxford is 330 DU, so the ozone values from each period were therefore scaled by −diff/330.
 As Table 3 shows, only the difference for 1933–1938 with respect to meteorological data (on monthly basis) is not significant. This can be explained by the low availability of data. Nevertheless, as the offset is nearly identical to those for the other periods we still made the adjustment.
3.4. Comparison of Annual Means With Arosa
Figure 5 shows the reevaluated data (original and corrected) together with the WOUDC data (1957–1975) and the TOMS data (1978–2005). It is obvious that the adjusted data match the measurements of the later periods better. The applied adjustment was determined from the comparison with meteorological data, so the ozone series from Oxford is still independent with respect to the one from Arosa. We can therefore compare the annual means from Oxford with Arosa ones (see Figure 6).
 From 1957 to 1970 (in fact till 2005 if TOMS V8 data are also used) the Oxford values are always higher than, or at the same level as, those from Arosa, whereas (apart from the earliest data) the opposite is true before 1957. That the relationship between Oxford and Arosa values for the very first period is different than for other periods was shown at the beginning of this section, while the 1940s and 1950s show clearly that the adjusted data match better the pattern of the later data compared to the original values.
3.5. Effects of Planetary Boundary-Layer Air Pollution
 Atmospheric trace gases such as SO2 and NO2 can interfere with Dobson total ozone observations [Dobson, 1963; Komhyr and Evans, 1980]. De Muer and De Backer  found fictitious total ozone trends which were in fact caused by sulfur dioxide trends; their observations were performed in the 1970s and 1980s at a heavily polluted site close to Brussels. Although we expect that the pollution levels during our period of investigation were much lower, we attempted to estimate the possible influence of SO2 and NO2.
 In 1963 Dobson concluded that observations at that time did not indicate an amount of SO2 which would cause an appreciable error in the total ozone measurements [Dobson, 1963]. This statement was reassuring regarding the possible interference of trace gases in total ozone data from Oxford, and we have attempted to evaluate this effect in a more quantitative way. The total ozone measurement error due to the presence of an interfering absorbing species was calculated according to Komhyr and Evans  for the C and AD pairs (the wavelength pairs used earlier are close to the C pair and the interference by trace gases is assumed to be similar), but with updated absorption coefficients [Komhyr et al., 1993; Vogler, 2006].
 For the English Midlands, Mylona  estimated a concentration of 10–15 μg/m3 for 1924–1952 and about 15–20 μg/m3 for the maximum of SO2 pollution during the 1960s and 1970s. Garland and Branson  estimated a mean mixing height of 1200 m for SO2 in the UK. We adopted a constant concentration up to an assumed mixing height of 1500 m. The following errors were estimated: +0.45% and +0.90% for the C pair for concentrations of 10 and 20 μg/m3, respectively, and +0.26% and +0.52% for the AD pair for the same concentrations, respectively.
 These uncertainties are well below the theoretical instrumental precision of >2%. We therefore conclude that for 1924 to 1957 there is no evidence of any substantial error in total ozone observations at Oxford for average conditions. That might be different for (say) winter smog conditions such as pertained from 5–9 December 1952, when London experienced an extreme smog situation with SO2 concentrations up to 0.69 ppm [Bell and Davis, 2001]. If one could perform a Dobson ozone observation during such an SO2 load, increases of 28% in the DS(C) mode and 16% in the DS(AD) mode would be expected (assuming a mixing height of 500 m). For a more moderate concentration of 0.1–0.2 ppm SO2 (the winter-average SO2 concentration in London at that time) the increases in the C and AD modes would be about 6% and 3.5%, respectively. There may be grounds for thinking that during this period Oxford also experienced an inversion and some associated elevated pollution levels; nevertheless, although the observations records indicate low visibility the calculated ozone values are in the normal range of variation. Furthermore, comparison of DS observations from the C and the AD pairs does not show any influence of SO2 (DS(C) observations are influenced more strongly by SO2 than DS(AD) observations). We therefore conclude that even in situations where the pollution levels were likely to have been elevated the ozone observations (at least until 1957) were not noticeably disturbed by the presence of SO2. The situation between 1957 and 1975 needs further investigation, however.
 To estimate the possible influence of NO2, the same procedure as for SO2 was used. In the case of NO2 it is more difficult to estimate an approximate surface concentration for 1924–1957. Since the main source of NO2 is high-temperature combustion, the surface concentration is strongly dependent upon the distance to the source (road traffic, power plants). The observations site (Watch Hill) is about 4 km from Oxford City Center, but a major local industry (automobile manufacture) was situated only 2 to 3 km to the SW (the direction of the prevailing wind), and the trunk route to London passed no more than 1 km to the N in a valley. Although we cannot give a reliable estimate, we consider two examples. We take an NO2 concentration of 50 μg/m3 (60 μg/m3 is the approximate annual mean for Zürich, Switzerland, at the beginning of the 1980s, when the NO2 pollution reached its maximum) and of 100 μg/m3, which represents a very high daily mean in 1990s. The resulting changes for the C pair are −2.59% and −5.17% for concentrations of 50 and 100 μg/m3, respectively, while for the AD pair they are +0.42% and +0.83% for the same two extreme concentrations, respectively.
 The NO2 levels at Oxford for 1924–1957 are not expected to be as high as in the examples just given; 20 μg/m3 would already be a quite high concentration for that time (leading to a reduction of about 1% for the C pair). The AD pair is clearly much less affected than the C pair. The NO2 emissions started to rise strongly after World War II (by a factor of up to 10) and peaked in the 1970s [Van Aardenne et al., 2001]. However, from 1952 onward observations were carried out in the AD mode, which is affected much less. Furthermore, the observations of the C pair since 1949 have been corrected with respect to AD observations, which means that they are possibly partly corrected for the influence of SO2 and NO2.
 We therefore believe that NO2 is not a significant problem for the period of the reevaluated data for average conditions. However, it might be a problem for the 1957–1975 data which are already stored at WOUDC. In particular, in terms of a trend analysis there might be a detectable influence since NO2 emissions had that very strong trend between the 1950s and 1970s.
4. Ozone Climatologies for Europe
4.1. Comparison of Total Ozone in the Pre-CFC and CFC Era
 The Oxford series allows one to address ozone variability on daily or interannual scales during the pre-CFC era (the reevaluated Oxford Dobson series will be available at WOUDC). This information could be important for validating chemistry-climate models. However, the Oxford series lacks overlapping observations between two periods when the instrument or the wavelengths used were changed, and moreover the Dobson observations stopped in 1975 and there is no possible connection with the start of the TOMS observations at the end of 1978. For those reasons the series cannot be recommended for trend analysis. Nevertheless, it can be useful for certain other analyses. In this section we present the Oxford series together with other short- or long-term historical total ozone series from Europe in the form of climatologies. In contrast to the climatology from London and Kelly , which is still widely used as reference climatology for the pre-CFC period, we include new data from Oxford and Svalbard [Vogler et al., 2006], and compare them with the climatology of the CFC-age (1988–2000).
 We compare the annual cycle of total ozone at six European stations (identified in Figure 7; their coordinates are listed in Table 4) for three different periods (1924–1939, 1950–1965, and 1988–2000).
Table 4. Data Used for Analysis and Station Coordinates
 The availability of ozone data for the different stations is very mixed, and that influences how the results are interpreted. We used monthly mean values with no restrictions on the number of days of observations per month. If there are different sources at a given station for the same dates, we give Dobson (D) ozone data first priority, with Brewer (B) data next; if there are no ground-based data available we take TOMS (Version 8) measurements for dates since 1978. One has to keep in mind that there is a systematic difference in the annual cycle of Brewer and Dobson data with an amplitude of about ±2%. The data we used are listed in Table 4.
 Notice that the availability of data varies according to station. Only three stations (Tromsø, Oxford, and Arosa) have data before the 1950s. Only Tromsø, Arosa, and Lerwick have a record between 1988 and 2000 which consists entirely of ground-based observations. In addition, one needs to keep in mind that the winter data from the northern stations (Svalbard and Tromsø) are limited by illumination restrictions. For example, DS and ZS Dobson observations cannot be made at Svalbard between November and February; the only possibility then is to use moonlight, for which we have data in 1950s and 1960s. However, those observations are much more difficult to perform and need specific conditions, and are hence less reliable.
 We are aware of slight differences in the seasonal cycles deduced from Dobson, Brewer, and satellite observations. If the monthly ozone data from Arosa for 1988–2000 are replaced by TOMS data we find a positive offset of TOMS versus Dobson of 0 to 3.7% for March to June, though other months more or less agree; see Figure 8. Despite these findings the differences are sufficiently small as not to affect the main conclusions of our investigation.
 As expected, the ozone annual cycles from one period have the same shape but a different amplitude which is strongly dependent on latitude (164 DU in Svalbard and 93 DU in Arosa for 1950–1965). The annual means are around 335 DU for most stations, though the two north of the arctic circle have values of about 350 DU (1950–1965). A comparison of 1924–1939 with 1950–1965 shows no substantial changes. On the other hand, between 1950–1965 and 1988–2000 the annual means decreased at all stations; not only have the annual means dropped, but the shape has also changed. The minimum values in autumn have stayed more or less constant, but the spring values have dropped substantially at all stations, though in Oxford the fall seems to have been less. Except at the high-latitude stations, the spring peak values for 1988–2000 are more pronounced, a feature also visible in the TOMS data.
 In the 1960s and 1970s several studies of ozone climatologies and trends were published [Komhyr et al., 1971; London and Kelly, 1974; Dütsch, 1974]. They all show an increase of total ozone during the 1960s. Our comparison does not cover that period, since we cannot examine the interval between 1965 and 1988. Nevertheless, our climatological comparison does allow some new insights.
 The comparison of the 1988–2000 monthly means with earlier ones shows clearly the difference between the pre-CFC age and a stratosphere which we know has been influenced by human activities. The ozone depletion is most pronounced during late winter and early spring months and very weak in autumn. This is well known and has been shown by trend studies [Staehelin et al., 2001]. The depletion becomes stronger toward the pole. The reason given is the establishment of the cold polar vortex at this time, which allows the formation of PSCs owing to its low temperature, the latter in turn activating the ozone-destroying catalysts. This catalytic destruction commences as soon as sunlight returns to the polar stratosphere. The strongest ozone depletions are therefore detected in the polar region (Svalbard and Tromsø). Since ozone depletion also occurs at midlatitudes and some air-mass exchange takes place between polar and midlatitudes, no systematic difference between annual mean ozone trends at polar and midlatitudes is detected.
4.2. Ozone Compared With Atmospheric Circulation Indices
 It is also interesting to analyze interannual variability of total ozone. Here we focus on the effect of the polar vortex using two variables, Z100 and VPSC. We define Z100 as the difference in 100 hPa GPH anomalies between 75°–90°N and 40°–55°N; it measures the state of the polar vortex (February to April) and captures its dynamical effect on ozone. VPSC (accumulated between December and March) is a rough description of the meteorological potential for chemical ozone depletion on polar stratospheric clouds during the respective winter. In addition arctic ozone depletion influences ozone in midlatitudes after the breakup of the vortex [e.g., Knudsen and Gross, 2000]. However, we restrict our analysis in our paper to the general comparison between VPSC and northern midlatitude ozone, as a more detailed study of polar ozone depletion on midlatitudes is beyond the scope of the paper.
 The indices were compared with ozone anomalies on a monthly basis for 1924–1965 and 1988–2000. Figure 9 shows high positive correlations of Z100 with ozone for both periods, exceeding 0.5 for Tromsø and Svalbard. A comparison of VPSC against ozone shows no significant correlation for 1949–1965 (the VPSC proxy was not available before 1949), while for 1988–2000 they are below −0.5 for most stations (even Oxford shows a correlation of −0.442 but is not significant owing to the shortness of the period).
 Our analysis suggests that the influence of atmospheric circulation (Z100) on total ozone did not change substantially between 1924–1965 and 1988–2000, while the effect of PSCs seems to be much stronger for 1988–2000. This is consistent with polar ozone depletion by CFCs catalyzed by Polar Stratospheric Clouds using an approach as described by Rex et al. .
5. Outlook and Conclusions
 The Dobson total ozone data set from Oxford has been successfully extended back to 1924. Comparisons with meteorological upper-air data and the Arosa series allowed us to calculate an adjusted, internally consistent series. It was necessary to develop an aerosol correction since it was realized that the pollution level at Oxford had a large impact on ozone observations represented by the single C wavelength pair (or on other pairs close to this). On the other hand we found that the influence of SO2 and NO2 between 1924 and 1957 was probably small. Although the reevaluated data are of good quality, they, along with existing data from Oxford, are hardly suitable for trend analysis in their current state owing to possible inhomogeneities in the 1957–1975 period.
 The mean ozone cycles for 1924–1939, 1950–1965, and 1988–2000 have been analyzed. There are no apparent changes between 1924–1939 and 1950–1965. As has been shown by earlier studies, we found significant ozone depletion (1988–2000) at all stations, with the depletion being strong in late winter/spring and negligible in autumn. The ozone decrease is larger toward the pole.
 Comparisons of total ozone with indices of dynamical and chemical effects related to the polar vortex show that the influence of atmospheric circulation remains unchanged, while the influence of chemistry on stratospheric ozone is only detected in the most recent decades.
 The extension of the Oxford ozone series allows us to examine the dynamical variability of total ozone over timescales ranging from that of ozone miniholes [Brönnimann and Hood, 2003] to that of interannual varibility [Brönnimann et al., 2004] or the QBO. This is considered particularly valuable in the period prior to ±1950, when upper-air observations were extremely sparse. In fact, historical total ozone data provide an independent test of consistency for new pre-1948 upper-air data products that are now being developed [Brönnimann et al., 2005].
 We thank Clive Rodgers, Oxford University, for providing access to the original observation and calculation sheets from the Dobson ozone observations made in Oxford from 1933 to 1957, and the Swiss National Science Foundation for funding our reevaluation project, “Past climate variability from an upper-level perspective.” The authors also thank Karel Vancek and Martin Stanek (Czech Hydrometeorological Institute) for valuable discussions on the reevaluation of historical ozone data, David Moore (UK Met Office) for information on the calibration history of Dobson Spectrophotometers, Georg Hansen (Norwegian Institute for Air Research) for providing ozone data from Tromsø, and Marjory Abraham, Walter Dann, Claudia Mohr, Bernhard Krähenmann, and Stefan Krähenmann for digitizing the ozone data.