4.1. SST Versus 0–20 m Near-Surface Analysis
 Figure 1ashows a time series of monthly global average near-surface ocean temperature anomalies. Series calculated from SST and from subsurface profiles within the upper 20 meters of the ocean show very similar interannual variability back to around 1945, although the 0–20 m series is generally cooler than the SST series between 1960 and 1975. Prior to 1960, the coverage of hydrographic observations (shown as insets in the main diagram) becomes much sparser. Despite this, the global-average SST and near-surface estimates continue to track each other, although the agreement becomes progressively worse earlier in the record as the sampling in the subsurface analysis becomes more regionally confined. As illustrated byFigure 1bthe mean monthly absolute temperature time series for the sea surface and for the 0–20 m level diverge before 1940 with near-surface sampling being biased to higher latitude resulting in a colder mean temperature. Before 1940, both time series show a warming trend, but the rate of warming is higher in the near-surface data than it is in the SST.Figure 1cshows the estimated 2-sigma uncertainty range and the difference between the estimates. The differences mostly fall within the uncertainty range back to around 1930. Both time series show a temperature increase from 1900 to about 1945, a slight decrease to the mid-1970s, and a temperature rise to the end of the record. While it is possible that the agreement is fortuitous, the fact that two independently derived series should agree so closely, is an indication that there is a common signal and that it is being faithfully represented, albeit with some uncertainty, by the data. An alternative view of global temperatures is provided by creating histograms of the point anomalies (Figure 1d). The point anomalies are characterized by a uni-modal distribution with the majority of the values concentrated within a narrow band around the mode. As inFigure 1a, the histograms show an overall warming of the global ocean since 1900.
Figure 1. (a) Global temperature anomaly time series (monthly values relative to the average for 2001–2010)calculated for the sea surface (red) and for the near surface (0–20 m) layer (blue). Both time series are obtained by area-weighted averaging of all available monthly 5 × 5-degree box anomalies. Inset maps show spatial distribution of the temperature observations/profiles and the number of observations/profiles within 5 × 5-degree boxes for selected time periods. (b) Time series of the sea surface temperature (red) and of the 0–20 m layer temperature (blue). Both time series are obtained by weighted averaging of 5 × 5-degree box absolute temperatures. (c) Difference between the sea surface and the subsurface temperature anomaly (black line) with respective two-sigma uncertainties shown in color. (d) Histograms of temperature anomalies for the 0–20 m layer based on temperature profiles from the global hydrographic database. Anomalies are calculated relative to the 10-year reference period 2001–2010. Color scale shows percent of the anomalies within 1 month × 0.01°C bins from the total number of anomalies for each calendar month and year. Histogram bins colored in grey correspond to 5 percent of original anomalies rejected through the application of the median filter. White line denotes the position of the distribution mode (5-year running mean of monthly anomaly values). N gives the total number of the original temperature profiles.
Download figure to PowerPoint
 Figure 2ashows time series of 5-year running average near-surface ocean temperatures, which highlight the difference in low-frequency variability between the data sets. The higher trend in the subsurface analysis prior to 1945 is also seen in the SST analysis subsampled to have the same coverage as the subsurface analysis and is therefore likely to be due largely to poor geographical sampling. The subsurface data for the years before ca. 1920 have a strong geographical bias with the majority of the data coming from the (North) Atlantic Ocean (seeFigure 1a). Both the near-surface and the subsampled surface time series indicate a warming of about 1.3–1.4°C since 1900, which is characteristic for the Atlantic Ocean. This is larger than for the whole global ocean, which amounts to around 0.7–0.8°C according to the more spatially complete SST time series.Levitus et al.  also reported a higher warming rate for the Atlantic Ocean since the 1950s.
Figure 2. (a) Five-year running mean temperature anomalies estimated using the sea surface temperature data and the near-surface (0–20 m) temperature profiles. Red: anomaly for the full sea surface temperature dataset; magenta: anomaly for the sea surface temperature dataset sub-sampled to the coverage of the less abundant subsurface data; dark blue: temperature anomaly for the 0–20 m layer based on the vertically interpolated (1 meter) temperature profiles; cyan: the same time series based on the observed level data. Shading shows uncertainties due to the incomplete sampling for the 0–20 m time series (light blue) and to the sum of the uncertainties due to bias adjustment, measurement, sampling and coverage for the sea surface time series (light-orange). All solid-line curves correspond to the datasets adjusted for biases, whereas the dotted lines are based on the original unadjusted datasets. (b) Temperature anomaly time series for the 0–400 m layer estimated in this study (green) and similar toLevitus et al. (orange). Shading shows uncertainties due to the incomplete sampling for the 0–400 m (light-green) and the error bars show the formal standard error of the 5-year mean value for the 0–400 m byLevitus et al. (yellow). All solid-line curves correspond to the datasets adjusted for biases, whereas the dotted lines are based on the original unadjusted datasets. (c) The same as inFigure 1d but for the layer 0–400 m.
Download figure to PowerPoint
 The fact that the estimated uncertainties of the SST and near-surface analyses do not overlap implies that the coverage uncertainties estimated by subsampling globally complete GECCO renalyses do not capture the full uncertainty due to poor coverage in the early record.Figure 2a also shows the effect of using observed depths, or interpolated depths for estimating the temperature in upper 20 meters. The divergence between the two curves prior to 1945 arises from an abrupt change in the composition of the subsurface data set. Before 1940, all the data are from bottle cast temperature profiles, which have a relatively coarse vertical resolution. MBTs, which have a greater vertical resolution, were introduced during the Second World War greatly improving the accuracy with which the temperature structure of the upper layers could be determined. Consequently, the uncertainty of estimating the average temperature in the upper 20 meters, as represented by the difference between the two curves, is larger before 1940.
 Also shown in Figure 2aare the series from the unadjusted data. The adjustments do not improve the agreement between the two data sets at all times, nor does the act of sub-sampling the SST data to have the same coverage as the near-surface analysis. There are two possible reasons for this which might be acting together. The first possibility is that the 0–20 m layer and the sea-surface warmed and cooled at different rates though always in tandem.Grodsky et al. showed relative trends between SST and mixed-layer temperature for certain regions, but their analysis did not consider the possible effects of biases in the measurements. The second possibility is that the bias adjustments applied to one or the other data set are incorrect. If the differences are due to unresolved biases then they suggest an uncertainty of around 0.1°C at decadal time scales. It is interesting to consider on which side such a bias is most likely to lie. The bias adjustments applied to the SST data are generally larger than the adjustments applied to the hydrographic profiles. Furthermore, the adjustments applied to the bathythermographic profiles are calculated relative to reference datasets (based on CTD and bottle measurements) whereas the SST adjustments, based as they are on estimates of biases from the literature, are not. Independent subsets of the SST series based on adjusted bucket measurements and adjusted engine room measurements agree well on a global and hemispheric scale, but not perfectly. Collocated differences between the two suggest an uncertainty of around 0.1–0.2°C in the 1940s and 1950s, but a much smaller uncertainty from the 1960s onwards [Kennedy et al., 2011b]. Although the subsurface measurements are adjusted relative to a reference data set, there is a possibility that existing adjustment schemes suffer from over-fitting, with spuriously good agreement occurring where simultaneous XBT and CTD measurements exist, but poorer performance where they do not. There are also systematic drifts in the average absolute temperature of the near-surface measurements (shown inFigure 1b) suggesting systematic changes in the water masses and geographical regions sampled. The issue of residual biases continues to be the focus of research both for the surface and subsurface data, and at present, there are no strong reasons to presume that one anomaly estimate is less biased than the other prior to World War II.
 To further compare the two data collections, we produced two sets of decadal temperature anomaly maps based on the surface and near-surface data respectively (Figure 3). These maps demonstrate that the first decade of the 21st century (2001–2010) was not uniformly warmer than previous decades. Before about 1920, the global ocean was almost everywhere colder than the reference decade of 2001–2010. After 1920, several regions of the global ocean were warmer than in the reference decade. The tropical regions of the East Pacific ocean were dominated by positive temperature anomalies during the1980s and 1990s due to several strong El-Niño events (1982–83, 1986–87, 1991–92 and 1997–98). In contrast the period 2001–2010 was marked by relatively modest El Niño events and strong La Niña events (2000–01, 2007–08 and late 2010). Evidence of a similar anomaly pattern albeit of smaller magnitude can be identified in the same region during the 1930s and 1940s, possibly due to the protracted El Niño of the early 1940s combined with the positive phase of the PDO at the time. However the data coverage is much poorer in comparison to the later decades.
Figure 3. Decadal temperature anomaly (relative to 2001–2010) maps based on the subsurface (0–20 m) data (first and third columns from left) and on the sea surface temperature data (second and fourth columns). Maps are produced by averaging all temperature anomalies in each 5 × 5-degree geographical square for the respective time period. Numbers shown on Asia correspond to the globally averaged 5 × 5-degree box anomalies.
Download figure to PowerPoint
 Another large-scale pattern of positive anomalies (relative to 2001–2010) occurred in the Southern Ocean during 1970s to 1990s. Similar to the East Pacific, surface and near-surface water temperatures in this region were higher compared to the reference decade of 2001–2010. The belt of positive anomalies was most pronounced for the decades between 1970 and 2000 within the Atlantic and the Western Indian sectors of the Southern Ocean. It should be noted that the decadal maps for the layer 0–400 m (not shown) do not reveal any significant positive anomalies within the same regions of the Southern Ocean. The anomalies seem to be confined to the near-surface suggesting a different time evolution below the upper mixed layer. However,Gille  reported a warming in the even deeper layers 700–1100 m between 1950s and 1980s. Thus, a rather abrupt cooling since the end of 1990s both in the East Pacific (connected to the weakening of El Nino and the shift to the negative phase of the Pacific Decadal Oscillation) and in the Southern Ocean [see also Knight et al., 2009] may have contributed to a flattening of the global temperature anomaly series after about 2000. The flattening can be clearly seen in Figure 2a. It should be noted that the changes revealed in the Southern Ocean refer mostly to the period of the austral summer, since there have been only sporadic winter observations before the implementation of Argo floats.
4.2. Layer 0–400 m
 Sub-surface observations in the upper 400 meters are more limited in number. However, the good agreement between the independent temperature anomaly time series for the ocean surface and for the near-surface layer points to the possibility of monitoring secular changes in the deeper ocean. Extension of the analysis back to the beginning of the 20th century reveals evolution of the temperature average over the upper 400 meters that is similar to the near-surface time series, with two periods of warming separated by a period between about 1945 and 1970 when the oceans cooled slightly (Figure 2b).
 The uncertainty bounds due to imperfect sampling, estimated using the GECCO reanalysis, are very wide at the beginning of the analyzed time period and between 1930 and the mid-1950s. A reduced number of observations (especially during World War I and to a lesser degree during World War II) and a narrower geographical scope of observations during the wars make the global temperature anomaly estimates much less reliable. The observational gaps are most clearly seen in the point anomaly histogram (Figure 2c). For this reason, we omit the time periods 1913–1920 and 1939–1945 from the analysis. The introduction of profiling floats has led to a significant reduction in the sampling uncertainty after about 2003. Overall, our anomaly estimates based only on the geographical squares with observations suggest a warming of about 0.5°C for the upper 400 meter layer since the beginning of the 20th century. If these are used to estimate the temperature change of the 0–400 m layer across all the global oceans, the warming is between 0.1 and 0.9°C since 1900, or between 0.3 and 0.7°C since 1910. Our estimates of the temperature rise for the 0–400 m layer agree, within the rather broad uncertainty ranges, with the results by Roemmich et al.  where the spatial mean warming of about 0.5°C since the Challenger expedition (1872–76) was found for the upper 366 meters. It is worth noting that, while the increase of the mean temperature anomaly for the 0–400 m is smaller than that of the 0–20 m layer, it represents a volume of water that is twenty times larger.
 Global analyses of MBT and XBT data [Levitus et al., 2009; Gouretski and Reseghetti, 2010] demonstrate that the MBT and XBT bias correction schemes do bring them into a better agreement with the reference CTD and bottle data. The accuracies of the thermometers used on historical Nansen casts (which is the only subsurface data type before 1940) were about 0.05°C or better [Wüst et al., 1932]. However, the sample depth was often calculated from the length of the wire paid out and the angle of the wire to the vertical measured at the ship deck. This method could lead to large errors in sample depth, systematically underestimating the actually achieved depth and usually causing a respective warm bias in the data. However, introduction of the thermometric method of the depth estimation based on the simultaneous use of protected and unprotected reversing thermometers has essentially solved the depth bias problem [Wüst et al., 1932]. Following the above considerations and taking into account significant differences between the full-coverage and masked SST time series we believe that the remaining uncertainties in our upper-ocean time series are largely due to imperfect sampling. However, the impact of residual or unknown biases can not be excluded.
 Levitus et al.  estimate global temperature anomalies using extensive interpolation of the gridded anomaly fields. A modified version of the time series from Levitus et al. , of mean temperature anomaly 0–400 m, averaged only over gridboxes with data, is shown in Figure 2b. Their time series suggests a more gradual warming since the 1950s than in our analysis. Differences between the two 0–400 m time series may be due to the XBT corrections applied, mapping techniques employed, and/or differences in data and quality control decisions for the data. For instance, modifying the mapping method described here to make it more similar to the method used by Levitus et al.  by not applying the depth extension to shallow profiles results in a change in calculated mean temperature anomaly 0–400 m of <0.05°C for years prior to 1955, <0.02°C 1955–1970, and < 0.01°C after 1970. Regarding bias corrections, Lyman et al. found that differences due to XBT bias corrections were the major factor in differences between ocean heat content estimates. A more detailed discussion of the differences between the two curves is beyond the scope of the present work. However, the differences between the time series are smaller than their respective uncertainty bounds and both time series show the same increase of about 0.2°C since the mid-1950s.
 As the global temperature time series available in the literature all start in the 1950's, no independent comparison could be made for this study before 1950. The comparison of the near-surface time series with the surface time series indicates that the GECCO-derived uncertainties might underestimate the sampling error on the global scale for years with extremely poor data coverage. However, our calculations do reveal a significant warming at least since the mid-1920s, when the German Atlantic Expedition (1925–27) [Wüst et al., 1932] provided a good quality full-depth data set for the whole Atlantic Ocean between 20°N and 60°S. Consequently, the GECCO-derived uncertainties for the mean 0–400 m temperature suggest a much smaller sampling uncertainty during 1925–29.