Uncertainty in annual rankings from NOAA's global temperature time series



[1] Annual rankings of global temperature are an important component of climate monitoring. However, there is some degree of uncertainty for every yearly value in the global temperature time series, which leads to uncertainty in annual rankings as well. This study applies a Monte Carlo uncertainty analysis to the National Oceanic and Atmospheric Administration (NOAA) National Climatic Data Center's global land-ocean surface temperature (NOAATMP) time series. Accounting for persistence between years does not materially affect the results versus presuming statistical independence. The highest probabilities for the warmest year analysis (1880–2012) are associated with the years 2010 (~36%), 2005 (~28%), and 1998 (~11%). The current separation among the warmest observed years is relatively small compared to the standard errors of the NOAATMP time series. However, each year between 1997 and 2012 was warmer than the vast majority of all other years since 1880 at the 95% confidence level.

1 Introduction

[2] The National Oceanic and Atmospheric Administration (NOAA) National Climatic Data Center (NCDC) is one of a handful of groups that produces a global temperature product (land and ocean). Other groups include the Met Office Hadley Centre and the Climate Research Unit, which jointly developed the HadCRUT4 data set [Morice et al., 2012], and the Goddard Institute for Space Studies (GISS), which developed the GISS Surface Temperature (GISTEMP) analysis [Hansen et al., 2010]. As part of its continuous monitoring activities, NCDC routinely assesses the rankings of individual months and years for numerous spatial scales (e.g., national, regional, and state), based on the period of record for the time series in question. A higher-profile example of this is the annual rankings of the global temperature series, which is derived from the NOAA NCDC's global land-ocean surface temperature (NOAATMP) time series [Vose et al., 2012]. This leads to statements like the following: the 15 hottest years on record have occurred in the past 16 years (1997–2012; see Table 1).

Table 1. The 15 Warmest Years on Record (1880–2012) in the NOAATMP Time Series
RankingYearAnomaly (°C)

[3] Such statements of observational “fact” do not take into account the nontrivial degree of uncertainty for each annual time series value. A common characterization of this uncertainty is the standard error time series. For annual time series of U.S. and global temperature, the standard errors can be quite large compared to the typical year-to-year fluctuation in the time series values. This can lead to substantial overlap of uncertainty ranges when years are ranked from coldest to warmest. This concept was explored by Guttorp and Kim [2013, hereinafter GK13], who used a Monte Carlo approach to characterize the uncertainty of annual rankings for the U.S. temperature time series. In this note, we apply a variant of the GK13 uncertainty approach to the NOAATMP time series. The data and methodology are presented in section 2, followed by the results in section 3. The discussion and conclusions are offered in section 4.

2 Data and Methodology

[4] In this study, we utilize the global time series and associated total standard errors produced by the NOAATMP analysis. This data set is referred to as Merged Land-Ocean Surface Temperature (MLOST) version 3.5 analysis in Vose et al. [2012]. We have chosen not to use the MLOST naming convention, as NOAA NCDC is in the process of revising its naming conventions. NOAATMP consists of monthly gridded temperature anomalies over the globe for the period 1880 to present. Data over land areas originate from weather stations measuring near-surface air temperature, whereas data over ocean areas originate mainly from ships and buoys measuring sea surface temperature (SST). Gridded anomalies are produced using a statistical reconstruction method that sequentially extracts low- and high-frequency components from the historical temperature record [Smith and Reynolds, 2005; Smith et al., 2008]. An outcome of the reconstruction process is a standard error estimate that explicitly quantifies the random, sampling, and bias components of uncertainty [Smith and Reynolds, 2005]. Random errors mainly originate in the input data, but these errors are largely filtered out by the reconstruction process. Sampling error reflects the density and distribution of the original observations. Bias error mainly results from systematic changes in observing practice, such as from historical changes in instrumentation, particularly over the ocean [Smith and Reynolds, 2002].

[5] Table 1 shows the 15 warmest years on record from 1880 to 2012. The NOAATMP time series and its standard errors are plotted in Figure 1. The observed warmest year on record was 2010, followed by 2005. The difference between the 2 years is less than 100th of a degree Celsius, which is substantially smaller than the associated standard errors. Note that the standard errors are significantly higher before the mid-1940s; this is primarily due to errors associated with SST bias adjustments [Vose et al., 2012]. The subtle rise in standard errors from 1880 to the mid-1930s is due exclusively to bias uncertainty over the oceans, coinciding with the general switchover from wood buckets to canvas buckets [Kennedy et al., 2011].

Figure 1.

(a) NOAATMP anomaly time series (1880–2012). The smooth curve indicates the residue of the EMD analysis. (b) The standard errors of the NOAATMP time series (1880–2012) are presented in bold. The associated HadCRUT4 (solid line) and GISTEMP (dashed line) analyses are also displayed.

[6] A Monte Carlo analysis can be used to assess the uncertainty of annual rankings. The simplest approach is to assume that all years in the annual time series, X(t), are independent. A large number of simulated time series can be created by shifting each estimated annual temperature up or down slightly by a random z score (i.e., Gaussian white noise), zi(t), scaled by the standard error (Figure 1b), s(t), i.e.,

display math(1)

[7] Here Yi(t) represents the ith simulation time series. GK13 called this the “independent model.” We simulate 100,000 realizations of the annual NOAATMP time series in this manner. From this pool of synthetic time series, we then calculate the probability that a given year was the warmest on record, as well as the probability that a given year was among the top 10 warmest years on record. The synthetic time series are also used to construct two-tailed 95% confidence intervals for each annual ranking.

[8] A more complicated approach for assessing the uncertainty in ranks involves modeling the dependence structure of the annual time series using a Box-Jenkins type of analysis. The NOAATMP global temperature time series (Figure 1a) is clearly nonstationary. Removal of an ordinary least squares regression line does not yield a stationary time series, as considerable spectral energy resides in the first two harmonics (not shown). The same is true if a generalized least squares regression line is modeled, the approach applied by GK13 to an annual U.S. time series. One way to achieve a stationary time series is to estimate a nonlinear trend using empirical mode decomposition (EMD), a nonparametric scheme for extracting intrinsic mode functions (IMFs) from a time series, leaving behind a “residue” that can be considered a time-dependent mean function [Huang and Shen, 2005]. EMD is increasingly being utilized in climate science applications [Wang et al., 2013]. We utilize the emd function in the R programming language [R Core Team, 2013] to extract the four leading IMFs, resulting in the smooth residue shown in Figure 1a. The residuals of this nonlinear trend fit (not shown) are stationary as determined by Kwiatkowski-Phillips-Schmidt-Shin test [Kwiatkowski et al., 1992] and Phillips-Perron test [Phillips and Perron, 1988] using the kpss.test and pp.test functions in R, respectively. Following GK13, we estimate the dependence structure of these residuals using the R auto.arima function, which determines the optimal autoregressive integrated moving average (ARIMA) parameters of the residual time series. Results are reported using both the Akaike information criterion and the Bayesian information criterion (AIC and BIC, respectively). We calculate 100,000 synthetic time series as in (1), except, instead of using a time series of purely randomized normal deviates, we utilize the ARIMA coefficients along with an innovation series [see Wilks, 2006] to create a time series of z scores, zi(t), that reflects the dependence structure of the residual time series. This “dependent model” takes the following form:

display math(2)

[9] The dependent simulation time series, Yi(t), are used to calculate the uncertainty of the warmest and top 10 warmest years as before. In essence, the dependent approach accounts for the fewer effective degrees of freedom due to the redness of the NOAATMP time series.

3 Results

[10] As summarized in Table 2, no single year exceeds a 50% probability of having been the warmest year on record globally in the NOAATMP data set. The year 2010 has the highest probability for all three cases, with values ranging from 36.5% to 38.3%. The year 2005 comes in second (28.1–29.2%), followed by the year 1998 in third (~11%). Five additional years have probabilities between 3% and 7%: 2003, 2002, 2006, 2009, and 2007. Ranks 6–8 across the three cases differ slightly (although they are different orderings of the same 3 years), but the rankings for the top 5 years are the same and are relatively well separated.

Table 2. The Probability (%) That a Year was the Warmest on Record (1880–2012) in the NOAATMP Time Series, Calculated Using the Independent Years Assumption and the Dependent Years Approach With Both the AIC and the BICa
RankIndependent NOAATMPDependent NOAATMP (AIC)Dependent NOAATMP (BIC)Independent HadCRUT4Independent GISTEMP
YearProbability (%)YearProbability (%)YearProbability (%)YearProbability (%)YearProbability (%)
  1. a

    The corresponding independent approach results for HadCRUT4 and GISTEMP are also shown. Only years with a probability above 1% across all five cases are shown.


[11] With respect to the probability that a given year was among the 10 warmest years on record, the results (Table 3) are remarkably consistent and well separated across all three cases for the top 14 years. Note that the top 14 years coincide with the observed values (Table 1). All 14 years presented have occurred in the last 16 years. All other years had a probability of less than 1% of having been among the warmest 10 years on record.

Table 3. The Probability (%) That a Year was Among the 10 Warmest on Record (1880–2012) in the NOAATMP Time Series, Calculated Using the Independent Years Assumption and the Dependent Years Approach With Both the AIC and the BICa
RankIndependent NOAATMPDependent NOAATMP (AIC)Dependent NOAATMP (BIC)Independent HadCRUT4Independent GISTEMP
YearProbability (%)YearProbability (%)YearProbability (%)YearProbability (%)YearProbability (%)
  1. a

    The corresponding independent approach results for HadCRUT4 and GISTEMP are also shown. Only years with a probability above 1% are shown.


[12] Figure 2 shows the 95% confidence intervals associated with each annual ranking (using the independent assumption simulations). Not surprisingly, the range of rankings is larger during the earlier half of the record when the NOAATMP standard errors were larger. The ranges can vary widely. For example, the confidence interval for the year 1934 ranges from 33rd to 116th warmest, whereas for 1995, the range is from thirteenth to nineteenth warmest. The confidence intervals for the 3 years with the warmest anomalies in the NOAATMP global time series are first to ninth for both 2010 and 2005 and first to tenth for 1998, although they are all clearly skewed toward warmer rankings.

Figure 2.

The 95% confidence intervals for each annual ranking of the NOAATMP anomaly time series (1880–2012) using the independent approach. The horizontal black line indicates a ranking of seventeenth warmest, the lowest confidence interval value between 2001 and 2012.

4 Discussion and Conclusions

[13] All of the associated probabilities presented herein for NOAATMP differ by ~2% or less across the three cases considered, so we contend that the independent and dependent methods yield comparable results for annual rankings, the same conclusion drawn by GK13 for the U.S. time series. Our results suggest that accounting for the dependence of annual values may be a prudent academic exercise, but the practical effect is muted for the global temperature time series. Any effect on the rankings due to the dependence of years is dwarfed by the effect associated with the large ratio of the standard errors to the year-to-year variability and the actual separation of annual values. In terms of practicality, the independent approach has the advantage of being automatable, whereas the dependent approach requires some degree of manual inspection to ensure the appropriateness of the EMD and ARIMA time series modeling involved. Therefore, we recommend the use of the Monte Carlo approach using the independent assumption to diagnose the uncertainty of annual rankings in climate monitoring applications involving global temperature time series.

[14] For example, the NCDC's most recent annual state of the climate report (http://www.ncdc.noaa.gov/sotc/global/2012/13) includes the following analysis: “The year 2012 was the 10th warmest year since records began in 1880…. Including 2012, all 12 years to date in the 21st century (2001–2012) rank among the 14 warmest in the 133-year period of record.” Using the independent approach, we calculate a 12% probability that 2012 was the tenth warmest year on record. The two-tailed 95% confidence interval suggests that 2012 is likely ranked anywhere between the second and fourteenth warmest years (Figure 2). We calculate a 68% likelihood that all years between 2001 and 2012 were among the 14 warmest years (there is a 31% chance that only 11 of the years rank among the top 14). Of the 12 confidence intervals over 2001–2012 shown in Figure 2, the rankings all range between the first and seventeenth warmest.

[15] Although the three global temperature time series are by no means derived from independent data values [Hansen et al., 2010], it is instructive to compare the probabilities generated using the independent approach for HadCRUT4 and GISTEMP as well (as with NOAATMP, using the dependent approach on the HadCRUT4 and GISTEMP global temperature series does not materially alter the probabilities). The GISTEMP standard errors (Figure 1b) consist of three period averages (R. Reto, personal communication, 2013). HadCRUT4 characterizes uncertainty by reporting the 95% confidence intervals from its 100 ensemble members, and we estimate its standard error time series (Figure 1b) by presuming normality. HadCRUT4 and GISTEMP both show 2010 and 2005 as the warmest and second warmest years on record, respectively. As is the case for NOAATMP, the highest probabilities (associated with the year 2010) are in the 30–40% range, shy of a majority of the simulations. While rankings 3–8 consist of the same set of years, their ordering varies across the data sets, most notably for the year 2007, which has the third highest probability in GISTEMP and eight highest in both NOAATMP and HadCRUT4. For all but one of the top 8 years (2002), the NOAATMP probability lies between the associated HadCRUT4 and GISTEMP probabilities. Similarly, the top 10 analysis (Table 3) shows that the years with probabilities above 1% consist of the same 14 years for all three data sets (all 14 have occurred since 1997), with the ordering of the probabilities being reasonably comparable, except once again for the year 2007. The overall story for all three data sets is the same: while we cannot irrefutably pinpoint which year was the warmest year on record through 2012, there is strong statistical evidence (e.g., Figure 2) that each individual year in the 1997–2012 period has been consistently and significantly warmer than the vast majority of all other years since 1880.

[16] We also repeated the independent approach uncertainty analysis for NOAATMP after multiplying its standard error time series by two. Characterizing uncertainty is an inexact science by nature, and this simple exercise tests the sensitivity of the Monte Carlo analysis with respect to the magnitudes of the standard errors. As expected, hypothetically doubling the standard error time series would reduce the probability that 2010 was the warmest year on record (from 36% to 22%), and the years 2005 (from 28% to 18%) and 1998 (from 11% to 10%) would also show varying degrees of reduced probabilities. However, the vast majority of the hypothetical shift away from the top 3 years would go toward increasing the probabilities of other years in the 1997–2012 period. In fact, even after doubling the standard errors, there is a 97% (down from 100%) probability that the warmest year occurred between 1997 and 2012, and no year prior to 1997 reaches a 10% probability of having been among the 10 warmest years on record.

[17] As a final check on the sensitivity of our methodology, we repeated the independent approach uncertainty analysis for NOAATMP using different end years between 1997 and 2012 (Table 4). Prior to 1997, the warmest year to date was 1995. However, 15 of the 17 years since 1995 exceeded the 1995 global temperature average in NOAATMP (only 1996 and 2000 were less warm), and four of these years went on to become the warmest years on record, at least briefly. After 1997 came to a close, it became the warmest year on record with a ~92% probability and was eclipsed the very next year by 1998 which garnered a probability above 99%. Subsequently, 2005 and 2010 became the warmest years on record, and due to smaller separations between them, the probabilities have come down. This demonstrates that the probabilities can change a great deal with the addition of just a single year. Another dramatic example of this was the U.S. temperature time series after the addition of 2012, which was significantly warmer than any previous year on record according to GK13 and NCDC (http://www.ncdc.noaa.gov/sotc/national/2012/13).

Table 4. The Probability (%) That a Year was the Warmest on Record (Since 1880) in the NOAATMP Time Series, Calculated Using the Independent Years Assumption and Variable End Years Between 1997 and 2012a
End YearRanking #1Ranking #2Ranking #3
YearProbability (%)YearProbability(%)YearProbability (%)
  1. a

    Only years with a probability above 1% are shown. Years shown in bold indicate years that at one point in time were the warmest years to date.


[18] Given the potential for annual ranking uncertainty estimates to vary from year to year, which can be particularly volatile when annual values persist near record levels as we have seen over the last two decades, articulating an uncertainty range alongside an annual ranking makes the climate scientist's already formidable communications challenge even more difficult. However, stakeholders are better served by, and often clamor for, a more thorough accounting of climatic conditions. This entails climate monitoring centers to provide not only a historical perspective of the most recent annual or monthly observation but also the context of the uncertainty inherent in that historical perspective.


[19] We kindly thank P. Guttorp for providing his R code and engaging us on this topic. We also acknowledge R. Reto and P. Jones for their fruitful discussions. Lastly, we wish to thank P. Thorne and B. Huang for their valuable insights while reviewing earlier versions of this manuscript.

[20] The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.