Independent confirmation of global land warming without the use of station temperatures
Gilbert P. Compo,
Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, Colorado, USA
Physical Sciences Division, Earth System Research Laboratory, National Oceanic and Atmospheric Administration, Boulder, Colorado, USA
Corresponding author: G. P. Compo, CIRES University of Colorado, NOAA Physical Sciences Division, Earth System Research Laboratory, 325 Broadway R/PSD1, Boulder, CO 80305, USA. E-mail: (email@example.com)
 Confidence in estimates of anthropogenic climate change is limited by known issues with air temperature observations from land stations. Station siting, instrument changes, changing observing practices, urban effects, land cover, land use variations, and statistical processing have all been hypothesized as affecting the trends presented by the Intergovernmental Panel on Climate Change and others. Any artifacts in the observed decadal and centennial variations associated with these issues could have important consequences for scientific understanding and climate policy. We use a completely different approach to investigate global land warming over the 20th century. We have ignored all air temperature observations and instead inferred them from observations of barometric pressure, sea surface temperature, and sea-ice concentration using a physically based data assimilation system called the 20th Century Reanalysis. This independent data set reproduces both annual variations and centennial trends in the temperature data sets, demonstrating the robustness of previous conclusions regarding global warming.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The observed increase in near-surface air temperature over land (2 m air temperature, hereafter TL2m) is a core indicator of global warming [Trenberth et al., 2007]. However, the accuracy of data sets documenting the increase continues to be debated [Pielke et al., 2007; Fall et al., 2011; Montandon et al., 2011; Christy, 2012], mainly because the TL2m record consists of observations taken irregularly in space and time using a variety of instruments and measurement techniques [Peterson et al., 1998; Hansen et al., 2010; Vose et al., 2012a; Brohan et al., 2006; Parker, 2011; Karl et al., 1986; Jones and Wigley, 2010]. The observation sites are not fully representative of the global land cover, tending to be weighted more toward urban and crop settings [Montandon et al., 2011], and represent their larger scale environment to varying degrees [Pielke et al., 2007; Christy, 2012; Fall et al., 2011]. Instrument relocations introduce further inhomogeneities in the record [Peterson et al., 1998; Brohan et al., 2006]. Changes in some sites resulting, for example, from conversion to cropland [Montandon et al., 2011], from the construction of buildings [Christy, 2012], or other urbanization effects [Hausfather et al., 2013; Christy, 2012; Parker, 2011; Jones and Wigley, 2010; Brohan et al., 2006; Peterson et al., 1998] are also a concern [Pielke et al., 2007; Christy, 2012; Fall et al., 2011; Montandon et al., 2011]. These and other heterogeneities affect the accuracy of climate variability deduced from such observations [Pielke et al., 2007; Christy, 2012; Peterson et al., 1998; Brohan et al., 2006; Karl and Williams, 1987; Karl et al., 1986; Ellis, 1890; Jones and Wigley, 2010; Hausfather et al., 2013]. Any artifacts in observed climate variations associated with these issues could have important consequences for scientific understanding [Trenberth et al., 2007] and climate policy [IPCC, 2007].
 Previous global analyses of TL2m have addressed many of these issues [Jones and Wigley, 2010; Trewin, 2010]. Adjustments to correct for potential biases have been developed [Peterson et al., 1998; Hansen et al., 2010; Vose et al., 2012a; Brohan et al., 2006; Karl and Williams, 1987; Karl et al., 1986]. Uncertainties in these adjustments, as well as in the observations and their aggregation into gridded analyses, have been estimated and included in the Climatic Research Unit gridded Temperature version 3 (CRUTEM3) [Brohan et al., 2006] and version 4 (CRUTEM4) [Jones et al., 2012] data sets developed by the University of East Anglia Climatic Research Unit and Met Office Hadley Centre and the Merged Land Ocean Surface Temperature version 3.5.2 (MLOST) data set [Vose et al., 2012a] developed by the National Oceanic and Atmospheric Administration (NOAA). Despite these efforts, debate continues as to the reliability of the global temperature record, particularly for assessing trends and decadal variability [Pielke et al., 2007; Fall et al., 2011; Montandon et al., 2011; Jones and Wigley, 2010], as highlighted recently before the United States Congress [Christy, 2012].
 In light of these issues, we have taken a completely different approach to deduce global 20th century TL2m warming without using any TL2m observations. We use the 20th Century Reanalysis (20CR [Compo et al., 2011], see supporting information1), a physically based state-of-the-art data assimilation system, to infer TL2m given only CO2, solar, and volcanic radiative forcing agents; monthly averaged sea surface temperature (SST) and sea-ice concentration fields (both from the HadISST1.1 [Rayner et al., 2003]); and hourly and synoptic barometric pressure observations (from the International Surface Pressure Databank [Compo et al., 2011]). In our analysis, we find that TL2m warming is reproduced using the independent 20CR data, confirming that the observed warming is not an artifact of deficiencies in station-temperature measurements.
 Details of data processing are given in the supporting information. Data access is found in Table S1.
2.1 The 20th Century Reanalysis and Simulation
 The 20CR is based on an Ensemble Kalman Filter technique [Whitaker and Hamill, 2002] and uses a time-varying weighting between pressure observations and an ensemble of 56 nine-hour forecasts made with a NOAA atmosphere/land general circulation model (AGCM) to estimate the three-dimensional state of the atmosphere every 6 h (see supporting information). A detailed assessment of the overall quality has been reported [Compo et al., 2011], and the data set has been used in a wide range of climate and weather applications [reanalyses.org, 2013], including examining the United States [Vose et al., 2012b] and global [Parker, 2011] TL2m trend since 1979.
 A 138 year set of 56 simulations using the same AGCM, boundary forcing, and radiative forcing agents as 20CR (AMIP20C hereafter) was also generated as an additional independent estimate of TL2m (see supporting information).
2.2 TL2m Observational Data Sets
 Eight different near-global data sets constructed from observations of TL2m are used here (Table S1). We have used Climatic Research Unit (CRU) and Met Office Hadley Centre grids of CRUTEM3 [Brohan et al., 2006] and CRUTEM4 [Jones et al., 2012]; the CRU in-filled time series grids (CRU_TS3.10) [Mitchell and Jones, 2005; Jones and Harris, 2011; Harris et al., 2013]; the NASA Goddard Institute for Space Studies grids at 250 km and 1200 km smoothing (GISTEMP250 and GISTEMP1200) [Hansen et al., 2010]; the NOAA MLOST grids [Vose et al., 2012a]; the Japan Meteorological Agency temperature grids (JMATEMP, JMA, unpublished data 2012); and the University of Delaware temperature grids (UDELv3.01) [Willmott and Robeson, 1995].
 As the 20CR does not use temperature observations from land stations, it is entirely independent of those observations. Nevertheless, the time variations of TL2m in the 20CR are very similar to those previously reported in the station-based data sets [Brohan et al., 2006; Hansen et al., 2010; Vose et al., 2012a; Jones et al., 2012], both over the 1901 to 2010 period and the more rapidly warming 1952 to 2010 period (Figure 1 and Table S2). We have focused on the TL2m anomalies averaged over 90°N–60°S in these two periods because this coverage is common to all the data sets considered (see supporting information). The temporal (Pearson) correlation between the 20CR and station-based estimates of the 1901–2010 globally averaged annual TL2m range from 0.84 (P = 0.006) with the UDELv3.01 data set to 0.92 (P = 0.001) with the MLOST data set. These and all temporal significance tests take into account the reduction in temporal degrees of freedom (dofs) arising from the auto-correlation in all series [Livezey and Chen, 1983].
 A good agreement is seen even after removing the long-term trend and multidecadal variability (Table S2). The high-pass filtered data, constructed by removing 7 year running means from the monthly anomalies, shows correlations over the 1901–2010 period from 0.74 (P < 0.0001) with JMATEMP to 0.81 (P < 0.0001) with four different data sets (Table S2). The correlations are higher for the shorter 1952–2010 period. All P are 0.0001 or smaller.
 This consistency of TL2m from land stations and the 20CR strongly suggests that the determinations of TL2m variations from monthly to centennial scales using the station temperatures are robust and reliable. The comparison is insensitive to the choice of data set (Figure 1 and Table S2) and the precise near-global region over which it is performed (Figure S1 and Table S3).
 Despite being based on an independent set of observations, the uncertainty estimates from 20CR are comparable to those from CRUTEM4 (Figure 1) and also from CRUTEM3 and MLOST (Figure S1 and Table S3). The uncertainties in both the 20CR and station estimates are smaller if the comparison is restricted to the 60°N–60°S region (Figure S1 and Table S3). This avoids the sparsely observed Arctic and regions of low sea-ice concentrations that have a warm bias in 20CR [Compo et al., 2011; Brönnimann et al., 2012]. The data sets agree even better over the 60°N–60°S domain (Table S3). For example, over the 1901–2010 period, four data sets correlate with 20CR at 0.91 (P < 0.0012).
 Although the agreement between 20CR and the station-based data sets is strong, the mean square differences between them are somewhat larger than expected from their respective confidence intervals (Table S3). This suggests that the data sets underestimate their uncertainty, particularly 20CR during the periods of disagreement in 1944–45, and the 1960s and 1970s (Figure S1).
 The same general agreement, with differences in details, is evident in the spatial patterns of the least-squares linear trends (Figures 2, 3, S2, and S3 and Table S2). The pattern correlations [Miyakoda et al., 1972] for trends over 1901–2010 range from 0.67 (P = 0.035) with JMATEMP to 0.78 (P = 0.011) with MLOST (Table S2). All P values are smaller than 0.035 assuming 8 spatial dofs [Jones et al., 1997]. A few regions, however, do deviate from this general agreement, notably the midwestern United States, eastern Brazil, and Argentina (Figures 2, S2, and S3). If the differences come from random effects, local linear trends (Figures 2, S2, and S3) will be evenly divided between being larger or smaller than the 20CR trend, with an expected 50% areal coverage of larger trends and a binomial sampling distribution. If the differences are systematic, indicating biases in either data set, a different distribution is likely. None of the areal percentages (Table S2) for either period are statistically distinguishable from the 50% expected for a binomial distribution if the spatial dofs of the TL2m trend field are less than 13. Estimates of the spatial dofs range from 3 to 8 [Jones et al., 1997]. Assuming 8 degrees of freedom, all P values are greater than 0.144.
 The global average trend of TL2m, computed as the area-weighted average of the local trends (Figures 2, S2, and S3), also shows quantitative agreement (Table S2). The 20CR trend is 0.45°C/50 years for 1901–2010 and 0.67°C/50 years for 1952–2010. None of the trends in Table S2 is significantly different from the corresponding 20CR trend assuming 8 spatial dofs (P ≥ 0.11).
 We have also inferred TL2m from a parallel ensemble of AGCM runs without using any pressure observations (AMIP20C). Though the SSTs are the dominant contributor to land warming, both regionally [e.g., Compo and Sardeshmukh, 2009; Dommenget, 2009] and globally [Compo and Sardeshmukh, 2009; Dommenget, 2009; Hoerling et al., 2008], the 20CR agrees better with the station data sets than it does with even this AGCM ensemble (Figure 3), particularly on the monthly timescale (Table S2), confirming the important influence of the pressure observations in 20CR.
 Unlike 20CR, the AMIP20C simulation is not intended to and cannot reproduce the observed day-to-day (or month-to-month) variations, i.e., the climate system's particular chaotic “sample path” over the 20th century, and thus cannot be regarded as providing independent observational confirmation of the long-term variations (just as successful simulation of 20th century global warming using climate models with prescribed radiative forcings cannot be viewed as providing observational confirmation of that warming). This is consistent with the much lower correlation (~0.35) in Table S2 of 20CR with the AMIP20C high-pass monthly TL2m time series than with all the thermometer-based series. On the other hand, that 20CR represents not just the low-frequency but even high-frequency aspects of the observed “sample path” is reflected in its good match with independent estimates of subdaily and monthly tropospheric-average temperatures [Compo et al., 2011], daily temperatures [Parker, 2011], monthly high-pass filtered TL2m (Table S2), and also in 24 h forecast skill when used as initial conditions [Compo et al., 2011].
 The independent estimate of TL2m from 20CR demonstrates that, in spite of recently published [Pielke et al., 2007; Fall et al., 2011; Montandon et al., 2011] and public [Christy, 2012] concerns with the station temperature record, the temperature analyses [Brohan et al., 2006; Hansen et al., 2010; Vose et al., 2012a; Jones et al., 2012; Japan Meteorological Agency, unpublished data, 2012; Harris et al., 2013] used by the Intergovernmental Panel on Climate Change (IPCC) [Trenberth et al., 2007] and many others for climate science and for input to climate policy [IPCC, 2007] are reliable and robust estimates of large-scale TL2m variability and change. Still, while more than 80% of the temporal variation of global land-average temperature (Tables S2 and S3) and 60% of the spatial variation in the century and recent half-century trends are captured (Table S2 and Figure 3), 20CR does show interesting differences in some regions such as the midwestern United States, Argentina, and eastern Brazil (Figures 2, S2, and S3) and during some time periods such as 1944–1945, the 1960s, and 1970s (Figures 1 and S1). Resolving these differences may include addressing previously unrecognized issues with the pressure observations, time-varying land use and land cover [Nuñez et al., 2008; Pielke et al., 2007; Brohan et al., 2006], aerosols [Parker, 2011; Jones et al., 2012], remaining inhomogeneities in the station records [Jones and Wigley, 2010; Trewin, 2010], or the prescribed boundary conditions in the 20CR system.
 All data are publicly available (Table S1). We thank the NOAA ESRL/PSD IT and Data group for support. 20CR used resources at NERSC (DE-AC02-05CH11231) and OLCF (DE-AC0500OR22725) funded by the Department of Energy (DoE) Office of Science. This work was supported by NOAA Climate Program Office and the Office of Science (BER), U.S. Department of Energy. P.B. was supported by the Joint DECC/Defra Met Office Hadley Centre Climate Programme (GA01101). P.D.J. has been supported by the USDoE (Grant DE-SC0005689). We also thank two anonymous reviewers for their comments.
 The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.
Auxiliary materials are available in the HTML. doi:10.1002/grl.50425.