An assessment quantifying the impact of urbanization on temperature trends from the U.S. Historical Climatology Network (USHCN) is described. Stations were first classified as urban and nonurban (rural) using four different proxy measures of urbanity. Trends from the two station types were then compared using a pairing method that controls for differences in instrument type and via spatial gridding to account for the uneven distribution of stations. The comparisons reveal systematic differences between the raw (unadjusted) urban and rural temperature trends throughout the USHCN period of record according to all four urban classifications. According to these classifications, urbanization accounts for 14–21% of the rise in unadjusted minimum temperatures since 1895 and 6–9% since 1960. The USHCN version 2 homogenization process effectively removes this urban signal such that it becomes insignificant during the last 50–80 years. In contrast, prior to 1930, only about half of the urban signal is removed. Accordingly, the National Aeronautics and Space Administration Goddard Institute for Space Studies urban-correction procedure has essentially no impact on USHCN version 2 trends since 1930, but effectively removes the residual urban-rural temperature trend differences for years before 1930 according to all four urban proxy classifications. Finally, an evaluation of the homogenization of USHCN temperature series using subsets of rural-only and urban-only reference series from the larger U.S. Cooperative Observer (Coop) Network suggests that the composition of Coop stations surrounding USHCN stations is sufficiently “rural” to limit the aliasing of urban heat island signals onto USHCN version 2 temperature trends during homogenization.
 Urbanization has long been recognized as having the potential to impact near-surface temperature readings by altering the sensible and latent heat fluxes in affected areas [e.g., Mitchell, 1953; Oke, 1982; Arnfield, 2003]. The concentration of high thermal mass impermeable surfaces in urbanized regions commonly leads to higher surface temperatures compared to those in less developed or rural areas, especially at night [Oke, 1982; Parker, 2010]. To mitigate the potential for an urban bias in temperature records used for climate monitoring, stations that comprise the U.S. Historical Climatology Network (USHCN) were selected to be largely from rural or small town locations [Quinlan et al., 1987; Menne et al., 2009]. Still, station locations tend to be correlated with inhabited areas. Relative to the percentage of total land area that is built up, “urban” observation stations are likely overrepresented in general, even in networks like the USHCN.
 Given the potential for urban biases, a number of studies have been undertaken to quantify the impact of the “urban heat island” (UHI) signal on land surface air temperature trends globally [e.g., Peterson et al. 1999; Parker, 2006; Jones et al., 2008; Hansen et al., 2010] and regionally within the United States [e.g., Kukla et al., 1986; Karl et al., 1988; Gallo et al., 1999; Gallo and Owen, 2002; Peterson, 2003; Peterson and Owen, 2005]. Unfortunately, quantifying the impact of urbanization on temperature trends faces multiple confounding factors. For example, an instrument originally installed in an urban environment may well have warmer absolute temperatures than one in a nearby rural area, all else being equal, but will not necessarily show a higher trend over time unless the composition of the city or the microclimate around the sensor changes in such a way as to cause the city observations to further diverge from temperatures at nearby rural locations [Jones and Lister, 2010], or the nature of urban land use leads to an amplifying of warm events whose frequency may change with time [McCarthy et al., 2010]. It follows that urban heat island effects will lead to larger temperature trends compared to rural areas only if UHI-related factors cause incremental increases over rural temperatures during the period over which the trend is calculated [Boehm, 1998]. Moreover, cooling biases can be introduced into the temperature record when stations move from city centers to more rural areas on the urban periphery. This may have occurred, for example, during the period between about 1940 and 1960 when stations were moved from urban centers to newly constructed airports [Hansen et al., 2001] and, in the case of the USHCN, airports, waste water treatment plants, and other locations that lie outside the urban core [National Climatic Data Center, 2012]. Conversely, an instrument that is constructed in a relatively rural area that becomes more urban over time may exhibit a warming bias, and stations in small towns are not necessarily free of urban influences.
 To further complicate matters, changes associated with urbanization may have impacts that affect both the mesoscale (102–104 m) and the microscale (100–102 m) signals. Small station moves (e.g., closer to nearby parking lots/buildings or to an area that favors cold air drainage) as well as local changes such as the growth of or removal of trees near the sensor may overwhelm any background UHI signal at the mesoscale [Boehm, 1998]. Notably, when stations are located in park-like settings within a city, the microclimate of the park can be isolated from the urban heat island “bubble” of surrounding built-up areas [Spronken-Smith and Oke, 1998; Peterson, 2003]. Furthermore, changes in observation practice such as time of observation and instrument changes may lead to artifacts (inhomogeneities) in the data record that complicate the quantification of urban heat island signals [Peterson, 2003], especially if these changes are correlated with urban form.
 Here, an analysis is described whose aim is to quantify the potential UHI contribution to U.S. temperature trends by more fully controlling for external factors that impact the trends but are otherwise unrelated to urbanization. A range of estimates for the UHI contribution to average U.S. temperature trends is provided by making use of four separate ways to differentiate urban and rural station environments to help assess uncertainty associated with identifying urban environments. The impact of data homogenization on the UHI signal is also evaluated. Homogenization is necessary to account for shifts in the station-based data caused by historical changes in the circumstances behind surface temperature measurement (e.g., changes in instrument type, station relocations) rather than by true changes in the climate. The artifacts caused by these kinds of changes have large, systematic impacts on U.S. temperature trends [Menne et al., 2009; Williams et al., 2012a]. Consequently, homogenized data sets are essential for evaluating temperature changes from the observational record [Venema et al., 2012; Lawrimore et al., 2011; Hansen et al., 2010; Vose et al., 2012]. Benchmarking the approach to homogenizing the U.S. monthly temperature data has essentially reaffirmed previous assessments regarding the nature and impact of these artifacts on USHCN temperature trends [Williams et al., 2012a].
 Homogenization of the USHCN monthly version 2 temperature data does not specifically target changes associated with urbanization. Instead, the procedure used involves identifying and accounting for shifts in the monthly temperature series that appear to be unique to a specific station, the assumption being that a spatially isolated and sustained shift in a station series is caused by factors unrelated to background climate variations [Menne et al., 2010]. Given that UHI-related changes may manifest as highly localized shifts or creeping changes, the focus in this analysis is to determine to what extent homogenization is removing apparent, local urban influences on the USHCN temperature record. Because homogenization may be removing local shifts caused by land use changes at nonurban stations, the same methodology used here could be applied to evaluating the impact of other types of land use changes.
 The paper is organized as follows. Some additional background and motivation for the study are provided in section 2. The data sets and methods are discussed in section 3. Results are presented in section 4. Conclusions are provided in section 5.
2 Background and Motivation
 Motivation for assessing urban influences on temperature trends comes largely from interest in quantifying the contribution of urbanization in overall temperature trends relative to other factors. To that end, measures of ambient population [Kukla et al., 1986] and satellite-derived nightlights [Gallo et al., 1999] have been used to differentiate urban and rural environments. Using these measures, monthly temperatures from U.S. weather stations designated as urban have been found to have decadal trends as much as 0.12°C/decade higher than those classified as rural [Kukla et al., 1986]. Because differences of this magnitude represent a non-negligible fraction of the likely background climate change signal, Karl et al.  developed a specific adjustment to control for the apparent contribution of the urban heat island signal in USHCN temperature data. After adjusting for shifts in the data associated with time of observation and other changes documented in station histories, the Karl et al.  evaluation suggested that an additional urban bias was present in the USHCN average temperature of about 0.06°C during the period from 1900 to 1984. Essentially all of the bias was associated with minimum temperatures in urban areas, which were about 0.13°C higher on average than in rural areas; maximum temperatures appeared to have little urban bias.
 The Karl et al.  UHI correction was used to produce the USHCN (version 1) fully adjusted USHCN monthly temperature data until the release of version 2 [Menne et al., 2009]. As in version 1, the version 2 release includes bias adjustments for time of observation and other station history changes, but version 2 also includes adjustments for changes (inhomogeneities) that are not documented in digital station histories (roughly 50% of all changes). Providing adjustments for both documented and undocumented station changes reduced the overall magnitude of minimum temperature trends from USHCN stations more than the fully adjusted version 1 temperatures even though version 1 contained the additional Karl et al.  UHI adjustment. The reason for this may be that the more comprehensive homogenization in version 2 removes the impact of incremental, but previously unidentified step changes associated with mesoscale and microscale urbanization factors, or that the signal arising from local UHI trend changes are sometimes aliased (i.e., inadvertently accounted for; DeGaetano, 2006) onto estimates of the more comprehensive version 2 step-type adjustments [Menne et al., 2009]. In any case, the development of a method for identifying and adjusting undocumented shifts appeared to account for more than the signal attributed to urban effects on minimum temperatures by Karl et al. . Thus, no separate UHI-specific correction was provided in USHCN version 2.
 Another reason that the Karl et al.  corrections were not used in version 2 is that they are monotonic functions of city population; that is, these adjustments always reduced minimum temperature trends based on the total population of the city. In contrast, Hansen et al. [1999, 2001, 2010] have used a nightlights-based method that forces urban (and “periurban”) station trends to conform to surrounding rural trends in the National Aeronautics and Space Administration (NASA) Goddard Institute for Space Studies (GISS) surface temperature analysis. In the process, the Hansen et al. approach actually increases the trend for about 40% of urban stations. The fact that so many urban trends are larger after the urban adjustment likely reflects the degree to which the confounding factors discussed above can mitigate or otherwise obscure potential urban heat island signals.
 For the U.S. data contribution to the NASA GISS analysis, Hansen et al. [2001, 2010] use the USHCN data that have been adjusted by the National Oceanic and Atmospheric Administration/National Climatic Data Center (NOAA/NCDC) for time of observation and station history changes, but apply their own UHI adjustment. The GISS urban adjustment reduced the otherwise adjusted USHCN version 1 temperature trends by an additional 0.15°C/century, more than twice than that of the Karl et al.  method [Hansen et al., 2001], even though the NASA GISS UHI corrections are not monotonic. When one uses the USHCN version 2 adjusted data, the impact of the GISS UHI correction is on the order of 0.07°C/century [Hansen et al., 2010].
 The differential impacts of these approaches to assessing and correcting for the UHI are indicative of the need to better frame the uncertainty of urban influences on temperature trends in the United States. As noted more recently by Peterson  and Peterson and Owen , this requires controlling for the many confounding issues like differences in instrumentation and other observation practices that may blur the urban signal. Whereas Peterson  and Peterson and Owen  focused primarily on a snapshot of mean urban-rural differences, here we build on their work by looking specifically at the time evolution of urban-rural differences. We use four rather than two proxy measures of urbanity and quantify the impact of data homogenization on the apparent UHI signal, focusing in particular on the potential magnitude of residual UHI contamination and whether there is evidence that homogenization transfers UHI bias from urban to nonurban station series.
 The Conterminous United States (CONUS) has some of the most dense, publicly available digital surface temperature data in the world, with over 7000 Cooperative Observer (Coop) Network Program stations reporting daily maximum and minimum temperatures for at least 10 of the network's 120-plus year history. A subset of 1218 stations, generally those with long records, composes the USHCN [Menne et al., 2009]. This highly sampled surface temperature field allows for the comparison of subsets of station data in a manner that avoids inherent biases due to changes in spatial coverage. The Coop Program also now maintains accurate geolocational information on the present location of observing stations, with coordinates expressed in degrees, minutes, and seconds (roughly 30 m accuracy) available for most stations. This also allows for the accurate indexing of current Coop station locations against high-resolution georeferenced data sets that are useful for delineating urban and nonurban areas.
 Because there is not an obvious mesoscale metric that determines the impact of urban form on temperature in all situations, we examined four different measures of urbanity that are available as georeferenced data sets: satellite-derived nightlights, urban boundary delineations, percent of impermeable surfaces, and historical population growth during the period for which high-resolution data are available (1930–2000). These four measures, which represent different snapshots of urban boundaries, were used to classify a station as urban or nonurban by retrieving the pixel values coincident with each station's coordinates. When the proxy for urban form involved continuous measurements (all but urban boundaries), a cutoff point to divide stations between urban and rural was chosen based on urban designations present in the literature (e.g., Hansen et al.  for nightlights and Elvidge et al.  for impermeable surface area). Each of these proxies is described in section 3.1 below.
3.1 Data Sets Used to Classify Station Types
3.1.1 Satellite Nightlights
 Satellite-derived brightness values associated with the Coop Network stations (including the USHCN) were taken from the Global Radiance Calibrated Nighttime Lights data set produced by the Earth Observation Group using instruments flown on Defense Meteorological Satellite Program (DMSP) satellites. We used the data from the F16 satellite recorded between 28 November 2005 and 24 December 2006. The values we associate with each station are linearly interpolated from the four neighbor pixels in the image file and are converted to radiance by multiplying by 1.51586 × 10−10, giving a result in W sr−1 cm−2 [Baugh et al., 2010]. To determine a radiance value threshold for designating urban stations that is consistent with the 32 μW/m2/sr/µm used in Hansen et al.  (who used data from Imhoff et al. ), we divided radiance values by the optical bandwidth of the F16 satellite (0.7 µm), resulting in a cutoff of 14.78 (i.e., 32 ÷ 0.7 × 1.51586) as the equivalent value for the 2005–2006 satellite nightlight series. This is rounded to the nearest integer (15) for the purpose of assigning a cutoff to separate urban from nonurban pixels.
3.1.2 Urban Boundaries (GRUMP)
 For the urban boundaries urbanity proxy, we use binary designations from the Global Rural-Urban Mapping Project (GRUMP), produced by the Center for International Earth Science Information Network (CIESIN) of the Earth Institute at Columbia University . GRUMP designations are based on the identification of urban areas using national census data (including the National Imagery and Mapping Agency database of populated places). GRUMP purports to identify cities and towns with populations exceeding 1000 residents. Urban boundaries surrounding identified cities and towns are estimated based on DMSP Operational Linescan System (OLS) data from 1994 to 1995 as well as data from the Digital Chart of the World's Populated Places (DCW) [Balk et al., 2004].
3.1.3 Impermeable Surfaces
 The Impervious Surface Area (ISA) for pixels coincident with Coop stations is taken from the Global Distribution and Density of Constructed Impervious Surfaces data set produced by the Earth Observation Group. The 1 km resolution data used for this study were derived from 30 m ISA data generated by the U.S. Geological Survey as described in Elvidge et al. . The data product has a nominal date of 2000/2001 and represents the percentage of the surface area that comprised man-made structures such as roads, buildings, and parking lots. Station latitude and longitude were used to reference the data set and extract the percentage of impervious surface in the surrounding 1 km. To determine the urban/nonurban classification, a cutoff of 10% was employed. As noted by Schueler  and Arnold and Gibbons , the impacts to hydrology typically begin above this figure. ISA values below 10% were classified as rural. This approach is consistent with, although somewhat more conservative than, the recent work of Potere et al. , who used a figure of 20% for detecting urban extent.
3.1.4 Population Growth
 For the population growth proxy, we utilized gridded 1 km population estimates for the CONUS, 1930–2000. This data set was also used by Peterson and Owen  and Peterson  to classify USHCN stations into urban and nonurban categories. The gridded population was created by using two U.S. Census Bureau data sets: the 2000 U.S. Census Bureau 1 km2 population density grid for CONUS [National Geophysical Data Center/NESDIS/NOAA, 2002] and tabular U.S. Census county data [U.S. Census Bureau, 2002]. Urban sites were defined as those characterized by a 1930–2000 population growth of ≥10 people/km2, which yields similar-sized numbers of urban and nonurban stations, as shown in Table 1. While there is no available justification in the literature for this or any specific 1930–2000 population growth cutoff as a proxy for urbanization, this value was chosen to be reasonably conservative and to produce an urban/rural division generally in line with the other urbanity proxies. As Table 1 indicates, the GRUMP, Nightlights, and Population Growth urbanity proxies result in a relatively even distribution of stations in the rural and urban categories, while the ISA proxy identifies the majority of stations as rural. Information on retrieving these data sets is provided as auxiliary material1.
Table 1. Number of USHCN Stations Classified by Urbanity for Each Urbanity Proxya
aFour stations could not be classified using the ISA urbanity proxy due to data set limitations.
3.2 Calculation of Rural and Urban Temperature Trend Differences
 Urban-rural temperature differences were calculated by subsetting the USHCN station data according to the urban/nonurban station classifications described above (for simplicity, nonurban stations are referred to as rural). To examine the possible UHI signal present in the USHCN temperature record, we use two different but complementary methods to compare urban and rural station temperatures: station pairing and spatial gridding.
3.2.1 Station Pairing Method
 The station pairing method creates pairs of nearby urban and nonurban (rural) stations as classified by the four urban proxy measures. Pairs were created by forming all possible permutations of urban and rural stations, excluding those that were more than 161 km (100 miles) apart, that had differing or unknown sensor equipment types (e.g., Maximum Minimum Temperature Sensors [MMTS] versus Liquid in Glass Thermometers in Cotton Region Shelters [CRS]) or cases in which both stations currently have MMTS sensors but installation dates differ by more than 5 years. This pairing method yields a set of proximate urban/rural station pairs for each classification method that should be relatively unaffected by bias introduced through sensor-type transitions [Quayle et al., 1991; Menne et al., 2009]. Time series of monthly maximum and minimum temperature anomalies relative to a 1961–1990 baseline were calculated for all urban and rural series. Difference series for each urban and rural station pairings were then created for the full period of the USHCN version 2 records (1895 to present).
 More specifically, the approach in the station pairing method was to take all permutations of urban and rural stations and produce a set containing unique pairs but non-unique occurrences of individual urban and rural station series (Table 2). For example, a specific urban station would create distinct pairs with all surrounding rural stations within 100 km with the same instrumentation type. To avoid overweighting regions with large numbers of adjacent urban and rural stations (and thus disproportionately more possible station pair combinations), we weight the urban-rural differences by the inverse of the number of station pairs associated with each unique urban station. The mean urban-rural differences for unique urban stations are averaged for each month to obtain a best estimate of the underlying urban-rural temperature differences.
Table 2. Number of Total Urban/Rural Station Pairs and Unique Urban Stations by Urbanity Proxy
Total station pairs
Unique urban stations
 The trend and confidence intervals for two periods, 1895–2010 and 1960–2010, are calculated from the station pair data using a weighted regression with clustered standard errors, with unique urban stations used for both the weighting and clustering. Standard errors are clustered by unique urban station because station pairs contain non-unique occurrences of individual urban and rural stations (e.g., a single urban station might be paired with four different rural stations), and treating each station pair as independent would result in erroneously narrow confidence intervals. As mentioned previously, each urban-rural pair is given a weight in the regression proportionate to the inverse of the number of station pairs that share the same urban station.
 The station pairing method allows us to control for both spatial coverage and sensor type, avoiding potential complications introduced by differing locations of urban and rural stations as well as urban-correlated bias in the transition to MMTS sensors in the 1980s. The results will not necessarily be as representative of the entire CONUS temperature field as those produced by spatial gridding, however, as station pairing does not explicitly weight based on spatial coverage.
3.2.2 Spatial Gridding Method
 The spatial gridding method is used to create separate gridded fields for the CONUS using the subsets of urban and rural station series (and separately for maximum and minimum temperatures) as classified by each urban proxy measure. Station temperatures are converted to anomalies relative to a 1961–1990 baseline period, and station series that fall within 2.5° latitude × 3.5° longitude grid cells are averaged together; then each grid cell average is cosine weighted to produce a CONUS average time series. The CONUS average urban and rural station series are then differenced. Trends and confidence intervals for the urban-rural differences during the 1895–2010 and 1960–2010 periods are calculated by regressing against the date using an AR(1) model to account for autocorrelation.
 The gridding method described above is commonly used by NOAA/NCDC to produce spatially averaged time series for climate monitoring. In addition to this method, results using the gridding method described in Menne et al. [2009, 2010] are provided as supplementary information.
3.3 USHCN Version 2 Monthly Temperature Data
 Urban-rural differences for mean monthly maximum and minimum temperatures were calculated using four different versions of the USHCN version 2 monthly temperature data. The four versions were used to help quantify the magnitude of the UHI in the underlying raw (unhomogenized) data, to isolate the impact of data homogenization on the UHI signal and to evaluate impact of the GISS UHI correction when applied as an addition correction over and above homogenization. The data set versions include
time of observation-only adjusted data (called TOB);
adjusted version 2 (TOB + pairwise homogenization adjustments; v2);
adjusted version 2 data produced by running the pairwise homogenization algorithm using (a) neighboring series classified only as rural (v2-rural neigh) and (b) neighboring series classified only as urban (v2-urban neigh);
adjusted version 2 data with the GISS UHI correction (TOB + pairwise homogenization + GISS UHI adjustments; v2 + step2).
Each of these variants is described below.
3.3.1 Time of Observation Bias-Adjusted Data (TOB)
 The TOB station series are the raw monthly temperature data adjusted only for the time-of-observation bias [Schaal and Dale, 1977; Karl et al., 1986]. The time of observation bias is an artifact of the starting/ending hour for the 24 h interval over which the maximum and minimum temperature occurred. This bias is unrelated to any physical artifacts associated with urbanization and only leads to biased trends when the time of observation changes through time. However, such changes are likely more prevalent at rural stations, which are commonly run by volunteer observers who have been systematically transitioning from afternoon to morning observation times [Menne et al., 2009]. In order to remove the time of observation bias as a confounding factor in assessing UHI impacts, we use data adjusted according the method described by Karl et al.  and Vose et al. . Results using completely unadjusted (raw) data are provided as auxiliary material using the Menne et al. [2009, 2010] gridding method.
3.3.2 Data Adjusted by the Pairwise Homogenization Algorithm (USHCN Version 2)
 Running the TOB-adjusted data through the Pairwise Homogenization Algorithm (PHA) [Menne and Williams, 2009] produces the USHCN version 2 fully adjusted data [Menne et al., 2009]. The PHA works by identifying and removing abrupt shifts in monthly temperature series that appear to be unique to a particular station. The shifts can be caused by small station moves, by a change in instrumentation, or possibly by local impacts of any kind of land use change. The shifts are identified via automated pairwise comparisons of monthly temperature series in which the relative homogeneity of a given station's series is evaluated by looking for breaks in differences series formed between the target station and a number of highly correlated neighboring series. The adjustments are based on the median shift magnitude calculated from pairwise temperature differences between the target and neighbors before and after the shift. For any particular target adjustment, the neighbor pool is drawn from those that appear to be homogeneous according to the PHA for a minimum period (24 months) before and after the target shift. The PHA does not specifically target urban station changes. Instead, the algorithm targets all shifts that appear to be unique to a particular station. Removing these local signals at all stations (rural and urban alike) produces temperature trend fields that more accurately reflect the general background climate signal than the raw data.
 For version 2, USHCN monthly temperatures were compared to sets of highly correlated neighboring series within the larger Coop Network. Details regarding the mechanics of the PHA and evaluations of the algorithm's efficiency can be found in Menne and Williams  and Williams et al. [2012a]. Version 2.0 of the adjusted monthly data was released in 2008 based on PHA version “52d.” Urban-rural differences in version 2.0 adjusted data are discussed below. An evaluation of the UHI signal in a new version of the data set (termed version 2.5) is provided as auxiliary material using the Menne et al. [2009, 2010] gridding method. Version 2.5 fully homogenized data are produced by algorithm version “52i,” which contains some bug fixes relative to version 52d [Williams et al., 2012b].
 To evaluate the potential for UHI bias to be transferred from urban Coop stations that may be used as neighbors in the homogenization of USHCN station records, we also ran the USHCN station series through the PHA using only Coop stations that were classified as rural in one case and using only stations classified as urban in the other according to the same set of four urban proxies.
3.3.3 Version 2 Homogenized Data With the Additional NASA/GISS “GISTEMP” UHI Correction
 Finally, we apply the GISS surface temperature (GISTEMP) urban heat island adjustment (described by Hansen et al. ) to the version 2.0 series to see how it addresses any remaining urban-related signal from the homogenized monthly temperature records. The GISTEMP UHI correction adjusts the trend of stations classified as urban or periurban to match the trend of a distance-weighted composite record made from nearby rural stations. An urban station is adjusted only if there are at least three nearby rural stations with values that overlap at least two thirds of the urban station's period of record. Periods and urban stations that fail the rural station requirement are excluded from the GISS analysis. Rural stations are ideally selected to be within 500 km of the urban station, but in some cases could be as far as 1000 km away to meet the selection requirement. Note that in performing this adjustment, only rural stations from USHCN have been used. This contrasts with the usual GISTEMP analysis, which will use any suitable rural stations in GHCN, possibly including stations not in USHCN (such as in Canada and Mexico). Given the spatial density of stations in USHCN, we expect any differences in adjustment to be minimal.
 The scheme for identifying stations as urban has changed in the history of the GISTEMP analysis [Hansen et al., 1999, 2001, 2010]; here we use nighttime radiances from the DMSP-calibrated radiance product described earlier. The analysis was carried out using the ccc-gistemp software supplied by the Climate Code Foundation [Barnes and Jones, 2011]. The resulting version 2.0 series with the GISTEMP UHI correction should be essentially the same as the USHCN data used in NASA's GISTemp product, albeit with a slightly more up-to-date data set used for determining nighttime brightness and separate application of the step 2 (UHI correction) process to average monthly minimum and maximum data rather than applying it to the mean monthly data only.
 This analysis described above produces estimates of urban-rural differences for each month from 1895 to 2010 for mean monthly minimum and maximum temperatures for the TOB, v2, v2 + Step 2, and v2-rural neigh/v2-urban neigh variants for each of the four urbanity proxies via both station pairing and spatial gridding methods, resulting in 64 different distinct urban-rural differences for each month.
4.1 Unhomogenized (TOB-Adjusted) Data
 Figure 1, which summarizes the urban minus rural (urban-rural) trend differences for all data set versions, indicates that the USHCN unhomogenized (TOB-only adjusted) data contain significant urban warming signals (p<0.05 for linear trend fit) over the period from 1895 to present in both the minimum and maximum temperatures according to nearly all urban classification and comparison methods (the exception being GRUMP and nightlights maximum temperatures evaluated via spatial gridding).
 As expected, the urban signal is larger in minimum temperatures than in maximum temperatures. Urban-rural difference trends in minimum temperature range between 0.05°C and 0.5°C per century in minimum temperatures for the 1895–2010 period for the unhomogenized data depending on the classification and comparison method (e.g., station pairing or spatial gridding).
 As shown in Figure 2, there is also evidence of a significant urban signal in the unhomogenized data during the past 50 years, with urban-rural difference trends of between 0.2°C and 0.6°C per century across all urbanity proxies for the period 1960–2010. This large urban warming signal does not appear to be a result of any correlation between instrument changes and urban form because it occurs with a similar magnitude in both the station pairing method (which controls for instrument type) and the spatial gridding method (which does not).
 For minimum temperatures, the urban warming signal over both century and half-century time frames is larger in the more restrictive urban classification (ISA) that contains relatively few urban stations, and the signal is smaller in the classifications (GRUMP, Nightlights, and Population Growth) that contain a more even split between urban and rural designations. The station-pairing method often shows significantly larger urban warming than the spatial gridding method; however, the pairing method does not account for the potential biases related to the spatial distribution of the station pairs. As shown in Figure 3, the divergences between station pairing and spatial gridding methods are particularly pronounced prior to 1950, which may be indicative of a larger geographic bias to the station pairs during that period. In contrast, both methods produce similar results for periods after 1950.
 As Figure S1 shows, the rural-urban differences are even larger in the raw minimum temperatures than in the TOB-adjusted data, especially for the period since 1950 when time-of-observation changes were prevalent. However, as mentioned above, this difference is not likely driven by any physical phenomena related to UHI. Instead, it probably reflects a higher frequency of time of observation changes at nonurban stations.
 Maximum temperature urban-minus-rural trends in the unhomogenized (TOB) data are also significantly larger than zero over the period 1895–2010 for most urban classifications, but are smaller than the trends in minimum temperature urban-rural differences (Figure 4). They also show considerably less variation across urbanity proxy, with urban warming trends of around 0.08–0.22°C per century for the station pairing method and −0.04–0.2°C per century for the spatial gridding method. However, maximum temperature urban-rural difference trends are larger over the period 1960–2010, particularly in the GRUMP and Population Growth proxies, where they exceed minimum urban minus rural trends. In this case, there is also a greater divergence between analysis methods, with the station-pairing method showing much larger urban warming than the spatial gridding method, which, again, probably reflects a spatial bias caused by the non-uniform distribution of station pairs.
 By comparing the trends of rural stations to those of all USHCN stations, we can use the spatial gridding method to get an estimate of the extent to which overall CONUS minimum temperature trends over the past century may have been driven by the urban warming signal (Table S1). By this estimate, the unhomogenized minimum temperature data from rural USHCN stations yield trends that are between 14% and 21% smaller on average over the 1895–2010 period than the trends from the full USHCN network. This difference decreases to between about 6% and 9% during the last 50 years.
4.2 Homogenized Version 2 Data (v2)
 The PHA significantly reduces the difference between urban and rural minimum temperature trends according to all analysis methods and station classifications. This is particularly true over the 1960–2010 period, for which the vast majority of the urbanity proxies and methods indicate no significant urban warming in the minimum data. Maximum temperatures are a bit more mixed, although most proxies and methods show no significant urban warming in the maximum data over the period. As shown in Figure 5, there is still a small but significant minimum urban warming prior to 1960 in all urbanity proxies except for Population Growth. The station-pairing method suggests some residual urban signal before 1960, but this residual signal is small in the spatial gridding method for all proxies after 1930.
 The effect of homogenization is most pronounced in the more restrictive urbanity proxies like ISA that contain relatively few urban stations and show larger urban warming trends prior to homogenization. The divergences between urban and rural temperatures that remain prior to 1930 even after homogenization are probably caused in part by the combination or poorer metadata for that time period and fewer coop station records that can be used as neighbors.
 Urban-rural differences in maximum temperatures over the century time frame are not reduced as much as minimum temperatures in the version 2.0 homogenization, as shown in Figure 6, but are smaller to begin with in the unhomogenized data (Figure 1).
 Comparing homogenized rural HCN stations to all HCN stations, we find that rural stations show between 0% and 10% (0 to 14%) less warming in minimum (maximum) temperature data in the version 2.0 data over the 1895–2010 period, and a slight but not significantly different from zero reduction in warming over the 1960–2010 period (Table S1). Thus residual urban signals not removed by data homogenization appear to be significant only for the period prior to 1960 and effectively only prior to about 1930. In summary, pairwise homogenization effectively removes the urban signal present in minimum temperature data from the last 50–80 years and reduces it by around 50% or more for the period prior to 1930 (as can be seen when comparing Figures 3 and 5).
4.3 Homogenized Version 2 Data With Added GISTEMP Correction (v2 + Step 2)
 As reported in Hansen et al. , applying the GISTEMP step 2 UHI correction to the USHCN version 2 data has the impact of reducing the mean CONUS temperature trend from 0.73°C to 0.65°C over the period 1900–2009. As shown in Figure S1, this reduction appears to result almost entirely from trend adjustments in the data for years prior to 1930. After 1930, the version 2.0 (52d) and version 2.5 (52i) data are not significantly impacted by the step 2 adjustment. Moreover, this trend reduction is required only because of an urban signal in the early minimum temperature data, which get reduced by about 0.0113°C/decade by the step 2 adjustment. The impact on maximum temperature is only 0.00288°C/decade. The average of these impacts is equivalent to the impact reported by Hansen et al. . As shown in Figures 7 and 8, the GISS step 2 adjustment is effectively removing the residual urban signal in both minimum and maximum temperatures across all proxies without any significant overadjustment, even for the most restrictive definitions of urbanity.
4.4 Homogenized Version 2 Data Using Only Coop Neighbors Classified as Rural (v2-Rural Neigh)
 In all of the urbanity proxies and analysis methods, the differences between urban and rural station minimum temperature trends are smaller in the homogenized data than in the unhomogenized data, which suggests that homogenization can remove much and perhaps nearly all (since 1930) of the urban signal without requiring a specific UHI correction. However, the trends in rural station minimum temperatures are slightly higher in the homogenized minimum temperature data than in the TOB-only adjusted data. One possible reason for this is that the PHA is appropriately removing inhomogeneities caused by station moves or other changes to rural stations that have had a net negative impact on the CONUS average bias (e.g., many stations now classified as rural were less rural in the past because they moved from city centers to airports or wastewater treatment plants). Another possibility is that homogenization is causing nearby UHI-affected stations to “correct” some rural station series in a way that transfers some of the urban warming bias to the temperature records from rural stations. In such a case, a comparison of the homogenized data between rural and urban stations would then show a decreased difference between the two by removing the appearance of an urbanization bias without actually removing the bias itself.
 To help determine the relative merits of these two explanations, the PHA was run separately allowing only rural-classified and only urban-classified Coop stations to be used as neighbors in calculating the PHA corrections for USHCN stations. In Figure 9, the spatially averaged U.S. minimum temperature anomalies for rural stations are shown for the four different data sets: the unhomogenized (TOB-adjusted only); the version 2 (all-Coop-adjusted; v2) data; the homogenized data set adjusted using only coop stations classified as rural; and the homogenized data set adjusted using only urban coop stations.
 The large difference in the trends between the urban-only adjusted and the rural-only adjusted data sets suggests that when urban Coop station series are used exclusively as reference series for the USHCN, some of their urban-related biases can be transferred to USHCN station series during homogenization. However, the fact that the homogenized all-Coop-adjusted minimum temperatures are much closer to the rural-station-only adjustments than the urban-only adjustments suggests that the bleeding effect from the ISA-classified urban stations is likely small in the USHCN version 2 data set. This is presumably because there are a sufficient number of rural stations available for use as reference neighbors in the Coop network to allow for the identification and removal of UHI-related impacts on the USHCN temperature series. Furthermore, as the ISA classification shows the largest urban-rural difference in the TOB data, it is likely that greater differences between rural-station-only-adjusted and all-coop-adjusted series using stricter rural definitions result from fewer identified breakpoints because of less network coverage, and not UHI-related aliasing. Nevertheless, it is instructive to further examine the rural-only and urban-only adjustments to assess the consequences of using these two subsets of stations as neighbors in the PHA.
 Figure S2 shows the cumulative impact of the adjustments using the rural-only and urban-only stations as neighbors to the USHCN. In this example, the impermeable surface extent was used to classify the stations. The cumulative impacts are shown separately for adjustments that are common between the two runs (i.e., adjustments that the PHA identified for the same stations and dates) versus those that are unique to the two separate urban-only and rural-only reference series runs. In the case of both the common and unique adjustments, the urban-only neighbor PHA run produces adjustments that are systematically larger (more positive) than the rural-only neighbor run. The magnitude of the resultant systematic bias for the adjustments common to both algorithm versions is shown in black. The reason for the systematic differences is probably that UHI trends or undetected positive step changes pervasive in the urban-only set of neighboring station series are being aliased onto the estimates of the necessary adjustments at USHCN stations. This aliasing from undetected urban biases becomes much more likely when all or most neighbors are characterized by such systematic errors.
 Figure S3 provides a similar comparison of the rural-only neighbor PHA run and the all-Coop (v2) neighbor run. In this case, the adjustments that are common to both the rural-only and the all-Coop neighbor runs have cumulative impacts that are nearly identical. This is evidence that, in most cases, the Coop neighbors that surround USHCN stations are sufficiently “rural” to prevent a transfer of undetected urban bias from the neighbors to the USHCN station series during the homogenization procedure. In the case of the adjustments that are unique to the separate runs, the cumulative impacts suggest that the less dense rural-only neighbors are missing some of the negative biases that occurred during the 1930–1950 period, which highlights the disadvantage of using a less dense station network. In fact, the all-Coop neighbor v2 data set has about 30% more adjustments than the rural-only neighbor PHA run produces. Results using the other three station classification approaches are similar and are provided as Figures S3–S8.
 According to all four proxy measures used to identify station environments that are currently urban, there is consistent evidence that urban stations have a systematic bias relative to rural stations throughout the USHCN period of record. This bias has led to an apparent urban warming signal in the unhomogenized data that accounts for approximately 14–21% of the total rise in USHCN minimum temperatures averaged over the CONUS for the period since 1895 and 6–9% of the rise over the past 50 years. Homogenization of the monthly temperature data via NCDC's PHA removes the majority of this apparent urban bias, especially over the last 50–80 years. Moreover, results from the PHA using the full set of Coop station series as reference series and using only those series from stations currently classified as rural are broadly consistent, which provides strong evidence that the reduction of the urban warming signal by homogenization is a consequence of the real elimination of an urban warming bias present in the raw data rather than a consequence of simply forcing agreement between urban and rural station trends through a spreading of the urban signal to series from nearby stations.
 As noted in section 1, one of the challenges in quantifying the UHI signal in land surface air temperature records is that changes affecting urban stations can occur at both the microscale and the mesoscale. Changes at the microscale (e.g., small station moves, growth of a tree) are not necessarily of interest in evaluations of the UHI signal because they are highly localized and may have no relevance to the broader land use changes associated with urbanization that can affect the mesoscale temperature signal. For this reason, microscale changes can be reasonably included in the list of inhomogeneities that should be corrected for via homogenization (along with instrument changes and time of observation changes). In contrast, it may be desirable to preserve changes in the mesoscale signal because these changes encompass a broader footprint and are arguably more likely to be related to larger scale land use changes. Unfortunately, it may not be possible to distinguish (at least automatically) changes occurring at the microscale from changes at the mesoscale, especially if only one station record is available to sample the mesoscale signal. Whatever the cause, when any station series exhibits a sustained change relative to highly correlated surrounding stations, the change is likely to be identified by the PHA as uniquely local, and its impact on that station's temperature trend will be removed with a bias adjustment. This happens whether the USHCN station is from a rural or urban environment, which means that the same challenge that exists for identifying UHI impacts also exists for identifying the impacts of other types of (nonurban) land use changes.
 Nevertheless, the pairing of urban and rural stations in a manner that controls for instrument type and time of observation changes reveals larger trends at urban stations, which is consistent with the understanding that land use changes associated with urbanization lead to larger historic temperature trends at urban stations. However, that this larger trend signal is effectively removed through homogenization suggests that the urban environments characterized by larger trends do not have large spatial scales that allow them to be sampled by a number of Coop stations (or that the urban temperature signal is heterogeneous) and thus the local urban signal is being effectively removed via homogenization.
 Because homogenization is largely successful in removing urban bias in the USHCN temperature data, it appears that only about 5% of the period-of-record USHCN version 2 minimum temperature trends across the CONUS can be attributed to local urban influences and, furthermore, that most of this contribution is coming from data for years prior to 1930. This residual urban bias for the earlier years in the record may be a consequence of the reduced station density of the Coop network in the early part of the 20th century, which limits the number of pairs available for detecting inhomogeneities, some of which may be related to urbanization.
 The NASA GISS's (GISTEMP) step 2 nightlight-based UHI adjustments effectively remove the remaining urban-rural differences during this early period, suggesting that the additional UHI-specific adjustment is achieving the goal of forcing agreement between urban and rural temperature trends. Nevertheless, the recently released USHCN version 2.5 data (homogenized with the PHA algorithm version “52i,” as shown in Figure S1) improves the pre-1930 period considerably vis-à-vis version 2.0 (except in the case of GRUMP), which may also mean that homogenization procedures may be able to more fully account for urban-related biases in the future, at least in areas with sufficient station density. In any case, at present, the net effect of urban-correlated biases on the version 2.5 adjusted data is evidently small, accounting for less than 5% of the trend since 1895 (and between 0% and 2% since 1960). While it would probably be worthwhile to further characterize the uncertainty in UHI-related warming in data sets like the USHCN (e.g., by exploring a range of cutoffs for classifying a station as urban with the various proxies or by quantifying more site-specific aspects of a station's environment), UHI does not appear to represent a significant contributing factor in the homogenized CONUS-average maximum and minimum temperature signal over the past 50–80 years.
 The authors thank Peter Thorne, Russ Vose, Jay Lawrimore, Tom Peterson, and three anonymous reviewers for helpful comments on this manuscript. The authors also thank Steven Mosher for useful feedback and help in obtaining Impermeable Surface Area values for U.S. stations.