Gregorian calendar bias in monthly temperature databases



[1] In this study we address a systematic bias in climate records that manifests due to the establishment of the Gregorian calendar system and exerts a statistically significant effect on monthly and seasonal temperature records. The addition of one extra day in February normally every fourth year produces a significant seasonal drift in the monthly values of that year in four major temperature datasets used in climate change analysis. The addition of a ‘leap year day’ for the Northern Hemisphere creates statistically significantly colder months of July to December and, to a lesser degree warmer months of February to June than correspondingly common (non-leap year) months. The discovery of such a fundamental bias in four major temperature datasets used in climate analysis (and likely present in any dataset displaying strong annual cycles, e.g., U.S. streamflow data) indicates the continued need for detailed scrutiny of climate records for such biases.

1. Introduction

[2] Monthly surface temperature is one of the most used climate variables to establish the human impact on climate change [Intergovernmental Panel on Climate Change, 2007]. An inherent assumption of using such a metric is that all other factors potentially influencing the record have been addressed. For example, temperature biases have been identified resulting from factors such as changes in land-use, the consistency of equipment/measurement/observation at site, the constancy of site location over time, and the influence of winds and moisture. It has been suggested that these potential biases have not been adequately addressed or noted in general assessments of climate change [Pielke et al., 2007]. Beyond these physical biases, we postulate that there are other potentially significant biases still remaining in climate records.

[3] The Gregorian calendar was designed to keep the vernal equinox on or close to 21 March, so that the date of the Christian Easter (observed on the first Sunday after full moon that falls on or after the vernal equinox) remains correct with respect to the vernal equinox [Meeus, 1991]. A key feature of the Gregorian (and of the earlier Julian) calendar system is the incorporation of periodic ‘leap years’ (or intercalary years), specifically years containing an extra day, in order to keep the calendar year synchronised with the astronomical calendar [Sajjad et al., 2005]. In particular, in the Gregorian calendar, February periodically contains twenty-nine days instead of the normal twenty-eight.

[4] A few researchers have identified anomalies with the calendar system that could lead to biases in environmental and epidemiological databases [e.g., Sagarin, 2001; Walter, 1994]. Sagarin demonstrated that the slow shift in the physical occurrence of the date of the vernal equinox relative to its calendar data over the course of a century can bias the trend analysis of spring events such as timing of migration and egg laying. Walter identified that several calendar effects (e.g., variation in month length, the irregular number of weekend days in each month and the occurrence of holidays) can impact trend analyses in epidemiologic data. Although climate researchers have identified that discernable differences in month length and year length can produce anomalies in comparisons, e.g., “monthly [climate] averages incorporate different lengths for different months and different years (leap years)” [Vinnikov et al., 2002, paragraph 2], there has been no examination of the one-day offset seasonal drift evident in the climate data for monthly values of leap years as compared to common (non-leap) years. Accordingly, we examine the consistent seasonal drift induced in long-term monthly climate data as a result of the incorporation of leap days into the calendar system.

[5] Conceptually, for the Northern Hemisphere, the effect of adding an extra day in February is to shift the climatological monthly average of the monthly data for February to June (termed for this study as ‘rising-sun’), such that, for example, an average April for a leap year would eliminate what would normally be the first day of April and conversely include what would normally be the first day of May. Conversely, the effect is reversed for the months of July to December (a period defined for this study as ‘sinking-sun’) where the addition of an extra day in February effectively shifts the average of those monthly data after the annual thermal maximum into the colder periods of autumn and winter. Because the specific rising-sun and sinking-sun months are reversed in the Southern Hemisphere, those basic periods should demonstrate a similar but reversed response to the Northern Hemisphere.

[6] To demonstrate this effect theoretically, we segregated a simple sinusoidal wave into 365 equal intervals (‘days’) such that the maximum occurs on Day 180 and the minimum on Day 1 and divided the intervals into the 12-month increments of common years. Next we created an equivalent leap-year sinusoidal wave divided into 366 equal intervals and divided those intervals into the 12-month increments associated with a leap year (i.e., February containing 29 days). We then constructed a hypothetical forty-year time series of these segregated sinusoidal waves (consisting of ten leap years and thirty common years) and computed the standardized monthly averages for all months. We then categorized the data into average common and leap year values.

[7] Following the discussion above, it is easily evident that using such a theoretical sinusoidal construct, the rising-sun months are ‘warmer’ in leap years than common years (Figure 1). Conversely, in the sinking-sun months, leap years are ‘colder’ than their common year counterparts (Figure 1). The difference in average ‘January’ values between leap and common years is explained in that this model is based on segregrating a sine wave into either 365 or 366 parts, as opposed to the calendar system of actually adding a day to the length of the year, i.e., the theoretical model holds wavelength constant which does not occur in the real calendar system.

Figure 1.

Theoretical average monthly ‘temperature anomalies’ with thermal maximum occurring on 1 July expressed as standardized anomalies for a ‘40-year’ sinusoidal construct such that leap year (366 day) (dotted line) and common year (365 day) (solid line) monthly values are differentiated.

[8] The effect, of course, should be reversed in the Southern Hemisphere. For this study, we therefore label for the Southern Hemisphere ‘rising-sun’ as referring to the months of July to December and ‘sinking-sun’ as referring to the months of February to June.

[9] The question is whether this calendar bias effect is significant enough to be visible in existing climatic change datasets. For this study, we relate the temperature differences between leap year and common year months to four independent temperature databases commonly and widely used in major climate change studies (the continental US Historical Climate Network Version 2; the Hadley Centre's CRUTEM3V dataset, the Microwave Sounding Unit (MSU) satellite record, and the composited global rawinsonde temperature record).

[10] For analysis of the leap year seasonal drift effect, one of the best quality-controlled regional temperature databases is the United States Historical Climatology Network (U.S. HCN), which consists of observations from the U.S. Cooperative Observer Network operated by NOAA's National Weather Surface (NWS). The 1218 HCN stations were originally selected according to factors such as record longevity, percentage of missing values, spatial coverage as well as the number of station moves and/or other station changes that may affect data homogeneity. The HCN dataset has been developed at NOAA's National Climatic Data Center (NCDC) in collaboration with the U.S. Department of Energy's Carbon Dioxide Information Analysis Center (CDIAC). The most recent monthly data release, the U.S. HCN Version 2, was produced using a new set of quality control and homogeneity assessment algorithms. The Version 2 homogenization algorithm and overall assessment of the Version 2 maximum and minimum temperature trends demonstrate the quality of this dataset [Menne and Williams, 2008; M. J. Menne et al., An overview of the United States historical climatology network serial monthly temperature data—Version 2, submitted to Bulletin of the American Meteorological Society, 2008].

[11] Each of 1218 HCN stations' monthly values were standardized to that particular month's long-term mean and standard deviation (the period 1949 to 2005). Months were categorized into those occurring in leap years and those associated with common years. Following the example of the theoretical sinusoidal construct, we created two broad seasons; ‘rising-sun’ consisting of the aggregrated months of February to June and ‘sinking-sun’ comprising the aggregated months of July to December. Of the 1218 stations, 1181 (97.0%) demonstrated colder leap-year temperatures for the composited sinking-sun months than for common years, following the results from the theoretical model. A plot of the average HCN monthly values for leap year and common leap values (Figure 2 and Table S1) displays a strikingly similar pattern to that of the theoretical sinusoidal construct (Figure 1).

Figure 2.

Average monthly temperature anomalies expressed as standardized anomalies (°C) for the composite of the 1218 station HCN database such that leap year (366 day) (dotted line) and common year (365 day) (solid line) monthly values are differentiated. Bars indicate one standard error of the mean.

[12] When all 1218 stations are composited and the differences between leap-year and common-year temperatures are evaluated, aggregated sinking-sun monthly anomalies are colder for leap years (equation image = −0.136°C) compared to temperature anomalies for common years (equation image = 0.044°C). Similarly, aggregated rising-sun month anomalies are warmer for leap years (equation image = 0.061°C) compared to composited rising-sun temperature anomalies for common years (equation image = −0.020°C).

[13] Mapping the sinking-sun differences between the aggregated leap and common monthly temperature anomalies (Figure 3 and Table S2) shows, first, as noted above almost the entire country does demonstrate the expected leap-year effect such that sinking-sun (July–December) leap year month temperatures are colder (negative values in Figure 3) compared to common year months. Second, the largest differences between leap year months and common year months (e.g., strongest leap-year effect) occur in the Southwest and the Eastern United States. In contrast, the smallest differences (although still predominately demonstrating colder leap-year sinking-sun months) are evident primarily in the Great Plains.

Figure 3.

Differences in sinking-sun (July through December) standardized monthly temperature anomalies (°C) of the aggregated leap years minus aggregated common years for the US Historical Climate Network Version 2 (for the years 1949 to 2005). Negative numbers indicate colder leap years months compared to common year months for the sinking-sun period.

[14] Conceptually, this spatial variability suggests that regions where the temperature intraseasonal variability is high (Great Plains) are not as strongly impacted by the one-day seasonal shift created by leaps years as for those regions where the temperature seasonal variability occurs more consistently (Southwest and Northeast U.S.). Such a concept can be evaluated for each station over the length of record by comparing the standard deviation of sinking sun months against the average temperature difference between leap and common leaps. Stations demonstrating smaller differences between the leap- and common-year months also record higher standard deviations for the sinking sun months. For the 1218 U.S. HCN stations, the adjusted shared variance (r2) between the two variables (leap year/common year differences and station standard deviation) is 0.208 (Z = 17.169, p < 0.001).

[15] An explanation for this relationship is most plausibly linked to studies demonstrating that, in the Great Plains (the region of highest seasonal variability and smallest average temperature difference between leap- and common-year months), individual significant snowstorms are critical to total annual snowfall amounts [Leathers and Harrington, 1999], i.e., the lack or present of a single storm can be the major determining factor to the annual snow total. We suggest that such dependency of single snow events also strongly influences the region's short-term temperature variability (e.g., [Cerveny and Balling, 1992] and, consequently, the degree to which the leap-year temperature effect is evident.

[16] Given that a high-quality, regional climate dataset demonstrates the seasonal drift associated with the Gregorian calendar, the next question is whether larger-scale, hemispheric databases also show this systematic bias. One of the major temperature datasets in climate change is the Hadley Centre's CRUTEM3V dataset of land-air temperature values [Brohan et al., 2006] using a 5° × 5° grid-box basis from 1850 to 2007, expressed as anomalies from 1961–1990 [Jones et al., 1999]. For the Northern Hemisphere, the sinking-sun months demonstrate significantly (t = −19.31, p < 0.001) colder leap-year temperatures (equation image = −0.115) as opposed to common year temperatures (equation image = −0.064). This follows the theoretical expectations of colder sinking-sun months for leap years. In comparison, the CRUTEM3V dataset for the Southern Hemisphere also indicates the expected colder sinking-sun (February–June) average temperatures for leap years than for common years (t = −6.99, p < 0.001). Composited leap year sinking-sun Southern Hemisphere temperatures (equation image = −0.127°C) were colder than common year sinking-sun months (equation image = −0.040°C). Similar leap-year seasonal drift is evident in the Hadley Centre's HadCRUT3, the combined land and marine temperature anomalies on a 5° by 5° grid-box basis.

[17] A third independent analysis of the potential influence of the Gregorian calendar bias is possible using the Microwave Sounding Unit (MSU) satellite record. Daily bulk atmospheric temperature values derived from microwave emissions near the 60-GHz oxygen absorption band are available as gathered by a series of U.S. National Oceanic and Atmospheric Administration (NOAA) polar-orbiting satellites [Christy et al., 2003], for this study, from 1979 to 2004. In contrast to other datasets, the leap year seasonal drift effect is most apparent in the MSU satellite record in the rising-sun (February through June) months. In those months, as theoretically expected, the composited Northern Hemispheric leap-year temperatures average significantly warmer (t = 1.940, p = 0.0524) then their common-year equivalents. Similarly, in the Southern Hemisphere, the rising-sun months (July–December) demonstrate significantly warmer leap years than their common year counterparts.

[18] A fourth independent dataset involves the surface temperature deviations in the troposphere and low stratosphere created from rawinsonde observations and compiled by the NOAA Air Resources Laboratory [Angell, 2006]. The data used in this study are available for the period 1958 to 2005 and are presented in seasonal deviations from their twenty-year average (1958–1977). This is a particularly stringent test of leap year seasonable drift in that the data are available in seasonal rather than monthly format, so that the effect has already been aggregated into three-month periods (and therefore diluted by a factor of three).

[19] For this study, we first selected the radiosonde surface temperatures aggregated between 0° and 90°. When these values are combined to create the rising-sun (spring) and sinking-sun (summer and fall) values comparable to the previous analyses, we again identify leap sinking-sun seasons as having colder average temperatures than their common counterparts (t = −2.356, p = 0.025). As with the Hadley Center temperature data, the leap-year seasonal drift is not significantly apparent in the rising-sun data. Similar results were identified in regional Northern Hemisphere aggregated bands (polar, temperate, and subtropical). For the Northern Hemisphere the leap year drift effect is most intense in the polar and temperate regions, but, although correct in expected direction of difference, the results were not statistically significant for the subtropical region.

[20] For the radiosonde Southern Hemispheric surface temperature data, the leap-year seasonal drift is not as strongly apparent as for the Northern Hemisphere. Although the direction of the difference (colder leap sinking-sun seasons compared to common year equivalents) matches expected theoretical results, the differences between leap and common years are not statistically significant for all available bands (0°–90°, 10°–30°, 30°–60°, 60°–90°).

[21] These empirical results from four major climatic datasets (continental US Historical Climate Network Version 2; Hadley Centre's CRUTEM3V dataset, MSU satellite record; and the composited global rawinsonde temperature record) reflect the same results as found using a simple sinusoidal temperature curve as theoretical proxy. For the Northern Hemisphere, rising-sun (February–June) temperatures for leap years tend to be warmer than their common year counterpoints while sinking-sun (July–December) temperatures tend to be colder than their corresponding common-year temperatures. In general, the sinking-sun seasonal drift appears to be more apparent, likely resulting from a lesser amount of interannual seasonal variability. We note that the this slight difference in degree of influence between the rising/sinking sun ‘seasons’ indicates that simple averaging of all months into an annual average will not completely eliminate the calendar bias from the data.

[22] Can this seasonal drift induced by the calendar system be isolated and removed from monthly climate databases? Following procedures developed for time of observation bias [Vose et al., 2003], a method for removing this bias is to compute a correction factor using daily data for leap years, to recompute common year (i.e., no 29 February occurrence) monthly temperatures for the set of available leap years, to establish the average difference between the two datasets and to create a correction factor.

[23] In this study, we have identified an embedded seasonal drift in monthly climate data resulting from the Gregorian calendar system. As the drift effect is evident in both a theoretical sinusoidal construct and in a set of major temperature datasets, this drift effect is likely present in any monthly data demonstrating a strong annual cycle, climate or otherwise. This conclusion of potential bias in all such databases follows from calendar biases identified in epidemiological and phenological research [Sagarin, 2001; Walter, 1994] and our findings may, indeed, be incorporating some of the effects identified by those authors. As a supplemental test of that conclusion, we examined a network of 609 United States Geological Survey streamflow stations with data extending from 1939 to 1988 [Slack and Landwehr, 1992]. For the sinking sun period (May to September, corresponding to hydrologic peak in April), leap year stream discharges were significantly lower than their common year counterparts (t = −5.32, p < 0.001) while, following expectations, leap year months produced significantly smaller discharge than their common year counterparts for the rising sun period, October to April excluding January (t = 1.94, p = 0.053). Consequently, the discovery of such a fundamental bias in monthly datasets indicates the continued need for detailed scrutiny of all such datasets.


[24] Cerveny and Balling were partially supported under NSF Geography and Regional Science/Atmospheric Dynamics grant 0751790.