Many long-term records of climate variables have missing data or have had changes in their times of observation. Here we present a technique to analyze such inhomogeneous records. We assume that the underlying climatic processes are nonstationary, where the observations contain a long-term trend superimposed on periodic shorter time seasonal and diurnal cycles. The seasonal and diurnal variations are approximated using a limited number of Fourier harmonics, while the trend is represented by a monotonic function of time whose amplitude can also vary seasonally and diurnally. A least squares method is used to estimate the unknown Fourier coefficients. As an example of the technique, we present an analysis of multi-decadal hourly observations of surface air temperature obtained from several meteorological stations within the United States.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The amplitude of seasonal and diurnal variations of meteorological variables is generally much larger than weather-related fluctuations. As such, the instruments and analytical techniques used for climatic studies are often different than those used for studying short-term localized weather events. From the very beginning, meteorologists used monthly averages to approximate the seasonal cycle, maximum/minimum temperatures to describe the diurnal cycle of temperature, and other simplifications. Techniques were developed to avoid excessive computations and to fix the calendar-related problems of leap years and unequal length of months.
 In addition to diurnal and seasonal variations, much smaller amplitude climatic trends are an important component of meteorological variables, and they can also display diurnal and seasonal cycles. Modern climatologists mainly consider the long-term trend to be linear in time rather than a higher order polynomial. For relatively short time intervals, trend estimates are always contaminated by trend-like and episodic (El Niño events and volcanic eruptions) components of natural climate variability. This is why such trend analysis can only be used as a diagnostic tool and not for extrapolation of climatic data into the future. To address this problem and filter out short-time random variations in climatic trend, trend analysis must use very long time periods.
 Although seasonal and diurnal variations in multi-year averages of surface air temperature have been analyzed in the past [e.g., Fassig, 1907], we know of no attempts to determine the trend of such variations. Polyak  investigated different techniques for approximating the seasonal cycle in multi-year averages of meteorological variables. He estimated multi-year averages for five-day periods and than approximated their seasonal variation using Fourier harmonics, polynomials, or smoothing of the pentad averages with statistically optimal numerical filters. The most recent global analysis of the geographical pattern of amplitude and phase of the first two harmonics of the diurnal cycle in observed temperature for two seasons, winter and summer, revealed the importance of the semidiurnal component of the diurnal cycle [Dai and Trenberth, 2004].
 The goal of this paper is to introduce a simple technique of approximating both the diurnal and seasonal cycles as well as the climatic trend using a limited number of Fourier harmonics. The main advantage of this technique is that it can be applied to data with changing and arbitrary observation times. Changing observation times is well known in the history of meteorological observation in different countries. The same problem also arises with satellite observations of surface and atmospheric variables. Here we use long-term surface air temperature observations at a few regular meteorological stations to illustrate our technique. Simpler versions of this technique can be found in our recent papers [Vinnikov and Robock, 2002; Vinnikov et al., 2002a, 2002b; Cavalieri et al., 2003]. This exact technique was used by Vinnikov and Grody , but they did not have room to describe the complete technique. It is the purpose of this paper to describe the general technique for others to use.
2. Approximation of Nonstationary Component in Climatic Records
 Let us suppose that a climatic variable y(t) has been observed more or less regularly during a long period of time. We can denote the observed value of y at time t as
where Y(t) is the expected value of y(t) and y′(t) is the residual (or anomaly, in the language of meteorologists), which is mostly related to natural weather variability. Vinnikov et al. [2002a, 2002b] allowed for a linear trend in the expected value by expressing Y(t) as
where A(t) = A(t + T) and B(t) = B(t + T) are periodic functions with a fundamental period of 1 yr (T = 365.25 days). That approach was applied to analyze processes without a diurnal cycle. Here we extend the analysis by assuming that A(t) and B(t) consist of short-time diurnal variations with a fundamental period of 1 day (H = 1 day) superimposed on the longer time annual cycle. By representing the diurnal and annual cycles by their Fourier series, the amplitudes become a double Fourier series,
where N and M are the number of harmonics needed to approximate the seasonal variations, while K and L are the number of harmonics needed to approximate the diurnal variations in A(t) and B(t).
 The Fourier coefficients a00 and b00 in equation (3) are real numbers but all others are complex numbers. These unknown coefficients are found using the least squares condition,
where t = t1, t2, t3, …, tp are the observation times. Although the observations can be irregular, they should cover the different phases of the diurnal and seasonal cycles over different parts of the record. The number of observations should also be much larger than the number of unknown coefficients. Approximations (2) and (3) assume a linear trend, but they can easily be extended to a polynomial trend by adding higher degree terms (t2·C(t), t3·D(t), …) in equation (2). This, however, increases the number of unknown coefficients and should only be used if it is really necessary and makes physical sense.
3. Application of the Theory to Long-Term Climatic Records of Surface Air Temperature
 The technique outlined above is not very sensitive to gaps or irregularities in the observations. As such, it can be applied to observations with arbitrary observation times. Here, however, we apply the technique to a homogeneous 50–60 year period of hourly surface air temperature observations from six reporting stations over the United States (Table 1). As with every temperature record, these data contain diurnal and seasonal variations in both the multi-year averaged temperature and linear trend. These variations require a minimum of two Fourier harmonics in equation (3) to accurately approximate the seasonal and diurnal cycles, i.e., K = L = N = M = 2. The Fourier coefficients in equation (3) are estimated for each station. They depend on the choice of the beginning time coordinate t = 0, but the estimated function Y(t) does not depend on this choice. The functions Y(t), A(t), and B(t) have a leap-year cycle, because the length of a year is not equal to an integer number of days. An analog to a multi-year average temperature (for a linear trend) can be obtained by taking the average of Y(t) over two years which are in the same phase of a leap year cycle. The corresponding trend estimates are equal to B(t) for any year in the same phase of a leap-year cycle.
Table 1. List of Stations
Bellingham International Airport
Bellevue Offutt Air Force Base
Belleville Scott Air Force Base
Andrews Air Force Base
Maxwell Air Force Base
Honolulu International Airport
Figure 1 shows the result derived for each station in the form of contour plots. The plots in the left column display the diurnal and seasonal variation of the multi-year average temperature, estimated as Y(t), averaged for the two leap years, 1948 and 2000. Similarly, the middle column displays the trend estimates for one of the leap years, B(t). The diurnal and seasonal variations of the average temperature show reasonable patterns, with the amplitude and phase of these variations depending on climatic conditions. The four northernmost stations show a tendency for winter warming. Three of them show a diurnal asymmetry in the trends with more warming at night and less warming or even a small cooling trend in the daytime. Such asymmetry has been found at many continental meteorological stations using maximum and minimum temperatures [Karl et al., 1993], but our technique displays the pattern in much more detail. The diurnal asymmetry is mostly a summertime phenomenon, as previously shown by Vinnikov et al. [2002b]. An opposite diurnal asymmetry of temperature trend, with maximum warming in the daytime, is found in Honolulu, Hawaii. The diurnal and seasonal cycles of the standard deviation of surface air temperature are shown in the right column of Figure 1 and display details of the expected winter and daytime maximums in temperature variability. They were estimated by applying the same technique (but with B(t) = 0) to the time series of squared detrended anomalies (y′)2 of surface temperature. We analyzed seasonal and diurnal cycles in variance, but displayed standard deviations for convenience.
4. Discussion and Conclusions
 We have developed a model of the expected value of a variable that combines three climatic variations (diurnal, seasonal and trend) into a single function. Vinnikov et al. [2002b] previously treated the diurnal and seasonal scales separately. The seasonal and diurnal variations are approximated using a limited number of Fourier harmonics, while the trend is represented by a monotonic function of time whose amplitude can also vary seasonally and diurnally. A least squares method is used to estimate the unknown Fourier coefficients. This new approach to data analysis improves the description of diurnal and seasonal cycles in the observed data and provides a more efficient means of analyzing data with changing observation times. As an example of the technique, an analysis is presented of the hourly observations of surface air temperature obtained from several meteorological stations within the United States. Both the multi-year averaged temperature and linear trend estimates show detailed diurnal and seasonal patterns that were difficult to analyze previously.
 Application of this technique has a few obvious limitations. The main question is whether seasonal and diurnal variations in a climatic record can be approximated by only a small number of Fourier harmonics. What is quite acceptable for surface air temperature may not be good at all for precipitation or other climatic variables. If we need too many harmonics to approximate the variations we would have to estimate too many unknown parameters from observed data. Available climatic records are often not long enough, and the length of a climatic record limits the number of harmonics in approximation (2, 3). If the record is too short, the errors of approximation of the expected value Y(t) will be included in the residuals y′(t) and misinterpreted as meteorological anomalies. To solve this problem we can use other non-harmonic periodic functions for the approximation of A(t) and B(t) in equation (2) that require only a small number of parameters. This will be a subject of one of our future publications. In the example here, we set the number of Fourier harmonics to 2 (i.e., K = L = N = M = 2) in equations (1)–(3) since these harmonic components provide a number of statistically significant amplitudes and result in a qualitatively good approximation of the seasonal and diurnal cycles in multi-year averages and trends. To test of this assertion we assume that the autocorrelation function of the meteorological anomalies can be approximated as = σ2e−τ/λ, where σ2 is the variance, τ is time lag, and λ is the autocorrelation scale. This model ignores seasonal and diurnal variations in the variance and lag correlation. To evaluate the effect of such an autocorrelation on a number of independent data in the time series of observed hourly temperatures we used an empirically estimated value of λ ≈ 40 hours. By increasing the number of harmonics up to K = L = N = M = 4, we find that we do not add statistically significant amplitudes and that we decrease the variance of residuals by not more than 0.6% of its value (decrease of standard deviation of ∼0.015°C). Even for K = L = N = M = 2, approximately half of the coefficients in the approximation (equations (2)–(4)) are not significant compared to their root mean squared errors estimated for independent observations.
 Another problem appears when we are trying to analyze short climatic records that include very large natural climatic anomalies, for example those related to very strong El Niños or volcanic eruptions. If such a phenomenon occurs near the beginning or the end of the record, the trend estimates may be significantly biased. There is no better solution of this problem than to extend the record into the past or to update it if possible. If trend estimates change when a few years of observations are added to the record, they should not be simply interpreted as climatic trends.
 Evaluation of statistical significance of the estimates is always a problem. Here we used the ordinary least squares technique for independent observations which are really not independent at all. The errors of the model (2, 3) coefficients obtained in such a simplification are underestimated. The number of hourly observations for 50 years is very large (438,300) compared to the number of unknown parameters. But the number of independent data is approximately 4.3 times less than the number of hourly observations. More realistic models for lag-correlation of residuals have to be studied and taken into account to estimate the actual confidence intervals of the unknown parameters. Nevertheless, our experience shows that taking into account the autocorrelation of the residuals affects the estimated Y(t) only a little, but improves the statistics of the errors quite a bit. We would first have to estimate the non-stationary expected value and then to estimate the non-stationary variance and autocorrelation. We would then use them to improve the non-stationary expected value estimates and iterate.
 Despite the above limitations, this is a powerful technique that can be used for traditional and satellite derived climatic records. The technique presented in this paper was used by Vinnikov and Grody  in their recent analysis of satellite observed trend in globally averaged mean tropospheric temperature, and we look forward to its application for many other climatological time series.
 We thank Abram Kagan, Semyon Grodsky, and Anandu Vernekar for useful discussions, and an anonymous reviewer for valuable suggestions. Meteorological data have been selected and kindly provided by NCDC/NOAA scientists. This work supported by NOAA grants NAO6GPO403 and NA17EC1483.