### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Methods
- 3. Results and discussion
- 4. Conclusions
- Acknowledgements
- References

We propose the use of Fourier series for representing the precipitation regime in a certain location and predicting it in ungauged locations, allowing for map production. We analyse monthly average precipitation data of 2043 gauging stations covering the Italian territory. The Fourier series allows to represent a curve as a sum of different sinusoidal components characterized by their period, amplitude and phase. Being the different harmonics not correlated, it is possible to fit them with stepwise multiple linear regressions. The Fourier series allows for a parsimonious representation of the regime, being usually the 12- and 6-month harmonics able to reproduce the observed values with little residuals [in this exercise the fitting gave an average monthly root mean square error (RMSE) of 9.21 mm and a correlation coefficient of 0.979]. Once the at-station harmonics parameters are obtained, it is possible to map them for predicting the regime in ungauged locations. Here we use ordinary kriging and the leave-one-out validation scheme for evaluating the amplitudes and phases of the harmonics of the 12- and 6-month periods and reconstructing the precipitation regime. We use the same scheme for the interpolation of the station data on a month-by-month basis, whose results are used as a benchmark. The analyses provide similar results, with overall RMSEs of 17.53 and 15.97 mm and correlation coefficients of 0.909 and 0.921, respectively. The spatial patterns of the reconstruction error are similar for the two cases. The stations having higher RMSE are clustered in the areas presenting high precipitation gradients, such as in the Appennines, or where major precipitation regime changes occur. For demonstrating that the Fourier series approach is more suitable for regionalization purposes, a *k*-means cluster analysis on the Fourier parameters was performed and the effect of such stratification on the mapping of the precipitation regime by applying regression kriging was assessed. Copyright © 2010 Royal Meteorological Society

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Methods
- 3. Results and discussion
- 4. Conclusions
- Acknowledgements
- References

The problem of assessing the spatial behaviour of variables measured in a limited number of locations, and mostly at the point scale, is of major interest for several science branches. The advances in computer sciences and geographical information systems' tools, as well as the increased availability of spatial data, have facilitated and stimulated the research efforts in this field.

Despite the most common techniques being developed several decades ago, the discussion on definitions, ranges of applicability and standards for assessing the quality of the estimates and for representing the results is still open and productive. For instance, only recently Hengl *et al.* (2007) demonstrated that universal kriging and regression kriging (RK) are mathematically equivalent; moreover, it is common that authors using RK do not report the relative weight of the two components in the determination of the final result, or make use of predictors that, despite being statistically significant, are not meaningful for the process under investigation.

Large-scale spatial variability of climatic variables, such as temperature or rainfall, has recently received considerable attention (e.g. Zheng and Basher, 1996; Nalder and Wein, 1998; Prudhomme and Reed, 1999).

In several studies, the main efforts are spent in maximizing the efficiency of statistical spatial interpolation techniques (e.g. Nalder and Wein, 1998; Lapen and Hayhoe, 2003). However, the approaches that investigate the role of a given predictor on the spatial variability of a climatic variable have the advantage to provide a major insight in the processes controlling the phenomenon; moreover, the physical consistency of such estimates allows the recognition of potential spurious samples in the base data, which could not be detected by means of purely statistically driven estimates. Considering, for instance, the studies on air temperature (see e.g. Zheng and Basher, 1996; Agnew and Palutikof, 2000; Ninyerola *et al.*, 2000; Gyalistras, 2003; Claps *et al.*, 2008), the authors start from the assumption that elevation and latitude already explain much of its spatial variability and then proceed to evaluate other factors that can have a significant influence on it, including the position of the site with respect to seas and continents and, at small scales, terrain attributes (aspect and morphology), atmospheric factors (humidity, precipitation and wind) and maritime factors (configuration and aspect of coasts and effects of sea currents).

When dealing with climatic regimes, here meant as the sequence of the average monthly values of a climatic variable, it is usual to process the data separately. As a result, a separate model is defined for each month (see e.g. Zheng and Basher, 1996; Attorre *et al.*, 2008; Fiorenzo *et al.*, 2008). The separate monthly models, despite being able to represent thoroughly the data, have the major drawback of not being consistent within the regime, e.g. using different explanatory variables in regression models or different semi-variogram models.

In order to overcome this problem, Claps *et al.* (2008) proposed to represent the temperature regime at gauged locations by means of Fourier series and to obtain its spatial representation by interpolating the Fourier series parameters by means of regression with geographic and morphometric covariates. The amplitudes and phases of the harmonics of 12- and 6-month periods used in that study were able to represent the regime and demonstrated clear relations with the proposed covariates. The mean root mean square error (RMSE) was 0.53 °C, which is comparable to the results obtained by Attorre *et al.* (2008) and Fiorenzo *et al.* (2008) for Italy and Basilicata, respectively.

The scope of this article is to highlight the effectiveness of the Fourier series approach for representing the regime of climatic and environmental variables in general, of precipitation in particular, and as a support for map production.

After presenting the Fourier series framework and providing some hints on its usage, we evaluate its capabilities based on the rainfall regime of 2043 gauging stations in Italy.

The Fourier series potential in fitting the at-station observed regime is first assessed. We present the classification of the Italian precipitation regime as obtained by means of unsupervised clustering of the Fourier parameters.

We evaluate the effectiveness of the Fourier series approach in the framework of the estimation of the precipitation regime in ungauged locations. The monthly estimates obtained by reconstructing the series after interpolating the Fourier parameters by means of ordinary kriging and the leave-one-out scheme are compared with those obtained when processing the monthly data directly within the same interpolation scheme.

Finally, we investigate the possibility to make use of geographic and morphologic variables extracted from a digital elevation model within regression and RK frameworks, and we assess the skill of the stratification based on the regions previously obtained in improving the performance of such schemes.

### 2. Methods

- Top of page
- Abstract
- 1. Introduction
- 2. Methods
- 3. Results and discussion
- 4. Conclusions
- Acknowledgements
- References

The curves representing the regime of climatic variables can be reproduced by means of Fourier series as the sum of sinusoidal curves having different periods:

- (1)

where *j* = month of the year (1/12); *A*_{0} = mean of *V*(*j*); τ( = 12) period of the cycle; *T*_{i} = period of the *i*^{th} harmonic; *A*_{Ti} = amplitude of the *i*^{th} harmonic; and ϕ_{Ti} = phase of the *i*^{th} harmonic.

The phases are now represented in radians. It is possible to obtain their values in months as:

- (4)

As an example, we report in Figure 1 the *A*_{0} component and the harmonics of periods 12 and 6 months as obtained from the adaptation to the precipitation regime of the gauging station located in Potenza. The regime reconstructed with the two mentioned harmonics and the observed data are reported in Figure 2.

Once the Fourier parameters are evaluated, it is worth to perform some controls on the values obtained. First of all, the amplitudes having negative values can be set to positive by imposing a shift of half a period to the phase of that harmonic. It is then necessary to verify that the phase is comprised between zero and the period of the harmonic, eventually correcting it by adding or subtracting one period.

When working on datasets including several sampling points to be used for regional studies or interpolation, it is necessary to perform a further control on the phases in order to ensure that their distribution is well represented within the mentioned lower and upper bounds. Once their histogram is plotted, it can happen that the data present a cut in the distribution. In such cases it is possible to set a threshold value falling into an interval where no data are located and to transform the values lower than the threshold by adding one period, allowing for a better representation of the distribution. As an example, in Figure 3 the distribution of the phase of the 12 months' harmonics is transformed from the initial one, bounded between zero and 2π, to a new one more suitable for further processing.

Once the at-site estimates of the Fourier parameters are obtained, it is possible to use them for the production of the monthly precipitation maps. In particular, we decided to apply ordinary kriging within a leave-one-out validation scheme. The estimates obtained for the Fourier series parameters are used for reconstructing the monthly precipitation values through Equation (1). The same scheme has been applied to the observed data for the construction of 12 separate models of the monthly normals to be used as a benchmark in the analysis. Both the estimates are compared with the observed values through the RMSE and the correlation coefficient (*R*). The indices are calculated in both a station-by-station and a month-by-month sampling, which will be reported as maps and tables, respectively.

In order to demonstrate the suitability of the Fourier series approach for regionalization studies, we present the results of a *k*-means unsupervised classification based on the Fourier's parameters. The clusters are compared with prior climatic and geographic knowledge.

In a previous work (Claps *et al.*, 2008), the parameters of the thermal regime resulted well correlated within a linear regression scheme with geographical and morphological indices extracted from a DEM, namely elevation (*Z*), the geometric average of the distance from the sea in the eight cardinal directions (*M*), an exposure index based on the direction and distance from the closest sea coast (*E*) and a measure of terrain concavity (IC), and the two location variables, latitude (Lat) and longitude (Lon), of the sampling points. Such approach would also be helpful for precipitation. However, the definition of morphological indices being meaningful over a large spatial domain is a critical issue that goes beyond the aim of this article. The independent variables proposed by Claps *et al.* (2008) have been used within a RK interpolation scheme in combination with the stratification provided by the regime classification for assessing the effectiveness of such classification in improving the reconstruction of the regime in ungauged locations. We expect that the stratification will play a major role in the selection of the most suitable covariates at the regional scale, i.e. the combination of the Lat and Lon variables by means of a proper coefficient set will allow to mimic the effect of the distance of the stations from the most relevant moisture source in a certain region.

#### 2.1. Available data

The analyses were carried out on a dataset of 2043 gauging stations. Large part of the data have been made available by the Central Office for Agricultural Ecology (UCEA) within a coordinated collection of own and former National Hydrographic Survey (SIMN) stations, for a total amount of 1437 locations. Additional 983 stations, partly overlapping the UCEA ones, were made available within national research projects. The data have been harmonized by discarding the overlapping stations for which the normals have been calculated over shorter periods.

Unfortunately, the data do not cover the Italian territory evenly, being part of northern Italy less densely covered and the Sardinia Region almost ungauged (two stations covering 24 000 km^{2}).

With respect to elevation, the distribution is quite even, apart from the mountainous part of the Italian territory, where the station density is lower (Table I). Unfortunately, the higher spatial variability of precipitation in such areas will produce higher estimation errors.

Table I. Distribution of the gauging stations of the database with elevation and related percentages of the area of the Italian territoryElevation (m.a.s.l.) | % Stations | % Area |
---|

*E* < 100 | 20 | 23 |

100 < *E* < 800 | 64 | 54 |

800 < *E* < 1200 | 11 | 11 |

*E* > 1200 | 5 | 12 |

### 4. Conclusions

- Top of page
- Abstract
- 1. Introduction
- 2. Methods
- 3. Results and discussion
- 4. Conclusions
- Acknowledgements
- References

The precipitation regime in ungauged locations reconstructed by means of the interpolation of the two harmonics Fourier series parameters derived from observed data has been demonstrated to be as accurate as the one obtained through the application of the month-by-month estimation based on observed data, with average RMSEs of 17.53 and 15.97 mm and correlation coefficients of 0.909 and 0.921, respectively.

The Fourier series approach allows to reproduce the regime in a more compact and consistent way and to reduce the complexity of the geostatistical problem by reducing the number of kriging operations from 12 (the monthly precipitation values) to 5 (the average and the amplitudes and phases of the Fourier series harmonics).

Moreover, it allows for a fast and meaningful delineation of homogeneous regions by means of simple unsupervised clustering of its parameters and to characterize in a compact way the precipitation regime, as highlighted by the representation of the mean curves of the regions.

The regions obtained have been used for the stratification of the dataset within a multiple linear regression approach based on geographic and morphologic variables. Despite the regression models being outperformed by ordinary kriging and RK, it has been possible to appreciate the effectiveness of the stratification in improving their performance and to assess the efficiency of the independent variables in the different regions.

The approach here proposed can be applied to a broad range of problems. We have successfully tested it on the representation of the regime of other climatic variables, as well as of monthly runoff and vegetation phenology as captured by remote sensing. It allows to simplify the study of complex processes, emphasize their main features and identify the relationships of complex systems.

Some caution should be used when dealing with datasets with not such a marked seasonal behaviour, such as in regions with almost flat precipitation regime. We would discourage the use of this approach when the signal has impulsive characteristics, such as in regions with a significant number of months with zero precipitation.