Most observations of (near-) surface meteorological variables like temperature, precipitation and wind velocity, particularly long time series, are obtained at surface weather stations, that is at fixed locations in space. From these measurements, selective information is obtained on intrinsically two- or three-dimensional fields that vary on diverse spatial scales. Statistical methods have thus been widely applied for characterising this spatial variability (e.g. Huff and Shipp, 1969; Bacchi and Kottegoda, 1995; Gunst, 1995; Grimes and Pardo-Iguzquiza, 2010). In particular, knowledge of the spatial autocorrelation of the variables is important for many applications. For example, it is essential for many modelling techniques that are applied for the interpolation of the meteorological fields in space and time (Baigorria et al., 2007; Grimes and Pardo-Iguzquiza, 2010; Baigorria and Jones, 2010). Moreover, it is required for data assimilation and can be helpful in subseasonal climate forecasting (Koster et al., 2008). For paleoclimate studies dealing with the reconstruction of a climate parameter from a proxy archive, the spatial correlation of the reconstructed variable can be used to assess the spatial representativeness of the archive. Spatial correlation of wind damages should be accounted for in the construction of insurance policies (De Silva et al., 2008). Finally, spatial correlations have to be taken into account when exploring regional trends in surface climate (Gunst, 1995; Douglas et al., 2000).
Precipitation has been the focus of many studies dealing with spatial correlation patterns in surface fields, since, on the one hand, it is an important input parameter for several applications, e.g. in hydrology and, on the other hand, exhibits a very large variability in space, demanding the application of advanced statistical methods (Grimes and Pardo-Iguzquiza, 2010). Typically, correlation-length scales are larger for steady precipitation related to the passage of synoptic-scale low-pressure systems than for showers and convective events (Huff and Shipp, 1969). In mid-latitudes, this leads to larger correlation lengths in winter than in summer (Baigorria et al., 2007). Furthermore, the spatial coherency increases with increasing accumulation period of the precipitation (Bacchi and Kottegoda, 1995). In addition to station measurements, remote sensing data have also been used as a basis for investigating the spatial autocorrelation of precipitation, particularly in the tropics (e.g. Ricciardulli and Sardeshmukh, 2002). For temperature, the variability in space usually is smaller and correlation-length scales are much larger than for precipitation (Shen et al., 2001; Koster et al., 2008). Correlation scales of near-surface wind velocity are relatively large over the ocean and for weak winds, but become smaller when wind speed increases (Brown and Swail, 1988). However, absolute values of the correlation lengths differ substantially between different measurement systems (Wylie et al., 1985), also because wind measurements are rather complicated. Over land, the correlation scales are smaller than over the ocean and depend on the complexity of the terrain (Reid and Turner, 2001).
An intrinsic feature of classical correlation analysis is that two time series are compared to each other as a whole, making the correlation coefficient sensitive to the coherency of the bulk of the data, but not that much to the extremes, since these are relatively rare (of course, this sensitivity also depends on the specific type of correlation statistics). Nevertheless, for several of the applications mentioned above, e.g. related to paleoclimatology, insurance issues or climate change, it is important to know about the spatial coherency of extreme meteorological events. It is not a priori clear if these extremes have the same spatial correlation properties as the bulk of the data. Several studies addressed this issue by calculating correlation coefficients of frequencies of extreme events on a yearly or seasonal basis (e.g. Douglas et al., 2000; Gellens, 2000). In this way, the spatial coherency of the influence of a climate state on extreme weather is explored. However, it is also an interesting question in which way single events and related weather patterns are spatially coherent in a statistical sense. This question of representativeness (cf. Reid and Turner, 2001) of extreme weather events cannot be addressed using seasonal or yearly data, but must be analysed in an event-based manner. This is the focus of the present study, which investigates the representativeness of extreme precipitation, temperature and wind events at different observation sites in central Europe. Therefore, a simple measure is defined for the coherency and statistical representativeness of such events. Furthermore, possible reasons for the different coherency characteristics of the variables are explored by analysing the synoptic-scale atmospheric patterns accompanying the extreme events.
The main data base for the present investigation is a set of meteorological surface observations from Germany and Switzerland, which is presented in Section 2.1. On the basis of these data, a simple statistical method for analysing the spatial coherency of extreme events is proposed, as described in Section 2.2. Results of this method are presented in Section 3.1. Since one major motivation for this study has been to investigate the spatial representativeness of proxy data, the main reference point of the analysis is the station Trier-Petrisberg, close to the Eifel region in western Germany where several maar lakes are located that may be used as archives of extreme weather proxies in future research (Pfahl et al., 2009; Sirocko, 2009). In Section 3.2, the atmospheric conditions during extreme weather events at Trier-Petrisberg are explored with the help of meteorological reanalysis data. Finally, Section 4 summarizes the results and gives a short outlook on future activities.
2. Data and methods
2.1. Meteorological data
Meteorological surface observations from 28 weather stations in Germany and Switzerland provided by the German weather service (DWD) and the Swiss Federal Office of Meteorology and Climatology (MeteoSwiss) were analysed in this study. The geographical positions of these stations are shown in Figure 1 and given numerically in Table I. The station density is larger in Switzerland, where the terrain is more complex. Measurements of daily mean temperature at 2 metres above ground for the years 1950–1999, daily accumulated precipitation for 1951–1997, and daily maximum wind gusts at 10 metres height during 1981–1999 were used. These periods were chosen in order to make sure that there are as little missing values in the time series as possible. For the temperature data, not more than 0.2% missing values were allowed (except for the station Deuselbach), and the station Chaumont was not used due to too many missing data. All stations have, at maximum, one missing value in the precipitation time series. The wind gust measurements contain more gaps, thus all stations with less than 0.6% missing data were analysed (again, except for Deuselbach). The stations Château d'Oex, Chaumont, Engelberg and Segl-Maria could not be used for the wind analysis. The station Deuselbach has 1.6% missing values in the temperature and 2.2% in the gust time series. Nevertheless, it was used here since it is located rather close to the main reference station Trier-Petrisberg (in a distance of 30 km) and can thus provide important information on local scale variability. It has to be kept in mind that the results for Deuselbach might be slightly biased by the higher frequency of missing data. All station measurements used in this study were routinely controlled for their quality by MeteoSwiss and the DWD. In addition, the temperature and precipitation data from the Swiss stations (all stations south of 47.6°N) were homogenized by MeteoSwiss using a method introduced by Begert et al. (2005). However, this homogenisation is not assumed to be absolutely necessary for the present study because no trend analysis is performed here.
Table I. Geographical locations and altitudes of weather stations
Châ teau d'Oex
In order to analyse the atmospheric conditions during extreme weather events at the reference station Trier-Petrisberg, ERA40 reanalysis data were employed (Uppala et al., 2005). These reanalyses are available with a spectral horizontal resolution of T159 and 60 vertical levels and were interpolated on a 1 × 1 degree horizontal grid. For each day with an extreme event (cf. Section 2.2) within the ERA40 period 1958–2001, six-hourly values of different atmospheric variables were extracted or calculated from the reanalysis dataset, and composite maps for all extreme days (i.e. averages over all six-hourly fields during these days) were compiled. This was done for the following variables: sea level pressure, temperature and geopotential height on the 850 hPa pressure level, and potential vorticity (PV) on the 310 and 320 K isentropes.
2.2. Analysis of simultaneous extreme events
The main objective of this study has been the identification of a simple measure for the statistical representativeness and spatial coherency of extreme events. Hence, the first step was to define a criterion of what shall be considered as extreme. This criterion should be rather general and not depend on a specific statistical distribution, enabling the measure of coherency to be used with many different data sets and variables. Therefore, the extreme events were defined as exceedance of a percentile-based threshold (Salinger and Griffiths, 2001; Klein Tank et al., 2006; Donat et al., 2010). This threshold was calculated separately for each station by computing sample percentiles for every year within the analysis period and then averaging these yearly percentiles over the whole period. In this way, relatively robust thresholds can be obtained that are not as much influenced by single extreme events (e.g. a long-lasting heat wave) as whole-sample percentiles. Furthermore, the thresholds are nonparametric and automatically take the varying climate conditions at the different stations into account. On the basis of these thresholds, the following weather extremes were defined:
hot day: daily average temperature exceeds its mean yearly 99% percentile,
cold day: daily average temperature is lower than its mean yearly 1% percentile,
heavy precipitation: daily precipitation exceeds its mean yearly 99% percentile,
strong gust: daily gust peak is larger than its mean yearly 99% percentile.
With these definitions, the number of extreme temperature and precipitation events at each station was in the order of 150–250; due to the shorter analysis period, the number of wind speed extremes was around 80.
In order to explore the spatial coherency of these extreme events, a method was used similar to what was termed conditional probability by Ricciardulli and Sardeshmukh (2002), who explored the spatial scales of tropical convection. A reference station was selected, and all days when an extreme event occurred at this station were considered further. For all other stations in the dataset, called target stations in the following, the relative fraction f of extreme events occurring simultaneously with an event at the reference station was used as a measure for the spatial coherency. For the calculation of this fraction, not only events on the same day were taken into account (in contrast to Ricciardulli and Sardeshmukh, 2002), but a temporal shift of plus or minus two days at the target station was allowed. This accounts for the fact that there may have been some lag in the occurrence of the events on the synoptic spatial scales considered here, e.g. due to the movement of a low-pressure system. Therewith, f at each target station is given by , where Nr is the number of extreme events at the reference station and the number of events at the target station occurring simultaneously (within a time window of plus and minus two days) with an event at the reference station. In other words, f gives the proportion of the weather extremes at the reference station that also affected the target station and thus directly measures the representativeness of these events.
It should be noted here that f is not symmetric under exchange of the reference and target stations. One reason for this is the normalisation with Nr (note that the total number of extreme events varies from station to station owing to their definition based on mean yearly and not whole sample percentiles). Another reason is that the clustering of events in time may differ between the two stations, and due to the introduction of the time window at the target station this influences . Because of the time window and the low number of extremes, a simple binary correlation between the occurrence of events at different stations (cf. Moron et al., 2007) is not an appropriate alternative to the calculation of f, since this binary correlation would be totally dominated by non-extreme days.
For analysing the statistical significance of a certain value of f being different from zero, a simple test was applied, assuming the days with extreme events at the reference station to be statistically independent from each other. In this case, and with the null hypothesis of no correlation between extreme events at two stations, the allocation of events at the stations was modelled with a binomial distribution. The success probability p was calculated by dividing the number of all days with or close to (within plus and minus two days around) an extreme event at the target station by the length of the time series. For the hot (cold) extremes, not the whole time series, but only the three summer (winter) months were considered in the calculation of p. The p-value of the statistical test was then read from the binomial distribution function. In the following, a significance level of 1% is used.
When using this test, the assumption of statistically independent events at the reference station is critical and often not fulfilled. In particular for the temperature data, the time series certainly are auto-correlated, i.e. there is clustering of the extreme events. The simplest form of autocorrelation in this case was avoided by using only one season of the year for the calculation of p, but this may not be sufficient. Nevertheless, developing a more exact test would be relatively complex, as the statistical distribution in the case of clustering is not known. Moreover, even when using a nonparametric test (e.g. a form of bootstrapping), this test would have to be adapted for each meteorological variable, because these variables have various autocorrelation properties. Since the intention of this study is to provide a relatively simple measure for spatial coherency of extreme events, this was not put forward here, but the test described above was applied to get an idea of the statistical significance. It should be kept in mind that, if the significance is critical for some future application of the method, a more advanced test might have to be developed.
As mentioned in Section 2.1, the analyses performed in this study are based on daily meteorological data. In order to investigate the sensitivity of the results with respect to this time scale (and thereby the maximum duration of the extreme events), the representativeness f of events at the main reference station Trier-Petrisberg was not only determined based on the original daily time series, but also using temporally smoothed data. Two sensitivity experiments were performed, applying a two- and three-day running mean filter, respectively, to the time series before the calculation of f.
Finally, the impact of the choice of the reference station on the spatial representativeness of extreme events was explored. First, the spatial coherency analysis was also performed using station Säntis as reference, which is relatively distinct from Trier-Petrisberg as it is a high-elevation station (2502 m asl) in alpine terrain (Figure 1 and Table I). Second, a measure for the mean representativeness of extremes at an arbitrary reference station was obtained by averaging f from all target stations in a given distance range. A range of 100–300 km was chosen, and the average frequency fav100–300 was calculated separately for each measurement station as reference.
3. Results and discussion
3.1. Coherency of meteorological extremes
In Figure 2, the spatial coherency of meteorological extreme events at the reference station Trier-Petrisberg (white circle) is visualized by plotting the relative fraction of simultaneous extremes f at each target station. Target stations where f is not statistically significant, using the test described in Section 2.2, are marked with a red outer circle. Figure 2(a) shows that hot days at Trier-Petrisberg often occur simultaneously with hot days at other stations in Germany and Switzerland. f is larger than 0.8 at 5 stations in western Germany and between 0.6 and 0.8 for most of the other stations. Only at 5 alpine and pre-alpine stations in eastern Switzerland and southern Germany, f is lower than 0.6. A gradual decrease of f with distance from Trier-Petrisberg can be observed at most of the German stations; in the southern regions, where terrain complexity is larger (cf. Figure 1), this decrease also seems to be more complex and steep. For the cold extremes (Figure 2(b)), the fraction of simultaneous extremes is even larger than for hot days at most of the stations. More than 80% of the coldest days in Trier-Petrisberg occur at the same time as cold days at all other German stations. In Switzerland, f is between 0.4 and 0.9, with the lowest values occurring at the high-elevation station Säntis and at the station Sion in the inner mountain valley. With respect to precipitation extremes, Figure 2(c) indicates a much lower spatial coherency. Only at the station Deuselbach, which is located at a distance of just 30 km from Trier-Petrisberg, f is larger than 50%. Values of f range from 0.2 to 0.4 at the 4 other stations in western Germany and are lower than 0.2 elsewhere. For several stations in northeastern Germany and Switzerland, the values of f are not statistically significant. Finally, the fraction of simultaneous wind gusts with gust events at Trier-Petrisberg is in between the values of f for temperature and precipitation extremes at most of the stations, as shown in Figure 2(d). Values range from 0.4 to 0.6 at many stations in central Germany and two stations in Switzerland. For most of the other, more distant stations, f is between 0.2 and 0.4. Only at Sion and at the southern alpine station Lugano, it is below 20%, with a non-significant value at the latter. This indicates that the alpine ridge is a relatively strong barrier for the simultaneous occurrence of heavy windstorms.
The spatial coherency of the extreme events shown in Figure 2 is related to the spatial correlation scales of the underlying meteorological fields analysed in other studies (Section 1). For temperature, correlation scales are very large, and the extreme events at the reference station are representative for a large region in central Europe. Precipitation, on the contrary, has small correlation-length scales and only a very local representativeness of extreme events. Wind velocity correlation and representativeness scales fall in between. When only stations in relatively smooth terrain are considered, the distance from the reference station mainly controls the relative fraction of simultaneous extremes. This is shown in Figure 3(a), where f is plotted against the distance from Trier-Petrisberg. Here only stations north of 48°N are taken into account, where the terrain is less complex than in the more southerly pre-alpine and alpine region. For all four categories of extreme events, f decreases relatively smoothly with distance. If the more alpine stations are added to this plot (not shown), there is much more scatter in the curves, indicating that local effects apart from distance play a larger role. By fitting the curves in Figure 3(a) to some kind of analytical decay function, it would, in principle, be possible to derive a spatial representativeness scale for each of the extreme events (cf. Ricciardulli and Sardeshmukh, 2002). However, since geographical factors different from distance also seem to play an important role in complex terrain, the low number of stations used in this study and their inhomogeneous geographical distribution do not allow to objectively calculate such a length scale. In particular for precipitation, much more stations in the local surroundings of the reference station would be required (cf. Baigorria et al., 2007). This may be a starting point for future research. Figure 3(b) shows f of precipitation extremes separately for the summer and winter half-year (here, the percentiles for the definition of the extremes also have been separately calculated). Similar to the spatial correlation of precipitation (Baigorria et al., 2007), f usually is smaller in summer than in winter, owing to the greater importance of local convective (heavy) precipitation in mid-latitude summer compared to winter, when (heavy) precipitation is more often triggered by synoptic-scale cyclones and their attendant frontal systems.
An interesting difference between extreme precipitation and windstorm events is obvious from Figures 2 and 3. At the station Deuselbach, that is on the local scale, f is similar for both variables, showing that about 40–50% of both heavy precipitation and heavy wind gusts at Trier-Petrisberg are related to very local events, e.g. thunderstorms, that affect only the reference station. However, for the remaining part of the events the spatial coherency is rather different: the heavy gusts have a larger spatial representativeness than the precipitation events, leading to different slopes of f in Figure 3(a). This difference in slopes is more pronounced than in standard correlation plots of the time series (not shown) and seems to be specifically related to extreme events.
The spatial representativeness of extreme events at Trier-Petrisberg is almost unchanged if not the original daily time series, but data smoothed with a two- or three-day running mean are used for the calculation of f. The absolute differences between the values of f obtained after such a smoothing and the original values for heavy precipitation, hot and cold days are smaller than 0.1 almost everywhere (there is only one station where this difference reaches 0.12 for hot days). This shows that the results presented for these variables are not very sensitive to the time scale on which the extreme events are identified. Note, however, that based upon the daily measurement time series used in this study, this aspect cannot be investigated for sub-daily time scales. For gust peaks, values of f from the smoothed time series typically are slightly larger than from the daily data, with a maximum difference of 0.19. From a physical point of view such a smoothing of gust-speed time series is not very meaningful though, because windstorms typically do not last longer than a few hours.
In order to explore which effect the choice of the reference station has, Figure 4 shows maps of f for the reference station Säntis. The largest differences in the magnitude of f compared to Figure 2 occur with respect to cold extremes. While cold days at Trier-Petrisberg are representative for a very large region, values of f with regard to Säntis are higher than 60% at only 5 target stations. For the three other extremes, the geographical patterns of f are shifted from Trier-Petrisberg to Säntis with more or less comparable magnitudes. Of course, some modifications occur related to local surroundings; for example, the effect of the Alpine ridge on gust representativeness is still visible. Also, for wind gusts the area with f larger than 40% is slightly smaller for Säntis than for Trier-Petrisberg, reflecting the influence of surface elevation and complexity on the occurrence and spatial coherency of windstorms (cf. Reid and Turner, 2001). Nevertheless, the gust representativeness of this very remote alpine station still is surprisingly high.
Figure 5(a) shows the average representativeness fav100–300 of cold days at each station (using this station as reference and all other within a distance of 100–300 km as target stations, as described in Section 2.2) as a function of station altitude. There is a strong relationship between the two quantities, in particular for the stations at higher altitudes; the Spearman rank correlation coefficient is − 0.71 (Table II). The correlation with latitude is even slightly more significant; note however that station latitude and altitude are highly interrelated in our case. The difference in altitude is the plausible cause for the discrepancy between cold day representativeness of Trier-Petrisberg and Säntis (Figures 2(b) and 4(b)). The important role of station altitude indicates that the vertical temperature distribution considerably influences the occurrence of cold extremes at the surface. For example, temperature inversions with stronger cooling at low altitudes, e.g. in valleys, are typical for cold winter days. For hot days and heavy precipitation events, the differences in fav100–300 between the stations are relatively minor, and only a very small north–south gradient with slightly lower values in the south can be identified (not shown). This gradient in fav100–300 is more pronounced for wind gusts, as shown in Figure 5(b). Here, the station latitude seems to be the most important factor (Spearman correlation coefficient of 0.89, compared to − 0.57 for altitude; Table II). However, this conclusion has to be handled with care, since there are only few stations within the Alps and specifically south of the Alpine ridge, and station latitude and altitude are strongly related within this dataset, as mentioned above. Nevertheless, Figure 5(b) shows that there is a clear difference between wind storm representativeness in the northern German plains and in the more complex terrain in the south.
Table II. Spearman rank correlation coefficients of fav100–300 with reference station altitude and latitude
Correlation with altitude
Correlation with latitude
3.2. Atmospheric conditions during extremes at Trier-Petrisberg
In the following section, the atmospheric conditions during extreme events at Trier-Petrisberg are explored in order to find out about coherent atmospheric structures related to these events. Figure 6(a) shows a composite map of temperature on 850 hPa and sea level pressure for hot days at the station Trier-Petrisberg. A tongue of warm air reaching from northwestern Africa over the western Mediterranean to western and central Europe can be identified in the figure. This warm tongue is related to two shallow low-pressure systems centred over France and Spain that drive a northward advection on their eastward side. The low-pressure anomalies are embedded within high-pressure values near the Azores and over the Baltic Sea, which are close to climatology (cf. Kållberg et al., 2005). The low-pressure systems most probably are no heat lows, but of dynamical origin, because there are related anomalies in the geopotential height on 850 hPa (not shown) and in the potential vorticity on 320 K, as shown in Figure 6(b). Just to the west of France and Spain and to the west of the low-pressure centres, a trough with high upper-tropospheric PV values is located, which also drives warm air advection towards central Europe. Also, there is a pronounced ridge with low PV values located over Europe, indicating that there has been southerly advection of air throughout the whole troposphere.
In Figure 7, the same composite maps as in Figure 6 (except that the PV is displayed on 310 K) are shown for cold days at Trier-Petrisberg. Here, a cold tongue is located over central to northeastern Europe. A low-pressure system over Italy and the Mediterranean contributes to easterly advection over Europe. The cold air temperatures in winter over the Eurasian continent induce high pressure values in this region, which also extend westwards over central Europe in Figure 7(a). In the PV field shown in Figure 7(b), a strong Rossby wave pattern can be identified, with low PV values over the eastern North Atlantic and very high values over northeastern Europe, enforcing northerly winds over central Europe. This field is very well aligned with the 850 hPa temperature, showing that the anomalies again extend throughout the troposphere and that there is a strong dynamical upper-level forcing. In summary, Figures 6 and 7 show that extreme temperature events at Trier-Petrisberg are related to well defined and coherent atmospheric patterns, and that large-scale advection and dynamical forcing play an important role for these events. This is a major reason for the large spatial representativeness of the extreme events described in Section 3.1.
With the help of the dynamical aspects of temperature extremes at Trier-Petrisberg described here, it is also possible to better understand the geographical patterns of spatial representativeness of the events shown in Figure 2. For example, the higher values of f in northern and eastern Germany for cold days than for hot days are related to the fact that the cold air typically is advected from the east to Trier-Petrisberg, covering also the respective target stations. On the contrary, the warm air coming from the south does not necessarily extend further to the northeast than Trier-Petrisberg. However, with this argumentation the relatively low values of f at the southern stations on hot days (Figure 2(a)) are more difficult to interpret. Possibly, the air temperature anomaly in the warm tongue may not always be extreme at the more southerly stations. Moreover, there might be some additional local effects during summer, e.g. connected to land-atmosphere feedbacks (cf. Fischer et al., 2007), that modify the impact of large-scale forcing on temperature extremes in a way that the spatial representativeness of hot days is reduced.
For extreme precipitation events at Trier-Petrisberg, no clear pattern can be identified in the atmospheric composites (not shown). Only a small negative PV anomaly on 320 K over Great Britain gives some indication of a coherent upper-tropospheric forcing of the events. This lack of pronounced large-scale signature relates directly to the low spatial representativeness of the heavy rain events (Section 3.1). This is different from the results of Martius et al. (2006), who found a consistent relationship between southern alpine heavy precipitation events and elongated PV streamers. However, these authors defined their extreme events based on area mean and not single-station precipitation. In this way, only events with a certain spatial extent and, thus, a more large-scale character were selected.
Days with strong gusts at Trier-Petrisberg are not connected to clear signals in the temperature and PV fields either, but a pronounced sea level pressure anomaly is obvious from the composite maps, as shown in Figure 8. The high-pressure values near the Azores are again close to climatology, whereas the low pressure centered between Scotland and Norway is clearly lower by some 20 hPa (Kållberg et al., 2005, for comparison). This leads to a relatively strong pressure gradient over Europe even in this composite. According to this, wind storms at Trier-Petrisberg are often caused by surface cyclones over the North Sea and the associated frontal systems and pressure gradients. These results are in agreement with the more detailed, weather type-based analysis of European windstorms, by Donat et al. (2010). Most probably, these large-scale storms are the reason for the weak decline of the frequency of simultaneous extreme gusts with distance shown in Figure 3(a).
In this study, a simple measure for the spatial coherency and representativeness of extreme meteorological events has been proposed. On the basis of a nonparametric, percentile-based definition of extremes, the relative frequency of simultaneous events at two stations has been used as an indicator of their coherency. This approach has the advantage that it can be applied to various datasets, independent of the statistical distribution. Moreover, its application is straightforward and does not require any complicated statistical transformations. However, this simplicity also implicates certain disadvantages with respect to some statistical details. The measure of coherency f is not totally symmetric under exchange of reference and target station (in contrast to a standard correlation coefficient), and the significance test suggested here does not account for autocorrelation in the time series. For the results obtained in this study, the statistical significance has not been in the focus of the interpretation, also because the time series are rather long and in particular for temperature extremes, where autocorrelation is most pronounced, the values of f appeared to be very high. For future applications, a more advanced test might have to be developed.
The highest spatial representativeness, also in relatively complex terrain, has been found for temperature extremes, analogously to the large spatial correlation scales of temperature (e.g. Wulfmeyer and Henning-Müller, 2006). This high coherency is related to a strong, large-scale atmospheric forcing typically triggering the extreme events. Heavy wind gusts sometimes are of local character, but those affecting more than one station have spatial representativeness scales comparable to the temperature extremes. Again, this relates to a large-scale atmospheric forcing, in this case by mid-latitude cyclones over the North Sea. The spatial coherency of precipitation extremes is low, corresponding to the small spatial correlation-length scales found in other studies (e.g. Bacchi and Kottegoda, 1995). Accordingly, no coherent large-scale atmospheric pattern connected to these events could be identified. Whereas the average representativeness of hot days and heavy precipitation events does not vary a lot between the stations, reference station altitude and terrain complexity have a clear impact on the representativeness of cold days and windstorms, respectively.
With respect to the main motivation for this study, an assessment of the spatial representativeness of proxy reconstructions of extreme events, these results have several implications. On the one hand, the reconstruction of temperature extremes from an archive in central Europe would be representative for a large area in space. This conclusion is not only based on the statistical evaluation (modern analogue concept), but also on the physical argument that large-scale advection is the most important process triggering such extremes. Thus, it is supposed to hold also for past climates. On the other hand, reconstructed precipitation extremes would reflect only very local conditions. In order to get a more complete picture of past heavy precipitation changes, a set of spatially distributed proxy archives would be required. For wind speed extremes, the representativeness of reconstructions would be intermediate, as outlined above.
In future research, the method proposed here may be applied to different datasets. In particular, it would be worthwhile to use a set of rain gauges with a much higher density in space in order to analyse the representativeness of precipitation extremes on small scales. With the help of denser station networks, it should be possible to properly define a representativeness-length scale. In addition, the present method can also be employed for other data types, e.g. satellite data or gridded data from models or reanalyses. Finally, more advanced statistical methods for the spatial dependence of extreme events, e.g. applying multivariate statistics, as suggested by Bodini and Cossu (2010), may be used for comparative studies.
We thank the German Weather Service DWD and the Swiss Federal Office of Meteorology and Climatology MeteoSwiss for providing access to meteorological measurements and ECMWF analysis data. Collaboration and discussions with Frank Sirocko were helpful for motivating this project. The analyses and graphics for this study were produced using the software package R. We are grateful to Christoph Frei for giving us access to his R plotting routine for geographical data. Parts of this study were funded by the Earth System Science Research Centre ‘Geocycles’ at the University of Mainz.