Applying stochastic small-scale damage functions to German winter storms



[1] Analyzing insurance-loss data we derive stochastic storm-damage functions for residential buildings. On district level we fit power-law relations between daily loss and maximum wind speed, typically spanning more than 4 orders of magnitude. The estimated exponents for 439 German districts roughly range from 8 to 12. In addition, we find correlations among the parameters and socio-demographic data, which we employ in a simplified parametrization of the damage function with just 3 independent parameters for each district. A Monte Carlo method is used to generate loss estimates and confidence bounds of daily and annual storm damages in Germany. Our approach reproduces the annual progression of winter storm losses and enables to estimate daily losses over a wide range of magnitudes.

1. Introduction

[2] A storm-damage function describes losses as a function of observable meteorological parameters, typically maximum wind speed. For winter storms occurring in central Europe several storm-damage functions for residential buildings are described in literature. The reinsurance companyMünchener Rückversicherungs-Gesellschaft [1993, 2001]found a power-law damage function of maximum wind speed with varying exponents of roughly 3 as well as 4–5, depending on the storm event and country being analyzed.Klawa and Ulbrich [2003]proposed a power-law damage function with exponent 3, refined byDonat et al. [2011], using excess wind speed over threshold instead of absolute maximum wind speed. Similarly, Heneka and Ruck [2008]used a power-law damage-propagation function of excess wind speed with exponent of either 2 or 3, assuming proportionality to the force or the kinetic energy of the wind, respectively. Both groups define threshold wind speed as the empirical 98 percentile of the wind distribution. For the NetherlandsDorland et al. [1999] derived a damage function for residential property that can be reformulated as a power law of maximum wind speed with exponent 0.5. When comparing these studies with literature on hurricane losses in the United States (see Watson and Johnson [2004] for an overview), one must be aware of the many differences in building structure and the nature of the hazard. However, following a similar approach to this article Huang et al. [2001]describe an exponential damage model for residential property in the Southeastern United States based on 10min-averaged wind speed.

[3] Our work is based on daily insurance-loss data (years 1997–2007) with a regional resolution of administrative districts. From theoretical considerations we propose a stochastic power-law damage function depending on maximum daily wind speed to describe empirical losses. We find exponents typically ranging from 8 to 12. Statistical deviations are modeled by a spatially correlated stochastic variable drawn from a log-normal distribution. Correlations among parameters and with socio-demographic data are exploited to reduce the number of independent parameters to three per district. The model quality is assessed by out-of-sample calculations based on Monte Carlo simulations of losses in daily and annual resolution. We demonstrate good agreement between annual model results and empirical values, albeit observing a small, potential underestimation of high losses. For the majority of districts we find high correlations between annual loss estimates and data. Absolute daily losses in Germany for the three most severe storms show good predictions of losses across 4 orders of magnitude.

[4] This article is structured as follows: After a brief discussion of data, we describe motivation and details of the damage function in section 3. A simplified parametrization of the damage function is demonstrated in section 4. Finally, we present modeled loss estimates and close with the discussion of our results in sections 5 and 6, respectively.

2. Data

[5] Insurance-loss data from the years 1997 to 2007 were provided by the German Insurance Association (Gesamtverband der Deutschen Versicherungswirtschaft e.V., GDV). The data comprise daily losses due to wind and hail for 439 German administrative districts. To eliminate economic influences such as growing market penetration and price effects, loss data were divided by the total insured value of each district to obtain a dimensionless loss ratio. Further description of the loss data is given byDonat et al. [2011]. As the empirical loss data does not differentiate between wind and hail damages, we limit the scope of the analysis to winter months from October through March, during which damages are predominantly driven by high winds.

[6] Data of daily maximum wind speed (3s gust wind) are publicly available and were obtained from the German Weather Service (Deutscher Wetterdienst, DWD) for 78 wind stations across Germany (available online at Wind stations that were lacking more than 5n measurements for a sampling period of n years were discarded. Typically, measurements were taken at 10m height above ground.

3. Motivation of the Damage Function

[7] The aim of this section is to derive a stochastic small-scale damage function that, for each districti and on a daily basis, relates the loss ratio Di (recorded storm loss over insured value) to the maximum wind speed vi. As all calculations are performed on district level subscript i will be omitted for simplicity.

[8] A damage function should naturally have a sigmoid shape with steep initial increase and saturation at large wind speeds. Such growth processes are often modeled by a logistic function

display math

where dmax is the asymptotic upper bound and the exponent c determines the steepness of the function. We apply the transformation x = ln(v/bv), with maximum daily wind speed v scaled by local constant bv. Taking the logarithm reduces broadness and skewness of the distribution of daily maximum wind speeds and ensures that limv→0d(v) = 0. Since recorded data show that for Germany d ≪ dmax, d(v) can be approximated as

display math

where constants were combined to the new scaling parameter b ≈ bvdmax−1/c.

[9] Figure 1a shows the empirical loss data for an arbitrarily chosen district. By inspection we see that the logarithmically binned data reveals a strong increase for wind speeds higher than approximately 13 ms−1 and an approximately constant regime for lower wind speeds. To capture this behavior an additional constant offset a is introduced, giving

display math

Calculating the residuals inline image between empirical data and d(v), we find an approximately log-normal distribution of residuals with nearly constant scale parameterσ. For simplicity, we utilize this finding for modeling statistical deviations ϵ and hence describe losses via a stochastic variable

display math

where inline imagerepresents the log-normal distribution andμ(v) = ln d(v). μ(v) and σ are the mean and standard deviation, respectively, of the variable's natural logarithm.

Figure 1.

Example of damage function and occurrence probability for an arbitrary district. (a) The damage function d(v) is plotted against the maximum daily wind speed v. Confidence bounds of ±2σ are shown by dashed lines. Grey points represent daily loss data. (b) The fitted occurrence probability p(v) is shown. Binned empirical data, shown as circles, are given as reference only.

[10] So far the analysis accounts for the loss intensity given a loss event, leaving aside the probability of an event. An empirical occurrence rate of loss events (Figure 1b) was calculated from linearly binned binary data, where a loss event was coded as ‘1’ and days without loss as ‘0’. While the empirical occurrence rate is approximately 1 at high v, it drops to a constant base rate for v → 0. Ideally, the occurrence rate could be derived from Dϵ(v) as the probability of exceeding a certain loss threshold. We were not able to identify such threshold via censored-regression modeling and hence chose to fit the data with an empirical occurrence-probability function

display math

with base probability (1 − α), shift β, and slope γ. Multiplying Dϵ(v) with a stochastic weight function w(v) based on p(v), we obtain the complete stochastic damage function

display math

Maximum-likelihood estimation was applied to calculate the parameters ofDϵ(v) in an iterative process, alternating between computing parameters a, b, c while keeping the scale parameter σconstant and vice versa (see pseudo-likelihood algorithm byRuppert et al. [2003]). A least-squares approach was used to fit the parameters ofp(v).

[11] As some wind stations may not be representative for a given district, the wind station featuring the highest predictive power was chosen from a set of 5 wind stations closest to the geographical center of the district. The coefficient of determination for non-linear regression models, generalizedR2, was chosen as a measure of predictive power. For the given shape of the damage curve d(v), R2 values related to nearby wind stations indicate the level of variance inherent to the specific combination of district loss and wind data. Due to the high level of statistical deviation around d(v), low R2 scores would be expected for any smooth damage curve. In fact, all estimated R2 scores lie within the interval [0.2, 0.6], with an average of 0.42. High R2is seen for north-western coastal regions which often experience high winds. Regions with anR2 score of 0.4 and below largely coincide with German low mountain ranges (Mittelgebirge) and along the southern alpine border. Best scores are hence generally obtained for regions with homogeneous elevation and high frequency of strong winds.

[12] The spatial distribution of the exponent c estimated for all German districts is shown in Figure 2. We find a slightly right-skewed distribution with mean 9.8. 80% of values are contained within the interval from 8.3 to 11.8. Values of 15 and beyond can be conceived as outliers, occurring in districts where wind measurements insufficiently differentiate losses even at high wind speeds. Geographically, values ofc below 10 predominate in Western, Central, and Northern Germany, while values above 10 are most often found across Southern Germany and the southern districts of East Germany.

Figure 2.

Spatial distribution of exponent c and DWD wind stations. The color code indicates the local values of c, summarized in the histogram inset. Markers indicate DWD wind stations that were used for calculations or excluded due to inhomogeneities or missing data.

[13] Our analysis is based on the assumption that maximum wind speed is the dominating criterion for the occurrence and severity of storm damages. It was not feasible to quantify the effects from other potential factors (e.g., storm duration, precipitation, or turbulent winds). However, the presence of systematic large-scale deviations should be reflected in spatial correlations of the statistical deviations ϵ. In fact, calculations of Spearman's correlation coefficient from normalized residuals inline image showed significant spatial rank correlations between districts, ranging from −0.30 to 0.67. While insignificant for the estimation of loss in single districts, these correlations must be accounted for when spatially accumulating loss across Germany. In order to reproduce the spatial correlations during the Monte Carlo calculations, the empirically estimated rank correlations were enforced on the random deviations ϵ of Dϵ. The algorithm was implemented as follows:

[14] 1. Determine pairwise Spearman's correlation coefficients ρi,j of inline image between every possible combination of districts and thus populate matrix inline image.

[15] 2. Determine the nearest positive-definite correlation matrixM using the algorithm derived by Higham [2002].

[16] 3. Use the iterative procedure by Iman and Conover [1982] to create spatially correlated random deviations ln(ϵ).

[17] We assume two main processes giving rise to the statistical deviations being found in Figure 1a. Firstly, the correlation between wind-speed measurements at separate sites is known to decrease significantly with growing distance. To assess the significance of this effect on small scales, we compared two closely situated wind stations within the same district (Berlin Tempelhof and Berlin Tegel, distance ≈ 11 km). From the empirical distribution we estimate that 75% of statistical deviations lie within the interval [−1.5 ms−1, 1.4 ms−1], while roughly 5% exceed [−3 ms−1, 2.9 ms−1]. Hence, a significant part of the observed deviations may be attributed to such source of error. Secondly, insurance data may be subject to statistical fluctuations caused by incorrect or delayed reporting of losses. We however expect that for large losses the latter errors are small and negligible.

4. Parametric Simplification

[18] In order to simplify the parametrization of the damage model we identified global statistical relationships and reduced the number of local fitting parameters. As additional predictors we used the number of residential buildings per district h, long-term damage rateδ defined as the share of days with recorded damages during the observation period, and the wind speed ν = ba1/c at the intersection of the constant aand the power-law term ind(v). The raw data for the 439 districts and the corresponding least-square fits are shown inFigures 3a–3c. Parameters a, α, and β could hence be replaced with the fitted global relationships

display math
display math
display math

Intuitively, the inverse proportionality between loss offset a and number of buildings h (equation (7a)) follows from the definition of the loss ratio, defined as the absolute loss divided by the insured value, since the insured value scales linearly with the total number of houses. This suggests a common minimum noise level for all districts. Furthermore, the approximate direct proportionality between ν and β in equation (7c)hints at a common threshold that separates the regime of noise at lower wind speeds from storm-driven losses at high wind speeds. In line with this proposition, we interpret (1 − α) as the probability of a random loss event in the noisy regime of the curve. Accordingly, equation (7b)shows that the regional differences of the long-term damage rateδ are dominated by random loss events. The remaining third parameter of p(v), γ, could furthermore be replaced by its mean value over all districts, inline image. As p(v) generally increases rapidly from (1 − α) to 1, results were insensitive to the error induced by this replacement.

Figure 3.

Correlations among model parameters and external factors. (a–c) Scatter plots show the correlations found for model parameters a, α, and β, respectively. Dots represent individual districts and dashed lines indicate fitted curves (cf. equations (7a)(7c)). (d) shows the parameter b versus the elevation (above MSL) of the used wind stations. Circles denote binned data.

[19] In summary, the above global relationships can be used to reduce the model parametrization to three local parameters (exponent c, and scaling parameters b and σ). Additionally, we observe a weak dependence of scale parameter b on the elevation of the respective wind stations above mean sea level (Figure 3d). However, it is expected that b comprises a multitude of scaling effects due to orography or land use, and that hence the altitude dependence is not sufficient for a robust approximation.

[20] In the following, all calculations are based on the full parametric model unless we refer to the reduced model.

5. Modeling Results

[21] In order to assess the predictive power of the proposed damage function, calculations of regional and country-wide loss figures were compared to empirical values. Due to the availability of only 11 years of spatially resolved loss data, an out-of-sample-test algorithm was implemented as follows:

[22] 1. Exclude year x from empirical loss data.

[23] 2. Train the storm damage model on the remaining data.

[24] 3. Predict country-wide daily and cumulated losses for yearx based on daily maximum wind speeds.

[25] 4. Vary x and repeat all calculations.

[26] In order to estimate the distribution of daily losses, the Monte Carlo method was used and 500 realizations of daily loss estimates were calculated.

[27] Figure 4shows daily loss predictions in Germany for the time periods around the three major storm events named ‘Lothar’ (24.-27.12.1999), ‘Jeanett’ (26.-29.10.2002), and ‘Kyrill’ (17.-19.1.2007). These storms are of particular interest, as they caused the largest insurance losses during the period under consideration. For most days empirical values lie within the uncertainty bounds of the model estimates. Peak empirical losses of storm events ‘Lothar’ and ‘Kyrill’ are contained within the 80% uncertainty bound, while ‘Jeanett’ is found in the 95% interval. The results demonstrate the model performance for predicting losses over 4 orders of magnitude.

Figure 4.

Out-of-sample calculations for daily German absolute losses during three severe winter storms (‘Lothar’, ‘Jeanett’, and ‘Kyrill’). Circles denote the median of the damage distribution and diamonds empirical values. 50%, 80%, and 95% confidence bounds are shaded from dark to light grey, respectively.

[28] Annual loss estimates during winter months are shown in Figure 5a. Regarding absolute loss figures, we estimate a very high Pearson correlation of 0.99 between the model estimates (median) and the empirical values, which indicates a good reproduction of the annual progression of empirical storm-loss data. Annual losses are dominated by the storm events, ‘Lothar’, ‘Jeanett’, and ‘Kyrill’, in the years 1999, 2002, and 2007, respectively. Loss estimates for these years hence reflect the peaks seen inFigure 4. Additionally, we observe a small positive bias for years with loss ratio below 10−4, which may be due to ignoring correlations in the estimation of p(v) (equation (5)). In total, we find approximately 12% underestimation of absolute loss accumulated over 11 years.

Figure 5.

Out-of-sample calculations for the annually accumulated loss ratio during winter months (Oct–Mar). (a) Loss estimates for Germany. Circles denote the median of the estimated damage distribution, while 50%, 80%, and 95% confidence bounds are shaded from dark to light grey, respectively. Empirical values are represented by diamonds. (b) A histogram of Pearson's correlation coefficients between annual loss estimates and empirical values for each district. Correlations above 0.6 are statistically significant. The solid and dashed lines relate to the fully parameterized and the reduced model, respectively.

[29] Figure 5b summarizes the correlation per district between the median of the annual loss estimates and the empirical values. Approximately 1/3 of all districts show high Pearson correlation coefficients above 0.9. The mean correlation over all districts is 0.74. The correlations allow for a comparison of the full model and the reduced model with only three fit parameters per district. The histogram shows an increase of correlations between 0.5 and 0.9, while the number of correlations with values above 0.9 is slightly decreased. Together with a slight increase of mean correlation to 0.76 this demonstrates the sufficiency of the three remaining fit parameters. Since both the original and the reduced model produce nearly identical quantitative loss estimates for Germany, we show results for the original model only.

6. Discussion

[30] Empirical data of daily insurance losses across German administrative districts show a strong increase of losses with maximum daily wind speed. We find that these losses are well described by power-law damage functions with regionally varying exponents that typically range between 8 and 12. For the out-of-sample calculations we generated successive parameter fits based on varying time slices of the available data. The estimated parameters were insensitive to these variations, thus demonstrating model robustness even under exclusion of the major loss events.

[31] While these results are in contrast to damage functions published before, a direct comparison of the exponents may be misleading. In fact, excess-over-threshold models, as applied byKlawa and Ulbrich [2003] and Heneka and Ruck [2008], imply a much steeper increase of loss in the threshold vicinity than pure power-law models of absolute wind speed [e.g.,Münchener Rückversicherungs-Gesellschaft, 1993, 2001]. The basic conjecture of our approach is a monotonous relationship between insured loss and maximum wind speed applicable to both small and extreme storm loss, which enables us to exploit information from a wide range of recorded losses. Since we found a universal power-law increase of loss for all districts we think that the use of damage functions with differing asymptotic shape may result in significant extrapolation error.

[32] In Figure 4we demonstrated in an out-of-sample test that daily modeled losses across Germany closely match empirical values ranging over four orders of magnitude. Judging from the comparison of median loss estimates and empirical data, peak losses may be slightly underestimated while still being within the uncertainty bounds of the model. Next to being a purely statistical effect (e.g., insufficient length of time series), this may be due to other aspects such as underdetermination of the model based on maximum wind speed only. Where available, empirical data regarding such aspects as the temporal wind profile, storm duration, or gustiness may be used to improve loss estimation. Socio-economic effects, such as demand surges [see, e.g.,Olsen and Porter, 2011], are expected to play a minor role.

[33] Inspired by other studies, the proposition of an exponential damage function was tested, but rejected due to strong overestimation of damages for large wind speeds. Bearing in mind that the damage function was fitted on the whole range of available loss data and thus not specifically calibrated to extreme losses, we conclude that the model results demonstrate good reproduction of both daily and annual extremes.

[34] Strong country-wide correlations of model parameters support the universal applicability of our damage function and permit the separation of the damage curve into a approximately constant noisy regime and a physical power-law regime. Employing these correlations, the model parametrization was successfully reduced to three independent parameters determining the basic shape of the damage curve. While the power-law exponent determines the curve's steepness, the scale parameter accounts for regional variation between districts and wind stations (e.g., distance and orography). The third parameter specifies the width of the log-normal loss distribution around the central curve and thus relates to the expected level of statistical deviations. In particular, the value of the exponent may be interpreted as an indicator for regional vulnerability to extreme winds. Its spatial distribution indicates reduced vulnerability within Western and Northern Germany. As these regions, and especially the coastal regions, are highly exposed to extreme winds, the relatively low values of the exponent suggest a greater level of adaptation to the current wind climate than for Southern Germany.

[35] All model calculations were deliberately based on raw measurements of maximum wind speed as provided by DWD. While most wind stations are known to be subject to inhomogeneities due to change in measurement apparatus, location or surrounding surface roughness they may nonetheless possess predictive power for neighboring districts. Due to the selection criterion of maximizing generalized R2, wind stations with inhomogeneities causing significant additional variance were excluded unlike for temperature or pressure data, correction of inhomogeneities in maximum wind speed data would require case-specific non-linear transformations that are beyond the scope of this study.

[36] Additional insight, in particular regarding the significance for extreme loss modeling, could be gained from a dedicated model intercomparison on the basis of common meteorological and insurance-loss data. In further work we moreover intend to apply our model to loss data for other European countries and regions. A cross-national comparison of model parameters could enable the identification of clusters of similar vulnerability and reveal regional adaptation potential.


[37] We appreciate useful discussions with M. Boettle, E. Faust, and C. Walther. We thank the German Insurance Association (GDV) and the German Weather Service (DWD) for providing the data. This work was further supported by the German Federal Ministry for Education and Research under grant number 031SZ191B (PROGRESS) and by the Baltic Sea Region Programme 2007–2013 (BaltCICA project).

[38] The Editor wishes to thank two anonymous reviewers for their assistance evaluating this paper.