Journal of Geophysical Research: Earth Surface

Probabilistic prediction of barrier-island response to hurricanes


Corresponding author: N. G. Plant, U.S. Geological Survey, 600 4th St. South, Saint Petersburg, FL 33705, USA. (


[1] Prediction of barrier-island response to hurricane attack is important for assessing the vulnerability of communities, infrastructure, habitat, and recreational assets to the impacts of storm surge, waves, and erosion. We have demonstrated that a conceptual model intended to make qualitative predictions of the type of beach response to storms (e.g., beach erosion, dune erosion, dune overwash, inundation) can be reformulated in a Bayesian network to make quantitative predictions of the morphologic response. In an application of this approach at Santa Rosa Island, FL, predicted dune-crest elevation changes in response to Hurricane Ivan explained about 20% to 30% of the observed variance. An extended Bayesian network based on the original conceptual model, which included dune elevations, storm surge, and swash, but with the addition of beach and dune widths as input variables, showed improved skill compared to the original model, explaining 70% of dune elevation change variance and about 60% of dune and shoreline position change variance. This probabilistic approach accurately represented prediction uncertainty (measured with the log likelihood ratio), and it outperformed the baseline prediction (i.e., the prior distribution based on the observations). Finally, sensitivity studies demonstrated that degrading the resolution of the Bayesian network or removing data from the calibration process reduced the skill of the predictions by 30% to 40%. The reduction in skill did not change conclusions regarding the relative importance of the input variables, and the extended model's skill always outperformed the original model.

1. Introduction

[2] Coastal morphologic features including shorelines, beaches, and dunes are characteristically variable in both space and time. Their temporal evolution is driven by hydrodynamic and sediment transport processes that also exhibit significant spatial and/or temporal variation, associated with variations in wind, water levels, waves, and currents. The morphologic and hydrodynamic variables interact via a rich set of feedback mechanisms that produce complex morphological patterns including sandbars, dune and berm ridges, breaches, inlets, and overwash fans. The processes or interactions of processes responsible for shaping these features are well documented and include sediment transport due to waves, alongshore currents, slumping, wave runup, overwash, and inundation.

[3] During extreme storms, rapid coastal evolution is possible, and the rate and style of response depend on both existing morphological features as well as hydrodynamic processes. This interdependence has been described in terms of storm regimes [Sallenger, 2000] wherein the type of morphological response (Figure 1) is categorized according to the type of hydrodynamic-morphology interaction. The categorization depends on the relative elevations of the beach, dune, wave-averaged water levels, and wave runup levels. Under the mildest conditions, waves and water levels will rise only to the level of the beach and beach erosion is possible (swash regime). Under more severe conditions, waves will reach the dune toe causing dune erosion by scarping (collision regime). A further increase in wave height or decrease in dune height will allow waves to overtop the dune and cause dune erosion and landward sediment transport (overwash regime). Under the most severe conditions, the combination of storm surge, tides, and wave setup exceeds the dune-crest elevation, and overland flow is possible, leading to dune erosion and, in extreme cases, island breaching (inundation regime).

Figure 1.

Examples of dune response outcomes described by the storm-scaling model: (a) swash, (b) collision, (c) overwash, (d) inundation. Photos were obtained along Santa Rosa Island, FL, by the (Figures 1a, 1b, and 1d) U.S. Geological Survey and (Figure 1c) U.S. Army Corps of Engineers after landfall of Hurricane Ivan (2004).

[4] The storm-regime model has been evaluated for a number of storm events [Stockdon et al., 2007b, 2007a] using lidar topography from both before and after major hurricane events [Bonnie, 1998; Floyd, 1999; Ivan, 2004] and both observed and modeled surge levels. These efforts show that the conceptual model can be used to make accurate predictions of actual barrier island response. The response was inferred both from quantitative changes in dune and beach elevations and from qualitative observations of the nature of those changes to determine if they were consistent with the response regimes. Measures of beach-volume and shoreline-position change were shown to correlate strongly with the response regime.

[5] The storm regime concept satisfies a physical requirement that prediction of dune evolution depends on knowledge of the initial topography as well as on the hydrodynamic conditions that drive sediment transport. Another approach to solving the same problem is with detailed numerical modeling that resolves bathymetry, topography, waves, tides, storm surge, and currents. There are a number of examples of this modeling approach that have demonstrated significant predictive skill [Roelvink et al., 2009; Lindemer et al., 2010; McCall et al., 2010]. The detailed modeling approach resolves the topography at o(1 m) in a 2-d spatial domain. The hydrodynamic processes are simulated at a time step of o(1 s), which resolves the long-wave component (20 s period and longer) associated with wave groups and parameterizes (i.e., does not resolve) the short-wave component. Predictions of specific storm events are skillful, explaining 90% of the observed topographic change variance [McCall et al., 2010]. This skill comes after substantial effort needed to constrain hydrodynamic boundary conditions, initial topography, and some model parameters.

[6] The advantage of the conceptual model approach over detailed numerical modeling is that an accurate characterization of the nature of barrier-island beach response is returned with minimal input data requirements (i.e., dune and runup height estimates are required for the former as opposed to a 2-d elevation grid for the latter) and simple calculations (i.e., subtracting runup and dune elevations as opposed to solving partial differential equations). But, the conceptual model only provides a qualitative description of the response (dune scarping, overwash, dune obliteration) as opposed to quantitative estimation of the magnitude of response (e.g., dune elevation changes). This limitation is significant if the approach is to be used to forecast future beach vulnerability to multiple storms, in which the dune characteristics must be updated after each storm.

[7] Other approaches to forecasting storm response span the range between the conceptual model proposed by Sallenger [2000, hereinafter referred to as S2000] and the detailed numerical implementations. These include models that predict the response of a small number of dune and beach parameters (e.g., dune elevation and position) with coupled differential equations [McNamara and Werner, 2008a, 2008b], also including predicted human alterations to a naturally evolving environment. More detailed models resolve the cross-shore profile with coupled partial differential equations [Larson and Kraus, 1989] and add the ability to resolve hydrodynamics and morphology simultaneously. Many approaches of predicting storm response are hybrid methods that use relatively detailed models to compute hydrodynamic properties and then use less detailed and more parameterized models to compute the morphologic response [e.g., Cañizares and Irish, 2008]. In the end, these approaches are all strongly dependent on observational data to achieve meaningful prediction accuracy, through both model initialization and also calibration of model parameters [Thieler et al., 2000]. This last statement is also true for a new approach that is developed and tested in this paper and points out that we are still in the hypothesis-testing phase of our understanding and application of morphologic prediction. To this end, hindcast evaluation is used to determine if models can recover observed response though parameter fitting and to understand prediction sensitivity to these parameter choices.

[8] Here, we develop a Bayesian network (BN) that makes quantitative predictions of morphologic responses to storms. The new approach is based on the storm-regime conceptual model, but results in more quantitative predictions, handles uncertainty in all the inputs, and provides a methodology for assimilation of detailed numerical model results as well as observational data. BNs (for a review, seeWikle and Berliner [2007]) predict the probability of a large, but finite, suite of possible model outcomes and account for uncertainties in both inputs and outputs. Recent examples of applications of BNs to predict coastal processes have demonstrated the utility of the approach. For example, wave evolution across the surf zone with uncertain bathymetry and uncertain forcing conditions was predicted with a BN that used both observational data and numerical simulations [Plant and Holland, 2011a, 2011b]. Gutierrez et al. [2011]used a BN to identify sensitivities of long-term shoreline change to variations in sea level rise, wave height, tides, and geomorphic setting.Hapke and Plant [2010]applied a BN to predict episodic sea-cliff retreat. These studies demonstrate that BNs are useful for quantifying the current state of prediction skill and uncertainty, capturing complicated interactions between relevant system variables, and identifying sensitivities of predictions to input uncertainties.

[9] Our objective is to develop and test the performance of a BN model applied to hurricane-driven coastal changes. InSection 2(Methods), a BN for hurricane-induced coastal change is formulated and the observational and model-derived data that are used to train the BNs are presented. InSection 3 (Results), the BN is used to make a series of hindcast predictions of the training set in order to illustrate predictive skill and sensitivity to input uncertainty. In Section 4 (Discussion), we analyze the relative contribution of each input variable to improved prediction skill and we discuss the sensitivity of the results to different BN implementation approaches by degrading the model resolution and reducing the amount of data used to train the BN. Conclusions are presented in Section 5.

2. Methods

2.1. Bayesian Network Formulation

[10] To test our hypothesis that a Bayesian approach can be exploited to make quantitative, probabilistic predictions of barrier-island dune evolution, we focus on prediction of the changes in dune-crest elevation, Zc. This variable is required by the S2000conceptual model to identify the storm-response regime, so, in principle, if we can predict changes in Zc, then we can predict future storm-response regimes.

[11] We wish to predict changes in the dune elevation via a generalized difference equation that is based on the S2000 model inputs:

display math

where ΔZcis the change in dune-crest elevation that we want to predict, Zbis the dune-base elevation,η50is the storm-induced mean water level, andη98the storm-induced extreme runup level (Figure 2a). Because we introduce additional variables that describe both vertical and horizontal components of morphology and morphologic change (Figure 2a), our nomenclature differs slightly from S2000. The correspondence between variables is Zc = Dhigh, Zb = Dlow, η50 = Rlow, and η98 = Rhigh.

Figure 2.

(a) Definition sketch of the beach and dune environment. Dashed line shows the post-event beach and dune surface, with elevations referenced to the still water level (swl). (b) Schematic diagram of the BN that represents a conceptual model of this system and its evolution. Variables in boxes with bold outline represent the originalS2000 model and the additional variables extend this model.

[12] The elevation change, ΔZc, is evaluated at the location of the initial dune crest, rather than defined to migrate with an evolving dune location. This choice of elevation-change metric maximizes the sensitivity to erosion under a variety of storm regimes and avoids some interpretation complexities. For instance, during overwash of low dunes, it is possible that onshore migration of a dune will cause considerable erosion but very little change in the actual elevation of the migrating dune. Also, if there are multiple dune lines, it is possible that the seaward-most dune will erode completely and a secondary dune with an equal (or even higher) dune-crest elevation remains as a protective barrier. In addition to the variables defining dune elevation, we have included several other variables that will be used to explore the role played by beach and dune widths (Wbeach,Wdune) in controlling morphologic response, and we will investigate the possibility of predicting shoreline and dune-crest position changes (Xsl, Xc) (Figure 2a).

[13] A probabilistic version of the generalized difference equation, using Bayes' Rule, is

display math

where Oi represents a vector of observed inputs (e.g., Oi = {Zc,iZb,iη50,iη98,i} and the ith observation indicates a specific location or time). The term on the left-hand side of (2) is the posterior probability of finding the specific value of dune-crest elevation change given the observed input. The first term on the right,p[OiZc,i], is the likelihood of finding the observed inputs if the dune elevation change were already known. This term represents the understanding of the correlation between the inputs and dune elevation change that will be extracted from prior data sets. This term is multiplied by the prior probability of dune elevation changes, pZc,i], which is the probability of occurrence of this result based on all available information (e.g., historical observations), but without constraining the other inputs. The prior distribution is valuable when input data are uncertain or missing. The term in the denominator is a normalization based on the likelihood of encountering the set of observations.

[14] A practical solution to equation (2) can be obtained using a BN [Charniak, 1991; Cooper and Herskovits, 1992; Ihler et al., 2007; Wikle and Berliner, 2007], which represents Bayes Rule with hierarchical chain of conditional probabilities. This hierarchical chain can be represented graphically (Figure 2b) with boxes representing variables and arrows representing correlations between variables. These correlations are stored as conditional probability tables (CPTs) that must be estimated from data or other models. For each variable (e.g., dune elevation or its change), the BN computes the probability that its value falls within several discrete range bins (Table 1). While it is entirely possible to represent correlations between each variable and all other variables, we have chosen to represent correlations that we expect to be physically meaningful. For example, arrows are included that represent the influence of dune elevation on its own evolution, the relationship between extreme runup and mean water levels, and the dependence of dune erosion on the forcing as well as on the initial dune elevation. Many other correlations (e.g., between beach width and surge height) are omitted because they are not expected to be physically meaningful or provide predictive skill.

Table 1. Definition of Discrete Bins for Each Variable
VariableBin Ranges (m)
ZcDune-crest elevation0 to 1.51.5 to 2.52.5 to 3.53.5 to 4.54.5 to 12 
ZbDune-base elevation0 to 1.51.5 to 1.751.75 to 22 to 33 to 3.5 
η50Storm-induced mean water level0 to 1.51.5 to 1.751.75 to 22 to 33 to 3.5 
η98Storm-induced extreme runup level0 to 33 to 3.253.25 to 3.53.5 to 3.753.75 to 44 to 7
Wbeach Beach width0 to 2525 to 5050 to 7575 to 100100 to 200 
Wdune Dune width0 to 2525 to 5050 to 100100 to 150150 to 500 
ΔZcDune-crest elevation change−10 to −5−5 to −2.5−2.5 to −1.5−1.5 to −0.5−0.5 to 0.5 
ΔXcDune-crest position change−250 to −150−150 to −100−100 to −50−50 to 00 to 50 
ΔXsl Shoreline position change−150 to −50−50 to −25−25 to −10−10 to 00 to 10 

2.2. Field Observations

[15] The proposed BN serves as a conceptual framework that organizes this particularly morphologic response problem selecting important variables and interactions based on an understanding of the physical processes that are involved. Also, the BN serves as a quantitative model that can be used to exercise and test our presumed understanding. However, the BN cannot capture this understanding directly, such as might be done with a deterministic model built on fundamental principles (e.g., Navier-Stokes and continuity equations). Instead, as described above, important relationships must be captured in terms of joint probabilities. In our case, prior estimates of the inputs (morphology and hydrodynamics) and output (dune elevation change) during actual or simulated storm conditions are required to construct the correlations needed by the BN. We use observations and numerical simulations associated with Hurricane Ivan, which made landfall near Gulf Shores, AL, as a category-3 storm on September 16, 2004 (Figure 3). We adapt the BN to adequately resolve the conditions that characterize this storm and morphologic response along the coast of Santa Rosa Island, FL.

Figure 3.

Map of study area showing Hurricane Ivan's track, the lidar surveys extents, and the study area where the models were applied.

2.2.1. Dune Elevation and Elevation-Change

[16] Dune elevations and elevation changes were derived from two topographic surveys. Both surveys were conducted using airborne lidar systems. The first survey (Figure 3) was conducted in May 2004 (four months prior to the September landfall of Hurricane Ivan) by the Army Corps of Engineers using the Compact Hydrographic Airborne Rapid Total Survey (CHARTS) system. This system is capable of measuring topography to 1-m horizontal resolution at nominally 15 cm (2σ) vertical accuracy [Wozencraft and Millar, 2005]. A post-storm survey (Figure 3) was conducted September 19, 2004, three days following hurricane landfall, by the U.S. Geological Survey using the Experimental Advance Airborne Lidar (EAARL) system, which has resolution and accuracy capabilities that are similar to CHARTS [Nayegandhi et al., 2009].

[17] The lidar topography was analyzed along cross-shore transects spaced every 20 m alongshore. The horizontal position of the shoreline was extracted as the mean-high-water contour (Figure 4). The position and elevation of the seaward-most dune crest were also extracted at each alongshore location. The dune-base elevation, defined as the location of maximum curvature between the dune crest and shoreline [Stockdon et al., 2009], was extracted where it was sufficiently well defined. No dune-base elevation was extracted if the dune-crest elevation was lower than 3 m. Mean beach slope and beach width were calculated, where possible, between the shoreline and dune base. The distance between the dune crest and dune base defined the dune width (i.e., a half-width).

Figure 4.

Example elevation profile for a cross-shore transect in the Santa Rosa Island study area. The location of the dune crests (Xc,Zc, filled circle), dune base (Xb,Zb, filled square—not present in post-storm profile), and shoreline (Xsl, filled triangle) are identified in the pre-storm (blue, dashed line) and post-storm (red, solid line) surveys. The horizontal dashed line indicates the mean high water (MHW) level and the solid line is the mean water level.

[18] A wide range of initial dune elevations were observed along the length of Santa Rosa Island (Figure 5). Pre-storm elevations ranged from about 2 m to over 7 m elevation. Post-storm elevations exhibited similar spatial variability, ranging from less than 1 m to over 7 m elevation. The average elevation change of Zc was −1.8 m, the spatial variability (standard deviation) of this change was 0.9 m, and the largest change was −6.0 m. The wide range of variability in both the initial dune elevation and the dune elevation changes provides an excellent test for any model's predictive ability. These data have been used for this purpose in testing the conceptual part of the storm scaling model [Stockdon et al., 2007a] as well as for testing a detailed numerical model of the dune erosion process [McCall et al., 2010].

Figure 5.

(a) Alongshore distribution of storm-induced mean water level (dark blue shading), extreme water level (light blue shading), and dune-crest elevation prior to storm (black dots) and after the storm (red dots). (b) The change in dune-crest elevation.

2.2.2. Water Levels

[19] The hydrodynamic inputs that are required at alongshore locations where morphological data exist are the storm-induced mean (η50) and extreme (η98) water levels. These were computed using a parameterization developed previously by Stockdon et al. [2006]. Required inputs to this parameterization include pre-storm beach slope (β, derived from the pre-storm lidar survey), storm surge (ηsurge), tides (ηtide), significant wave height (H, evaluated in 20 m water depth) and wavelength (L, computed from the peak wave period at 20 m depth). The parameterization computes setup (ηsetup), swash (S), η50, and η98 as follows:

display math
display math
display math
display math

Thus, the storm-induced mean water level (η50) includes contributions from tides, surge and setup. Swash and setup, in this case corrected for a 10% parameterization bias in the last term in equation (3d), are added to the tide and surge to estimate the storm-induced extreme water level (η98).

[20] Tide, surge, wave height, and wavelength required to support predictions of η50 and η98 (equations (3a) and (3b)) were computed using a numerical model (Delft-3D) [Delft Hydraulics, 2003]. Several nested model domains supported the hydrodynamic calculations and have been described in detail elsewhere [McCall et al., 2010]. The model was initialized 53 h prior to landfall starting from unperturbed waves and water levels using re-analysis of observed winds and atmospheric pressure obtained from National Oceanic and Atmospheric Administration (NOAA) Hurricane Research Division and simulations ran for 66 h and generated hourly output. Wave height and period (and, therefore L) were extracted from the model along the 20 m isobath. Tides from the TPXO6.2 global model [Egbert and Erofeeva, 2002] were included at four points along the model boundary and propagated through the model to the shoreline. The hydrodynamic parameters from the 66-h simulation were used to computeη50 and η98 at hourly intervals and the maximum value at each shoreline grid cell was extracted to characterize the water levels used in this study. Variation in storm duration and variations in water levels during the storm were not considered here.

[21] Modeled wave height and period were verified using measurements from NOAA National Buoy Data Center buoy 42040 located in 300 m water depth. The modeled peak wave height (18 m) exceeded the observed value (16 m) by about 10%. The modeled peak wave period (20 s) exceeded the observed value (15 s) by about 25%. This error was sufficiently large that we used the observed wave period values to calculate the wavelength required in equations (3a) and (3b). Comparing observations to predictions over a 2-day period, mean errors were 0.3 m for the wave height and 4 s for the wave period. At a shallower location at about 80 m depth there was a temporary buoy, (SAX04 buoy 7) [Fernandes, 2005] the modeled wave height at the storm peak (11.8 m) was slightly higher than the buoy measurement (11.4 m), and the mean error over the duration of the simulation was 0.40 m. At this location, the modeled wave period (20 s) exceeded the observed value by 5 s at the storm peak and, when averaged over the storm duration, the model overestimated the period by 3 s.

2.3. Assimilating Data Into the Bayesian Network

[22] Each variable used in the BN must be discretized into a set of bins that span the range of values that are included in the data sets. Using guidelines proposed by Plant and Holland [2011a], we attempted to allow each variable five different bins and still resolve each variable adequately. Topographic elevation variables were resolved at a minimum of 1.0 m increments for the dune-crest elevation and 0.25 m increments for the dune base, which had a narrower range of possible values. Surge and runup were resolved at a minimum of 0.25 m increments. Horizontal distances (widths and positions) were resolved at a minimum of 25 m. Coarser resolution was allowed at the lowest and highest bins in order to include the extreme values and the resolution constraints required resolvingη98 with 6 bins (Table 1).

[23] Netica software (Norsys, Ver. 3.25,, 1990–2007] was used to build, train, and run the BN. The morphologic variables derived from the lidar observations and the hydrodynamic simulation results were assimilated into CPTs for all of the variables. An iterative expectation maximization method [Dempster et al., 1977] was used to update the CPTs. The BN that represented the S2000 model included 26,315 possible states representing the combinations of different variables taking on different values (e.g., low dunes and low surge, high dunes and high surge, high dunes and low surge, etc.). These were constrained with 26,808 data values. There were some missing values, such that not all variables were available at all locations. The prior probability distribution for each of the variables in the BN is, essentially, a histogram of the data for each variable (Figure 6). Properties such as the mean value, variance, and most likely value can be estimated from the prior distributions.

Figure 6.

Prior probability distributions of each variable. The height of each bar is the total probability in each discrete bin. The width of each bar indicates the bin widths. All units are in meters.

2.4. Prediction Evaluation

[24] The degree to which the BN can recover observed morphological changes and provide uncertainty estimates that are consistent with the actual prediction errors was evaluated with two skill metrics. First, the prediction skill was quantified using a regression model relating the predictions to the actual observations:

display math

Here, inline image is the regression estimate based on inline image, the Bayesian-mean predicted value from the BN output. The Bayesian-mean value is

display math

where the summation is over j = 1,2…,J discrete bins from the prediction obtained at each alongshore location i. As in equation (2), each Oi represents a particular combination of the possible input variables. The regression model includes corrections for bias, b0, and gain, b1. The regression skill is

display math

where the summation is over all alongshore locations. This measure of skill describes a weighted percentage of the observed variance that is explained by the Bayesian-mean prediction. The weighting factors are the prediction uncertainties (i.e., variances) around the Bayesian-mean value and do not depend on the observations. The variances at each location were computed as

display math

where the summation is, again, over the J discrete values of ΔZc. The Bayesian-mean value represents a robust predicted value, and the variance of the prediction provides a measure of prediction uncertainty used as weighting term in the regression and skill estimates.

[25] BNs should accurately account for prediction uncertainty in more detail than can be described by the mean and variance estimates. We will show some examples where the predicted probabilities are bimodal. In this case, the Bayesian-mean value will fall between the two most likely outcomes and is a poor prediction. While this is an inaccurate prediction, the Bayesian variance (equation (4d)) will reflect higher uncertainty in this situation. It is possible to retain the full probability distribution as the prediction and to describe the skill of the updated probabilities by evaluating the predicted likelihood of each observation and comparing this likelihood to the prior likelihood. We use the log likelihood ratio to do this:

display math

where the first term on the right is the updated probability evaluated at the discrete bin that matched the observed dune elevation change at the ith observation location. If the predicted probability in that bin is high, the likelihood is also high. The second term is the prior probability at the ith observation location. If the updated probability is higher than the prior probability for the observed value, then the prediction is improved compared to the prior and the log likelihood ratio is greater than zero. This indicates that the updated probability distribution is both different from the prior and more accurate. On the other hand, if the updated probability at the observed value is lower than the prior probability, this indicates that the update is either more uncertain than the prior or it is more confident but actually wrong. Thus, the likelihood ratio scores the ability of the BN to make skillful estimates of both the mean value and uncertainty. Summing the log likelihood ratio over all observation locations provides a measure of how much better (or worse) the BN prediction performed over the entire data set compared to using prior.

3. Results

[26] Assimilation of observed morphologic variables and modeled hydrodynamic variables that represented the peak of hurricane Ivan's surge and swash produced a single quantitative BN that included input and output variables consistent with both the S2000model as well as our proposed extended model. If input to the BN includes dune-crest and base elevations and mean and extreme water levels, it represents theS2000 model. Or, if we add beach and dune widths to the inputs, the BN represents an extension of the S2000 model. After illustrating some general properties of the BN prediction approach, we compare the performance differences of these two model choices.

3.1. Example Prediction Scenarios

[27] Example BN predictions and their sensitivity to variation in the inputs are demonstrated by constraining some of the inputs and observing how the probabilities were updated for the unconstrained variables. An interesting case corresponded to the analysis of high dunes wherein the dune-crest elevation was constrained to the highest range (Zc > 4.5 m) by specifying that this was known with 100% probability. Similarly, the dune-base elevation was constrained to its most-likely range (Zb = 2–3 m). Then, we sequentially constrained the storm-induced mean water levels (η50) from 0.5 to 3 m (Figure 7). Because η50 and η98 were highly correlated, we omitted η98from the input and demonstrated that its value could be recovered from the BN along with dune-crest elevation changes. This additional prediction is implemented by removingη98 from the observation set (Oi) in equation (2) and expanding the prediction set to include this variable (i.e., the left side becomes p[{η98, ΔZc}i|Oi]). For example, when η50 was low (<1.5 m), the predicted η98 also had a high probability of being low (89% likelihood, Figure 7, top left). As η50 increased from 1.5 m to higher levels, the most probable value of η98 increased as well.

Figure 7.

Demonstration of prediction scenarios with initial dune-crest and base elevations held constant (Zc > 4.5 m, 2 < Zb < 3 m). The storm-induced mean water level (η50) is varied in each row from 0.75 to 2.25 m. Extreme water level (η98) and dune elevation change (ΔZc) distributions are updated (red shading) for each value of η50. Dashed lines depict the prior distributions of η98 and ΔZc.

[28] The prediction of dune-crest elevation changes (ΔZc, Figure 7, right) were generated concurrently with the η98 predictions. For low values of η50, the most likely predicted value of ΔZcwas near zero (88% likelihood). Higher storm-induced water levels (e.g.,η50 = 1.5–1.75 m) yielded a bi-modal response in ΔZc, with 60% probability near the zero-change outcome and 10–30% probability of the highest change. For this scenario, extreme water level (η98) was not likely to exceed the 4.5 m high dune-crest elevation, implying that the initial erosion regime was likely to be that of collision. The prediction that dunes may in fact suffer catastrophic erosion (i.e., ΔZc = Zc) is, perhaps, because the collision regime could persist long enough to reduce the dune elevation to the point where overwash or inundation could occur [Stockdon et al., 2007b]. Finally, at the highest water levels, the dune crest change is most likely (65% probability) to equal the initial crest elevation and predicted extreme water levels (η98 > 4 m) are consistent with the overwash regime.

3.2. Testing the Sallenger 2000 Model

[29] Predictions of ΔZc were generated at all locations where all of the inputs required by the S2000 conceptual model (Zc,Zb,η50,η98) were available, resulting in 1672 test cases. The hindcast predictions are presented as confidence regions and compared to the observations at each alongshore location (Figures 8a and 8b). The confidence intervals range from narrow (a meter or two) in some locations to extremely broad (over 6 m) in others. This indicates that the BN is more confident under some prediction scenarios than others. For instance, the most uncertainty in the dune-crest change predictions tended to correspond to the cases where the most erosion was observed. These locations correspond to high dunes, since dunes generally did not erode more than their initial elevation, and, asFigure 7 shows, high dunes exposed to moderately high surge and waves will either survive with little erosion or fail completely. The uncertainty is relatively low at locations where the initial dune elevations were already relatively low (e.g., less than 4 m, locations at the western end of the study area: alongshore coordinates −25 km to −20 km).

Figure 8.

S2000 model hindcast probabilities (shading) for ΔZccompared to observed values (+) for (a) the entire study area and for (b) a smaller focus area indicated by region outlined by a dashed line in Figure 8a. The shading corresponds to the 50% (darkest), 90%, and 95% (lightest) confidence regions. The initial dune-crest elevation is shown with dots connected by a solid line. (c) Correlations of predictions to observations are shown.

[30] The regression skill (Figure 8c) for the hindcast predictions was relatively low (0.23, Table 2). The mean error was 0.1 m and RMS error was 0.8 m. Predictions of large negative values (dune-crest erosion exceeding 3 m) were generally consistent with the observations, even if they were biased such that they under predict the observations. There was high variability in the prediction error when the predicted change was between −3 and −1 m such that the observations were either consistent with the predictions (falling along the 1:1 line), or many observations were near zero. This result is consistent with an expected failure of the Bayesian-mean value for cases where the actual response is bimodal. This is not necessarily a failure of the BN itself, which may be aware of the ambiguity, but cannot resolve it with the inputs variables that were provided.

Table 2. Error and Skill Statistics for S2000 and Extended Models
StatisticTest NameΔZcΔXslΔXc
Likelihood ratioCheck896578964

[31] Comparing the observed dune elevation changes to the updated and prior probabilities produced a likelihood ratio for the dune elevation change of 222 (Table 2). The highest possible ratio was 896, determined by performing a “check,” where the observation of the quantity that we are predicting (ΔZc) was supplied as input to the network (Table 2). The scores indicate that the updated predictions are better than using the prior probabilities (i.e., LR > 0). But there is room for improvement as the low likelihood score is a result of the predictions having broad probability distributions (i.e., high uncertainty) that are not strongly different from the prior distributions.

3.3. Testing the Extended Model

[32] The extended BN implementation (Figure 2b) captures more features of the morphologic system than the S2000 model by including dune and beach width as inputs variables. This model is also more complex because it includes the coupling between changes in other morphologic features (shoreline and dune position) to dune elevation changes. We performed the same hindcast evaluation of the extended model, using the same data that were used in the previous evaluation, augmented with the new variables. Although there were some cases where the extended inputs were missing, all of the cases from the previous test were used. When data are missing, the BN uses the variable's prior distribution.

[33] More confident and more accurate ΔZc predictions were obtained with the extended BN (Figure 9a). The prediction skill (Figure 9b) and likelihood ratios obtained using the extended BN improved substantially compared to the results from the original BN (Table 2). The extended model added just two new input variables (beach and dune width) yet the skill increased threefold. The improvements include both more accurate predictions (in the Bayesian-mean sense) as well as a corresponding reduction in the prediction uncertainties. The majority of the improvement occurred in the previously ambiguous range (predicted erosion between −3 to −1 m) as well as for extreme erosion, where there is no longer a bias.

Figure 9.

Extended model (a) hindcast probabilities for ΔZc within the focus area and (b) correlations to observations. Lines and symbols are the same as in Figure 8.

[34] To illustrate the role played by the new variables, we compare prediction errors from both the S2000 and extended models to values of beach width (Figure 10). The S2000 errors were correlated to the beach width. Predictions of ΔZc were too negative (too much erosion predicted) for wide beaches and the predictions were too positive (not enough erosion predicted) for narrow beaches. For the extended BN predictions, the error variance was reduced and errors are no longer correlated to beach width. In the discussion, we explore the contribution of all the input variables to the prediction skill in order to further explain the source of the extended model's improvement.

Figure 10.

Dependence of ΔZc prediction error (predicted ΔZc – observed ΔZc) on beach width. When beach width is not included in the prediction (S2000 model, +), the elevation change is predicted to be too positive (not enough erosion) for narrow beaches and too negative (too much erosion predicted) for wide beaches. The extended model (dots) removes this systematic error.

3.4. Prediction of Horizontal Changes

[35] The hindcast predictions of the change in dune-crest and shoreline positions were also evaluated using the inputs required by theS2000 model as well as those included in the extended model (Figure 11). The shoreline-change prediction skill (Table 2) was about 0.3 using the S2000model, and the skill (0.6) was improved by a factor of two using the extended model inputs. The prediction skill of the dune-position change was similar to that of the shoreline-position change and showed the same amount of improvement using the extended model (skill = 0.7) compared to theS2000model (skill = 0.3). Dune-position changes (Figure 11b) were typically higher where initial dune widths were wider and where initial beach widths were narrower. Note that the BN makes predictions even when the initial dune width was missing (e.g., between km 1.5 and 2). Nonetheless, an accurate prediction was possible using the available information. Patterns of shoreline change were not clearly related to the initial beach width or dune width, indicating that the extended model's prediction skill resulted from incorporation of some complicated relationships between the morphologic and hydrodynamic variables. Likelihood ratios indicated that both models produced predictions that were improvements over the prior probabilities. Predictions of dune position changes improved more than shoreline change using the extended model compared to the S2000model. It is well known that shoreline changes occur whenever there are changes in wave conditions or even tides and, therefore, some of the observed changes were not associated entirely with the response to Hurricane Ivan. Because the dunes are not continually affected by wave-driven sediment transport processes, the observed dune changes were more likely to reflect storm response assuming negligible aeolian processes over relatively short time intervals investigated here.

Figure 11.

Extended model predictions of (a) dune and (b) shoreline position changes and corresponding correlations between predicted and observed dune (c) and shoreline (d) changes. Dots indicate initial (Figure 11a) dune width and (Figure 11b) beach width.

4. Discussion

[36] The S2000storm-scaling model was developed to make predictions of the type of coastal response (e.g., collision or overwash), but not to quantify the amount of morphologic change (e.g., specific vertical or horizontal changes). We have used BNs to make quantitative and skillful predictions and have shown that improvements in the originalS2000formulation are possible if more morphologic variables are added to the model. To determine whether the skill is based on physically meaningful correlation relationships as opposed to a result of a massive model-tuning exercise, we first examine the role played by each input variable in making skillful predictions. Then we examine the sensitivity of the prediction skill to implementation choices that include reducing the model complexity and reducing the amount data used in model calibration.

4.1. Extended Model Improvement

[37] A substantial increase in the hindcast skill was obtained with the implementation of the extended BN, which added dune width and beach width variables to the original inputs. In order to identify which variables (or combinations of variables) contributed to the improved skill, a series of additional hindcast prediction tests and evaluations were performed in which different combinations of input variables were provided or withheld. The skill was computed for these scenarios and compared to the predictions that were presented in the results section. Results of other statistical tests are presented in the auxiliary material.

[38] The “check” scenario is provided as a reference for the best-possible skill, which is only a function of the level of detail in the discretization of the bins in the BN. In the first set of tests (Figure 12, red bars), just one variable was removed from the prediction. For example, “no beach width” indicates that all of the input variables except Wbeachwere used. A large decrease in prediction skill results when an important variable is removed. An unimportant or redundant variable will have little effect on the skill when added or removed. Thus, when the initial dune elevation was removed, the dune-crest elevation change prediction skill (Figure 12a) was reduced by roughly 50% compared to the implementation that used all of the input variables. Beach width and runup height were the next most important variables. Beach width as well as island width has been identified previously in descriptive studies of the impacts of Hurricane Ivan along Santa Rosa [Claudino-Sales et al., 2008; Houser, 2009].

Figure 12.

Comparison of hindcast skill for dune elevation, shoreline position, and dune position change predictions using different input variables. (a–c) Result for the full-resolution network. (d–f) Reduced-resolution results. The colors differentiate results from the hindcast predictions (black), removal of one variable (red), and inclusion of only one variable (blue).

[39] The least important variable to the ΔZcpredictions was storm-induced mean water level (η50, Figure 12). The prediction skill was reduced by only 10–15% when η50 was removed. This illustrates an important, but purely statistical, result, as storm surge is physically important to driving morphologic change. However, for this analysis based on the Hurricane Ivan data set, the surge elevation was strongly correlated to the extreme water level (η98), such that surge is redundant and largely unnecessary if η98 values are available, and vice versa. The relationship between η50 and η98 will depend on storm characteristics not included here, such as distance from landfall. It is possible to capture such variations in a more detailed treatment of the hydrodynamic problem [e.g., Irish et al., 2011]. A potential advantage of statistical redundancy is that the impact of errors in the hydrodynamic inputs can be reduced through the inherent data assimilation capability of the BN approach.

[40] Applying the same sensitivity test to dune and shoreline position changes, the most important input variable was beach width. For the shoreline change prediction, withholding beach width caused a 40% skill reduction compared to the full set of inputs (Figure 12b). Only a 20% skill reduction resulted when beach width was removed from the dune position change prediction (Figure 12c). The surge height was, again, the least important variable when all the others input variables were included.

[41] The next set of experiments (Figure 12, blue bars) provided just one input variable to the BN prediction (e.g., “Only beach width”). The purpose here was to further identify the relative importance of the inputs and, by removing interactions between variables, to indicate the importance these interactions. In this test, dune-crest elevation changes were predicted best by the initial beach width, followed by extreme water level, and then initial dune-crest elevation. The prediction with only beach width had a skill that was 70% as high as that based on theS2000 inputs, indicating, for our study, that this variable is as important to consider as the originally conceived variables. For predicting shoreline change with a single variable, only η50 and η98 stand out as important. However, the skill was 30% lower than predicted using S2000inputs and 50% lower than the extended model. The sum of the individual skill values from the single-variable predictions is about 0.5 and is less than the skill of the combined prediction (0.6), implying that the interactions between hydrodynamics and initial morphology, are important. This is consistent with other recent predictions that explicitly recognize this feedback. The dune position predictions were skillful when only hydrodynamic variables were included. Morphologic variables are required for the best predictions. In fact, using the dune width variable alone outperformed the prediction usingS2000 variables.

4.2. Implementation Choices

[42] The BN design relied on several somewhat subjective choices that can network performance. These choices include (1) selecting variables to include in the network, (2) deciding which of the possible correlations between variables should be resolved, and (3) choosing the bin resolution for each variable. We have already addressed the impact of choice of variables by removing or including subsets of the input variable set. In this section, we address the issue of bin resolution by evaluating a modified version of the extended network that had only three bins per variable, compared to at least five bins in the BN that we have presented so far. The reduced-resolution network was capable of resolving only about 35,000 possible combinations of the inputs (compared to about four million for the full-resolution BN). We used the reduced-resolution network to identify differences in prediction skill (Table 3) and to determine whether changing the resolution affected our conclusion on the relative importance of the different input variables.

Table 3. Skill Statistics and Percent Change for Reduced-Resolution Network
StatisticTest NameΔZcΔXslΔXc
LikelihoodCheck731 (−18)406 (−30)756 (−22)
S2000135 (−39)58 (−43)104 (−35)
Extended309 (−45)126 (−55)318 (−45)
SkillCheck0.59 (−31)0.51 (−40)0.84 (−6)
S20000.20 (−14)0.24 (−21)0.29 (+4)
Extended0.53 (−33)0.60 (−4)0.81 (+22)

[43] In all cases, including the check, the likelihood ratio was lower for the reduced resolution BN (Table 3). At reduced resolution, there is less ability to distinguish the prior from the update, and it becomes harder to achieve high likelihood scores. The skill scores were typically lower as well, at least partly due to the reduction in resolution. The dune elevation change predictions showed more reduction in skill (by about 30%) than the dune and shoreline position predictions. The dune position prediction skill improved using the reduced resolution network. However, the likelihood ratio decreased. This result is an artifact of using a weighted skill metric; the skill increase came from an increase in the prediction uncertainty for those cases that were poorly predicted, which, in turn, reduced the weights in the regression model for the worst predictions. Tables in the auxiliary material provide exhaustive error statistics including the unweighted (i.e., σx,i2= 1) mean and RMS errors, which were both larger, and the skill, which was lower in magnitude under the reduced-resolution network.

[44] The sensitivity tests were repeated under reduced-resolution (Figures 12d–12f). The sensitivity patterns were nearly identical to the full-resolution results, indicating that changes in the network resolution did not alter its ability to describe the relative importance and interactions of the input variables. Specifically, beach width and initial dune-crest elevation were, again, the most important variables to include in the ΔZcprediction. The change in importance may be a result of smearing across important elevation thresholds: beach width may not have corresponding threshold values that are as important. Rather, as beaches become wider the dunes behind them are progressively and continuously more resistant to erosion. In contrast, dune-crest elevations are actual thresholds that dramatically change the erosion response when they are exceeded by overwash and mean water levels [Stockdon et al., 2007b].

[45] Finally, the sensitivity of the BN skill to a reduction in training information was tested using the original higher-resolution BN in order to determine how much skill was due to over fitting of the model parameters. The data were divided into two sets, selected randomly such that each set included data that spanned the full study area and the variables in each set had similar mean values and variances. The BN was trained using one of the random sets and evaluated against the other, hidden data set. The predictions from the newly trained BN were compared to the original BN, which had been trained on all the data and reevaluated on just the hidden data. Using theS2000 model, the new BN's likelihood scores for the ΔZc prediction were reduced by 28%. The likelihood scores were greater than zero for all the variables (Table 4). The skill for the ΔZc predictions was reduced by 36% from 0.23 in for the original BN to 0.15 for the newly trained BN.

Table 4. Skill Statistics and Percent Change for Random Split Test
StatisticTest NameΔZcΔXslΔXc
Original BN Tested Against 2nd Random Set
BN Trained on 1st Random Set and Tested Against 2nd Random Set
LikelihoodS200089 (−28%)1 (−97%)44 (−47%)
Extended−181 (−161%)−177 (−231%)−163 (−157%)
SkillS20000.15 (−36%)0.24 (−33%)0.26 (−11%)
Extended0.44 (−44%)0.26 (−66%)0.56 (−38%)

[46] Using the extended model, the likelihood scores for the ΔZc prediction were reduced by 160%, and they were negative for all the variables, indicating poorer prediction performance compared to the priors. The prediction skill for ΔZc was reduced by 44% compared to the original prediction trained on the full data set. While less skillful than the original prediction, the extended model prediction was substantially better than the skill of the S2000 prediction (0.44 versus 0.15). Shoreline position change prediction skill was similar for both the S2000and extended predictions (about 0.25), while the dune-crest position change prediction was more skillful using the extended model (0.56 versus 0.26).

[47] The combination of poor likelihood and relatively high skill for the extended prediction is an indication of some over fitting of the BN such that the uncertainties of the predictions were underestimated in some cases, even though the correlations (CPTs) leading to the prediction skill were adequately estimated. The cause for the negative likelihood scores can be diagnosed by comparing the actual dune-crest elevation prediction errors (i.e., the Bayesian-mean predicted value minus the observed value) to the predicted uncertainty (equation (4d)). The RMS prediction errors (dashed line in Figure 13) are very close to the expected uncertainty for both the S2000and extended predictions. This demonstrates that the posterior errors are generally well characterized by the BN: when the uncertainty is predicted to be small, the actual errors are relatively small and when the uncertainty is predicted to be large, the errors are indeed larger. However, the extended-model predictions include a number of cases where the predicted uncertainty was low (about 0.5 m) whereas the actual prediction errors exceeded expected errors (e.g., maximum errors >2 m;Figure 13). There were nine cases (not shown) where predicted erosion exceeded 5 m (falling in the −5 to −10 m bin) while actual erosion was about 4 m. These cases were severely penalized by the likelihood score because the BN prediction indicated a very low probability of a large error, and it was wrong. Summarizing the results in Table 4 and Figure 13, the extended model made more skillful predictions of the true dune-crest elevation changes at the expense of the accuracy of the prediction uncertainty in some cases. TheS2000 model made less skillful predictions, but the corresponding uncertainties more accurately reflected the true prediction accuracy—it is a more cautious estimate and hedges by admitting that the errors might be large.

Figure 13.

Relationship between the actual ΔZcprediction errors (Bayesian mean - observed) and the predicted uncertainty (standard deviation of the Bayesian prediction). The root mean square error is shown for theS2000model (red dash-dot line) and for the extended model (black, dashed line). Maximum errors are shown with symbols (+ forS2000, o for extended). The solid black 1:1 line indicates where the observed RMS error would equal the predicted uncertainty.

5. Conclusions

[48] We have demonstrated that a conceptual model intended to make qualitative predictions of beach response to storms can be reformulated in a Bayesian Network to make quantitative and skillful predictions of changes in beach and dune morphology. The original conceptual model, proposed by S2000 and tested by Stockdon et al. [2007b], reduced the dimensionality of the problem by considering only dune, surge, and wave runup elevations as input and predicting beach response in one of just four types: beach erosion due to swash, dune erosion due to collision, dune overwash, and inundation. The BN reformulation of this model included the original input variables and replaced the prediction of the beach response regime (swash, collision, overwash, inundation) with predictions of dune elevation changes as well as changes in dune and shoreline positions. Prediction skills achieved by the BN using the S2000input variables explained about 20–30% of the observed dune-crest elevation changes. The probabilistic approach was evaluated using the log likelihood ratio and predictions were shown to accurately represent uncertainty and outperform the prediction based on the prior distribution.

[49] The BN based on the original conceptual model was extended by adding beach and dune widths as input variables. The prediction skill of this extended model improved substantially over the original model, explaining about 80% of dune elevation and shoreline change variance and about 60% of dune position change variance. Sensitivity studies indicated that beach width, which was not in the original S2000model, was the second-most important input for predicting dune elevation changes and shoreline position changes. Initial elevation was the most important variable. Dune width, also not a part of the original storm scaling model, was the most important variable for predicting dune position changes. This result was insensitive to changes in alternative formulations that reduced the resolution of the network. Dividing the data into two randomly selected sets, one for training and one for testing, resulted in a 40% reduction in skill and uncertainties associated with the largest prediction errors were underestimated. Nonetheless, even this test showed that the extended model's prediction skill improved compared to theS2000 model and that root mean square errors were accurately predicted.


[50] We are indebted to our colleagues who helped to develop the data used in this paper. K. Doran prepared the data sets to be suitable for our approach and conducted the initial simulations that pointed out the need to include dune widths. D. Thompson provided hydrodynamic model results. K. Morgan extracted compelling photos and K. Guy assisted with drafting figures. The lidar data were collected through a joint program between USGS, NASA, and USACE. And, we are indebted to the careful reviews and substantial comments provided by our internal USGS reviewers (C. Hapke and K. Rankin), editors at JGR, and three anonymous reviewers.