Accuracy of pan-European coastal flood mapping

Coastal flood maps covering the whole European continent have become available in recent years. However, their ability to complement or replace high-resolution local flood maps was not investigated so far. In this paper we compare pan-European estimates of extreme sea levels and coastal flood extents at given return periods with observations and high-resolution reference maps. The analysis is done for two pan-European assessments and one global study. We find that whereas the models have good accuracy in estimating storm surge heights, large disparities exist between the large-scale flood maps and four local maps of flood extents from England, the Netherlands, Poland, and France. Moreover, the accuracy of the underlying digital elevation model and assumptions about flood protection existing in a given area influence significantly the results. In addition, the first pan-European projection of temporal trends in the size of flood zones is presented, with and without assuming flood protection levels.

Coastal flood maps covering the whole European continent have become available in recent years. However, their ability to complement or replace high-resolution local flood maps was not investigated so far. In this paper we compare pan-European estimates of extreme sea levels and coastal flood extents at given return periods with observations and high-resolution reference maps. The analysis is done for two pan-European assessments and one global study. We find that whereas the models have good accuracy in estimating storm surge heights, large disparities exist between the large-scale flood maps and four local maps of flood extents from England, the Netherlands, Poland, and France. Moreover, the accuracy of the underlying digital elevation model and assumptions about flood protection existing in a given area influence significantly the results. In addition, the first pan-European projection of temporal trends in the size of flood zones is presented, with and without assuming flood protection levels.

K E Y W O R D S
climate change, coastal inundation, EURO-CORDEX, flood hazard, return periods, sea level rise, storm surges

| INTRODUCTION
Continental or global assessments of flood hazard have become commonplace in recent years. Yet, large-scale analyses of coastal inundation hazard are less well developed than those dealing with riverine events. Many studies on river floods have been conducted even on a global scale, using a large variety of methods, as presented in detail by Bierkens et al. (2015). The latest publications on this topic have also routinely validated the accuracy of large-scale flood maps by comparing them with high-resolution, locally-produced maps (e.g., Alfieri et al., 2014, Sampson et al., 2015, Paprotny, Morales-Nápoles, & Jonkman, 2017. By contrast, only a handful large-scale storm surge or coastal flood studies are available (e.g., Forzieri et al., 2016;Hinkel et al., 2014;Mokrech, Kebede, Nicholls, Wimmer, & Feyen, 2015;Vousdoukas, Voukouvalas, Mentaschi, et al., 2016), and none has so far used local flood hazard maps for validation. At the same time, there is growing concern in Europe that changes in storm patterns and sea level rise may lead to a significant increase in the level hazard, even more than in case of river floods (Church et al., 2013;Forzieri et al., 2016). Therefore, large-scale assessments of coastal flood hazard taking into account climate change will be needed, while more effort is needed to assure their informativeness and accuracy.
The first attempt at creating an overarching collection on coastal hazard information was made by Vafeidis et al. (2008), who devised the Dynamic Interactive Vulnerability Assessment (DIVA) database. However, though it contains extreme water levels with given return periods for 12,000 coastal segments around the world, the estimates were not derived through hydrodynamic modelling. In addition, the only validation of flood maps was done by comparing the number of people at risk of 1-in-1,000-year flood with some national studies (Hinkel et al., 2014). DIVA database has been more recently supplemented by the Coastal Fluvial Flood (CFFlood) database (Mokrech et al., 2015) for European countries. Yet, no validation was presented apart from stating that the flood zones delimited in that study were consistent with a national floodplain map for the United Kingdom. Only with the work of Ward (2015, 2016) has a global dataset of storm surges, extreme sea levels and coastal floods become available (hereafter, "GTSR model"). Still, only the results of a reanalysis  have been published so far. Also, no validation of the global coastal flood map was presented.
On the European scale, Vousdoukas, Voukouvalas, Mentaschi, et al. (2016), Vousdoukas, Voukouvalas, Annunziato, Giardino, and Feyen (2016), Vousdoukas, Mentaschi, Voukouvalas, Verlaan, and Feyen (2017) provided storm surge and extreme sea levels for both present and future climate, as well as coastal flood extent estimates for the present climate (hereafter, "JRC model"). In Vousdoukas, Voukouvalas, Mentaschi, et al. (2016) a comparative analysis of four methodologies of calculating coastal flood extents was presented, together with a juxtaposition of the pan-European maps with the actual inundation limit observed during Xynthia storm in France in 2010. Still, climate change projections were not yet published for coastal floods; even though a multi-hazard assessment for Europe by JRC (Forzieri et al., 2016) did include future projections of coastal floods, they were made only by taking into account global sea level rise. Pan-European coastal flood modelling was also done within "Risk analysis of infrastructure networks in response to extreme weather" (RAIN) project (hereafter, "TUD model") and presented in Groenemeijer et al. (2016) and Paprotny, Morales-Nápoles, and Nikulin (2016), but only validation of storm surge estimates was carried out, and not of flood extents. The main advantage of that study was the first attempt to make future projections of extreme water levels and coastal flood extents, including factors such as changes in storminess, regional sea level rise and glacial isostatic adjustment.
In light of the above, there is clearly a need for a more systematic analysis of the accuracy of large-scale flood maps in context of their potential applications to flood risk management, as well as climate change mitigation and adaptation. The objectives of this paper are therefore twofold. Firstly, to extend storm surge modelling work presented in  into coastal inundation delimitation, both for present and future climate. Secondly, to analyse the accuracy of pan-European modelling work, namely (a) estimation of extreme sea levels (surges together with tides) at given return periods and (b) calculation of coastal flood extents, using different methodologies. To complete this work, results of TUD, JRC, and GTSR models will be compared with measurements made at several dozen tide gauges, and with local flood maps made in four highresolution assessments of coastal flood risk from England, France, the Netherlands, and Poland.

| Boundary conditions
In this study, for the purpose of calculating coastal flood extents, the boundary conditions are extreme water levels with given return periods. Extreme water levels (EWLs) were derived in the TUD study as described in  and Groenemeijer et al. (2016), and also in the JRC study as shown by Vousdoukas, Voukouvalas, Mentaschi, et al. (2016) and Vousdoukas, Voukouvalas, Annunziato, et al. (2016). Briefly, in TUD's analysis the EWL (E p, T, S ) was defined as follows: where: • p is the probability of occurrence (or, conversely, return period); • T is the time period (1971-2000, 2021-2050, 2071-2100); • S is the climate model run scenario (historical for 1971(historical for -2000.5 and RCP8.5 for other periods); • R p,T,S is the storm surge height (relative to local mean sea level) with a given probability of occurrence p, time period T, and scenario S; • D is the mean high tide height; • M is the baseline mean sea level, that is, the difference between the actual mean topography of the ocean (1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012) and the geoid; • L T,S is the difference between mean sea level in time period T and scenario S compared to the baseline MSL. This factor includes several components; • G T is the accumulated effect of glacial isostatic adjustment between time period T and year 2000 (the elevation model's epoch).
The JRC's analysis of coastal flood extents was done only for a historical reanalysis, therefore the EWL equation is simpler, but also adds another component, that is, wave setup: where R is the storm surge height, W is the significant wave height and H is the maximum high tide height.
In both studies the hydrodynamic model Delft3D was used to derive storm surge heights; their setups are summarised in Table 1 (Deltares, 2014). The two studies used the same climate reanalysis ERA-Interim Dee et al., 2011 for validation, though differed in the grid's resolution and output's timestep. In the TUD analysis, the hindcast and projections were made utilising climate data generated within the EURO-CORDEX activities at the Rossby Centre of the Swedish Meteorological and Hydrological Institute (Strandberg et al., 2014) specifically for RAIN project. Also, both studies have chosen a different approach to extreme value analysis. It can be noted from Equation (2) that in the JRC study surge and wave time series were merged before undertaking an extreme value analysis. Wave's contribution to EWLs was estimated by taking 20% of offshore significant wave height directly from the ERA-Interim reanalysis dataset. To validate the results, storm surge heights with given return periods were also calculated from measurements done at tide gauges from around Europe (see section 3.1).
Tides were not included in the hydrodynamic model, but rather added from external datasets. The assumption is that tides and surges are independent, hence mean high tide was chosen for estimating extreme water levels, as it constitutes the expectation of tide height in a given tidal cycle. The assumption of independence was analysed in , where it was observed that tide-surge interaction at most gauge stations is negligible. Tidal constituents for the calculation were obtained from TPXO8 model (Egbert, Bennett, & Foreman, 1994). The JRC study utilised the same dataset, but opted to use the maximum high tide instead. A further difference between the two studies is in the extreme value analysis. TUD has chosen to use annual maxima of storm surges fitted to Gumbel distribution. The selection of Gumbel was dictated by a comparison of fit between several distributions and the observations using Akaike Information Criterion. Meanwhile, JRC used nonstationary extreme value statistical analysis from Mentaschi et al. (2016), combined with a design hydrograph. Finally, three additional components of extreme water levels (MSL, SLR, GIA) were obtained for TUD's model by combining several external databases. Briefly, MSL is from MDT_CNES-CLS13 dataset (Aviso, 2015), SLR is combination of CNRM-CM5 model Voldoire et al., 2013 and regional SLR estimates by Slangen et al. (2014), and GIA is from ICE-6G_C (VM5a) model (Peltier, Argus, & Drummond, 2015). For more details we refer to .
In addition, the global GTSR model from Muis et al. (2016) was used in the analysis, for which information is provided in Table 1. The main differences with respect to the European studies is that tides were included in the hydrodynamic computation and that the model uses an irregular grid. Extreme value analysis was carried out by analysis the combined surge and tide timeseries. 1 Basic information on calculating extreme water levels (EWLs) for coastal extent modelling by TUD and JRC. Based on  for TUD, Vousdoukas, Voukouvalas, Mentaschi, et al. (2016) and Vousdoukas, Voukouvalas, Annunziato, et al. (2016)

| Coastal flood extents
After the boundary conditions were obtained, they were applied to calculate flood extents. All model results analysed here used the same method, even if with a different setup. Static inundation approach, also known as "bathtub fill," is the simplest applicable method (Poulter & Halpin, 2008). In this approach, it is assumed that all land laying below the extreme water level is flooded, as long as it is hydraulically connected with the sea. TUD's study was carried out in two variants: with or without correction of elevation in the underlying digital elevation model (DEM). The model used here is EU-DEM, which has a resolution of 1 arcsec, but for the flood analysis it was resampled and projected to a 100 m resolution. Though the dataset is comprehensive and consistent, accuracy issues have been reported (DHI GRAS, 2014). It is shown that the average error of the model is −0.56 m and root-mean-squared error of 2.9 m, while displaying significant diversity between countries. It was therefore decided to analyse floods not only on the original dataset, but also on a corrected one. The correction was based on an assumption that in the coastal zones the bias of the model is the same as averaged over the whole country, as indicated in the EU-DEM validation report.
To test this assumption, three nationally-produced DEMs were collected for: gov.uk).
The first two datasets have a resolution of 100 m, while the last one-50 m. All datasets were resampled to fit into the grid of EU-DEM, and the Polish dataset was modified by adding 18 cm (Augath & Ihde, 2002) in order to move it the from national vertical datum (Kronsztadt) to EVRF-2000, which was used in EU-DEM. The comparison between the national and European DEMs was made for the coastal zone extending up to +3 m above mean sea level. In the Polish coast the bias of EU-DEM was found to be equal to −2.31 m, very close to country average of −2.38 m reported by DHI GRAS. In the Netherlands the values were −0.96 and −0.85 m, respectively, and in the UK +0.70 in the coast and + 0.72 m for the whole territory. Due to the very close alignment of those figures, EU-DEM was corrected using country-specific values from the validation report.
Coastal flood extent estimates in the TUD analysis were done using a delimitation of the coast from CCM2 river and catchment database (De Jager & Vogt, 2010). For each coastal segment in CCM2, their nearest neighbourhood landward was calculated. Then, the nearest grid cell for each dataset representing a component of extreme water levels (Equation (1)) was assigned, so that the EWL could be calculated separately at each segment. It is important to note that existence of coastal defences was not taken into account at this stage. Instead, maps were post-processed by removing flood zones where extreme water levels were below estimated protection standards. Those estimates were taken from two databases. Firstly, FLOPROS database (Scussolini et al., 2016) shows nominal protection levels in terms of return periods, which were assumed equal to EWLs in the historical scenario at a given coastal segments. The standards are either design levels of actual defences, or legal requirements for flood protection in a given area, or estimates based on expected annual damages. Secondly, a revised version of coastal protection levels from Vousdoukas, Voukouvalas, Mentaschi, et al. (2016) was used, where they are given as heights above mean sea level, linking return periods with EWLs including tides. This database was constructed by considering all available information on flood extents and number of people affected known from past events and local studies. High-resolution population grid from JRC was combined with modelled flood zones to estimate affected population so that most probable protection standards for all return periods could be found.
The JRC study included approximately 11,000 segments of equal length (25 km), with the nearest neighbourhood extending 100 km landward. The elevation model used in the analysis was derived from CCM2, and is mostly based on Shuttle Radar Topography Mission (STRM) elevation model (Rabus, Eineder, Roth, & Bamler, 2003). An important difference from the TUD approach was a modification the DEM using estimates coastal protection levels. The elevation in all DEM cells found on the coastline and having elevation lower than one of the protection levels. Those levels were estimated using high-resolution DEMs, information on flood protection standards, personal communication with national authorities etc. The JRC results analysed here are a revised version of those presented in Vousdoukas, Voukouvalas, Mentaschi, et al. (2016). Finally, the GTSR model used static inundation method on the original STRM DEM, with no coastal protection included.

| Validation of coastal flood maps
The validation of pan-European maps was done by comparison with four "reference" maps: two from official national flood studies, one from published research and one representing actual observed flood extent during an extreme event.
The map with the largest spatial extent is for England. "Risk of Flooding from Rivers and Sea" map (April 2015 version) was produced during 2005-2013 by the Environment Agency (2015). It utilised local-scale modelling and takes into account the height, type and condition of the flood defences. Therefore, the actual reliability of the defences, not only the nominal protection level is included in the probability of flooding. The flood maps were validated locally using experts' assessments and have a resolution of 50 m. The map was prepared in four scenarios, for return periods below 30 years, 30-100 years, 100-1,000 years, and above 1,000 years. In addition, a map of areas benefiting from flood protection against riverine events with a return period 100 years and coastal events with a return period of 200 years. Both river and coastal floods are represented in the same maps for different return periods, therefore for the comparison only those flood extent patches were selected that were connected to the coast, but no further than 25 km. This choice might have included some influence of rivers in the delimitation of flood zones; however there was no other possibility of disentangling the riverine and coastal phenomena from the map.
The other official study is "The National Flood Risk Analysis for the Netherlands" (Jongejan & Maaskant, 2015;Vergouwe, 2015), also known by the abbreviation VNK. In this dataset, the probability of failure of each dike section is provided together with a corresponding flood zone. Nine different failure mechanisms are considered, including four for dikes (overflow/overtopping, piping, instability, erosion), four for hydraulic structures (overflow/overtopping, seepage, structural failure, failure to close) and one for dunes (erosion). The inundation calculation was done using 1D or 2D dynamic models, depending on location, taking into account the development of breaches in dikes. Flood zones of 400 dike sections were combined into a single map, and the probability of flooding was added up when flood zones of different dike sections overlapped. It was therefore assumed that failure of each dike section would be mutually exclusive with failure of any other section. In this way, flood maps for 30-, 100-, 300-, 1,000-year return period could be extracted at 100 m resolution. Here, we use maps for 36 "dike rings" (Van der Most & Wehrung, 2005), numbered consecutively from 1 to 35 and ring 13b) solely or predominantly protect against high sea levels out of 58 dike rings covered by the VNK study.
The third dataset was obtained from a study on coastal floods and sea level rise in Poland (Paprotny & Terefenko, 2017). In contrast to the previous two datasets it used only the static "bathtub" method, however it utilised a detailed 1 m-resolution DEM from lidar scanning. The study analysed effects of floods for every 5 cm increase in water level, and the maps for specific return periods were obtained by assigning flood zones to one of eight tide gauges in the coast for which extreme value analysis could be done. The map was obtained for all five scenarios considered in the pan-European TUD map. Validation presented in Paprotny and Terefenko (2017) shows that number of the exposed population in main cities calculated in their study is very similar to the one indicated in the official flood maps.
Last but not least, one case study of an actual storm surge was used for validation. On February 27-28, 2010, an extra-tropical cyclone "Xynthia" caused a devastating flood in Vendée and Charente-Maritime departments of France, with a death toll of 41 (Lumbroso & Vinet, 2011). The total flooded area was 413 km 2 . Analysis of tide gauge data from La Pallice harbour in La Rochelle, which was located in the very centre of the event, shown that the water levels had a return period of more than 100 years (Pineau-Guillou et al., 2012). Observed flood extent was digitised from maps presented by Breilh, Chaumillon, Bertin, and Gravelle (2013).
All maps were projected and resampled to fit the same grid as the pan-European maps. For more detailed analysis, all flood maps were split into regions utilising Eurostat's (2015) "Nomenclature of territorial units for Statistics" (NUTS), 2013 version. The maps divided by regions are presented in Figure 1.
The pan-European maps were evaluated with two measures, originally used for flood map validation by Bates and De Roo (2000), and later by many studies, for example, Alfieri et al. (2014) or Vousdoukas, Voukouvalas, Mentaschi, et al. (2016). Test for "correctness" (or, "hit rate" I cor ) indicates what percentage of the reference map is recreated in the pan-European map (Equation 3). However, this test does not penalise overestimation, therefore another measure for "fit" (or, "critical success index" I fit ) is applied (Equation 4). They are calculated as follows: where A EM is the area indicated as flooded in the pan-European/global map and A RM is the area indicated as flooded in the reference map. From the two measures the "false alarm ratio" can also be inferred: 3 | RESULTS

| Validation of modelled extreme water levels
Extreme water levels with a return period of 100 years obtained from TUD, JRC, and GTSR models are compared with observations in Figure 2. Data from 79 or 84 gauges, depending on time series availability, were used for the analysis and their positions are shown in Figure 3. Results of all three studies indicate similar performance for the historical reanalysis (1979-2014 with ERA-Interim climate model).
The TUD model has the highest correlation (R 2 = 0.94), but also the highest bias as indicated by the lower value of Nash-Sutcliffe efficiency (I NSE = 0.80). It is visible that the using mean high tide instead combining surges and tide in the hydrodynamic model underestimates EWLs compared to EWLs obtained by applying extreme value analysis to water Location of the local reference maps with corresponding NUTS codes (see Table 2 higher and the bias the lower relative to the simulation using ERA-Interim, the comparison in Figure 3 excludes most stations with very large tidal amplitudes. The tidal component is the same for analyses done with both climate models, therefore using EURO-CORDEX gave higher storm surge estimates than the ERA-Interim runs. Spatially, the distribution of error is very uneven: errors are low in the Baltic and North seas, and significantly higher for the coasts exposed directly to the Atlantic Ocean. No stations with long series could found for the Mediterranean or Black seas. Also, some variability can be observed for analysing different return   Validation results for the pan-European TUD map, with and without DEM correction, by countries and regions. A 100-year flood zone area is taken from the reference maps. The indicators for correctness (I periods: the higher the return period, the lower the correlation and higher the bias: R 2 = 0.94 and I NSE = 0.89 for 10-year EWL compared to R 2 = 0.89 and I NSE = 0.83 for 1,000-year EWL. This relatively minor decrease in accuracy for higher return periods can be largely attributed to the increasing uncertainty of the EWLs at a given return period.

| Validation of coastal flood maps
The results of the comparison between the TUD map with reference maps in 5 scenarios (10, 30, 100, 300, and 1,000 years return period) are shown in Table 2 and all large-scale maps are validated for the 100-year event in Table 3. A snapshot of the comparison for the Humber river estuary on the eastern coast of England is presented in Figure 4. National flood maps for England indicate more than 4,000 km 2 at risk of a 1 in 100 years flood, the largest area of the four high-resolution studies. For that return period, 68% of the flood zone is recreated in the TUD pan-European map (I cor ), however the "fit" (I fit ) is rather low, at 32%. In effect, for each km 2 correctly predicted another km 2 is falsely indicated as being at risk of flood (111% false alarm ratio I false ). Considering areas that are normally protected by flood defences against a 1 in 200 years event, the results for the pan-European map improve slightly. It improves further for the 1,000-year event, but is very poor for a 30-year event. This is caused by the effect of flood defences, which are not included in the TUD map: most areas are protected against flood with a high Validation results for the pan-European and global maps with a 100-year return period, by countries and regions. 100-year flood zone area is taken from the reference maps. The indicators for correctness (I cor ) and fit (I fit ) are in %. For location of the regions, see Figure 1 NUTS Name 100-year flood zone (km 2 ) probability of occurrence, but few are large enough to prevent a millennial flood. The same effect could be observed for the Dutch and Polish maps. In case of the former, the performance is very low for all return periods due to high level of flood protection in the Netherlands. In Poland, where the flood protection has lower standards (mostly below 100 years return period), the maps were prepared with the same methodology. Hence, the difference in flood zone delimitation compared to the pan-European map is caused primarily by the use a more detailed DEM. At the same time, there is noticeable influence of DEM correction on the results. This effect is lower in England, were the EU-DEM is less biased, and unnoticeable in the Netherlands, where flood zones are mostly depressions. The performance of the GTSR model is, on average, similar to TUD analysis (for 100-year return period; Table 3). It also uses static inundation technique, with no flood protection is assumed, and only the EWLs and the underlying DEM are slightly different. However, at regional level disparities are sometimes significant, like for South East and South West of England. The bias in the global DEM over Poland reduces the accuracy of flood zone delimitation, similarly when the uncorrected EU-DEM is applied in the TUD analysis. Again, the whole low-lying territory of the Netherlands is indicated as flooded, and the extent of the Xynthia storm surge is substantially overestimated.
The JRC model included estimated flood protection levels in the calculation, generating smaller flood extents than the other two assessments. In general this leads to substantially lower overestimation: only 65% false alarm ratio (I false ) for all study area together, compared to almost 300% for remaining models. However, this is with the expense of missing many flood zones indicated in the local maps. Only for England the I cor and I fit measures are similar. In the case of the Netherlands, the no flood hazard in the 100-year scenario is indicated, as nominal protection standards are much higher.

| European coastal flood extents under present and future climate
Results of the coastal flood extent analysis using corrected DEM for present and future climate are visualised in Figure 5 for the 100-year event. For the sake of clarity, the flood extents were aggregated in 50 km blocks from the original 100-m resolution maps. Those scenarios exclude flood protection; influence of such structures on the results is discussed in the next section. The maps are also indicative for regional distribution and future trends in flood extents with other return periods. In total, the static method with the corrected DEM indicated 68,000 km 2 at risk of flooding within the domain (100-year return period), which included all European coasts except for parts of Russian and Ukrainian coasts. That is approximately 1.2% of the corresponding inland area of the domain. The flood zones concentrate around the North Sea, where EWLs are among FIGURE 4 An example of the differences between the pan-European map from this study and the local reference map, for the Humber river estuary in England, both for the 100-year flood scenario (Environment Agency 2016) the highest, and many low-lying areas occur. Other pockets of flood hazard are mostly located at river deltas, which often feature depressions, for example, in Italy (Po), Spain (Guadalquivir, Ebro), Poland (Vistula), Lithuania (Nemunas), Romania (Danube).
Future coastal flood hazard was projected taking into account three factors, namely changing meteorological conditions, rising mean sea levels (SLR) and effects of glacial isostatic adjustment (GIA). On average, 100-year surges would decrease slightly in future (2-9 cm) according to projections based on RCA4 climate model, and GIA will also contribute negatively to EWLs (8-15 cm). In effect, by midcentury, mean EWLs would decline, only become bigger in the subsequent decades due to SLR, which would contribute 30-45 cm compared to the historical scenario. However, the trends will be very uneven around Europe, as complex coastlines of Finland, Norway, UK or Greece skew the average figures. In effect, also for the 2021-2050 an increase of the potential flood zones is expected, albeit very small, of about 0.4-0.6%, depending on emission scenario. For 2071-2100, the growth will amount to 3-8%. In the north of Europe, mostly reductions in flood hazard are projected for the nearterm due to lower surge heights, while in the long-term only parts of Sweden and Finland around the Gulf of Bothnia will see a decrease in hazard as consequence of intense GIA. By contrast, in the Mediterranean GIA is negligible and surge are mostly below 1 m, hence sea level rise will be the predominant factor causing an increase of EWLs in the entire region.
The size of the flood zones and their changes in the future are highly dependent on the assumed flood protection standards, as can be ascertained from Figure 6. Without considering them, most of the study area is already marked as inundated by a 10 year surge; relatively little is added for higher return periods or future time points. When adding protection standards from FLOPROS database, the flood zones become much smaller: only 1,000 km 2 is within the 30 year flood zone, and 21,000 km 2 in the 100 year zone. Under climate change projections, the 30 year zone would expand even up to 40,000 km 2 , and 100 year to 48,000 km 2 . 1,000 year zone would decline by about 1% around mid-century, but increase by 12-17% by end of century. Using JRC's estimates of protection levels the inundation extents become even smaller: 2,800 km 2 (10 year) to 12,600 km 2 (1,000-year). In general, such reworked maps give a more complicated picture, with 100 year zone decreasing by 2021-2050 before increasing again above 1971-2000 levels by 2071-2100. However, the 1,000 year would then be expected to expand by 48-67% by 2071-2100. Influence of the different flood protection assumptions is also visible in Figure 7, where only the effect of an uniform SLR is added to EWLs under historical scenario. The flooded area is the smallest when JRC's estimates are used, but also increase more steeply than when FLOPROS or no flood defences at all are used.

| DISCUSSION
The analysis has shown that the results are very sensitive to different flood protection assumptions, especially in context of future projections. However, because information on dimensions and conditions of natural (dunes, cliffs, beaches) or artificial (sea walls, dikes) coastlines is only obtainable by detailed local studies; it is therefore not possible to have complete information on the European scale. This information also changes relatively dynamically, compared to the climate or socio-economic situation, let alone underlying properties of the terrain: the coast erodes or builds-up by accumulation, flood defences deteriorate over time or are renovated and new ones are being constructed. Also, using nominal protection levels, for which some information could be found for selected countries or localities, also has pitfalls. For instance, the Netherlands have high protection standards, ranging from 1 in 1,250 to 1 in 10,000 years in coastal dike rings (Vergouwe, 2015). When considering only overtopping of the dikes, indeed only 3 out of 400 dike sections in the study area have a probability of failure higher than 1 in 1,000 years. However, when considering other failure FIGURE 6 Coastal flood extents in Europe under present and future climate by return period, according to TUD model, using different assumptions of flood protection (from FLOPROS database by Scussolini et al., 2016 and JRC). Only results for EU countries and Norway were included in this graph FIGURE 7 100-year coastal flood zone area in Europe (EU plus Norway) assuming uniform sea level rise and using different assumptions of flood protection mechanisms, 48 dikes segments are above this threshold according to the VNK study. Further, the different segments are not independent of each other. Hence, the probability of flooding in a given area is higher when failure of more than one segment can inundate it. In effect, the area of the 1-in-1,000 years floodplain for the Netherlands is 3,837 km 2 , or more than a fifth of the area of dike rings included in the study.
More factors influence the performance of the large-scale maps. As Vousdoukas, Voukouvalas, Mentaschi, et al. (2016) and Ramirez, Lichter, Coulthard, and Skinner (2016) have shown, the area with hazard is overestimated by the static method to a varying degree depending on the type of coast. Low-lying vicinities of estuaries and deltas are particularly prone to errors compared with steeper coasts. Part of the inaccuracy might stem from neglecting influence of river discharge in all large-scale assessments, but is included in English and Dutch flood hazard maps.
The construction of boundary conditions for flood modelling also influences the results: incorporation, or not, of waves into the model changed the flooded area estimate significantly. Comparing flooded area by country (Table 4) shows that even with the static method and no flood defences, the models can yield very different results, depending on the forcing EWLs and underlying DEM. This is especially noticeable for countries around the Baltic Sea, Romania or Spain. Similarly, transposing the protection standards between models is not fully feasible. The estimates by JRC were based on EWLs including waves, therefore applying them to TUD model results in a very large drop in flood estimates. Only in the Baltic Sea, where waves are less significant, the protection standards are lower than the 100-year EWLs without waves. There is also large change in flood area when switching between JRC and FLOPROS databases of flood protection standards. This is because the two datasets were made using different assumptions and sources: FLOPROS was mostly focused on river flood protection, while the JRC study was dedicated to coastal floods only. Also, FLOPROS relied on available information nominal standards, while JRC's estimates were made taking into account recorded flood damages and experts' judgments.
Future changes in flood hazard contain many uncertainties. Small-scale effects such as ground subsidence or coastal erosion/accumulation were not taken into account due to lack of pan-European information, but could be locally significant. SLR could also have effects on tides and tide-surge interaction (Idier, Paris, Le Cozannet, Boulahya, & Dumas, 2017). GIA is a very slow process, and the rate of vertical motion of the crust changes very little over time, though the resolution of available data is low (Peltier et al., 2015). Meanwhile, sea level rise is a combination of several climate-related factors, which are understood and quantified to a varying degree (Carson et al., 2016;Slangen et al., 2014). Last but not least, there is uncertainty related with climate data, as the accuracy of storm surge estimates are dependent on the quality of air pressure and wind speed/ direction data. As can be noticed from Figures 5 and 6 and the text, the difference between RCP 4.5 and RCP 8.5 scenarios is sometimes very large, to the point that opposite trends are indicated. The projections used were generated by only one regional climate model -RCA4 driven by one global model EC-EARTH, which could give different results than other regional-global model combinations or general circulation models used by Vousdoukas, Voukouvalas, Annunziato, et al. (2016) and Vousdoukas et al. (2017).

| CONCLUSIONS AND RECOMMENDATIONS
In this study we compared three large-scale assessments of coastal flood hazard with observations and local flood maps. The analysis has shown that hydrodynamic models combined with pan-European or global datasets have good accuracy in estimating extreme water levels, with R 2 in the range of 0.9, and NSE around 0.8 (which indicates little bias compared to observations). At the same time, the analysis indicated that taking into consideration non-linear interactions between tides and storm surge into the modelling, results in a lower bias compared with extreme values obtained from observations. The study also investigated accuracy in the pan-European EU-DEM elevation model in the context of its impact on flood modelling. We found the vertical errors in the coastal floodplains of three countries analysed (United Kingdom, the Netherlands, Poland) were not negligible. In the case of Poland, the average error of −2.3 m in the coastal zone causes significant difference between inundation limits derived without and with correction of the DEM. Similar disparity in results was found for several other countries, mainly along the Baltic Sea (Table 4). It is therefore recommended to analyse the accuracy of DEMs underlying the flood analysis before calculating coastal flood extents.
Further, the three large-scale models were juxtaposed with four case studies in Europe, where the 100-year flood hazard zone covers more than 7,000 km 2 . Performance of the static method in all models was not satisfactory. For England, the accuracy of flood zone delimitation was similar in all variants, and was lower than in river flood maps over the same territory analysed separately by TUD and JRC. For Poland the performance was better, as the flood maps for that country come from research which also used static inundation, albeit utilising a high-resolution DEM. For the Netherlands, either the whole low-lying area of the country was indicated as at risk of inundation or not at all, due to the difficulty of recreating the "dike rings" system in large-scale models. All models largely overestimated flood zones recorded during Xynthia storm surge in France. This area includes extensive low-lying areas and during the actual event they could not have been flooded because of the short duration of the phenomena, while the static method indicates the whole area as being at risk of flooding.
Finally, projections of future changes in flood zones from the TUD assessment were presented (and are publicly available, see . Depending on time period and climate change scenario, different factors are the main contributors to future trends: storm patterns, mean sea level rise or glacial isostatic adjustment. However, it was shown that the results are highly influenced by the flood protection levels that were assumed. Both external databases of flood protection levels used different sources and methods, yielding widely disparate results in terms of future flood extents. As noted above, relying on protection standards can lead to underestimation of hazard due to neglecting dike reliability. More research is therefore needed to establish an intermediate solution which would give estimates aligned with local-scale information and expert knowledge. Overall, continental analyses are useful because they are homogenous and allow drawing conclusions on larger scales. However, at this level of development they cannot replace local assessments when high accuracy is the priority.