Developing a Fluvial and Pluvial Stochastic Flood Model of Southeast Asia

Flood event set generation, as employed in catastrophe risk models, relies on gauge information that is not available in data‐scarce regions. To overcome this limitation, we develop a stochastic fluvial and pluvial flood model of Southeast Asia, using freely and globally available discharge data from the global hydrological model GloFAS and rainfall from the ERA5 reanalysis. We use a conditional multivariate statistical model to produce a synthetic catalog of 10,000 years of flood events. We calculate the flood population exposure associated with each flood event using freely available population data from WorldPop and generate exposure probability exceedance curves. We validate the population exposure curves against observed flood disaster data from EM‐DAT, showing that our methodology provides exposure estimates that are in line with historical observations. We find that there is a 1% probability that more than 30 million people will be exposed to flooding in a given year according to our event set. This number is roughly half the population living in the 100‐year return period flood zone of Fathom's hazard maps, suggesting most studies based on static flood maps overestimate exposure. This analysis provides significant progress over previous non‐stochastic studies which are only able to compute total or average exposure within a given floodplain area and demonstrates that a reanalysis‐based stochastic flood model can be designed to generate reliable estimates of population exposure probability exceedance. This study is a step toward a fully global catastrophe model for floods capable of providing exposure and loss estimates worldwide.


Introduction
Floods are among the most frequent natural hazards, affecting millions of people worldwide every year and causing billions of US$ of economic losses.In 2022, flood events caused US$ 44.9 billion in economic losses and affected 57.1 million people (CRED, 2022 Disasters in numbers. Brussels: CRED;2023).With changes in climate and urbanization, most regions worldwide are projected to face increases in climate hazards, often including floods.For example, projected increases in the frequency and intensity of heavy precipitation will increase rain-generated local flooding as reported by the Intergovernmental Panel on Climate Change (IPCC, 2022).Estimating flood hazard and risk at large scales (national to continental) using hydrodynamic models is now more necessary than ever, especially in developing countries and data-scarce regions.
Hydrodynamic models are typically configured to generate a hazard map associated with a streamflow (or rainfall) event with a set probability of exceedance.Implicitly, the return period of the event over the model domain is therefore deemed constant.For example, an event with a peak inflow discharge return period of 1 in 20 years is assumed to generate water depths over the domain that are exceeded once in every 20 years, on average.This assumption is acceptable at small scales but breaks down when modeling larger regions.A countryscale event where all the rivers experience a 1 in 20-year flood simultaneously is a physical impossibility, and instead at large scales the return period of an event will be different at different points on the river network.This is often termed the spatial footprint of the event.Despite this obvious feature of real floods, national scale flood hazard maps, for example, those created by FEMA in the US and the Environment Agency in England, are typically a patchwork quilt of local modeling studies that each assume a constant return period in space, and which are then stitched together to produce the large-scale data set.
Constant in space or static probability national hazard maps are useful for many applications, such as floodplain zoning and infrastructure planning, but have limitations when it comes to calculating flood losses.Typically, flood losses are computed by combining spatial maps of water depth with depth-damage functions.However, applying these at large scales without accounting for the spatial variations in return period that are characteristic of real floods can result in a misestimation of the loss that could realistically occur (Metin et al., 2020).In fact, this approach can only tell us about the average exposure or loss we might expect over a particular region and time period (e.g., the Expected Annual Exposure or Expected Annual Damage) or the total exposure or loss within a floodplain zone with a given probability of inundation.For many risk management decisions, such as insurance and reinsurance pricing or disaster preparedness, we also need to know what the expected loss will be in nonaverage years.For example, insurance firms are typically required to set aside sufficient capital to ensure they remain solvent even under very challenging conditions (e.g., when confronted by country-scale annual losses with a 1 in 200-year return period as stated by European Commission's Solvency II regulations, 2009).To provide quantitative answers to such questions, the full exposure-exceedance or loss-exceedance probability curve must be calculated.
Computing the exposure-exceedance curve for large-scale problems therefore requires we simulate flood spatial footprints in a realistic manner.Such model boundary conditions could come from a network of gauging stations, but the limited duration of most observational records means we have a very limited number of samples of event spatial footprints relative to the return periods of the very rare exposure and loss years in which we are most interested.Instead, we need to use an approach that extends gauge records to generate many hundreds or thousands of years of extreme events with realistic spatial patterns of flow return period across a river network.These extended records can provide a much larger sample of plausible boundary conditions for large scale hydrodynamic models that can then be used to compute the flood depths from each event.These water depths can then be intersected with population or land use maps and coupled with damage functions to give the exposure and loss for each event.Hydrodynamic models built in this way avoid the constant return period in space assumption of local studies and properly capture the spatial heterogeneity of flood events.In turn, these large catalogs of plausible flood events, also known as event sets, extend limited observational records and allow us to look more clearly into the tails of the exposure and loss distributions to obtain a more comprehensive view of flood risk.
Flood event sets can be generated through continuous hydrological simulation or by using statistical dependence models.The latter approach has the advantage of significantly lowering the computational cost of generating the many thousands of years of synthetic events needed when compared to running a full rainfall-runoff model over such long periods.In this study we use an established multivariate extreme frequency analysis statistical dependence model, based on the Heffernan and Tawn (2004) method, further developed by Quinn et al. (2019) and previously used in a number of large-scale studies (e.g., Olcese et al., 2022;Wing et al., 2020).
The multivariate statistical modeling approach, like the one applied by Quinn et al. (2019), is based on modeling the spatial dependency between extreme events measured at streamflow gauges, and then using the fitted model to stochastically simulate flood events with realistic spatial patterns.This has proven to be effective in data rich areas such as the United States with a relatively dense gauge network and many years of observations.However, since this methodology relies on historical observations, it is not immediately applicable to ungauged basins, or datascarce regions in general, where we do not have a long enough timeseries of measured streamflow from a large enough number of gauges.Moreover, there are many regions worldwide where there is almost no observed data, such as large parts of Africa, Asia, Central and South America.
Global hydrological models (GHMs) and climatological reanalyses can provide discharge and rainfall hindcasts over large areas for sufficiently long periods of time to drive a stochastic analysis and could therefore be used as input to a statistical dependence model.Whilst we know that the absolute discharges and rainfalls simulated by these models can have significant biases (Jafarzadegan et al., 2023), previous work (Olcese et al., 2022;Wing et al., 2020) has shown that GHMs do have skill in simulating the spatial dependence in extreme flows between different points in space.Put simply, if flow at site X is extreme then the probability that flow at site Y is also extreme is similar irrespective of whether this is identified using flow data from ground gauges or from GHM output.The spatial patterns in return period across an event footprint are therefore well simulated in GHMs and we convert back to absolute discharge using standard at-a-site or regionalized flood frequency analyses developed from ground data.Wing et al. (2020) developed and tested this model-based method in the United States by comparing it to a gaugebased approach, and later research from Olcese et al. (2022) tested it globally wherever gauge data were available.This model-based approach to estimating spatial dependence of extremes uses discharge return period from GHMs (instead of measured streamflow).The locations are appropriately chosen along the hydrological model's river Niall Quinn, Callum J. R. Murphy-Barltrop, Izzy Probyn Water Resources Research 10.1029/2023WR036580 network, daily discharge data and rainfall are extracted from the model at these locations, and then the statistical model outputs a catalog of events in the form of values of the return period at each site for each synthetic flood.We use the return periods to obtain the corresponding water depths for the area associated with each dependence location using precomputed hazard maps.In our case we use the Fathom global hazard maps (Sampson et al., 2015;Wing et al., 2017Wing et al., , 2023) ) which have been shown to have good skill at simulating flood water depths at sub-100 m scales over large areas (Bates et al., 2021(Bates et al., , 2023;;Wing et al., 2017Wing et al., , 2019Wing et al., , 2021)).For boundary conditions, the hydrodynamic inundation modeling uses regionalized frequency analysis of ground gauge observations to estimate the magnitude of extreme rainfall or river flow events anywhere in the domain of interest, and the inundation model and stochastic model only connect in terms of event probabilities.By working in probability space and separating the stochastic model (which just uses correlations) and the inundation model (which uses ground-based observations of discharge and rainfall) we neatly sidestep the problem of GHM discharge bias.
The previous studies from Wing et al. (2020) and Olcese et al. (2022) showed that the use of discharge data from GHMs had promising results in terms of recreating the observed dependence structures between sites as seen in historical records.However, these studies focused on a site-to-site comparison to existing gauges and did not explore designing a dense network of synthetic gauges to test the approach in practice.There was no focus on the production of realistic spatial patterns and little validation beyond river discharge co-correlation.The tests in datascarce regions did not include an analysis of how the hazard dependence might propagate into exposure and the modeling chain was not validated beyond river discharge.These tasks are therefore the purpose of this paper where we apply the methodology to Southeast Asia, designing a synthetic gauge network with which to train the statistical model and validating the resulting exposure against observed flood events.We have also extended the method beyond previous studies by including pluvial data in our analysis.
At a global level, Southeast Asia is one of the regions with the highest flood risk, due to climate conditions and population exposure.The region may also be even more at risk from floods in the future considering the effects of climate change, urbanization trends, and economic development (Winsemius et al., 2016).As a result, the demand for comprehensive flood risk analysis in the area is ever growing.Brunei, Cambodia, Laos, Malaysia, Myanmar, Philippines, Thailand and Vietnam have a total population of 398 million people (United Nations, World Population Prospects, 2022), many of whom live in river deltas and floodplains (Tierolf et al., 2021).More specifically, 1,973 people per km of river reach length are exposed in laterally unconfined floodplains (with minimal or no levees to contain the floodwaters, where flooding and exposure grow exponentially with discharge return period) compared to 183 in North America, whilst 59% of cropland area is located in floodplains in this region, and thus requires a human presence (Devitt et al., 2023).The region is often affected by fluvial flooding along its major rivers, such as the Chao Praya, the Red River and the Mekong, as a result of monsoons and tropical cyclones.Most of the floods that occur in mainland Southeast Asia are associated with the East Asian summer monsoon (Loo et al., 2015), while tropical cyclones often affect the Philippines and Vietnam (Tran et al., 2022).
In contrast with the clear need for flood models, the availability of historical records for discharge and rainfall in the Southeast Asian region is very limited, especially when working at a large scale, and this makes the use of gauge-based approaches difficult.As a comparison, if we look at the GRDC (Global Runoff Data Centre) database there are only ∼300 river discharge gauges in the whole of South East Asia (∼4.5 million km 2 ), giving a gauge density of 1 gauge per ∼15,000 km 2 , while the USGS (United States Geological Survey) operates over 11,000 gauges in the United States (∼9.4 million km 2 ) equating to a density of 1 gauge per ∼855 km 2 .By using GHMs to condition the stochastic model we can overcome the limited sampling density of observed data and generate many thousands of years of realistic flood events for Southeast Asia, thereby providing useful new information on the flood risk in the region.The gauge density is only constrained by the computational power available.

Study Area
Our reanalysis-based stochastic flood model of Southeast Asia includes Malaysia, Myanmar, Thailand, Vietnam, Brunei, Cambodia, Laos and the Philippines.We simulate fluvial and pluvial flooding, using as inputs the new GloFAS v4.0 river discharge reanalysis (Grimaldi et al., 2022) and ERA5 precipitation data (Hersbach et al., 2020), both freely available globally.As previously stated, we chose this area because of its high exposure to flood events and the lack of gauge data, making it a good candidate for developing and testing our reanalysisbased stochastic modeling approach.

GloFAS v4.0 Hydrological Reanalysis
GloFAS v4.0 hydrological reanalysis is the latest generation of the ERA5-forced hydrological reanalysis produced by the Copernicus Emergency Management Service (Grimaldi et al., 2022).It has a grid resolution of 0.05°( ∼5 km) with a quasi-global extent (from 180 to 180°of longitude and from 90°north to 60°south latitude) and provides daily maps of discharge in m 3 /s from 1980 to 2022.The meteorological input variables for GloFAS v4.0 are CS3 ERA5 2-m temperature, 2-m dew temperature, 10-m U wind component, 10-m V wind component, surface solar radiation downwards and surface thermal radiation.These variables are used to run LISFLOOD v4.1.0,a semi-distributed, physically based hydrological model (De Roo et al., 2000), that represents catchment hydrological processes, such as partition of precipitation into rain and snow, snow melting, canopy interception of rain, water infiltration into the soil, groundwater storage, surface runoff, lakes, dams, irrigation, and other human water uses, flow in the rivers and in the floodplains (Grimaldi et al., 2022).

ERA5 Reanalysis Precipitation
ERA5 reanalysis is produced by the European Centre for Medium-Range Weather Forecasts (ECMWF) and based on the Integrated Forecasting System (IFS) Cy41r2 (Hersbach et al., 2020).It comprises records of the global atmosphere, land surface and ocean waves from 1950 onwards.It has a horizontal resolution of 31 km and provides hourly outputs of accumulated precipitation.In this study we use daily accumulated precipitation, derived from the hourly data, to drive the spatial dependency of pluvial events for consistency with other data (e.g., daily flows).

Site Selection of Synthetic Gauges
Site selection is an important element of the model set up process: the chosen locations are the ones that will be analyzed for spatial dependence and each point will be associated with an area that will then be used to sample from pre-computed flood hazard maps (based on Sampson et al., 2015).For each simulated event, flooding in the area around each dependence point will be considered as having the same magnitude as the event at that location.The selection of the locations is therefore optimized to be associated with the appropriate drainage basin (for fluvial points) or with a surrounding area (for pluvial points).Without having to rely on existing gauges for data input, we are able to choose any locations to represent fluvial and pluvial flooding in the area.We selected pluvial sites located on a grid of 0.25°(∼31 km), selecting one point for each tile in the ERA5 data set, while fluvial points are positioned at the outlet of each modeled reach in the Fathom global hydrodynamic model hazard layers (Sampson et al., 2015), boundaries between reaches are created wherever there is a change in stream order, and larger rivers are also split in multiple reaches based on changes in upstream accumulation exceeding a threshold.With these criteria, we obtain a total of 14,390 locations (4,747 fluvial points and 9,643 pluvial points) over Southeast Asia, for a gauge density of 1 fluvial site per ∼520 km 2 and 1 pluvial site per ∼256 km 2 , significantly higher than what the observation network can provide in this region (1 gauge per ∼15,000 km 2 ).For each location we can extract daily discharge data (for fluvial points) or daily rainfall (for pluvial points).This hindcast data is the input to our stochastic modeling framework.

Stochastic Multivariate Conditional Model
The stochastic flood model used in this study is based on Quinn et al. (2019); this approach has been previously used by multiple studies, such as Diederen et al. (2019), Ghizzoni et al. (2012), Keef et al. (2009), Neal et al. (2012), Wing et al. (2020), andOlcese et al. (2022) A conceptual diagram of the model can be seen in Figure 2. It uses a statistical approach to derive the spatial dependency between peak river flows or rainfall at different sites using the multivariate conditional extremes model given in Heffernan and Tawn (2004).The fitted model is then used to stochastically simulate a 10,000-years of synthetic flood events, hereafter termed the event set.

Pre-Processing and Clustering
Fluvial data are collected from the GloFAS grid, and we make sure to link each site to the correct GloFAS cell by comparing the locations and the upstream catchment area values between the Fathom model and GloFAS.This is done using a fully automated procedure, that iteratively searches for the cell with the best match in accumulation (less than 20% difference) within a 10-cell radius from the starting point.Precipitation data are extracted from the Water Resources Research 10.1029/2023WR036580 OLCESE ET AL.
ERA5 1-hr accumulation layer and then aggregated to daily rainfall.After extracting daily discharge and precipitation timeseries for fluvial and pluvial points, we look for local peaks that exceed a 1 in 0.5 years threshold and use a spatiotemporal algorithm to cluster them into unique events using different temporal windows tailored to the typical event duration at each site.The typical event duration for each site is evaluated using a large moving window to first identify the flow/precipitation peaks at each site and then define the event duration as the number of days it takes the flow to rise (and then fall) by 50% variation from the mean value in our time window.After identifying fluvial and pluvial events separately and obtaining unique fluvial and pluvial clusters, we check for temporal and spatial overlap between the two perils and merge any combined fluvial and pluvial events.The output from this clustering phase is a table where each cell is a streamflow/rainfall value, each row represents a single event and each column a site.

Dependence Model
After preparing the training event set (derived from the hydrological models and reanalysis), we use the semiparametric approach given in Coles and Tawn (1991) to capture the marginal distribution at each site.For this approach, a Generalized Pareto Distribution (GPD) is used to model observations exceeding some high extreme threshold (Coles, 2001), while observations below the threshold are modeled empirically via a probability rank transform.To select the GPD thresholds, we utilize the approach given in Murphy, Tawn, and Varty (2023), whereby a threshold is selected to minimize the absolute difference between modeled and input quantiles: the corresponding software for this approach is given in Murphy, Tawn, Varty, et al. (2023).The following step is to transform the marginal data sets onto the standard Laplace scale using the probability integral transform and fit the multivariate extremes model, originally proposed in Heffernan and Tawn (2004) (H&T model henceforth) and further extended in Keef et al. (2013), to each conditioning site.A generic formulation of the H&T model, that simulates extremes at site Y j when site Y i is extreme, can be described by the following equation: Where y denotes an extreme streamflow/rainfall at site Y i , parameter 1 ≤ α j|i ≤ 1 controls the form and strength of the dependence, parameter β j|i < 1 affects how the dependence relationship changes as the magnitude at the conditioning site increases and Z j|i is a vector of independent residual terms.This formulation is used to capture all possible pairwise dependencies between a conditioning site and its neighboring sites (every site within the study area can be a neighboring site; we do not set any distance limit when evaluating correlation).
Using the fitted dependence model, we can approximate the observed multivariate distribution for any given conditioning site.With this information we then simulate 10,000 years of extreme events at our locations (Keef et al., 2012).Simulating 10,000 years allows the generation of a good sample of very large events with return periods in the range of 100-200 years, which are of interest for insurers.The main output of the model is an event set, in the form of return levels (actual streamflow and rainfall values) and magnitude (Laplace margins and return periods) for all the sites used as inputs.In our case the 10,000-year simulation produced 832,880 distinct flood events over the Southeast Asia region.

Population Exposure
The calculation of population exposure was the final step of the stochastic model validation.Knowing the return period associated with each site for each event, it is possible to calculate the corresponding population exposure caused by each event and compare it to observed data.A conceptual diagram of the exposure calculation method can be seen in Figure 3.
As a first step, each fluvial and pluvial site in Figure 1 was associated with a corresponding area, in order to obtain the flooded area associated with the location being extreme: each fluvial site was linked to its upstream area (based on the basin subdivision in HydroBASINS, Lehner & Grill, 2013) and each pluvial point was linked to the surrounding 0.25 by 0.25°area.After identifying the affected area associated with each location, and knowing the return periods linked to each event, we can calculate the local area affected by each flood event and subsequently use high-resolution population data to derive the population exposure.The same process can be used to create event flood maps from the event set and the hazard maps.Each site is associated with a corresponding area and the event flood map is created by extracting the regions associated with the flooding sites from the hazard maps with the appropriate return periods.
The hazard data used to determine the flood footprint associated with each site at various return periods was produced by Fathom and is the last update (version 3) of their global flood maps (Wing et al., 2023).This hazard data has been extensively validated in different regions (Wing et al., 2023), including Central Vietnam (Hawker et al., 2023).The data are produced with a variant of the LISFLOOD-FP 1D/2D hydrodynamic model (Bates et al., 2010) run at 1 arcsecond (∼30 m) spatial resolution over the entire terrestrial land surface.The modeling uses the FABDEM digital elevation model (Hawker et al., 2022) for the 2D model component and the MERIT-Hydro hydrography (Yamazaki et al., 2019) and the channel bathymetry optimization algorithm of Neal et al. (2021) to define the 1D rivers.
The hazard map for each return period was first converted into a binary image representing flooded and nonflooded cells, considering as fluvial flooding anywhere in the fluvial hazard layer with a water depth greater than 15 cm as suggested by the UK Environment Agency (2019) (we later explored changing this threshold to 30 cm) and as pluvial flooding anything with more than 10 cm of water in the pluvial hazard maps as in Mediero et al. (2022).The choice of a suitable water depth threshold to define what is or isn't considered flooded and which compares appropriately to EM-DAT records is not an obvious one: it is highly likely that floods with trivial water depths won't be recorded in the EM-DAT database, so zero would not be an appropriate choice, 15 cm is approximately property threshold height, while 30 cm of water depth starts to cause issues for human stability and household electrics.As we cannot be sure about what has and hasn't been recorded in EM-DAT, we chose to include a plausible range in our study.This step produced 16 binary hazard maps: 8 for each peril (fluvial and pluvial) representing the various return periods simulated (5,10,20,50,100,200,500, and 1,000 years).We also created 16 "clean" binary layers considering pluvial flooding only in the cells where there was no fluvial flooding at each return period and vice versa.These "clean" layers were used to avoid double counting exposure in mainly fluvial events with a pluvial component and mainly pluvial events with a fluvial component.
Using population data from the WorldPop database (more specifically the 100 m spatial resolution 2020 Constrained without UN adjustment data set, Tatem, 2017) we calculated fluvial and pluvial population exposure for the sub-areas associated with each synthetic gauge for each simulated return period.This allowed us to then extract the population exposure associated with each synthetic event in our data set, using linear interpolation to interpolate between the return periods.
To validate our results, we calculated the total annual exposure for each year in our 10,000-year simulation and compared it to available disaster observations.The observed exposure data we chose as a benchmark for our model is from EM-DAT, the International Disaster Database produced by the Center on the Epidemiology of Disasters (CRED), part of the University of Louvain (UCLouvain).This database is a global inventory of hazards and disasters, and it is compiled using data from UN agencies, non-governmental organizations, reinsurance companies, research institutes and press agencies.EM-DAT includes disasters related to natural and technological hazards, such as floods, droughts, earthquakes, landslides or industrial accidents.It focuses on major disasters and includes events that have at least one of the following characteristics: 10 fatalities, 100 affected people, a formal declaration of a state of emergency or a call for international assistance.We extracted EM-DAT disaster data for the countries within our study area (Malaysia, Myanmar, Thailand, Vietnam, Brunei, Cambodia, Laos and the Philippines) and filtered it by disaster type to only include flooding.We obtained a data set of 368 events across 23 years (from the year 2000 to present day).For 349 out of the 368 events, data about the total number of affected people was provided.EM-DAT doesn't provide a definition of affected people beyond describing it as "the sum of injured, homeless and affected."As EM-DAT uses a variety of sources to compile its database, it is likely that the definition used to calculate the number of affected people is event dependant.As we do not know exactly the criteria used to determine the number of affected people during each event, we decided to compare it to the total number of people living in the flooded areas during each event in our model.The exposure values associated with each event from EM-DAT were normalized according to the population growth rate (World Bank, World Development Indicators: population growth, annual %) in each country, depending on the year of the event, in order to have a fair comparison to our model-derived exposure, which is based on present day (2020) population.

Model Validation
The first step of model validation is to check how accurately the spatial dependency as seen in the training data can be simulated by the statistical model.We use standard metrics to evaluate the accuracy of the model, such as estimates of the extremal dependence coefficient χ (Coles et al., 1999), the fits of the marginal distributions, the return levels and event footprint sizes.

GPD Fits Evaluation
The fits of the GPD to each marginal distribution are evaluated for each site.We produce quantile-quantile (QQ) plots to determine if our simulated data set has the same marginal distribution as the training data.Because of the threshold selection technique, we expect the QQ plots to be very close to the y = x line.We produced one plot for each site in order to visually check the results and we calculated the mean absolute percentage error (MAPE) for each peril across all sites.
The following equation was used to calculate the MAPE, where n is the number of sites, Y e is the empirical quantile and Y m is the modeled quantile.
For both fluvial and pluvial sites, most locations have errors below 5%, with the mean being around 2%-3%, as we can see in Figure 4. We are therefore satisfied with the performance of the GPD fit.

Chi Estimates
The extremal dependence coefficient χ is the conditional probability that site Y is extreme when site X is extreme.If we accurately represent the dependence structure we see in the input data set, we expect the simulated χ estimates to be similar to the input ones.In Figure 5, we plot the synthetic estimates of χ from our model (on the y axis) against the historical (training) ones derived from GloFAS and ERA5 data (on the x axis).Some scatter in the plot is expected, but the modeled χ values are similar to what we see in the input data, albeit with a slight bias.We calculated the absolute percentage error between modeled and empirical χ estimates across all sites and saw that the majority of sites have error values below 30%.We show a summary of the error statistics in the top left of Figure 5.As the overall error values are adequately small and other validation metrics show adequate model performance, we are satisfied with the representation of the dependence structure produced by the model at this stage.

Return Levels Comparison
Historical (training) and synthetic return level estimates are compared for various return periods (1,5,10,20,30,35, 100, 1,000, and 10,000 years).The return levels (streamflow for fluvial sites and precipitation for pluvial sites) are computed using the estimated marginal (GPD) distributions for the historical events and empirically for the synthetic events.The results from the analysis show a good correspondence between training and synthetic return levels.In Figure 6 we show the results for some return periods (5, 10, 20, and 30 years), with the historic return levels on the x axis (as seen in the input data from GloFAS and ERA5) and the modeled synthetic return levels on the y axis.The top four panels refer to the 4,747 fluvial locations, while the bottom four panels refer to the 9,643 pluvial sites.Whilst there is an expected spread around the y = x 1:1 red line, especially in the pluvial sites and particularly at higher return periods, most sites show a good correspondence between the return level estimates, similar to that seen in previous applications such as Quinn et al. (2019).More graphs (relative to return periods of 1, 35, 100, 1,000, and 10,000 years) are available in the supplementary materials.In the top left corner of the figures, we can see a summary of the statistics of the absolute percentage errors in the return levels estimate: we can see that the 90% of the sites have errors below 30% in all cases.

Footprint Sizes
As part of our standard validation metrics, we evaluate the simulated footprint sizes (defined as the number of sites experiencing an event greater than 1 in 0.5-year), to determine if the sizes of our simulated events are similar to the events that we use as model input.The synthetic event sizes are slightly smaller than the historical benchmark, but overall we have a good correspondence.In Figure 7 we can see the cumulative distribution of the event sizes and notice how larger events are associated with higher non-exceedance probabilities in the synthetic data set.

Qualitative Validation of Event Footprints
Having established that the synthetic event set adequately reproduces the characteristics of the model-based historical event set used as input, we qualitatively evaluated a sample of events to visually check their spatial  structure.More specifically the aim was to test if events had a realistic spatial structure and represented the effect of physically plausible phenomena, such as monsoon systems, convective storms or tropical cyclones.To do this, we visually examined the events with the largest footprints at each conditioning site and a sample of events with return period at the conditioning site between 100 and 200 years.A small subset of events is shown in Figure 8 to give an indication of the findings.
As a first test, we checked if our model produced any implausibly large events, by visually examining the events with the highest number of affected sites.We identified some very large events, especially ones associated with high return periods at the conditioning site, but no events showed clearly implausible footprints.Major flood events affecting large portions of mainland Southeast Asia are not uncommon in this region, often caused by monsoon systems (an example of such an event produced by the model is represented in the top left panel of Figure 8).We therefore conclude that the large events generated by the model are plausible for this region.
When visually checking maps of randomly sampled 100-200 years events, we also identified some cases where the events consist of two apparently separate clusters of sites affected at the same time.These instances are likely to be caused by the pre-processing of the input, more specifically in the pre-clustering phase, when historical events are generated from the GloFAS discharge hindcasts.During this process, two individual floods caused by distinct and different physical mechanisms can sometimes be merged into one, and therefore create a binomial event with more than one hotspot.This behavior could be avoided by shortening the time window within which we allow the clustering process to look for and merge events.An example of this pattern Is shown in the top right panel of Figure 8.
Aside from this multiple-hotspot pattern, the vast majority of the events that were qualitatively assessed in the data set resemble realistic flood events that can occur in the region and we believe can provide useful flood risk information.The bottom two panels in Figure 8 show two examples of plausible events: a fluvial event in the Mekong area and a pluvial event affecting Indonesia.
Due to the large number of events produced by our 10,000 years simulation (a total of 832,880 events), it is not possible to quality check all of them, but a visual sanity check of a non-trivial sample was still considered a valuable element of model validation and showed no evident issues with the event footprints.

Validation of Population Exposure
As a first comparison between modeled and observed data we computed the total Exceedance Probability (EP) curves for flood exposure (caused by all the events, just fluvial and just pluvial events) and compared them to EM- show the non-exceedance probability associated with events of different sizes (with size being defined as the number of sites experiencing an event over 1 in 0.5-year).We can see a good correspondence between the curves, although the synthetic events have on average a smaller footprint than the "historical" ones.
DAT disaster data.An EP curve is generated by calculating the exposure for each event and then summing the values by year, to obtain total annual exposure.The exposure values were then sorted in descending order in order to associate a return period to each annual loss.This process was first done considering our entire ∼800,000 event catalog and then by partitioning the event set into separate fluvial and pluvial floods.The resulting curves are shown in Figure 9.The same method was then applied to EM-DAT data and plotted on the same graph to evaluate the comparison.Figure 9 shows that our modeled exposure is overall higher that what we see in EM-DAT records, which is not surprising as EM-DAT is likely missing a large amount of small flood events that do not fall in its disaster category.
Since EM-DAT selectively samples larger events and, in addition, has only ∼23 years of observations, a better way to compare it to our model results is to sample similar subsets of events from our simulation and compare the exposure that they cause.We extracted 1,000 samples of 23 years from our model and for each one of these, we selected the biggest 349 events (as EM-DAT is likely to contain only the most impactful floods).This event sampling was done in order to match the frequency of our modeled events to the historical records, and then to validate the magnitude of the exposure.We also explored including all simulated events for each year and noticed it made a negligible difference in the results shown in Figure 10.We then produced EP curves for each subset and compared the resulting exposure to the exposure reported in EM-DAT.We initially computed the exposure using a 15 cm threshold to determine which areas are considered as fluvial flooding and then explored changing this to 30 cm to evaluate the sensitivity of the model to this parameter.The results are shown in Figure 10.We can see an overall good correspondence between the model samples and the observed exposure data.Considering the sampling noise in the available record length of just 23 years and the uncertainty in the observed exposed people associated with the EM-DAT data set (dashed blue lines in Figure 10), we conclude that the model produces results that are indistinguishable from the observations.The observed exposure exceedance curve aligns with what we would expect from a 23-year sample of our simulation.There is some sensitivity to the choice of flooding threshold as one would expect, but this does not materially change our overall conclusion.

Conclusions
In this study we developed a stochastic fluvial and pluvial flood model of Southeast Asia, generating 10,000 years of flood events using freely available streamflow and rainfall data from the GloFAS and ERA5 hindcasts.We tested this model-based approach using a high number of sites distributed in the region and showed that the model does reproduce well the spatial dependency we see in the data from the GHM.Previous work (Olcese et al., 2022;Wing et al., 2020) has already demonstrated the ability of GHMs to represent observed extremal dependency.We were then able to convert the simulated events from return periods to water depth using pre-computed hazard maps using the Fathom global flood model (but any other large scale flood model could equally be used).This approach allows us to have a view of flood risk in a data-scarce area where we do not have sufficient observations of discharge and precipitation and could be used in other data-scarce regions.
Using freely available population data from WorldPop we were also able to calculate the flood population exposure caused by our event set and validate this against EM-DAT records of people affected by floods in the past 23 years.As we can see from Figure 10, when assuming the same event frequency, the observations and the model are indistinguishable from each other, showing that our methodology can produce sensible estimates of flood exposure.According to our event set, there is a 1% probability that 30 million people will be exposed to flooding in a given year if we consider the effects of both fluvial and pluvial events, and around 23 million if we look at fluvial exposure only.This number is roughly half the population living in the 100-year return period (1% probability) flood zone of Fathom's hazard maps.This tells us that the number of exposed people we can obtain from hazard maps is useful but will overestimate the number of people that can be affected in a single year with a 1 in 100-year recurrence.This is one of the key advantages of using an event set approach for risk calculations instead of static hazard maps.
This paper also provides a more complete analysis of population exposure in Southeast Asia when compared to previous research by focusing on an event-based view of risk and considering both fluvial and pluvial flooding.Some previous studies have focused on analyzing population exposure specifically in Southeast Asia, but only focused on river floods and used static hazard maps rather than an event set (Tierolf et al., 2021).There are various papers that look at flood population exposure at a global scale, but all of them have focused on river or coastal flooding using static flood hazard maps (Jongman et al., 2012;Tiggeloven et al., 2020) and not many include pluvial flooding by using the same Fathom data set we have employed here (Andreadis et al., 2022).
This research proves that it is possible to design a fluvial and pluvial stochastic flood model in a data scarce region, using globally available synthetic streamflow and rainfall data.The model generates events with physically plausible spatial structures and, when combined with hazard maps and population data, can provide a sensible estimate of flood population exposure exceedance in a data-scarce region such as Southeast Asia.This implies that this methodology could be used globally, proving especially useful in other data-scarce regions.We plan on using this method in the future to further explore flood risk by estimating economic losses as well as population exposure.This opens the possibility to future developments such as generating a global catastrophe model with a global flood event set and associated exposure and losses estimates, without relying on the availability of historical data.

Figure 1 .
Figure 1.Map of the selected 14,390 locations for the study area of Southeast Asia.The blue dots represent the fluvial locations (4,747), located at the outlet of each modeled reach, and the red dots represent pluvial locations (9,643), one for each tile of the ERA5 reanalysis.

Figure 2 .
Figure 2. Conceptual diagram of our stochastic multivariate conditional model framework.The first stage (upper section of the flow chart) is the preprocessing and clustering of flow and precipitation timeseries into events, while the lower section of the diagram describes the main steps of the dependence model.The dependence model first fits a Generalized Pareto Distribution distribution to the upper tail of the marginal data at each site, transforms the data into Laplace margins and then uses a multivariate extremes model to derive the spatial dependency between sites and simulate 10,000 years of synthetic floods.

Figure 3 .
Figure 3. Conceptual diagram of the exposure calculation method.We use population data from WorldPop database, and the hazard maps produced by Fathom to calculate the number of exposed people at each return period.We then combine that with our event set to obtain the population exposure caused by each event.

Figure 4 .
Figure 4. Histograms representing the distribution of the mean absolute percentage error in our quantile estimations.Most sites show errors lower than 5%, the dashed blue line represents the mean error.These low errors show that the model accurately fits a Generalized Pareto Distribution and reproduces the marginal distributions as seen in the input data.

Figure 5 .
Figure 5.Comparison of historical (training) and synthetic (simulated) extremal dependence coefficient χ estimates.The values are overall close to the red y = x 1:1 line, showing how the model represents the dependence structure we see in the input data The panel in the top left shows a summary of the error statistics relative to the absolute percentage error between synthetic and historic chi estimates.

Figure 6 .
Figure 6.Comparison between historical (training) and simulated return levels for fluvial and pluvial sites.We can see that the points are overall close to the y = x line, showing that the model adequately recreates the input return levels and the values of the 90th percentile errors are always below 30%.

Figure 7 .
Figure 7. Cumulative distribution of the size of global events.The CDF curves for the historic (training) and synthetic data setshow the non-exceedance probability associated with events of different sizes (with size being defined as the number of sites experiencing an event over 1 in 0.5-year).We can see a good correspondence between the curves, although the synthetic events have on average a smaller footprint than the "historical" ones.

Figure 8 .
Figure8.Maps showing some of the events produced by the stochastic model.The conditioning site of each event is marked with a blue triangle, and the colors represent the return period experienced by each location, according to the colormaps.In the top left panel, we can see a large event, affecting a large portion of mainland Southeast Asia, but not incompatible with a large monsoon system with a high return period.The top right panel shows an event with two clusters of sites affected at the same time, possibly an artifact caused by the event clustering methodology.The bottom left panel shows an event with fluvial flooding around the Mekong area and the bottom right panel shows a pluvial event affecting the Philippines, compatible with a tropical storm.

Figure 9 .
Figure9.Exceedance Probability curve showing the probability of different levels of exposure to be exceeded with increasing return period.The dark blue curve shows the total annual exposed people by all the events produced by our stochastic model, the green curve represents the exposure associated with fluvial events and the light blue the exposure from pluvial events.The red dots represent the annual losses caused by the flood events reported in EM-DAT (23 years of observations), adjusted to account for population growth.

Figure 10 .
Figure 10.Exceedance probability curves for flood exposure.The gray lines represent 1,000 23-year samples extracted from our simulated fluvial events.The blue dots represent EM-DAT annual exposure data and the blue line is a GEV fit through EM-DAT points.The dashed lines represent the 1 standard deviation confidence bounds of the GEV fit.