A Flood Risk Framework Capturing the Seasonality of and Dependence Between Rainfall and Sea Levels—An Application to Ho Chi Minh City, Vietnam

State‐of‐the‐art flood hazard maps in coastal cities are often obtained from simulating coastal or pluvial events separately. This method does not account for the seasonality of flood drivers and their mutual dependence. In this article, we include the impact of these two factors in a computationally efficient probabilistic framework for flood risk calculation, using Ho Chi Minh City (HCMC) as a case study. HCMC can be flooded subannually by high tide, rainfall, and storm surge events or a combination thereof during the monsoon or tropical cyclones. Using long gauge observations, we stochastically model 10,000 years of rainfall and sea level events based on their monthly distributions, dependence structure and cooccurrence rate. The impact from each stochastic event is then obtained from a damage function built from selected rainfall and sea level combinations, leading to an expected annual damage (EAD) of $1.02 B (95th annual damage percentile of $2.15 B). We find no dependence for most months and large differences in expected damage across months ($36–166 M) driven by the seasonality of rainfall and sea levels. Excluding monthly variability leads to a serious underestimation of the EAD by 72–83%. This is because high‐probability flood events, which can happen multiple times during the year and are properly captured by our framework, contribute the most to the EAD. This application illustrates the potential of our framework and advocates for the inclusion of flood drivers' dynamics in coastal risk assessments.

Flood risk studies including seasonal variability have to date typically only considered a single flood driver, such as discharge (Förster et al., 2008;Klaus et al., 2016). Studies that have considered multiple drivers and their potential dependence have usually tended to focus on one generating mechanism, ignoring a potential second flood season (Apel et al., 2016;Du et al., 2020;Han et al., 2020;Lian et al., 2017;Tu et al., 2018;Xu et al., 2019Xu et al., , 2014Yang & Qian, 2019).
In this study, we develop a framework that can account for the monthly variability in rainfall and sea levels and their dependence to calculate flood risk. We apply the framework to Ho Chi Minh City (HCMC), the selected case study. HCMC is one of the 10 cities most at risk from flooding under current socioeconomic conditions (Dasgupta et al., 2009;Hallegatte et al., 2013;Hanson et al., 2011;Kron, 2013;Nicholls et al., 2008;Sundermann et al., 2014), and is affected by multiple meteorological flood drivers. It is also the largest economic center of Vietnam, accounting for about 21% of foreign direct investments (General Statistics Office, 2020) and around 22% of the national gross domestic product in 2019 (HCMC Statistical Office, 2020). Flood research in HCMC based on single drivers of flood hazard shows that sea level rise and changes in precipitation or sea level extremes are expected to increase the frequency of minor floods and the impact of extreme flood events, unless appropriate adaptation measures are taken (Hallegatte et al., 2013;Khoi & Trang, 2016;H .Q. Nguyen et al., 2019;Scussolini et al., 2017;Storch & Downes, 2011;Vachaud et al., 2019;Vafeidis et al., 2019;Zhao et al., 2021).
Extremes of rainfall and of sea level have been acknowledged as the main climate-related drivers of flooding in HCMC compared to river discharges which are controlled by a set of upstream reservoirs (ADB, 2010;Binh et al., 2019;Vachaud et al., 2019). In the urban city center, districts along the river banks are particularly vulnerable to coastal flooding and inland districts, more elevated, to pluvial flooding (ADB, 2010). Dedicated flood hazard modeling studies have mainly focused on either one flood driver or have neglected the seasonality of flood drivers (Binh et al., 2019;Lasage et al., 2014;Scussolini et al., 2017). Yet, empirical evidence suggests that flood frequency is higher between September and November, when intense rainfall and extreme sea levels can also occur (Duy et al., 2018). To the best of our knowledge, the only study to examine these combinations at a monthly time scale for HCMC is Dahm et al. (2013). However, they do not report on the dependence between flood drivers for this location nor on the expected annual damage (EAD) and the seasonality of flood risk. The documentation of their model framework is also very limited.
To address these gaps, we develop a novel multivariate flood risk framework to quantify the impact of interactions between seasonal variations in rainfall and sea levels and their dependence on flood risk dynamics. Unlike the approach from Dahm et al. (2013), we decompose sea levels between the tide and the skew surge components because this allows for a better estimate of extreme sea level distributions (Batstone et al., 2013;Haigh et al., 2010) and for a more accurate characterization of the dependence . We apply this framework to calculate the EAD in HCMC and to understand the relative contribution of different flood drivers to the EAD. We show that seasonality is a key determinant of flood risk for HCMC even when compared to fully dependent flood drivers.

Case Study: Ho Chi Minh City
Located in the downstream reaches of the Saigon and Dong Nai rivers, HCMC can flood from various mechanisms throughout the year such as extreme rainfall, extreme sea levels, tsunamis, and reservoir dam breaks (Vachaud et al., 2019). Between May and November, HCMC receives about 90% of its 2,000 mm of yearly average rainfall, driven by the East Asian summer monsoon (Binh et al., 2019;Dircke et al., 2010;Horton et al., 2010;K. A. Nguyen et al., 2019). During this season, downpours can paralyze the city for multiple hours (Duy et al., 2018), as observed in August of 2020 (Tuoi Tre News, 2020). Pluvial nuisance flooding has been reported to already occur for events with over 50 mm of accumulated rainfall (Horton et al., 2010) and extensive flooding for events with an accumulated daily rainfall of more than 100 mm (Dahm et al., 2013;Dircke et al., 2010;Duy et al., 2018;Vachaud et al., 2019). High tidal peaks can cause low-lying areas like the southern districts to be flooded on a monthly basis. Between September and January, the seasonal cycle in the tide leads to the largest tidal events, with a maximum tidal amplitude of 1.5 m (Devlin et al., 2018). Long-sustained strong winds and low pressure systems from the monsoon can also cause significant storm surge heights along the southern coast of Vietnam, in the order of 30-50 cm (Thuy et al., 2019). When combined with high tides, sea levels can exceed critical warning levels and lead to flooding (Thuy et al., 2019). These coastal flood events affect large parts of the city, given that 45% of HCMC is located below 1 m above sea level (m.a.s.l.) and 65% below 2 m.a.s.l (ADB, 2010). Finally, while less common, high storm surge events can stem from tropical cyclones and lead to impacts in HCMC (Pham et al., 2019). In the current climate, 4-6 tropical cyclones make landfall along the coast of Vietnam on average each year (Almar et al., 2017;Takagi et al., 2014), and about 10% of those affect HCMC (ADB, 2010). A notable disruptive example is typhoon Linda (November 1997), which brought heavy rainfall and storm surge resulting in 48% of the HCMC population being affected (ADB, 2010).

Data and Methods
To quantify the impact of seasonal variations in rainfall and sea levels on flood risk and capture statistical dependencies between flood drivers, the following general steps are applied ( Figure 1) and explained in detail in the following subsections. First, we preprocess the observed time series of the individual flood drivers, whereby they are detrended to obtain stationary time series of rainfall, skew surge and daily high tide (Section 3.1; step I in Figure 1). Second, we characterize the multivariate joint distribution of rainfall, storm surge and high tide. We report the pairwise dependency distributions between flood drivers at a monthly time step and the number of observed cooccurring extremes (Section 3.2; step II in Figure 1). Third, we use a hydrodynamic model, land cover map and vulnerability curves to calculate the damage from different combinations of rainfall and sea level and build a damage function (Section 3.3; step III in Figure 1). Fourth, we quantify the monthly flood risk (Section 3.4; step IV in Figure 1) by multiplying the monthly joint probability mass function obtained from multiple drivers with their estimated direct economic impacts from steps II and III, respectively. Finally, we carry out a sensitivity analysis to quantify the role of seasonality and of the statistical dependence on the expected annual damage (Section 3.5).

Preprocessing Observed Time Series
We compile multidecadal time series of observed rainfall and sea levels and remove long-term trends to correct for changes in the current climate (Read & Vogel, 2015) and obtain the three flood driver time series used for the statistical analysis: the daily average rainfall monthly maxima over the HCMC center, skew surge monthly maxima, and high tide levels at the coast.
Rainfall. We use daily accumulated rainfall between January 1, 1980 and December 31, 2017 from six rain gauges located close to or in HCMC's center, as shown in Figures 2a and 2b. The data are obtained from the Southern Regional Hydrometeorological Center. We apply the Thiessen polygon method to convert the point rainfall to areal rainfall, which is used for representing the daily average over the city center ( Figure S5 in Supporting Information S1). We extract rainfall monthly maxima from this resulting time series, and correct the time series of monthly maxima for long-term drifts to ensure stationarity. To do this, we test for a significant monotonic trend at the 5% significance level using a modified Mann-Kendall test for autocorrelated data from Yue et al. (2002) with the pyMannKendall package in Python (Hussain & Mahmud, 2019), and linearly detrend maxima of months for which a significant trend is found, that is, September to December (Table S1 in Supporting Information S1).
Tide. We use hourly sea level time series between January 1, 1980 and December 31, 2017 at the coast (station Vung Tau; Figure 2b). This time series is obtained from the Southern Regional Hydrometeorological Center via the Center of Water Management and Climate Change. We remove the mean sea level variability from the sea level times series by subtracting the annual moving median (Cid et al., 2017) and subsequently use the MATLAB UTide (Unified Tidal Analysis and Prediction Functions) package (Codiga, 2011) to extract the tide. This package applies a harmonic analysis on the observed sea level time series with up to 146 tidal constituents, which are either automatically selected based on an automated decision tree or manually selected by the user. It also performs a nodal correction for time series longer than 2 years which is the case here. We extracted the tide based on 13 tidal constituents (M2, K1, O1, S2, sa, p1, N2, K2, Q1, SSA, MK3, NU2, and NO1), instead of the 68 automatically selected, to avoid overfitting. Together, these constituents explain more than 99.5% of the tidal variability at the Vung Tau station. Daily high tide events are extracted from this time series and we estimate the empirical distribution per month using a nonparametric kernel density distribution (see Section 3.2).
Skew Surge. We subtract the predicted tide from the sea level series to obtain the residual water level, which is also referred to as the storm surge (Cid et al., 2017;Wu et al., 2018). We find spurious peaks in the residual signal due to minor phase shifts in the tidal predictions, a common challenge in tidal systems with large tidal amplitudes (Calafat & Marcos, 2020;Williams et al., 2016). We circumvent this problem by calculating the skew surge, that is, the difference between the highest observed sea level and high tide within a tidal period. In practical terms, the skew surge represents the incremental water level on top of the high tide, which is the relevant metric for the driver of coastal flooding (Haigh et al., 2016;Williams et al., 2016). As this study focuses on seasonal variability, we remove any remaining periodicities longer than 1 year by applying spectral filtering (Cid et al., 2017). We extract monthly maxima from this resulting time series.

Characterizing the Multivariate Joint Distribution of Rainfall, Skew Surge, and High Tide Per Month
The monthly joint distribution of rainfall and sea level events, ( , ) , is obtained from combining the joint distribution of rainfall, , and skew surge maxima, , with the high tide levels, .
Parametric marginal distributions of monthly rainfall and skew surge maxima are selected based on the Akaike Information Criterion (AIC) measure (Mutua, 1994) among different types (Generalized Extreme Value types I, II, and III, Pearson III, and Exponential) with parameters estimated by the L-moments method using the lmoments3 Python package (summarized in Tables S3-S6 in Supporting Information S1). The monthly distribution of daily high tide levels is estimated using a nonparametric kernel density distribution with the statsmodels Python package. The distribution of monthly sea level maxima is obtained from the convolution of the extreme value distribution of the skew surge with the empirical distribution of the daily high tide (Batstone et al., 2013;Caires, 2011;Haigh et al., 2014). This is done by sampling a random high tide with a random skew surge maxima, following Mcinnes et al. (2013). While tide and skew surge can be dependent in mixed tidal regimes (Santamaria-Aguilar & Vafeidis, 2018), here we find no dependence except for during the month of May (Text S1 in Supporting Information S1). Figure S1 in Supporting Information S1 compares the annual return levels per month with observed sea levels at the Vung Tau gauge which visually shows that our method reduces uncertainty and provides an appropriate fit for return periods higher than a 3-year return period in a certain month.
Within a given month, we assume high tide levels to be independent of the surge and rainfall, and therefore we only consider the dependence between rainfall and skew surge. We specifically focus here on the dependence between rainfall and skew surge, instead of sea level, to highlight potential relationships not influenced by the deterministic tide. Regions with high tidal range can exhibit low or no statistical dependence when considering total sea level instead of skew surge as tidal variability becomes dominant .
We model the joint distribution of monthly maxima of rainfall and skew surge per month based on their dependence structure and on the number of cooccurrences between the maxima of the two variables. This method is different than lag-based conditional exceedance models (Lamb et   Rainfall gauges used to calculate the daily rainfall over the city center are denoted with red circles and the sea level gauge with a yellow triangle. The area shown in (b) is shown in red in (c). Note. The Bien Hoa rainfall gauge is located outside the spatial extent of (a). extremes are selected using monthly maxima instead of the peaks-over-threshold method. Our approach allows us to easily define independent and identically distributed events and create a balanced set of events while still include their dynamics Santos et al., 2021), as follows: • For the dependence structure between rainfall and skew surge monthly maxima, we fit and select the copula model with the smallest AIC value among 21 copula models (independence, Gaussian, Student t, Clayton, Gumbel, Frank, Joe, BB1, BB6, BB7, BB8 copula family and their rotation) using the function BiCopSelect() of the VineCopula package in R (Nagler et al., 2021). We report the upper tail dependence coefficient, the empirical Kendall's τ rank correlation coefficient to assess the degree of monotonic dependence, and the Cramér-von Mises goodness of fit test (Genest et al., 2009). However, we select the independence copula if the null hypothesis of independence fails to be rejected at the 5% level • For the number of cooccurrences between the rainfall and skew surge monthly maxima, we count the number of observed cooccurrences from the time series. We define an event as cooccurring when both maxima occur on the same calendar day (i.e., with a time lag of 0 days). This time lag is selected as it takes about 2-6 hr for high waters at the coast to reach HCMC center districts. Then, similar to Couasnon et al. (2020), we compare the number of observed cooccurrences to the number of expected cooccurrences assuming statistical independence. The expected number of cooccurrences from statistical independence is calculated using a binomial distribution Based on the monthly marginal distributions, the dependence model between rainfall and skew surge maxima and their cooccurrence rate, we numerically construct the monthly distribution of rainfall and sea levels as follows. The identified copula model is sampled 10,000 times for each month, to create stochastic pairs of rainfall and skew surge maxima corresponding to 10,000 years of data. We use the observed cooccurrence rate to determine the proportion of pairs that are cooccurring rainfall and skew surge events. The rest of the pairs are noncooccurring, that is, the rainfall and skew surge maxima are separate events that do not occur on the same day. In this case, the rainfall maximum is linked with a mean daily surge level for that month and the skew surge maximum with the mean daily rainfall of that month. Both for the cooccurring and noncooccurring events, we sample a random daily high tide to obtain the sea level. Note that in the case of noncooccurring events, this can sometimes lead to two damaging events in that month, one from the rainfall maximum and one from the skew surge maximum. In such case, only the event causing the highest damage is kept since we focus on monthly maxima damage.

Calculating Damage for Different Combinations of Drivers
The direct expected damage ( , ) from a given rainfall event of magnitude and from a given sea level event is based on maximum water depths obtained from hydrodynamic modeling, and combined with exposure and vulnerability data, using the DamageScanner Python package (Koks, 2019).
Maximum flood depths are modeled using the TELEMAC 2D hydrodynamic model (Hervouet, 2000), previously validated and calibrated for flood risk studies in HCMC (Scussolini et al., 2017). The model domain is shown in Figure S2 in Supporting Information S1. The model timestep is 20 s and a simulation takes about 3 hr to run on a high-performance computer. The hydrodynamic model includes pooling of water from rainfall, rainfall into the river and overtopping of river banks but omits local underground drainage systems. For each simulation, boundary conditions of upstream discharge, rainfall, and sea level are required. Upstream discharges are controlled by reservoirs ( Figure 2b) and were previously found not to play a significant role in flood outcomes (Vachaud et al., 2019). Therefore, in this study, discharges are set to constant values corresponding to the annual 1.5-year return period (Binh et al., 2019). Daily rainfall data are converted into events with higher temporal resolution for the scope of the hydrodynamic simulation: these correspond to events of a total duration of 3 hr, found to be representative for extreme rainfall in HCMC (Dahm et al., 2013) and explained in Text S3 in Supporting Information S1. The hyetograph (temporal evolution) of the rainfall event follows volume ratios previously used for pluvial studies in HCMC (FIM, 2013;Scussolini et al., 2017). Rainfall is assumed to be spatially uniform over the model domain as the hydrodynamic model cannot model spatial differences. While this simplifies the representation of pluvial events, we do not expect this to substantially impact our results since the size of the case study area remains limited. The temporal evolution of the sea level events is derived from the average tidal hydrograph, calculated from the tide time series. The peak of the sea level events is determined from the sum of the high tide and skew surge levels, as per Section 3.2. For every run, the start of the rainfall event is set to occur 1 hr before the peak of the water level at the downstream limit of the city center area, corresponding to the most frequent observed time lag between rainfall and sea level peaks. This time lag also results in the interaction of rainfall and sea level peaks in the city center, a conservative assumption. A complete description of the boundary conditions is added in Texts S2-S4 in Supporting Information S1.
The same land cover attributes and corresponding local depth-damage curves previously used for HCMC in Lasage et al. (2014) and Scussolini et al. (2017) are used to represent exposure and vulnerability, respectively. The land cover maps have been reclassified to correspond with the six categories of the local depth-damage curves (agriculture, public buildings, urban or rural residential, and small or large business). Both the hazard raster and land cover are resampled to have the same resolution of 7.3 by 8.7 m. For each simulation, the maximum flood depth at each cell is used to determine the proportion of economic damage and multiplied by the land-use specific maximum value per unit area, as shown in Figure S9 and Table S7 in Supporting Information S1, respectively.
We estimate the damage ( , ) from 36 combinations of rainfall and sea level events, with values ranging from 0 to 300 mm/day for the rainfall and from 0.61 to 1.86 m for the sea level. These combinations cover a wide range of cases from normal to extreme conditions (up to the approximately 1,000-year marginal return level of rainfall and sea level, respectively) but limit the number of hydrodynamic runs. The damage for other combinations of rainfall and sea level than the one modeled is bilinearly interpolated across the 36 estimated damage values. For events with a rainfall or sea level magnitude outside of the minimum (maximum) value modeled, the damage is linearly extrapolated based on the lowest (highest) points.

Quantifying Monthly Flood Risk
The seasonal variability in flood risk is obtained by calculating the monthly flood risk. Focusing on a monthly time scale allows us to cover all flood seasons and to include rainfall and sea level events less extreme than the annual maxima, but that are nevertheless potentially damaging. For each month, we model the joint distribution of rainfall and sea level events, and generate 10,000 years of data for that month, as explained in Section 3.2. We estimate the damage from these events using the damage function (Section 3.3) to obtain the monthly maxima damage distribution. The expected monthly damage, , for the th month can be calculated as follows: where ( ) is the damage from given rainfall and sea level events, and ( ) is the monthly probability density function of damaging rainfall and sea level events. Equation 1 is solved by numerical integration.
The EAD, the expected damage over any given year, is a key indicator in flood risk assessments (Quinn et al., 2019;Verkade & Werner, 2011;Ward et al., 2013Ward et al., , 2017Winsemius et al., 2013). Damage in a month is assumed to be independent from other months. The annual average damage can thus be calculated as the sum of the average monthly damages, or equivalently the average of the sum of randomly combined monthly damages:

Influence of Monthly Variability and Dependence on the Expected Annual Damage
Traditional flood risk assessments using block maxima sampling are based on annual maxima and do not take into account the dependence between flood drivers (Scussolini et al., 2017;Ward et al., 2013;Winsemius et al., 2013). We use our case study to report on the influence of the monthly variability in rainfall, skew surge and tide, and their dependence on the EAD. First, to investigate the role of the seasonality of rainfall, skew surge and high tide on the EAD, we apply the framework to yearly data instead of monthly data. By using annual maxima instead of monthly maxima, we do not provide any information on the seasonal variability of rainfall, skew surge and tide. Second, we test the influence of the multivariate joint distribution on the results, comparing results under the assumption of statistical independence and under the assumption of full positive dependence between the rainfall and skew surge. For the assumption of independence, the number of cooccurring events is set as the expected number under statistical independence, obtained from a binomial distribution. The magnitude of rainfall and skew surge events are independently sampled. For the assumption of positive dependence, we first modify the number of cooccurrences to assume that rainfall and skew surge maxima always cooccur and then we also assume a complete positive correlation (τ = 1) between the rainfall and skew surge maxima.

Results and Discussion
In this section, we present and discuss the results for each step from the flood risk framework presented in Figure 1 as well as the sensitivity analysis. We first present a brief overview of the stationary times series of flood drivers (Section 4.1). Second, we show the dependence structure and number of cooccurrences between rainfall and skew surge maxima (Section 4.2). Third, we show the obtained damage surface for different combinations of rainfall and sea level events (Section 4.3). Finally, we present the monthly flood risk results from our framework (Section 4.4) and assess the role of seasonality and dependence on the results obtained (Section 4.5). We also reflect on the limitations from this study (Section 4.6).

Flood Drivers Time Series
In Figure 3, we visualize the seasonality of the individual flood drivers using boxplots per month. Figure 3a shows specific percentiles (5th, 25th, 50th, 75th, and 95th) from the empirical distribution of the daily high tide per month and Figures 3b-3d from monthly maxima of the skew surge, sea level, and rainfall, respectively. The month of occurrence of the highest 95th percentile is different for each of the flood drivers, namely: November for high tide, March for skew surge, and September for rainfall.
There is a clear seasonal variation in the daily high tide levels (Figure 3a). Median high tide levels are lowest in July (0.55 m) and highest in December (1.01 m). Averaged across all months, the difference in daily high tide level between the 5th and 95th percentile, Δ 5−95 , is 0.57 m, and this difference shows little fluctuation throughout the year (standard deviation (std): 0.08 m). The coefficient of variation, CV, defined as the ratio between the standard deviation and the mean, varies between 0.14 (December) and 0.30 (August). For the monthly maxima of the skew surge (Figure 3b), the highest median is measured in September (0.21 m) and the lowest in December (0.11 m). The relative standard deviation in surge extremes between months is larger than for the tide, with a CV varying between 0.26 (July) and 0.51 (November). However, the average difference between the 5th and 95th percentile is lower than for the tide (average Δ5 − 95 : 0.18 m, std:0.03 m). As a result, the impact of the skew surge on the monthly maxima of sea levels is not constant throughout the year (Figure 3c). For example, while the highest tides occur in December, the highest median sea level (1.38 m) and 95th percentile (1.52 m) occur in October. This impact is also visible for the sea level in March which shows a higher median than February even though tides are lower in March than in February (Figure 3a).
Finally, the distributions of the rainfall maxima per month show strong seasonality (Figure 3d) with the lowest median value (0.1 mm/day) in February. Between December and March, monthly maxima are low (average 95th percentile of 31.6 mm/day). April has the highest 5th -95th percentile range (Δ 5−95 : 70.2 mm/day), and is the month signaling the usual start of the monsoon season. During the monsoon, from May to November, the rainfall maxima are high (median: 45.7 mm/day), with the highest 95th percentile value (103.4 mm/day) reached in September. Figure 4 shows the number of monthly maxima that occurred on the same day for the different pairs of drivers, for the whole time series between 1980 and 2017. While our methodology focuses solely on the dependence between skew surge and rainfall (Figure 1, part II), we also present results for the tide and rainfall (Figure 4b) and sea level and rainfall (Figure 4c) to highlight the influence of the tides on the observed number of cooccurrences per month.

Characterizing the Multivariate Joint Distribution of Rainfall, Skew Surge, and High Tide Per Month
Monthly skew surge maxima cooccur with rainfall maxima more often on the same day (on average 2.42 events across months) than is the case for the tide or sea level maxima with rainfall maxima (average of 1.08 and 0.92 event, respectively). This is probably due to the following phenomenon. Daily tide variability is much larger than the monthly maxima skew surge variability for any given month (average Δ 5−95 = 0.57 m and average Δ 5−95 = 0.18 m, see Figure 3), and skew surge extremes can happen during any high tide. As a result, skew surge maxima happening during a lower high tide will not necessarily lead to a sea level extreme, thereby diminishing the chance that a monthly sea level maximum coincides with a rainfall maximum.
In Figure 4, we use a binomial distribution to calculate the expected value of cooccurring monthly maxima (dashed lines) assuming statistical independence, and the 5th-95th confidence interval (dotted lines). For all months, except for March and April for the skew surge and rainfall, we find that the number of cooccurrences series is within the confidence interval of that can be explained by chance alone using a 5% significance level. This means that even though the number of cooccurrence from skew surge and rainfall maxima is higher than for the high tide or sea level, this signal is not strong enough to establish that there is temporal dependence between skew surge and rainfall. April corresponds to the start of the southwest monsoon season, which is characterized by abrupt rainfall events and sustained strong winds. It is known that bursts of rainfall events due to cold fronts happen before April but these are isolated events, followed by multiple days of no rain (Dang et al., 2020;Pham et al., 2010). Within the last decades, early monsoon rainfalls (between February and April) have consistently increased, and accompanying trade winds have strengthened (Li et al., 2019). This could explain why we observe more cooccurrences for these months.
We focus further on rainfall and skew surge maxima to investigate to which extent the magnitudes of monthly extremes are correlated. Table 1 presents the copula models with the lowest the AIC measure, the p value from testing the null hypothesis of independence and the selected copula model. The empirical Kendall's τ rank correlation coefficient between monthly storm surge and rainfall is low, ranging from −0.09 for the month of October to 0.27 in July, with most months showing no correlation. The independence copula is selected for most months. For June, July, October, and November, the AIC measure is lowest for other copula models. However, for these months except July, we cannot reject the assumption of independence at the 5% significance level. Therefore, the independence copula is selected to model pairs of rainfall and skew surge maxima for all months except in July where the Frank copula is used. In the latter case, the Cramér-von Mises goodness of fit test indicates an appropriate fit with a p value of 0.048. Figure S10 in Supporting Information S1 shows a visual comparison between the observed and modeled monthly maxima of rainfall and skew surge and suggests a good agreement between the modeled and observed joint distributions. Figure 5 shows the damage calculated from the selected combination of rainfall and sea level events. The shape of the damage surface is a result of the interaction between flooded areas, exposed assets and their vulnerability. The highest damage obtained from these simulations is $1.3 B, due to a cooccurring rainfall event of 300 mm/ day and a sea level magnitude of 1.86 m. In Figure 5b, we show a subset of the results, where the relationship between sea level and damage is conditional on the lowest, the highest and an intermediate rainfall intensity, respectively. The relationship is mostly linear, except for simulations with no rainfall. This is different than other Note. The row in bold for July indicates rejection of the null hypothesis of the independence copula and a significant Cramérvon Mises goodness of fit test at the 5% level (p value ≤ 0.05).

Table 1 Copula Models for the Rainfall and Skew Surge Maxima With the Lowest Akaike Information Criterion (AIC) and the Selected Copula Models for This Study
low-lying areas that have reported an exponential increase in damage from an increase in sea level (Habete & Ferreira, 2017). For HCMC, as the sea level increases, the increase in damage becomes constant, as indicated visually by the near-linear slopes. Similar behavior is observed in Figure 5c, for the relationship between rainfall and damage conditional on different sea levels.
Given that around 63% of the study area is classified as residential, we associate the dominant effect observed here as a result of the site topography and the depth-damage curves used. Figure S11 in Supporting Information S1 presents the area inundated (a, d), volume (b, e), and damage (c, f) from the combination of selected rainfall and sea level events. When rainfall intensity increases, the total area inundated also increases ( Figure  S11a in Supporting Information S1). However, for a high rainfall intensity this difference becomes much smaller than for lower rainfall intensity ( Figure S11d in Supporting Information S1). This means that for high rainfall intensity events, the flood extent tends to be the same and the sea level mainly modulates the average flood depth (Figures S11b and S11e in Supporting Information S1). Since most of the flooded area is classified as residential, the increase in damage (Figures S11c and S11f in Supporting Information S1) becomes then mainly driven by the corresponding depth-damage curve ( Figure S9 in Supporting Information S1).

Monthly Flood Risk
Based on the number of cooccurring rainfall and storm surge monthly maxima and the selected copula models (Section 4.2), we simulate 10,000 years of data and estimate the damage from these events using the damage function shown in Section 4.3. By doing so, we obtain, for every month, a monthly maxima damage distribution, shown as boxplots in Figure 6a. The box edges and whiskers indicate specific percentiles which can also be interpreted as annual return periods of damage in a certain month. Using Weibull plotting position on the 10,000 years of simulated data for each month, we can derive the annual return period of damage for a given month. For example, the top whiskers represents the 95th percentile or the 20-year (=1/(1 − 0.95)) return period for that month. The expected monthly damage is the expected value of this distribution, represented by a black circle. The highest expected monthly damage is obtained for the month of November ($166M) and lowest in July ($36M). Summing all expected monthly damages, we find an EAD of $1.02 B. From the monthly maxima damage distribution, we also derive the yearly damage distribution (not shown here) by summing randomly sampled monthly damages for each month. We estimate the annual 20-year return period damage to be $2.15 B (i.e., the 5% chance in any given year of total annual flood losses being exceeded).
The variation in monthly expected damage in Figure 6a result from the interplay between the frequency and the damage from the rainfall and sea level events. Figure 6b shows the rainfall and sea level events of all 10,000 years modeled, along with isolines of the damage surface from Figure 5. The most damaging event is estimated to cause a damage of $1.23 B but this event occurs rarely, namely 1-in-10,000 years in our simulated data. In Figure 6c, we show the binned contribution from the rainfall and sea level events to the EAD, a widely adopted metric of risk. This is done by multiplying the relative frequency of the simulated events with the damage function, that is, the probability of an event times its consequence. Bins with the darkest color contribute the most to the EAD, in this case low rainfall and high sea level events. These events have an annual return period varying between 1 and 10 years for the sea level and about 1 year for the rainfall. Therefore, even though the damage from one low rainfall and high sea level event is lower than that with a more extreme rainfall (Figures 6b and 5), cooccurring high rainfall and sea level are so seldom that they contribute less to the EAD than cooccurring low rainfall and high sea level events. These results are in line with riverine flood studies that assumed low to no protection levels and found low return periods to account for a large part of the EAD (Klaus et al., 2016;Merz et al., 2009;Ward et al., 2011). For example, in the Seckach catchment in Germany, events with return periods between 5 and 20 years already account for 50% of the EAD (Merz et al., 2009). However, it is important to note that these results are dependent on the dynamics of the hazard, vulnerability, and exposure. For example, for highly protected locations, we expect low-probability events to have the highest contribution to the EAD.
We classify all the simulated damaging events as coastal-driven, pluvial-driven or compound-driven and compare this with flood records from the Steering Center of the Urban Flood Control Program (Figure 7). This data set reports between 2009 and 2017 the number of areas flooded for each flood event and further categorizes each event as either a pluvial or a coastal flood. From this record, we define a flood as a compound flood when Figure 6. (a) Boxplots of the monthly damage distributions obtained from applying our framework to Ho Chi Minh City (HCMC) center. Black lines represent the median, the box edges correspond to the 25th and 75th percentiles, the whiskers correspond to the 5th and 95th percentiles of the data. The expected monthly flood risk, that is, the expected value from the distribution, is represented by a black circle. (b) Simulated 10,000 years of rainfall and sea level pairs, in black. The red contour lines represent the damage in dollars for any combination of rainfall and sea level. (c) Normalized bivariate histogram (relative frequency) of the simulated rainfall and sea level events multiplied with the corresponding damage. The integral of this graph, that is, summing over all the bins, is the expected annual damage (EAD). The darker the color of a bin, the higher its contribution to the annual expected damage.
both a rainfall and coastal flood was reported on the same date. Comparing Figures 7a and 7b, we note that the variations in monthly expected damage closely follow the average number of flood events per month, indicating that our flood risk framework properly captures flood seasonality in HCMC. Between May and July, even though there is on average a high number of flood events, we find relatively low damages. This is because even though rainfall-driven floods generate larger flood volumes than coastal or compound floods for these months (Figure 7c), rainfall-driven events are spread out over the whole city center which results in lower average flood depth compared to coastal-driven flood (not shown). Scussolini et al. (2017) found an EAD of $1.14 B for the whole extent of HCMC based on yearly maxima of sea levels. A direct comparison with our study is complicated because of the fundamental differences in methodology and difference in spatial extent. Therefore, to be able to compare this value with the one found in this study, we calculate the EAD using their methodology (i.e., assuming an annual rainfall return period of 5 years always cooccurring with sea level extremes) for an area common to both studies. We find a difference in EAD of 30% (not shown here). We attribute this difference mainly to the fact that we impose the rainfall peak to cooccur with the first tidal peak instead of the second tidal peak during the day in Scussolini et al. (2017). They also consider a slightly larger case study area than in this study. Therefore, we conclude that while Scussolini et al. (2017) obtain an EAD in the range of the one found of this study, this is mainly due to the opposite and compensating effect of several fundamental differences in methodology. In the light of our study, the assumption of Scussolini et al. (2017) that extreme sea levels always cooccur with a 5-year rainfall event is too conservative since rainfall and extreme sea levels do not cooccur more often than what is expected by chance (Figure 4c). We also find that subannual flood events, not accurately captured by their methodology, contribute considerably to the EAD. This conclusion is also supported by findings presented in a report from the Asian Development Bank (ADB, 2010).

Influence of Monthly Variability and Dependence on the Expected Annual Damage
We compare the EAD found in Section 4.4 with one obtained with a similar methodology based on annual maxima instead of monthly maxima. By doing so, we neglect the influence of the monthly variability in rainfall, skew surge and high tide. We fit rainfall and skew surge marginal distributions from yearly maxima and generate the equivalent of 10,000 years of paired data. With this method, we obtain an EAD of $177M. This is about 6 times less than the EAD obtained using monthly data, an underestimation of 83%. We attribute this difference to the following two reasons. First, by assuming independence between skew surge and high tide at a yearly time scale, their seasonality is neglected which results in an underestimation of extreme sea levels. In fact, in our case, the occurrence of skew surge yearly maxima is not uniform, with most events (18%) happening in March, a month where tide levels are usually high (Figure 3a). Using sea level directly instead of the sum of high tide and skew surge to take this effect into account increases the EAD to $285M, but this remains around 3.6 times less than the EAD from monthly data. Second, flooding can occur multiple times a year in HCMC and we just showed that lower intensity floods contribute the most to the EAD (Figure 6c). By using annual maxima, those high-probability flood events are discarded.
The monthly joint distribution of damaging rainfall and sea level events is based on the number of cooccurring rainfall and storm surge monthly maxima and on the selected copula models (Section 4.2). In Figure 8, we show the influence of these two components on the monthly maxima damage distribution, first by assuming complete independence between the magnitudes of rainfall and skew surge maxima and the number of cooccurrences (case I, in green). For the assumption of positive dependence, we first assume monthly maxima to always cooccur (case O, in yellow), and additionally assume that rainfall and skew surge maxima are fully positively correlated (case OC, in red). We observe negligible differences between the modeled dependence structure (case M, in black) and independence (case I). The influence of dependence can increase the expected monthly maxima by up to $32M (from $166M to $198M for case OC in November) or decrease it by $3M (from $100M to $97M for case OC in January) when compared to the modeled dependence structure (case M, in black; same as Figure 6). This is because the monthly damage distributions are different, as shown by the difference in boxplots between the cases. Assuming full temporal and statistical dependence (case OC) consistently leads to higher damage for higher return periods (such as the 20-year return period in a certain month indicated by the upper whisker in the boxplot) but also the lowest damage for the lower return period (about the 1-year return period in a certain month; lower whisker). This is a result from the assumption of fully positive correlation which implies that low skew surge will consistently cooccur with low rainfall (or negligible rainfall for some month such as in January, February or March) and high skew surge with high rainfall. This difference in damage distribution explains why we can obtain a lower monthly expected damage in January and February than the modeled monthly expected damage.
Summing all months, we find the EAD to be $1.02 B for case I, $1.15 B for case O, and $1.17 B for case OC. This represents a decrease of 0.13% and an increase of 13% and 15% compared to the modeled EAD from the observed rainfall and skew surge maxima time series (case M), respectively. Figures 8b and 8c show the difference in the normalized bivariate histogram between case O and OC with the modeled damaging events. Imposing rainfall and sea level maxima to always cooccur and to be fully positively correlated result in more damage from sea level events happening with a higher rainfall.

Limitations
Our framework presents an advancement with respect to traditional flood risk methods, as it can capture the seasonality of flood drivers and their potential dependence. Because hydrodynamic modeling of stochastic flood events is computationally expensive (Parker et al., 2019), our framework provides an alternative to limit the number of hydrodynamic runs while still accounting for the probability of multivariate flood driver events. Nevertheless, it is important to recognize that the representation of the flood drivers as boundary conditions for these hydrodynamic simulations is simplified. For example, due to the selected temporal dynamics of the coastal and pluvial events, our framework ignores consecutive events (de Ruiter et al., 2020) or events of longer durations. Long lasting flood rainfall events with consecutive high tide levels could also create high damage. Due to the model setup, rainfall boundary conditions are assumed to be spatially uniform. While the impact of this assumption is negligible on the overall flood risk considering the size of the study area, this may not be the case for larger regions (Apel et al., 2016). In such cases, these spatial and temporal differences should be added to our framework, potentially adding considerable complication.
The flood risk framework was applied to HCMC using 38 years of data. Such record length is insufficient to characterize the coastal flood risk with confidence (Dullaart et al., 2021) and the upper tail dependence structure between rainfall and storm surge (Gori et al., 2020). Even though the selected copula models were found to be acceptable models, we expect the upper tail dependence between rainfall and skew surge to be underestimated given the low number of typhoons that hit the area in the last decade (Takagi, 2019). To circumvent this challenge, a sensitivity analysis can help pinpoint which elements contribute the most to the damage. For HCMC, we showed seasonality to largely drive damages and the influence of the dependence to be limited. In coastal locations where the dependence between rainfall and sea level is stronger and influenced by tropical cyclones, such as in northern Vietnam (Bevacqua et al., 2020), synthetic tracks can help better characterize the (joint) distribution of typhoon-induced surge or rainfall (Gori et al., 2020).
For this study, we consider the monthly joint distribution from the identified flood drivers in HCMC, limiting our analysis to the impact of rainfall and sea levels. This simplification is acceptable for HCMC since discharge is not a dominant flood driver, but not in regions where discharge is not controlled upstream. Adding additional flood drivers in our model framework would increase the number of necessary hydrodynamic model runs exponentially, which could become a limiting factor to the feasibility of the analysis. In other locations, high tide and skew surge can exhibit a stronger dependence at the monthly time scale (Santamaria-Aguilar & Vafeidis, 2018). In such case, the framework should be expanded to include this dependence. This can be done by conditionalizing the tide distribution on the sampled skew surge level. Wave-driven effects can also contribute to extreme sea levels, especially in steep coastal profiles (Serafin et al., 2017). Estimating this contribution often requires other data sets however than from tide gauges records since tide gauges are often located in environments protected from waves (Calafat & Marcos, 2020).
Important sources of nonstationarities should be considered when applying our framework to future risk projections. First, HCMC experiences high subsidence rates between 20 until 60 mm/year close to the rivers (Duffy et al., 2020;Minh et al., 2015). A simplified way to include this would be to assume spatially uniform subsidence, even though this might result in an underestimation or overestimation of flood impacts. Alternatively, with our framework this could be done by running the selected scenarios with the projected bathymetry and elevation model. Second, future climate projections for 2100 predict an increase of 8.5-16.5% in rainfall between September and November (MONRE, 2009), and changes in tides and storm surges . HCMC urban expansion is expanding fast and projected to continue (Kontgis et al., 2014), potentially worsening the impact of pluvial floods in the future. These changes could be implemented by modifying the distribution of the flood drivers, and by adopting future land-use maps as in Scussolini et al. (2017).

Conclusion and Outlook
In this study, we developed a flood risk framework that considers the seasonality of multiple flood drivers and their potential dependence and applied it to Ho Chi Minh City (HCMC), Vietnam. We included the three most important local flood drivers, namely extreme rainfall, skew surge, and high tide events. By focusing on monthly time scales, we considered multiple flood seasons per year, which is of particular importance for HCMC.
Under the observed seasonality and statistical dependence structure between rainfall and skew surge, we found the EAD for HCMC to be $1.02 B. The impact of the statistical dependence between rainfall and sea levels on the EAD is negligible compared to that of the seasonality of rainfall and sea levels. Damages can result from both coastal and pluvial floods, with some months being more coastal-dominated (December to March), pluvial-dominated (April to August) or a mix (September to November). The dependence between rainfall and skew surge is low or absent for most months. While the assumption of full positive dependence would increase at most the EAD by 15%, not considering the seasonality of rainfall and sea level leads to an underestimation of 72-83%. We also found that the damage from the highest simulated return period of 10,000-year is mainly driven by rainfall extremes, and that the events contributing the most to the EAD are extreme sea levels cooccurring with a low or moderate rainfall. This suggests that investing in protection measures aimed at reducing the impact of extreme sea level events can potentially greatly reduce flood risk in HCMC's center, along with measures that reduce local runoff on urban surfaces. However, proper flood risk management and adaptation should also incorporate future projections of climate and sea level with projections of exposure and if possible, vulnerability. In order to investigate the optimal choices regarding such measures, more detailed adaptation studies including a complete cost-benefit analysis should be performed (Scussolini et al., 2017). Our framework could be used to assess the impact of disaster risk reduction measures on flood risk, by modifying the exposure, vulnerability functions, and/ or hazard layers to model the expected changes from these measures.
The challenges faced by HCMC are common to many other coastal megacities in the world. In future work, we aim to apply the model framework presented in this study to other coastal locations and contrast our findings. While for HCMC we found the dependence between rainfall and skew surge to be absent for most months and only to marginally affect the EAD, dependence is more consequential in other locations (Bevacqua et al., 2020;Couasnon et al., 2020). We foresee future versions of the framework to be more flexible, mainly by updating the dependence modeling to be able to include synthetic event data sets from large ensemble simulations and by considering other time lags. More generally, we suggest that decomposing flood risk to a higher temporal frequency, for example, per month as done in this study, can improve estimates of flood risk and be beneficial for flood mitigation, planning, and emergency response.

Data Availability Statement
The original time series at the rainfall and sea level gauges have been provided by the Southern Regional Hydrometeorological Center via the Center of Water Management and Climate Change in HCMC, Vietnam. The data used to produce the results presented in this study (the stationary time series of rainfall, daily high tide, skew surge, and the damage surface) and the code developed for this analysis is available on Github at https://github. com/couasnonanais/seasonality_risk (doi:10.5281/zenodo.5763288).