Toward Street‐Level Nowcasting of Flash Floods Impacts Based on HPC Hydrodynamic Modeling at the Watershed Scale and High‐Resolution Weather Radar Data

In our era, the rapid increase of parallel programming coupled with high‐performance computing (HPC) facilities allows for the use of two‐dimensional shallow water equation (2D‐SWE) algorithms for simulating floods at the “hydrological” catchment scale, rather than just at the “hydraulic” fluvial scale. This approach paves the way for the development of new operational systems focused on impact‐based flash‐floods nowcasting, wherein hydrodynamic simulations directly model the spatial and temporal variability of measured or predicted rainfall on impacts even at a street scale. Specifically, the main goal of this research is to make a step to move toward the implementation of an effective flash flood nowcasting system in which timely and accurate impact warnings are provided by including weather radar products in the HPC 2D‐SWEs modelling framework able to integrate watershed hydrology, flow hydrodynamics, and river urban flooding in just one model. The timing, location, and intensity of the street‐level evolution of some key elements at risk (people, vehicles, and infrastructures) are also discussed considering both calibration issues and the role played by the spatial and temporal rainfall resolution. All these issues are analyzed and discussed having as a starting point the flood event which hit the Mandra town (Athens, Greece) on the 15 November 2017, highlighting the feasibility and the accuracy of the overall approach and providing new insights for the research in this field.

• The potential of radar data and high-performance computing two-dimensional shallow water equation solvers at the watershed scale for impact-based flash-flood nowcasting system is highlighted • Reliable prediction of water depths within the urban area, with run time 30 times faster than real-time using a high-resolution grid • Rain resolution can affect simulated arrival times, peak and time-to-peak values, and street-level prediction of the effects on a specific target Supporting Information: forecasting has shifted the focus away from providing details of "what the weather will be" to warning of "what the weather will do" (Harrowsmith et al., 2020;Kaltenberger et al., 2020;Silvestro et al., 2019).The benefit of real-time hydrodynamic simulations is the ability to directly model the forecast spatial variability of rainfall on inundation and impacts at a city scale (Speight et al., 2019), setting thresholds based on the flow velocity of flow that is essential when considering a danger to people, movement of vehicles, or damage to property (Hofmann & Schüttrumpf, 2019).
Moreover, the interactions between rainfall variability in space and time and catchment characteristics strongly influence the hydrological response, especially in urbanized areas, where the runoff generation is fast (Hapuarachchi et al., 2011;Ivanov et al., 2021).For these reasons, the use of high-resolution rainfall data is necessary to investigate and predict hydrological response and the two-dimensional behavior of flood flow within urbanized areas (Corral et al., 2019;Ochoa-Rodriguez et al., 2015;Thorndahl et al., 2017;Zhou et al., 2017).
In the past, the prohibitive computational times of the 2D approaches in domains typical of hydrological applications hindered their application in operational flood forecasting systems (Echeverribar et al., 2019).However, nowadays, the use of high-performance computing (HPC) is common and several 2D-SWEs codes are developed using parallelization schemes.As a consequence, approaches based on both the message passing interface (MPI) or open multi-processing, compatible with clusters composed of several central processing units and the general purpose graphics processing unit have been shown highly efficient, leading to a drastic reduction of the computational times (Buttinger-Kreuzhuber et al., 2022;Carlotto et al., 2021;Caviedes-Voullième et al., 2023;Dazzi et al., 2018;Morales-Hernández et al., 2021;Noh et al., 2018;Sanders & Schubert, 2019;Vacondio et al., 2017;X. Xia et al., 2019).
The use of HPC flood modeling technologies in flash flood nowcasting is still at an early stage, and more research effort is needed (Ming et al., 2020;Morsy et al., 2018;Schubert et al., 2022).In particular, an emerging research need is the use of HPC technology to develop a flash flood forecasting system by directly coupling fully hydrodynamic models and weather predictions/measurements to forecast the impact of heavy storms.
Hydrodynamic models are becoming more and more attractive in rainfall-runoff studies because hydrologic and hydrodynamic flood processes are no longer modeled in two different systems but, entirely, within the framework of the 2D-SWEs approach (see, e.g., Bout & Jetten, 2018;Cea & Bladé, 2015;Costabile et al., 2013;Hadid et al., 2020).In this growing research field, several challenges can be identified: numerical methods to get stable and reliable results (Zhao & Liang, 2022), issues related to microtopography and roughness (Caviedes-Voullième et al., 2021;Ӧzgen et al., 2015;Taccone et al., 2020), infiltration (Fernández-Pato et al., 2016;Ni et al., 2020) analysis of channels networks (Bernard et al., 2022;Costabile et al., 2019), analysis and generation of efficient numerical grids (Caviedes-Voullième et al., 2012;Costabile & Costanzo, 2021;Dasallas et al., 2022;García-Alén et al., 2022;Kirstetter et al., 2021).The same modeling approach is used also in the context of urban flooding, considering the presence of buildings and the sewer system (Bellos & Tsakiris, 2015;Cea et al., 2010;Chang et al., 2018;Jang et al., 2018;Leitão et al., 2016; Q. Li et al., 2020;Padulano et al., 2021b;Tokarczyk et al., 2015). 10.1029/2023WR034599 3 of 31 The increasing awareness of the potential of the 2D-SWE integrated modeling at the catchment scale can pave the way to the development of new operational systems focused on the impact-based flash-floods nowcasting, in which hydrodynamic simulations directly model the spatial and temporal variability of measured or predicted rainfall on impacts at a street scale.This modeling strategy is potentially able to face the demands of emergency managers that are not addressed by the common approaches for the nowcasting of flash flood events, essentially based on the peak discharge predictions and providing only indications about possible flood magnitudes (Le Bihan et al., 2017).Therefore, considering the need to enhance the existing tools to get information on the location of affected people, disruption of services, damage to buildings and other elements present in hazard zones (Pittore et al., 2017;UNISDR, 2015;WMO, 2015), it seems relevant to start analyzing the feasibility and the accuracy of that approach to feed the ongoing debate on flash floods impact forecasting and warning (B.Merz et al., 2020;Taylor et al., 2018;Q. Zhang et al., 2019).
Within the research needs highlighted so far, the main goal of this research is to move toward an effective flash flood nowcasting system in which, starting from radar weather products, timely and accurate impact warnings are provided based on 2D-SWEs modeling framework able to integrate watershed hydrology, flow hydrodynamics and river urban flooding in just one model.In particular, the feasibility and the accuracy of the overall approach are discussed keeping in mind who and what is exposed to impacts and, specifically, what is going to happen, when it will happen, how bad it will be and where, providing timing, location and intensity of the street-level evolution of some key elements at-risk (people, vehicles, and infrastructures).This approach inevitably requires understanding the role played by calibration strategies and the influence of spatial and temporal rainfall resolution, which may be considered secondary objectives of the work.
The paper is organized as follows: methodological issues are reported in Section 2 whereas the case study description (flood event which hit the Mandra town [Athens, Greece] on 15 November 2017) and the available data are described in Section 3. The assumptions for model setup and parameter calibration are shown in Section 4. Model results and discussion are presented in Sections 5 and 6, respectively.Finally, the conclusions are highlighted in Section 7.

Methodology
The flowchart reported in Figure 1 summarizes the work done to address the research issues mentioned above.The methodological steps are described below.

2D Integrated Shallow Water Model
The simulation tool proposed in this work aims at integrating watershed hydrology, flow hydrodynamics and urban flooding in just one model, within the framework of the fully dynamic version of the 2D-SWEs.It can be expressed as in Equation 1.
where t is time; x and y are the spatial coordinates, U is the vector of the conserved variables, F and G are the flux vectors in the x-and y-directions, S is the term describing the effects of bed slope and bed friction, S RR represents the effects of rainfall and infiltration rates, S D takes into account the presence of the sewer system.The vectors reported in Equation 1 can be written as in Equations 2 and 3: (2) in which: h (m) is the water depth; u, v (m/s) are the depth-averaged flow velocities in x-and y-directions; g (m/s 2 ) is the gravitational acceleration; S 0,x (−) and S 0,y (−) are the bed slopes in x-and y-directions; S f,x (−) and S f,y (−) are the friction slopes in x-and y-directions; R (m/s) is the rain intensity, F (m/s) represents the hydrologic loss in terms of infiltration rate, D is the term related to the effects of the urban drainage system, that can be modeled in different ways depending on both the availability of detailed data and the specific purposes of the study (Cea & Costabile, 2022).
In other words, the vector S RR allows one to extend the application of the 2D-SWEs modeling from the traditional river flood propagation and mapping studies to rainfall-runoff simulations in rural areas.In particular, the infiltration module used here is based on the well-known Green-Ampt model, widely popular even in the recent literature (Fernández-Pato et al., 2016;Ni et al., 2020;Tügel et al., 2022), implemented according to the approach proposed in Fiedler and Ramirez (2000).
The term S D further enables the model to simulate the urban drainage system.When comprehensive input data are available (including the capacity and location of drain inlets, manholes, pipes etc.), the dual-drainage approach can be applied (Chang et al., 2018;Q. Li et al., 2020;Martins et al., 2017;Rubinato et al., 2017;Sañudo et al., 2020;H. Zhang et al., 2021).In this case, D assumes the meaning of discharge rate per unit area (m 3 /s/m 2 ) due to the exchange between the sewer system and the surface flow and may be seen as a sink/source terms depending on the difference between surface water draining to the sewer system through inlets or the sewer flow surcharging to the ground surface.The flow in the sewer network is simulated using a 1D approach.
However, in many cases, information on the location and characteristics of stormwater infrastructures is missing or incomplete (Mignot & Dewals, 2022), forcing the use of simplified methods (Bertsch et al., 2017) that neglect the sewer flow surcharging to the ground surface.In this case, D can be computed using specific formulas for the inlet drainage capacity (Cosco et al., 2020;Kitsikoudis et al., 2021;Palla et al., 2016;J. Xia et al., 2022) or an equivalent infiltration capacity (Y.Wang et al., 2018), that is quite common for large scale studies involving both rural and urban areas (Noh et al., 2019;Qiang et al., 2021;Schubert et al., 2022;Xing et al., 2022).
The modeling strategy adopted in this work, for the sewer system, will be discussed in Section 3, on the basis of the specific features that characterize the case study analyzed in this paper.
The model is written in Fortran 90 and is configured for parallel execution using MPI directives (Text S1 and Figure S1 in Supporting Information S1), so it applies to shared memory, distributed memory, or high-performance hybrid computing systems.

Impact/Vulnerability Estimation
Impact forecasts can be organized using different levels of detail, including direct and indirect effects on the built environment, man-made mobile objects (i.e., vehicles), natural environment, human population, financial loss due to direct damage, business interruption, and so on (Diakakis et al., 2020b;Le Bihan et al., 2017;B. Merz et al., 2020;Ritter et al., 2020).
Performing an effective impact analysis goes beyond the purposes of the work and, therefore, we did not include in our framework any quantification of exposure, expected impacts and damages, and the related uncertainties (Molinari et al., 2020).The key element we want to stress is the benefit associated with the approach based on the 2D-SWEs modeling at the catchment scale from a nowcasting perspective.In this context, the model is not only able to predict the hydrologic response of the watershed but, contextually, may provide essential information to automatically move to vulnerability analysis and, then, impact estimation at least at a street level.A suitable combination of flow depth and velocity, computed by the integrated 2D-SWE model, through vulnerability and fragility curves (Lazzarin et al., 2022) allows one to estimate the potential effects of the flow on a target, such as the assessment of direct potential damage to structures and buildings (Bermúdez & Zischg, 2018;Milanesi et al., 2018), people (Arrighi et al., 2017;Milanesi et al., 2015;Musolino et al., 2020;Postacchini et al., 2021;N. Wang et al., 2021) and vehicles (Martínez-Gomariz et al., 2016;Milanesi & Pilotti, 2020;N. Wang et al., 2021).
In this work, we used the AIDR (2017) approach (Figure S2 in Supporting Information S1) essentially due to its simplicity and because it takes into account the main targets exposed to flood risk (Costabile et al., 2020).

Mobile X-Band Polarimetric Radar
A key component of the nowcasting system proposed in this work is the high-resolution X-band dual-polarization Doppler weather radar (XPOL).The radar system used in this study is the XPOL mobile radar of the National Observatory of Athens (NOA), which has a peak power of about 30 kW per polarization, and a parabolic antenna of 2.4 m diameter which corresponds to a half-power beam width of about 1°.It is usually based at the top of a 500 m high hill in the north suburbs of Athens, Greece and operates during rain events in the broader area of Athens.Its volume scan includes plan position indicator (PPI) near horizontal scans at various elevation angles (typically at 1°, 2°, and 3°) of its antenna and range height indicator vertical section scans at selected azimuth angles of interest in each event.It takes about 2 min for a volume scan to complete.The typical range of the radar is 120 km with a gate resolution of 150 m and angular resolution of 0.6°.
High-quality rainfall rates require careful processing of raw radar data, whose basic steps for XPOL are described below.Ground clutter is removed with a real-time narrow bandwidth IIR filter on I and Q components of the radar signal, and noise and remaining clutter are detected using criteria based on co-polar correlation coefficient, radial Doppler velocity and spectrum width (Kalogiros et al., 2013a).The reflectivity data of the system are corrected for signal attenuation from the rain using an iterative algorithm based on differential phase shift (Kalogiros et al., 2013b).The effect of the bright band layer during stratiform rain, which was not observed in the event considered in this work, is detected and corrected using an algorithm based on co-polar correlation coefficient and apparent reflectivity profile (Kalogiros et al., 2013a).The rainfall rate at surface level, which is the only rainfall data used in this paper, and rain microphysics parameters are then estimated using polarimetric algorithms, which have been obtained from T-matrix scattering simulation for a wide range of values of physical parameters and measurement errors (Anagnostou et al., 2013;Kalogiros et al., 2012).Hybrid PPI scans are constructed for each volume scan using the data from the lowest antenna elevation with low beam power blockage, which is estimated using beam geometry and 150 m resolution terrain.Finally, cell average or inverse squared distance weighted averaging of nearest neighbors (if no data was within a cell) on a rectangular grid with a spatial resolution of 200 m is made.The bias calibration of radar reflectivity measurements is checked using self-consistency of polarimetric variables (Kalogiros et al., 2013b).Differential reflectivity is calibrated using an average theoretical relationship between reflectivity and differential reflectivity as described in Kalogiros et al. (2013a).

Description of the Event
The flood event occurred on the 15 November 2017 at Mandra town, which is located 30 km away from the center of Athens (Greece) in the western part of the greater metropolitan area of Athens (Figure 2a).It was the most catastrophic flood in Greece in the last 40 years and it caused the loss of 24 people and extended damages to the infrastructure of Mandra.Several attempts have been made in the literature to reconstruct, with hydrologic or hydrodynamic models, the estimated peak discharge at the outlets of the catchments draining in the town (Bellos et al., 2020;Bournas & Baltas, 2022;Diakakis et al., 2019Diakakis et al., , 2020a;;Mitsopoulos et al., 2022aMitsopoulos et al., , 2022b;;Spyrou et al., 2020;Theodosopoulou et al., 2022;Varlas et al., 2019) but not the observed peak-water depths at various locations in the town.
The town is built in the outlet of two small to medium catchments (about 20-25 km 2 each), which drain two streams named Agia Aikaterini and Soures (Figure 2a).Agia Aikaterini stream is part of the Mandra urban drainage system.Recently, Bellos et al. (2022c) presented an open flood data set that includes all the data used in this study, namely: (a) the maximum water depths at 44 observation points (Figure 2b); the digital terrain model (DTM) of the area with a resolution of 5 m; shape files with the buildings' footprint; the rainfall field recorded by the NOA's weather radar.Regarding the catchment characteristics, the soil of the entire computational area is characterized as loam.The building footprints are derived manually, using satellite photos of the Google Earth platform.

Radar Rainfall Measurements and Estimations
The event occurred at the outer edge of Medicane Numa-Zenon (a location of possibly weak rainfall and, thus, no flash flood was expected), which was centered over the Tyrrhenian Sea and was approaching Greece from the west (Figure 3a).However, the estimates of accumulated rainfall during the event from direct infrared or passive microwave satellite data in the area were limited to about 50 mm and the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement mission (IMERG) Early Run product estimation (not shown here) was about 150 mm, which was significantly reduced in Late Run product.This was probably due to the small vertical development of the convective clouds in this event which led to significant underestimation of rainfall by satellite data.The reflectivity calibration of XPOL radar against a disdrometer and rain gauges located within a 10-20 km distance from the radar in a post-event period of about 10 days and the comparison of IMERG estimations against radar data are presented in Varlas et al. (2019).According to the XPOL Doppler measurements (not shown here), the wind speed and direction near the surface during the core of the event was moderate to high (around 15 m/s) with a general direction from the south and an eastern flow to the east of the flood area, which converged in the area of the Pateras mountain.The spatial and temporal characteristics of the storm were captured by XPOL radar.The elevation of the radar operation site, the small beam width, the use of antenna elevation angles of at least 1° and the peak height of about 1,100 m of the Pateras mountain at the core of the current rainfall event contributed to very small-beam blockage and ground clutter at the range (about 40 km) and area of the event.
Figure 3b shows the accumulated rainfall during the whole period of the event.Accumulated precipitation height reached 300 mm in 8 hr (200 mm in the time period 3:00-6:00 GMT) in a small area (4 km × 18 km) at the slope of Pateras mountain.Due to the remote location and the small size of the area of intense rainfall and the lack of rain gauges or stream gauges in this area, rainfall was recorded only by XPOL.Outside the area of intense rainfall, the nearby meteorological stations of the Hellenic National Meteorological Service and the National Observatory of Athens recorded almost no rain.Also, operational weather models did not forecast any significant rain in the area.This fact indicates that the local orographic effect was a significant cause of the flood event, as well as the importance of radar measurements in recording and nowcasting flash-flood events in complex terrain.
A normalized variogram analysis of the rainfall field of the event, recorded by the radar at 2 min and 200 m resolutions over the two catchments was obtained, similar to Berne et al. (2004).The variogram of a spatial field (rainfall in this case) is the structure-function, which is half the variance of the differences between values of the spatial function of the field at all possible couples of points separated by a specified distance.
The spherical model reported in Equation 4 is frequently used to describe the variogram: where γ is the variogram, C 0 and C are the nugget effect and sill parameters, d is the distance, and r is the sill range.According to this model, the variogram increases almost linearly for small separation values and tends to a fixed value for large separation values.The sill range corresponds to the largest scale of spatial variability of the field.Similarly, the temporal variogram can be defined for temporal instead of spatial separations.This analysis showed a temporal sill range of about 28 min and a spatial sill range of about 3.9 km (Figures 3c and 3d), which are required to fully resolve the event.These sill ranges imply that the characteristic temporal and spatial scales (one-third of sill spatial and temporal ranges) of the rainfall field, which captures about 87% of the variability information assuming a spherical model, are 9.3 min and 1.3 km, respectively.The zero nugget effect of model fit indicates that both temporal and spatial resolutions of our radar data are sufficient to capture the full structure of the rainfall field.However, the hydrologic/hydrodynamic response of the catchment may be affected by small scales of rainfall field, which we examine in this paper.

Model Setup
The computational domain is based on an unstructured grid, made of triangular irregular elements having a different size.Two different grid resolutions have been set for the whole area upstream of the town.Specifically, following the approach proposed by Costabile and Costanzo (2021), a first simulation was performed using a uniform coarse resolution, in order to separate the most hydrodynamically active areas (essentially channel network and unchanneled valleys) from the hillslope.Then, the grid resolution of the former areas has been further refined (approximately 25-36 m 2 ) whereas 49-64 m 2 has been set for the hillslopes (see Figures 4a  and 4b).Finally, the part of the domain covered by the urban area (roads, squares, etc.) has been discretized using a resolution of up to 9 m 2 (see Figures 4a and 4c).The total number of computational elements is approximately equal to 10 6 .
As shown in Figure 4c, buildings are treated by the "building hole" approach (Costabile & Macchione, 2015;Schubert & Sanders, 2012) setting proper solid wall boundary conditions.As for the other boundary conditions, we assumed open boundary (transmissive conditions) in the most downstream part of the domain, and solid wall conditions for the catchment boundaries (Figure S3a in Supporting Information S1).
Following Bellos et al. (2020), three roughness classes have been assumed throughout the domain (Figure S3b in Supporting Information S1).We have assumed just one class for the soil (loam), uniformly distributed in the two watersheds.
As far as the urban area is concerned, no water losses have been considered, meaning that the effects of the drainage structures have been assumed as negligible since the sewer network was either destroyed or did not work at all (Bellos et al., 2022c;Diakakis et al., 2019).Moreover, the sewer system was designed for a maximum flood capacity of 10 m 3 /s, which is about 10% of the discharge peak value flowing in the town.Therefore, even assuming that the sewer system worked during the event, its influence is expected to be very limited and, thus, not so significant to model it.
The last step was the generation of the rainfall input starting from the radar-estimated accumulated rain.Specifically, for each computational element, the rain input was computed by the interpolation of the recorded data on the nodes of the computational grid for each available time step.The simulation refers to 15 hr of the real event, from 23:00 GMT on 14 November 2017 to 14:00 GMT on the next day.

Parametric Calibration Strategies
The numerical model incorporates six gray-box parameters required by two sub-models used in the 2D-SWEs, namely the friction and the infiltration sub-models, used in the momentum and mass continuity equations respectively.In particular, three values of the Manning coefficient have been considered (namely n L , n M , n H ), associated with the three main land uses identified in the domain (urban, vegetated, and forested areas).Moreover, the effective porosity E, the suction head S and the conductivity C need to be estimated as regards the infiltration sub-model.
In a previous work (Bellos et al., 2022a), we show that the coupling between radar data and the 2D-SWE model proved to be reliable when performing blind simulation, in which we assumed typical values of literature for the parameters involved in the computations.In this paper, the calibration process assumes particular importance because we are interested in understanding the influence of rainfall resolution on the overall results.To that purpose, the model was initially calibrated considering a reference spatial and temporal rainfall resolution.Then we checked the effects of denser and coarser resolutions to realize if there are spatial and temporal thresholds above which the model calibration is dependent on the rainfall resolution.Finally, we are interested in the comparison between the model predictions when using ad hoc calibrations for the denser and finer rainfall resolutions.
For the calibration strategy, we followed the principle of parsimony, in order to define a number of parameters under calibration which could be supported by the existing data (Efstratiadis & Demetris Koutsoyiannis, 2010).
It can be summarized in the following steps.
1. Defining a plausible range of the parametric values, based on the natural characteristics of the field and engineering judgment.For the estimation of the Manning coefficients, we combined the land use of the domain, the bibliographic values for each land use and generic suggestions for this issue (Bellos et al., 2018).
For the range of the infiltration parameters we followed the respective bibliography (Rawls et al., 1983).Therefore, we pre-defined the following plausible ranges: n L = (0.02-0.05) s m −1/3 ; n M = (0.04-0.07) s m −1/3 ; n H = (0.05-0.10) s m −1/3 ; E = (0.334-0.534); S = (1.33-59.38)cm; C = (0.3-0.8) cm/hr.2. Screening the number of the parameters using a global sensitivity analysis (GSA) method, with respect to their impact on the results (Kourtis et al., 2021).This is achieved using the GSA methods, such as Morris or Sobol methods (Morris, 1991;Sobol, 1993), with which the influence of each parameter on the final result is quantified.In this methodology, the elementary effects (EEs) and their statistical characteristics (namely the mean and the standard deviation) are calculated for each parameter.Since the statistical characteristics are estimated, the higher the mean the higher the contribution of the parameter to the dispersion of the output and vice versa.The standard deviation of EEs is a metric of the linearity and/or the interaction effects of the considered parameter in question.High values of the standard deviation denote nonlinear effects or interactions with at least one other parameter.With this information, the modeler can screen which parameters are critical to the analysis and reduce the dimensional space.The application of the Morris method was performed here using the SAFE toolbox (Pianosi et al., 2015).To derive the EEs, the step size Δ was defined as equal to 2/3 and the number of the trajectories, which define the number of the runs, was equal to 15. 3. Performing a grid-search calibration scheme, since the required computational time of the model is prohibitive for implementing an evolutionary optimization algorithm.The objective function to minimize is the root mean square error (RMSE) between the modeled and the observed maximum water depths at the 44 observational points.Specifically, 105 runs were performed using the 10 min (according to the estimated hydrologic response mentioned in Section 3.1) and the highest 200 m resolution of rainfall field according to radar data.4. Implementing a more robust optimization algorithm in order to investigate whether a better solution can be found with respect to the optimum found from the grid-search calibration, using a Machine Learning (ML) surrogate assistance (Christelis & Mantoglou, 2016;Tsattalios et al., 2023;Tsoukalas et al., 2016).To that purpose, we developed 44 data-driven fast emulators for all the observational points, trained by numerical results derived from the numerical model for the calibration phase.The ML method used for the development of the emulators is the Gaussian Process Regression (GPR) technique, which is used in several hydrological applications (Carbajal et al., 2017).For the GPR, we used the Regression Learner application in MATLAB.
Briefly, with this method, assuming that we have a training data set x, y, where x is the input data (an array with all the values of the parameters selected for the calibration phase) and y is the response vector (the maximum flood depth), we can predict a new response y for every new array of x.
The GPR technique assumes that the response y is a function of input data that follows a multivariate Gaussian distribution with the following form (Equation 5): where m(x) is the mean function, the k(x, x′) is the covariance or Kernel function and (x, x′) are all possible pairs of the input array x.The Kernel function used in our study, which is also used in other hydrological studies (Carbajal et al., 2017), is the squared exponential function, which has the following form (Equation 6): where σ 2 is the overall variance or amplitude and l is the length scale.The latter parameters are known as hyperparameters and should be optimized against the training data.
A detailed presentation of the technique can be found in Rasmussen and Williams (2006).

Calibration
The results of the GSA, depicted in Figure 5 for all 44 points, highlight that the most influential parameters are the Manning coefficient n L and the Green-Ampt parameters S and C (Figure 5a).This picture was expected since n L involves the urban environment and the 44 watermarks are all inside the Mandra town.
An interesting (but also expected) point here is that if we substitute the maximum water depths with the flow peaks of the flood hydrographs computed in the outlet of the two catchments that run to Mandra town (namely Agia Aikaterini and Soures catchments) we have a different picture (Figure 5b), in which infiltration parameters S and C are by far the most influential, while friction parameters are not of high importance.This picture is logical: n M is the majority of the computational area, while there is no area related to n L at the upstream part of the catchments.
As regards the grid-search calibration scheme, a set of 100 (4 × 5 × 5) runs is implemented, assuming that the steps for n L , S, and C are respectively equal to 0.01 s/m 1/3 , 14.5125 and 0.125 cm/hr.The remaining three parameters took the average values of their plausible range, specifically n M = 0.055 s/m 1/3 , n H = 0.075 s/m 1/3 , and E = 0.434.
A preliminary data handling showed that the objective function was trapped in the boundaries and hence an additional set of 36 runs is performed, assuming that n L ranges (0.075, 0.125) s/m 1/3 and the step is 0.025 s/m 1/3 , S ranges (15.8425, 44.8675) cm and the step is 14.5125 cm and C ranges (0.3-0.675) cm/hr and the step is 0.125.The number of new runs and the grid step are decided with an engineering judgment, considering the additional computational cost and the benefit to the objective function.Grid-search calibration suggested that the optimum parametric set is n L = 0.075 s/m 1/3 , S = 30.355cm, C = 0.425 cm/hr and the objective function takes the value RMSE = 0.586.
A 4D plotting (three parameters and the RMSE which is the objective function) is depicted in Figure 5c.A graphical observation of the 4D plot indicates that the optimization has the following weaknesses: (a) there is a lower threshold of the objective function which means that there is a structural weakness in finding better solutions and (b) it suffers from the most common illness of the optimization problems, namely the equifinality (Beven & Freer, 2001).This may be explained by the type of data used for calibration, represented by the water depths.Specifically, for a given mesh resolution, similar maximum levels can probably be obtained, for example, for low infiltration (leading to larger flood volumes) combined with low roughness, and for high infiltration (lower flood volumes) combined with higher roughness.
If we rank these 136 runs from the biggest to the smallest value of the RMSE we derive Figure 6, where the upper part depicts the way in which the objective function tends to the optimum, while the down part depicts the evolution of all the parametric values against the number of runs, as previously ranked.The y-axis is in the logarithmic scale for illustrative purposes (in order to plot all three parameters).Besides, the 10-period moving average is also plotted in order to detect whether it is a trend.
It seems that there is a lower threshold in the objective function which is equal to 0.59.About 7% of the runs have a percentage difference of less than 1% from the optimum value of the objective function, while this percentage becomes about 20% for a corresponding percentage difference of less than 5%.Therefore, there are strong indications that the objective function cannot be improved more and the lower threshold is the optimum solution.
Regarding the parameter estimation, with the help of the moving average we can observe a general trend for higher Manning coefficient and lower suction head and conductivity, but we cannot prove that there is a trend that results in a unique set of solutions.A similar conclusion can be obtained when introducing a random error in the maximum water depths recorded at the 44 observational points, with a uniform distribution, to investigate its influence on the objective function (Text S2 and Figure S4 in Supporting Information S1).
Finally, we developed 44 data-driven fast emulators for all the observational points, as explained in Section 4.2.The input for each emulator is the parametric set of n L , S, C, and the output (prediction) is the maximum water depth.The developed emulators replicate the simulator in good accordance, as the errors, are less than 5%, even though there are very few errors that can reach 20%-30% (Figure S5 in Supporting Information S1).
Having a very fast prediction tool instead of a relatively slow numerical model, we implemented a more robust optimization algorithm in order to investigate whether it is a better optimum compared with the optimum found from the grid-search calibration.The optimization algorithm selected is the interior-point method for non-linear constrained problems (fmincon function of MATLAB).If we assume that the optimization problem is constrained, the optimum parametric combination is n L = 0.0634 s/m 1/3 , S = 10.0427cm, C = 0.6307 cm/hr, and the objective function is RMSE = 0.582, which is not a significant improvement in comparison with the grid-search calibration.As a further confirmation, no benefit in the solution was obtained using the parametric set deduced from ML-based calibration in the 2D-SWEs model.This was expected since the objective function is rather smooth with no discontinuities.
Even with the notice that there is a level of equifinality, in what follows we assume that the optimum combination of the parameters required by the numerical model is the following: n L = 0.075 s/m 1/3 (calibrated), n M = 0.055 s/m 1/3 (average value), n H = 0.075 s/m 1/3 (average value), E = 0.434 (average value), S = 30.355cm (calibrated), C = 0.425 cm/hr (calibrated).

Flash Flood Evolution: Comparison to Observations, Combined Analysis of the Hydrologic and Hydraulic Response in the Town and Street-Level Prediction of the Temporal Evolution of Elements at Risk
Figure 7 shows the overall results obtained by using the 10 min and 200 m resolution radar rainfall and the set of parameters calibrated as before.Figure 7a, in particular, shows the capability of the model in the simulation of quick catchment response to the high variability of rainfall both in time and space, describing multiple peaks.According to this simulation, the maximum peak values are approximately equal to 160 and 100 m 3 /s for the Soures and Agia Aikaterini catchments, respectively.The peak discharge values simulated upstream of the town are within the range of the results of previous studies of the same flood event using the same radar rainfall data (Bellos et al., 2020;Diakakis et al., 2019;Varlas et al., 2019) with differences depending on the model used and setup, confirming the physical soundness of the approach.The model described the channel networks in the two catchments (Figure 7b) and provided an overall good agreement between the observed and the computed flood extent in the town (Figure 7c).
Figure 8a shows the water depth evolution computed in six points located along the main roads, from which the close agreement between the maximum values and the observed water marks can be appreciated (see for example P5, P27, and P21).However, as shown in Figure 8b, the discrepancies are significant in other points (but not more than 0.6 m in absolute values).They are mainly located on secondary and narrow streets, highlighting in some cases important underestimations (see point P30) and overestimations (point P44), even though the maximum water levels seem to be generally underestimated (points P32, P42, P35).This is not surprising and can be easily explained by the poor level of topographical information for the description of alleys and other small topographical features, that cannot be described by the available DEM, and the uncertainty in the exact elevation of the observation points, which should affect the estimated water level, at the elements of the computational grid chosen for comparison purposes.
We note that, between these points with underestimation of water depth, there is a topographic feature (small hill) about 20 m higher than the surrounding terrain and, thus, the exact location and elevation of these observation points (at the foot of this hill) is quite important for validation purposes.
Figure 8c shows the observed flood extent and the location of the observed watermarks.The points have been grouped in five areas, according to street-level logic, as our interest lies in identifying local clusters of points with higher/lower errors.Figure 8d shows five boxplots of the percentage errors, each corresponding to a group of observed watermarks.This picture further shows that the largest errors occur in areas characterized by narrow and peripheral streets (Figures S4 and S5 in Supporting Information S1).Particularly, the group with larger errors is located close to the inundation boundary (indicated by the green squares in Figure 8c).
According to an impact-based nowcasting perspective, the street-level prediction of the temporal evolution of elements at risk is essential.In this context, the application of the AIDR criterion reported in Section 2.2 would lead to the results shown in Figure 9. Significant variation within the town can be observed during the evolution of the event, identifying streets involved by completely different hazard classes.For example, a significant part of the main road analyzed in Figure 9a is characterized by the higher class (H6), from 6:30 to 8:00 GMT, and only from 8:30 GMT the situation seems to improve.On other streets, the situation is better than that of the main road, but significant hazards for people and vehicles, with different levels of hazard, can be identified as well.

Effects of Rainfall Resolution on Hydrologic and Hydraulic Responses
The measured rain fields have been arranged according to the following resolutions: 200 m, 1 km, and 3 km for spatial resolution and 2, 10, and 30 min for the temporal resolution, assuming that the most common weather data are lying in this order of magnitude, starting with the more detailed resolution (usually obtained by radar data) until coarser resolutions (usually obtained by satellite data).The lower values of resolution relate to the highest measurement resolution and the upper values include the characteristic spatial and temporal scales of the event and catchments (Section 3.2).As a consequence, eight more runs of the 2D integrated models have been performed using these different rainfall inputs, using the same set of calibrated parameters of the reference solution (200 m and 10 min as spatial and temporal resolution, respectively) and the same computational grid.
The effects on discharge hydrographs are shown in Figure 10.Whereas the variations between the simulations related to 200 m and 1 km are substantially negligible (Figures 10a and 10b), important differences can be detected when considering the coarsest resolution (3 km).In particular, the peak values are lower than the other two simulations, with variations of approximately up 23% and 53% for the Agia Aikaterini and Soures catchments respectively, the multiple peaks are significantly smoothed and flood volumes are significantly lower.These results are somehow expected, considering the general lowering of the highest rainfall intensities that, in turn,    induces a reduction of the surface runoff.However, the impact on the arrival times and times to peak, which is key information from a nowcasting perspective, seems to be negligible.
These considerations do not hold when attention is focused on the effects induced by different rainfall temporal resolutions keeping constant the spatial resolution (200 m).As shown in Figures 10c and 10d, the effects on the arrival times and time to peak are limited when comparing the results related to 2 and 10 min but are much more significant when considering the coarsest temporal resolution (30 min), for which a delay approximately equal to 30 min can be observed for both the catchments.The maximum peak values are not so different but the associated variations in terms of times-to-peak are more significant.
Finally, Figures 10e and 10f highlight the tendency of the model to underestimate the observed watermarks, especially when considering the coarsest spatial resolution, as highlighted by the metrics RMSE and E (mean percentage error) (Table S1 in Supporting Information S1).The highest are the temporal and spatial resolutions the lowest are the values of the metrics and, therefore, more accurate are the simulation results.Moreover, the use of the lowest rainfall resolution induces underestimation greater than 0.3 m and up to 0.5 along the main road, whereas medium to low differences can be detected in other streets of the town (Figure S5 in Supporting Information S1).

Effects of Rainfall Resolution on the Dynamic Evaluation of Element at Risk
Figure 11 summarizes the effects of spatial and temporal rainfall resolution on the street-level prediction of elements at risk, which may be essential in an impact-based nowcasting system.Specifically, for the sake of brevity, the focus here is only on the extent of areas characterized by specific hazard conditions in instants of times representative of the hydraulic peak condition within the town (approximately 6:00-6:30 GMT).
In particular, we are interested in the quantification of the variation induced by the lowering of the resolution, so that the highest resolution (200 m and 2 min) is considered as the reference solution.The variation index D HR (Equation 7) is thus computed for the hazard class j (j = 2, 3, 4, 5, 6) in which A GR,j is the area associated with the hazard class j using a generic spatial and temporal resolution and A HR,j is the homologous one related to the highest resolution.
The overall results show that the areas associated with the lowest hazard classes seem not to be significantly influenced by the rainfall resolution (except when the lowest temporal resolution is considered).This is expected due to the spatial and temporal aggregation of the rainfall field that, in turn, induced the reduction of the peak values.As further proof of this, greater variations can be seen for the highest hazard classes, which seem to be very important, especially when considering the lowest spatial resolution.Therefore, the lowest resolution should be 1 km and 10 min to have limited errors in hazard class predictions (less than about 10%) and a minimal error in arrival times for this event.However, even for these resolutions in the case of the highest hazard class the difference from the simulation using the finest resolutions is not negligible.
Figure 12 further highlights the consequences of the lowering of the rainfall resolution.Specifically, the comparison between the areas characterized by specific elements at risk, as predicted by the 2D integrated model using the highest and lowest resolution at 6:30 GMT, is shown.Different hazard classes are estimated in the two cases in several streets (see for example the black ellipses).In particular, along the main road (ellipse A) the difference is equal to one class, whereas in other parts of town, the variation can be up to two classes (ellipse B) or can be considered generally safe (ellipse C).These considerations gain importance in a nowcasting perspective because different predictions of hazard conditions in specific streets could suggest different emergency strategies.

Sensitivity of Model Calibration on Rainfall Resolution
The results shown in the previous section pose a question about the influence of rainfall resolution on model calibration.Therefore, we performed a grid-search calibration for the 3 km and 30 min rainfall, coarse resolution hereafter, under the assumptions that: (a) the outcome of the sensitivity analysis using 200 m and 10 min rainfall, hereafter identified as dense resolution, still works for the coarse resolution, so that the most influential parameters are the Manning coefficient n L (friction sub-model), the suction head S (infiltration sub-model) and the conductivity C (infiltration sub-model); (b) the parametric values for this calibration were the same as the initial one related to the dense rainfall resolution.
The two objective functions for 136 runs and the parametric values with 10-period moving averages, for both dense and coarse resolutions are depicted in Figure 13.It is observed that the threshold of 0.59 of the objective function is still not overcome.
Regarding the parameter estimation, the picture for both rainfall resolutions is the same: a general trend for higher Manning coefficient and lower suction head and conductivity, but there is no proof for encountering the issue of equifinality.Even with that limitation, the optimum combination of the parameters required by the numerical model are the following: n L = 0.125 s/m 1/3 (calibrated), n M = 0.055 s/m 1/3 (average value), n H = 0.075 s/m 1/3 (average value), E = 0.434 (average value), S = 15.8425cm (calibrated), C = 0.675 cm/hr (calibrated).In comparison with the dense resolution, it seems that a higher Manning coefficient and conductivity and a smaller suction head are required.In particular, the value of the Manning coefficient seems to be unrealistic.This value could be acceptable in the "building-resistance" approach (Schubert & Sanders, 2012) whereby a large resistance parameter value is assigned to cells that fall within building footprints to describe, indirectly, the effects of buildings on the flow behavior.This is not the case in our simulation, as already explained in Section 4.1, in which the presence of buildings is explicitly taken into account.Similarly to what has been observed in Section 5.1, the equifinality issues highlighted before can be explained by the type of data used for calibration (maximum water depths).Since the low-resolution rainfall simulation leads to discharge underestimation, the calibration procedure attempts to compensate by increasing the roughness coefficient (up to almost unphysical values) in order to obtain the same maximum depths.
From an impact-based nowcasting point of view, it is interesting to focus on the combined effects of calibration and rainfall resolution inside the town.Figure 14 summarizes the impact on the simulated water depth hydrographs at a point located along the main road, P8, between P5 and P13 (see Figure 8a).In particular, Figure 14a shows the effect of model calibration using the coarse resolution, which increases the peak value from 2.0 m obtained with the calibration related to dense resolution (green line) to 2.2 m (black line) which, in turn, is the same value obtained using higher spatial resolutions, related to dense rainfall calibration (blue and red lines).This is representative of what has been observed in Figure 13, about the similar values assumed by RMSE using calibrated values for dense and coarse resolution.
However, the simulated arrival time is higher than before (approximately 10 min).This is essentially due to the higher value of the Manning coefficient, resulting in about 67% greater than the calibrated value using the dense resolution.The delay in the prediction of arrival times induced by coarse resolution further increases (approximately 40 min) when compared to the simulation related to the dense one (red line in Figure 14b), performing even worse than the results obtained using the same rainfall resolution but using the set of parameters calibrated using the dense resolution (green line in Figure 14b).

Discussion
The experience gained in the analysis of the Mandra 2017 flood events highlighted several relevant aspects to take into account for the implementation of street-level nowcasting of flash flood impacts in watersheds involving urbanized areas.Though Bellos et al. (2022a) showed that the proposed modeling strategy is able to provide reasonably good results when performing blind simulations, that is in the absence of model calibration, a careful evaluation of both the accuracy and the feasibility of the proposed approach should be analyzed in the light of calibration issues and rainfall resolution that, in turn, are strictly linked together.
Actually, we are still far from describing a flood event in such an accurate way, without calibrating the required parameters (white-box modeling).With the current detail of data and the existing computational power, even the most sophisticated mechanistic flood simulators still involve numerous parameters which have a gray-box nature.
Of course, a more detailed input data set, such as a DTM with denser resolution or a more thorough description of the built-up environment identifying gardens, houses, fences, etc. can always improve the accuracy and the efficiency of the simulations, but a careful calibration strategy is still essential for a proper rainfall-runoff analysis, even with the less "uncertain" SWE-based simulators.
Our results highlighted that the calibration suffers from equifinality, due to structural weakness.There are several sources for this weakness.The major factor has to do with the input data, namely the resolution of the mesh and the building's footprint.For the first, even a 5 × 5 m DTM is not enough to identify all the flow complexities, since the streets are rather narrow (5-10 m).For the second, even though the "building hole" approach is the more efficient method in comparison with the others (Bellos & Tsakiris, 2015), still there are disadvantages: (a) the flood volume stored in the houses cannot be captured; (b) the building's footprint is manually drawn and therefore it was very difficult to separate the main house and the garden, which is probably flooded in the case of a picket fence.Except for the latter, there is always the structural uncertainty which amplifies the errors.First, the nature of the form of 2D-SWE used in our simulator, which does not include turbulence modeling.Second, the errors introduced by the numerical method used for solving the equations, since the numerical approach to the partial derivative is always an abstract from the analytical solution.
However, the good agreements in terms of spatial extent (Figure 7c) and water levels within the town (Figure 8) represent clear indications of the accuracy of the approach proposed in this work.The physical soundness of the results can be further appreciated in terms of arrival time.Specifically, the first emergency phone call to the authorities requesting help at about 4:38 GMT, which probably came from a house at the entrance of the town from the side of Agia Aikaterini catchment and agrees with some reasonable reaction delay with water depth results (arrival time at 4:15 GMT as reported e.g., in Figure 14c) and it is a further confirmation of simulation results.Moreover, the simulated response time (the first small peak discharge at Agia Aikaterini outlet approximately at 4:00 GMT, Figure 10c) corresponds to a similar first small peak of average catchment rainfall at 2:55 10.1029/2023WR034599 23 of 31 GMT (Figure 7a) agrees with the mean value (1 hr and 25 min) of the formulas proposed by Barbero et al. (2022), specifically derived for small basins considering also the characteristics of the storm events.The usefulness of the overall approach proposed here can be further appreciated considering the different information that an emergency manager can extract from the model result.For example, the duration of the event is not the same along the main road.Specifically, as one can observe in Figure 8, whereas the beginning of the event is substantially the same everywhere (approximately 4:20 GMT), it is quite difficult to detect a common timing for its ending.In particular, the event seems to be finished after 9:30-10:00 GMT in some points (see e.g., P27) due to the very low value of the water depths that, in contrast, are still significant in other locations (see e.g., P13 and P21).Looking at the results shown in Figure 8b, the difference in terms of flood evolution inside the town is exacerbated.In some points (P30 and P32) no hydraulic response can be seen after 7:00 GMT whereas in other locations this occurs after 8:00 GMT (see e.g., P35 and P44).
Further quantitative key information for an emergency manager can be provided, such as the spatial extent of the areas involved by a specific hazard condition during the event (Table S2 in Supporting Information S1), providing more specific information than the water depth only or discharge hydrographs.From this point of view, it is possible to predict the timing of the worst situation at the town scale, that is the instant of time for which the maximum exposure of the elements at risk is simulated (6:30 GMT) and the duration of the most critical condition (6:00-8:00 GMT).This information can be combined with specific curves reporting the expected damages in relation to the element at risk and the related spatial extension (Huizinga et al., 2017) but this goes beyond the purpose of the work.
Since the approach proposed here potentially enables decision-makers to identify the streets most vulnerable to flood impact, it might contribute to the improvement of the emergency response by evacuation planning (Coles et al., 2017;Musolino et al., 2022) and effectively support the design of emergency measures (Dottori et al., 2017).Specifically, logistics of operations and road accessibility are known to be essential for emergency activities and flood preparedness (Arrighi et al., 2019;Coles et al., 2017;Green et al., 2017).The potential of the proposed system is the ability to forecast, dynamically, the most dangerous roads, that can be the scene of fatalities or inaccessible during and after the event, as a function of the quick hydrologic response of the watershed upstream to intense and localized rainfall forecasted by the same model.However, these actions were not examined in the current paper and more research is needed to address them.
The influence of rainfall resolution on both the model calibration and overall predictions represented a key element of this work.According to the simple empirical equations proposed by Berne et al. (2004), the temporal and spatial resolutions of rainfall measurements for hydrological analysis are equal to 7.5 min and 4 km, respectively.Actually, the variogram analysis of the current event (Figures 3c and 3d) showed little different required resolutions of 9.3 min and 1.3 km.Overall, the hydrodynamic results highlighted that in order to avoid important underestimations on both flood discharges generated by the two catchments and water depths within the urban areas, resolutions up to 1 km are recommended, at least for situations similar to that analyzed here.The temporal resolution of rainfall data significantly affected the arrival times, which are essential from a nowcasting perspective, but its effect can be neglected when using resolutions up to 10 min.The difference between the highest and lowest spatial resolution in terms of water depth values inside the town is approximately 40 cm, except for the 30 min-resolution in which the variation is slightly lower (30 cm).It should be observed that the flooding of Mandra was mostly induced by the Agia Aikaterini catchment, since the Soures stream impacted only the downstream part of the town.Considering that the effects of the spatial resolution are much more important for the Soures catchment (see Figures 8b and 8f) due to the specific features of the rainfall event (about 30% more accumulated rainfall in Soures than in Agia Aikaterini catchment), it may be deduced that the variation in terms of hydraulic response, within the town, could be potentially even more significant if the situation was reversed.
The influence of rainfall spatial and temporal resolution was not only shown in terms of the hydrological and hydrodynamic responses, as shown in other studies (Cristiano et al., 2017(Cristiano et al., , 2019;;Fatone et al., 2021;Ochoa-Rodriguez et al., 2015;Shuai et al., 2022;Smith et al., 2013;ten Veldhuis et al., 2018), but also highlighting the effects on the estimation of street-level impacts, for which different evaluations of the areas characterized by hazard conditions for people, vehicles and buildings have obtained.Based on Figure 11, a good practice guidance would suggest that if the input of the flood warning system is a high-resolution weather radar rainfall product, the latter system can more accurately predict the time at which the flood peak will hit each part of the city.On the other hand, if the input data is a weather product with a coarser resolution such as a weather satellite, the developers of the flood warning system shall avoid predicting specific time moments but provide the predicted maximum depths in the time horizon of the system (e.g., 15-30 min for geostationary satellites or more than 6 hr for polar satellites).The finest resolutions should be used whenever available, which is the case of rainfall data from high-resolution radars, for nowcasting purposes (when characteristic temporal and spatial scales are not known) instead of post-event analysis and when higher accuracy of hazard class estimation is needed.
Finally, the results of this study highlight that particular care should be devoted when the calibration is performed using coarse rainfall resolution.A corrective mechanism during the calibration exists, that masks biases caused by different precipitation resolutions, which leads to a similar performance metric.This is mainly due to the very high unphysical value of friction within the town that induced the increase of the water level.However, this caused a further delay when comparing the estimated arrival time at the entrance of the town with respect to the observations, because of the expected drastic reduction of the flow velocity caused by the unrealistic value of the friction, which has consequences on impact estimations.
As concerns the practical feasibility of the proposed approach for flash flood impact nowcasting, we carried out simulations using different clusters.Specifically, the analysis of runtime changes with the number of processors for a fixed grid size has been carried out using the GALILEO 100 cluster, available at SuperComputing Applications and Innovation (SCAI, which is the HPC department of CINECA), the largest computing centre in Italy and one of the largest in Europe.In particular, we were able to simulate the whole event (15 hr) in approximately 30 min, approximately, using 192 threads (Text S3 and Table S3 in Supporting Information S1).Therefore, as a rough indication, the numerical model requires 2 min to predict 1 hour of the real event, that is the order of magnitude related to the hydrologic response of small watersheds, like those considered in this work.This is a clear indication of its feasibility for flash flood nowcasting, at least when using recent and competitive hardware, despite the high-resolution grid used in the urban area (up to 9 m 2 ) which mostly determines the run time.However, even using an older cluster, characterized by lower computing capabilities (Text S3 in Supporting Information S1), the computational time seems to be reasonable from a nowcasting perspective (approximately 1 hr for the whole event).
In the future, it would be interesting to extend this modeling approach to bigger watersheds to test the overall run time, even considering more HPC-enabled solutions to enhance the computational efficiency (Caviedes-Voullième et al., 2023;Morales-Hernández et al., 2021;Sharifian et al., 2023).

Conclusions and Final Remarks
The results of this paper have been oriented to feed the emergent and ongoing line of research on flash flood impact forecasting and warning, providing a contribution to the enhancement of existing tools to get information on the location of affected people, disruption of services, damage to buildings and other elements present in hazard zones.The main findings of this work refer to: (1) the feasibility and the accuracy of the proposed approach and (2) the influence of both model calibration and rainfall resolution on the expected flood impact forecasting and warning.
1.It seems that a nowcasting flood platform in the street-level scale is now feasible.The coupling of high-resolution DTMs, radar products for weather measurements and short-time forecast (up to 3 hr) based on measurements (taking into account the rainfall temporal and spatial heterogeneity) and robust integrated numerical models running in HPC facilities has provided good results, reproducing, satisfactorily, complex phenomena characterized by convective-type storms over two catchments.Specifically, the accuracy of the approach has been tested not so much using flood discharge data or water levels in the rivers but against more than 40 points located within the urban district, according to a logic of street-level nowcasting of flash floods impacts.The calibration process highlighted that the overall solution does not seem to be heavily affected by the parameters assumed in the simulations.This can be an indication about the reliability of the approach proposed in this work, that in any case requires further analysis and applications to be generalized.The feasibility of the approach is further testified by the run time of the numerical simulations, which was 30 times faster than the real-time, despite the use of high-resolution cells (9 m 2 ) required to manage urban flooding, so that the hydrological response time (approximately 1 hr) for watershed like those analyzed here can be simulated in 2 min.2. Both spatial and temporal heterogeneity of the rainfall field are essential for accurate predictions of street-level impacts.In particular, in order to avoid important underestimations on both flood discharges generated by the two catchments and water depths within the urban areas, resolutions up to 1 km are recommended, at least for situations similar to that analyzed here.The temporal resolution of rainfall data significantly affected the arrival times, which are essential from a nowcasting perspective, but its effect can be neglected when using resolutions up to 10 min.However, despite these suggested resolutions the use of the highest and the finest spatial-temporal rainfall resolution analyzed here may have significant effects on the estimation of street-level impacts, leading to different evaluations of the areas characterized by hazard conditions for people, vehicles and buildings.Therefore, these results suggest that high-resolution weather radar products, having more detail than the corresponding satellite products or rain gauges information, are preferable in impact-based flash-flood simulations.3. The results of this study highlighted that a set of calibrated parameters for a dense rainfall resolution may not work for a coarser one.The use of ad hoc calibration for the coarsest resolution allowed us to obtain similar metrics for the RMSE value referred to the maximum water depths but using a too-high, unphysical value of the Manning coefficient for the urban area, that reduces the flow velocity in the town inducing further delay in terms of arrival times and potential underestimation of expected impacts.Overall, the objective function related to the coarse resolution is sharper and converges to the optimum value slower in comparison with the dense one.The latter means that a coarse resolution is more vulnerable to errors in case of no data and therefore no calibration phase exists.Finally, it should be observed that the computational cost, the structural weakness of the model to capture the reality which is characterized by high complexity and the specific features of every area, should be taken into account for the calibration scheme.Our findings highlight that the use of a surrogate ML-based emulator was not able to find a better solution than the parametric set derived by the grid-search calibration.
These conclusions may be limited, in part, by uncertainties in the distribution of soil type, the way the buildings' footprints were derived and the DEM resolution, probably too coarse to describe in detail the complex topography of the town and local infrastructure.
Finally, two issues will be faced in the future.First, the approach proposed here will be configured as an operational system, in order to address its effective feasibility for flash flood events in situations different from those analyzed here and, mostly, to evaluate the overall time needed for the full workflow, from data acquisition to decision-support output.Furthermore, though the computational times seem to be feasible from a nowcasting perspective, it will be interesting to analyze the performance of the system using the latest GPU technology, especially for domains larger than those considered in this work.

Figure 1 .
Figure 1.General framework of the research.

Figure 2 .
Figure 2. General view of the event location (a), observed flooded areas (processing of a WordlView-4 satellite high resolution (31 cm) image on 21 November 2017 and ground observations by BEYOND group of the National Observatory of Athens, Greece, available online at http://beyond-eocenter.eu/index.php/web-services/floodhub) and locations of the 44 water/mud marks derived by Bellos et al. (2022c) (b), example of: mud footprint (c), effects on buildings and business (d, e), urban drainage failure (f, g).

Figure 3 .
Figure 3. (a) Accumulated rainfall height during the event according to precipitation estimates from Meteosat Second Generation (MSG) infrared satellite data, (b) total accumulated rainfall height recorded by XPOL radar (range circles from the radar are shown at 10 km intervals, azimuth angles are 30° interval and elevation at 250 m interval), (c) normalized temporal variogram of average rainfall, and (d) mean normalized spatial variogram of rainfall field in the catchments during the period of more intense rainfall 2:30-6:00 GMT.

Figure 4 .
Figure 4. Set-up of the computational grid: resolutions used within the domain (a) and details for rural (b) and urban areas (c).Buildings (white areas in panel (c)) are modeled using the "building hole" approach.

Figure 5 .
Figure 5. Global sensitivity analysis results for the 44 maximum water depths (a), flood peaks (b), and 4D plotting of the objective function (root mean square error [RMSE]) against the Manning coefficient of the low friction zones, the suction head and the conductivity (c).

Figure 6 .
Figure 6.Objective function (up) and parametric values (bottom) against the number of runs.

Figure 7 .
Figure 7. Results of the integrated two-dimensional shallow water equations model at the watershed scale after calibration (10 min and 200 m resolution radar rainfall data): discharge hydrographs computed for the Soures and Agia Aikaterini catchments (a), maximum water depths map throughout the watersheds (b), and comparison between observed and computed flood extent in the town (c).

Figure 8 .
Figure 8.Comparison between the simulated water depths temporal evolution and observed high water marks at specific locations after model calibration, using 200 m and 10 min as spatial and temporal rainfall resolution: predictions along the main roads (a) and narrow streets (b), grouping of points (c), and boxplot of the errors for each group.

Figure 9 .
Figure 9. Street-level prediction of the temporal evolution of elements at risk, using 10 min and 200 m resolution radar rainfall data as input for the model.

Figure 10 .
Figure 10.Influence of rainfall resolution on the discharge hydrographs: effects of spatial resolution (a, b), temporal resolution (c, d) and comparison between the highest and the lowest resolution (e, f).

Figure 11 .
Figure 11.Heat map of the variation index D HR (percentage variation in terms of areas involved by a specific hazard vulnerability class in respect to the prediction associated with the highest spatial and temporal resolution) at specific instants of times (6:00 and 6:30 GMT) and considering the maximum values.

Figure 12 .
Figure 12.Street-level hazard mapping as predicted by the 2D integrated model using the highest (a) and the lowest rainfall resolution (b) at 6:30 GMT.

Figure 13 .
Figure 13.Comparison between the objective functions (up) and the parametric values for coarse and dense resolution (down).

Figure 14 .
Figure 14.Influence of rainfall resolution and model calibration on the water depth hydrograph computed at a point located on the main road.