The Role of Global Data Sets for Riverine Flood Risk Management at National Scales

Over the last two decades, several data sets have been developed to assess flood risk at the global scale. In recent years, some of these data sets have become detailed enough to be informative at national scales. The use of these data sets nationally could have enormous benefits in areas lacking existing flood risk information and allow better flood management decisions and disaster response. In this study, we evaluate the usefulness of global data for assessing flood risk in five countries: Colombia, England, Ethiopia, India, and Malaysia. National flood risk assessments are carried out for each of the five countries using six data sets of global flood hazard, seven data sets of global population, and three different methods for calculating vulnerability. We also conduct interviews with key water experts in each country to explore what capacity there is to use these global data sets nationally. We find that the data sets differ substantially at the national level, and this is reflected in the national flood risk estimates. While some global data sets could be of significant value for national flood risk management, others are either not detailed enough, or too outdated to be relevant at this scale. For the relevant global data sets to be used most effectively for national flood risk management, a country needs a functioning, institutional framework with capability to support their use and implementation.

There are two approaches to representing flood hazard globally: either through remote sensing (RS) of historical flood events or through global flood models (GFMs). The two are often considered complimentary , as RS data are used to validate the global models (Bernhofen et al., 2018;Mester et al., 2021). GFMs use global data sets, automated methods, and simplified hydraulic equations to simulate flood hazard globally (Trigg et al., 2020). These models, which began as research experiments, are now being used for disaster response (Emerton et al., 2020), to inform policy decisions , to assess business risks (Ward, Winsemius, et al., 2020), and recently their modeling frameworks have incorporated detailed national level data to assess national flood risk Wing et al., 2017Wing et al., , 2018 Several GFMs have been developed in academia (Yamazaki et al., 2011;Ward et al., 2013), by research institutions Rudari et al., 2015), and by commercial companies (Sampson et al., 2015). Their differing approaches to global flood hazard mapping result in flood extent disagreement (Aerts et al., 2020; and varied performance (Bernhofen et al., 2018), suggesting no single model is uniformly fit for purpose.
Similarly, global population maps, necessary for calculating exposure, adopt equally divergent approaches to mapping human population. These range in complexity from simply distributing census data across administrative boundaries to statistically estimating population distribution and density from auxiliary data sets that relate to human presence (Leyk et al., 2019). Recent studies by Smith et al. (2019) and Bernhofen et al. (2021) found that flood exposure estimates are significantly impacted by the population data set used. As these data sets become increasingly locally relevant, there is an urgent need to investigate their fitness-for-use in flood risk assessments at these scales.
A key component of flood risk assessments, frequently absent from global studies, is vulnerability (Ward, Blauhut, et al., 2020). Vulnerability is multifaceted; it can be assessed through societal, economic, environmental, or physical means (Birkmann et al., 2006). The most visible, and most commonly assessed, aspect of vulnerability is direct damages (Meyer et al., 2013). Direct damages are typically calculated using some form of vulnerability curve, which translates a component of the modeled flood hazard (often depth) into a degree of damage. Depth-damage curves are derived from data collected from historical flood events or expert judgment; or a combination of the two (van Westen, 2014). As a result, vulnerability functions are globally disparate. In countries with a wealth of historical data, such as the UK , The Netherlands (Kok, 2004), and the US (Davis & Skaggs, 1992) there are detailed vulnerability functions, whereas in other countries there are none at all. These data gaps were addressed by Huizinga et al. (2017) who developed a global database of depth-damage 3 of 25 functions for multiple land use classes. Significant uncertainties remain, however, both in the data sets used to identify assets at risk in the vulnerability calculations and in the assumptions made about the land use classes.
A cross-disciplinary approach needs to be taken when evaluating global data sets for FRM at national scales (Morrison et al., 2018). The physical science (global data) needs to be understood in the context of the social science. There needs to be a focus on the capabilities of regional and local governance in interpreting and using this data to inform and address flood risk. Governing bodies require data sets to be accessible, unambiguous, and easy to use; however, variability between data sets pose risks for effective policy and decision making, for example, the different conceptualizations of vulnerability may not translate to actual administrative and political structures. Also integral is the capacity of organizations and other governance structures to use the data. Human, technical, and financial resources of services are often lacking. Failures in coordination and communication between related departments and other relevant stakeholders over other scales may result in the incorrect use of data. For example, the dissemination of data to the local scale can be complicated and challenged by local priorities, alternative perceptions, elite capture, and language. Data also have the potential to be manipulated and/or abused in power struggles or for political motives (Venot et al., 2021;Wissman-Weber & Levy, 2018).
The UKRI GCRF Water Security and Sustainable Development Hub project (https://www.watersecurityhub.org/) connects water experts in five different countries, spanning four continents. This project provides a unique opportunity to test the global data sets for use at these scales in countries with vastly different histories of flooding and flood management structures and allows us to explore the commonality and variability of global data used locally. In this paper, we use global data sets and methods from previously published studies of global flood risk to carry out flood risk assessments in five countries: Colombia, England, Ethiopia, India, and Malaysia. We calculate national flood risk using a 20-year catalog of historical flooding, five riverine GFMs, seven global population data sets, and three approaches to calculating vulnerability. We then assess the credibility of this data for use at the national scale considering the variability of the flood risk estimates and exploring the implications this has on their usefulness. We also examine the capacity to use this data for FRM in each country.

National Flood Risk Management Approaches
There are two distinctive approaches to FRM, as laid out by Morrison et al. (2018): the resistance approach and the adaptive approach. The resistance, or standard, approach to FRM consists of mitigating flood risks through infrastructure or laws and regulations. The adaptive approach focusses less on preventing flooding and places greater emphasis on increasing resilience in high-risk areas (Schelfaut et al., 2011). The approaches are complimentary, and successful examples of FRM often consist of a marriage of the two (van Wesenbeeck et al., 2014).
The implementation of FRM strategies typically falls on the government. The level of government responsible for executing FRM strategies is dependent on the country and the strategy being implemented (Merz et al., 2010). Governance strategies to FRM vary, as countries prioritize certain approaches over others (Driessen et al., 2018). Governance strategies can be hierarchical, consisting of a traditional "top down" decision-making structure (Alexander, Priest, & Mees, 2016); they can be decentralized, where policy decisions are made at the local level with a greater emphasis placed on stakeholder engagement (Driessen et al., 2012); they can be polycentric, where policy power is shared between different levels of government and nongovernment stakeholders (Garvey & Paavola, 2021;Loeschner et al., 2019); or they can be panarchy, which is an adaptive approach to governance that consists of a nested set of adaptive cycles (Alexander, Priest, & Mees, 2016;Gunderson & Holling, 2002), where certain conditions can trigger "bottom up" changes in the system (Garmestani & Benson, 2013).
To evaluate global flood risk data for use at the national scale, it is important to understand a country's approach to national FRM. Where, and how, the data will be used will depend on the national FRM strategy and who is responsible for implementing it. Taking a multicountry approach, as we are doing, enables us to pick apart the differences and commonalities in national strategies and how these influence the applicability of global flood risk data in a national flood risk context. 10.1029/2021WR031555 4 of 25

Study Countries
We evaluate the global data for use at the national level in five countries: Colombia, England, Ethiopia, India, and Malaysia. These five countries bring together local communities and 46 different stakeholder partners that work together to address water security issues in the Global Challenges Research Fund (GCRF) funded Water Security and Sustainable Development Hub (https://www.watersecurityhub.org/). Below, we briefly summarize flood risk in each country and how it is managed.

Colombia
Colombia is particularly susceptible to extreme weather events such as hurricanes, storms, and flooding due to its hydroclimatology that emerges from Colombia being located in the Intertropical Convergence Zone (ITCZ). The ITCZ is a place where both warm and humid winds from Northern and Southern latitudes converge, creating a belt of clouds. This situation generates constant provision of wind and humidity that, when interacting with topography, defines the rainy and dry seasons. The hydroclimatology is further influenced by El Niño-Southern Oscillation (ENSO). The cold phase of ENSO, otherwise known as La Niña, increases rainfall which leads to increased river flow and flooding. For example, in 2011 4 million people were affected by a strong La Niña event, causing losses of $7.8 billion through damage to economic infrastructure, flooding of agricultural land, and the issuing of government subsidies (Hoyos et al., 2013). Climate change is also projected to increase rainfall by 2.5% by 2050 which will further increase incidences of flooding (Ramirez-Villegas et al., 2012). Colombia manages flood risk alongside other risks posed by volcanos, landslides, and earthquakes under their National Disaster Risk Management System (UNGRD in Spanish). Policy, legislation, and regulations under this system are decentralized over the global, national, regional, and local levels to directly include public entities, nonprofit entities and communities within the policy's remit and subsequent activities. Colombia takes an expost approach to FRM through a reaction to flood events which occur.

England
Flooding has been recognized by the UK government as one of the most serious threats facing the country. The National Flood Risk Assessment (NaFRA) estimate that one in six commercial and residential properties are at risk from pluvial, riverine, and coastal flooding. These risks are exacerbated by factors such as population growth, deteriorating drainage infrastructure, land use change, and natural erosive processes, and will worsen with climate change (Alexander, Priest, Micou, et al., 2016). Extreme flood events have become more frequent in recent years, for example, Kendon et al. (2019) report that in 2019 England and Wales had its fifth wettest autumn since 1766 resulting in severe flooding in Yorkshire, Nottinghamshire, Derbyshire, and Lincolnshire; the most severe flood event to occur in the UK since 2015. Governed by the Department for Environment, Food and Rural Affairs (DEFRA), current flood risk policy centers on resilience to manage flood and climate change risk and to protect economic growth and infrastructure. It recognizes the importance of public participation over a decentralized structure to nurture long-term and flexible approaches; to enable life to continue alongside water rather than keeping water out (Forrest et al., 2017). This entails community groups working alongside other flood related agencies to come up with long-term solutions.

Ethiopia
Ethiopia is exposed to a wide range of disasters associated with the country's diverse geoclimatic and socioeconomic conditions, but floods and droughts represent major challenges to communities and livelihoods. Flooding has become one of the most common, frequent, and severe natural disasters in Ethiopia affecting lowlands, highland, and urban areas, displacing thousands and causing loss of property and livelihoods. Increased rainfall variability and extreme events have increased the likelihood of flooding, while risk is exacerbated by rapid population growth and urbanization, particularly in Addis Ababa the capital (Beshir & Song, 2021;Haile, Habib, & Rientjes, 2013). Environmental degradation, poverty and conflict further aggravate the risks and reduce the coping capacity and resilience of communities. For example, Haile, Kusters, and Wagesho (2013), illustrate how resettlement programs by the Ethiopian Government between 1983 and 1996 in the lowland region Gambela and consequent land use change resulted in increased flood events that affected up to a third of the population in some woredas. FRM in Ethiopia is governed by the National Disaster Risk Management Commission (NDRMC), established in 2015, to coordinate an integrated approach with all hazards to streamline their disaster risk management approach over multiple administrative scales to including an early warning and response system across all government sectors. The Government of Ethiopia (GOE) has a long institutional history of addressing disaster risk management (DRM), starting with the establishment of the Relief and Rehabilitation Commission (RRC) following the 1974 famines. Since then, the country has taken several steps to shift to a more proactive approach to DRM. This includes updating the National Policy and Strategy on DRM (2013) and developing a DRM Strategic Program and Investment Framework (SPIF) for government and donor interventions in 2014 (DRMFSS, 2014).

India
Flood risk in India differs across the country due to the various geomorphological locations and different atmospheric circulations. The Indian Summer Monsoon through several transient atmospheric conditions brings rain to different parts of the country via different monsoonal phases as onset and advance (mid-May to mid-July), peak rainfall (July to August), and withdrawal (mid-September to mid-October). Rainfall intensity and extreme flood events have increased in intensity between 1951 and 2015 (Ray et al., 2019;Vinnarasi & Dhanya, 2016). Flood risk is particularly severe for urban settlements in India due to the huge populations who reside in mega-cities (population of over 1 million). The number of mega-cities has risen exponentially to 52 cities over the last two decades due to migration from rural areas (De et al., 2013). Flooding and water-logging have become common occurrences due to the reduction of green spaces and aging storm drains which struggle to cope, especially during the monsoon seasons, leading to loss of income and increased disease risk (Ali et al., 2021). In the capital, Delhi, 24,840 ha of the city is built on floodplains, 68% of which are the low-lying Yamuna floodplains. The apex organization for flood management schemes of India is the Central Water Commission (CWC). However, FRM in India is always state led, with the federal government only assisting when relief measures (e.g., through National Disaster Response Force (NDRF), State Disaster Response Fund (SDRF), etc.) are required. Many States, especially the ones which are flood-prone, have established Flood Control Boards, organized by the respective Irrigation Departments, to assess the flood problems and evaluate the flood schemes. For example, the Irrigation and Flood Control Department leads FRM in Delhi. The city is demarcated into six drainage zones, and 12 municipal zones manage the storm run-off between them for the whole city. This approach reflects the structural approach of policy to flood risk which focuses on mainly infrastructural measures to control flooding. Different structural/ administrative measures have been adopted by these organizations to reduce the flood losses and protect the floodplains across India. In addition to several laws enacted by the Central Government (e.g., Inter-state River Disputes Act 1956, The River Boards Act of 1956, Damodar Valley Corporation Act, 1948, Betwa River Board Act, 1976, Brahmaputra Board Act, 1980, The Land Acquisition Act, 1894, etc.), a few States have also enacted laws to deal with disputes related to flood control works (CWC, 2018).

Malaysia
Malaysia is severely affected by flooding. Eighty-five of Malaysia's 189 river basins are prone to recurrent flooding, all of which flow into the South China Sea (Saifulsyahira et al., 2016). Rainfall intensity in Malaysia is high all year round, with most of the flooding occurring between November and February during the Northeast Monsoon. For example, in January 2021, six people died and 50,000 were displaced during the monsoon on the east coast (Al Jazeera, 2021). Flash floods have also become more common with increased urbanization, infrastructure development alongside rivers, and the poor maintenance of drains and waterways (Mabahwi et al., 2020;Yusoff et al., 2018). FRM in Malaysia is driven by the federal government and characterized by a mostly technocratic approach. The Department of Irrigation and Drainage Malaysia (DID) is the main entity involved with flood management which includes the management of hydrological data, planning and development of flood defenses, planning and development of flood mitigation, management of national river resources, and coordination of other relevant agencies over federal, state and district administrative levels (Mabahwi et al., 2020).

The Role of Global Data
The use of global data sets to assess national flood risk is dependent on the extent to which countries have the capacity, institutions, and governance structures to use and interpret the information. Many countries, especially those in the Global South, frequently lack the resources, expertise and strong institutional frameworks needed to access, collect, interpret, and analyze available data sets to implement effective FRM. For example, in Colombia, the gap between policy, the political will, and capacity to act on flooding influences FRM; also, the country's expost approach to FRM may limit the usefulness of these data sets (Key Informant Interview, O. Patricia Quintero Garcia, 2021). In Malaysia, efforts to manage flood disasters are hampered by a lack of legislative guidance on the management of flooding within the National Disaster Management Agency (NADMA), the federal agency in charge of disaster risk management, despite the NADMA's close association with flooding management agencies. Obstructive bureaucracy over administrative scales and between agencies and limited authority in decision making also restrains the ability to manage flood risk in Malaysia (Mabahwi et al., 2020). Effective use of global flood risk data by these countries also entails corroboration with local data collected concerning flood risk, however local data may be limited, unavailable, or incompatible with global data sets due to poor data management, lack of resources systems, and unreserved, restricted access to data.

Global Data
The number of global data sets for calculating climate risks is large and growing (Lindersson et al., 2020). In this study, we use global data sets that have been used in previously published studies of global flood risk. The data sets we use are free and can be easily obtained by the end-user, either by directly downloading them online or by contacting the developer of the data sets. In total, we use six global data sets of flood hazard, seven global data sets of human population, and three global approaches to calculating vulnerability. These data sets are detailed in the sections below and in Tables 1 and 2.

Global Flood Hazard Data
We use both models and satellite observed flood events to represent hazard. We use five GFMs that have been used in previous studies of global flood risk. The models are CaMa-UT (Yamazaki et al., 2011;Zhou et al., 2020), CIMA-UNEP (Rudari et al., 2015), Fathom (Sampson et al., 2015), GLOFRIS Ward, Winsemius, et al., 2020;Winsemius et al., 2013), and JRC . These models represent the stateof-the-art in publicly available global riverine flood hazard maps. Some of the models have incorporated other flooding mechanisms into their modeling frameworks such as coastal (Ward, Winsemius, et al., 2020) and pluvial flooding (Sampson et al., 2015), however, because these mechanisms are not present in all the models we use only riverine flood maps. The flood maps are spatially continuous, meaning that the return period simulated is assumed constant across the modeled domain. The five models can be categorized into two distinct structures: cascade model structure and gauged flow data model structure . Cascade models use global climate precipitation data to force land surface models which predict extreme flows across the river network. Gauged flow data models use global gauge data and regional flood frequency analysis to estimate extreme flows in ungauged basins globally. Previous intercomparison studies found large differences between these models in Africa  and China (Aerts et al., 2020). Validation of the same models against observed flooding in Nigeria and Mozambique found that the best models performed favorably compared with historical flood events (Bernhofen et al., 2018). Some, but not all, of the GFMs have incorporated flood defenses into their modeling frameworks. To maintain consistency between the GFMs we use only the riverine undefended flood hazard maps. The 100-year return period, or 1% annual probability flood, is used for our calculations. In England, we use Fathom-UK flood extents, which utilizes the same modeling framework as their global model but makes use of more detailed national data. It is worth mentioning that JRC has also produced more detailed flood maps for Europe using their global modeling approach (Dottori et al., 2021), but these maps are not used in this study. In addition to globally modeled flood extents, we use satellite derived flood extents from the Global Flood Database (GFD), a 20-year catalog of observed flood events (Tellman et al., 2021). The GFD categorizes flood events as being caused by heavy rain, storm surge, snow melt, or dam breaks. We only consider flood events caused by heavy rain and snow melt to maintain consistency with the GFM outputs; however, we are not able to further distinguish between observed riverine and pluvial flooding. The global flood hazard data sets are outlined in Table 1. Detailed descriptions of the data sets and how to access them, as well as previous flood risk studies they have been used in are included in the Supporting Information S1.

Global Population Data
To identify who is exposed to flooding it is essential to understand where people live. Gridded population data sets, which distribute census information over spatial data, are the tools commonly used to calculate flood exposure at the global scale. The methods applied to distribute census data differ in complexity. These methods, their development, and their wide-ranging applications are reviewed in detail by Leyk et al. (2019). To summarize the different methods briefly, census data can be distributed across a grid by areal weighting or by dasymetric weighting. The areal weighting approach distributes census data evenly across an area. The dasymetric weighting approach uses ancillary data sets to weight the distribution of census data. This can vary in complexity from binary weighting (settlement or no settlement) to statistical weighting approaches based on multiple ancillary data sets. Another way to distinguish the population data sets is whether they are constrained or unconstrained.
The constrained approach masks out all nonsettled areas as uninhabited, while the unconstrained approach assumes that not all settlements can be accurately mapped globally and residual census data are distributed across nonsettled area to account for any unmapped settlements (Thomson et al., 2021).
The use, and limitations, of gridded population data in flood exposure studies specifically, are addressed in the studies of Smith et al. (2019) and Bernhofen et al. (2021). The two studies collectively consider four different global population data sets; however, many more have been used in previous studies of global flood risk. In   (Freire et al., 2016), GRUMP (Balk et al., 2005), HRSL (Tiecke et al., 2017), HYDE (Klein Goldewijk et al., 2010, Landscan, and Worldpop (Stevens et al., 2015). These data sets have all been used in previous studies of global flood risk. In our analysis, we use the most up-todate epoch for each population data set, which are then scaled to 2020 national population totals for exposure comparison. Detailed descriptions of the population data set and how to access them, as well as previous global flood risk studies they have been used in are included in Supporting Information S1.

Global Vulnerability Approaches
Vulnerability is the susceptibility of a community or system to experience losses from a hazardous event (UNISDR, 2004). It is a complex, multifaceted concept that can be experienced directly or indirectly across human, physical, economic, and environmental spheres (Van Westen, 2014). Vulnerability has received less attention at the global scale than hazard and exposure (Ward, Blauhut, et al., 2020). Below, we identify and summarize three intercomparable methods for calculating vulnerability that have been used in previous studies of global flood risk. The three methods calculate direct economic damages using land cover maps to identify assets at risk and depth-damage curves to determine the degree of damage experienced by the asset. We name the three vulnerability approaches based on the global land cover map used to represent assets at risk.

GHSL
The approach to calculating vulnerability in the Aqueduct Floods project (Ward, Winsemius, et al., 2020) is based on the global depth-damage function database developed by Huizinga et al. (2017). Only urban damages are considered. The urban area is split into three classes: residential, commercial, and industrial. Because current global land cover data sets do not differentiate between urban classes, assumptions are made about the fractional split of urban classes globally. Based on the spatial distribution of urban classes in Europe derived from the Corine Land Cover data set and the findings of a report by the Buildings Performance Institute Europe (Economidou et al., 2011) the global fractional split of urban areas used in Ward, Winsemius, et al. (2020) is 75% residential, 15% commercial, and 10% industrial. Urban areas are defined as cells in the 1 km resolution Global Human Settlement Layer (GHSL) data set (Corbane et al., 2019) that correspond to a percentage of built-up area of 50% or greater (Ward, Winsemius, et al., 2020).

GlobCover
The same global depth-damage function database (Huizinga et al., 2017) was used alongside the 10 arc sec resolution (∼300 m at the equator) GlobCover (v2.3) land cover map (Bontemps et al., 2011) to calculate vulnerability in a number of other studies of global flood risk (Alfieri et al., 2017Dottori et al., 2018). In these studies, five land use classes were considered in the vulnerability assessment: four urban classes (residential, commercial, industrial, and infrastructure) and agriculture. This is the only approach of the three that considers any nonurban (agricultural) damages. While the GlobCover data set explicitly represents agriculture area, it makes no distinction between urban land use classes, which are represented as "artificial areas." These "artificial areas" are split into the four urban land use classes using globally consistent ratios, derived from studies of land use occupation in cities across different continents . The urban land use ratios used are 56% residential, 20% commercial, 16% industrial, and 8% infrastructure (L. Alfieri, personal communication, December 1, 2020).

HYDE
In Ward et al. (2013), a single depth-damage function, derived by averaging the high and low urban density land class functions in the Damagescanner tool (Klijn et al., 2007), is used to calculate vulnerability globally.

GFM and Population Agreement Calculations
The data sets are aggregated following the approach of  and Aerts et al. (2020). GFM output is aggregated by first resampling the five GFMs to the finest GFM resolution (1 arc sec in England and 3 arc sec in the remaining countries) using the nearest-neighbor approach, which ensures depths of the resampled flood map are the same as the native resolution flood map. The GFM flood depth maps are converted to binary wet/ dry rasters for any nonzero flood depth and then summed to produce the aggregated GFM map. Permanent water bodies are masked out using the G3WBM permanent water body mask (Yamazaki et al., 2015). Values in the aggregated GFM map range from 5 (highest agreement) to 1 (lowest agreement). Similarly, to produce the aggregated population map the seven global population data sets are resampled to the finest population resolution (1 arc sec). The population maps are then converted to binary populated area maps where any cell with a nonzero population is defined as a populated cell. It should be noted that this approach just represents the agreement between the population data in terms of populated area and does not account for variations in population density. Values in the aggregated population map range from 7 (highest agreement) to 1 (lowest agreement).
Agreement between the data sets is calculated using the Model Agreement Index (MAI) first introduced by see Trigg et al., 2018) for correct formulation of the MAI) and three variations of this index. The MAI is calculated using the aggregated GFM map. For each model agreement level, the total flooded area is multiplied by the fractional level of agreement. These values are summed for all agreement levels and then divided by the total flooded area to give a fraction of model agreement, which ranges from 0 (no agreement) to 1 (total agreement).
where is the total flooded area in the aggregated GFM map, is the agreement level, is the total number of models, and is the flooded area at the agreement level . An illustrative example of the calculation of the MAI is included in Supporting Information S1. The Population Agreement Index (PAI) is calculated in the same way that the MAI is calculated. The only difference is that the aggregated population map rather than the aggregated GFM map is used in the calculations.
where is the total populated area in the aggregated population map and is the total populated area at agreement level . Values for the PAI range from 0 (no populated area agreement) to 1 (total populated area agreement). The Exposure Agreement Index (EAI) is another variation of the MAI. Similar to the exposure weighted metrics used in Pappenberger et al. (2007) and Wing et al. (2019), the EAI uses exposed population, rather than flooded area, to calculate agreement. EAI is calculated for each of the seven population data sets.
where E is the total population exposed to the entire aggregated GFM map and is the population exposed at agreement level . The EAI ranges from 0 (no model exposure agreement) to 1 (total model exposure agreement) and is an indicator of the level of agreement between the models when used for exposure calculations. The final agreement index is the Volume Agreement Index (VAI). While the MAI calculates agreement between the models in two dimensions, the VAI calculates model agreement in three dimensions by incorporating flood depth. The VAI needs to be calculated using the aggregated GFM map alongside all the GFM flood depth maps.
where is the maximum volume possible for the aggregated flood extent and is the volume of models in agreement at agreement level (in three dimensions). The VAI ranges from 0 (no agreement) to 1 (total agreement).

Flood Exposure Calculations
Flood exposure is calculated for each country using observational flood data, five GFMs, and seven population data sets outlined in Section 3. Observational flood data for the last 20 years is collated from the GFD and merged into one 20-year flood map. We remove any observed flood events caused by storm surges or by dams. In total, 237 events are merged across the five countries. There are two resolutions at which exposure calculations are carried out: 1 arc sec and 3 arc sec. Exposure calculations for the HRSL population map are carried out at 1 arc sec resolution (the native resolution of HRSL). Similarly, in England exposure calculations are all carried out at 1 arc sec resolution (the native resolution of the Fathom-UK flood map). The remaining exposure calculations are carried out at 3 arc sec resolution. The six flood hazard data sets are resampled to 3 arc sec resolution (if not already native at 3 arc sec) and 1 arc sec resolution using the nearest-neighbor approach. Global population data sets coarser in resolution than 3 arc sec (GHS-POP, GPW4, GRUMP, HYDE, and LandScan) are resampled and the population is evenly distributed to a 3 arc sec resolution grid (or 1 arc sec in England). Flood exposure is calculated by intersecting a flood map with a global population data set. Permanent water bodies are masked out using the G3WBM water body map (Yamazaki et al., 2015). To account for any differences in total national populations between the seven global population data sets (and because not all population data is in the same epoch), each data set's total national population is scaled to match the WorldPop 2020 national population totals.

Flood Damage Calculations
Flood damages are calculated in each country using the five GFMs and three vulnerability methods outlined in Section 3. Observational data are not used for the vulnerability calculations as the maps contain no information about flood depth. Because the depth-damage curves are in units of meters, depths for the CIMA-UNEP GFM are first converted from centimeters to meters. Each vulnerability method uses a different landcover map (Glob-Cover, GHSL, and HYDE). These maps are resampled to the analysis resolution (1 arc sec in England, 3 arc sec in the rest) using the nearest-neighbor approach. Permanent water bodies are masked out in the GFMs with the G3WBM water body map (Yamazaki et al., 2015).
For the GlobCover vulnerability method, the approach follows that of Alfieri et al. (2017) and Dottori et al. (2018). Damages are calculated across five different sectors: agriculture, commercial, industrial, infrastructure, and residential, the latter four making up the urban class. Those areas defined as "Artificial" in the GlobCover landcover map are classified as urban areas. Because the GlobCover map does not distinguish between urban sectors, we use constant urban ratios of 56% residential, 20% commercial, 16% industrial, and 8% infrastructure that have been used in the aforementioned studies. When defining agricultural areas, we use the GlobCover "Cropland" class. Where a range of potential cropland area is given in the GlobCover documentation we use the average value (e.g., for 20-50% coverage we use 35%). Damage curves and maximum damages for each sector in each country are taken from the Huizinga et al. (2017) global database of depth-damage functions.
For the GHSL vulnerability method, we follow the Aqueuduct approach (Ward, Winsemius, et al., 2020). Damages are calculated for three urban sectors: residential, commercial, and industrial. Urban areas are defined as those cells in the GHSL data set with a built-up area greater than 50%. Constant ratios of 75% residential, 15% commercial, and 10% industrial are used for the urban sector split. The same Huizinga et al. (2017) database is used to determine maximum damage values and damage curves per sector in each country.
For the HYDE vulnerability method, we follow the approach outlined in Ward et al. (2013). Maximum damages for each country are calculated using a GDP normalization equation from Jongman et al. (2012) applied to a maximum damage value from the Damagescanner model (Klijn et al., 2007). To convert the maximum damage values from 2005 USD into 2010 EUR (to ensure consistency with the Huizinga et al. (2017) database), we use the average annual inflation from 2005 to 2010 and the average USD to EUR exchange rate for 2010. Urban areas are calculated using the HYDE urban land cover data set for the year 2015, which shows the percentage urban coverage per grid cell. This percentage urban coverage is converted to an urban area, to which we assign the calculated country-specific maximum damage value. A single depth-damage function is used, which is the average of the functions for the high and low urban density classes in the Damagescanner model.
Damages are calculated for each of the three approaches by intersecting a GFM flood hazard map with depths with the relevant land use data set. Where the flooding and the land use data intersect, percentage damage is calculated for the specific land use type using the flood depth at that location and the specific depth-damage curve for that sector. Damages are calculated by multiplying the percentage damage by the maximum damage value for that land use type. Damages are reported in 2010 Euros.

Institutional Capacity of Flood Risk Management
Qualitative interviews are conducted among key water experts of the five countries to explore the extent and capacity to which they access and use these global data sets. Data from these interviews were used to illustrate the national context of FRM, as outlined in Section 2, and feed into the discussion in Section 6.

Global Flood Hazard and Population Data Agreement
Aggregated maps of GFM hazard extent ( Figure 1) were used to evaluate model agreement (see agreement scores in Table 3). In Colombia, the country with the best MAI score (0.363), the models showed the highest levels of agreement to the north of the country on the Magdalena River. In India, the country with the second highest MAI score (0.322), the areas of highest GFM agreement were in the northeast of the country, along the Ganges and the Brahmaputra rivers. This was a trend seen across the five countries: the models tended to agree more on larger rivers and disagree more on smaller rivers. This is evident in the Orinoquia region in central Colombia where only one of the five models predicts significant inundation. Most of the rivers here have an upstream drainage area less than 500 km 2 . Of the five GFMs, Fathom is the only one that models rivers this small (rivers with an upstream drainage area greater than 50 km 2 ). The impact of river thresholds was most marked in England, here Fathom-UK has ingested higher accuracy national elevation and gauge data to model flooding on all rivers. By comparison, JRC only models flooding on six rivers in England. The models also disagree in low-lying coastal areas and deltas, such as the western Ganges delta and the Godavari delta in India, the Sarawak's Rajang River Delta in Malaysia, and the Fens in eastern England near The Wash. In these low-lying areas, the flood extent is more sensitive to differences in modeled flood depth leading to lower model agreement. To further disseminate model agreement, we split the countries into drainage basins from level 4 to level 6 according to the HydroAtlas (Linke et al., 2019) classification. Maps of basin level agreement scores can be found in Supporting Information S1. When examining the relationship in level 6 basins between the catchment area upstream of the basin and the MAI score within the basin, we found a positive normative association between the two (Spearman's rank coefficient, ρ = 0.429), evidence that GFM agreement improves as the size of river modeled increases. Comparing MAI scores between coastal level 6 basins and inland level 6 basins we found that the mean inland MAI score (0.293) was 38% larger than the mean MAI score for coastal basins (0.212). The same trends were found when examining the relationships at basin levels 4 and 5 (results included in Supporting Information S1).
The preceding section considered agreement between the modeled flood extents in two dimensions. Agreement was also measured in three dimensions using the VAI score, which incorporates modeled flood depth in the calculation. In general, VAI scores showed similar trends to MAI scores: scores were higher in basins with larger rivers and lower in coastal basins compared to inland basins. At the national level, Colombia and India remained the two highest scoring countries with VAI scores of 0.217 and 0.183, respectively. Interestingly, Ethiopia had the third highest VAI score (0.169) despite having the lowest MAI score, suggesting there was greater agreement between the modeled flood depths in Ethiopia than in Malaysia or England.
To evaluate GFM agreement in a risk context, exposure agreement when intersected with a population map was calculated using the EAI score. EAI scores were calculated for each of the seven population maps (see Table 3). The lower a population map's EAI score the greater the proportion of exposure that falls within the low agreement zones of the aggregated flood map. As the EAI score decreases the effect the choice of GFM has on calculated flood exposure increases. For example, in Colombia the choice of GFM has a greater impact on exposure Comparing the maximum aggregated GFM extent with 20 years of observational flooding from GFD (see Figure 1) we find that in Colombia the GFMs capture over 92% of the historical flooding. Almost 40% of the captured flooding is in the high agreement zone of the aggregated map (5 models agree), likely because a large proportion of the observed flooding occurred in the north of the country where the models showed higher levels of agreement. In India, much of the observed flooding on the Ganges and Brahmaputra rivers is captured by the models. However, there are large areas of observed flooding missed by the models in central India in the state of Madhya Pradesh. In England, the 20-year observed flood extent (10,938 km 2 ) is almost as large as the maximum aggregated 100-year return period GFM flood extent (13,608 km 2 ), but with little overlap. The GFD observed flooded area of over 10 thousand kilometers is nearly double the flooded area recorded by the Environment Agency since 1946 in their historical flood map for England (Environment Agency, 2022). Much of the GFD observed flooding can be attributed to commission errors from cloud cover.
When assessing population map agreement, we consider only binary populated or unpopulated areas; we do not consider variations in population density. England's PAI score (0.782; see Table 3) is much higher than the other four countries. This can also be seen visually in Figure 1, where the aggregated population map for England has more dark green areas relative to the other countries. Population disagreement stems largely from the differing approaches to modeling rural/low populated areas. Unconstrained population data sets (which spread residual census data across uninhabited areas) are responsible for the large areas of low population agreement in Colombia and Ethiopia in Figure 1 and contributes to their comparatively low PAI scores. Another contributing factor to population disagreement is the difference in data set resolution. The finest (HRSL, 1 arc sec) resolution data set is detailed enough to identify individual buildings while the coarsest (HYDE, 5 arcminute) resolution data set is detailed enough to identify only cities.

Flood Exposure
National flood exposure estimates calculated for each country using 35 different combinations of GFM and global population data set are shown in Figure 2. No single GFM consistently predicted the most or least exposure across the five countries. The same is true for the global population data sets. In Colombia, Fathom predicted more than double the average exposure than any of the other GFMs. Here, Fathom's flood extent (152,304 km 2 ) was significantly larger relative to the other GFMs (the next largest extent is JRC at 87,961 km 2 ). In Malaysia, the model with the highest exposure was GLOFRIS. This was because it predicted far more exposure on the Malaysian coast than the other GFMs. This was a trend seen across the five countries, GLOFRIS predicted far more coastal inundation than any other GFM. Flooding in level 6 coastal basins accounted for 21.5% of the total GLOF-RIS flood extent, compared with 10.2% (CaMa-UT), 7.7% (Fathom), 7.2% (CIMA), and 6.8% (JRC). In each of the five countries, the average exposure calculated using Fathom was consistently above the 35-data set exposure average, while exposure calculated using CaMa-UT and CIMA was consistently below the 35-data set average.  Figure 2. National flood exposure dot plots. Thirty five national flood exposure estimates calculated using five global flood models and seven global population data sets. Column on the right shows the average national flood exposure estimate calculated with each population data set.

Table 3 Country Level Model Agreement Index (MAI), Volume Agreement Index (VAI), Exposure Agreement Index (EAI), and Population Agreement Index (PAI) Scores
The choice of global population data set used also had a significant effect on exposure estimates. In Ethiopia, when LandScan and HRSL population maps were used, national flood exposure estimates were far lower across all the GFMs. In Colombia, flood exposure estimate disagreement in the Rio Negro basin to the south-east of the country was a result of the use of different population data sets rather than the use different GFMs. In this basin, average HRSL (47 thousand) and GHS-POP (39 thousand) exposures were far greater than Landscan (17 thousand), HYDE (9 thousand), WorldPop (8 thousand), GPW4 (4 thousand), and GRUMP (2 thousand) exposures. Much of this exposure disagreement in this basin came from the town of Mitú (see Figure S1 in Supporting Information S1). Here, the GPW4 and GRUMP data sets did not even represent a town (populations below 100), WorldPop and HYDE picked up some population (below 4,000), only Landscan, GHS-POP, and HRSL represented population totals over 10,000 (2018 Mitú population estimate was 29,850, DANE, 2019). The difficulty in accurately representing rural towns and populations is one of the major contributing factors to exposure disagreement, especially if the population is exposed to a river as in Mitú. The population data sets agreed better in large urban areas. This is especially evident in Figure 2 when examining the spread of the population exposure estimates for the JRC GFM in England. The majority (65%) of JRC's national exposure came from the Thames river in Greater London. Because the population data sets show greater agreement in dense urban areas, the differences in exposure estimates here are lower.
Across the five countries, the only population data set whose average exposure showed a consistent trend above or below the 35-data set average was HYDE, suggesting there are less cross-national trends in exposure estimates for global population data than there are for GFMs. The HYDE data set maps population distribution at a resolution of nearly 9 km at the equator, which is between 10 and three hundred times coarser than the other population data sets and between 10 and one hundred times coarser than the GFMs. At such a coarse resolution, HYDE represents the interaction between the inundation and the exposure with significantly less precision, therefore, the resulting HYDE exposure estimates are influenced more by the modeled inundated area than the location of the population exposed. Conversely, HRSL exposure estimates were typically lower than the average (except for Malaysia). This is because the detailed representation of individual buildings in the HRSL data set better captures the population's avoidance of obvious floodplains.
The spread of average GFM exposure is larger than the spread of average global population exposure in each of the five countries, suggesting that the choice of GFM used has a greater impact on exposure estimates than the choice of global population data set used. To explore this further at the basin level, we compare the average coefficient of variation of flood exposure estimates when the choice of GFM is held constant to the average coefficient of variation when the choice of global population data set is held constant. Across the five countries, we find that the choice of GFM had a greater influence on exposure estimates than the choice of population data set in 90% of level 4 basins, 80% of level 5 basins, and 78% of level 6 basins. A figure identifying these basins is included in Supporting Information S1.
In Figure 3, exposure results are normalized and combined to produce box and whisker plots for cross-country comparison. The distribution of national flood exposure estimates is comparatively smaller in England and Ethiopia than it is in Colombia, India, or Malaysia. The range of potential normalized national flood exposures calculated using global data in these three countries is substantial. In Colombia, normalized national exposure ranges between 34 and 175 people exposed per 1,000; in India, it ranges between 72 and 244 people exposed per 1,000; and in Malaysia, it ranges between 50 and 219 people exposed per 1,000.
We also calculate population exposure to 20 years of historical flood events from the GFD. These exposure results are listed in Table 4 along with exposure to the maximum combined GFM extent and exposure where the two data sets overlapped. The population data set used to calculate exposure has an equally significant impact on observed flood exposure estimates as it does on modeled flood exposure estimates. For example, in India, observed flood exposure calculated using HRSL (50 people per 1,000) is 43% smaller than observed flood exposure calculated GPW4 (81.1 people per 1,000). This is significant as often these data sets are used in immediate disaster response to estimate those exposed to flood events.

Flood Damages
Direct economic damages for the five countries were calculated using five GFMs and three different vulnerability approaches. The total economic damages and the GDP normalized economic damages for each country are shown in Figure 4. Total flood damages were largest in India: ranging from 29.7 billion EUR (39.4 billion USD) to 109 billion EUR (145 billion USD) depending on the GFM and vulnerability approach used. Damages were most acute in Malaysia, where normalized damages made up between 2.2% and 29% of national GDP. Flood damages were comparatively small in Ethiopia, never exceeding 0.5% of national GDP. Here, when the GlobCover vulnerability approach was used, agricultural damages accounted for the majority of total damages (between 83% and 100%). Only two of the five GFMs (Fathom and CaMa-UT) calculated any urban GlobCover damages in Ethiopia. This is because the rivers running through the two cities where urban damages were calculated (Addis Ababa and Dawa) are too small to be modeled by three of the five GFMs.
In each of the five countries, the choice of vulnerability approach used had a greater impact on direct damage estimates than the choice of GFM. In Colombia, the average total damages calculated using the GlobCover approach was 650 million EUR (862 million USD) compared to 3.5 billion EUR (4.6 billion USD) and 9.9 billion EUR (13.1 billion USD) when GHSL and HYDE approaches were used, respectively. No vulnerability approach consistently predicted the most or least damages. In Colombia, England, and Malaysia the HYDE method predicted the most damages, while in Ethiopia and India the GlobCover approach predicted the most damages. Differences in direct damages between the three approaches are a function of the land cover data set used, and the assumptions made during the calculations. Apart from GlobCover, which also considers agriculture, damages are only calculated for urban areas. The three landcover data sets differ in their classification of urban areas. In Colombia, the total urban area defined by GlobCover was just 99 km 2 , for GHSL it was 946 km 2 , and for HYDE it was 2436 km 2 . The differences in damages calculated in Colombia reflect these differences in urban area. This trend was similar in Malaysia, where urban areas were 1,178, 1,929, and 4,929 km 2 for GlobCover, GHSL, and HYDE, respectively. The three approaches make different assumptions about the categorization of damages. Inclusion of agricultural damages is significant in Ethiopia, but not in the other four countries. Infrastructure damages, which are only considered in the GlobCover approach, make up less than 2% of total damages in each of the five countries.

Global Data Used Nationally
We identify 16 different global flood risk data sets and methods that have been used in previous studies of global flood risk and use them to calculate national flood risk in five countries: Colombia, England, Ethiopia, India, and Malaysia. These data sets, which have been instrumental in improving our understanding of global flood risk over the past two decades, are becoming increasingly relevant at the national scale. However, as Ward et al. (2015) postulated about GFMs, "there is often a mismatch between their actual ability and the envisaged use by practitioners." We have shown that there is also mismatch between the different global data sets, which is reflected in what they tell us about national flood risk.
Disagreement between the GFMs is substantial, and this is reflected in their MAI scores. The scores, which range from 0.24 to 0.363, are in line with the scores of the intercomparison of the first generation of GFMs in Africa . As the models develop, you would expect convergence in their modeled output. However, these models are not being developed at the same rate. Only three of the five GFMs tested in this study (Fathom, CaMa-UT, GLOFRIS) have updated their model outputs since the first  intercomparison. Fundamental differences between the models remain; most notably the thresholds set on the size of river modeled (Bernhofen et al., 2021). These thresholds impact estimates of flood risk at the national scale, as seen with the large Fathom risk estimates relative to the other models (especially in Colombia and Ethiopia); and risk estimates at the basin and city scale, such as in the capital of Ethiopia, Addis Ababa, where only two of the five GFMs estimated any flood risk. Beyond differences in modeled domain, the models differ in their very structure. Although there are limits to the conclusions that can be drawn by a comparison of raw modeled output alone, results suggest differing levels of hydrodynamic representation and coastal boundary conditions contribute to disagreement in  low-lying and coastal areas. This is evident across the five countries, but especially in England and Malaysia, the two countries with higher coast to area ratios. In these countries, GLOFRIS (a volume spreading model) predicts far higher coastal exposure, and subsequent national exposure, relative to the other (more hydrodynamic) models. The final resolution of the modeled flood extent should also be considered. The detail lost when using a 30 m or 90 m resolution flood map compared with a 1 km resolution flood map is not insignificant (Fleischmann et al., 2019;Horritt & Bates, 2001;Savage et al., 2016). National flood risk management strategies will often have to account for the combined risk posed by riverine, pluvial, and coastal flooding. This study only considered modeled riverine flooding, due to the scarcity of global data sets considering multiple flood drivers. Thus, the risks reported in this study will likely be smaller than the combined flood risks faced by each country. Similarly, flood protection measures are either crudely represented in global flood models-often as a function of GDP (Rudari et al., 2015;Sampson et al., 2015) or protection standard databases (Scussolini et al., 2016)-or not represented at all. Our comparison of undefended riverine flood maps will likely have overestimated flood risk in countries with extensive flood protection, such as England.
Equally important to globally modeled flood data is global observational flood data. Although the limits to this data have been shown in England, Ethiopia, and Malaysia. The incorrect classification of flooding from satellite imagery due to cloud or terrain shadows, as is apparent in England, can lead to significant over prediction of flooding and lead to potentially misclassified exposure (Revilla-Romero et al., 2015). In Ethiopia and Malaysia, the limited 20-year timeframe of the satellite observed data is evident, as the area of the country which is flooded is a small fraction of the 100-year return period GFM flooded area. Rather than used in isolation, global flood observations and GFM data should be used to complement each other . Population disagreement was almost as significant as GFM disagreement. This was most notable in rural areas, such as in south-east Colombia, where the town of Mitú was captured by only some of the population data sets. Global population data were classified in Leyk et al. (2019) by the complexity of modeled population distribution. Unmodelled population data sets, such as GPW4, evenly distribute population data over census enumerated areas. This means the detail at which population is represented is entirely dependent on the size of the enumerated areas; something which is highly variable across countries. In Colombia, where the size of the average census enumerated area is 1,021 km 2 , GPW4 calculated exposure is a lot less accurate than in England, where the size of the average census enumerated area is 0.76 km 2 (CIESIN, 2018). Indeed, it's surprising that GPW4 is still so widely used in studies of flood exposure, as it does not capture the population distribution at any detail finer than the census unit. The discrepancy in detail between national census data is something that needs to be considered for all population data sets that use GPW data as input (GHS-POP, GRUMP, HRSL, WorldPop), although the impact this has on final population estimates is smaller due to the additional population distribution modeling carried out by these data sets. The resolution of the population data is equally important to consider. The most resolved of the population data sets, HRSL, identifies individual dwellings at 30 m resolution. Highly resolved population data was shown in the studies of Smith et al. (2019) and Bernhofen et al. (2021) to be of significant importance in accurately representing the avoidance of flood-prone areas. Indeed, the low HRSL flood exposure estimates in four of the five countries examined in this study would support this finding. There are obvious limits to the conclusions that can be drawn with coarse resolution population data. Those data sets with a resolution of 1 km (GPW4, GRUMP, LandScan) will struggle to accurately model exposure on anything but the largest rivers. Even GHS-POP, which has a resolution of 250 m, was shown in Bernhofen et al. (2021) to be too coarse to accurately represent exposure on some smaller rivers. Certain population data, such as GRUMP and HYDE, are not relevant at the national scale. GRUMP, which has not been updated since 2000, is obsolete for flood risk analysis under current conditions. The HYDE data, at a resolution of roughly 9 km, is far too coarse to draw any meaningful conclusions at the national level. It was the only population data set that consistently overpredicted exposure across the five countries examined. Population data should be chosen with the intended use in mind. Previous studies have highlighted the benefits of HRSL (Bernhofen et al., 2021;Smith et al., 2019), however, HRSL population estimates are limited to 2018. If consistent population estimates across time are required, data sets such as GHS-POP or WorldPop would be better suited. Similarly, if you want to calculate exposed daytime population rather than nighttime population, Landscan is the only data set you can use.
The range of low national VAI scores (0.132-0.217), which consider both extent and depth disagreement between the models, would suggest that the choice of GFM has a large effect on calculated national damages. What we found, was that the choice of vulnerability approach has a far greater effect on national damages than the choice of GFM. This was largely due to how the three different land cover maps used in the vulnerability calculations identified urban areas. Across the five countries, the size of urban area defined by each data set was reflected in the national damage estimates. Equally important, but less reflected in the national flood damage estimates, was how the GlobCover and GHSL approaches split urban sector damages. Each approach applied constant global ratios of urban sector split, which were based on studies of either global or European cities Ward, Winsemius, et al., 2020). Sector level damages were directly impacted by these constant global ratios, meaning GHSL residential damages always made up a larger proportion of urban damages than GlobCover residential damages. The lack of a ceteris paribus comparison between vulnerability approaches limits the definite conclusions that can be drawn about the impact different aspects of the vulnerability calculation had on disagreement. Previous work by de Moel and Aerts (2011) found that the valuation of assets and the choice of damage curve had the greatest impact on damage uncertainty in the Dutch basin they were investigating. Similarly, when examining loss data in the US from the National Flood Insurance Program Wing et al. (2020) found that claims data does not fit the monotonic shape of traditional damage curves. It is well established that the vulnerability component of any flood risk assessment carries the most uncertainty at any scale. The assumptions and uncertainties associated with the three global vulnerability approaches tested in this study do not translate well into the national context.

Uncertainty and Decision Making
Exploring the hazard, vulnerability, and exposure components of flood risk using different models and data sets provides a useful basis for discussing uncertainty across all three components. Often quantifiable uncertainties are understood as risks, while unquantifiable uncertainties are understood as uncertainties. Some of these uncertainties are amenable to quantitative or qualitative evaluation, while some cannot be evaluated (Riesch, 2013).
Key sources of modeling related uncertainty include; context and framing, input, model structure, parameter, and model technical uncertainty (Refsgaard et al., 2007). GFMs are not always developed for answering questions associated with context and framing, such as social, environmental, economic, technological, and infrastructural characteristics at the local scale. The extent to which these characteristics (often not accounted for) affect uncertainties differs from location to location and involves complex interactions. These characteristics may not always be captured in the calibration and validation process and does not always include all streamflow observation stations (Hirpa et al., 2021;Wing et al., 2021). This means that model parameterization is not always sensitive to local characteristics. Similarly, while the implicit representation of flood protection in GFMs may produce accurate aggregate estimates of risk, local risk cannot be accurately understood without explicit representation of flood defenses. Input data across GFMs suggests that there are commonalities (e.g., the DEM), interdependences, and differences making it difficult to isolate the impact of different factors on uncertainty. Substantial model structure, parameterization, and technical differences also mean that without a significantly complex sensitivity analysis it would be difficult to ascertain how and to what extent they individually or collectively affect uncertainty (Hoch & Trigg, 2019). As a consequence, while we use multiple models and data sets to illustrate uncertainties, the contribution of different sources of uncertainty is difficult to ascertain quantitatively or qualitatively.
With cascading uncertainty as GFM information is combined with exposure and vulnerability information from different data sets, there is an expansion of the uncertainty space. Flood risk assessments, based on the three components, would involve a further increase in uncertainty. Furthermore, the flood risk information may not always account for smaller scale flood prevention interventions or be relevant at hyperlocal scales where other socioeconomic factors may affect flood risk. Further scrutiny may also reveal that there are important differences between extreme flood magnitudes (not explored in this study), as demonstrated in Africa  and the conterminous United States (Devitt et al., 2021). There may also be other models, data sets, and even factors affecting risk that need to be accounted for to better understand risks. This suggests that although this study brings together multiple variants of the contributors to flood risk assessment, the uncertainty space is unclear and the contribution of different factors to exposure, damage and risk is not well characterized. This makes it challenging to interpret results in a way that can aid decision making at local scales. For instance, we discuss disagreement in typically flood-prone areas like low-lying deltas and low relief coastal areas, and higher agreement in larger river basins. This could affect the salience and credibility of the information for local to national level decision makers, some of whom may have primary knowledge and experience of dealing with flooding in such areas. Such information may be relevant to decision makers who are interested in hotspots and may focus on areas with greater agreement, or on areas where populations also face other hazards. An alternative way could be for decision makers to think of these different combinations (exposure, damage and risk) as scenarios that are plausible, but do not necessarily capture the full range of possibilities. Decision makers, who are often used to dealing with uncertainty, may find that a structured decision making under uncertainty approach (Bhave et al., 2016) may help assess the value of the information and make more informed flood management decisions.

National Capacity to Use Global Data
The inconsistencies and variation between these global data sets may call into question their "usefulness" for evaluating flood risk, especially for some countries who experience limited institutional capacity and policy in FRM. The corroboration of global data with data collected locally may potentially ensure accuracy and consistency. However, the quality and availability of data in countries in the Global South is frequently very poor. For example, spatiotemporal time series are often not complete and bureaucratic, administrative barriers or political motivation can prevent access to data. These global data sets can hold exceptional value in areas that are data poor; for example, planetary level data sets have been used to detect long-term meteorological changes in Pakistan, India, and Inner Mongolia (Lindersson et al., 2020).
Evaluations at the national scale may have limited impact on the most vulnerable of which usually inhabit agriculturally dependent rural areas, especially if population data sets are not resolved enough for these areas, or due to narrow or misaligned interpretations of vulnerability. For example, calculation of economic damage reflects inequality; hence it becomes important in relation to how something like agricultural damage is accounted for, given the centrality of agriculture in the livelihoods of many, while not economically catastrophic in GDP terms, this is potentially devastating on a local, livelihoods scale.
Effective FRM and the ability to use these global flood risk data sets requires policy and institutions that recognize the interconnected and interdependent systems that are inevitable, not only with technical interventions and infrastructure but also the sociopolitical networks that provide expertise and coordination (Jonkman & Dawson, 2012). However, the resources and agency to achieve this is frequently lacking, for example, some countries do not have specific FRM policies and with other disaster management areas taking priority, such as drought in Ethiopia and earthquakes in Colombia resulting in lack of agency regarding FRM.

Conclusions
As global flood risk data develops and becomes increasingly relevant at national scales there is an urgent need to evaluate its credibility in a national flood risk context. By carrying out national flood risk assessments using global data in five different countries we explore the commonality and variability of global data used nationally. Global data sets vary significantly at the national level, and this is reflected in the national flood risk estimates. We find that the choice of GFM has a larger effect on exposure estimates than the choice of population data set. Across the five countries, the choice of GFM is more significant than the choice of population data in 78% of level 6 basins. The choice of vulnerability approach has the greatest influence on national flood damage estimates. In Colombia, national flood damages differed by a factor of 15 depending on the global vulnerability approach used. The detail of the data sets becomes increasingly important at the national scale. GFMs that do not model the flooding of small rivers are leaving a substantial amount of national flood risk unaccounted for, which results in risk estimates that are often halved compared to the risk estimates of more representative GFMs. Similarly, coarse resolution global population data limits the detail at which risk can be evaluated and diminishes the usefulness of certain data sets at this scale. Global approaches to calculating vulnerability are limited both by the uncertainty of global land cover data sets (where classified urban areas can differ by up to a factor of 25) and the assumptions made to calculate damages at the global scale.
Further to these challenges and inconsistencies but just as significant is whether countries have the capacity to access and use these data sets. These data sets can only be effective if a country has a functioning, institutional framework with capability to support their use and implementation. This can include informed and proactive policies which both monitor and plan for future flood risk. Additionally, strong institutions that effectively implement these policies, which encourage expertise and assist the consultation and coordination between a diverse range of key stakeholders (Jonkman & Dawson, 2012). Technical and financial capital is significant in introducing and maintaining the infrastructure needed to monitor and assess flood risk; as is the availability of good quality, compatible data to complement and use alongside these global data sets. Variation in the methods of conceptualizing population and vulnerability could be particularly problematic for compatibility.

Recommendations and Future Work
Global flood risk data sets were evaluated in this study by quantifying the uncertainty when used interchangeably for flood risk assessments at the national scale. Further work should incorporate locally sourced data and locally calibrated models to test the global data sets. Only then could one definitively conclude which data is "best" for a given locality or use. These data sets could have considerable potential for assisting and furthering FRM in countries who have limited capacity to access local data, however further investigation is needed to reveal the extent to which countries find these data sets useful and have the capacity to use these data sets. Further work should examine in greater detail the institutional capability of national and local FRM to access and apply such data sets. The variation between these data sets also requires technical understanding of the nature of limitations of the data. Further, the application of such data sets as evidence for decision making entails choices over the allocation of resources. Future work should seek to examine the types of policy and resource allocations that result from the application of such data sets to FRM.

Data Availability Statement
All the data used in this study are openly available for research purposes. CaMa-UT model outputs can be obtained from the developer at http://hydro.iis.u-tokyo.ac.jp/∼yamadai/. Aggregated 1 km CIMA-UNEP flood maps can be downloaded from https://preview.grid.unep.ch, the native 3 arc sec maps used in this study can be obtained directly from the developer. The Fathom Global 2.0 and Fathom-UK flood maps can be obtained from the developer at https://www.fathom.global/. GLOFRIS flood maps can be downloaded from the Aqueduct floods platform https://www.wri.org/applications/aqueduct/floods/. JRC flood maps