The impact of different rainfall products on landscape modelling simulations

Rainfall products can contain significantly different spatiotemporal estimates, depending on their underlying data and final constructed resolution. Commonly used products, such as rain gauges, rain gauge networks, and weather radar, differ in their information content regarding intensities, spatial variability, and natural climatic variability, therefore producing different estimates. Landscape evolution models (LEMs) simulate the geomorphic changes in landscapes, and current models can simulate timeframes from event level to millions of years and some use rainfall inputs to drive them. However, the impact of different rainfall products on LEM outputs has never been considered. This study uses the STREAP rainfall generator, calibrated using commonly used rainfall observation products, to produce longer rainfall records than the observations to drive the CAESAR‐Lisflood LEM to examine how differences in rainfall products affect simulated landscapes. The results show that the simulation of changes to basin geomorphology is sensitive to the differences between rainfall products, with these differences expressed linearly in discharges but non‐linearly in sediment yields. Furthermore, when applied over a 1500‐year period, large differences in the simulated long profiles were observed, with the simulations producing greater sediment yields showing erosion extending further downstream. This suggests that the choice of rainfall product to drive LEMs has a large impact on the final simulated landscapes. The combination of rainfall generator model and LEMs represents a potentially powerful method for assessing the impacts of rainfall product differences on landscapes and their short‐ and long‐term evolution. © 2020 The Authors. Earth Surface Processes and Landforms published by John Wiley & Sons Ltd


Introduction
Landscape evolution models (LEMs; see review by Tucker and Hancock, 2010) are tools for understanding the role large-scale processes have in the long-term development of the Earth's surface (10 3 -10 6 years). LEMs are most often operated in an exploratory manner, employing careful simplifications of physical laws to maximize computational efficiency (Tucker and Hancock, 2010). Recently, the increase in computational power available has allowed for increasing complexity and detail to be simulated , yet understanding output uncertainties and sensitivities of these models to differences between input data has received far less treatment (Coulthard and Skinner, 2016;Hancock et al., 2016;Chandra et al., 2019).
Rainfall, both spatially and temporally, is considered a major source of input uncertainty in hydrological modelling (Keijsers et al., 2011;McMillan et al., 2011McMillan et al., , 2012Peleg et al., 2017a). Measurements of rainfall are made using a variety of methods that produce estimations of the spatial and temporal distribution of rainfall, and these different techniques are referred to herein as rainfall products. Rainfall products can be formed from single-point observations (e.g. rain gauge), extrapolated using interpolation of many points in space (networks of rain gauges, disdrometers, etc.), or spatially estimated using remote sensing techniques (i.e. weather radar and satellite). For each of the products, differences in the estimates, spatially and temporally, are expected (e.g. Ciach, 2003;Villarini et al., 2008;Villarini and Krajewski, 2010;Peleg et al., 2013).
Rain gauges are able to provide an estimate of rainfall intensity at a single point, and when used in a network the spatial pattern of rainfall can be estimated using spatial interpolation methods (e.g. kriging)these spatial datasets contain uncertainties originating from the point measurements, the interpolation (dependent on the density and design of the network), the terrain of the catchment, the spatial distribution of rainfall, and the interpolation method selected (e.g. Hofstra et al., 2009;Mc-Millan et al., 2011). Weather radar is a useful technique for estimating the distribution of a rain field over a large area (Berne and Krajewski, 2013). Weather radar systems measure the backscattered signal and use it as a proxy for rainfall. Quantitative precipitation estimation from weather radar is considered to be highly uncertain, but the quality has improved in recent years, even allowing the study of extreme rainfall intensities (e.g. Marra and Morin, 2015;Marra et al., 2017). In combination, rain gauges and weather radar can provide a more detailed view of the properties of a rain field, with the added value of spatial coverage provided by the radar, especially between gauges.
Given the length of many LEM simulations, the impact of differences in the driving rainfall product used might seem trivial, however, research has shown that simulated geomorphic processes and sediment output from LEMs can be sensitive to the spatial and temporal resolution of rainfall inputs. Coulthard and Skinner (2016) showed that increased rainfall intensities, which may be hidden within averaged longer timesteps, or spatial averaging, led to greater first-order stream incision and second/third-order stream aggradation for the same rainfall totals. Therefore, Coulthard and Skinner (2016) suggested that the patterns of erosion and deposition predicted by LEMs are highly likely to be affected by uncertainties and variabilities in rainfall products. Coulthard et al. (2012b) showed that basins acted as a 'geomorphic multiplier', with a non-linear relationship between rainfall, discharge, and sediment transport acting to exponentially amplify the effects of increases in rainfall on sediment yield.
As landscape evolution operates over timescales spanning tens to millions of years, the choice of rainfall product with which to drive a LEM is often motivated by the requirement for an indicative record of rainfall to sample from. However, because rainfall observation records often lack the required temporal coverage, one approach has been to use as long as possible a record from a nearby rain gauge (even if located tens of kilometres away) and loop it to the required length (e.g. Coulthard et al., 2000;Hancock, 2009;Hancock et al., 2010Hancock et al., , 2011Hancock et al., , 2015Hancock and Coulthard, 2012;Saynor et al., 2012;Poeppl et al., 2013;Hoober et al., 2017). This clearly introduces several limitations as climate variability is reduced to the length of the record and introducing non-stationarity (e.g. climate change) can only be represented basically (e.g. by altering rainfall totals). A rainfall times series produced in this way will never contain higher extremes than those observed in the record, which is unrealistic. Rainfall generator models present an alternative way to derive long-term rainfall records (Wilks and Wilby, 1999;Smith et al., 2014) and can be used to produce ensembles of synthetic rainfall calibrated using the climate variability and spatiotemporal characteristics from observed rainfall products. Rainfall generators have been used previously to produce time series of basin-averaged precipitation rates for LEMs (Tucker and Bras, 2000;Coulthard et al., 2012a;Howard et al., 2016;Coulthard and Van De Wiel, 2017;Hancock et al., 2017), with no spatial distribution, and there has been no appraisal of how different methods of rainfall simulation, ergo rainfall products, can alter model outcomes.
This study aims to test how the different spatiotemporal estimates associated with different rainfall observation techniques impact on simulations of changes to basin geomorphology as they cascade through to LEM outputs. Three starting rainfall products will be used: (i) a single rain gauge product; (ii) an interpolated gridded rain field based on a rain gauge network; and (iii) weather radar observations. 50-Year records were produced by looping and bootstrapping (i), and using (ii) and (iii) to calibrate the STREAP weather generator (Paschalis et al., 2013;Peleg et al., 2017b) and to derive 30 ensemble members at a 1 h/1 km resolution. Whilst not a complete uncertainty evaluation of all plausible rainfall scenarios, we can start to quantify the impact and thus sensitivities to the LEM outputs driven by these different products.

Study catchment
The study area was the Upper Swale catchment, UK (Figure 1), which has an area of 181 km 2 and an elevation range between 182 and 712 m. There is a strong orographic rainfall gradient across the catchment, with mean annual rainfalls of 1000 mm in the north-east and up to 1940 mm in the south-west. The catchment has been chosen as it has been widely used in previous studies testing the CAESAR-Lisflood model (Coulthard and Macklin, 2001;Coulthard et al., 2012b;Coulthard and Van De Wiel, 2013;Coulthard and Skinner, 2016;, and it has good coverage of the rainfall products.
In order to measure spatial patterns of simulated geomorphic change, the catchment was subdivided into areas based on stream orders derived from the proportion of the catchment drained, as per .
Rainfall products: Single gauge, gauge network, and weather radar Three rainfall products were examined. The first (Tow) is based on a relatively long record from a single rain gauge located 2 km south-west of the study area, at a 1 h timestep. Even though this is the closest available gauge to the catchment, it is located in a topographically higher area with a greater mean annual rainfall than much of the catchment (see Figure A4 in the online Supporting Information A)the record was not adjusted to account for this. It provided a 30-year record, however, years with more than 2000 missing hours (83 days) of observations were excluded from the analysis, reducing the record to 24 years (removing 1987-1993, 1995-2009, 2011, and 2014). This product reflects intensities sampled from a single point, therefore providing no representation of the spatial rainfall field, thus rainfall intensities were assumed to be uniformly distributed across the catchment.
The second product (TBR) aims to address the lack of spatial coverage in the single-gauge approach above by interpolating the point data from a network of rain gauges to produce a distributed and gridded rain field estimation (McMillan et al., 2011). The gridded interpolated rainfall data was derived from a tipping bucket rain gauge (TBR) network at 1 h/1 km resolution, as described by Blenkinsop et al. (2017).
The third product (NIMROD) is derived from a rain radar network (UK NIMROD Composite product; Met Office, 2003), provided at a 5 min/2 km resolution, which was aggregated to a 1 h timestep.
Despite a much longer record being available for the TBR product (from 1990), both TBR and NIMROD were limited in length to a 6-year period where records for both the TBR and the NIMROD were available (2006)(2007)(2008)(2009)(2010)(2011). This was so that it would be possible to directly compare their relative abilities in capturing the spatial rain field and point intensities over the same time period, using them in the calibration of STREAP, and subsequently propagating the impacts on changes to basin geomorphology.
An example of the spatial differences in rainfall intensity between the three products is shown in Figure 2, which illustrates the most extreme hourly rainfall intensities that were observed by the three products.
Further in-depth analysis of the differences between the observation products can be found in the online Supporting Information A, however, to summarize, the three products differed in their spatiotemporal representation of rainfall over the catchment. The Tow rain gauge record contained more information 2513 IMPACT OF RAINFALL PRODUCTS ON LANDSCAPE MODELLING SIMULATIONS on the natural variability of climate due to the length of the record, but only supplied this information for a single pointthis point is just outside the catchment in the area of highest annual rainfall (1940 mm based off TBR). Point rainfall intensity estimates from the NIMROD weather radar are considered less accurate (not reported here) compared with rain gauges (Cecinati et al., 2017), but the weather radar supplies valuable information on the spatial structure of rainfall. Lastly, produced from interpolation of a network of ground rain gauges, rainfall intensity estimates from TBR were likely more reliable than the estimates from the weather radar (Lewis et al., 2018), however, the spatial structure of the rainfall is smoothed compared to the radar (Jewell and Gaussiat, 2015), especially as the interpolation included no rain gauges from within the catchment itself.

STREAP rainfall generator
The STREAP (space-time realizations of areal precipitation) model is used to generate gridded rainfall at high spatial FIGURE 2. The largest hourly rainfall intensity (19/06/2007, 19:00) as recorded by the uniform Tow rain gauge (left), the distributed NIMROD weather radar (middle), and TBR (right) products. μ is areal rainfall intensity (i.e. the average rainfall over the field), CV is the spatial rainfall coefficient of variation (standard deviation of the rainfall intensities over the field divided by the areal rainfall intensity), and max is the maximum rainfall intensity observed in that rain field. [Colour figure can be viewed at wileyonlinelibrary.com] (sub-kilometre) and temporal (minute) resolution. The model was originally developed by Paschalis et al. (2013) and was developed further by Peleg et al. (2017b). It has been applied for several hydrological applications (e.g. over a large rural catchment for flood investigations -Paschalis et al., 2014, to simulate extreme rainfall intensity over small scales -Peleg et al., 2018, and to study the impacts of spatial and climatological rainfall variability in urban drainage -Peleg et al., 2017a), and was applied with the CAESAR-Lisflood model to study the landscape sensitivity to extreme rainfall events in the context of climate change (Peleg et al., 2020b). STREAP reproduces the storm arrival process (i.e. the length of the storm and intra-storm periods), the temporal evolution of the storm (i.e. the areal intensity over the catchment and the fraction of wet areas), and the space-time structure of rainfall (i.e. intermittent rain fields). The model is ideal for LEM studies as it can be used to produce ensembles of synthetic rainfall reflecting the rainfall natural variability (assuming climate is stationary) from relatively short observation records, whilst reproducing the spatiotemporal characteristics of the rain fields.

CAESAR-Lisflood landscape evolution model
CAESAR-Lisflood is a LEM based on a regular grid that has been used extensively to simulate river basin morphodynamics over a wide range of time (1-10 000 years) and spatial (0.02-1000 km 2 ) scales . The model can operate in a catchment mode where a spatially distributed rainfall input is converted into surface runoff using the TOPMODEL hydrological model (Beven and Kirkby, 1979), and this is routed across the grid using LISFLOOD-FP (Bates et al., 2010), which generates flow depths and velocities. An active-layer system is used to simulate fluvial erosion and the model can handle up to nine grain sizes. The initial inception of the CAESAR model was to explore the impact large-scale processes have on the long-term evolution of landscapes (>1000 years), however, the incorporation of the LISFLOOD-FP hydraulic code  enabled full hydrodynamics and validated inundation patterns and velocities to be simulated. Consequently, CAESAR-Lisflood is ideal to test the impact differences in high (temporal and spatial)-resolution rain data have on model outputs and on changes to basin geomorphology. To assess the model's sensitivity to rainfall observation uncertainty, both within and between products, all model parameters were kept constant throughout the experiments. We used a pre-calibrated CAESAR-Lisflood model of the Swale catchment, UK, as used in several previous studies (e.g. Coulthard and Skinner, 2016) and used the same digital elevation model (DEM) derived from an air-borne LiDAR scan that had been resampled to 50-m grid cells. The DEM and grain size distributions were 'spun-up' before the experiments using a 10-year rainfall time series at 24 h/lumped (catchment-average) resolution based on the UK NIMROD 5 km Composite dataset (Met Office, 2003). It is important here to note that the model application is intended as a testbed for the impacts of rainfall products, and therefore does not include any representation of bedrock or vegetation that could alter the model sensitivity.

LEM ensemble simulations
The experimental procedure uses the different gridded rainfall products described to calibrate STREAP. Each calibration is used to produce a rainfall ensemble at a 1 h timestep and a 1 km spatial resolution. The generation of the ensembles follows a common setup when exploring climate impacts including stochastic uncertainties (e.g. Fatichi et al., 2016), with each consisting of 30 individual realizations (to account for the natural climate variability) and covering a period of 50 years (the upper length boundary allowing us to assume climate stationarity). Each ensemble is designed to show how the different information provided by the TBR or NIMROD weather radar rainfall products spatially influences the generated rainfall recordsany differences, effectively a source of representative uncertainty, will then be observed through the model cascade. Two records were generated using the Tow gauge: (i) a simple looped record, used as a proxy for a typically used rainfall product to allow comparison; and (ii) a bootstrapped ensemble (of 30 realizations), generated by sampling with replacement of the original records by block bootstrapping of the entire years, meaning that a specific year that was recorded can appear numerous times or never in each realization. It is possible to reduce the uncertainty in rainfall observations by merging multiple products, using the most useful information from each. Here, for example, a 'merged' rainfall record was produced by calibrating STREAP using the best information from the three original products: the temporal structure of rainfall follows that obtained from the Tow rain gauge, the spatial structure of rainfall follows that obtained from the radar, and rainfall intensities are adjusted to follow the TBR data ( Table I). The generated rainfall ensembles will henceforth be referred to as products in their own right. The products used are shown in Table I, and an example of a single 50-year realization for each of the products is presented in Figure 3.
Whilst some difference in the discharge, sediment yields, and landscape change will likely be observed between simulations over the 50-year timeframe, longer simulation times will make landscape changes more prominent. To overcome this limitation, 1500-year records were producedfor TowLooped and TowBoot this was done by repeating the 50-year record, and for all other products all 30 ensemble realizations were combined into one long record.
Finally, to ascertain the impact of the extra variability afforded by the longer observation record with the Tow gauge, a further test was performed with normalized rainfall means and spatial representation. This was done using TowLoop, and the first members of the sNIM and sTBR ensembles. The two spatially distributed products were first lumped across the domain and then TowLoop and sNIM were normalized to the mean rainfall of sTBR. These three inputs were then used to drive a 50-year simulation. The peak rainfall intensities in these records were 13.7 mm h À1 for TowLoop (adjusted from 17.5 mm h À1 ), 6.0 mm h À1 for sNIM (adjusted from 5.0 mm h-1), and 6.9 mm hÀ1 for sTBR.

Evaluation of STREAP Products
STREAP's ability to reproduce the hourly areal rainfall was examined ( Figure 4). The areal rainfall was well reproduced for the Tow rain gauge (Figure 4a). The simulated rainfall for the distributed rainfall products (gridded and radar, Figures 4b and c) is in general also well reproduced, but the high rainfall intensities (i.e. larger than the 95th percentiles) are underestimated. It is worth highlighting this underestimation, as it is these high rainfall intensities that can result in significant geomorphic activity, and subsequently sediment yields may be underestimated. The simulated spatial correlation of rainfall is a bit higher than observed, but the differences are not significant (see Figure A5 in the online Supporting Information B). More details from the evaluation, such as the comparison of the 2515 IMPACT OF RAINFALL PRODUCTS ON LANDSCAPE MODELLING SIMULATIONS annual areal rainfall variability between observed and simulated products, can be found in the online Supporting Information B.

LEM rainfall-discharge-sediment cascade
The cumulative rainfall, river discharge, and sediment volumes for all the products are shown in Figure 5. The cascade from the rainfall (Figure 5a) to the discharge (Figure 5b) was observed to be broadly linear, as observed previously (Coulthard et al., 2012b). Each product showed a small increase in volume in the conversion from rainfall to discharge that was caused by the setting of a low flow threshold in the CAESAR-Lisflood model (the model assumes that at discharges below the threshold, the erosion and transport of sediment is negligible and increases the model timestep to improve efficiency, assuming hydrology is running in a steady state). The spread of the products, expressed as the range of values as a percentage of the mean of the members, was similar for both rainfall and discharge, with the spread for sTowMer slightly reducing in the rainfall to discharge cascade (10.7 to 8.7%).
TowLoop, TowBoot, and sTowMer showed much greater rainfall and discharge volumes (Figures 5a and b), and as was expected the TowLoop and the mean of the gauge ensembles were similar (rainfall = 1.6 × 10 10 m 3 ; discharge = 2.0 × 10 10 m 3 ). sTBR (mean of ensemble = 1.2 × 10 10 and 1.4 × 10 10 m 3 ), and sNIM (mean of ensemble = 9.6 × 10 10 and 1.2 × 10 10 m 3 ) showed lower volumes, with sNIM the lowest. Despite being based on different observations of the same rainfall, the ensembles of the two products showed no overlap after the first few years of simulation. sTowMer (mean of ensemble = 1.4 × 10 10 and 1.5 × 10 10 m 3 ) estimated greater volumes than all the other gridded products, but less than TowLoop and TowBoot.
The sediment yields produced using each rainfall product ( Figure 5c) were more varied and complex than the hydrological response (Figure 5b). The same pattern is seen, however, with the Tow-based products having produced the most sediment yield (TowLoop = 1.7 × 10 6 m 3 ), and sNIM the least (mean of ensemble = 1.5 × 10 5 m 3 ). Again, there was no overlap between the products after the first few years. sTowMer had a mean cumulative sediment yield (7.3 × 10 5 m 3 ), over four times greater than sTBR and SNIM yet lower than TowBootthere was no overlap in range after the first few years.
In all instances the spread of ensembles, as a percentage of the ensemble mean, was greater for sediment yields than for discharges, with the greatest increase seen in sTowMer (discharge = 8.5%; sediment yield = 28.8%). The spread of ensembles was lower for products that did not use observations from the Tow rain gauge as part of the calibration.

Changes to basin geomorphology
Changes to basin geomorphology were assessed using the longer 1500-year simulations. Figure 6 shows the change in mean elevations for different stream order sub-basins when the model was driven by each of the rainfall products. The difference in the scale of changes between the products is evident, with TowLoop and TowBoot showing the greatest change. sTowMer also shows greater changes than sNim and sTBR. By normalizing the changes as a proportion of the total changes, the different spatial patterns between the products become clearer -TowLoop and TowBoot show a dominance of change (erosion) in the third-order region of the catchment, and also greater changes in the fifth-order region, with a relatively low proportion of change in the fourth-order region. This is in contrast to sNIM and sTBR, which both show a much greater proportion of change in the fourth-order region, a similar level to changes in the third-order region, and less relative change in the fifth-order region. sTowMer shows a mixture of the two Merged STREAP calibrated using intensity information from the gridded TBR record (6 years), spatial information from the NIMROD record (6 years), and climatic variability information from the Tow gauge (24 years) to generate an ensemble of 30 individual 50-year realizations, applied at 1 h/1 km resolution sTowMer patterns, with third-order dominant, but more change in the fourth than the fifth. Table II shows statistics drawn from elevation and volume changes at the individual cell level. The same differences observed in Figure 6 are also evident here, with all values greater for the products using the Tow rain gauge record. The mean elevation change across all cells appears low compared to the values seen in Figure 6, but this indicates that high erosion in the upper parts of the catchment is offset by deposition in the fifth-order region, which is a larger area. By looking at the proportion of the overall mean elevation change compared to the total erosion across the catchment, it is possible to see what proportion of eroded sediment is lost to the system at the catchment outletthis is greatest for the two products containing the information from the Tow rain gauge. Figure 7 shows the river profiles after 1500 years of simulation, and the changes from the initial river profile. The difference in the scale of the changes is evident, as with Figure 6 and Table II, however, there is a difference in the transition zone where the dominant change goes from erosion to deposition downstream. For sNIM and sTBR, the transition occurs around 7000 m downstream, after 7300 m for sTowMer, and 8000 m for TowLoop. The level of deposition tails off dramatically towards the catchment outlet, with some minor erosion at the edge of the modelthis is possibly a result of a model parameter (which sets a fixed hydraulic slope at the edge of the domain).

Differences between products
The sensitivity of the CAESAR-Lisflood model to different rainfall products was assessed using the characteristics of different rainfall observation products to calibrate a weather generator, then using ensembles of generated rainfall as an input for CAESAR-Lisflood. Differences in rainfall products occur as each product observes rainfall differently, with different abilities and spatial and temporal coverage. We have not assessed CAESAR-Lisflood's sensitivity to uncertainties associated with individual products, such as observation error in rain gauges or interpolation error in rain gauge networks, although many of the observations made here would apply.
Our simulations show that the variation of rainfall rate had the largest influence on landscape evolution in the teststhe Tow rain gauge had the greatest variability in rainfall of the products and the simulations that used rainfall derived from the Tow rain gauge consequently showed the greatest sediment yields and changes in elevation. The difference in rainfall variability between TowLoop, sNIM, and sTBR was isolated as a factor through spatially averaging a single ensemble member from sNIM and sTBR to a catchment average, and standardizing the rainfall means of eachthis produced cumulative discharges where TowLoop was 2% greater than sNIM and 4% greater than sTBR, yet cumulative sediment yields were 184% greater than sNIM and 273% greater than sTBR. Even when normalized, the Tow record contained higher rainfall intensities in the upper quantiles of events compared to the other two products, and it is these rare, high-intensity events that drive much geomorphic activity. The normalized TowLoop produced a sediment yield of 59% of the original. Rainfall variability, and by extension discharge variability, is important to erosion rates and changes to basin geomorphology (Deal et al., 2017;Scherler et al., 2017), and here it has been identified as a key sensitivity and source of uncertainty in LEMs. The spatial representation of the rainfall is also important here, as the most extreme values of rainfall are often associated with convective storms that exist at a smaller spatial scale than the catchment. When using the Tow gauge, there is no spatial representation, so any observation of a storm is applied to the whole catchment, yet for the other products the peak of the storm is smoothed across the catchment. Typically, applying a spatially distributed rainfall record at a coarser resolution (spatially or temporally) results in a decrease in simulated sediment yields (Coulthard and Skinner, 2016;Battista et al., 2019).
Ultimately, there is no reliable method of directly observing all characteristics of rainfall. No single product can be said to be correct, or even more correct than another. Rather, each product has its skill and deficiencies that must be balanced by the user, often through combining the best elements of each. This is what this study achieves with the calibration of the  sTowMer product, taking intensity from the rain gauge network, spatial and temporal coverage from the radar, and finally conditioning the distribution of events using the longer record afforded by the single gauge. This was by no means a novel approach for producing rainfall estimates, but it is for the application to a LEM, and this merged product produced outputs that were between those from the gauge ensemble and those of the other two productsand showed greater variability than sTBR and sNIM. Consequently, the additional variability from the Tow gauge allowed for higher-intensity upper-quantile events, increased discharges, and sediment yields, yet the better representation of the rainfall gradient across the catchment allowed by the information from the NIMROD and TBR network removed the spatial bias that is seen in the Tow only-based products (closer to the total produced by rainfall mean normalized TowLoop).

Geomorphic multiplier
The term 'geomorphic multiplier' was coined in Coulthard et al. (2012b) to describe the amplifying effect a catchment has on the signal moving from rainfall to discharge to sediment yield. For example, in Coulthard et al. (2012b) the predicted increase in rainfall by 1.28 times for a single extreme event resulted in estimated sediment yields over five times greater. Sediment response to flow variation is complex, non-linear, and difficult to predict (Gomez and Church, 1989;Cudden and Hoey, 2003;Coulthard et al., 2007;Van De Wiel and Coulthard, 2010). Field and laboratory observations have been used to develop sediment transport formulas in an attempt to quantify the relationship between flow velocities, shear stress, and transport initiation, and they often feature a cubic term linking changes in flow velocity to transport initiation. In the model itself, the geomorphic multiplier is expressed through the application of these physically derived equations (in this case, Wilcock and Crowe, 2003). The importance of this is clearly illustrated in our findingswhere geomorphic responses to any differences between products are amplified. Effectively, as part of an uncertainty cascade, the geomorphic multiplier is magnifying any upstream uncertainty before uncertainty associated with the model (e.g. parameter uncertainty) is accounted for, increasing overall sensitivity. This is evident in the results, where the differences in the rainfall transfer broadly linearly to the discharge yet resulting in much greater, non-linear, differences in the sediment yields ( Figure 5). For example, both TowLoop and TowBoot produced cumulative discharge totals that were 150% greater than sTBR, yet cumulative sediment yields that were 1000% greater. A large difference was also observed in the scale of changes to the basin geomorphology ( Figure 6).

Spatial divergence in geomorphic response and equifinality
Some authors have discussed the possibility of a 'geomorphic equifinality' within LEMs (e.g. Hancock et al., 2016), suggesting that landscape evolution is dominantly driven by the topography of the landscape itself, and thus LEMs are robust to input uncertainties over long-term simulations (e.g. Willgoose et al., 1991a,b;Howard, 1994;Howard et al., 1994;Hancock et al., 2016). This concept would suggest that, regardless of differences in the rainfall products used here, given enough time, each would produce the same spatial pattern of geomorphic change, albeit to different extents. However, Figure 7 shows that there are differences in the patterns of erosion and deposition, especially between the Tow-based products and those of sNIM and sTBR, particularly in the third-order areas of the catchment where the TowLooped and sTowMer showed deep incision. One explanation would be that given enough time, the sNIM and sTBR would eventually match these levels of incision in third-order areas, however, for this to be the case it would be expected that the proportion of eroded material leaving the catchment would be the same, yet this is 0.30 for both sNIM and sTBR, 0.49 for TowLoop, and 0.40 for TowMer (Table II). The greater rainfall (volumes and intensities) provided by the climatic information in the Tow rain gauge produced greater stream powers within the catchment and consequently a greater proportion of the eroded sediment is removed from the system, causing the upstream area of erosion to extend further downstream. As sNIM and sTBR did not generate these greater stream powers, a greater proportion of the eroded sediment was deposited within the system, and this makes it unlikely that given enough time, the river profiles will resemble those for TowLoop and sTowMer. Therefore, our results should provoke concernthere are some clear and sometimes large differences (an order of magnitude) in the geomorphic response to different products, even over short timescales, which could potentially produce diverging landscapes over longer timescales than used in this study. Further work would be required to understand to what extent rainfall product differences influence the outputs of tests across different timeframes.

LEMs and climate change studies
The use of a rainfall generator model in combination with a LEM presents some intriguing benefits going forward that have not been explored here. The calibrations of a rainfall generator model are based on observations and assumed to represent the present climate. By altering the calibration parameters, including altering the spatial structure of rainfall and locations of convective events within catchments, possible future climate can be generated (Peleg et al., 2019(Peleg et al., , 2020a. This has great potential for modelling future climate scenarios, where changes to the type of rainfall an area receives are likely to occur. For example, in the UK, climate predictions suggest that the frequency and intensity of extreme convective events are likely to increase disproportionately to changes in mean annual rainfall (Fowler and Ekström, 2009), and in Coulthard et al. (2012b) these changes were shown to have a non-linear impact on simulated sediment yields. However, as the version of CAESAR applied in that study could only utilize catchment average rainfall values, the impact is possibly underestimated, as representing convective events in this way smooths out local rainfall intensities and this has been shown to influence sediment yields (Coulthard and Skinner, 2016). This combination of a gridded high-resolution rainfall generator model and LEM could be used to investigate the impacts of convective events on catchments at high spatial and temporal resolution.
However, there must be caution in their applications for such. As in Coulthard et al. (2012b), we urge the geomorphic community to embrace probabilistic methods, even for exploratory use. Owing to the uncertain nature of rainfall estimation and the propagation of this uncertainty into downstream applications, relevant fields, such as meteorology and hydrology, predominantly use probabilistic methods. Instead of representing rainfall as a single deterministic input, an ensemble of unique, yet equally probable, representations of rainfall are used instead. These are often combined with downstream models that also use probabilistic methods to represent model uncertainty (e.g. parameter uncertainty), providing a large ensemble output covering the range of probable outcomes. Both Coulthard et al. (2012b) and this study made use of rainfall ensembles but have not applied them to a probabilistic representation of the model for a full uncertainty cascade. The framework outlined by Pappenberger et al. (2005) could be adapted/expanded to do this.

Wider implications
For the potential application of LEMs in forecasting, in a similar way to hydrological models at present, several issues need to be addressed. First, as discussed above, they currently do not make use of probabilistic methods to assess uncertainty. Second, the models display far greater sensitivity to the rainfall input with regard to the choice of product (this study), resolution of the data (Coulthard and Skinner, 2016), and changes to the storm structure (Peleg et al., 2020b). Third, geomorphic models retain a 'memory' of past events in a way hydrological models do not, where an erroneous change to the landscape is retained and will influence all future outputs. Fourth, there is a paucity in both metrics and observation data suitable for assessing the performance of the models (Tucker and Hancock, 2010). The developments of high-performance and cloud computing remove some of the barriers to probabilistic modelling using LEMs, yet it is likely that the barriers are challenging enough to impede the application of LEMs in the same way as hydrological models for decades to come. The development of the next generation of LEMs should seek to address these issues and also focus on determining what information can be extracted from the models that is of interest to stakeholders and decision-makers and is also reliable (i.e. useful). Some previous studies provide examples of what useful information can be extracted (e.g. Lane et al., 2007;Coulthard et al., 2012b).

Conclusions
Numerical models of geomorphic processes have been shown to be sensitive to numerous factors, with these sensitivities often being more acute than in models that simulate hydraulic processes alone. Here, the sensitivity of model outputs to rainfall inputs has been assessed using rainfall observations produced by different methods, showing that whilst the differences between these products transfer linearly to the hydraulic outputs from the model, the differences produced in the sediment yields were non-linear, magnifying the differences between the inputs. This poses a problem for the user as it is not possible to state that any single product is correct, or even the most correct, with each having a different skill for observing different characteristics of rainfall. The use of probabilistic methods to represent rainfall in LEM studies should be common practice in order to account for these uncertainties, and future development of LEMs should focus on better handling the non-linear uncertainty cascade resulting from the use of deterministic sediment transport equations. The use of LEM and a rainfall generator in combination, with full accounting for uncertainties, has the potential to provide useful information relating to the 2520 C. J. SKINNER ET AL.
impact climate change will have on the landscape, and consequently on society.