Corresponding author: A. C. V. Getirana, Hydrological Sciences Laboratory, NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA. (email@example.com)
 This paper describes and evaluates a procedure that integrates radar altimetry data into the automatic calibration of large-scale flow routing schemes (LFRS). The Hydrological Modeling and Analysis Platform, coupled in off-line mode with the Interactions between Soil, Biosphere, and Atmosphere land surface model, is used to simulate daily surface water dynamics of the Amazon basin at a 0.25° spatial resolution. The Multiobjective Complex Evolution optimization algorithm is used to optimize one parameter (subsurface runoff time delay) and other three parameter multiplier factors (Manning roughness coefficient for rivers, river width, and bankfull height) by minimizing two objective functions for the 2002 to 2006 period. Four calibration experiments are performed by combining water discharge observations and Envisat data to evaluate the potential of using radar altimetry in the automatic calibration of LFRS. One experiment is based on daily discharge observations, other combines discharge with altimetric data, and the other two ones are driven exclusively by radar altimetry data, at 16 or four virtual stations, depending on the experiment. The calibration process is validated against discharge observations at five gauging stations located on the main tributaries. This study shows the feasibility of calibrating LFRS using radar altimetry data. Results demonstrate that reasonable parameters can be obtained by using radar altimetry in an optimization procedure with competitive computational costs. However, there is evidence of equifinality among model parameters. Furthermore, the automatic calibration driven by altimetric data can reliably reproduce discharges time series, and significant improvements are noticed in simulated water level variations.
 Currently, no accurate global river geometry data set is available for hydrological studies. River width and bankfull height are usually parameterized globally via empirical mathematical formulations as functions of the upstream drainage area or water discharge at grid cells [e.g., Decharme et al., 2012; Yamazaki et al., 2011]. Other hydraulic characteristics such as the river bed roughness (frequently represented by the Manning roughness coefficient) are also unknown for most rivers in the world and must be guessed intelligently. Although efforts have been made toward refined estimates of hydrological variables by taking into account climatologically similar regions [e.g., Decharme et al., 2012], such coarse estimates are important sources of uncertainty in global water surface modeling studies. Recent studies on large-scale flow routing schemes (LFRS) have shown advances in representing backwater effects on water level and discharge, water storage in floodplains, and interactions between floodplains, soil, and atmosphere [e.g., Dadson et al., 2010; Decharme et al., 2012; Yamazaki et al., 2011]. Some of these studies, in addition to several previous ones [e.g., see Chow, 1988], demonstrate the high sensitivity of flow routing schemes to the river geometry and hydraulic coefficients.
 A common way to estimate model parameters is the automatic calibration based on optimization techniques. In the past decades, optimization techniques have been widely used in the calibration of lumped, semidistributed and distributed hydrological models at the mesoscale and regional scale. In most of these situations, model parameters are conceptual representations of abstract watershed characteristics and are simply determined through a trial-and-error process adjusting the parameter values to minimize the error between the model output and observed data [Gupta et al., 1998]. Water discharge time series have been traditionally assumed as the “truth” in model parameter calibration processes. This assumption assures, in most cases, satisfactory streamflow simulations.
 Recent advances in radar altimetry have improved the monitoring of river and lake water height variability located in ungauged or poorly gauged regions [e.g., Koblinsky et al., 1993; Birkett, 2000; Calmant et al., 2008; Roux et al., 2010]. The accuracy of radar altimetry has motivated the application of these data for (1) estimating discharge in poorly gauged basins based on rating curve fitting and river bed slope estimates [e.g., León et al., 2006; Getirana and Peters-Lidard, 2012; Michailovsky et al., 2012] and (2) evaluating hydrodynamic [e.g., Wilson et al., 2007] and hydrological [e.g., Coe et al., 2008; Getirana et al., 2010] models. The next and most promising step for the spatial altimetry technology is the surface water and ocean topography (SWOT) mission [Durand et al., 2010], planned to be launched within the decade. SWOT will measure water elevation with a spatial resolution on the order of 100 m with two to four revisits at low latitudes to midlatitudes and up to 10 revisits at high latitudes per 22 day orbit repeat period. In this sense, efforts have been made toward the improvement of model parameter estimation techniques based on assimilation [e.g., Pereira-Cardenal et al., 2011] and optimization techniques [e.g., Getirana, 2010]. These efforts have mainly addressed mesoscale and regional scale models. Till date, the combination of optimization techniques and radar altimetry has been explored very little, especially with LFRS. Indeed, as a general rule, LFRS are parameterized based on few available river geometry information and by evaluating maximum likelihood functions for measuring the “closeness” of model outputs and in situ or satellite-based observations. However, the potentially readily available and massive quantity of altimetric data and the successful results obtained by previous studies using these data make one ask whether similar methodologies could be used to drive LFRS and to represent spatial and temporal surface water fluxes consistently.
 A previous study demonstrated the viability of integrating spatial altimetry data into the automatic calibration of rainfall-runoff models [Getirana, 2010] used to simulate the Branco River basin in the Amazon basin. A robust postprocessing approach was used to convert simulated discharges into water depths by using stage-discharge relationships (also known as rating curve) at the catchment scale. Eight parameters related to vertical water and energy fluxes were automatically calibrated. This study adopts a different approach, evaluating the feasibility of using altimetric data in the automatic parameterization of LFRS, while assuming both the meteorological forcings and the default land surface model (LSM) parameterization as the truth in the simulation of vertical water and energy fluxes. The Hydrological Modeling and Analysis Platform (HyMAP) [Getirana et al., 2012] and the Multiobjective Complex Evolution (MOCOM-UA) [Yapo et al., 1998] multicriteria global optimization algorithm are considered in this context. As both HyMAP and MOCOM-UA have been comprehensively described and discussed in previous papers, sensitivity analyses evaluating the different options of these tools are not provided in this study. However, it must be mentioned that HyMAP parameters have been set for the Amazon basin in a previous study [Getirana et al., 2012] based on expert knowledge and observed data. As described later in this paper, this parameter set is used as the basis to the construction of the optimization experiments.
 The objective functions (OFs) derived from radar altimetry data are adapted to remove systematic biases between simulated water levels and altimetric data. The performance of the calibration procedure is evaluated considering the following criteria, as suggested by Sorooshian et al. : (1) the variation of parameter estimates as a function of the different data sets used in the calibration procedure; (2) the reliability of water discharge forecasts obtained using these parameter estimates; and (3) the conceptual meaning of parameter estimates.
 The Amazon basin has been selected as the study area for this study. This choice has been motivated by the wide collection of altimetric data recently made freely available on the web [Crétaux et al., 2011]. In addition, a more detailed evaluation can be performed at the regional scale before extending the proposed methodology over the globe.
 Four experiments are performed, varying from each other according to the data set used in the calibration procedure. The data sets are composed of daily discharge observations at four gauging stations and Envisat data at four or 16 virtual stations (VSs; a VS represents the intersection between the open water and the satellite ground tracks) along the Amazon River.
 This paper is organized into four sections. Section 2 presents the materials and methods used in this study, including a brief description of the Envisat altimetric data set, the main aspects of the HyMAP global flow routing scheme and the MOCOM-UA global optimization algorithm. Results of the calibration experiments using different data sources (altimetric data and water discharge) are presented, compared, and discussed in section 2.1. Finally, section 2.2 details the conclusions and next steps of this study.
2. Data Sets and Methods
 HyMAP is a global-scale flow routing scheme specially designed to be coupled with any LSM. The model is based on the Catchment-based Macroscale Floodplain (CaMa-Flood) model [Yamazaki et al., 2011] and Interactions between the Soil, Biosphere and Atmosphere (ISBA)-Total Runoff Integrating Pathways (TRIP) [Decharme et al., 2012] with improvements in the surface and subsurface runoff time delays, floodplain dynamics, and evaporation from surface water. HyMAP simulates water level, discharge and storage in rivers and floodplains at the spatial resolution of 0.25° and at the daily time step. For this study, the internal computational time step was set as 3 h. The surface and subsurface runoffs generated by a LSM are routed using a kinematic wave formulation through a prescribed river network to oceans or inland seas. The model is composed of four modules accounting for (1) the surface and subsurface runoff time delays, (2) flow routing in river channels, (3) flow routing in floodplains, and (4) evaporation from open water surfaces. HyMAP is fully described in Getirana et al. , and only its main features are presented in the Appendix of this paper. A multiobjective optimization algorithm has been implemented in the platform in order to calibrate model parameters.
2.2. The ISBA Land Surface Model
 In this study, HyMAP is forced by outputs provided by the ISBA [Noilhan and Mahfouf, 1996]. In terms of hydrology, the three-layer force restore approach is used for the soil [Boone et al., 1999], and the subgrid runoff is parameterized following Habets et al. . The surface and subsurface runoffs derived from ISBA at the daily time step are used as inputs in HyMAP. In addition, some meteorological forcings and the actual evapotranspiration provided by the ISBA are also needed to calculate evaporation from floodplains. The subsurface runoff represents the gravitational drainage.
 The evaporation is computed using a standard resistance analog between the surface and a reference atmospheric level with contributions from transpiration, bare soil, and intercepted water. A single bulk surface energy budget temperature is used together with standard surface layer atmospheric stability corrections based on similarity theory in order to resolve the daily cycle [for further details, see Noilhan and Mahfouf, 1996].
2.3. MOCOM-UA Algorithm
 The MOCOM-UA is a global multiobjective optimization algorithm. It is based on the SCE-UA single-criterion optimization algorithm adapted for multiobjective problems. It provides an effective and efficient distribution of solutions on the Pareto optimum space [Boyle et al., 2000]. Its main advantage is the requirement of only one coefficient to be defined: the set (or population) ns of points randomly distributed within the parameter hyperspace defined by the n-dimensional feasible parameter space. The population of ns points is ranked and sorted according to a Pareto ranking procedure for each iteration, as suggested by Goldberg . A multicriteria version of the downhill simplex method is used to evolve each simplex in a multiobjective improvement direction [Boyle et al., 2000]. The optimization process stops when all ns points are ranked evenly. This means that the entire population converged toward the Pareto optimum. For further details about the MOCOM-UA algorithm, descriptive papers can be found in the literature [e.g., Yapo et al., 1998; Boyle et al., 2000].
2.4. Meteorological Forcings
 The meteorological data set used to force ISBA is provided by the Princeton University on a 3 h time step and at a 1° resolution [Sheffield et al., 2006]. This data set is based on the National Center of Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis. Sheffield et al.  carried out corrections of the systematic biases in the 6 h NCEP-NCAR reanalysis via hybridization with global monthly gridded observations. In addition, the precipitation was disaggregated in both space and time at 1° resolution via statistical downscaling and at 3 h time step using information from the 3 h Tropical Rainfall Measuring Mission data set. The 3 h precipitation from Sheffield et al.  is then corrected to match the monthly value from the Global Precipitation Climatology Center Full Data Product V4, as described in Decharme et al. .
2.5. In Situ Data
 Water discharge observations at nine gauging stations operated by the Brazilian Water Agency (Agência Nacional de Águas (ANA)) were considered in the automatic calibration and validation procedures. Drainage areas vary from 165,501 to 4,688,170 km2 (see detailed list of gauging stations in Table 1), and all of them were used to validate the model. Only four gauging stations located along the Amazon River (Tabatinga, Sto Antonio do Iça, Manacapuru, and Óbidos) are used in the automatic calibration. In addition, cross-sectional information (river width and bankfull height, and wet area) at the latter four stations are used to be compared with the optimization estimates.
Table 1. Gauging Stations Considered for the Automatic Calibration and Model Evaluation Stepsa
Station ID (ANA)
Drainage Area (km2)
Weight, W (%)
Mean Discharge (m3 s−1)
Tabatinga, Sto Antonio do Iça, Manacapuru and Óbidos stations were used in the calibration experiments QQ and QH.
Faz. Vista Alegre
Sto Antonio do Iça
2.6. Radar Altimetry Data
 Data provided by the altimeter onboard the Envisat satellite are considered in this study. Envisat orbits on a 35 day temporal resolution (duration of the orbital cycle) from latitude 81.5°N to 81.5°S and 70 km intertrack spacing at the equator. Its beam footprint width is about 3.5 km. Time series used in this study are a result of a signal selection based on a fix-sized window at a VS, which is the location where radar satellite ground tracks transect open water surfaces. The water height at a VS is computed as the average of all signals selected within the window during an orbital cycle. The ranges used in this study are those issued by the ICE-1 algorithm [Bamber, 1994]. Errors in altimetric time series along rivers within the Amazon basin are in the order of tens of centimeters. Envisat data are freely available on Hydroweb (available at http://www.legos.obs-mip.fr/soa/hydrologie/hydroweb). Readers should refer to Calmant et al.  for in-depth information related to the use of altimetry for continental waters. Altimetric data at 16 VSs along the Amazon River, from 2002 to 2006, are considered in this study (see Figure 1 and Table 2). The VSs provide time series with 34–41 altimetric observations for the study period, depending on the track.
Table 2. Virtual Stations Considered for the Automatic Calibrationa
Station ID (Hydroweb)
Drainage Area (km2)
Weight in the Objective Function (%)
Number of Cycles (from 2002 to 2006)
Virtual stations vs1, vs7, vs11, and vs16 were used to calibrate the model in the experiment HH4.
2.7. Calibration Experiments
 HyMAP was automatically calibrated for the period from January 2002 to December 2006 using the MOCOM-UA algorithm, which is implemented within the modeling platform. Four calibration experiments have been performed, differing from each other according to the data sets used to drive the optimization algorithm. These data sets are composed of (1) daily observed water discharges at four gauging stations (Tabatinga, Sto Antonio do Iça, Manacapuru, and Óbidos stations) and (2) spatial altimetry data at 16 VSs. A third data set (3) composed of only four VSs derived from (2) is also considered with the objective of checking the viability of an automatic calibration with a reduced amount of altimetric data. These four VSs have been selected with the objective of keeping an equal distance between two VSs. All of the stations are located along the Amazon River.
 Two OFs have been used in each experiment. To evaluate the influence of altimetric data on model results, experiments have the OFs derived (1) both from data set 1 (water discharge at four gauging stations), called experiment QQ hereafter, (2) one from data set 1 and the other one from data set 2 (altimetric data at 16 VSs), called experiment QH, (3) both from data set 2 (experiment HH), and (4) both from data set 3 (altimetric data at four VSs), called experiment HH4. QQ represents the standard calibration, i.e., OFs are functions of in situ data, QH allows one to identify the gains of combining both water discharge and altimetric data, HH represents a calibration completely free of in situ observations, and HH4 allows one to evaluate how a reduced amount of altimetric data influences discharge simulations. An overview of the calibration experiments performed in this study is given in Table 3.
Water discharge at four gauging stations and altimetric data at 16 virtual stations
NSQ and NSAH
Altimetric data at 16 virtual stations
WR and NSAH
Altimetric data at four virtual stations
WR and NSAH
 At each iteration of the optimization process, OFs were computed as the weighted sum of the performance coefficients at the stations (virtual or gauge, depending on the OF) and can be generally represented as follows:
where f is a function of the time step t and the simulated (S) and observed (O) signals. W is the weight attributed to each gauging or VS, and k which, in this paper, has been defined as a function of the drainage area (see Tables 1 and 2). n is the total number of stations considered in each experiment. f is represented by different performance coefficients, which are selected as a function of the experiment. For experiment QQ, the Nash-Sutcliffe (NS) coefficient for discharges (NSQ) and NS of the logarithm of discharges (LNSQ) have been considered as functions f. To avoid systematic biases between simulated water levels and altimetric data (these are mostly due to the digital elevation model (DEM) used to derive river bed heights), experiments driven by altimetric data were evaluated with the NS of unbiased water levels (NSAH) and the weighted determination coefficient (R2) for water levels ( ). Not considering biases between simulated and observed water levels means that constant biases eventually present in the forcings are neglected during the optimization process. The four f functions are represented by equations (2)-(5):
 In the above equations, nt is the total number of days disposing of observed data, and and the respective mean values of the target and simulated signals for the entire period. R2 and α are the determination coefficient and the tangent derived from a linear regression between simulated and observed signals, respectively. NSQ, LNSQ, and NSAH range from −∞ to 1, where 1 is the optimal case and WR ranges from −1 to 1, where 1 is the best value. R2 is given as
 In this study, relatively large parameter domains have been defined with the purpose of making the automatic calibration an impartial process. The first guess (the initial parameter set) has been set as far from reliable parameter values as possible (see Table 4). It has been demonstrated that higher ns values (200 or higher) used to explore optimal parameter sets within the hyperdomain can provide better Pareto solutions [Yapo et al., 1998]. However, as the objective of this study is to evaluate the potential of using radar altimetry data in the calibration process of LFRS rather than the optimization algorithm itself, ns has been fixed as 100.
Table 4. Model Parameters Subjected to the Automatic Calibrationa
Except for Tb, the values defined for first guess and domain are the product of the default parameter set (ParDef) and a multiplier.
Subsurface runoff time delay (days)
2.0 × HParDef
[0.1–2.5] × HParDef
River bankfull height (m)
2.0 × WParDef
[0.1–2.5] × WParDef
River width (m)
0.5 × nParDef
[0.25–2.5] × nParDef
Manning coefficient for rivers
 The calibration experiments have been evaluated qualitatively, by means of visual inspection of observed and simulated hydrographs, and quantitatively, through the analysis of performance coefficients for discharges during a 5 year calibration period, limited between 2002 and 2006 by Envisat data availability. Results have also been compared against a simulation using the default parameter set [Getirana et al., 2012], as presented in the Appendix. The model validation has been performed during the 1997 to 2001 period. Results were analyzed at the nine gauging stations, including four stations used in the optimization experiments and 16 VSs. Four parameters are calibrated: the subsurface runoff time delay (Tb), Manning roughness coefficient for rivers (nr), river width (W), and bankfull height (H). The default parameter set and the simulation outputs derived from it are referred to as ParDef and Def hereafter. Except for Tb, all the parameters are heterogeneously distributed in space. Thus, to reduce the computational cost, parameter sets are represented by the product between ParDef and a spatially uniform multiplier. In this sense, the optimization procedure is performed by means of the calibration of four multiplier factors.
3. Results and Discussion
3.1. Results of the Automatic Calibration
 All of the automatic calibration experiments resulted in refined solutions, converging to optimal parameter sets. The objective space with the Pareto solutions of each experiment is shown in Figure 2. As expected, OFs improved considerably when compared with those provided by the initial guess. Experiment QQ needed 292 evolutions (2328 model runs). The evolutions represent the number of times a group of evenly ranked points evolved toward the Pareto front. Experiment QH converged faster than QQ, with 283 evolutions and 2127 model runs. This can be explained by the low correlation between water discharge-based and level-based OFs. On the other hand, experiments guided exclusively by altimetric data had an increase in computational costs of 39% (HH evolved 405 times and required 3058 model runs to retrieve the Pareto solutions) and 67% (HH4: 488 evolutions and 3818 model runs).
 Figure 3 shows the evolution of OFs and parameters over the optimization process for the four experiments. One can first notice that the process can be divided into two steps: the first one takes about 25% of the optimization process and is characterized by a fast OF convergence to near-optimal values, and the second step performs a refinement of OFs in most of cases, consuming 75% of the total time processing. Second, in most of cases, parameter sets converged to very different values of the first guess. The sensitivity of OFs and parameters is discussed below.
 Parameter sets varied substantially from an experiment to another, revealing the influence of the type and quantity of data used as the reference in the automatic calibration. Discharge-based experiments, i.e., QQ and QH (technically, QH is a discharge and altimetry-based experiment), found optimal Tb values with large ranges, from 30 to 55 days. This indicates its low sensitivity to water discharge data. On the other hand, altimetry-based experiments (HH and HH4) resulted in lower ranged Tb values from 25 to 30 days (HH) and 20 to 25 days (HH4).
 Except for Tb in experiments QQ and QH, all other optimal parameters could not retrieve the default parameters, which were chosen on a physical basis. However, if one is less exigent, i.e., if near-optimal OFs can be accepted as effective solutions, parameter sets of all experiments contain the default parameter set. This is explained by the fact that the optimization process adapts parameters as a function of errors in the modeling process. For example, optimal nr values reached the lower domain boundary in experiments QQ, QH, and HH. This might indicate that the parameter domains do not represent the entire range of possible values or optimal parameters have no physical meaning. As results represent Manning coefficients for rivers ranging from 0.0075 to 0.0125, which are significantly lower than those suggested in the literature, one can say that the automatic calibration procedure does optimize OFs but is not always capable of providing physically based parameter values.
 Optimal W values ranged between 20% and 75% of WParDef in experiments QQ and QH, respectively. The river width increases related to the use of altimetry data in the optimization process is compensated by decreasing river bankfull heights. Optimal H values in QQ, ranging from 1.25 × HParDef to 1.55 × HParDef, were reduced to values ranging from 0.125 × HParDef to 0.45 × HParDef in QH. Correspondingly, this reveals equifinality among the model parameters. Except for Tb, all optimal parameter values were approximately the same in experiments QH and HH. The use of a reduced number of altimetric observations in the automatic calibration resulted in a slight change in parameter values between experiments HH and HH4.
 Even if the optimal parameter sets differ from those obtained from the default parameter set, both water discharge time series derived from both cases are very similar at Óbidos. Figure 4 illustrates hydrographs at Tabatinga, Sto Antonio do Iça, Manacapuru, and Óbidos during 2003 for the four experiments. As one can see, hydrographs evolved from a very perturbed form (as derived from the first guess parameter set) to refined results close to observations. Systematically overestimated peaks observed in all model outputs can be attributed to errors in the forcing data set and/or vertical water balance computed by the ISBA.
 Based on a visual inspection, water discharge peaks derived from the joint use of water discharge and spatial altimetry (experiment QH) are slightly better phased with observations than the outputs of experiment QQ. NSQ values at Óbidos evolved from −0.79 (as given by the first guess parameter set) to optimal values ranging between 0.65 and 0.70 for the experiment QQ. The NSQ coefficients were slightly improved by the experiment QH, when WR replaces LNSQ in the OF, ranging from 0.67 to 0.73. In the case of altimetry-based experiments, one can see a small degradation of NSQ values in comparison with QQ and QH experiments. Experiment HH had NSQ varying from 0.66 to 0.67, and when a limited number of altimetric observations is used, the coefficient remains nearly constant, ranging between 0.64 and 0.65 for HH4.
 Results at the other three gauging stations used for the automatic calibration did not behave in the same way as at Óbidos. At Tabatinga for example, QQ performed better than the other three experiments, with NSQ values ranging from 0.55 to 0.59. QH had the best NSQ equal to 0.49, and the other best NSQ values were 0.42 (HH) and 0.38 (HH4). Sto Antonio do Iça and Manacapuru also performed better in experiment QQ than QH. Optimal results of experiments HH and HH4 had worse performances in comparison with the first two, but still better than the first guess. Sto Antonio do Iça had best NS values of 0.71, 0.64, 0.59, and 0.58 for experiments QQ, QH, HH, and HH4, respectively. At Manacapuru, the best coefficients were of 0.78, 0.77, 0.69, and 0.70.
 Interesting findings are revealed when modeled water levels derived from calibration experiments are evaluated against Envisat data. As shown in Figure 5 at VSs vs1, vs7, vs11, and vs16, the experiment QQ, driven exclusively by water discharge data, was unable to represent the amplitude of water level time series satisfactorily. Standard deviations of Envisat altimetric water levels (sobs) were 2.39, 2.69, 3.10, and 2.93, whereas optimal QQ solutions had mean standard deviations (scal) of 4.04, 5.84, 8.13, and 8.10, respectively. This gives scal/sobs ratios of 1.69, 2.17, 2.62, and 2.77. In addition, the set of solutions resulting from QQ contained a high degree of water level uncertainty, with a mean value for the four VSs of 2.03 m, varying up to 5 m from a solution to another (e.g., time step 26 of vs3).
 In the case of QH, the integration of altimetric data into the optimization process added to the system the ability of representing water level variations more consistently. Both amplitude and uncertainty values were decreased. scal/sobs ratios were shortened to 0.62, 1.13, 0.84, and 1.03, respectively, and the mean uncertainty was reduced to about 0.73 m. According to the formulation of the kinematic wave equation (equation (A7)), given a discharge Qr, the amplitude of the water level hr is inherent to the river width W, Manning coefficient nr, and river slope ir. As ir is not calibrated and constant over time for a given river reach, and optimal nr coefficients are nearly the same for QQ and QH, the adjustment of water level amplitudes (minimization of scal/sobs ratios) is mainly due to changes of river cross-sectional geometry, i.e., river bankfull height and width (see Figure 3). In other words, given the same discharge variations, low W values (as provided by the optimal solutions of experiment QQ) result in higher H and hr values. Higher W values (resulting from experiments using altimetric data in the automatic calibration) are compensated by lower H and hr values.
 When the automatic calibration is driven exclusively by spatial altimetry data at 16 VSs (experiment HH), scal/sobs ratios showed a slight improvement (0.70, 1.11, 1.03, and 0.99, respectively) in comparison with QH. The mean uncertainty of water levels was reduced to 0.35 m. Finally, the best agreement between observed and modeled water levels at VSs vs1, vs7, vs11, and vs16 was obtained with HH4. scal/sobs ratios were closer to 1 (1.01, 1.02, 0.98, and 0.99, respectively) in comparison with the previous experiments. This is certainly expected because HH4 minimizes errors between observed and modeled water levels at these same four VSs. The uncertainty deteriorated slightly in comparison with experiment HH, with a mean value of 0.45 m. It is important to mention that the other VSs (not shown) reveal the same characteristics as illustrated in Figure 5.
3.2. Validation of the Automatic Calibration Against Observed Discharge
 Results of the optimization experiments were evaluated in the 1997 to 2001 period using daily water discharge data at nine gauging stations, including the four stations used in the automatic calibration. To simplify the analysis, only one Pareto solution, called hereafter validation solution (represented by the black square in Figure 2), is used in the validation process. The validation solution was defined for each experiment as the solution providing the 50th (of 100 solutions) best OF1 value (NS for experiments QQ and QH, and NSA for experiments HH and HH4).
 Figure 6 shows NSQ coefficients derived from the four experiments (QQ, QH, HH, and HH4) and Def for the nine gauging stations and the weighted mean of Tabatinga, Sto Antonio do Iça, Manacapuru, and Óbidos, as used in the automatic calibration process. For comparison purposes, the coefficients are shown for both calibration (2002–2006) and validation (1997–2001) periods.
 At Óbidos, coefficients varied little among experiments. NSQ values for the validation period ranged from 0.83 (HH4) to 0.87 (QH). Both QQ and Def had NSQ = 0.86. At Manacapuru, Def performed better (NSQ = 0.82) than the validation solution of the optimization experiments. QQ, QH, HH, and HH4 had NSQ = 0.80, 0.80, 0.73, and 0.75, respectively. The other two gauging stations located upstream along the Amazon River (Tabatinga and Sto Antonio do Iça) had similar results for the validation period, with the best NSQ values for QQ (0.60 and 0.58, respectively) and the worst for HH4 (0.44 and 0.46, respectively). Figure 7 shows hydrographs at the four gauging stations used in the optimization process in 1999, within the validation period.
 QQ provided the best-weighted mean NSQ (0.78) and HH and HH4 the worst values (0.72 and 0.73, respectively). The combination of both water discharge and level data in the optimization process, i.e., the QH experiment, led to a very good performance coefficient for the weighted mean (NSQ=0.76), which are close to the results provided by QQ. However, as one could expect, the replacement of discharge observations by radar altimetry resulted in a degradation of discharge-based coefficients. Except for QH at Óbidos, all altimetry-based experiments resulted in equal or worse NSQ values at all stations used in the calibration process, including the weighted mean. This observation is valid for both calibration and validation periods. It is noteworthy to examine the difference of the NSQ coefficients between the calibration and validation periods. For most gauging stations, values for the validation period are significantly higher than those obtained in the calibration period. Exceptions are Sto Antonio do Iça, Serrinha, and Gavião. At Óbidos, the difference between the validation and calibration periods is about 0.10. Other stations located along the Amazon River and used in the optimization process had differences of 0.05 (Manacapuru) and 0.13 (Sto Antonio do Iça). The weighted means of NSQ coefficients presented an averaged difference of 0.09 between the calibration and validation steps. Results at other stations vary randomly, as a function of the geographic location. Differences at other stations used in the evaluation varied from 0.09 at Serrinha station to 0.50, as was the case for experiment QQ at the Gavião station. These highly heterogeneous coefficients are explained by both forcing uncertainty and imperfect simplifications of the physical system. This means that certain parameter sets can explain appropriately the physical processes or compensate errors intrinsic to forcings and model simplifications for a given time and location; however, in other periods and areas, a recalibration might be needed. In some cases, the forcings and the model are simply not appropriate for the particular domain.
3.3. Reliability of Parameter Estimates
 To evaluate the reliability of automatically calibrated parameters, river bankfull height, H, and width, W, estimates of these parameters were compared with in situ observations at four gauging stations (Tabatinga, Sto Antonio do Iça, Manacapuru, and Óbidos). The observed data correspond to the average of several measurements performed by ANA during the last 50 years. Observed W represents the observed distance between river banks at the gauging station, and the observed H is obtained by dividing the cross-sectional wet area, Aw (m2), by W. This means that the observed H is the bankfull height of a river with an equivalent rectangular cross-sectional form. Other estimates provided by previous parameterizations [Decharme et al., 2012; Yamazaki et al., 2011; Getirana et al., 2012] are also used for comparison.
 As shown in Figure 8, despite a few exceptions, parameterized H values (optimization outputs and default parameterizations) underestimate observations at all stations. QQ provides the closest river bankfull heights among the four experiments. On the other hand, most of the parameterized W values overestimate observed data. One can notice that the three optimization experiments using Envisat altimetry data provided similar river widths, which reveal the sensitivity of this parameter to altimetry-based OFs.
 Despite the differences found among W and H, the estimated cross-sectional areas, Aw, are more homogeneous; however, they are underestimated when compared with the observations. Similar Aw values for different experiments are a result of a balance between W and H in order to keep consistency in water discharge variation. This results in substantial changes in hr values (see the case of experiments QQ and QH in Figure 5). It must be emphasized that the gauging stations are generally installed in locations where floodplains rarely occur in order to guarantee better accuracy in flow measurements for most time periods. This suggests that river geometry at these gauging stations might not be representative of the whole river system, especially in locations where floodplains are more frequent. In addition, meteorological forcing uncertainties and errors from LSM outputs may play a crucial role in the parameter estimation. This means that completely different results can be obtained if the flow routing scheme is forced with outputs from other LSMs. In this sense, comparison between estimated and observed river width and height must be carried out with caution.
 The potential of using altimetric data in the automatic calibration of a global flow routing scheme was evaluated in this study, and the first results of the proposed methodology were presented for the Amazon basin using the HyMAP model. The evaluation considered four experiments that varied from each other according to the data set used to calculate the OFs of the optimization scheme. Three data sets were used to conduct the search for optimal model parameters: (1) the first was composed of discharge data at four gauging stations, (2) the second data set was derived from Envisat satellite data at 16, and then (3) the third data set at four VSs. The experiments reported in this paper involved the estimation of four model parameters: subsurface runoff time delay (Tb), the Manning coefficient for rivers (nr), river width (W), and and bankfull height (H). These parameters were selected to represent the most important aspects of the model behavior. The evaluation of the optimization experiments was performed on the basis of the reliability of parameter estimates and the performance coefficients for water discharges at nine gauging stations located within the basin for both the calibration (2002–2006) and the validation (1997–2001) periods.
 Results demonstrated the feasibility of using altimetric data in the automatic calibration of LFRS parameters. Even if experiments provided different values for parameter sets, NS for discharge at Óbidos are nearly the same for all experiments. QQ and QH resulted in better performances in terms of water discharge than HH and HH4, as one would expect. The integration of altimetric data into the automatic calibration, as represented by the QH experiment, brought significant improvements for the water level modeling due to the slight augmentation of W and the decrease of H. However, results provided by the HH and HH4 are still acceptable. Both experiments provided competitive results, showing their ability to predict discharge time series in different time periods and locations of the Amazon basin. Noise inherent to the altimetric data acquisition and processing did not seem to be a restriction for obtaining optimal hydrographs.
 The resulting hydrographs reveal errors in the modeling process that are mainly due to both forcing uncertainty and imperfect simplifications of the physical system, commonly found in hydrological models [Gupta et al., 1998]. These imperfect simplifications include parameter estimates. For simplification reasons, all of the model parameters are considered constant in time and some others have homogeneous spatial distribution, as it is the case for Tb. In addition, the assumption of using a single set of empirical equations in the entire Amazon basin for determining the cross-sectional shape includes large uncertainty. Other sources of error are the kinematic wave assumption, which is not capable of simulating hysteresis caused by backwater effects in flat water surfaces. However, the use of a diffusive wave approach requires a much finer temporal resolution in order to avoid numerical instabilities. These considerations, along with the simplifications of the representation of the physical system, result in additional errors and increase the uncertainty of parameter estimates. These errors are also seen in terms of the large differences in performance coefficients derived from the same experiment at a few gauging stations (e.g., Gavião, Lábrea, Porto Velho, and Óbidos), which means that the forcings, an insufficiently represented physical process in the model or a parameter set can perform well for one period and poorly for another. In addition, the meaningful differences between observed river geometry and parameter estimates (derived from both the optimization experiments and proposed in the literature) suggest that the parameterization of global flow routing schemes should be less physically based in cases where more detailed data sets representing river geometry are unavailable or inexistent. To improve the reliability of simulations of the water budget and discharge, future studies should be performed toward the simultaneous calibration of both LSM and LFRS. However, OFs taking into account other observed variables should be combined with those using discharge and radar altimetry in order to constraint different physical processes in the calibration process and to reduce the equifinality among model parameters.
 The findings presented in this paper have significant implication for the benefit that can be obtained by using satellite-based altimetry data, such as the forthcoming SWOT mission [Durand et al., 2010]. As a first attempt, the proposed method has been evaluated in the Amazon basin. However, it can be easily transferred to other basins where radar altimetry data are available. Evidently, optimal parameters will vary as a function of several factors, including data accuracy and availability and geomorphology. In this sense, the use of this method in different large basins toward a global-scale application is suggested as a future study. The application of such approach at the global scale and its adaptation to SWOT Virtual Mission data will considerably improve the modeling, understanding, and streamflow forecasts in poorly gauged or ungauged basins.
Appendix A: Model Description
 HyMAP is composed of four modules: (1) surface and subsurface runoff time delays; (2) river-floodplain interface; (3) flow routing in river channels and floodplains; and (4) evaporation from floodplains.
A1. Module 1: Surface and Subsurface Runoff Time Delays
 The concentration time (or time delay factor) is a physically based process representing the subgrid-scale routing. For each grid cell, both surface, Is (mm Δt−1), and subsurface, Ib (mm Δt−1), runoffs derived from a LSM pass through separate linear reservoirs with appropriate time-delay factors. These values can vary from a few hours to several days, depending on hydrogeological characteristics of the catchment. The linear reservoir outflows can be represented by the following equation:
where the subscripts s and b represent surface and subsurface runoff variables, respectively. Os,b (mm Δt−1) stands for the outflow at time step t, Vs,b (mm) the volume stored in the linear reservoir, and Ts,b the concentration time of the grid cell. V is updated twice at each time step: at the beginning, summing the inflow Is,b, and at the end, subtracting Os,b.
 The subsurface runoff time delay factor Tb is considered spatially uniform and constant in time. The current parameterization of HyMAP considers Tb=45 days. Ts is computed for each grid cell following the Kirpich's  formula:
where Δxj (km) is the distance between the farthest point within a grid cell and its outlet, and Δhj (m) is the difference between the maximum and minimum elevations of the pathway. Both Δxj and Δhj are derived from the high-resolution DEM. At a 0.25° resolution, Ts values are quite low in comparison with Tb, varying from several minutes to a few days.
 Finally, the total discharge produced in each grid cell Qc (m3 Δt−1) is computed as
where Ac stands for the grid cell area.
A2. Module 2: River-Floodplain Interface
 The river channel reservoir of a grid cell is composed of three parameters: channel length, L; channel width, W; and bank height, H. If water height in the river channel hr (m) is higher than H, water is exchanged between river and floodplain reservoirs. This process is considered instantaneous at each time step. This means that water surface elevations of the river channel and the floodplain are the same.
 A floodplain reservoir has a parameter for the unit-catchment area, Ac, and a floodplain elevation profile, hf=f(A), as suggested by Yamazaki et al. . The topographic parameters used to create the elevation profile are derived from the 30 arc-second Shuttle Radar Topography Mission (SRTM30) DEM and the Global Drainage Basin Database (GDBD) flow direction map at 1 km resolution [Masutomi et al., 2009] processed with the flexible location of waterways method [Yamazaki et al., 2009].
 The river channel and floodplain water exchanges at each time step are represented as follows:
where subscripts r and f represent river channel and floodplain variables, respectively. S (m3) stands for the total water storage in the grid cell, Sr (m3) and Sf (m3) the river channel and floodplain water storages, hr (m) and hf (m) water depths, W (m) the river width, L (m) the river length, and Af (m2) the flooded area. Srmax (m3) stands for the river bankfull water storage and is given as Srmax=H × W × L, where H (m) is the river bankfull height.
 The temporal evolution of water storage in river channels and floodplains of a grid cell, S, is defined by the continuity equation (A6) considering linear reservoir outputs, Qc, river and floodplain discharges to the downstream grid point, Qr and Qf, river and floodplain discharges from the upstream grid points, Qr,k and Qf,k, and evaporation from floodplains, E:
where t is time, and dt is time step. The index k stands for the nUp upstream grid cells of the target grid point.
A3. Module 3: Flow Routing in River Channels and Floodplains
 Water discharge in rivers and floodplains is calculated by the kinematic wave equation. Using the Manning formula for a rectangular cross section and large width-to-depth ratio, water discharge in the river channel, Qr (m3 s−1), can be defined as
where nr is the roughness coefficient for rivers. ir is a constant river bed slope derived from topographic information and corresponds to the slope between the target and downstream grid cells.
 Similarly, water discharge in the floodplains Qf (m3 s−1) is given as
where nf is the Manning roughness coefficient for floodplains, and for simplification, if is considered equal to ir.
A3.1. River Width and Bankfull Height
 River width and bankfull height are both defined based on an empirical relationship between observed river geometry and the mean annual discharge at each river cross section:
where Qmed (m3·s−1) is the annual mean discharge in each grid cell estimated using the global runoff database from Cogley . β is fixed for five different hydrological regions of the world. For equatorial or subtropical basins, which include the Amazon basin, β=18.
H is defined as
 or via a linear relationship with the river width
A3.2. Manning Coefficient for River Channels
 The Manning coefficient of river channels nr varies according to the following formula:
where nmax and nmin are the maximum and the minimum value of the Manning coefficient (the current version of HyMAP has values equal to 0.05 and 0.03, respectively), and Hmax and Hmin are the maximum and minimum river bankfull heights as computed by equation (A11).
A3.3. Manning Coefficient for Floodplains
 The Manning coefficient for floodplains, nf, is spatially distributed as a function of 12 vegetation types at 0.25° resolution derived from the 1 km ECOCLIMAP data set [Masson et al., 2003]. nf values are larger in dense vegetated areas and lower for sparser vegetated regions.
A4. Module 4: Evaporation From Floodplains
 A simple approach is used to estimate the evaporation from the open waters Ew (m3 dt−1). First, the potential evaporation E (mm·dt−1) rate is calculated by the Penman–Monteith equation, by setting up the surface resistance to zero
where Δ (kPa·°C−1) is the gradient of the saturated vapor pressure-temperature function, A (MJ·m−2·s−1) is the available energy; ρA (kg·m−3) and ρW (kg·m−3) are the specific mass of air and water, respectively; cp is the specific heat of moist air (MJ·kg−1·°C−1); D (kPa) is the vapor pressure deficit; γ (kPa·°C−1) is the psychrometric constant; ra (s·m−1) is the aerodynamic resistance; λ (MJ·kg−1) is the latent heat of vaporization; and M is a time-step unit conversion from m·s−1 to mm·dt−1. Available energy and aerodynamic resistance can be calculated following Shuttleworth . For simplification purposes, water albedo and emissivity were fixed as 0.07 and 1, respectively.
 Then, the actual evapotranspiration rate, ET (mm·dt−1), diagnosed by the LSM, is subtracted from E and the result is multiplied by the water surface Af, resulting in the effective evaporation from open waters
 The computation is done once per day using standard input meteorological forcing variables and assuming that the water in the floodplains and river have the same temperature as the air (a predicted or prescribed surface water temperature is not needed).
 The first author thanks the Centre National d'Études Spatiales (CNES) for the financial support. The study benefited from data made available by Agência Nacional de Águas (ANA) and by the European Space Agency (ESA) under the form of Geophysical Data Records (GDRs). The multimission database of GDRs is maintained by the Centre de Topographie des Océans et de l'Hydrosphère (CTOH) at LEGOS. The authors also thank G. Cochonneau (IRD) and M. C. Gennero (IRD) for their help in data acquisition and processing and three anonymous reviewers for their valuable comments.