Deep Learning Integrating Scale Conversion and Pedo‐Transfer Function to Avoid Potential Errors in Cross‐Scale Transfer

Pedo‐transfer functions (PTFs) relate soil/landscape static properties to a wide range of model inputs (e.g., soil hydraulic parameters) that are essential to soil hydrological modeling. Combining PTFs and hydrological models is a powerful strategy allowing the use of soil/landscape static properties for the generalization of large‐scale modeling. However, since the spatial scales of soil hydraulic parameters required for model inputs and soil/landscape static properties are often not identical, cross‐scale transfer is required, which can be a significant source of errors. Here, we investigate uncertainties in cross‐scale transfer and develop an approach that avoids them. The proposed method uses the convolutional neural network (CNN) as a cross‐scale transfer approach to directly map soil/landscape static properties to soil hydraulic parameters across different spatial scales. The proposed CNN approach is applied under two different estimation strategies to invert the hydraulic parameters of a soil‐water balance model and subsequently the quality of the parameters is assessed. Both synthetical and real‐world results around the conterminous United States indicate that in general the employed end‐to‐end strategy is superior to the two‐step strategy. The CNN‐based integrated model successfully reduces potential errors in cross‐scale transfer and can be applied to other areas lacking information on hydraulic parameters or observations. The proposed method can be extended to improve parameter estimation in earth system models and enhance our understanding of key hydrological processes.


Introduction
Soil moisture is an essential hydrological component connecting surface water and groundwater (Peng et al., 2021;Seneviratne et al., 2010;Zha et al., 2019).The spatio-temporal soil moisture movement at different scales can be simulated by soil-water models (Paul et al., 2021;Telteu et al., 2021).In general, the performance of these models partly depends on model parameterization.Reducing its uncertainties is still one of the major unsolved problems in hydrology (Blöschl et al., 2019;Clark et al., 2017;Feng et al., 2023;Shen et al., 2023), (a) and up-/down-scaling technique at step (b) in MPR are predefined before inversely estimating soil hydraulic parameters using hydrological observations.Based on the MPR model, Klotz et al. (2017) proposed a symbolic regression approach to automatically estimate PTFs, which relaxes the restriction of predefined PTFs.Feigl et al. (2020) proposed a calibration method of PTFs based on a text-generating neural network to transfer text semantics into a continuous space, further improving the PTF estimation.However, the state-of-the-art MPR model (a) still requires the predefined mathematical forms of an up-/down-scaling technique and (b) requires solving high-dimensional optimization problems during calibration, which is prohibitive for large-scale hydrological modeling.
With the help of advanced deep learning techniques, some progress has been made in soil hydraulic parameter estimation using PTFs in conjunction with hydrological models.For example, by using long short-term memory models instead of predefined PTFs, Tsai et al. (2021) proposed a novel differentiable PTF learning model based on regional surface soil moisture observations, while Kraft et al. (2022) developed a similar method to obtain model parameters varying in time.Furthermore, both their models in estimating PTF are based on automatic differentiation implemented in PyTorch (Paszke et al., 2019), which is very efficient and can deal well with highdimensional optimizations (Shen et al., 2023).However, Tsai et al. (2021) did not consider scale conversion and implicitly assumed that soil/landscape static properties have a scale consistent with that of model parameters, like Beck et al. (2020) and Hundecha and Bárdossy (2004).While Kraft et al. (2022) used an independent encoder to derive a low-dimensional representation of soil/landscape static properties to avoid their initial high dimensionality, spatial scale mismatch issues are still not fully resolved.
Therefore, several issues in PTF estimation and the corresponding up-/down-scaling techniques still exist (Van Looy et al., 2017).The uncertainties in cross-scale transfer should be assessed.In addition, there are two choices in scale conversion frameworks (i.e., by scaling soil/landscape static properties before applying the PTF or scaling soil hydraulic parameters after applying the PTF), which could lead to diverse results of hydraulic parameter estimation.Although scaling soil hydraulic parameters based on PTF is suggested (e.g., Feigl et al., 2020;Samaniego et al., 2010), scaling soil/landscape static properties before applying the PTF is also widely used (e.g., Beck et al., 2020;Hundecha & Bárdossy, 2004;Tsai et al., 2021).These issues can undermine confidence in utilizing the published soil hydraulic parameter data sets (Dai et al., 2019;Zhang et al., 2018) or proposed PTF functions (Van Looy et al., 2017).Developing a more flexible model, which simultaneously avoids the potential errors from scale conversion frameworks and the assumptions for mathematical forms of PTFs and up-/down-scaling techniques is an urgent task.
Deep learning can approximate any given complex nonlinear function from structured data sets (Cybenko, 1989;Hornik et al., 1989), and has shown promising results for representing PTFs.Particularly, the convolutional neural network (CNN) (Lecun et al., 2015) has been reported to be highly suitable for learning multi-scale spatial patterns from multi-scale gridded data (Sun et al., 2019) -a feature that can be used to derive soil hydraulic parameters across different scales.Moreover, auto-differentiation techniques (Paszke et al., 2019) allow training the CNN model and inverting the hydraulic parameters efficiently, even with high dimensional parameters and states.Thus, in order to reduce uncertainties in the cross-scale transfer, a CNN-based model that integrates PTFs and up-/ down-scaling techniques to derive soil hydraulic parameters from soil/landscape static properties at different scales is proposed.The model does not require the pre-definition of the mathematical formula for PTFs and up-/ down-scaling techniques, or the selection of two alternative scale conversion frameworks, namely scaling soil/ landscape static properties before applying the PTF or scaling soil hydraulic parameters after applying the PTF.Before testing its performance, we first assess the uncertainties in PTF and scale conversion.Then, two different optimization strategies are introduced and compared to estimate the CNN-based integrated model.Whether these uncertainties from cross-scale transfer can be reduced effectively is tested both in synthetic and real-world cases.

PTF and Scale Conversion
Pedo-transfer functions are typically described as, Water Resources Research 10.1029/2023WR035543 where u denotes soil/landscape static properties, β denotes the parameters required in modeling, such as soil hydraulic parameters (Van Looy et al., 2017), and the superscript l PTF denotes the spatial scale of PTF.Traditional PTF model does not have a function for crossing scales, thus input u and output β shall own the same scale l PTF , although this scale itself can be quite flexible.In this study, two Cosby PTF models (Cosby et al., 1984) that link hydraulic conductivity to sand fraction (CB1, Equation 2) or to sand and clay fraction (CB2, Equation 3) are used: where f sand and f clay are point-scale sand and clay fractions of soil; K sat is the saturated hydraulic conductivity (m/ d) of the corresponding scale.However, in many practical applications as shown in Figure 1, the spatial resolution for available u (l 0 ) and that for the required β(l 1 ) are generally different.Therefore, scale conversion is required for traditional PTF.Commonly, two opposite scale conversion frameworks are widely employed: (a) MPR-type scale conversion that scales β l f into β l 0 (e.g., Feigl et al., 2020;Samaniego et al., 2010); (b) traditional regionalization-TR-type scale conversion that scales u l 0 into u l f ready for PTF conversion (e.g., Beck et al., 2020;Hundecha & Bárdossy, 2004;Tsai et al., 2021).The choice of PTF forms (e.g., CB1 and CB2) and scale conversion frameworks (e.g., MPR-type and TR-type scale conversions) may significantly affect the final soil hydraulic parameters (Van Looy et al., 2017;Paschalis et al., 2022;Weihermüller et al., 2021;Y. Zhang & Schaap, 2019).In addition, the choice of an up-/down-scaling algorithm can also introduce uncertainty.In this study, we investigate eight up-/down-scaling techniques during up-/down-scaling: the nearest neighbor (NN), bilinear (BL), bicubic (BC), mean (ME), minimum (MI), maximum (MA), median (MD), and mode (MO) algorithms.
To avoid these preconceived assumptions, CNN (Fukushima, 1980), which can flexibly integrate the two tasks (PTF and up-/down-scaling technique) in soil hydraulic parameter identification, is proposed here.CNN usually consists of different concatenating, convolutional, and pooling layers, which allows for transforming different multi-scale soil/landscape static properties (u l 0 ) directly into parameters in the target scale for modeling (β l 1 ), as demonstrated by the red arrows in Figure 1.

Model Estimation Strategies
In order to invert the hydraulic parameters based on the proposed CNN approach using the information on soil/ landscape static properties and dynamic soil state variables (e.g., soil moisture, which can be effectively retrieved by remote sensing technique) (Vereecken et al., 2008(Vereecken et al., , 2016;;Wanders et al., 2014), we employed two parameter estimation strategies for the CNN approach, namely, the classical two-step (Beck et al., 2016) and the end-to-end (Feigl et al., 2020;Tsai et al., 2021) strategies.

Two-Step Strategy
The two-step strategy first obtains the hydraulic parameters and subsequently trains the CNN-based integrated model.In step one (blue dashed box in Figure 2a), we invert soil hydraulic parameters cell-by-cell based on the fitting between the simulated and the observed soil moisture data using inverse methods, such as Ensemble Kalman filter or ensemble smoother (Crestani et al., 2013).In this study, we adopt the iterative ensemble smoother, which performs well in nonlinear problems (Chen & Oliver, 2013;P. Li, Zha, Shi, et al., 2020;P. Li, Zha, Tso, et al., 2020).In step two (yellow dashed box in Figure 2a), the inverted soil hydraulic parameters in step one and the data for soil/landscape static properties are used as targets and inputs, respectively, to train the CNNbased integrated model based on the gradient descent method implemented in PyTorch (Paszke et al., 2019).A proper design of CNN structure should bridge the input and output of different spatial scales.Finally, the trained CNN-based integrated model can be used to obtain the soil hydraulic parameters at the region without soil moisture observations.

End-To-End Strategy
In the end-to-end strategy, the CNN and the soil water model are jointly trained.That means the soil hydraulic parameters, the output of the last layer of CNN, are directly fed into the soil-water model, which serves as an additional nonlinear layer.Its output, the soil moisture, is compared with the observed counterpart to calculate the loss function, which is minimized during training.At the prediction stage, we keep only the CNN layers by discarding the last soil-water model layer, and this CNN-based integrated model is again used to obtain the soil hydraulic parameters at the region without soil moisture observations.Compared to the two-step strategy, the main benefit of the end-to-end strategy is that it can simultaneously optimize the PTF, scale conversion, and the soil hydraulic parameters based on soil\landscape static property and soil moisture observation, achieving a global optimization.Furthermore, since the soil hydraulic parameters are obtained through the second-last layer, the spatial scale of this is quite flexible and solely depends on the design of the CNN layers (Tsai et al., 2021), which could save significant computational time compared to the two-step strategy involving cell-by-cell parameter inversion.

Soil-Water Model
The soil water balance model used in this study is based on the root zone water balance computation described in Allen et al. (1998): where i is the current time (day); D r [L]  small in our simulation cases and considered as zero.For the detailed hydrological calculation process in the soilwater model, please see Text S1 in Supporting Information S1.
To calculate DP, we ignore the matrix potential gradient and assume that precipitation and evapotranspiration do not affect the soil moisture distribution, leading to the equation: where t [T] is time; z [L] represents the vertical coordinate, positive upward; K(θ) [L/T] is the unsaturated hydraulic conductivity, which is a function of soil moisture (θ): In the two-step strategy, model parameters are first estimated by the inversion method using the regional soil moisture observations.Then, the soil/landscape static property data and the derived model parameters are used as inputs and outputs, respectively, to train the CNN-based integrated model.In the end-to-end strategy, the soil hydraulic parameters inferred by the CNN-based integrated model based on the soil/landscape static property data are directly sent to the soil-water model to obtain the soil moisture.The gradient descent method is employed to train the CNN-based integrated model by reducing the loss function between the regional soil moisture observations and simulations.
where K sat [L/T] is the saturated hydraulic conductivity.θ fc and θ s are θ values at the field capacity point and saturation point.DP ∆t can be derived by integration as, where Z r [L] is the root zone depth.Based on Equations 5 and 6, θ t+∆t can be written as where ∆t is the numerical calculation step, which is generally 1 hr.Therefore, there are 24 necessary to run for each i step.We implement this model in Python based on the open-source swb code (Christofides, 2018).Unlike the original swb model using an empirical parameter, drain time, to calculate DP, the revised DP calculation procedure (Equations 7 and 8) is widely used in many regional-scale hydrological models, for example, SWAT (Neitsch et al., 2011), to provide a more accurate description of soil water redistribution during drainage (Zha, 2014).In addition, compared with drain time, the parameter K sat has a clearer soil physical meaning, which is convenient for evaluating K sat estimates against the open data sets for saturated hydraulic conductivity obtained by other PTF models (e.g., Dai et al., 2019;Y. Zhang et al., 2018).
In this regard, the parameter K sat is selected to be estimated by the CNN-based integrated model.That means we keep all the other parameters known and optimize K sat to reflect the information embedded in soil moisture observation.In the meantime, the CNN training parameters (i.e., weights and bias) are also optimized to establish the link between the optimized K sat and observation of soil/landscape static properties.As mentioned before, these two optimizations are done simultaneously in the end-to-end strategy but sequentially in the two-step strategy.However, previous studies (e.g., Peng et al., 2021) have noticed that there are intrinsic differences between model-simulated and satellite-derived soil moisture.Direct assimilation of the satellite-derived soil moisture to the model without consideration of its difference compared to the simulated result could lead to the failure of parameter inversion.To resolve this issue, we follow the procedures employed by De Lannoy et al. ( 2007), P. Li, Zha, Tso, et al. (2020), R. H. Reichle et al. (2004), Tian et al. (2019), Tsai et al. (2021), andVarble andChávez (2011) and adopt a linear model to describe this difference between the simulated soil moisture θ sim and observed soil moisture θ obs , In addition, the parameters in this equation, namely, scaling parameter A and bias parameter B, are believed to be linked to some soil/landscape static/statistical attributes (e.g., standard deviation and mean values of soil moisture), making them possibly derived by PTF in conjunction with K sat estimation (Tsai et al., 2021).In the synthetic scenarios (see Section 3 for details), the scaling (A) and bias (B) parameters associated with soil/ landscape static properties are simply set based on Equations 10 and 11, respectively.
To sum up, the CNN-based integrated model is only used to estimate parameters K sat , A, and B, while the uncertainties from the other soil hydraulic parameters are omitted, and parameters θ s , θ fc , θ wp , Z r , p, and K c over CONUS are set to be constant as 0.5, 0.2, 0.078, 1 m, 0.1, and 0.5, respectively.

Water Resources Research
10.1029/2023WR035543 LI ET AL.

CNN-Based Integrated Model
Due to the convolution function of CNN (Fukushima, 1980), the spatial scales represented by CNN input and output are solely dependent on CNN structure.For the basic CNN model, the first layer consists of x inputs (i.e., soil/landscape static properties), each with the dimensions of 10 × 10 grids and each grid with a spatial size of 1 × 1 km 2 .The final output layer consists of y outputs (e.g., soil hydraulic parameters), each with the shape of 1 × 1 and each grid with a spatial size of 10 × 10 km 2 .Using this CNN setting, the inputs of the scale 1 km are linked to outputs of the scale 10 km.Similarly, inputs with dimensions of 50 × 50 and 100 × 100 grids are used for output parameters with 50 and 100 km scales.The detailed configuration of the CNN adopted in this study is shown in Text S2 and Figure S1 in Supporting Information S1.

Soil/Landscape Static Data
Soil/landscape static data are needed as input for training the PTF and scale conversion.To evaluate the proposed CNN-based integrated model, a series of tested cases under synthetic and real-world scenarios are designed (see Section 3 for details).Under the synthetic scenario, percentages of sand and clay with a resolution of 1 km (soil basic property) around the conterminous United States (CONUS) (60°-130°W, 20°-54°N) are used as the inputs for PTF, which are derived by scaling (nearest neighbor sampling, i.e., NN algorithm mentioned in Section 2.1) from SoilGrids 250 m V2 (Hengl et al., 2017;Poggio et al., 2021).Other static attributes, such as NDVI, Slope, etc (Table 1), are also collected for the real-world scenario.They are also scaled into 1-or 10 km based on their raw resolutions.Please refer to Figures S2-S4 in Supporting Information S1 for these variables' spatial distributions.

Dynamic Data
Dynamic data are usually used for forcing the soil water model (e.g., precipitation and evaporation) and also serve as the observations (e.g., soil moisture) for inversion.Daily total precipitation and potential evaporation at 10 km resolution around CONUS during 2014-2017 are obtained from the ERA5-Land analysis by scaling hourly data to daily one (Muñoz Sabater, 2019).Daily root zone soil moisture (0-1 m) data at 10 km resolution (Brocca et al., 2012) around CONUS from 1 April 2015 (the earliest available date) to 31 December 2017 are derived from Soil Moisture Active Passive (SMAP) L4 V6 Analysis Update data (Reichle et al., 2021).
It is known that soil moisture time series may have a strong temporal correlation.Very high-frequency observations for data assimilation may significantly increase the computational burden while not improving model performance because of the information redundancy.Previous studies indicate that the optimal temporal sampling

Evaluation Criteria
To evaluate the accuracy of parameter estimates (i.e., K sat , A, and B) and state simulation (i.e., soil moisture), several metrics are used: correlation coefficient (CORR), root mean square error (RMSE), and its normalized version (NRMSE).Please see Text S3 in Supporting Information S1 for the calculation details of these metrics.

Case Implementations
In this section, we first take different combinations of PTFs, up-/down-scaling techniques, and scale conversion frameworks to evaluate the effects of uncertainties in PTF and scale conversion on the derived parameters (Section 3.1).Then, we designed several synthetic cases (Section 3.2): seven cases for comparisons of estimation strategies and effects from potential errors (Section 3.2.1)and three cases for investigating the performances of the proposed methods at different scales (Section 3.2.2).Finally, the comprehensive performance of the proposed methods is tested in three cases designed under real-world cases.

Assessment of Uncertainty in PTF and Scale Conversion
Unlike the proposed CNN-based integrated model, the MPR-type and TR-type scale conversion involve explicit implementation of scale conversion framework, either pre-mapping u l 0 to u l PTF (TR) or post-mapping β l PTF to β l 0 (MPR) (Figure 1).However, the choice between MPR and TR seems to be undetermined, and there are various choices for up-/down-scaling techniques and forms of PTF functions.In this section, we extensively assess the uncertainty in PTF and scale conversion, that is, from u l 0 to β l 1 , based on MPR-type and TRtype scale conversion.During the assessment, soil basic properties within the 0 ∼ 5 cm depth cover CONUS with the spatial scale of 1 km are employed as inputs, while the outputs are the K sat maps of the spatial scale of 10 km.For the TR-type scale conversion, we first aggregate the 1 km soil basic properties into 10 km ones, and then they are used as inputs for the given PTF function to obtain the 10 km K sat maps.In contrast, the MPR model first obtains the 1 km K sat maps by running the PTF at the 1 km scale, and then the results are aggregated to the scale of 10 km.Here, two forms of PTF (i.e., CB1 and CB2) and eight up-/down-scaling techniques (i.e., NN, BL, BC, ME, MI, MA, MD, and MO, see Section 2.1) are used ergodically for both MPR-type and TR-type scale conversions, leading to a total of 8 × 2 × 2 = 32 K sat maps at 10 km resolution (Please refer to Figure S5 in Supporting Information S1 for the visual illustration).The evaluation metrics (see Section 2.6) are calculated for each pair of K sat maps, leading to three metric matrices of the size 32 × 32.For brevity, we removed diagonals and duplicate elements in matrices and only present the RMSE matrix in Figure 3 (RMSE), and the remaining two can be found in Figure S6 in Supporting Information S1 (CORR) and Figure S7 in Supporting Information S1 (NRMSE) of the Supporting Information S1.It should be noted that in this section, there are no "true" values for K sat , and the RMSE values only reflect the difference in K sat prediction induced by choice in the 32 combinations.For convenience, the 32 combinations are named with the format XX-YY (for MPR-type scale conversion) or YY-XX (for TR-type scale conversion model), where XX and YY are selected from the names of PTF and the names of the eight up-/down-scaling techniques, respectively.For instance, CB1-NN means that PTF of the form CB1 and up-/down-scaling technique NN are used in MPR-type scale conversion.

Cases Designed Under Synthetic Scenarios
In this section, the proposed CNN-based integrated model is assessed by different cases under synthetic scenarios, with the benefit of knowing the "true" hydraulic parameters and the exact relationship between soil basic properties and hydraulic parameters.

Comparisons of Estimation Strategies and Potential Errors
As for the reference model in the synthetic cases, CB2-MA is selected as the reference PTF, up-/down-scaling technique, and scale conversion framework to derive the reference 10 km K sat A, and B values, based on which the soil-water model has run from 2014 to 2017 to provide reference observations.Table 2 shows seven cases (including the reference truth) that are designed and tested.They are: (a) case "Reference," in which CB2-MA (see Section 3.1 for naming rules) generates the reference soil hydraulic parameters, and then the synthetic "observed" soil moisture data are generated by the soil-water model with reference parameters K sat , A, and B; (b) Case "Inverse," estimating the soil hydraulic parameters using the first step of the two-step strategy with only "observed" soil moisture, which does not consider the information from soil basic properties and thus is similar to traditional inverse approaches; (c) case "CNN (two-step)," estimating the parameters by sequentially incorporating the information of "observed" soil moisture and soil basic properties by the two-step strategy; (d) case "CNN (end-to-end)," estimating the parameters by simultaneously assimilating information from "observed" soil moisture and soil basic properties by the end-to-end strategy; (e) Case "TR," in which the 1 km soil basic properties are first mapped to 10 km ones via the MA up-/down-scaling technique, and then CNN is used to map the 10 km soil basic properties to 10 km soil hydraulic parameters (TR-type scale conversion).The hydraulic parameters are estimated by simultaneously assimilating information from soil moisture observations and soil basic properties (end-to-end strategy); (f) case "MPR (with error)," in which the 1 km soil basic properties are first mapped to 1 km soil hydraulic parameters by CNN, and then aggregated to 10 km ones via the MI up-/down-scaling technique (MPR-type scale conversion).The CNN model is optimized using the end-to-end strategy; (g) case "MPR (without error)," similar to case "MPR (with error)," but using the MA up-/down-scaling technique instead of the MI up-/down-scaling technique.For each case, in addition to the visual inspection of the spatial distribution of K sat , A, and B, they are also quantitatively evaluated by independent model simulations using these inputs and compared with the three "observed" data sets that are used for temporal, spatial, and spatio-temporal generalization (see Section 2.5.2).
The results from case "CNN (two-step)," case "CNN (end-to-end)," and case "Inverse" can be compared and used to demonstrate the performance of the CNN-based integrated model optimized by two different estimation strategies (two-step and end-to-end) and traditional inverse method.The results are analyzed and discussed in Section 4.2.1.Meanwhile, based on the same end-to-end strategy, the comparison between the case "CNN (endto-end)," case "TR," case "MPR (with error)," and case "MPR (without error)" can demonstrate the effects of different sources of errors on parameter estimation (presented in Section 4.2.2).

Cross-Scale Transfer in Other Scales
In this section, the ability to transfer parameters at different scales using the CNN-based integrated model is tested with the comparison to the MPR-type scale conversion framework.Therefore, soil hydraulic parameters at other model scales, namely, 50 and 100 km are derived.To this end, the forcing data and soil moisture observation of corresponding scales are generated based on the average up-/down-scaling technique of 10 km scale forcing data and soil moisture observation (Vereecken et al., 2007) to run the models of corresponding scales.Served as the comparative benchmark, the calibrated PTF model in Section 3.2.2 using the MPR-type methods from cases  "MPR (with error)" and "MPR (without error)" are used.Accordingly, three kinds of cases are designed and conducted.They are (a) case "MPR (with error)," (b) case "MPR (without error)," and (c) case "CNN (end-toend)."As for the MPR-type cases, their PTF and scale conversion are the same as the cases with the same name in Section 3.2.1, and we do not recalibrate the model, using the previously adopted up-/down-scaling technique, scale conversion framework, and trained PTF to directly transfer parameters K sat A, and B at the scales of 10, 50, and 100 km.As for the CNN-based integrated method, we recalibrate the model using the aggregated observations and forcing data, and all other settings are similar to the case "CNN (end-to-end)" in Section 3.2.1.The evaluation is based on the model simulation fitting to the observations during the test period and in test grids (spatial-temporal generalization) at different scales.

Cases Designed Under Real-World Scenarios
In this section, three cases under real-world scenarios are designed and conducted to examine the robustness of our proposed CNN-based integrated model.They are (a) case "Inverse," (b) case "CNN (two-step)," and (c) case "CNN (end-to-end)".Unlike synthetic scenarios, under real-world scenarios, estimations of parameters K sat , A, and B can be improved by including more factors in addition to sand and clay fractions (Y.Zhang & Schaap, 2019).Therefore, additional static attributes, that is, MTP, MET0, AvgSMAP, StdSMAP, NDVI, Slope, and Sand and Clay fractions at different depths, are included as predictors for training PTF and scale conversion (Table 1).Besides, instead of using model-simulated soil moisture as the benchmark under synthetic scenarios, SMAP-derived root-zone soil moisture data around the CONUS are used as the truth to evaluate the soil moisture predictions based on estimated parameters for the three cases.All other settings and evaluations are identical to the synthetic cases with the same names.

Differences in K sat Values
To evaluate the effects of the uncertainties in PTF and scale conversion on derived parameters, different combinations of two PTFs, eight up-/down-scaling techniques, and two scale conversion frameworks, are used to derive the parameter K sat values (Section 3.1).The abbreviations for specific PTF, up-/down-scaling technique, and scale conversion framework are described in Section 2.1.Here, the differences between the derived parameter K sat values are illustrated in Figure 3 (RMSE), Figure S6 in Supporting Information S1 (CORR), and Figure S7 in Supporting Information S1 (NRMSE).The three evaluated metrics present similar results; hence, we focus on RMSE values here.The results reveal that differences resulting from different up-/down-scaling techniques (e.g., CB2-MI&CB2-MA) surpass those arising from different scale conversion frameworks (e.g., CB2-MA&MA-CB2) and different PTFs (i.e., CB1-MA&CB2-MA).This discrepancy may be attributed to the relatively linear mathematical forms of the PTFs.When considering only different scale conversion frameworks, scenarios involving CB2 with MA, MI, and MO, especially with the MA up-/down-scaling technique (i.e., CB2-MA&MA-CB2), generally exhibit larger differences compared to scenarios with other up-/down-scaling techniques (refer to the diagonal to the left in panel divided by the gray lines in the fourth row and second column of Figure 3).When only different up-/down-scaling techniques are considered, the largest differences relate to the MA and MI up-/ down-scaling techniques (refer to the four frames divided by the gray lines located on the side of the hypotenuse in Figure 3).The second largest differences relate to the MO/MD followed by the remaining up-/down-scaling techniques.MPR-type scale conversion results in larger differences than TR-type scale conversion (refer to the comparisons between the frames divided by black lines in the upper left corner and lower right corner of Figure 3).This discrepancy suggests that the TR-type scale conversion considers less spatial heterogeneity resulting from the fine-scale soil/landscape static properties (Samaniego et al., 2010).Within MPR-type scale conversion, CB2 introduces larger differences than CB1 (refer to the comparison between the frames divided by the gray lines in the second row and second column and in the first row and first column in Figure 3).Conversely, within TR-type scale conversion, CB1 with one independent variable brings larger differences than CB2 (refer to the comparison between the frames divided by the gray lines in the third row and third column and the fourth row and fourth column of Figure 3).In general, when different PTFs, up-/down-scaling techniques, and scale conversion frameworks are combined, significantly larger differences may emerge, compared with the case when only one type of uncertainty exists (e.g., CB2-MA&MI-CB1 vs. CB1-NN&CB1-BL).These findings underscore the critical importance of the method development to mitigate errors caused by the different sub-processes in crossscale transfer, and provide a guideline for the subsequent model evaluation.

Performance of the Synthetic Cases
All the results from the synthetic cases are presented in this section.In synthetic cases, the "true" parameters and soil moisture states are known to us.Therefore, it is easy to conduct in-depth evaluations of the performances of our proposed methods.Here, we followed the implementation details of Section 3.2 to first investigate the differences between the two model estimation strategies: the two-step and end-to-end strategies (Section 4.2.1).Then, based on the better estimation strategy, the effects of different potential errors in the scale conversion framework (case "TR") and in the up-/down-scaling technique (case "MPR (with error)") on the model estimation are assessed (Section 4.2.2).Finally, the ability of our proposed CNN-based integrated model to conduct crossscale transfer at different spatial scales (10, 50, and 100 km) is examined (Section 4.2.3).

Performance of Two Estimation Strategies
The results on the soil moisture estimates from different estimation methods (the first-third bars of each group in Figure 4; scatter plots between reference and estimated soil moisture data are shown in the first-third columns of Figure S8 in Supporting Information S1) indicate that the inverse method (labeled as "Inverse") performs best, followed by the end-to-end (labeled as "CNN (end-to-end)") and two-step (labeled as "CNN (two-step)") strategies.However, the case "CNN (end-to-end)" performs consistently over different scenarios (e.g., RMSE = 0.0056-0.057),followed by the case "CNN (two-step)" (e.g., RMSE = 0.0084-0.0088)and case "Inverse" (e.g., RMSE = 0.0005-0.0013).Note that the high performance yielded in the case of "Inverse" results from a lack of uncertainties, for example, the model structure error (Q.Zhang et al., 2019).Its performance may deteriorate quicker than the results in the case "CNN (end-to-end)" if other sources of errors are considered.This could be caused by the nature of the inverse method, which is optimized cell-by-cell, while the end-to-end strategy considers the global performances of all the training grids.
Surprisingly, the best performance in regional soil moisture estimations by the inverse method ("Inverse") has the worst K sat estimation (column 1 of Figure S9 in Supporting Information S1 and row 2 of Figure 5).However, the inverse method still yields good estimates of parameters A and B. In the reference model, parameters A and B are sensitive to the observations for all grids, while the parameter K sat is a key factor controlling the soil moisture simulation when soil moisture is between saturation and field capacity as described in Equation 6.Once out of this scenario, the inverse method will estimate functionally equivalent K sat values that are significantly different from the "true" values.It is the common issue named "the issue of parameter identifiability" (Beven, 2006;Yi & Park, 2021), which is elusive in real-world cases but may significantly affect model performance.For example, the wrong parameters significantly deteriorate the CNN-based integrated model estimation in step two of the twostep strategy ("CNN (two-step)").Although the K sat estimates by the inverse method and the two-step strategy are similar to the reference in some regions, many parts are affected by the "issue of parameter identifiability."In these parts, the estimated wrong but functionally equivalent K sat values are very homogeneous and low, which can be attributed to the initial guess values, parameter-constrained range, and logarithm-based parameter estimation of the parameter K sat (Please refer to Figure S10 in Supporting Information S1).All of these factors have effects on the derived K sat values when the parameter K sat is not sensitive to the observations.For example, logarithmbased parameter estimation makes the calibrated logarithm-based distribution of K sat that is not sensitive to the observations close to the initial logarithm-based K sat distribution, but the logarithm-based K sat distribution becomes much narrower when transformed into a normal distribution.On the contrary, the end-to-end strategy ("CNN (end-to-end)"), directly regularized (or constrained) by soil/landscape static properties, can obtain K sat estimates as accurate as its performance of soil moisture simulation.Compared with the performances of the case "CNN (two-step)" and case "Inverse," the spatial patterns of K sat estimated by the end-to-end strategy are also closest to the reference.Furthermore, the parameters K sat and A estimated by the two-step strategy ("CNN (twostep)") have a globally better performance than the parameters directly derived from the inverse method ("Inverse"), showing the advantages of global regularization based on the soil/landscape static properties.However, the soil moisture estimates by the two-step strategy ("CNN (two-step)") perform worse than the inverse method ("Inverse"), probably because parts of K sat values estimated by the inverse method in step one are sensitive to the observations.This part of the estimated K sat values is relatively correct in step one and becomes more biased toward the reference after global regularization in step two due to false K sat estimations in step one in some grids where K sat is not sensitive to the observations.Feigl et al. (2020) also stated that pursuing a parameter transfer function after conducting parameter optimization might generate a weak or false regionalization due to the issue of parameter identifiability (Beven, 2006;Yi & Park, 2021).

Influence of Potential Errors on Performances
The soil moisture estimates with different potential errors by the end-to-end strategy (the fourth-fifth bars of each group are shown in Figure 4; scatter plots between reference and estimated soil moisture data are shown in the fourth-fifth columns of Figure S8 in Supporting Information S1).The results exhibit that both mistaken scale conversion framework (labeled as "TR") and up-/down-scaling techniques (labeled as "MPR (with error)") generate several times larger errors in the soil moisture simulation than those by our proposed CNN-based integrated model that avoids these potential errors (case "CNN (end-to-end)").The mistaken scale conversion framework shows fewer adverse effects on estimating parameters than the mistaken up-/down-scaling technique, since the parameters are assumed to be only related to sand and clay fractions based on the relatively simple Equations 10 and 11 in the synthetic cases, which may explain the better performance of case "TR" than case "MPR (with error)."Therefore, it is not easy to determine the relative magnitudes of effects in optimizing PTF The parameters estimated in cases 4-5 are shown in columns 4-5 of Figure S9 in Supporting Information S1 and Rows e-f of Figure 5. Compared with the errors in the up-/down-scaling technique, namely, case MPR (with error), all the parameters estimated in the case "TR" have relatively larger CORR values with the reference values, an expected result since there are fewer adverse effects from the mistaken scale conversion framework.The mistaken up-/down-scaling technique significantly affects all parameter estimations (i.e., K sat , A, and B) in the case "MPR (with error)", consequently biasing the results toward the reference values.Also, the spatial pattern of parameters estimated in case "TR" is more similar to the reference, although the absolute differences between them exist.Meanwhile, the spatial pattern of parameters estimated in the case "MPR (with error)" presents more differences in the spatial patterns compared with the reference, particularly for the parameter K sat .However, the error for soil moisture retrieved in case "MPR (with error)" seems to be on a similar level in comparison with results from the case "TR" due to the issue of parameter identifiability (Beven, 2006;Yi & Park, 2021).The utilization of derived parameters in case "MPR (with error)" should be handled with caution since the wrong but functionally equivalent parameters may lead to (a) a wrong hydrological process and (b) wrong simulated soil moisture values in the case of a nonstationary future scenario.The parameters estimated in case "CNN (end-to-  2, respectively, also labeled at the left of each row.Columns 1-3 denote the parameters K sat , A, and B, respectively, also labeled at the bottom of each column.end)" have a much more similar spatial pattern and better statistical results to the reference than those obtained in the two cases (i.e., cases "TR" and "MPR (with error)"), further showing the harmful effects of potential errors in crossscale transfer on parameter and state estimations and the advantages of using the proposed CNN-based integrated model.
Additionally, the case "MPR (without error)" (column 6 of Figure S9 in Supporting Information S1 and Row g of Figure 5) yields better results in comparison with the cases that have the mistaken scale conversion framework (case "TR") and the mistaken up-/down-scaling technique (case "MPR (with error)"), the performances by our proposed CNN-based integrated model (case "CNN (end-to-end)"), and even the case "Inverse" globally for the training grids, which highlight the correctness of the up-/down-scaling technique and the scale conversion framework on the parameter and regional soil moisture estimation and great practical significance of the MPR method after we cautiously determine the up-/down-scaling technique.

Performance on the Other Scales
The reference up-/down-scaling technique from 1 to 10 km is set as the MA up-/down-scaling technique, which can bring much nonlinearity in the crossscale transfer.The nonlinearity in the cross-scale transfer from 1 to 50 km or 100 km tends to decrease as the nonlinearity in cross-scale transfer for other scales tends to be less.These factors may have caused the up-/down-scaling technique to significantly differ between scale changes (Binley et al., 1989).Besides, the "true" cross-scale transfer process from 1 to 50 km and 100 km is unknown even in these synthetic cases.Therefore, we evaluate the performances according to the model simulation fitting to the observations at different scales.Figure 6 shows that although case "MPR (without error)" performs best at the 10 km spatial scale, the CNN-based model (case "CNN (end-to-end)") performs best and most stably at other spatial scales, followed by case "MPR (without error)", and case "MPR (with error)" performs worst.It is not surprising since there is a high probability that the up-/down-scaling technique is not fixed in the processes of conducting parameter transfer at these different scales.In these scenarios, it is better to recalibrate PTF and scale conversion when deriving the parameter at another scale.It should be noted that we do not change the soil-water model over different scales, which may bring some uncertainties.It would be better to select the most appropriate hydrological model in the real world since different hydrological models generally have their best applicable scales, and the parameter in one hydrological model to describe the scale-mismatch hydrologic process may not exist.The performance results indicate that these complexities and uncertainties in the cross-scale transfer have deteriorated the performance of case "CNN (end-to-end)" from 10 to 100 km.Nevertheless, it shows that our proposed CNN-based integrated model with the end-to-end strategy, combining the features of the "forward upscaling" and "inverse upscaling" (Vereecken et al., 2007), has made great progress on the issues of parameter scaling and derivation of effective values.

Performances of Real-World Case
The performance of our proposed CNN-based integrated model is tested in this section based on the comparisons between the simulated and the SMAP-derived soil moisture values in the real-world case.It should be noted that there are many more uncertainties (e.g., observations, models, and forcing data) under real-world scenarios than under synthetic scenarios that only include uncertainties in parameters.Figure 7 shows that the errors in the soil moisture estimated by all methods are very similar to the inherent error of the SMAP-derived data (Reichle et al., 2021).Unsurprisingly, the soil moisture performances simulated by all methods, especially for the inverse method ("Inverse"), significantly deteriorate from periods during training to test since more significant uncertainties exist in the real world than in the synthetic cases.Besides, the end-to-end strategy ("CNN (end-to-end)") outperforms the inverse method ("Inverse") for temporal generalization, which is different from the synthetic cases.Scatter plots between the SMAP-derived and model-simulated soil moisture data (Figure S11 in Supporting Information S1) show that more outliers by the inverse method ("Inverse") and twostep strategy ("CNN (two-step)") than the end-to-end strategy ("CNN (endto-end)") are far away from the 45-degree line, especially during temporal and spatio-temporal generalization, which indicates that the inverse method ("Inverse") and two-step strategy ("CNN (two-step)") may capture fewer "true" parameters due to the issue of parameter identifiability (Beven, 2006;Yi & Park, 2021) as discussed in the synthetic cases.
All the model parameters (i.e., K sat , A, and B) estimated by these methods (i.e., the inverse method, the two-step strategy, and the end-to-end strategy) in the real-world case are presented in Figure 8.It shows that the K sat values estimated by the inverse method ("Inverse") and two-step strategy ("CNN (two-step)") are generally smaller and more homogeneous than those estimated by the end-to-end strategy ("CNN (end-to-end)"), except for a few small areas where soil moisture observations are sensitive to K sat .This performance is similar to that in the synthetic case, where the K sat values estimated by the inverse method ("Inverse") tend to have some lower values (Figure S9 in Supporting Information S1 and Figure 5) than the rest of the methods.All methods have more similar spatial patterns of estimated parameter A and B values.The performances are similar in the synthetic cases since the soil moisture observations in all scenarios are sensitive to the values of parameters A and B. Furthermore, the spatial distribution of the parameters estimated by the end-to-end strategy ("CNN (end-to-end)") has a more similar pattern to the different static attributes (Figures S3 and S4 in Supporting Information S1).That is, the spatial distribution of K sat is similar to that of MET0, the spatial distribution of A is similar to that of stdSMAP, and the spatial distribution of B is similar to that of AvgSMAP.
A more intuitive comparison between parameters estimated by the end-to-end strategy ("CNN (end-to-end)") and inverse method ("Inverse")/two-step strategy ("CNN (two-step)") is shown in Figure S12 in Supporting Information S1.The patterns of K sat are similar to the patterns of the relationship between reference K sat and K sat estimated by the inverse method ("Inverse")/ two-step strategy ("CNN (two-step)") in the synthetic cases shown in Figure S9 in Supporting Information S1.Meanwhile, the CORR values between the parameter K sat values estimated by the two-step strategy ("CNN (two-step)") and those estimated by the end-to-end strategy ("CNN (end-to-end)") are higher than the CORR values between the parameter K sat values estimated by the inverse method ("Inverse") and those estimated by the end-to-end strategy ("CNN (end-to-end)").The above performances may serve as a supplement to indicate that the parameter K sat values estimated by the end-to-end strategy ("CNN (end-to-end)") are closer to the "true" parameters, which also shows the improvements of using the global constraints from the static attributes as revealed in the synthetical cases.
The use of different parameters derived from the three methods analyzed above also leads to differences in estimates of other variables, like runoff, in comparison with ERA5-Land (Figure S13 in Supporting Information S1).Runoff simulations using parameters estimated by the end-to-end strategy ("CNN (end-to-end)") are generally closer to the runoff values from ERA5-Land than the runoff simulations using parameters derived by the other two methods, particularly for the metrics RMSE and NRMSE.Although the soil-water model used in this study was not particularly developed for the runoff simulation, it also shows the effects of using worse parameter The "Inverse," "CNN (two-step)," and "CNN (end-to-end)" denote the different estimation methods, respectively, that is, the inverse method used in step one in the two-step strategy, the two-step strategy, and the endto-end strategy.Note that the y-axis in subplot (a) is not from zero.
estimates from the inverse method and two-step strategy, as well as the impact of the issue of parameter identifiability (Beven, 2006;Yi & Park, 2021).

Discussions
While this study analyzes the potential errors in PTF and scale conversion as well as some progress made by our proposed method, further investigation into the following issues is still necessary.
In our synthetic cases, the reference Cosby PTF model is relatively simple and may not be appropriate for all kinds of soil types (Weihermüller et al., 2021) and results in fewer effects from different scale conversion frameworks on the retrieved soil moisture values and parameter estimates.The impact of other kinds of up-/downscaling techniques (Schweppe et al., 2022) (e.g., harmonic mean and geometric mean) and scale conversion frameworks (e.g., first scaling, then conducting PTF, and finally scaling) are not investigated.In investigating the effects of errors in the scale conversion framework and the up-/down-scaling techniques (i.e., cases "TR" and "MPR (with error)"), the scenarios with relatively large errors (i.e., MA in MPR-type and TR-type scale conversions, MA and MI in the MPR-type scale conversion) are selected to demonstrate the performance.However, we believe that errors with such magnitude exist with more complex PTF and more heterogeneous static attribute maps or real-world scenarios.Therefore, a future iteration of this analysis should consider more realistic scenarios and attribute maps to evaluate each method's performance better.
We also compared our K sat values estimated by the end-to-end strategy ("CNN (end-to-end)") in the real-world case with a publicly available data set, that is, Zhang et al. (2018) (Figure 9), although their spatial resolutions, generating predictors, and modeling methods are pretty different.Results show that similar spatial patterns in some areas exist, for example, the southeast part around CONUS.However, some significant differences arise at other locations, even showing values with opposite spatial patterns.These different results may be attributed to the following reasons.First, there are inherent errors in the predictors of both studies, and Zhang et al. (2018) only included texture, bulk density data, field capacity, and wilting point to estimate the parameter K sat , while some other covariates (e.g., the organic content and the large cracks) are not included in one or both studies, affecting Figure 8. Spatial distributions of parameters estimated in the real-world case.Rows a-c denote the parameters estimated by the inverse method (step one in the two-step strategy, labeled as "Inverse"), the two-step (labeled as "CNN (two-step)"), and the end-to-end (labeled as "CNN (end-to-end)") strategies, respectively, also labeled at the left of each row.Columns 1-3 denote the parameters of K sat , A, and B, respectively, also labeled at the bottom of each column.the estimates.Second, there are differences in the spatial coverage of the parameter K sat values estimated by regional observations in this study and the point-scale measurements of K sat in Zhang et al. (2018).Dong et al. (2020) also found some negative CORR values between parameters derived using the static attribute data only or the remote sensing observations only in their model.Similarly, the spatial scale of K sat values estimated in Zhang et ( 2018) is 1 km for surface soils (0-5 cm), which are not consistent with ours of the 10 km resolution for root-zone soils (0-100 cm).Because of this difference in the spatial scale between both data sets, significantly high nonlinearity in up-/down-scaling techniques and scale conversion frameworks can result in huge differences between the estimated parameters.Third, the soil-water model used in the current study has known limitations for representing soil moisture.For instance, the soil-water model does not consider the irrigation water and capillary rise terms, which can affect the retrieved results.In addition, there may be inconsistencies in the SMAPderived soil moisture data (Fang et al., 2020) and ERA5-Land data, as well as in the static attribute inputs or predefined parameters in the soil-water model (e.g., NDVI and crop coefficient).Fourth, only the static attributes located within each grid cell of the model in this study are selected as inputs.Therefore, the effect of cells outside each model grid but correlated to the model parameters could be missed.Establishing an additional buffer distance for the static attributes, including the grid cells outside each model grid, may allow us to consider such effects (Xu et al., 2023).Also, future improvements on the CNN-based integrated model can involve novel structures like ConvLSTM (Li et al., 2021;Shi et al., 2015) that can enhance the utilization of the spatio-temporally varying predictors (e.g., dynamic landscape features), adopting more advanced loss functions like Kling-Gupta efficiency (Gupta et al., 2009), and applying multi-objective optimization (e.g., soil moisture, terrestrial water storage, and runoff) to further avoid the issue of parameter identifiability (Beven, 2006;Yi & Park, 2021).Our approach can be flexibly transferred to other regions, improving the regionalization of the model and increasing our understanding of the natural environment.Furthermore, our CNN-based integrated model can serve as one candidate for parameter transfer functions and be integrated into the MPR framework to consider more scenarios in the future.
Overall, when significant nonlinearity exists in the cross-scale transfer (Binley et al., 1989) and the training data is adequate for avoiding overfitting (Van Looy et al., 2017), using our proposed CNN-based integrated model can yield great returns.After we successfully avoid the assumptions for mathematical forms of PTFs and up-/downscaling techniques and selections for scale conversion frameworks, it is of great importance to determine the relevant static or even spatiotemporal predictors for the cross-scale transfer (Y.Zhang & Schaap, 2017).Also, highly accurate predictors are necessary and need to be generated (Chaney et al., 2016).The loss function is based on the differences between the model simulations and observations in training the CNN-based integrated model.Therefore, structure errors (Scanlon et al., 2018;Q. Zhang et al., 2019) in the applied hydrology model have significant negative effects on mining observation information, which is important to eliminate as much as possible.

Summary and Conclusions
This study first evaluates the uncertainties in PTF and scale conversion.A CNN-based model to integrate PTF, up-/down-scaling techniques, and scale conversion framework is developed to mitigate these uncertainties.Based on the soil moisture simulation and parameter estimates, two optimization strategies (i.e., the end-to-end and twostep strategies) are compared, and the effects of errors in the up-/down-scaling technique and the scale conversion framework in optimizing PTF are investigated.Finally, the proposed method is tested in a real-world case.The major conclusions are drawn as follows: 1. Different PTFs, up-/down-scaling techniques, and scale conversion frameworks can yield markedly different parameters.Specifically, MA, MI, and MO produce the most significant differences in the retrieved soil hydraulic parameters compared to other considered up-/down-scaling techniques.The order in which the process (scale conversion framework) is applied is also relevant.Conducting PTF first and scaling then (i.e., MPR-type scale conversion) tends to cause more significant parameter space variability than the opposite scale conversion framework (i.e., TR-type scale conversion).2. Parameters estimated by the inverse method (the first step in the two-step strategy) provide reasonable soil moisture estimates at training grids, but generally encounter the issue of parameter identifiability (Beven, 2006;Yi & Park, 2021) and large parameter deviations.After regularizing by the CNN-based integrated model, the parameter estimation improves globally, while the soil moisture simulation deteriorates.The optimized CNN-based integrated model by the end-to-end strategy has stronger global constraints and gives the best overall parameter and soil moisture estimations.3. Errors in up-/down-scaling technique and scale conversion framework exacerbate performance in both soil moisture and parameter estimations.Among them, errors in up-/down-scaling techniques tend to encounter more significant parameter identifiability issues than errors in scale conversion frameworks when the PTF is relatively simple.The CNN-based model effectively mitigates these uncertainties by integrating the up-/downscaling technique and scale conversion framework, since selecting them in advance becomes unnecessary.Also, the performance of the CNN-based model is robust when applied to derive parameters at different scales.4. The performance by the end-to-end strategy is more robust than the two-step strategy, even when several sources of uncertainties are included.More effort is still required to eliminate discrepancies between advanced published soil hydraulic parameter data sets and applicable soil hydraulic parameters of regional models.

Figure 1 .
Figure1.Schematic representations of different processes for estimating the soil hydraulic parameters β at the spatial scale l 1 (i.e., β l 1 ) with the soil/landscape static properties u at the spatial scale l 0 (i.e., u l 0 ).The established pedo-transfer function model (PTF) is originally developed based on β l PTF with u l PTF .The reasonable way may be to first scale u l 0 to u l PTF , then map u l PTF to β l PTF , and finally scale β l PTF to β l 1 , which is characterized by the deep learning convolutional neural network (CNN) model in this study.We can adopt TR-type scale conversion (first scaling u l 0 to u l PTF and then mapping from u l PTF to β l PTF ) when spatial scale l PTF ≈ l 1 , and adopt MPR-type scale conversion (first mapping u l PTF to β l PTF and then scaling β l PTF to β l 1 ) when spatial scale l 0 ≈ l PTF .The CNN model is used to represent the cross-scale transfer from u l 0 to β l 1 .The red arrow denotes the scaling process by up-/down-scaling technique.The blue arrow represents the scale-consistent transfer by PTF or soil-water model.The dashed arrow represents the possible scale-consistent transfer by PTF when l PTF ≈ l 1 or l 0 ≈ l PTF .Different color of the circle denotes the different spatial scale.

Figure 2 .
Figure 2. The (a) two-step and (b) end-to-end strategies for the cross-scale transfer estimation.In the two-step strategy, model parameters are first estimated by the inversion method using the regional soil moisture observations.Then, the soil/landscape static property data and the derived model parameters are used as inputs and outputs, respectively, to train the CNN-based integrated model.In the end-to-end strategy, the soil hydraulic parameters inferred by the CNN-based integrated model based on the soil/landscape static property data are directly sent to the soil-water model to obtain the soil moisture.The gradient descent method is employed to train the CNN-based integrated model by reducing the loss function between the regional soil moisture observations and simulations.

Figure 3 .
Figure 3. RMSE matrix of uncertainties from cross-scale transfer.Sand and clay contents within the 0-5 cm depth (1 km) around CONUS are used as the inputs to generate the parameter K sat (10 km).The RMSE value is calculated between different K sat estimates around CONUS by different combinations of PTFs, up-/downscaling techniques, and scale conversion frameworks.The black solid lines divide MPR-type and TR-type scale conversions, forming four big frames.The gray solid lines segment the PTFs of CB1 and CB2.
PTF) and MA (up-/down-scaling technique) in MPR-type scale conversion (CB2-MA) Inverse Estimating parameters of the soil-water model by the inverse method in step one of the two-step strategy CNN (two-step) Training the CNN-based integrated model based on the two-step strategy CNN (end-to-end) Training the CNN-based integrated model based on the end-to-end strategy TR Mistaking MPR-type scale conversion as TR-type scale conversion and only training the PTF based on the end-to-end strategy MPR (with error) Mistaking the MA up-/down-scaling technique as MI and only training the PTF based on the end-to-end strategy (in MPR-type scale conversion with errors in up-/downscaling technique) MPR (without error) Only training PTF based on the end-to-end strategy (in MPR-type scale conversion without errors in up-/down-scaling technique)

Figure 4 .
Figure 4. Statistical results, that is, (a) CORR, (b) RMSE, and (c) NRMSE, between reference and estimated soil moisture data in different cases of Table 2. "Inverse" indicates the performances at the training grids based on the soil-water model directly estimated by the inverse model."CNN (two-step)" and "CNN (end-to-end)" indicate the performance using our proposed models trained by two different estimation strategies."TR" indicates the performances based on TR-type scale conversion."MPR (with error)" and "MPR (without error)" indicate the performances based on the MPR-type scale conversion with and without errors in up-/down-scaling techniques.Training, temporal, spatial, and spatio-temporal generalizations are respectively used to exhibit the results based on the data at the training grids during the training period, at the training grids during the test period, at the test grids during the training period, and at the test grids during the test period.Note that the y-axis in subplot (a) is not from zero.

Figure 5 .
Figure 5. Spatial distributions of reference and estimated parameters in the synthetic cases.Rows a-g denote the reference and estimated parameters from different cases in Table2, respectively, also labeled at the left of each row.Columns 1-3 denote the parameters K sat , A, and B, respectively, also labeled at the bottom of each column.

Figure 6 .
Figure 6.Comparisons between soil moisture reference data and simulation data based on the parameters at different scales from different cases during the test period and at test grids (spatial-temporal generalization).Note that the y-axis in subplot (a) is not from zero.

Figure 7 .
Figure 7. Statistical results between the SMAP-derived and model-estimated soil moisture data in the real-world case, that are, (a) CORR, (b) RMSE, and (c) NRMSE.Training, temporal, spatial, and spatio-temporal generalizations are respectively used to exhibit the results based on the data at the training grids during the training period, at the training grids during the test period, at the test grids during the training period, and at the test grids during the test period.The "Inverse," "CNN (two-step)," and "CNN (end-to-end)" denote the different estimation methods, respectively, that is, the inverse method used in step one in the two-step strategy, the two-step strategy, and the endto-end strategy.Note that the y-axis in subplot (a) is not from zero.

Figure 9 .
Figure9.Maps of saturated hydraulic conductivity (lg 10 K sat ) at 1 km resolution estimated based on the Kosugi K3 model using sand, silt, clay percentage, and bulk density from the SoilGrids product at 1 km resolution(Zhang et al., 2018).
is the root zone depletion relative to the field capacity at the end of a time period; P [L] is the effective precipitation; RO [L] is the runoff; IR n [L] is the net irrigation depth; CR [L] is the capillary rise; ET c [L] is the evapotranspiration; DP [L] is the water loss through deep percolation.IR n and CR are

Table 1
(Yu et al., 2021)in the Real-World CaseNote.The values in parentheses in the resolution column represent the original resolution of the data.forsoilmoisture data assimilation ranges from one to more than 10 days(Yu et al., 2021).We, therefore, sample the observations every 3 days for inversion from 3 April 2015 to 31 December 2017, leading to 335 observations at each location.Temporally, the data in the years 2015 and 2016 (213 temporal points) are used for training, and the rest (data in the year 2017, 122 temporal points) are kept for testing.Regarding spatial grids, we randomly select 80% of all spatial grids (i.e., 164,623) as training grids and the remaining 20% as test grids.The combination of temporal and spatial division leads to four data sets: observations at the training grids during the training period, observations at the training grids during the test period, observations at the test grids during the training period, and observations at the test grids during the test period, respectively.Apparently, the first data set is used in training the PTF and scale conversion while the latter three can be used to test the temporal, spatial, and spatio-temporal generalization of the model simulation.Besides, to eliminate the influence of the initial condition on soil-water modeling, the forcing data during the year 2014 are employed to warm up the soil-water model described in Section 2.3 based on the IC-WUP method (Yu et al., 2019). interval

Table 2
Synthetic Cases in Sections 3.2.1,4.2.1, and 4.2.2Developed to Investigate the Performance of Different Estimation Strategies and the Effects of Errors in Scale Conversion Framework and Up-/Down-Scaling Technique on the Estimation