Downscaling SMAP soil moisture product in cold and arid region: Incorporating NDSI and BSI into the random forest algorithm

Soil moisture (SM) is a critical element of the hydrological cycle, land surface processes, and surface energy balance. However, the low spatial resolution of commonly used SM products limits the application of SM in agriculture and eco‐hydrology in cold and arid regions. In this study, the normalized difference soil index (NDSI) and bare soil index (BSI) were added to traditional downscaling factors including land surface temperature, normalized difference vegetation index, digital elevation mode, apparent thermal inertia, Albedo, and temperature vegetation dryness index, as they are more strongly correlated with surface SM in the bare soil‐vegetation alternation zone of such region. Using the random forest algorithm, a downscaling model of SM was constructed for such region. The accuracy of the downscaled SM estimates was validated by comparing them with the original SM data collected from May to September 2021, which is the non‐freezing period of the soil. The findings indicate that the newly added NDSI and BSI have good correlation with SM. Incorporating NDSI and BSI to construct the downscaled model enhances the accuracy by over 19% compared to excluding them, while also providing a more comprehensive representation of SM information. NDSI and BSI can be well applied to the downscaled research of SM in the bare soil‐vegetation alternation zone, which is of great value for the study of eco‐hydrology and agricultural drought monitoring in cold and arid regions.


INTRODUCTION
Soil moisture (SM) is a vital aspect of the hydrological processes on land (Fang et al., 2022;Petropoulos et al., 2015) and surface energy balance (Cosh et al., 2021;Y. Li et al., 2018).It plays a crucial role in water and energy exchange across the soil-plant-atmosphere continuum (Fang et al., 2022;Petropoulos et al., 2015).Particularly in arid and semiarid regions, SM is considered to be a crucial limiting factor for plant growth, development, and regeneration, as well as a prerequisite for achieving various ecological functions and services (Reynolds et al., 2007).However, SM is not a typical hydrological or meteorological observation element due to its susceptibility to various factors such as vegetation, topography, climate, and soil properties, which can result in significant spatial heterogeneity (Z.Zhang et al., 2022;T. Zhao et al., 2020).This heterogeneity poses challenges for direct and effective observation.In recent times, advanced remote sensing technology has made high-precision SM information a research hotspot (Sabaghy et al., 2020).
Compared to traditional methods, deriving SM from remote sensing data offers several advantages, including wide coverage, long-term monitoring, and cost-effectiveness (Z.-L.Li et al., 2021;Lievens et al., 2017).Microwave remote sensing, in particular, is a highly sensitive method for monitoring SM that enables round-the-clock, all-weather monitoring, regardless of environmental conditions (Koley & Jeganathan, 2020).Therefore, it is considered one of the most significant means of obtaining SM information.However, most existing SM products, such as soil moisture active passive (SMAP), Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E), and soil moisture and ocean salinity (SMOS), have low spatial resolution.The SMAP satellite mission, launched by NASA on January 31, 2015, is designed to provide soil surface moisture content at a depth of 5 cm with spatial resolutions of 3, 9, and 36 km, respectively.The mission aims to conduct measurements simultaneously using both L-band radar (active) and radiometer (passive) sensors.
To estimate surface SM, the single channel algorithm with V-polarization was utilized through observations of vertically polarized brightness temperature (Zeng et al., 2016).
In addition, SMOS and AMSR-E SM products have a spatial resolution of 25 km.Recent research suggests that SMAP SM products exhibit a high degree of accuracy and are well suited for use in watersheds situated in arid regions of Northwest China (F.Chen et al., 2018;L. Zhang et al., 2019).
Enhancing the spatial resolution of SM data to obtain highly detailed information holds significant practical value and application.
To achieve higher spatial resolution of data, researchers both domestically and internationally have extensively investigated SM downscaling methods.These methods commonly involve the fusion of satellite remote sensing data, geo-

Core Ideas
• Adding soil factors improves the downscaling of soil moisture in cold and arid regions.• Compared with the traditional model, the improved model showed a significant increase in accuracy.• This improved model better characterized soil moisture spatial and temporal variability.
graphic information-based techniques, and model-based and machine-learning approaches (Peng et al., 2017).Satellitebased remote sensing data fusion methods primarily comprise active-passive microwave data fusion and fusion of optical/thermal infrared data with microwave data.Visible/thermal infrared remote sensing is advantageous over microwave remote sensing for providing high-resolution land surface parameters (N.Chen et al., 2017).The fundamental principle of satellite-based remote sensing data fusion is to derive downscaling factors from high-resolution visible thermal infrared data to enhance the representation of spatial variations in low-resolution microwave SM data.However, this method is vulnerable to the impact of atmospheric conditions and cloud cover (Djamai et al., 2016).
The method utilizing geographic information (GI) data aims to establish a correlation between GI and SM, resulting in high-resolution SM data.However, the majority of GI data utilized in this approach necessitates field measurements, limiting the method's applicability in areas with inadequate or inconsistent meteorological stations (Ranney et al., 2015).
Model-based methods require in situ data input and bias correction and a complex process of merging SM spatial and temporal variability to analyze SM products at different scales in order to downscale the data.The process can be operationally complex (Mascaro et al., 2010).With advancements in computer science and artificial intelligence, an increasing number of researchers are utilizing machine learning methods to incorporate MODIS and topographic data into SM downscaling studies, leading to improved scaling accuracy (Abbaszadeh et al., 2019;Xu et al., 2022).Machine learning techniques contribute to capturing the complex relationships between SM and downscaling factors, leading to the development of more reliable downscaling models and enabling the handling of large amounts of input and output data.Among various machine learning approaches, the random forest (RF) model is widely employed for scaling microwave SM products.This is due to its ability to establish relationships between SM and surface parameters in the absence of continuous data, as well as its strong nonlinear learning capability and flexibility in integrating multiple data sources (Abowarda et al., 2021;Long et al., 2019).RF is composed of multiple decision trees and do not require feature selection.They exhibit excellent classification/regression accuracy for high-dimensional datasets that are imbalanced or have missing features.They are used in downscaling research due to their simplicity, speed, and high accuracy without the need for complex physical mechanisms (Hutengs & Vohland, 2016).Im et al. (2016) used three machine learning methods to establish the complex relationship between AMSR-E SM products and other surface variables from moderate resolution imaging spectroradiometer (MODIS) data, which opened up a research path for downscaling.The research results showed that the RF method had better downscaling performance.S. It has been demonstrated that traditional downscaling models tend to select the normalized difference vegetation index (NDVI), temperature vegetation dryness index (TVDI), land surface temperature (LST), Albedo, digital elevation mode (DEM), and apparent thermal inertia (ATI) as downscaling factors (Q.Chen et al., 2019;Fang et al., 2021;Mohanty et al., 2017).While these factors have been shown to effectively reflect SM in areas with dense vegetation cover, their ability to do so may be limited in cold and arid regions with bare soilvegetation alternation due to the presence of large amounts of bare soil.Recent research suggests that land cover is also a crucial contributor to SM in arid and semi-arid regions (Deng et al., 2020;L. Xia et al., 2021).Thus, the identification of downscaling factors that accurately reflect land cover based on the surface characteristics of bare soil-vegetation alternation is a crucial step toward effective downscaling in cold and arid regions.
This study constructs an SM downscaling model for the bare soil-vegetation alternating zone in the cold, dry region by incorporating the normalized difference soil index (NDSI) and bare soil index (BSI) into the traditional downscaling factors (Albedo, LST, NDVI, TVDI, ATI, and DEM) using the RF model.The NDSI and BSI provide rich surface feature information, and they are also highly applicable in the analysis of soil physical and chemical indicators (K.Xia et al., 2021).In alternating areas of bare soil and vegetation, they composed of different bands can better extract soil-vegetation information (Bala et al., 2019).These indices are typically used in remote sensing monitoring of ecological environments in arid and semi-arid regions, allowing for the differentiation of vegetation cover and the severity of drought.The aim is to compare an improved SM downscaling model that integrates sparse vegetation and bare soil with the traditional downscaling model to explore the rationality of the improved model for downscaling SMAP SM products.The goal is to obtain large-scale and high spatial resolution SM data, provide ideas for studying SM downscaling in the cold, arid region, and establish a theoretical and methodological foundation for analyzing agricultural drought monitoring and hydrological change analysis in this area.

2
STUDY AREA AND DATA

Study area
The Weigan River Basin (Figure 1) is located in the Aksu region of Xinjiang, encompasses five counties: Xinhe, Shaya, Kuqa, Baicheng, and Wensu.The terrain slopes gradually from north to south.With a latitude ranging from 40˚57′ to 42˚39′N and a longitude ranging from 80˚8′ to 84˚E, it is positioned south of the Tianshan Mountain Range and north of the Tarim Basin.The basin is primarily formed by the convergence of the Muzati River, Kabuslang River, Taile Waiqiuq River, Karasu River, and the Weigan and Kuqa Rivers, collectively known as the Heizi River.The main stream of the Weigan River spans 452 km, with a total drainage area of 6.79 × 10 4 km 2 .The Weigan River basin exhibits significant regional variations in climate.The Tien Shan Mountains and the Baicheng Basin experience a cold-temperate climate, while the Kuqa-Shaya-Sinhe Plain features a warm-temperate climate (T.Liu et al., 2023).The region is characterized by low precipitation, high evapotranspiration rates, large diurnal temperature differences, and long hours of sunshine (Z.Wang et al., 2020).These factors contribute to the relatively fragile ecological environment in the area, where bare-ground vegetation is the dominant land use type and cotton and maize are the primary crops.Representative plant species include rooibos, poplar, and so on.Additionally, there is considerable spatial variability in soil texture within the region.

Data used and image preprocessing
(1) MODIS data The data were obtained from the FAA LAADS website (https://search.earthdata.nasa.gov/search).The image data (Table 1) comprised the NDVI band from the MOD13Q1 product, the MOD09A1 surface reflectance data, and the MOD11A2 LST product, which were utilized.Preprocessing tasks such as resampling, projection transformation, mosaicking, and cropping were conducted using the MCTK tool in ENVI 5.3 software and the Google Earth Engine platform.(2) SMAP data This study utilized the freely accessible SMAP L3 product, which can be downloaded from the NASA website (https://search.earthdata.nasa.gov/search).The product provides information on SM distribution from 0 to 5 cm of the surface.It includes data from two orbits: morning (descending) and afternoon (ascending), with descending and ascending orbits, which occur at 6:00 a.m. and 6:00 p.m. local time, respectively.Considering that the morning observations are believed to better represent the near-surface reality due to the heat balance and homogeneous atmosphere (Entekhabi et al., 2010), we utilized the descending orbit data for the downscaling study.The data are stored in HDF5 format and utilize the EASE-Grid 2 projection.Data reading and preprocessing were performed using the Python programming language, with the data format converted to the commonly used Geotiff format and the geographic coordinate system set to WGS-84.

Selection of downscaling factor
(1) Albedo Since the energy of solar radiation is mainly concentrated in the range of 0.25-1.5 μm, the reflectance of visible and nearinfrared (NIR) bands can be approximately calculated.In this paper, the Albedo was calculated using Liang's broadband reflectance formula (Liang, 2000).
(2) Apparent thermal inertia Due to the complexity of the parameters in the thermal inertia model, it is often calculated using the ATI in practical applications.In 1985, Price (1985) proposed the concept of ATI based on a systematic elaboration of the thermal inertia method for monitoring SM and the imaging principle of thermal inertia.This method has the advantages of simple calculation, easy data acquisition, and is more suitable for remote sensing data applications with higher feasibility.The main principle is based on the positive correlation between ATI and SM content.The calculation formula is as follows: In the equation, A represents the Albedo calculated by Equation ( 1) and ∆T represents the diurnal temperature difference obtained from MOD11A2 data.
(3) Temperature vegetation dryness index Sandholt et al. (2002) employed LST and NDVI to construct the LST-NDVI feature space, and subsequently proposed a simplified TVDI based on this space.TVDI is closely associated with SM conditions and can effectively reflect the level of water stress on vegetation.The value range of TVDI is 0-1, and the smaller the TVDI, the greater the SM, and vice versa, representing a smaller SM.The NDSI is an index used to monitor SM content on the earth's surface.It is calculated by taking the ratio of reflectance values between the short-wave infrared (SWIR) and NIR bands, and then normalizing the difference.NDSI can effectively reflect changes in SM and is widely used for this purpose (Rogers & Kearney, 2010).
where  NIR is the NIR reflectance and  SWIR is the SWIR reflectance from MOD09A1 data.
(5) Bare soil index The BSI is an indicator widely used in soil and water conservation to measure the risk of soil erosion and sediment loss.It can effectively reflect the degree of soil exposure (Bhunia et al., 2017).
where  RED is the red reflectance,  NIR is the NIR reflectance,  SWIR is the SWIR reflectance, and  BLUE is the blue band reflectance from MOD09A1 data.

RF model
To better understand the intricate correlation between SM and surface parameters, an increasing number of multivariate nonlinear statistical models are employed in the field of SM downscaling.RF, a prominent example of the bagging algorithm, is a nonlinear integrated maximum likelihood approach that comprises numerous randomized decision trees.It has found extensive application in various domains such as classification, regression, and machine learning (Bhuiyan et al., 2018).In this study, the regression model of the RF algorithm is utilized to develop the SM downscaling model (Figure 2).
where SM  represents the SM inverted by the traditional downscaling model, while SM  represents the SM inverted by the improved downscaling model.In this study's regression analysis, the built-in functions of the R language are employed to implement the task.Initially, multiple decision trees are constructed during the training phase, with each tree being built using bootstrap samples.The K-fold cross-validation method, known to yield optimal validation results when K is set to 10 ( Khellouk et al., 2019), is utilized.To enhance the generalization capability of the RF model, the predictions from several independent regression trees are averaged arithmetically to obtain the final model predictions.

Evaluation indicators
In this study, the downscaling results of the RF model were evaluated using three indicators: the coefficient of determination (R 2 ) and root mean square error (RMSE).These evaluation indexes were employed to quantitatively assess the prediction accuracy of the RF downscaling model for SM.Additionally, the correlation between the eight input factors of the model and SMAP (0-5 cm) SM was analyzed using the Pearson correlation coefficient (r).
where SM  is the SM predicted by RF model, SM 0 is the original SMAP L3 SM data,  is sample size of SM, and X is the downscaling factor.

Correlation analysis of downscaling factor and SM
Pearson correlation coefficients and significance tests were calculated for eight downscaling factors and surface (0-5 cm) SM, as shown in Table 2. LST showed a strong overall correlation.In spring (May), ATI showed the highest correlation with SM, while in summer (July), LST showed the highest correlation with SM, and in autumn (September), BSI showed the highest correlation with SM.The newly added factors, BSI and NDSI, showed much higher correlation with SM in summer and autumn than NDVI, and in spring, they showed higher correlation with SM than TVDI.

Importance analysis of downscaling factors
The RF algorithm enables a quantitative assessment of the input variables' contribution to the final outcome.In this study, the importance of the variables was evaluated using %IncMSE (percentage increase in mean square error), considering the range and distribution of error rates in the samples.The significance of individual independent variables in the RF regression model was determined using the importance command function.The mean importance scores of the down-scaling factors during the soil non-freezing period from May to September 2021 are depicted in Figure 3.
As shown in Figure 3, all surface environmental factors contribute to SM inversion, with DEM, LST, NDVI, NDSI, TVDI, Albedo, and BSI ranked in descending order of importance.DEM makes the largest contribution, accounting for 43.93% of the SM estimation.

Comparative analysis of downscaling results
To validate the model's accuracy, this paper employs the original SMAP SM data, ensuring validation at the same spatial scale.This approach allows for testing the applicability of both the traditional and improved downscaling models in cold and arid regions.The validation results, presented A comprehensive analysis reveals that the improved model, which considers the environmental characteristics of the bare soil-vegetation alternating zone in cold and arid regions, exhibits a slight enhancement in overall accuracy compared to the traditional model.It demonstrates higher coefficients of determination and lower errors.Consequently, the improved SM downscaling model is better suited for studying SM downscaling in cold and arid regions with bare soil-vegetation alternating zone.

Spatial distribution analysis of SM
Figure 6 presents the spatial distribution of the original SMAP L3 SM in the study area compared with the downscaled SM.
The downscaling effect of SMAP SM products during the non-freezing period of May-September 2021 is studied and analyzed.The spatial resolution of SM data in the study region is enhanced from 9 to 1 km.It effectively retained detailed information on SM in most areas, resulting in a more comprehensive and detailed representation while maintaining spatial consistency.This addresses the limitations of low resolution in the SMAP SM products and partially improves the issue of missing SM data.As shown in Figure 6, the SM levels are generally higher in the northern, southeastern, and southwestern regions of the study region, while lower in the central and eastern bare soil areas.The average SM exceeds 0.19 cm 3 cm −3 , and the downscaled products accurately reflect the changing patterns of SM.

DISCUSSION
(1) The analysis of downscaling factors and their correlation with SM: The strong overall correlation of LST is mainly due to the decisive role of surface temperature in water evaporation (Song et al., 2019).In general, areas with high surface temperatures (or bare soil areas) tend to have lower SM content.The strong correlation of DEM in summer may be due to rising temperatures during this season, which can lead to reduced precipitation and subsequently affect SM, thereby increasing the impact of DEM on surface SM (Hu et al., 2020).NDVI varies seasonally and fluctuates due to the melting of ice and snow at the end of spring, which increases SM and strengthens the correlation between SM and NDVI.The water uptake of vegetation roots also affects changes in SM (Maurya et al., 2021).Albedo can cause energy exchange between the surface and the atmosphere, thereby affecting the evapotranspiration of soil and vegetation through a coupling effect (Long et al., 2019), resulting in a higher correlation in summer.The correlation between TVDI and SM is high in summer and low in spring, while the opposite is true for ATI.The reason is that ATI is usually suitable for monitoring SM in flat, bare areas with sparse vegetation or in the pre-growing season of crops.When there is vegetation cover, monitoring accuracy will decrease (Yuan et al., 2020), while TVDI is more suitable for periods with better vegetation cover (J.Wang et al., 2016).An increase in summer temperature leads to an enhancement of soil brightness, which enhances the relationship between NDSI and surface temperature (Sayão et al., 2020), thereby improving the response characteristics of NDSI to SM. BSI shows a higher correlation with SM in autumn (September), as vegetation decreases and bare ground increases, enhancing the ability to identify bare soil (Yao et al., 2022) and thus increasing its correlation with SM.
(2) Analysis of the importance of downscaling factors: The dominant role of DEM in the SM downscaling model is consistent with the findings of Y. Liu et al. (2020).This is because DEM has a relatively stable response relationship with SM, which is less influenced by the sparsity or density of vegetation, resulting in less seasonal variability and a higher average importance score in the downscaling model.Additionally, the study area is located on the southern slope of the Tianshan Mountains, with a high altitude and year-round snow cover.In this cold and arid region, SM mainly comes from snowmelt and precipitation (Williams et al., 2009), leading to higher surface SM in high-altitude areas and further increasing the importance score of DEM in the downscaling model.Furthermore, other factors, except for DEM, rely on optical remote sensing data and are susceptible to cloud and weather interference, resulting in data loss and lower importance scores than DEM.
Although the importance of BSI is relatively low, its contribution to the model is 7.28%.This is because the study period is concentrated from May to September, during which the vegetation growth in the vegetation area is better.The results indicate that even the lowcorrelated NDVI can provide some information for the construction of SM downscaling models in cold and arid regions.
(3) Analysis of the advantages based on BSI and NDSI: The combination of different bands can reduce data dimensionality and eliminate redundant information, thereby maximizing the indication effect of spectral indices, attributable to the strong correlation between different bands or spectral indices (Brown et al., 2000).The BSI combines spectral information from blue, red, NIR, and SWIR bands, thereby improving its ability to identify bare and non-bare land (Yao et al., 2022), which is consistent with the findings of Gao et al. (2022) in their study of bare soil and vegetation alternation areas in Khongtai Lik, Xinjiang.The BSI also exhibits a high correlation with SM.The NDSI combines the NIR and SWIR bands.It effectively distinguishes between bare soil and vegetation and is suited for the study area, which is characterized by aridity, large temperature differences, long hours of sunlight, and sparse vegetation with a lot of bare soil (Bala et al., 2019).Both NDSI and BSI exhibit good and stable correlations with SM, as they both contain the SWIR band.It has been shown by studies that the SWIR band is more sensitive to changes in SM and contains abundant SM information.It can effectively detect changes in vegetation and SM and is less sensitive to background (Y.Liu et al., 2021), which improves the relationship between downscaling factors and SM.(4) Analysis of soil moisture downscaling model based on RF: In this study, a SM downscaling model was constructed using the RF algorithm to describe the complex relationship between SM and surface parameters.Both the traditional and improved models achieved R 2 above 0.9, demonstrating the excellent generalization ability and strong robustness of the RF algorithm in downscaling, which is consistent with the results of W. Zhao et al. (2018).Traditional downscaling models often select NDVI, LST, DEM, Albedo, and drought factors such as ATI and TVDI as downscaling factors to construct SM downscaling models.Based on the regional characteristics of the bare soil-vegetation alternation zone in the cold and arid regions, this study added NDSI and BSI to the traditional downscaling model to improve the identification accuracy of vegetation and bare soil in the bare soil-vegetation alternation zone, reflecting the influence of bare soil on surface SM.The improved SM downscaling model showed an accuracy improvement of more than 19% compared to the traditional model and had strong stability, indicating that the downscaling model in this study is better suited to the regional characteristics of the bare soil-vegetation alternation zone in the cold and arid region.
Hence, the SM downscaling model developed in this study effectively addresses the limitations of conventional downscaling factors for SM in the bare soil-vegetation alternation area of cold and arid regions, considering the influence of vegetation cover and other factors.This model offers valuable insights for future investigations on SM downscaling in similar environmental conditions.

CONCLUSION
In this study, the suitability of the improved downscaling model for SMAP SM (0-5 cm) in the bare soil-vegetation alternating area of the cold and arid region was explored, utilizing the RF algorithm and employing Albedo, ATI, DEM, LST, NDVI, TVDI, NDSI, and BSI as auxiliary data.The following conclusions were drawn from the study: (1) The correlation analysis between each downscaled factor and SM revealed that the new NDSI and BSI exhibit higher correlations with SM during summer and fall compared to the NDVI.Additionally, during spring, the new downscaled factors display higher correlations with SM than TVDI.These results suggest that the new downscaled factors exhibit improved correlations with SM in the bare soil-vegetation alternating zone.(2) LST, NDVI, Albedo ATI, TVDI, and DEM were selected as the traditional SM downscaling factors.In order to improve the precision of the downscaling model in the bare soil-vegetation alternating zone of cold and arid region, we incorporated NDSI, BSI into the model.As a result, the accuracy of the improved downscaling model increased by 22.46%, 19.26%, 23.57%, 23.88%, and 20.74% for LST, NDVI, Albedo, ATI, and TVDI, respectively.Moreover, the overall accuracy of the model improved by 20%.These improvements were observed when comparing the results with the pre-improved SM downscaling model.(3) The descending order of importance for the downscaling factors was determined as DEM, LST, NDVI, NDSI, TVDI, Albedo, and BSI.Among these factors, DEM had the highest contribution to the SM downscaling study, accounting for 43.93% of the variability.Although the importance of BSI during the study period is not high, it compensates to some extent for the impact of vegetation coverage such as NDVI on SM, thereby improving the accuracy of the downscaling model for SM.(4) The downscaling results of both the traditional and improved downscaling models exhibited similar patterns to the spatial distribution of SM in the original SMAP dataset, thereby preserving and enhancing the detailed information of SM.The northern and southwestern regions of the study area exhibited higher SM levels, while the central and eastern areas of the basin, characterized by bare soil, displayed lower SM levels, with an average exceeding 0.19 cm 3 •cm −3 .The enhanced spatial resolution from 9 to 1 km in the original SMAP SM data partially addressed the issue of data scarcity, providing a valuable foundation for monitoring ecological and hydrological conditions, as well as drought conditions, in cold and arid agricultural areas.These findings offer effective support for ecological hydrology and agricultural drought monitoring in cold and arid regions.

AU T H O R C O N T R I B U T I O N S
Mingxing Gao: Conceptualization; data curation; formal analysis; project administration; validation; visualization; writing-original draft.Kui Zhu: Conceptualization; investigation; project administration; writing-review and editing.Yanjun Guo: Writing-review and editing.Xuhang Han: Writing-review and editing.Dongsheng Li: Writingreview and editing.Shujian Zhang: Writing-review and editing.

A C K N O W L E D G M E N T S
This study was supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region (2022D01C41, 2023B03009-1), Third Comprehensive Scientific Expedition to Xinjiang (2022xjkk0105), and National Natural Science Foundation of China (52279029).We are indebted to the editors and reviewers for the insightful contributions as constructive comments and suggestions during manuscript review.

C O N F L I C T O F I N T E R E S T S T A T E M E N T
The authors declare no conflicts of interest.
5) NDVI represents the vegetation index corresponding to MOD13Q1 image.  is the LST of any pixel in MOD11A2 image,   min is the minimum LST corresponding to the same NDVI value in MOD11A2 image, and   max is the maximum LST corresponding to the same NDVI value in MOD11A2 image.Here, a, b, c, and d are the coefficients of the TVDI dry-wet edge equation.(4) Normalized difference soil index

F
Technical roadmap of soil moisture downscaling model based on random forest algorithm.ATI, apparent thermal inertia; BSI, bare soil index; DEM, digital elevation mode; LST, land surface temperature; NDSI, normalized difference soil index; NDVI, normalized difference vegetation index; SM, soil moisture; SMAP, soil moisture active passive; TVDI, temperature vegetation dryness index.

F
Comparison chart of error (coefficient of determination [R 2 ], root mean square error [RMSE]) before and after model improvement from May to September 2021. in Figures 4 and 5, demonstrate the robustness of the RF algorithm SM downscaling model, albeit with some errors introduced by various factors.The scatter plot reveals that the improved downscaling model exhibits an enhanced accuracy of 22.46%, 19.26%, 23.57%, 23.88%, and 20.74% compared to the traditional downscaling model, respectively.

F
I G U R E 5 Downscale soil moisture (SM) and soil moisture active passive (SMAP) scatter map of SM from May to September 2021.RMSE stands for root mean square error, and R 2 stands for coefficient of determination.A smaller RMSE value indicates a smaller model error and higher accuracy.A higher R 2 value indicates a better fit for the model.The red line is the fitted line and the black line is the 1:1 line.F I G U R E 6 Comparison of the spatial distribution of soil moisture active passive (SMAP) original soil moisture (SM) products and the downscaled SM data from May to September 2021.