A Multi‐Scale Soil Moisture Monitoring Strategy for California: Design and Validation

A multi‐scale soil moisture monitoring strategy for California was designed to inform water resource management. The proposed workflow classifies soil moisture response units (SMRUs) using publicly available datasets that represent soil, vegetation, climate, and hydrology variables, which control soil water storage. The SMRUs were classified, using principal component analysis and unsupervised K‐means clustering within a geographic information system, and validated, using summary statistics derived from measured soil moisture time series. Validation stations, located in the Sierra Nevada, include transect of sites that cross the rain‐to‐snow transition and a cluster of sites located at similar elevations in a snow‐dominated watershed. The SMRUs capture unique responses to varying climate conditions characterized by statistical measures of central tendency, dispersion, and extremes. A topographic position index and landform classification is the final step in the workflow to guide the optimal placement of soil moisture sensors at the local‐scale. The proposed workflow is highly flexible and can be implemented over a range of spatial scales and input datasets can be customized. Our approach captures a range of soil moisture responses to climate across California and can be used to design and optimize soil moisture monitoring strategies to support runoff forecasts for water supply management or to assess landscape conditions for forest and rangeland management.


INTRODUCTION
Expected increases in climatic variability (IPCC 2014) and more frequent episodes of water scarcity make strategic management of water resources increasingly important. As climate change results in novel hydrologic conditions, real-time soil moisture monitoring is now considered an integral component of operational water resource management at local to regional scales (Schaefer et al. 2007;Illston et al. 2008;Dobriyal et al. 2012;Trenberth and Asrar 2014;Harpold et al. 2017).
Soil moisture monitoring and early recognition of anomalous conditions can improve runoff forecasting and support time-sensitive implementation of effective water-management solutions. For example, when spring snowmelt occurs on dry soils, a portion of snowmelt must fill the existing soil water deficit and runoff generation is both delayed and diminished. Deficits in soil moisture can result in the earlier onset of low flow conditions, while saturated soils more readily produce runoff that can result in earlier and larger streamflow peaks.
Incorporating soil moisture conditions into runoff forecast models will enable more accurate predictions of the timing and volume of runoff (Crow and Ryu 2009;Tayfur et al. 2014;Harpold et al. 2017). This information will greatly improve water resource management and reservoir operations; however, the ability to manage for anomalous soil moisture conditions requires new water-management strategies supported by real-time soil moisture monitoring.
Despite the importance of soil moisture in determining water balance conditions and associated hydrologic responses (Flint et al. 2008), there are relatively few soil moisture monitoring stations in comparison to streamflow stations (Quiring et al. 2015). There are even fewer soil moisture stations that measure deep subsurface soil moisture (>0.5 m) because many of the existing monitoring networks were developed for the calibration of remote sensing products (Larson et al. 2008(Larson et al. , 2010Entekhabi et al. 2010). These shallow networks capture only the top few centimeters (~5 cm); (Ochsner et al. 2013) and do not characterize the deeper root zone soil moisture that is utilized by transpiring vegetation. These shallow monitoring networks cannot effectively characterize surplus or deficit soil moisture conditions that exert first order controls on hydrologic responses, and the spatial resolution of satellite observations of soil moisture is too coarse for many hydrologic applications (Peng et al. 2017). This highlights the need for real-time soil moisture monitoring to characterize soil water storage.
Hydrologic classifications, which group rivers into hydrologic response units with similar streamflow regimes, have been implemented successfully over a range of spatial scales (Wolock et al. 2004;Olden et al. 2012). Unlike streamflow, which represents a cumulative watershed response, soil moisture data are fine-scale point measurements with high spatial and temporal variability (Vereecken et al. 2014), which creates a unique challenge for classification.
Soil moisture patterns have been described using cluster analysis at the watershed (Lin et al. 2006;Schroter et al. 2015) and hillslope-scale (Lee and Kim 2017;Liao et al. 2017). Hydrologic-based soil classifications have been developed using conceptual models of soil response (Boorman et al. 1995). Building on these studies, we developed an analytical workflow to classify soil moisture response units (SMRUs) at the regional to local-scale across California. The proposed workflow can be used to optimize the number and location of soil moisture stations, to design monitoring strategies that capture the range of soil moisture responses to climate, and to inform water resource management.
This study represents the first attempt to define SMRUs at a fine-spatial scale (~270 m) throughout California (Figure 1). A SMRU is defined as a landscape region with similar soil, climatic, hydrologic, and vegetation characteristics that produce a similar soil moisture response to climate. The size of a SMRU can range from relatively small areas that are typically located in mountainous regions, to very large areas, which tend to be located in gently sloping valleys. We developed a deductive approach that uses unsupervised cluster analysis to classify SMRUs across California. Deductive methods, which are often referred to as "top down" methods, approach classification from the general to the specific. This type of deductive classification is commonly used when an unsupervised approach, which does not constrain the classification process, is required due to the absence or scarcity of measured data (Sayre et al. 2009;Olden et al. 2012;Auerbach et al. 2016). For these reasons, unsupervised cluster analysis is considered an exploratory first principle tool that uses pattern recognition to define statistical groupings, which maximize within group similarities and between group differences.
Because of the spatial variability of soil properties, a primary challenge in the implementation of soil moisture monitoring is optimizing the number and location of sensors to represent the range of potential conditions (Robinson et al. 2008). We used preexisting environmental datasets discretized at a 270-m spatial resolution Wieczorek 2014Wieczorek , 2015 http://phenology.cr.usgs.gov; Table 1) and then validated watershed-scale SMRUs using measured soil moisture time series. Our workflow generates a series of map products that identify where monitoring will best support water resource management. We further illustrate the workflow with an example application in the Hetch-Hetchy water supply basin and discuss other predictive applications.

Study Area
California's Mediterranean climate is naturally variable with large latitude and elevation gradients that produce strong climate gradients. For example, California's north coast region receives over 2,500 mm of average annual precipitation compared to 130 mm in the southeastern deserts. About 75% of the state's annual precipitation falls within a narrow window of time between November and March. Annual precipitation anomalies are routinely 50%-200% of long-term averages, which result in greater inter-annual variability than most other locations in the conterminous United States (U.S.) (Dettinger et al. 2011). Rain-dominated landscapes occur at JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION lower elevations and the rain-to-snow transition occurs at about 2,000-2,500 m elevation (Curtis et al. 2014). The state's largest reservoirs are located in northern California where water is captured and stored during the runoff season. Stored water is then released and transferred to population and agricultural centers in southern California via a complex system of dams, canals, aqueducts, levees, bypasses, weirs, and pumping plants.
We selected study regions at three nested-spatial scales to optimize our workflow for classifying SMRUs (Figure 1). The regional scale includes the entire state of California (CA) (419,550 km 2 ), which can be subdivided into 10 unique Jepson ecological regions (Hickman 1993) on the basis of similar air temperature and precipitation patterns. At the intermediate scale, we selected the Sierra Nevada ecoregion (SN) (62,560 km 2 ), where more than 60% of California's developed water supply originates (http://www.sierranevada.ca.gov/our-region/ca-primarywatershed). At the watershed-scale, we selected the combined Merced-Tuolumne watersheds (MT) (7,010 km 2 ), which include the rain-to-snow transition boundary.
We further illustrate our workflow with example applications in the Hetch-Hetchy (1,175 km 2 ) and Upper Carson River (2,480 km 2 ) watersheds (CR). We selected the Hetch-Hetchy watershed because it is an important water supply basin. Hetch-Hetchy reservoir, one of California's larger water supply reservoirs, supplies water via an aqueduct to 2.4 million San Francisco-Bay Area residents. Because much of California's surface water supply is derived from Sierra Nevada snowmelt we selected the Upper Carson River watershed, to further validate our SMRU classification in a snow-dominated basin.

Multivariate Analysis Geographic Information System-Workflow
A multivariate workflow ( Figure 2) was developed to classify SMRUs and to support design and optimization of the number and location of soil moisture monitoring stations. Publicly available environmental raster datasets Wieczorek 2014Wieczorek , 2015, U.S. Geological Survey (USGS) Remote Sensing Phenology: http://phenology.cr.usgs.gov) were reviewed and a list of 13 datasets that govern soil moisture response to climate were selected as explanatory variables (Table 1). The explanatory variables, referred to here as soil moisture response variables, represent a range of soil, climate, hydrology, and vegetation properties. Variables that describe soil properties included: soil depth, percent sand, percent clay, porosity, and available water capacity. All the soil variables, except soil depth, were weighted averages computed for the entire soil profile. Available water capacity, defined as the amount of water a soil can store, was calculated as water content at field capacity (i.e., the amount of  "greenness" or photosynthetic activity across the landscape. Prior to classification, the 13 soil moisture response raster datasets (Table 1) were masked and standardized. A mask was developed using a land-use map from the California Department of Forestry's Fire and Resource Assessment Program (Department of Forestry and Fire Protection, FRAP 2015). The land-use mask eliminated locations where soil moisture monitoring was either unfeasible or not recommended. Masked land-use areas were classified as: water, urban, barren, irrigated crops, agricultural land, and marsh land. An additional mask was derived from existing recharge and runoff maps  to eliminate areas that did not produce enough recharge and/or runoff to be considered useful in a water supply context, which applies primarily to desert regions. Recharge plus runoff maps were created and a minimum threshold was set at 10 mm/year. All regions below this threshold were eliminated from our analysis. The 13 masked rasters were then standardized to convert all variables to a similar scale. This method reduces the artificial dominance of any single variable but allows extreme conditions to be preserved. A z-score method (Equation 1) was used to standardize the masked soil response rasters, where z is the standardized value, n is the original raster pixel value, x is the mean of the raster layer, and s is the standard deviation of the raster layer.
Principal Component Analysis. Each of the 13 soil moisture response variables was correlated with at least one other variable at all three spatial scales (Tables 2-4). Because data redundancy complicates pattern recognition, a key decision is how many variables should be used for classification. Although methods for eliminating variables are generally heuristic, the goal is to minimize the number of variables that can be used to explain the pattern of variance across the study region.
We used principal component analysis (PCA), a common statistical method (Pearson 1901;Abdi and Williams 2010;Shlens 2014), to define an optimal number of uncorrelated variables for cluster analysis and SMRU classification. An orthogonal transformation was used to convert the soil moisture response variables into a set of linearly uncorrelated principal components (PCs). Orthogonal transformation moves the largest possible variance into the first PC and each subsequent PC has the highest variance possible under the constraint that it is orthogonal to the prior component.
A separate PCA was completed for each study region using the 13 standardized soil moisture response rasters (Table 1) and the PRINCIPAL COMPONENTS function in ArcGIS (http://pro.arc gis.com/en/pro-app/tool-reference/spatial-analyst/princ ipal-components.htm). Note that ArcGIS functions are denoted here by capitalizing the function name. Again, all raster datasets were discretized at a 270-m spatial resolution.
There are numerous methods for selecting the number of PCs to retain for further analysis (Henson and Roberts 2006). A common approach is to simply select the first two PCs for further analysis. For example, Thorne et al. (2017) used PCA in a study that determined the sensitivity of vegetation groups to future climate distributions across California, Thorne et al. (2017) selected the first two PCs, which We selected PCs to retain for the cluster analysis and SMRU classification using a break in slope screetest criterion (Cattell 1966), which is a common empirical method that defines a suite of PCs with the largest explanatory power. Larger eigenvalues represent greater explanatory power and the eigenvalues decrease sequentially for each additional PC, while asymptotically approaching zero. When the eigenvalues are plotted, a user-defined break in slope screetest criterion defines the optimal number of PCs for further analysis. The scree-test criterion method allows the user to objectively identify the optimal number of PCs with adequate explanatory power for further analysis.
Cluster Analysis. The PCs for each study region with the largest explanatory power were used to classify SMRUs using an unsupervised K-means cluster analysis (Jain 2010). Separate cluster analyses were performed for each study region using the ISO-CLUSTER function in ArcGIS (http://pro.arcgis. com/en/pro-app/tool-reference/spatial-analyst/iso-cluster. htm). ISOCLUSTER uses K-means clustering, one of the most widely used classification algorithms (Pang-Ning et al. 2006), and an optimization clustering algorithm, known as the migrating means technique, which iteratively minimizes the sum of the squared error. Error is defined as the distance between the raster cell's multi-dimensional value and a cluster centroid. The cluster centroid is defined as the group mean using multi-dimensional Euclidean distance. ISOCLUSTER runs iteratively until <2% of the cells change from one cluster to another within an iteration.
The ISOCLUSTER function requires the user to define an initial number of classes. Because the  optimal number of classes is usually unknown, we started with a conservatively high number (N = 30), analyzed the clusters, reduced the number of classes, and repeated the analyses until we achieved an optimal number of classes. Necessarily, the optimization of the number of classes will be constrained by project resources, objectives, or other limitations. We used the DENDROGRAM and MAXIMUM LIKELI-HOOD CLASSIFICATION functions to interpret the cluster results and reduced the number of classes stepwise until we determined an optimal number of classes that captured variability across all the study regions.
In our worked example, a primary objective was to compare the classifications across the three nestedspatial scales. For this reason, after defining an optimal number of classes, we specified the same number of classes (N = 11) for all the study regions in the final ISOCLUSTER analyses to support comparison across spatial scales. However, the workflow allows a user to define the number of clusters on the basis of resources for implementation, ability to capture site heterogeneity, or other monitoring constraints or objectives.
The DENDROGRAM function (http://pro.arcgis. com/en/pro-app/tool-reference/spatial-analyst/dendrogram. htm) constructs a table and tree diagram that show the classification hierarchy, which were used to explore the potential for merging classes. The MAXI-MUM LIKELIHOOD CLASSIFICATION function (http://pro.arcgis.com/en/pro-app/tool-reference/spatialanalyst/maximum-likelihood-classification.htm) calculates a percent certainty or probability for each raster cell, which represents the highest probability or "maximum likelihood" that the raster cell belongs to the assigned class. The probability values range from 0 to 1.0. Lower values indicate a higher likelihood of misclassification, whereas higher values indicate lower likelihood of misclassification. During the optimization, the confidence rasters were used to explore the potential for reducing the probability of misclassification.
Landform Classification. After the SMRUs are classified, landform classification is the final step in the workflow to determine the location for installation of monitoring equipment on the basis of local topographic conditions. Fine-scale topographic analysis was used to develop a landform map and explore potential locations for soil moisture monitoring in the Hetch-Hetchy watershed, an important water supply basin. This final step in the workflow is necessary to determine optimal sensor locations and to ensure that a wide range of landscape positions are represented when designing a monitoring network.
A landform map was derived in ArcGIS using a multi-scale topographic position index (TPI) and a slope raster (Weiss 2001;Jenness 2006;Deumlich et al. 2010) derived from a 10-m digital elevation model (DEM) (https://nationalmap.gov/elevation. html). The multi-scale TPI compares the elevation of individual focal cells to the mean elevation of the surrounding cells, where the surrounding cells are referred to as the "neighborhood" and the size and shape of the neighborhood are user-defined. We calculated a fine-scale TPI (circular neighborhood with a radius of 750 m) and large-scale TPI (circular neighborhood with a radius of 2,000 m) using the ArcGIS FOCAL MEAN function (https://pro.arcgis.com/en/ pro-app/tool-reference/spatial-analyst/focal-statistics. htm) and a raster calculation: where positive TPI values are assigned to cells with elevations higher than surrounding areas and negative TPI values are assigned to cells with elevations lower than surrounding areas. A slope raster was then calculated using the ArcGIS SLOPE function (http://pro.arcgis.com/en/pro-app/tool-reference/spatialanalyst/slope.htm). The final landform map was created using a nested conditional statement that applies classification thresholds to the 750-m TPI, 2,000-m TPI, and slope rasters (Weiss 2001;Jenness 2006). We used thresholds recommended by Deumlich et al.
(2010) to define 10 landform classes (Table 5). The multi-scale TPI enables delineation of small depressions, large valleys, small hummocks, and large ridges. The resulting landform map provides a useful base for determining sensor location and facilitates consideration of landscape position in the design of soil moisture monitoring networks.
Validation of SMRUs. We used direct measurements of soil moisture as an independent dataset to validate the classification of SMRUs at the watershed-scale on the basis of summary statistics. We did not define relevant summary statistics a priori and the number of validation stations was constrained by available data. Eleven validation stations, located within the Sierra Nevada ecoregion (Figure 1), were selected on the basis of their geographical distribution, a continuous period of record spanning wet to dry conditions from 2009 to 2012, no more than 10% missing data, and multiple soil probe depths that span at least 50 cm to provide an integrated estimate of soil water storage (Table 6) In a separate analysis, we applied the SMRU workflow in the Upper Carson River watershed, which is located just above the rain-to-snow transition, and then validated using a cluster of five validation stations (Poison Flat, Monitor Pass, Forestdale Creek, Blue Lakes, Burnside Lake). The cluster of Carson River validation stations were selected to show an example study region with low climate variability among stations located at similar elevations across a snow-dominated watershed.
Datasets were downloaded from the Scripps Institution of Oceanography (Scripps and University of California, San Diego; http://meteora.ucsd.edu/), the North American Soil Moisture Database (Quiring et al. 2015), the Department of Water Resources California Data Exchange Center (http://cdec.wate r.ca.gov/), the Natural Resources Conservation Service (NRCS) Snowpack Telemetry (SNOTEL) (http:// www.wcc.nrcs.usda.gov/snow/), and USGS datasets (Stern et al. 2018). These continuous soil moisture time series were checked for missing data. Cumulative soil moisture, which is a measure of soil water storage, was calculated on a daily basis using the equation: Mid-slope drainages, shallow valleys TPI ≤ À1 À1 < TPI < 1 -3 Upland drainages, headwaters TPI ≤ À1 Local ridges, hills in valleys TPI ≥ À1 TPI ≤ À1 -9 Mid-slope ridges, small hills in plains TPI ≥ À1 À1 <TPI < 1 -10 Mountain tops, high ridges TPI ≥ À1 TPI ≥ À1 - where SM is the cumulative soil moisture (%), depth is summed across all soil layers in cm, profile is the total soil profile thickness in cm, and X is the soil moisture value for each soil layer (%). We selected 2010, an average water year (Jian et al. 2011), for validation using statistical measures of central tendency, dispersion, and extremes.

RESULTS
The explanatory power of the PCs differed across the range of spatial scales represented by the study regions (Table 7). A break in slope scree-test criterion ( Figure 3) retained eight PCs for cluster analysis at the California-scale and seven PCs for the Sierra Nevada ecoregion-scale and the Merced-Tuolumne watershed-scale. The first PC for the California-scale analysis explained 35% of the variance compared to 42% and 52% for the Sierra Nevada and Merced-Tuolumne regions, respectively. The PCs used for SMRU classification at the state (Figure 4), ecoregion (Figure 5), and watershed (Figure 6) scales explained 96%, 94%, and 97% of the variance, respectively.
At the California-scale, the SMRU classification ( Figure 4a) is strongly influenced by elevation. At this scale, the Merced-Tuolumne watersheds are dominated by six SMRUs (Figure 4b), with the highest likelihood of misclassification occurring near the rain-to-snow transition (Figure 4c), which coincides with the transition from SMRU #3 to #10. The lowest elevations, which are rain-dominated, are primarily classified as SMRU #5 and #7 and the highest elevations that are snow-dominated are primarily classified as SMRUs #10 and #11.  Note: Variables highlighted in bold were used for cluster analysis and classification of SMRUs. Variable names are described in Table 1.

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
At the ecoregion-scale (Figure 5a), topography exerts a larger influence on the SMRU classification and the Merced-Tuolumne SMRUs (Figure 5b) illustrate greater variability across the study region. Previous climate modeling at the ecoregionscale indicates ridge and valley topography  strongly influence the variability of snow processes (Curtis et al. 2014) and this assertion is supported by the SMRU classification. At this intermediatescale, the Merced-Tuolumne watersheds are dominated by three SMRUs (#3, #4, and #7) below the rain-to-snow transition that remain strongly influenced by the elevation gradient. Above the rainto-snow transition, the spatial extent of four SMRUs (#7, #8, #10, and #11) is more variable and shows the influence of ridge and valley topography. At this spatial scale, the highest likelihood of misclassification remains near the rain-to-snow transition (Figure 5c). At the watershed-scale (Figure 6a), the Merced-Tuolumne SMRUs below the rain-to-snow transition remain strongly influenced by the elevation gradient and are dominated by four SMRUs (#6, #7, #9, and #10) but the spatial extent of the five SMRUs above the rain-to-snow transition (#1, #2, #3, #4, and #5) is highly variable and influenced by finer-scale topography and associated hydrologic processes. At this scale, the highest likelihood of misclassification

JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
moved away from the rain-to-snow transition to topographic highlands and lower elevation ridges (Figure 6b). A comparison of the maximum likelihood classification rasters across spatial scales illustrates a lower confidence and higher likelihood for misclassification at finer-scales (Table 8). The percent of each study region with a 50% or greater likelihood of cells assigned to the correct class is 67%, 62%, and 59%, respectively, for the state, ecoregion, and watershedscales. The decrease in confidence and associated increase in the probability of misclassification at finer-scales is further exemplified in the SMRU classification for the Upper Carson River watershed.
To expand the number of validation sites, we applied the SMRU workflow to the Upper Carson River watershed, which is a snow-dominated basin located along the Sierra Nevada crest (Figure 1). We selected five PCs for the cluster analysis and defined the optimal number of classes (N = 11) for classification but the ISOCLUSTER function only classified four SMRUs (Figure 7a). The maximum likelihood of misclassification ( Figure 7b) was much higher in comparison to the Merced-Tuolumne watersheds and the percent of the Carson River watershed with a 50% or greater likelihood of cells assigned to the correct class was only 34% (Table 8).
The watershed-scale SMRUs were validated using measured data from six stations along an elevation gradient within the Merced-Tuolumne watersheds ( Figure 8a) and a cluster of five stations within the Upper Carson River watershed (Figure 8b). The transect of stations across the Merced-Tuolumne watersheds illustrates the range of soil moisture responses along an elevation gradient that spans the rain-to-snow transition. In comparison, the cluster of stations in the Upper Carson River watershed illustrates a range of soil moisture responses within a narrow elevation range across a snow-dominated watershed. We used the 2010 water year for validation because it represented an average year but we also show a series of wet to dry years, for the period spanning 2009-2012.
We used summary statistics for water year 2010 to describe the shape of the annual soil moisture time series and to further illustrate that the SMRUs capture a range of soil moisture responses to climate across the study regions. The summary statistics demonstrate that SMRU classes are poorly explained by measures of central tendency and better explained by statistical metrics of dispersion and extremes (Figure 9). In the Merced-Tuolumne and Carson River watersheds, measures of dispersion (standard deviation, coefficient of variation, range, variance, and the interquartile range) and extremes (maximum, minimum, skewness, and kurtosis) vary systematically and highlight the primary differences among the SMRUs.
The response of soil moisture to climatic variations is further illustrated by comparing summary statistics for three stations (Figure 9) that span the rainto-snow transition within the Merced-Tuolumne watersheds. Merced Grove (MT10) is rainfall-dominated, with a low variability in soil moisture over an annual scale but a rapid response to rainfall. This station had a low standard deviation, coefficient of variation, and skewness but the highest minimum value. The lack of snowpack at Merced Grove results in a narrower range of soil water content because the soil properties enable the soil to continually drain. Gin Flat (MT6) is located within the rain-to-snow transition, where the soil responds to early season rainfall, but soil moisture rapidly declines and stabilizes during the winter as snowpack accumulates. There is a second gradual rise followed by a rapid decline in soil moisture during the spring snowmelt runoff season. Although Gin Flat had mean and median values that were similar to Merced Grove, values for the standard deviation, coefficient of variation, range, variance, maximum, skewness, and  Table 6). Note that there are only nine landform classes; plains (Class 5, Table 5) do not exist in this high-relief mountain watershed. kurtosis were larger and the minimum and interquartile range were smaller. Dana Meadows (MT2) is snow-dominated with a small response to early season precipitation and a strong snowmelt signal from May to August. Colder air temperatures maintain snowpack at Dana Meadows for a longer period resulting in stable soil moisture contents over the winter and throughout much of the summer. Dana Meadows had the lowest mean, median, and minimum values but highest standard deviation, coefficient of variation, range, variance, interquartile range, maximum, skewness, and kurtosis. Dana Meadows also had the largest range of soil moisture conditions, but rapidly loses soil moisture, indicating better soil drainage. The final step in the proposed workflow is a landform classification that can be used to design and optimize the number and location of soil moisture monitoring stations. We show an example application for the Hetch-Hetchy watershed (1,175 km 2 , Figure 10), an important water supply basin in the Sierra Nevada and located upstream of the Hetch-Hetchy reservoir within the MT watershed boundary (Figure 1). A comprehensive soil moisture network would have stations located in several combinations of landforms and SMRUs but costs and accessibility for operation and maintenance of monitoring equipment must be considered. A more realistic application is to select a transect across a gradient of SMRUs and landforms similar to the Merced-Tuolumne validation transect. The recommended workflow uses a landform classification ( Figure 11) along with the maximum likelihood classification ( Figure 10) to select soil moisture monitoring locations at the localscale.

DISCUSSION
The primary objective of this study was to design and validate a statewide soil moisture monitoring strategy that can be applied over a broad range of spatial scales. A workflow within a geographic information system was developed to classify SMRUs using a deductive or "top down" approach. Our workflow combines several user-friendly ArcGIS statistical tools and is intended to be easily repeatable and our nested-scale analysis illustrates how spatial scale influences the SMRU classification. At the Californiascale, SMRUs were strongly influenced by elevation. At the ecoregion-scale, the SMRUs captured more heterogeneity above the rain-to-snow transition. At the watershed-scale, the SMRUs capture the influence of finer-scale topography and the integrated effects of soil, vegetation, climate, and hydrology.
Validation was performed using available soil moisture time series data and our results indicate the SMRUs capture soil moisture variability across the study regions. In heavily instrumented watersheds, where the relationship between soil water storage and runoff is well understood, a supervised classification could be developed with classes defined on the basis of user criteria.
Maps of the maximum likelihood classification can be used to explore the potential for reducing statistical misclassification and to optimize site selection and sensor location. This is particularly relevant at the watershed-scale because the likelihood of misclassification increased at finer-spatial scales.
The final step in our workflow utilizes a landform classification to ensure that a wide range of topographic conditions are represented within a soil moisture monitoring strategy. Site selection and sensor location at the local-scale will benefit from the finescale landform classification that facilitates selection of a range of hillslope positions (e.g. soil catena) for soil moisture monitoring.
Our proposed workflow will be used to develop a statewide soil moisture monitoring strategy to inform water resource management throughout California. The relation between soil moisture conditions and spring snowmelt is of particular interest. An optimized soil moisture monitoring network can help inform reservoir operations managers by providing an assessment of soil water storage that might be filled with snowmelt, thus reducing snowmelt discharge available for reservoir storage. Other potential applications include: quantifying landscape stress and wildfire risk assessment (Krueger et al. 2016), estimating irrigation demand and crop productivity (Rivers et al. 2015), and risk assessment and early warning of floods (Ionita et al. 2015) and landslides (Mirus et al. 2016).
Many soil moisture sensors are installed at shallow depths, which are intended to calibrate satellite data but these shallow sensors fail to characterize the complete soil profile. Although imperative for determining soil response to climate, there are relatively few stations with sensors installed below the top 5-10 cm of soil. We recommend a standardized approach for monitoring that includes installation of deeper subsurface sensors. Ideally, sensors would be installed in at least three different soil horizons that include the rooting zone and the soil and bedrock interface (Flint et al. 2008).

CONCLUSIONS
Hydrologic conditions under future climates are expected to be more variable and more extreme, which will make quantifying the role of soil moisture within the water balance increasingly important for runoff forecasting and water resource management. Soil moisture monitoring is essential to understanding soil moisture response to variable climate conditions and to forecast extreme or anomalous conditions related to spring snowmelt discharge and summer baseflows. Our workflow for designing and validating statewide soil moisture monitoring strategies will be an integral component of water resource management. Our example applications were able to capture a range of soil moisture responses to climate across regional to watershed-scales and can be used to optimize the number and location of soil moisture stations necessary to improve hydrologic characterization. The workflow that has been developed here is unique and can be implemented over a range of spatial scales using existing and publicly available input datasets and software, or it can be customized by the user for their specific needs.
A better understanding of the linkages between soil water storage and runoff, particularly during snowmelt discharge and summer baseflow periods, could greatly advance the understanding of soil-runoff processes, could improve runoff forecasts, and inform the timing of surface water reservoir operations. Improvements in data collection platforms are needed, particularly real-time data collection over a range of soil depths within the soil profile and the network of soil moisture monitoring locations should be expanded. Implementing a real-time soil moisture monitoring strategy for California could usher in a new paradigm of water supply management that incorporates a broader spectrum of measured data to monitor the hydrologic cycle as completely as possible.