Evaluation and Optimization of Snow Albedo Scheme in Noah‐MP Land Surface Model Using In Situ Spectral Observations in the Colorado Rockies

The Biosphere‐Atmosphere Transfer Scheme (BATS) ground snow albedo algorithm is commonly used in land‐surface models (LSM), weather forecasting and research applications. This study addresses key uncertainties in BATS simulated ground snow albedo within the Noah‐MP LSM framework through evaluation and optimization of the Noah‐MP BATS ground snow albedo formulation using 2‐band (visible and near‐infrared (NIR)) in situ albedo observations at Rocky Mountain field stations. The Noah‐MP BATS ground snow albedo scheme is extremely sensitive to its input parameters. Namely, an ensemble generated by varying BATS input parameters within potentially plausible ranges provides an average daily range (maximum ensemble member minus minimum ensemble member) of ground snow albedo exceeding 0.45 in visible and NIR bands. Parameter optimization improves agreement between simulated and in situ observed ground snow albedo in visible, NIR and broadband spectrums. Importantly, optimized parameters result in reduced biases relative to observed fresh‐snow albedo and better agreement with observed albedo decay. Our analysis across different sites supports that the optimized BATS ground snow albedo parameters are appropriate to transfer in space and time, at least within the region studied (the central‐southern Rocky Mountains). The primary error source remaining after parameter optimization is that observed fresh‐snow albedo is highly variable, particularly in the NIR spectrum, whereas BATS fresh‐snow albedo is constant, an issue which requires further investigation. This study shows significant correlations between observed fresh‐snow albedo and surface meteorological conditions (e.g., downward shortwave radiation and temperature) which can support future model development that attempts to include a time‐varying formulation for fresh‐snow albedo.

Snow albedo schemes in LSMs, particularly those used in weather and hydrological forecast models, typically attempt to maintain established theoretical relationships that govern snow albedo evolution with high computational efficiency (Hansen et al., 1983;He & Flanner, 2020;Pedersen & Winther, 2005;Roeckner et al., 2004;Verseghy, 1991;Yang et al., 1997). Relatively simple schemes, such as the CLASS parameterization (Verseghy, 1991), decreases snow albedo in time at an exponential rate modulated by a user-defined decay factor. This attempts to implicitly account for the suit of factors that affect snow albedo decay with a single time-constant parameter, and thus assumes that optical or environmental factors that affect snow albedo do not change in time. Thus, although simpler schemes may be effective at modeling average albedo decay rates, they are not capable of reflecting temporal variability in decay rates as a function of time-changing factors. Schemes increase in complexity as more factors that affect snow albedo evolution are explicitly accounted for by the model (e.g., the complex Snow, Ice, and Aerosol Radiative (SNICAR) model which is implemented in the Community Land Model (CLM)) (Flanner et al., 2007(Flanner et al., , 2021. Relationships between snow albedo and the environment that have been incorporated into more complex models relate to the inherent optical properties of snow, and environmental factors, and are: (a) Snow grain size increases as snow ages, which reduces albedo, primarily in the near-infrared (NIR) spectrum (Wiscombe & Warren, 1980). (b) Deposition of light absorbing particles (e.g., dust and soot) on snow can significantly decrease snow albedo, primarily in the visible spectrum (He et al., 2018;Painter et al., 2010Painter et al., , 2013Skiles et al., 2012). (c) Albedo increases as solar zenith increases, with near-infrared (NIR) albedo being more sensitive to sun angle than visible albedo (Wiscombe & Warren, 1980). (d) Cloud cover converts direct radiation into diffuse radiation, which has been observed to increase snow albedo (Gardner & Sharp, 2010;Wiscombe & Warren, 1980). (e) Increasing liquid water content decreases snow albedo; a direct minor reduction due to water absorption in the NIR and indirect reduction by accelerating snow grain growth and filling the pore space (Colbeck, 1979;Donahue et al., 2021Donahue et al., , 2022Grenfell & Maykut, 1977). (f) NIR and visible albedo are mostly insensitive to snow depth when the snowpack's snow water equivalent exceeds 20 cm (Wang et al., 2020;Wiscombe & Warren, 1980). (g) Albedo decay is partially driven by shortwave incident radiation and near surface air temperature, which relates to the rate of snow grain growth (Amaral et al., 2017;Calleja et al., 2021).
The Biosphere-Atmosphere Transfer Scheme (BATS) ground snow albedo formulation (Dickinson et al., 1986;Yang et al., 1997) is inferred from snowpack radiative transfer calculations of Wiscombe and Warren (1980). BATS is an intermediately complex algorithm which allows for high computational efficiency while maintaining representations of the above interactions between direct and diffuse albedo over visible and NIR spectrums with snow age, surface temperature, solar illumination angle, and absorptive impurities. Given the balance between model complexity and computation efficiency, BATS is widely used in research-application and operational modeling systems such as the Noah-MP LSM, Weather Research and Forecasting model (WRF) (Powers et al., 2017), versions 2.0-3.0 of the WRF-Hydro/NOAA National Water Modeling system (Gochis et al., 2020), the Common Land Model (CoLM) (Li, Lu, et al., 2017), and the land component of the MPI Earth System Model (JSBACH3) (Reick et al., 2021).
Previous research has compared BATS simulated snow albedo with in situ and remotely sensed observations from point to global scales (Jin et al., 1999;Malik et al., 2014;Molotch & Bales, 2006;Wei et al., 2001;Xu & Shu, 2014;Yang et al., 1997;Zhou, 2003). BATS can accurately represent snow albedo dynamics (r > 0.9) (Malik et al., 2014), and has been found to outperform albedo models that are solely based on snow surface aging (Molotch & Bales, 2006). However, BATS has suffered from systematic overestimation of snow albedo relative to observations when using default and optimized (minimized RMSE) parameters (Jin et al., 1999;Malik 3 of 21 et al., 2014;Molotch & Bales, 2006). Molotch and Bales (2006) pointed out a likely overestimation of BATS simulated spring snow albedo because the Wiscombe and Warren (1980) parameterization, which BATS is based on, tends to overestimate albedo for large grains that tends to be more prevalent during ablation periods. BATS overestimated snow albedo has been shown to be exacerbated through snow albedo feedbacks: high albedo producing lower snowmelt, and lower snowmelt resulting in prolonged snowpack which increases albedo (Malik et al., 2014). BATS snow albedo is determined to be particularly error prone immediately following snowfall events (Jin et al., 1999;Molotch & Bales, 2006). This finding is consistent with an independent global evaluation of BATS, which found MODIS observes substantially lower pure snow NIR albedo than the default BATS setting in high latitude regions (Zhou, 2003).
Assimilation of albedo observations within the BATS scheme can improve simulated snowpack evolution (Xu & Shu, 2014), but has limited benefit for future forecasts. Furthermore, nonlinear interactions between snow and other processes in LSMs has made it difficult for these studies to address if simulated snow albedo is attributable to deficiencies in snow albedo formulations or other model processes (e.g., snow cover area dynamics) (Wang & Zeng, 2010). Unfortunately, albedo model evaluations are lacking because detailed spectral measurements of radiation are rare in remote snow-covered regions (Bair et al., 2019). Therefore, the need to better understand and improve BATS physical representation of snow albedo through novel field experiments and parameter optimization remains an important area of research (Malik et al., 2014;Pirazzini, 2009).
This study addresses known uncertainties in BATS simulated ground snow albedo to improve accuracy of LSM simulated snowpack evolution by evaluating and optimizing the Noah-MP BATS ground snow age and albedo formulation using in situ two-band (visible and NIR) albedo observations at Rocky Mountain field stations. Comparisons between BATS simulated and observed snow albedo are designed to better understand BATS ground snow albedo error characteristics and reduce errors via parameter optimization. The three science questions this manuscript addresses are: (a) How sensitive is the Noah-MP BATS snow albedo scheme to its input parameters which are commonly assumed to be equivalent to default values defined in Yang et al. (1997)? (b) To which degree can parameter optimization improve agreement with observed snow albedo? (c) Are the optimized parameters spatially and temporally transferable?

Study Sites and In Situ Observations
Observational data used in this study come from three different high-elevation study sites within the southern Rocky Mountains in the central and southwestern parts of the state of the Colorado (Figure 1). The sites are divided into one model evaluation and training site and two parameter transferability assessment sites. The primary model training and evaluation site is called the "Irwin" study site which is located approximately 11 km west-northwest of the town of Crested Butte, Colorado at an elevation of 3,168 m (10,423 ft.). Pairs of Huskeflux SR-11 upward-looking and downward-looking broadband solar radiation (285-3,000 nm) and NIR (695-3,000 nm) pyranometers sensors were mounted at two different levels on an ∼10-m-tall tower. Upward-looking sensors are positioned at the top of the tower while downward-looking sensors are located approximately 4 m above the bare ground surface. In this separated manner, both pairs of sensors should have minimal "field-of-view" contamination from the tower structure. Any changes in upward versus downward exchange between the upward and downward-looking sensors due to sensor location is considered to be insignificant. The snow surface slope is not characterized at the site, no slope corrections for local solar zenith angle is applied to the down-looking pyranometers , which may introduce some uncertainty into calculated albedo. The Irwin site is characterized as a high-elevation subalpine site with sparse forest canopy within a cold continental climate regime. Care was taken in tower and sensor siting to minimize influences of surrounding trees on incoming and outgoing radiation measurements. In addition to basic meteorological measurements of air temperature, humidity, and wind speed automated, ultrasonic snow depth measurements are made on the Irwin site. Grass understory exists at the site but the understory is fully buried by snow during the periods of investigation used in this study as analysis periods are limited to times when in situ measurements of snow depth exceed 0.2 m.
The first model parameter transferability assessment site is the "Senator Beck" Basin Study Area operated by the Center for Snow and Avalanche Studies, near Silverton, Colorado, which lies approximately 140 km southwest of the Irwin study site (https://snowstudies.org/senator-beck-study-area-overview/). The Senator Beck site 4 of 21 is located in an alpine region at an elevation of 3,703 m (12,186 ft.). Alpine tundra vegetation at the site is very short and easily buried by seasonal snowpack but the same minimum criteria for snow depth of 0.2 m was applied at this site for our analysis. Numerous investigations on snowpack processes, including snow albedo, have been conducted at the Senator Beck site (e.g., Bair et al., 2019;Skiles et al., 2012;Skiles & Painter, 2017). The Senator Beck measurement suite, described in Landry et al. (2014), has a large number of snowpack thermodynamic and radiative transfer sensors including automated snow depth measurements and pairs of upward-looking and downward-looking broadband and NIR-filtered Huskeflux SR-11 sensors allowing for direct comparison of observational data and all model evaluation metrics, including the two-band BATS albedo estimates, between the Irwin and Senator Beck sites.
The second model parameter transferability assessment site is referred to as the "East River" pumphouse eddy-covariance/energy balance (EC/EB) study site, located approximately 6.5 km northeast of the town of Crested Butte (14.3 km east-northeast of the Irwin site) at an elevation of 2,762 m (9,061 ft.). The East River tower was installed as part of the Department of Energy East River Watershed Science Focus Area program and was designed to characterize land-atmosphere exchanges of radiation, energy and water from a high-mountain riparian area (Ryken et al., 2021). As is typical of many EC/EB tower installations, a single four-way radiometer (Kipp and Zonen CNR-4) was installed at the top of the EC/EB tower at a height of 6 m above the ground surface. The spectral channels of the East River radiometer are limited to broadband incoming and outgoing shortwave radiation (300-2,800 nm) as well as incoming and outgoing longwave radiation (4.5-42 µm). So, a

of 21
primary limitation of the East River site is that only broadband albedo, as opposed to partitioned visible and NIR components, can be assessed. However, like the East River site, many existing EC/EB flux towers possess four-way radiometer measurements similar or identical to the East River Kipp and Zonen CNR-4 sensor, so this site is used as an assessment site of opportunity to evaluate broadband shortwave albedo. Vegetation at the East River site is composed of riparian grasses and willow galleries of varying densities up to 2 m in height when not buried by snow. Both the grasses and willow galleries at this site get fully buried by seasonal snowpack and the willows typically get bent down by accumulating snow as the seasonal pack develops. Nevertheless, because of the greater grass and willow heights, evaluations of snow albedo are limited to periods when snow depth at the site, as measured by an ultrasonic snow depth sensor, is in excess of 0.5 m.
Lastly, all three sites reside in areas of very strong topographic variability and are surrounded, in various directions, by higher mountain peaks and ridges. As such, topographic shading of radiation measurements occurs at each site. Additionally, nearly all field measurements of incoming solar radiation suffer from sensor dome reflection and refraction at low sun angles which can greatly affect albedo measurements. To eliminate impacts from topographic shading and sensor dome contamination issues, measurements of snow albedo used in this study are limited to 1 hr before and after local solar noon, when the sun is nearly directly overhead. While this limited period may not fully account for albedo-relevant processes at low sun angles, it should provide a robust estimate of snow albedo with the least amount of uncertainty introduced by measurement issues and local terrain effects.
At the Irwin and Senator Beck stations, observed visible solar radiation is computed by subtracting the NIR incoming and outgoing solar radiation from the broadband incoming and outgoing solar radiation (Equations 1 and 2, respectively) : where ↓ and ↑ represent incoming and outgoing solar radiation, respectively. Subscripts "V", "NIR," and "broadband" denote visible, NIR, and broadband, respectively. Visible albedo is computed as the ratio: of ↑ ∕ ↓ .

Noah-MP Model Setup
Noah-MP  is implemented as a land component of the community Weather Research and Forecasting model (WRF), the community WRF-Hydro modeling system in its configuration as the operational NOAA National Water Model, and in the NOAA Unified Forecast System. Noah-MP simulations conducted in this analysis use model-physics options from the WRF/Noah-MP options used in the continental-scale convection-permitting regional climate simulations (He et al., 2019;Liu et al., 2017), except the BATS scheme is chosen to compute snow albedo instead of CLASS. Noah-MP snow-related parameters follow the values used in the recent release of WRF/Noah-MP version 4.3 (https://github.com/wrf-model/WRF/tree/release-v4.3), where the snow cover parameter has been updated to improve simulated surface albedo and temperature . fications.pdf) data set used to drive NOAA's National Water Model (NWM). At the Irwin site, in situ observed downward shortwave radiation is used instead of AORC data at times with available observations to minimize simulation uncertainty attributable to uncertain downward shortwave radiation. At the Senator Beck site, in situ observed downward shortwave and longwave radiation, precipitation, air temperature, wind speed, pressure, and specific humidity are used instead of AORC data at times with available observations. At the East River site, in situ observed downward shortwave and longwave radiation and air temperature are used instead of AORC data at times with available observations.

BATS Ground Snow Albedo Scheme
The BATS ground snow albedo formulation, described in Yang et al. (1997), is the most sophisticated snow albedo physics option within the Noah-MP LSM. It computes ground snow albedo accounting for direct and diffuse radiation in visible and NIR bands. Broadband ground snow albedo ( ) is computed as the average of visible ( ; < 0.7 μm) and NIR ( ; ≥ 0.7 μm) snow albedo: and are the fractions of total transmitted solar radiation that is direct and diffuse, respectively. Direct visible ( − ) and NIR ( − ) snow albedos are solved as: − and − are diffuse visible and NIR albedo, respectively. V DIR and NIR DIR are the cosZ factor for direct visible and NIR snow albedo, respectively. is a factor, ranging between 0 and 1, to parameterize the effect of solar zenith angle on snow albedo: where cosZ is the cosine of solar zenith angle and b is an adjustable parameter. Diffuse albedos are calculated as: where and are fresh-snow visible and NIR albedo with solar zenith angle less than 60°. V age and NIR age are tunable parameters.
is a snow age factor, ranging between 0 and 1, computed as where is defined as t denotes the current time step, ∆ is the change in snow water equivalent between t and t − 1, and SWE MX is the snow water equivalent required to fully cover old snow. ∆ is computed as where 0 is a tunable parameter with a unit of seconds. 1 represents the effects of snow grain growth from vapor diffusion, where Grain growth is a tunable parameter, T FRZ = 273.16 K, and T g is the ground temperature. 2 parameterizes additional effects of snow grain growth near or at the freezing of meltwater: where Extra growth is a tunable parameter. 3 (i.e., dirt-soot) parameterizes the effects of snow impurities, and is a tunable parameter.
We created a box model comprised of snow age and snow albedo subroutines from the Noah-MP BATS snow albedo scheme, as summarized by Equations 3-15. This box model is used to directly constrain the aforementioned BATS parameters and reduce computation time in the parameter optimization process (described in Section 2.4) where tens of thousands of simulations were performed. The box model effectively simulates visible and NIR albedo nearly identical to the full Noah-MP simulations, with minimal differences due to SWE (as input to the box model in Equation 12) having slightly different temporary values within Noah-MP time steps ( Figure S1 in Supporting Information S1). The box model requires dynamic inputs of T g , SWE, cosZ and downward transmitted direct and diffuse solar radiation, which are taken from the reference Noah-MP simulation (using default parameters listed in Table 1). We tested the adequacy of this step by finding that 5,000 box model simulations reproduce nearly identical results when inputting default parameters and an ensemble of dynamic time series (generated from Noah-MP simulations using unique BATS parameter sets) during periods when snow depth exceeds 0.2 m. Thus, the simplified box model that does not account for parameter tuning feedbacks on T g and SWE is an adequate tool because parameter tuning does not have noticeable effects on reference T g or SWE that translate to significant changes in simulated ground snow albedo ( Figure S2 in Supporting Information S1). Note, parameter tuning does result in large changes to SWE and T g , but these changes do not noticeably alter BATS simulated snow albedo during periods when snow depth is greater than 0.2 m. The box model (MATLAB R2020b) function is publicly available here: https://github.com/RAbolafiaRosenzweig/BATS-Snow-Albedo-for-MATLAB.

BATS Ground Snow Albedo Sensitivity Testing and Parameter Optimization
The BATS ground snow albedo algorithm is typically run assuming default parameter values proposed by Yang et al. (1997). We test the sensitivity of the BATS scheme to each parameter by varying one parameter at a time through running the above box model over the Irwin site, holding all the other parameters at reference values. Ten simulations are run for each parameter (i.e., 10 different values covering evaluated ranges for each parameter), where evaluated parameter ranges (based on expert opinion from the authors of this manuscript) are defined in Table 1. Evaluation metrics for parameter sensitivity include: (a) the 10-member ensemble range of average daily albedo and (b) the 10-member ensemble range of variance. Insensitive parameters, which are held at default values during optimization, were only considered as those that are quantified within the four least sensitive parameters for both of these metrics. This is a strict categorization of insensitivity which likely allows relatively insensitive parameters to be tuned, however, the box model we use is computationally inexpensive and allows for a large number of simulations that can reasonably evaluate a large parameter space.
We next consider the sensitivity of the BATS scheme to variability across all tunable parameters (i.e., parameters categorized as sensitive based on the above procedure) by running 15,000 BATS snow albedo simulations using the aforementioned box model at the Irwin study site, where these simulations vary by parameter settings determined through Latin hypercube sampling the parameter space (consisting of nine tunable parameters in this case). The corresponding 15,000-member ensemble range provides a metric for the range of potentially simulated ground snow albedo at the Irwin site based on unique combinations of parameter settings. This is important in understanding if previously reported BATS biases (e.g., Molotch & Bales, 2006;Zhou, 2003) fall within the range of BATS parameter-based uncertainty. BATS visible and NIR ground snow albedo parameters are optimized at the Irwin study site by choosing the parameter set (of the 15,000 unique realizations) that maximizes the Nash-Sutcliffe Efficiency (NSE; Equation 19) relative to observed visible and NIR albedo.

Evaluation Metrics
Time series analyses are conducted to compare BATS simulated ground visible, NIR, and broadband snow albedo to corresponding observations. All evaluations consider daily time series, averaged during midday hours (±1 hr from solar noon) (see Section 2.1 for details). Importantly, we directly compare BATS simulated ground snow albedo outputs with observed ground snow albedo, which are not affected by snow cover area or vegetation due to the selection of analysis periods with snow fully covering short-vegetated ground at the measurement sites (see Section 2.1). This allows BATS ground snow albedo to be evaluated with reduced confounding effects from other LSM processes.
We consider a range of evaluation metrics that quantify simulation accuracy uniquely, as described below.
The Pearson correlation coefficient (r) (Pearson & Henrici, 1896), ranging from −1 to 1, is widely used in hydrological modeling studies, and thus serves as a well-known benchmark for performance: The bias in mean albedo considers the average difference between simulated and observed albedo. This metric is used to understand the degree to which predicted albedo can accurately simulate the mean state of observed albedo.

= −
Similarly, the bias in albedo variability is computed as the difference between simulated and observed time series variance. This metric is used to understand the degree to which predicted albedo can accurately simulate the variance of observed albedo.
Nash-Sutcliffe Efficiency (NSE) (Nash & Sutcliffe, 1970) is a commonly used robust metric to compare simulated and observed processes which normalizes model performance into an interpretable scale. NSE = 1 indicates 9 of 21 perfect correspondence between simulations and observations; NSE = 0 indicates that model simulations have the same explanatory power as the mean of the observations and NSE <0 indicates that the model is a worse predictor than the mean of the observations (Schaefli & Gupta, 2007).
We also report the Taylor score (S) (Taylor, 2001), which combines metrics of correlation and the root mean difference, where S has a lower bound of 0 and higher S corresponds with higher simulated accuracy.
where O t and P t represent observed and simulated albedo at timestep t, respectively. and represent the time series mean of observed and simulated albedo, respectively. n is the number of days in each time series. and are the standard deviation of observed and simulated time series, respectively. 0 is the maximum attainable correlation, assumed to be 1.0 in all cases in this analysis.

Albedo Decay Curves
We construct observed and simulated albedo decay curves at the Irwin and Senator Beck sites using daily averaged albedo time series (averaged over ±1 hr solar noon) to evaluate simulated albedo decay dynamics (Sections 3.2 and 3.3, respectively). These decay curves display albedo categorized by the number of days after a previous snowfall event. Snowfall events are considered instances when observed snow depth increases by more than 20 mm.

BATS Parameter Sensitivity and Optimization at the Irwin Site
The mean state of BATS simulated daily broadband snow albedo is most sensitive to , , and 0 , whereas the simulated daily broadband albedo variability is most sensitive to , , and 0 ( Figure 2 and  Figure S3 in Supporting Information S1). and define fresh-snow albedo in visible and NIR spectrums, and thus exert strong control on the mean state of visible and NIR albedo, respectively. and scale snow aging effects on albedo, and thus these parameters exert strong control on the temporal variability of visible and NIR albedo, respectively. 0 modulates f age and thus the albedo decay rate, where lower 0 corresponds with higher f age and faster decay. Note, 0 only exerts strong control on the mean albedo state when this parameter has a low setting (e.g., 1 × 10 5 s).
To optimize these parameters, a 15,000-member ensemble of ground snow albedo is generated through Latin hyper cube sampling unique parameter sets, considering all BATS snow albedo parameters to be tunable excluding b, Extra growth and V DIR given the low sensitivity of simulated albedo mean and variability to these parameters ( Figure 2). These 15,000 unique realizations of ground snow albedo have large variability. Specifically, daily average ensemble ranges of simulated visible and NIR albedo are 0.49 and 0.47, respectively ( Figure 3). This is quite large considering the observed range of daily visible and NIR albedo is less than 0.38 at this site throughout the snow-covered periods when observations are available. Hence, systematic biases from BATS snow albedo (e.g., Molotch & Bales, 2006;Zhou, 2003) can likely be ameliorated through optimal parameter selection.
Here, we note key differences between reference and optimized parameter sets that can aid future BATS snow albedo modeling efforts. Table 1 presents optimized values for each Noah-MP snow age and BATS parameter. The optimal value for (0.91) is lower than the default value (0.95) based on the field measurement from Wiscombe and Warren (1980). Conversely, the optimal value for (0.74) is significantly higher than the default value (0.65). However, this provides only a slightly higher resultant fresh broadband albedo (0.825 from the optimized case and 0.8 from the reference case). These optimal settings for fresh-snow albedo closely correspond with the median of observed fresh-snow albedo in visible (0.90) and NIR (0.75) wavelengths at the Irwin site (Figure 5a), thus the parameter optimization methodology (i.e., maximizing NSE) results in physically realistic parameter selection for fresh-snow albedo. We find that optimized V age and NIR age are both much larger than the default settings defined in Yang et al. (1997), which is qualitatively consistent with the Malik et al. (2014) BATS parameter calibration. Using optimized and higher V age and NIR age values better represent the observed fast albedo decay, relative to the too-slow reference simulated decay rate (e.g., Figure 4 and Table 2). Optimal 0 and SWE MX are significantly different for optimal visible and NIR albedo cases; however, Noah-MP currently considers this parameter equivalent in computations of visible and NIR albedo. Thus, Noah-MP simulations may benefit from computing the effects of snow albedo aging in visible and NIR wavelengths separately. Therefore, all optimized results presented herein compute visible and NIR albedo aging effects separately. Optimal dirt-soot settings for visible and NIR albedo simulations (0.25 and 0.11, respectively) are substantially lower than the default setting (0.3). This supports an argument provided by Molotch and Bales (2006) that the dirt-soot parameter should be calibrated at multiple sites rather than accepting the default 0.3 from Yang et al. (1997). Our optimal

of 21
set of the parameters shows a NSE larger than the other sets of the parameters ( Figure S4 in Supporting Information S1), although the NSE variability in performance of the top 20 parameter sets is relatively small, whereas the parameter set variability in the top 20 sets for some tunable parameters (e.g., 0 , NIR age , dirt-soot) is large ( Figure  S4 in Supporting Information S1). This indicates that the optimal parameter set selection is not robust because equifinality may play a role in selecting the optimal parameter sets with very close NSE.

BATS Ground Snow Albedo Evaluation at the Irwin Study Site
The reference BATS simulation systematically overestimates visible albedo (mean bias = 0.08) but underestimates NIR albedo (mean bias = −0.02); however, these biases mostly cancel out when considering broadband albedo (mean bias = 0.01) ( Table 2). The reference simulation provides an adequate correlation for visible, NIR, and broadband albedo (r ≥ 0.69) ( Table 2 and Figure 4). Small biases in reference NIR albedo are partially attributable to compensatory errors: reference NIR fresh-snow albedo is lower than observations but too-slow albedo decay rates (controlled by low NIR age ) result in a higher NIR albedo, with the mean NIR albedo close to observations (Figures 4-6). Reference visible albedo has large biases due to overestimated fresh-snow visible albedo and underestimated albedo decay rate (controlled by low V age ). Parameter optimization improves agreement between simulated and observed visible, NIR and broadband albedo across all evaluation metrics (Table 2 and Figure 4). Lowered biases in simulated visible and NIR albedo are largely attributable to optimized fresh-snow albedo settings (i.e., and ), which exert a strong control on the mean albedo state, and optimized (i.e., increased) aging parameters (i.e., V age and NIR age ) which enable faster and more accurate decay rates ( Figure 6). Overall, parameter optimization provides valuable improvements for simulated ground snow albedo at the Irwin site, in particular when considering the S and NSE metrics (Table 2).
BATS simulated ground snow albedo is much more accurate over warm months, relative to winter months when albedo variability is largely modulated by variability in fresh-snow albedo (Table 2; Figure S5 in Supporting Information S1). Over winter months, both optimized and reference BATS albedo fail to capture the temporal variability of observed visible, NIR, and broadband albedo (r ≤ 0.26). However, simulated winter albedo generally maintains low biases (−0.03 and −0.02 for reference and optimized broadband albedo, respectively) ( Table 2). The low correlation between simulated and observed albedo during winter months indicates that factors controlling variability in observed albedo may not be well represented in the BATS scheme. For instance, observations show 12 of 21 a large range of fresh-snow albedo ( Figure 5), which is currently constant in BATS. Two processes may be responsible for the observed fresh-snow variability: (a) fresh-snow is deposited with time-varying grain size, and (b) fresh-snow experiences a rapid decay between the time of deposition and the time a valid albedo observation is recorded at midday. Both explanations are consistent with the observed correlation between ↓ and fresh-snow NIR albedo (r = −0.47, p < 0.01; Figure 5); however, variability in fresh-snow visible albedo has an insignificant correlation with ↓ . Given the moderate correlation between fresh-snow NIR albedo and ↓ , observed variability in fresh-snow albedo is likely controlled by other time-varying factors such as temperature, unmodeled impurities, cloud cover, humidity and/or measurement uncertainties (e.g., sensor obstruction). Discrepancies between BATS fresh-snow albedo and observed fresh-snow albedo motivates future research to explore the physics of new snow albedo variability.
Parameter optimization results in large improvements for BATS snow albedo over warm months. For instance, the optimized broadband snow albedo has an NSE of 0.73 over warm months whereas the reference broadband snow albedo has an NSE of −2.12 (Table 2; Figure S5 in Supporting Information S1). Improvements in broadband snow albedo are attributable to improved simulated visible and NIR snow albedo (Table 2). Improvements in simulated albedo from March-June strongly suggests that simulated ablation from the Noah-MP LSM can be substantially improved via optimized BATS parameterization.
Parameter optimization results in better agreement with observed visible, NIR, and broadband albedo decay ( Figure 6). The average bias for the median and range (10th to 90th percentiles) of optimized visible albedo following snowfall events (up to 16 days) is −0.04 and −0.08, respectively; whereas reference biases are 0.07  and −0.16, respectively. The average bias for the median and range of optimized NIR albedo following snowfall events is 0.01 and −0.00, respectively; whereas reference biases are 0.07 and −0.14, respectively. Albedo variability following snowfall is partially controlled by meteorological conditions (e.g., ↓ and air temperature) (Figures 7, 8 and Figures S6, S7 in Supporting Information S1) (Amaral et al., 2017;Calleja et al., 2021). Observed and simulated albedo, categorized by days following snowfall, have significant (p ≤ 0.05) correlations with air temperature and ↓ which supports that these meteorological conditions partially explain decay rate Figure 6. Ground snow albedo decay at the Irwin site n-days (horizontal axis) following snowfall. Solid dots represent the median albedo across all instances of albedo records following snowfall and whiskers represent the 10th and 90th percentiles. Ground snow albedo decay is shown in (a) visible, (b) near-infrared (NIR), and (c) broadband wavelengths. Presented data for days 7-8 after snowfall events only represent two decay events and afterward only one.

Figure 7.
Comparison of in situ observed ↓ and ground snow albedo, grouped by days following snowfall at the Irwin site. Scatter plots compare variability in visible (blue) and near-infrared (NIR) (red) albedo with daily averaged ↓ for each specific day. The relationship between ↓ and albedo n-days after snowfall (indicated by column titles), is shown for observed (top row), Biosphere-Atmosphere Transfer Scheme (BATS) optimized (middle row) and BATS reference ground snow albedo (bottom row). Underlined correlations are statistically significant (p ≤ 0.05).
14 of 21 variability across multiple albedo decay events (5-47 decay events observed, varying by days following snowfall events). Correlations between observed and simulated albedo with meteorological conditions at the Irwin (Figures 7 and 8) and the Senator Beck sites ( Figures S6 and S7 in Supporting Information S1) support that the BATS snow albedo scheme correctly represents directional controls of ↓ and air temperature on snow albedo decay rate. Observed NIR albedo tends to have higher correlations with these meteorological conditions than visible albedo at both Irwin and Senator Beck sites because NIR snow albedo is more sensitive to snow grain growth (He et al., 2018;Wiscombe & Warren, 1980) during aging driven by high downward radiation and temperature. Observed fresh-snow albedo is more sensitive to ↓ than simulated albedo at both Irwin and Senator Beck sites (Figure 7 and Figure S6 in Supporting Information S1). Thus, the BATS scheme may be improved by incorporating a dependence of fresh-snow albedo on ↓ and other potential meteorological variables, rather than the assumed constant representation. However, more detailed research outside the scope of this analysis is required to integrate fresh-snow's dependence on meteorological conditions considered herein and other factors (e.g., humidity and cloud fraction) (Wang et al., 2020) that govern variability in fresh-snow albedo. Overall, lower biases in albedo variability from the optimized simulation, relative to the reference simulation (Table 2), supports that parameter selection is important to accurately simulate albedo decay rate but there is still room for BATS model improvement by including a temporally dynamic representation of fresh-snow albedo. Furthermore, remaining errors after parameter optimization are likely at least partially attributable to discrepancies between the 1 km AORC atmospheric forcing and the site's true meteorological conditions. f age (Equation 11) controls the average albedo decay rate and variability in decay rates in the BATS scheme (Figure 9 and Figure S8 in Supporting Information S1), where the sensitivity of simulated albedo to f age is modulated by input parameters V age and NIR age (Equations 9 and 10). For instance, in Figure 9a f age curves have high correlations with BATS simulated visible and NIR albedo decay curves (r = −1.00). f age also has high correlations with observed visible and NIR albedo decay curves (r = −0.95 and −0.96 for visible and NIR albedo, respectively). Thus, the BATS f age formulation successfully simulates average observed decay dynamics. Simulated fresh-snow variability strongly relates with f age ; however, variability in observed fresh-snow albedo has weak relationships with f age at the Irwin (Figure 9b) and Senator Beck ( Figure S9 in Supporting Information S1) sites (|r| ≤ 0.35). This further supports that simulated controls on variability in snow albedo immediately following snowfall are inaccurate. During two to 5 days following snowfall, observed albedo decay rate variability tends to have a significant relationship with f age (Figure 9b). Thus, factors that modulate variability in albedo decay Figure 8. Comparison of Analysis of Record for Calibration (AORC) air temperature and in situ observed ground snow albedo, grouped by days following snowfall at the Irwin site. Scatter plots compare variability in visible (blue) and near-infrared (NIR) (red) albedo with daily averaged air temperature for each specific day. The relationship between air temperature and albedo n-days after snowfall (indicated by column titles), is shown for observed (top row), Biosphere-Atmosphere Transfer Scheme (BATS) optimized (middle row) and BATS reference ground snow albedo (bottom row). Underlined correlations are statistically significant (p ≤ 0.05). 15 of 21 rates more than a day after snowfall events tend to be reasonably represented through BATS snow aging formulation. Figure 9a shows differences between f age curves following snowfall events for BATS simulations using parameters optimized for visible and NIR albedo. Importantly, these differences support that the Noah-MP BATS snow albedo scheme may benefit from an update that allows snow albedo aging for NIR and visible albedo to be computed separately. This separate treatment is physically reasonable because aging-driven changes in snow grain properties (e.g., size, shape, packing structures) have different impacts on visible and NIR snow albedo (He & Flanner, 2020).

Parameter Transferability to Senator Beck Site
We use the optimized parameters from the Irwin site (Table 1) to generate an optimized simulation at the Senator Beck site during different time periods to test the spatial and temporal transferability of the optimized parameters. The optimized Noah-MP simulation produces more accurate snow albedo than the reference simulation at Figure 9. (a) Optimized Biosphere-Atmosphere Transfer Scheme (BATS) simulated ground snow albedo and f age n-days (horizontal axis) following snowfall. Solid dots represent the median albedo across all instances of albedo records following snowfall and whiskers represent the 10th and 90th percentiles. The "*" symbol represents the median f age across all instances of albedo records following snowfall, where the black line and symbols correspond to f age using optimal near-infrared (NIR) albedo parameters and the gray line and symbols correspond to f age using optimal visible albedo parameters. (b) Comparison of daily f age from optimized BATS visible and NIR albedo simulations, grouped by days following snowfall with observed (top row) and BATS optimized ground snow albedo (bottom row). Underlined correlations are statistically significant (p ≤ 0.05).

of 21
the Senator Beck site ( Figure 10; Table 3). Specifically, visible and NIR albedo from the optimized simulation have substantially lower biases than the reference simulation and similar correlations. There are similar biases from reference and optimized broadband snow albedo due to the offset effects of negative visible albedo bias and positive NIR albedo bias in the reference simulation. The optimized snow albedo shows improvements when considering the S and NSE metrics as well as reduced biases in albedo variability. Similar correlations between optimized and reference simulations suggests there are factors other than BATS parameter uncertainty which drives errors in simulated daily snow albedo variability (e.g., model physics and uncertain atmospheric forcing). Summary statistics calculated from the period (water years 2013-2018) before available observations at the   Irwin site qualitatively agree with the aforementioned results in this paragraph (Table S1 in Supporting Information S1). Hence, parameters optimized at the Irwin site are successfully transferred in space and time to improve BATS simulated albedo performance at the Senator Beck site.
The median broadband albedo decay curve from the optimized simulation shows slightly worse agreement with the median observed broadband snow albedo decay curve, relative to the reference simulation ( Figure 11); where the bias from the optimized curve is −0.03 and the bias from the reference curve is −0.01. However, the bias of the 80% range (10th-90th percentiles) for the optimized broadband albedo curve is 2.7 times smaller than that of the reference simulation, indicating substantial improvements in simulating albedo decay rate variability. For visible snow albedo, the optimized simulation provides a faster and less accurate median decay rate in the beginning of decay cycles (e.g., first 7 days), but results in a more accurate median visible albedo 8-9 days in decay cycles, relative to the reference simulation ( Figure 11a). The bias of the 80% range (10th-90th percentiles) for the optimized visible albedo curve is 1.7 times smaller than that of the reference simulation, indicating substantial improvements in simulating visible albedo decay rate variability. For NIR snow albedo, the optimized simulation provides a higher and more accurate fresh-snow albedo parameterization and faster and more accurate decay rates, resulting in a more accurate representation of NIR snow albedo throughout decay cycles relative to the reference simulation ( Figure 11b). The bias of the 80% range (10th-90th percentiles) for the optimized NIR albedo curve is 4.7 times smaller than that of the reference simulation, indicating substantial improvements in simulating NIR albedo decay rate variability. Thus, the optimized simulation has similar performance as the reference simulation when considering the ability to model the average decay rate, but the optimized simulation has much better agreement with the observed decay rate variability. These results support that transferring optimized BATS parameters in space and time results in non-degraded to improved performance.

Parameter Transferability to East River Site
We also use the optimized parameters from the Irwin site (Table 1) to generate an optimized simulation at the East River site to further test the spatial transferability of the optimized parameters. There are relatively small differences in skill between optimized and reference simulated broadband snow albedo at this site averaged over winter and spring: marginal correlation and bias reductions and increases to NSE and S (Table 4 and Figure 12). This  is likely because of the offset effects of the visible and NIR albedo bias in the reference simulation as seen at the Irwin and Senator Beck sites, which however cannot be verified due to the lack of spectral measurements at the East River site. Additionally, similar to the Senator Beck analysis, similar skill scores between optimized and reference simulations suggests there are factors other than BATS parameter uncertainty which contribute to snow albedo simulation errors (e.g., model physics and uncertain atmospheric forcing). Broadband snow albedo from the optimized simulation has better agreement with observations during late spring when albedo decay is particularly fast ( Figure 12). Specifically, from 10 April 2019 to 20 April 2019, the optimized simulation shows a 37% bias reduction in broadband albedo relative to the reference simulation. Thus, although the overall statistics highlight only marginal differences between reference and optimized albedo performance, these results support that parameter optimization will likely increase the accuracy of modeled late season ablation which depends on accurate representation of snow albedo (e.g., Abolafia-Rosenzweig et al., 2021). Overall, we consider the spatial transferability of optimized parameters to the East River site appropriate and nondegrading.

Discussions
Previous research that has evaluated and discussed BATS snow albedo has consistently reported overestimates (Malik et al., 2014;Molotch & Bales, 2006;Niu et al., 2011;Zhou, 2003). The results presented here highlight those systematic biases in ground snow albedo can be resolved through parameter optimization. Thus, reported BATS biases are largely attributable to parameter uncertainty rather than a misrepresentation of processes in the BATS formulation. A previous effort to calibrate BATS ground snow albedo (Malik et al., 2014) likely failed to resolve systematic biases because only two tunable parameters were considered in that analysis (V age and NIR age ). Importantly, Malik et al. (2014) did not tune and parameters, which we show exert strong controls on the mean state of BATS simulated ground snow albedo. Given the high sensitivity of BATS ground snow albedo to appropriate parameters, we conclude that the BATS scheme can reasonably well simulate snow albedo if appropriate parameter selection is considered.
The primary error source remaining after parameter optimization is that observed fresh-snow albedo is highly variable, whereas BATS simulated fresh-snow albedo is constant. This error source can dominate BATS snow albedo performance in winter months when snowfall is frequent. This error source may only modestly affect the accuracy of simulated snowpack where winter ablation is small relative to spring ablation, but it may drive substantial errors in the simulated surface energy budget. Our observational analysis indicates that fresh-snow albedo is partially sensitive to ↓ and air temperature, and a recent study also found sensitivities of fresh-snow albedo to relative humidity and cloud cover (Wang et al., 2020). Thus, future snow albedo model improvement can greatly benefit from adding a time-varying formulation for fresh-snow albedo into the BATS scheme that reflects combined effects of these conditions. Both Irwin and Senator Beck study sites are known to get relatively heavy dust depositions in spring that impacts visible albedo. Thus, optimized parameters presented herein may be considered regionally relevant; however, these parameters may not be appropriate for locations where dust deposition is lighter. Future snow modeling with Noah-MP can benefit from future studies that evaluate these locally optimized parameters over broader areas (e.g., regional or global) through comparisons with remotely sensed albedo observations (e.g., from MODIS).

Conclusions
The Noah-MP BATS ground snow albedo scheme is highly sensitive to fresh-snow albedo and snow age parameters that are commonly assumed to be the same as those proposed by Yang et al. (1997). Mean daily BATS simulated visible and NIR snow albedo can vary by more than 0.45 based on unique parameter settings. Therefore, parameter calibration of BATS ground snow albedo is key to accurately simulating snow albedo. Indeed, parameter optimization in this study improves agreement between simulated and in situ observed ground snow albedo in visible, NIR and broadband spectrums relative to a reference simulation using default BATS parameters (e.g., improving simulated broadband snow albedo NSE from −2.03 to 0.66). Importantly, optimized BATS albedo parameters result in reduced biases relative to observed fresh-snow albedo and better agreement with observed albedo decay. Our analysis supports that the optimized BATS ground snow albedo parameters are appropriate to transfer in time and space to sites with a similar climate. Namely, BATS ground snow albedo simulations using parameters optimized at the Irwin site from 2019 to 2020 were spatially and temporally transferred to other Rocky Mountain stations (i.e., Senator Beck and East River) for simulations spanning water years 2013-2020. Optimized simulations at Senator Beck and East River sites provide better or non-degraded performance relative to a reference simulation using default parameters, and in most cases optimized simulations have higher accuracy. Differences between optimized snow age parameters from visible and NIR snow albedo simulations indicate that the Noah-MP BATS snow albedo scheme may benefit from an update that allows NIR and visible snow albedo aging to be computed separately. The primary error source remaining after parameter optimization is that observed fresh-snow albedo is highly variable, particularly in the NIR band, whereas BATS fresh-snow albedo is constant. Correlations between observed fresh-snow albedo and meteorological conditions (downward shortwave radiation and temperature) quantified in this study can support future model development that attempts to include a time-varying formulation for fresh-snow albedo that can be linked to meteorological conditions.