Disentangling the potential effects of land‐use and climate change on stream conditions

Abstract Land‐use and climate change are significantly affecting stream ecosystems, yet understanding of their long‐term impacts is hindered by the few studies that have simultaneously investigated their interaction and high variability among future projections. We modeled possible effects of a suite of 2030, 2060, and 2090 land‐use and climate scenarios on the condition of 70,772 small streams in the Chesapeake Bay watershed, United States. The Chesapeake Basin‐wide Index of Biotic Integrity, a benthic macroinvertebrate multimetric index, was used to represent stream condition. Land‐use scenarios included four Special Report on Emissions Scenarios (A1B, A2, B1, and B2) representing a range of potential landscape futures. Future climate scenarios included quartiles of future climate changes from downscaled Coupled Model Intercomparison Project ‐ Phase 5 (CMIP5) and a watershed‐wide uniform scenario (Lynch2016). We employed random forests analysis to model individual and combined effects of land‐use and climate change on stream conditions. Individual scenarios suggest that by 2090, watershed‐wide conditions may exhibit anywhere from large degradations (e.g., scenarios A1B, A2, and the CMIP5 25th percentile) to small degradations (e.g., scenarios B1, B2, and Lynch2016). Combined land‐use and climate change scenarios highlighted their interaction and predicted, by 2090, watershed‐wide degradation in 16.2% (A2 CMIP5 25th percentile) to 1.0% (B2 Lynch2016) of stream kilometers. A goal for the Chesapeake Bay watershed is to restore 10% of stream kilometers over a 2008 baseline; our results suggest meeting and sustaining this goal until 2090 may require improvement in 11.0%–26.2% of stream kilometers, dependent on land‐use and climate scenario. These results highlight inherent variability among scenarios and the resultant uncertainty of predicted conditions, which reinforces the need to incorporate multiple scenarios of both land‐use (e.g., development, agriculture, etc.) and climate change in future studies to encapsulate the range of potential future conditions.

Small streams are particularly susceptible to land-use and climate change (Meyer et al., 1999;Woodward et al., 2010). Land-use change has been shown to modify small stream chemistry, hydrology, geomorphology, and biology (Allan, 2004;Walsh et al., 2005), likely because of the strong coupling of streams to upstream landscapes (Hynes, 1975;Kärnä et al., 2019). Climate change also affects small streams by altering streamflow and temperature regimes (Dhungel, Tarboton, Jin, & Hawkins, 2016;Guse et al., 2015;Woodward et al., 2010). The effects of land-use and climate change on small streams may have a proportionally large impact on global freshwater biodiversity given that small streams make up more than 88% of global freshwater stream length (≤3rd Strahler order streams, Downing et al., 2012) and provide habitat to many freshwater taxa (Meyer et al., 1999).
Benthic macroinvertebrates are a major component of small stream communities providing important functions, including nutrient and energy transfer through food webs (Covich, Palmer, & Crowl, 1999). A single stream can contain hundreds of macroinvertebrate taxa with diverse life histories and tolerances to external stressors (Meyer et al., 2007); as such, they are useful in assessing cumulative stress and are critical bioindicators in many stream monitoring programs (Bonada, Prat, Resh, & Statzner, 2006;Carter, Resh, & Hannaford, 2017). Macroinvertebrates also can be sensitive to the effects of climate change and land-use (Durance & Ormerod, 2007;Kuemmerlen et al., 2015;Mustonen et al., 2018;Nelson et al., 2009;Pyne & Poff, 2017). Incorporating projected future land-use and climate change into assessments of macroinvertebrates and stream condition could not only improve our understanding of their effects but it also may improve regulatory and policy decisions.
Given the complex and unpredictable interactions between landuse and climate projections, they are not precise predictions of future conditions but rather provide a range of possible futures and future uncertainties (Oliver & Morecroft, 2014;Van Vuuren & Carter, 2014). Therefore, studies on future biological conditions need to incorporate a multi-scenario approach that captures the range of possible future conditions. Here, our objectives were to (a) predict stream biological conditions as defined by use of the Chesapeake Basin-wide Index of Biotic Integrity (Chessie BIBI; Smith, Buchanan, & Nagel, 2017) under a suite of land-use and climate projections for the Chesapeake Bay watershed, and (b) use these predictions to determine the percent of stream length improvement needed under different scenarios to improve health and function of 10% of stream miles over a 2008 baseline developed using the Chessie BIBI (Chesapeake Bay Program, 2017b).
The Chesapeake Bay watershed lies in the northeast United States and by some predictions will experience a 2.0°C air temperature increase by 2035, the greatest warming in the contiguous United States and a level that is two decades ahead of global average values (Dupigny-Giroux et al., 2018). Streams in this region may therefore experience the effects of climate change earlier than in other regions. Human population is also expected to increase by 2 million in the watershed, from 18 to 20 million, by 2030 (Chesapeake Bay Program, 2017a), which may result in drastic changes to landuse patterns in coming decades. Together, the predicted early onset of climatic and population changes in this region makes it an excellent test case of how land-use and climate change may affect global freshwater biological conditions.

| The Chesapeake Basin-wide Index of Biotic Integrity (Chessie BIBI)
The Chessie BIBI provides a standardized measure of stream condition for the watershed (Smith et al., 2017). It was developed using stream benthic macroinvertebrate raw counts from 21,343 sampling events collected by 28 state, county, regional, federal, and other monitoring programs from 1992 to 2015. The analysis focused on first-to fourth-order streams and excluded data collected in December, January, and February because of limited surveying. Smith et al. (2017) standardized the dataset's macroinvertebrate taxonomy (e.g., excluded taxa enumerated by just one or two sampling programs) and rarefied sample counts to approximately 100 individuals per sample event prior to calculating richness and diversity metrics. Smith et al. (2017) evaluated over 200 commonly used metrics (Barbour, Gerritsen, Snyder, & Stribling, 1999), and identified those most sensitive to degradation in stream habitat and water quality conditions in the watershed's 12 geographically distinct bioregions. Each sample's metrics were scored on a continuous gradient (0-100) and averaged to produce the final index score. High scores indicate macroinvertebrate communities like those in high quality, relatively undisturbed streams (i.e., reference). Final index scores were then assigned a categorical rating of Very Poor, Poor, Fair, Good, or Excellent based on thresholds derived from the 10th, 25th, and 50th percentiles of BIBI scores in reference samples. See Smith et al., 2017 for a detailed description of the Chessie BIBI. We used the family-level, bioregion-scale Chessie BIBI Smith et al., 2017). We chose this version because it (a) optimized the trade-off between taxonomic resolution and sample size, (b) adjusted for natural effects of geology, topography, and biogeography (i.e., bioregion), and (c) selected by stakeholders as their preferred stream health indicator . The Chessie BIBI was built 100 times using a bootstrap approach to account for slight differences in the taxa selected during the probabilistic rarefaction step, because it randomly selects which rare taxa are included in the BIBI calculation. The median result of those 100 runs was then used in modeling. In total, 21,266 stream sampling events were available with the family-level, bioregion-scale Chessie BIBI, of which 17,385 were from our baseline period of January 1, 2000 to December 31, 2011. This baseline was chosen to capture two complete surveys of the watershed because each survey program takes 6 years to cover the watershed . We spatially linked each sample to an NHDPlusV2 catchment. We removed 14 events due to incomplete data and 1,458 sites with upstream drainages ≥200 km 2 , the threshold for our definition of small streams, dropping the sample size to 15,913. For NHDPlusV2 catchments with more than one Chessie BIBI score, we randomly selected one sample

| Physical environmental dataset
We used data from StreamCat (Hill, Weber, Leibowitz, Olsen, & Thornbrugh, 2016), which contains 517 metrics representing both natural and anthropogenic landscape information summarized using NHDPlusV2 at local, cumulative upstream and riparian scales. We selected 15 uncorrelated (r < 0.70) StreamCat metrics that were identified in literature as important surrogates of instream drivers of stream condition as they relate to habitat for benthic macroinvertebrates, including: • Upstream cumulative watershed area because it is strongly related to many stream variables, such as discharge, energy process, and biological communities (Vannote, Minshall, Cummins, Sedell, & Cushing, 1980); • Elevation since it was important in previous modeling efforts of stream condition in the study area and because it correlates with slope and instream temperature ; • Seven soil predictors (mean season water depth (cm), mean organic matter content (% by weight), mean permeability (cm/hr), mean depth (cm) to bedrock, mean percent clay content, mean percent sand content, and mean soil erodibility (Kf) factor) because of the importance of soils and resultant drivers (e.g., sediment) on stream macroinvertebrates (Waters, 1995); • Three measures of geochemical content in surface or near surface geology-mean percent of lithological calcium oxide (CaO) because of its high correlation with many stream chemistry variables, mean lithological hydraulic conductivity (micrometers per second) because of its influence on rock/waters interaction, and mean lithological uniaxial compressive strength as a measure of susceptibility to weathering (megaPascals; Olson & Hawkins, 2012) all of which affect local habitat for macroinvertebrates; • Summaries of mean runoff (mm) and baseflow index because of the importance of hydrology to streams (Poff et al., 1997); and • Mean composite topographic index (topographic wetness index), which relates upslope area to local slope and is used to quantify topographic control on hydrological processes and estimate water accumulation-that is, valley bottoms have a high index whereas ridge and crests have a low index (Beven & Kirkby, 1979;Sörensen, Zinko, & Seibert, 2006). We hypothesized streams near or surrounded by ridges and crests would have less anthropogenic stress due to less accessibility or suitability, thus improving stream condition.
We used cumulative upstream watershed values for these predictors because local stream conditions are influenced by upstream catchment conditions (Scott, Helfman, McTammany, Benfield, & Bolstad, 2002) and the coarse resolution of many predictor layers can lead to inaccurate estimations when used at a local reach scale.

| Land-use
We assessed baseline land-use and land cover (
Departures of temperature and precipitation were expressed as mean seasonal departures for 19 year future periods centered on 2030, 2060, and 2090. For temperature and precipitation, the 25th, 50th, and 75th percentiles of projected deviations of temperature (°C) and precipitation (mm) were computed for each HRU separately for each season and period (e.g., Figures S7-S12) and then rasterized at 30 m resolution to coincide with the resampled PRISM and other covariate data. We added these deviations to historical PRISM climate estimates (1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999) to produce future projections of average seasonal temperature and total seasonal precipitation across the Chesapeake Bay watershed (n = 36 average temperature rasters and 36 total precipitation rasters). For a more detailed description of the CMIP5 processing, see Appendix S1. Lynch, Seth, and Thibeault (2016) used the CMIP5 scenarios to develop projected seasonal changes in total precipitation and average temperature in the Northeast United States.
We used these uniform, generalized projections to assess whether responses among stream conditions were influenced by the localized effects of downscaling by adding the uniform projected seasonal changes in total precipitation and average temperature to all historical values (Lynch et al., 2016, see Table S2).

| Model development, validation, and interpretation
We used random forests analysis (Breiman, 2001) (Cutler et al., 2007). In total, 36 predictors were used (Table S1), including cumulative upstream watershed values for 15 natural predictors from StreamCat (Hill et al., 2016), 12 LULC categories representing baseline 2005 landscape conditions, seasonal average air temperature and total precipitation from PRISM, and dominant bioregion upstream of each stream reach given its importance in previous modeling efforts . The model was developed using the randomForest R package (Liaw & Wierner, 2002) with 1,000 trees and 19 variables randomly sampled at each split, which was optimal during tuning.
We divided the Chessie BIBI data into training (75%, n = 2,775) and test (25%, n = 925) datasets. Predictive performance was evaluated using model mean of squared residuals and percentage of explained variation in the training data. Since an important stakeholder goal is determining stream length in either Poor, Fair, or Good condition, we also tested model performance by how well it correctly classified these groups (i.e., percent correctly classified, PCC) and calculated the Kappa statistic (values <0.00 indicate poor agreement, 0.00-0.20 slight agreement, 0.21-0.40 fair agreement, 0.41-0.60 moderate agreement, 0.61-0.8 substantial agreement, and 0.81-1.00 almost perfect agreement; Landis & Koch, 1977). The test dataset was used to validate the model by calculating PCC and Kappa for the three-category classification.
Several covariates were correlated in the training dataset (r > .70; Table S3); however, to avoid potentially missing an important climate covariate during prediction, all variables were retained regardless of correlation because random forest models are robust to multicollinearity. That said, correlated variables may affect variable importance plots and partial dependence plots (Molnar, 2019).
Furthermore, land-use and climate factors affect stream conditions through complex and multivariate interactions, which inhibits drawing inferences on functional relationships using partial dependence plots (Friedman, 2001). We used these plots only to interpret predictor strength and to evaluate relationship forms between predictor variables and raw stream condition scores and caution against overinterpreting these figures.

| Model application and prediction
The Chesapeake Bay watershed has 83,637 catchments represented in the NHDplusV2; of these 10,666 were larger than 200 km 2 and 2,199 had missing data and were not used here. We first used the model to predict scores and rating categories for the remain- To align with management goals, we linked predicted scores to the NHDPlusV2 Flowline dataset and calculated total stream kilometers and percentages predicted as Poor, Fair, and Good. We also calculated total kilometers and percentages that were predicted in improved or degraded condition categories for each scenario relative to baseline conditions. By using three categories to classify stream condition, there are three possibilities for a stream to be predicted to a degraded (Good to Fair, Good to Poor, and Fair to Poor) or improved (Poor to Fair, Poor to Good, and Fair to Good) condition. We calculated the total increase or decrease of stream kilometers and net change as the difference between these two values. An additive effect was calculated by summing net changes from associated land-use and climate only scenarios and an interaction outcome was determined as the difference in magnitude between this effect and the net change for each combined scenario.
All analyses and figure generation were performed in the R Statistical Software Package, version 3.6.0 (R Development Core Team, 2019).

| Model validation and interpretation
The baseline random forests model had a mean of squared re- The top two most important predictors in our baseline model were topographic wetness index and developed land cover ( Figure S13) with predicted Chessie BIBI raw score showing a rapid decrease between a 700 and 800 topographic wetness index and between 0% and 20% developed land ( Figure S14). Spring precipitation was within the top 10 most important variables; other climate variables were 13th or less most important ( Figure S13). Predicted Chessie BIBI scores for all seasonal precipitation variables showed an initial increase and then the response either flattened (spring) or decreased (summer, winter, and fall, Figure S15). Predicted Chessie BIBI scores showed an initial flat response followed by a decrease for summer and spring temperatures, and initial increase response followed by no clear response to fall and winter temperatures ( Figure S16).

| Model predictions
Mean and median values of model covariates associated with the Chessie BIBI training dataset samples (n = 2,775) aligned closely to those in the NHDplusV2 dataset (n = 70,772); however, maximum values were larger for many covariates in the NHDPlusV2 dataset, especially for some land covers (e.g., mechanically disturbed land, croplands, and wetlands,

| Combined land-use and climate scenario predictions
All combined land-use and climate scenarios predicted a decrease in stream kilometers in Good condition; however, this decrease dampened with increasing percentile and the Lynch2016 scenarios ( Figure 3; Table S4). The A2 climate scenarios generally pre-

| Climate only predictions
By 2090, CMIP5 p25 predicted the largest amount of stream kilometers in a degraded condition, followed by CMIP5 p50 and CMIP5 p75, and then Lynch2016 (

| Combined land-use and climate predictions
For all climate scenarios, by 2090, A2 combined scenarios predicted the most streams in degraded conditions followed by A1B, B1, and then B2; an opposite pattern was predicted for streams in improved conditions (B2 > B1 > A1B > A2, Table 3).
F I G U R E 2 Predicted percentage of stream kilometers in Poor or Good condition under baseline and 2030, 2060, and 2090 land-use projections (a) and climate projections (b). Fair condition not plotted for clarity, but results are in Table S4.
Land-use projections are described in  (Table 3). These additive effects were all larger than analogous net changes, indicating combining climate and land-use predicted a less degraded condition across the watershed (Table 3; Table S5).

Interaction outcome
Land-use only projections A1B    Table 1. Values for stream lengths are in Table S5.

| Spatial patterns in stream condition changes-2090
Land-use only scenarios showed a wide range in how predicted changes in stream condition spatially organized across the watershed. Scenario A2 predicted widespread degradation in stream conditions, whereas B2 predicted widespread improvements in stream conditions by 2090; A1B and B1 scenarios predicted spatial patterns in between A2 and B2 ( Figure S17). All climate only scenarios predicted improved stream conditions for northern and mideastern portions of the watershed; the number of improved streams increased with CMIP5 quartiles and was highest for the Lynch2016 scenario ( Figure S18). All climate only scenarios predicted degraded million/year, 1.1%). Global climate models also are highly variable and to reduce the need to independently test many global climate models we used quartiles of projections from the CMIP5 program.
Weighting all scenarios equally, by 2090 our results suggest that changes to watershed-wide stream conditions could range from a 16.2% (A2 CMIP5 p25) to 1.0% (B2 Lynch2016) degradation of stream kilometers.

| Model interpretation and prediction
Model accuracy is an important component of any prediction study.
Our accuracy diagnostics are similar to prediction-based ordinal stream condition studies performed previously within the study region (e.g., Maloney, Weller, Russell, & Hothorn, 2009) but slightly less than those from binomial stream condition studies both inside and outside the region (Hill et al., 2017;Maloney, Smith, et al., 2018). Overall, this suggests our model is suitable for use in both prediction of current conditions of unsampled areas and under future land-use and climate conditions. However, another issue is temporal consistency of prediction-based models. We are unaware of a study that has addressed how accurately models predict

F I G U R E 5
Maps showing NHDPlusV2 catchments with predicted change in stream conditions in 2090 under CMIP5 p50 climate projections and A1B (a), A2 (b), B1 (c), and B2 (d) land-use projections. Coding convention: "None" = a stream reach was predicted in the same condition in baseline and 2090 conditions (e.g., Poor to Poor), "Degraded" = a stream reach was predicted at a lower category in 2090 than baseline period (e.g., Good to Fair), and "Improved" = a stream reach was predicted at a higher category in 2090 than baseline period (e.g., Fair to Good). Similar maps for 2030 under CMIP5 p50 climate projections and 2060 under CMIP5 p50 climate projections are in Figures S19 and S20. Stream kilometers and percentages by predicted stream condition change are in Table S5 and Table 3 a response in different time periods in a single region, likely a result of limited long-term, large-scale data.  Table S6).
Although only two periods, these results suggest robustness in temporal accuracy; however, longer term data are needed to more rigorously test temporal consistency.
Our prediction of 66.1% of small stream reaches in the Chesapeake Bay watershed currently in Fair or Good condition is slightly higher than the 64.0% predicted in Maloney, Smith, et al. (2018). Close agreement was surprising given the latter used a finer base layer Land-use and climate covariates were important drivers of stream condition. The importance of developed cover supports previous studies (Maloney, Schmid, & Weller, 2012;Roy, Rosemond, Leigh, Paul, & Wallace, 2003;Walsh et al., 2005) as does the importance of agriculture cover (Hay_Pasture and Cropland; Allan, Erickson, & Fay, 1997) and forest cover . The high importance of upstream developed land as a driver was expected given it alters streams through modified flow and thermal regimes as well as increases in sediments, toxic contaminants, and other stressors (Walsh et al., 2005)-all of which negatively affect benthic macroinvertebrates. Positive trends between precipitation and Chessie BIBI scores suggest that stream conditions are higher in more humid areas. However, trends between temperature and Chessie BIBI scores suggest there may be an upper limit to increased temperature on stream condition. Many streams in the Chesapeake are cold water, which could be more sensitive than warmer systems to increasing temperature (Heino, Virkkala, & Toivonen, 2009). Together, the strong effects of landuse and climate underscore the potential sensitivity of streams to changes in these features.
The importance of topographic wetness index and the rapid decrease in predicted Chessie BIBI suggests valleys and areas with lower upstream catchment slopes had a lower Chessie BIBI score, and thus more degraded conditions. Larger topographic wetness index values, indicative of valley bottoms, were in the southeastern portion of the study area ( Figure S21), largely the Southeastern Plains and Middle Atlantic Coastal Plains bioregions (Figure 1). In a contiguous US study, Hill et al. (2017) reported topographic wetness index was an important driver of stream conditions in the Coastal Plains region, which aligned with these two Chesapeake Bay watershed plains bioregions. This area also had disproportionate amounts of developed and agricultural lands ( Figure S21); thus, there could be some interaction between topographic wetness index and anthropogenic stress occurring (i.e., these areas are more supportive of intensive agriculture and urban or suburban development).

| Land-use only
Results from land-use only predictions suggest net changes in stream condition ranged from degradation in 11.5% to 0.2% of stream kilometers. A net increase in streams in degraded condition from 2030 to 2090 was predicted for both economic development scenarios (A1B and A2), and although at much smaller changes, also for the more environmentally focused B1 and B2 scenarios. Developed land more than doubled for A1B and A2 by 2090 (Table S7), which likely drove the degradation in stream conditions given its negative functional response with Chessie BIBI ( Figure S14). B1 and B2 were projected to have a smaller increase in developed land than A1B or A2, which is a likely reason behind the relatively small increases in streams in degraded condition. An important concern for predicting future conditions is that distributions of data in the predictive model cover the range in projected years. Our data for developed land in the training (Table S1) and baseline scenarios ( Figure S22) overlapped all future scenario distributions, so the model was suitable to predict across the range of developed land in all scenarios.

| Climate only
Except for the Lynch2016 2030 scenario, all climate only scenarios projected a net degradation in stream conditions for the Chesapeake Bay watershed; however, the magnitude of degradation decreased with increasing percentiles of the downscaled climate change models (CMIP5 scenarios) and was smallest for the watershed-wide uniform scenarios (Lynch2016). Thus, the scenario analyses suggest increasing precipitation may mollify the effects of increasing temperature in the future. The distribution of total precipitation in the training dataset overlapped those for baseline 2005 data, but baseline data were slightly higher than historical and all projections (Table S1, Figure S23). The distribution of future temperature projections was shifted above the upper limits of the baseline data for all projections, especially by 2090 (Table S1, Figure S24). Thus, due to potentially novel climate conditions in the future, our model may not effectively predict the effects of lower precipitation projections and higher temperature projections especially for the later year (2090) and higher quartile (p75) scenarios.
We observed an effect of downscaling climate models on stream condition predictions with the regionally uniform climate scenario (Lynch et al., 2016) showing a smaller net proportion of degraded stream kilometers than downscaled CMIP5 scenarios. Lynch2016 projected slightly higher precipitation levels than the CMIP5 scenarios but projected changes in temperature within the range of the CMIP5s (Figures S7-S12). Therefore, it is likely that local variation afforded by downscaled CMIP5 scenarios was behind these differences and highlights an importance of including downscaled data when appropriate. predicted to an improved condition in out years for these scenarios.

| Spatial patterns
Displayed graphically, predictions from combined land-use and climate scenarios identify geographic areas where climate change may mediate effects of land-use in the Chesapeake Bay watershed (e.g., north and middle eastern part of basin) and other areas where such effects of climate may magnify land-use effects (e.g., high development scenario A2 in 2090, Figure 5). A spatial understanding of such interactive effects could be used by researchers, managers, and policy makers to minimize risk or take advantage of opportunities afforded by these interactions (Oliver & Moorecraft, 2014). For example, areas where climate has a larger mediating effect could be prioritized for restoration because climate may help amplify restoration effects, whereas, areas predicted to experience a magnified effect of climate change may warrant strong intervention to stymie possible future loss and to realize restoration goals.

| Limitations of models
We acknowledge we were limited on available land-use and climate covariates, which may have affected the predictive baseline model and thus future predictions. Both were at coarse resolutions (landuse = 250 m, climate = 800 m), which may not encapsulate changes in land cover and climate variables that could be seen at finer scales.
To facilitate summation of these factors with the NHDPlusV2 dataset, we also resampled each to 30 m. Doing so enabled better summation for smaller catchments (e.g., enabled raster centroids to be placed within catchments); however, we acknowledge such a procedure generated finer scaled rasters but not finer scaled representations of land cover.
We also acknowledge limitations in defining seasons by months defined by water years and there are other aggregation options (e.g., meteorological seasons). Any such aggregation is an oversimplification of seasons in this area, which are likely not uniform but rather transitional based on spatial (e.g., latitude) and landscape (e.g., elevation) factors. Defining seasons that incorporate these factors would likely improve seasonal inference but is beyond the scope of the current study.
We also limited our study to using four SRES land-use projections (A1B, A2, B1, and B2) because these were available for our study area. Representative Concentration Pathways (RCP) 2.6, 4.5, 6.0, and 8.5 and shared socioeconomic pathways (SSP) projections exist but are currently not available for the Chesapeake Bay watershed at similar spatial and thematic resolution. results should be robust to these more recent scenarios. We also note that land-use projections are based on existing information at the time of model development, and thus do not contain technological advances that lead to unexpected land-use changes, such as the development of shale oil and gas development that began in the northern portion of the watershed in 2010 .
Optimally long-term biological, land-use, and climate datasets would be available to more definitively tease out effects of land-use and climate changes on biological process (Northrup et al., 2019) and distinguish them from other factors that change biodiversity and abundance (e.g., diseases, introductions, extirpations, and genetic drift). In the absence of such data, models are developed under a baseline scenario where biological data are sufficient, and then, this model is used to predict biological conditions under projected future conditions. Doing so we assume contemporary relationships will hold in the future; however, we acknowledge this may not be the case, particularly when novel climate and land-use combinations may exist in the future without current correlates.
For our climate change predictions of stream condition, we used surrogates for instream biologically relevant measures of water temperature and streamflow, which are an oversimplification of the ways in which changing climatic conditions will affect stream thermal and flow regimes (Knouft & Ficklin, 2017;Morrill, Bales, & Conklin, 2005). Using more biologically relevant metrics would likely improve model strength and our mechanistic understanding of climate effects on stream condition (Kuemmerlen et al., 2015;Merriam, Petty, & Clingerman, 2019;Pyne & Poff, 2017). Unfortunately, such models for headwaters streams are not available for the entire watershed.

| Management implications
Managers are faced with protecting, conserving, and restoring biological populations and associated ecosystems under continually changing land-use, water demands, and climate conditions. to assuage effects of possible future land-use and climate changes and sustain the 10% goal; a range that has dramatically different management implications. For example, at a median cost to restore a stream kilometer in the Chesapeake Bay watershed of $10,500 (Hassett et al., 2005), based on our study, the total cost to reach the 10% goal would range between 82.7 and 197.0 million US dollars. Where to focus restoration efforts also changes among scenarios ( Figure 5; Figures S17-S20). Under some scenarios (e.g., B2 CMIP5 p50), a more spatially targeted (central and southern portion) approach may be fruitful, whereas other scenarios (e.g., A2 CMIP5 p50) highlight a watershed-wide need for intervention to assure more streams are in improved versus degraded conditions.
Globally, our results can be used as an early case study for other regions given the expected relatively early projected temperature changes and population growth in the watershed. As mentioned above, the Chesapeake Bay watershed is in a geographic region expected to experience increased temperatures decades before global averages and that may experience population increases similar to a worst-case global scenario. Furthermore, the Chesapeake Bay watershed has a restoration goal set by the Chesapeake Bay Program and our results highlight how various possible future scenarios may affect not only attaining such restoration goals but also sustaining them. Other regions will be faced with similar uncertainties in future land-use and climate scenarios, and therefore should acknowledge these uncertainties when attempting to predict future conditions.