Hydrologic and water quality responses to biomass production in the Tennessee river basin

Reducing dependence on fossil‐based energy has raised interest in biofuels as a potential energy source, but concerns have been raised about potential implications for water quality. These effects may vary regionally depending on the biomass feedstocks and changes in land management. Here, we focused on the Tennessee River Basin (TRB), USA. According to the recent 2016 Billion‐Ton Report (BT16) by the US Department of Energy, under two future scenarios (base‐case and high‐yield), three perennial feedstocks show high potential for growing profitably in the TRB: switchgrass (Panicum virgatum), miscanthus (Miscanthus × giganteus), and willow (Salix spp.). We used the Soil & Water Assessment Tool (SWAT) to compare hydrology and water quality for a current landscape with those simulated for two future BT16 landscapes. We combined publicly available temporal and geospatial datasets with local land and water management information to realistically represent physical characteristics of the watershed. We developed a new autocalibration tool (SWATopt) to calibrate and evaluate SWAT in the TRB with reservoir operations, including comparison against synthetic and intermediate response variables derived from gage measurements. Our spatiotemporal evaluation enables to more realistically simulate the current situation, which gives us more confidence to project the effects of land‐use changes on water quality. Under both future BT16 scenarios, simulated nitrate and total nitrogen loadings and concentrations were greatly reduced relative to the current landscape, whereas runoff, sediment, and phosphorus showed only small changes. Difference between simulated water results for the two future scenarios was small. The influence of biomass production on water quantity and quality depended on the crop, area planted, and management practices, as well as on site‐specific characteristics. These results offer hope that bioenergy production in the TRB could help to protect the region's rivers from nitrogen pollution by providing a market for perennial crops with low nutrient input requirements.


| INTRODUCTION
The Energy Independence and Security Act (EISA) of 2007 set a target for US production of over 36 billion gallons (1 gallon = 3.785 L) of renewable fuels annually by 2022 (US Congress, 2007). Because agricultural development has historically been associated with impacts on water quality (Dodds & Oakes, 2008), converting the lands needed to meet EISA targets heightened concerns for the nation's rivers and lakes, as well as for downstream estuaries.
The health of rivers in the Tennessee River Basin (TRB) is of particular interest because they support one of the most biologically diverse river fauna in North America (Haag & Williams, 2014;Keck et al., 2014). The basin has retained a large proportion of forested land, in part because significant amounts are protected as National Forest or Parks. The largest historical impact to aquatic ecosystems of the region is the building of dams. Other threats to aquatic biota include water quality degradation associated with higher nutrient and sediment loadings from nonforested, human-influenced watersheds (Scott, Helfman, McTammany, Benfield, & Bolstad, 2002). Therefore, it is important to understanding the potential effects of converting lands from other purposes to grow bioenergy crops in this region.
Evaluating changes in water quality associated with large-scale regional shifts in land-use and management require process-based modeling of hydrology and nutrient dynamics (Wellen, Kamran-Disfani, & Arhonditsis, 2015). Models, such as the Soil & Water Assessment Tool (SWAT) (Arnold & Fohrer, 2005;Srinivasan, Arnold, & Jones, 1998), incorporate current understanding of linkages between watershed properties and water quality responses. SWAT model has been widely used to evaluate hydrologic and water quality responses to land-use change including the conversion of land to growing perennial bioenergy crops, with changes that depend on differences in management between the new bioenergy crop and the previous land use. Simulation of a river basin projected to grow perennial feedstocks, the Arkansas-White-Red River Basin, found decreases in water yield, nutrient (nitrogen and phosphorus) loadings, and concentrations when pasture and agricultural crops were replaced by switchgrass (Jager et al., 2015). In that study, sediment concentration increased due to decreased water yield, but sediment loadings decreased. A reduction in nitrate load was also reported in the Salt Creek watershed in Illinois when conventional crops were converted to miscanthus, with the magnitude of change depending on the proportion of land-use change and the associated change in application of N fertilization (Ng, Eheart, Cai, & Miguez, 2010). Another study focused on four watersheds in Michigan determined that the expansion of perennial grasses (switchgrass and miscanthus) would reduce sediment and total phosphorus (P) loads in most areas, but also led to a short-term increase N and sediment loadings from marginal lands with high pre-existing levels of soil nitrogen (Love & Nejadhashemi, 2011). A few studies have examined the effects of cultivations of short-rotation coppices (SRCs, e.g., willow) on hydrology and water quality. Hartwich et al. (2016) found that the actual evapotranspiration of SRCs was 16% higher than that of annual crops in the North German Plain, though exceptions existed. Ssegane and Negri (2016) predicted similar reductions (33%) in annual sediment export but lower decreases (18% vs. 26%) in NO À 3 export when comparing willow to switchgrass cultivated on marginal lands in the Indian Creek watershed, Illinois. These studies imply that changes in water quality depend on the land-cover types before and after the conversions, management practices, metrics (load or concentration), and soil properties.
In the TRB, no comprehensive evaluation has previously been performed to understand how growing bioenergy crops across the basin would influence hydrology and water quality. In one study, SWAT modeling of TRB was performed as part of an intensive simulation for a large region (Upper Mississippi River and Ohio-Tennessee River) based on 12-digit hydrologic units (Panagopoulos et al., 2015). This study demonstrated the capability to implement SWAT modeling at a higher spatial resolution to better represent hydrologic units and management practices. However, the hydrologic and water quality results for the TRB were only investigated at the basin outlet owing to its small area compared with other regions. In addition, bioenergy crops were not specifically considered and reservoir operations were turned off in that study due to high uncertainties in their simulations (Panagopoulos et al., 2015). A subsequent study in the same region (Upper Mississippi River and Ohio-Tennessee River) showed that the cultivation of switchgrass and miscanthus could lead to significant reductions in sediment and nutrients by up to 60% (Panagopoulos, Gassman, Kling, Cibin, & Chaubey, 2017). Though a previous study examined the water quality effects of switchgrass cultivation in one subwatershed of TRB (Parish et al., 2012), projected changes for the entire basin have not been comprehensively evaluated. Furthermore, a recent national assessment of potential supply of biomass in the United States, the 2016 Billion-Ton report (BT16), highlighted opportunities for growing energy crops across the basin, including switchgrass, miscanthus, and willow (US DOE, 2016a). The economic scenarios evaluated by the BT16 study provided a basis for modeling changes to water quality associated with well-described assumptions.
In this study, we ask how dedicated perennial bioenergy crops (switchgrass, miscanthus, and willow) might change hydrology and water quality in the TRB. Specifically, we evaluate water responses to future land-use change associated with biomass production projected by a base-case and a high-yield scenario in the BT16 (US DOE, 2016a). However, such model simulations of future scenarios are convincing only if the model has been well parameterized and evaluated (Wang & Chen, 2012). Evaluation of multiple responses over time and space is strongly encouraged (Cao, Bowden, Davie, & Fenemor, 2006;Wellen et al., 2015). However, in large river basins, data may not be available in the right times or places to correspond with model outputs (Hoos & McMahon, 2009). This study presents solutions for situations when there is limited spatial and long-term water quality data available and for representing reservoirs in a highly regulated watershed.

| The Tennessee River Basin
The Tennessee River Basin, a tributary basin of the Mississippi River Basin, is located in the southeastern part of the United States (USGS, 2016) ( Figure 1). There are significant physiographic differences in the eastern and western portions of the basin (Price & Leigh, 2006). Forest cover is the dominant natural vegetation in the basin. Since the 1930s, the TRB has been impounded by a network of dams (reservoirs), most of which are managed by the Tennessee Valley Authority (TVA). Mainstem Tennessee River dams are operated in "run-of-river" mode to support river navigation and generate hydroelectric power. Dams on the tributaries function as storage impoundments and are used primarily for flood control during winter and spring (TVA, 2004). Kentucky Dam is 35 km (22 mi) upstream from Paducah, Kentucky, where the Tennessee River flows northwest into the Ohio River.

| SWAT implementation: Watershed delineation and definition of Hydrologic Response Units (HRUs)
We used the SWAT (Version 2012/Revision 627) model (Arnold et al., 2012) to evaluate the responses of hydrology and water quality to land-use changes in the TRB as illustrated by Figure 2. We delineated watersheds in the TRB using ArcSWAT (Winchell, Srinivasan, Di Luzio, & Arnold, 2013) based on 30-m resolution digital elevation data and USGS-defined HUC8 (Figure 1) draining to main stream gages and reservoirs from the National Hydrography Dataset Plus (NHD+) (McKay et al., 2012).
The SWAT model has a reservoir module that can represent water bodies in a watershed (Wang & Xia, 2010). Twenty-two (22) reservoirs in the TRB were included in the SWAT setup ( Figure 1). We selected reservoirs based on the following guidelines: (a) The reservoir can be placed on a river reach generated by ArcSWAT's subbasin delineation (Winchell et al., 2013); (b) the reservoir is managed by the TVA; and (c) the reservoir is large, generally with storage capacity greater than 100,000 ac-ft (123 km 3 ). However, we included two smaller reservoirs (Fort Patrick Henry and Ocoee No. 1) located near HUC8 outlets (HUC8-06010102 and HUC8-006020003). We collected required reservoir variables (see Supporting Information Table S1) as inputs to the SWAT model. Daily measurements of reservoir outflow rates from 1985 to 2013 were used in this study to represent the reservoir operation (Arnold et al., 2012).
We delineated the watershed into subbasins, each of which includes multiple hydrological response units (HRUs). Each HRU represents a unique combination of soil type, slope, and land use or land cover (Neitsch, Arnold, Kiniry, & Williams, 2011). STATSGO soil map (scale: 1:250,000) units (Soil Survey Staff, 1994) that comprised more than 10% of a subbasin were retained. We discretized slope into four categories: <1%, 1-2%, 2-5%, and >5%. This analysis is part of a larger effort to model the entire Mississippi River Basin to understand influences on hypoxia in the Gulf of Mexico. Standardized slope classes were chosen to allow comparison of alternative tile drainage scenarios at <1% and <2% slope. In addition, we required all HRUs with slope >5% to be included in the analysis because these steep areas are subject to erosion (Jager et al., 2015). We used the 2009 Cropland Data Layer (CDL-2009, spatial resolution: 63 × 63 m) (USDA-NASS, 2014) to represent the current landscape (Jager et al., 2015). Natural vegetation in current TRB is dominated by forest (59.4%) and grassland/pasture (11.7%). The major crops in current TRB are hay (nonalfalfa, 9.7%), soybeans (1.7%), and corn (1.5%). We retained land-use classes that comprised more than 2% area of a subbasin. This protocol created a total of 4,026 distinct HRUs in 55 subbasins under the current landscape.

| Meteorological forcings
We downloaded synthetic meteorological data from DAY-MET (Thornton, Running, & White, 1997) for the center of each HUC8 (Figure 1) over the period 1980-2014 (35 years). The original DAYMET data have a 1 × 1 km spatial resolution. Daily meteorological variables include total precipitation (mm), maximum and minimum temperatures (°C), and solar radiation (MJ m −2 day −1 ). Two additional variables (wind speed and relative humidity) were estimated by the SWAT model's climate generator (Gassman, Reyes, Green, & Arnold, 2007) based on the monthly weather database "WGEN_US_COOP_1980_2010." The mean annual precipitation (MAP) for HUC8 units ranged from 1,129 to 1715 mm with an average of 1,433 mm during 1980-2014.

| SWATopt model calibration & evaluation
Because flow data in this highly regulated system were difficult to compare against, we identified synthetic response variables (estimated from gaged measurements). However, available SWAT calibration tools [e.g., SWAT-CUP (Abbaspour, 2015), the autocalibration tool (Van Griensven, 2005), and the R-SWAT-FME framework (Wu & Liu, 2014)] do not provide the option of calibrating against userchosen intermediate response variables, that is, runoff on each HUC8 consisting of multiple subbasins and the total fluxes of nitrate and nitrite (NO À 3 +NO À 2 ). We therefore developed an autocalibration tool (SWATopt) by incorporating the SCE algorithm (Duan, Sorooshian, & Gupta, 1992;Wang, Xia, & Chen, 2009) into the source code of SWAT2012 (Revision 627) model (see Supporting Information Table S2 for different types of SWAT calibration). The SWAT calibration tool (SWATopt) developed in this study can be accessed upon request via GitHub (https://github.c om/wanggangsheng/SWATopt.git). SCE is a stochastic F I G U R E 1 Subbasins and reservoirs of the Tennessee River Basin (TRB) in the Soil & Water Assessment Tool (SWAT). The main-stem Tennessee River runs from east to west optimization algorithm that has been widely used in calibration of hydrological models including SWAT (Wang & Xia, 2010;Zhang, Srinivasan, Zhao, & Liew, 2009).
Using SWATopt, we estimated 39 parameters (Supporting Information Table S3) governing the hydrologic and water quality processes in SWAT. These parameters were selected based on the sensitivity analyses in previous studies (Abbaspour et al., 2007;Baskaran et al., 2010;Bekele & Nicklow, 2007;Santhi et al., 2001;Wang, Barber, Chen, & Wu, 2014;Wu & Liu, 2012). Generally, these parameters were calibrated step by step. The hydrologic parameters (No. 1-14 in Supporting Information Table S3) were first estimated using hydrologic response variables (i.e., hydrologic variables, e.g., streamflow or runoff). The second step was to estimate the water quality parameters (No. 15-39 in Supporting Information Table S3) using water quality measurements (e.g., sediment, nitrogen, and/or phosphorus).
Criteria used to assess model performance include the Nash-Sutcliffe Efficiency (NSE, i.e., coefficient of determination) and the percent bias (PBIAS; Moriasi et al., 2007). NSE quantifies the proportion of the variance in the response variables that is predictable from the independent variables. PBIAS denotes the relative deviation of predicted mean value from observed mean value. According to Moriasi et al. (2007), model simulation is satisfactory if NSE > 0.5 and if |PBIAS| ≤ 25% for streamflow (runoff), | PBIAS| ≤ 55% for sediment, and |PBIAS| ≤ 70% for nitrogen (N) and phosphorus (P).

| Evaluation of hydrology
One general issue with using streamflow (discharge) data for model evaluation is the influence of reservoir operation on downstream gaged data. Gage locations are often chosen to evaluate compliance with minimum flow regulations below dams, and streamflow is largely a measure of the outflow from the upstream reservoir(s).
To circumvent this issue and to ensure uniformity in data across a large region, previous large-scale studies have relied on runoff data instead of streamflow for comparison. For example, computed monthly runoff was used to evaluate the variable infiltration capacity (VIC) model for the conterminous US (Oubeidillah, Kao, Ashfaq, Naz, & Tootle, 2014). We used the USGS computed monthly runoff (1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995) in each HUC8 generated by combining historical flow data F I G U R E 2 Data and methods used to evaluate water quantity and quality responses to biomass production in the Tennessee River Basin One year (1985) was used for model spin-up followed by 10 years (1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995) for model calibration. Another 18 years (1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013) of data were used for model validation. The objective of our multisite evaluation of runoff was to determine hydrologic parameters (No. 1-14 in Supporting Information Table S3) of the subbasins within each HUC8. As aforementioned, the USGS HUC8 runoff represents the local water yield (in units of mm), not the streamflow. There are 32 HUC8 units in the TRB, and the watershed was delineated into 55 subbasins for SWAT modeling. As a result, each HUC8 consists of 1-4 subbasins. For example, when we implemented evaluation in terms of the HUC8-06010102, the parameters in four subbasins (1, 4, 5 and 6) in this HUC8 were adjusted. We calculated simulated HUC8 runoff as the area-weighted average of runoff from the SWAT subbasins within each HUC8: where R HUC8 is the runoff in a HUC8; R sub (j) and Areasub(j) are the simulated runoff (mm) and the area (km 2 ) in the jth subbasin; and Area HUC8 is the total area of the HUC8 that includes m subbasins.

| Evaluation of water quality
Nutrient measurements are sparse in rivers of the TRB. We have attempted to collect in situ water quality monitoring data from over 6,000 USGS and EPA (Environmental Protection Agency) stations within the TRB through the National Water Quality Monitoring Council online Water Quality Portal (WQP) (NWQMC, 2015). However, these observed data are not ready and useful for model evaluation owing to the following reasons: (a) Although there are many measurement sites (stations), very few long-term time series are available within our study period (after 1980s); (b) not all of the water quality variables are measured at a specific site; and (c) there is scaling issue regarding the water quality data. Due to limited subdaily data points (i.e., one measurement in one month or several/many months), it is meaningless to do model evaluation at daily scale. If we want to do model evaluation at the monthly or yearly scale, we need to integrate the data from subdaily to monthly or yearly scale, which is difficult when there are lots of data gaps. We used the published LOADEST dataset (Runkel, Crawford, & Cohn, 2004) as reference to estimate water quality parameters. The LOADEST dataset provided estimates of monthly nutrient fluxes (1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) at the Tennessee River near Paducah, KY (i.e., the outlet of TRB) (USGS, 2015). We used 3 years (1994)(1995)(1996) of data for model spin-up and 10 years (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006) for model calibration. Another 7 years (2007)(2008)(2009)(2010)(2011)(2012)(2013) of data were used for model validation. Four water quality variables were available from the LOADEST dataset: sediment, total phosphorus (TP), total nitrogen (TN), and NO À 3 +NO À 2 . The hydrologic parameters (No. 1-14 in Supporting Information Table S3) calibrated against runoff were fixed during the calibration of water quality parameters (No. 15-39 in Supporting Information Table S3). When multiple response variables (e.g., Sediment, TP, TN, NO À 3 +NO À 2 ) were considered in model calibration, we calculated the overall objective function as the weighted average of individual objective function of respective response variable (Wang & Chen, 2012).
In addition, the spatial distribution of mean annual nutrient yields estimated by the SPARROW model (Hoos & McMahon, 2009) was used as another dataset for model evaluation at the HUC8 level. We conducted the water quality simulation for a longer period  than the period for model calibration and validation (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013). The mean annual yields (MAYD) of nutrients at the HUC8 level were calculated as the area-weighted average of the MAYDs at all subbasins within the HUC8. We compared the TN and TP MAYDs between SWAT and SPARROW by (a) testing the significance of difference in average MAYDs across TRB by the Fisher's least significant difference (LSD) method (De Mendiburu, 2015); (b) testing the significance of difference in the probability distribution of the MAYDs by the Kruskal-Wallis (KW) test (Giraudoux, 2013); and (c) calculating the PBIAS of MAYDs between SWAT and SPARROW at the HUC8 level. All statistical tests were conducted at the significance level of α = 0.05.

| Assessment of water quality responses to biomass production
Future landscapes in the TRB were based on previous research in BT16 that optimally located where bioenergy crops were cost-competitive relative to each other and conventional crops. Future landscapes including cellulosic bioenergy crops in BT16 were generated by downscaling county-level forecasts from the POLYSYS model (US DOE, 2016b). We focused on future landscapes projected under two future scenarios described in BT16 (US DOE, 2016a): the base-case (BC1) and intermediate high-yield (HH3) scenarios. In the BC1 scenario, the landscape was extrapolated to 2040 with allocation of bioenergy crops based on price associated with yield improved by 1% annually. The HH3 landscape assumed an increase in energy crop yield by 3% annually (US DOE, 2016a,b).
The BT16 economic modeling projected future crops to a county scale. We downscaled future land use-land cover (LULC) categories from county-level POLYSYS results to USDA CLU (Common Land Unit) parcels (US DOE, 2016b). We formulated and solved a mixed-integer optimization to allocate the production of biomass crops to parcels within each county. Each parcel was assigned one crop. For each county, we minimized the difference between the total area converted from one LULC class to another class and the total area specified by the BT16 scenario for the county. Complete future 2,040 landscapes were produced for both the BC1 and HH3 scenarios by overlaying the CLU parcels on the current land-use map. The resulting layer was then converted into a raster file (spatial resolution: 1 × 1 km) for SWAT model setup. Using the same protocol as the current landscape (CDL-2009), we generated 4004 and 3994 SWAT HRUs for the BC1 and HH3 scenario, respectively.
We simulated a 10-year, 15-year, and 22-year cycle for switchgrass, miscanthus, and willow, respectively (Buchholz & Volk, 2013;US DOE, 2016a). Switchgrass and miscanthus were allowed to harvested every year (US DOE, 2016a), whereas willow was assumed to be harvested every 3 years starting from the fourth year after planting during a 22-year life-cycle (Buchholz & Volk, 2013). Genetic yield improvements on specific traits were not simulated. We followed fertilizer application scenarios in line with the BT16 assumptions (BT16 Volume 1 Appendix C; US DOE, 2016a), where in addition to N-fertilizers, P-fertilizers were applied for perennial bioenergy crops (US DOE, 2016a). We back-calculated annual N and P fertilizer amounts (Table 1) for the three bioenergy crops from projected future yields. We applied more N-fertilizer to switchgrass and miscanthus under the HH3 scenario than under the BC1 scenario because the crop yields are assumed to be higher for these two crops (but not willow) under HH3. The plant growth parameters in SWAT for bioenergy crops have been calibrated and improved by previous studies (Guo et al., 2018;Hartwich et al., 2016;Jager et al., 2015;Trybula et al., 2015). Thus, we adopted the parameter values from literature for three bioenergy crops ( Table 2): switchgrass (Jager et al., 2015), miscanthus (Trybula et al., 2015), and willow (Hartwich et al., 2016).
To quantify the impacts of land-use change on hydrology and water quality, we used the same forcing data to drive the SWAT model under different land-use scenarios (Current, BC1, and HH3). We calculated percent changes in mean annual water or nutrient loadings and concentrations under BC1 or HH3 compared to the current scenario. We constructed a distribution of changes comprised of responses of 55 subbasins to 29 years of climate and future land-use change scenarios (BC1 and HH3). We presented the boxplots of the distribution of percent changes in subbasin-area-weighted loadings (mm for water and kg/ha for sediment and nutrients) and concentrations (m 3 /s for streamflow and mg/L for sediment and nutrients).

| Summary of future bioenergy landscapes
Dedicated bioenergy crops represented 6.2% and 6.5% of the landscape under BC1 and HH3, respectively (Figure 3a). The main future bioenergy crops in the TRB were switchgrass (Figure 3b, 2% of the total TRB area in BC1 and HH3), miscanthus (Figure 3c, 0.4% and 0.7% of TRB in BC1 and HH3, respectively), and willow (Figure 3d, 3.7% and 3.8% in BC1 and HH3, respectively). In addition, the urban (low/medium/high-intensity developed) area increased from 1.1% to 2.1% under BC1 and HH3 scenarios. Pasture/grassland acreage represented 15.3% of the landscape in BC1 and HH3, which is 3.1% higher than that in the current landscape. The largest declines were projected in hay, that is, 1.1% and 1.0% under BC1 and HH3 compared to 10.1% in the current. The areas of landuse classes were not significantly different between BC1 and HH3 ( Figure 3a). Therefore, our analysis will focus on the comparison between BC1 and current scenarios.

| Spatiotemporal evaluation of runoff
Hydrologic parameters (see definitions of parameter No. 1-14 in Supporting information Table S3 and parameter values in Supporting Information Table S4) were estimated by comparing simulated monthly HUC8 runoff with the USGS dataset. As an example, Supporting Information Figure S1 shows the comparison between the SWAT-simulated monthly runoff (i.e., water yield) and USGS-estimated runoff in HUC8-06040006, which is the outlet HUC8 of TRB.

| Temporal evaluation of water quality using the LOADEST dataset
The water quality parameters (see definitions of parameter No. 15-39 in Supporting Information Table S3 and parameter values Supporting Information Table S4) were estimated against the LOADEST dataset by taking into account multiple objectives, that is, four response variables, including sediment, TP, TN, and NO À 3 +NO À 2 (see Supporting Information Figure S2 for comparisons between SWAT simulations and LOADEST estimates). The PBIAS value for model calibration was 21% for sediment, 2% for TP, −11% for TN, and −80% for NO À 3 +NO À 2 (Supporting Information Table S5). The PBIAS values for model validation of water quality were within ±75% except for NO À 3 +NO À 2 (−157%) (Supporting Information Table S5).

| Spatial evaluation of water quality using the SPARROW dataset
The spatial patterns of SWAT-simulated TN and TP at the subbasin level are shown in Figure 5 and other variables  (Jager et al., 2015), miscanthus (Trybula et al., 2015), and willow (Hartwich et al., 2016); otherwise, they are default values in SWAT database. Miscanthus is not included in original SWAT database.
(runoff, runoff coefficient, sediment, and NO À 3 ) are shown in Supporting Information Figure S3. The spatial distributions of SWAT-simulated mean annual yields (MAYDs)  of TN and TP were comparable to the SPAR-ROW-estimated MAYDs (1975MAYDs ( -2004 (Figure 6 and Supporting Information Figure S4). The average MAYDs of TN across the 32 HUC8 units were not significantly different between SWAT and SPARROW, although the SWATsimulated MAYD (5.5 kg N ha −1 ) was 12% lower than the SPARROW estimate (6.2 kg N ha −1 ) (Figure 6a). The 50% CIs of MAYD of TN were 2.5-6.7 kg N ha −1 (SWAT) and 4.7-7.4 kg N ha −1 (SPARROW), respectively. The KW test was significant, which implied that the MAYDs from the two models did not originate from the same probability distribution. The PBIAS values (between SWAT and SPARROW) for TN at 26 of 32 HUC8 units were within the range of ±70%, and the PBIAS values at three HUC8 were higher than 80% (Figure 6b).
The SWAT-simulated TP (SWAT_TP) consisted of three components, that is, organic P (ORGP), soluble P (SOLP), and sediment P (SEDP). The LSD test showed that the average MAYD of SWAT_TP (1.32 kg P ha −1 ) was not significantly different from that of SPARROW_TP (0.88 kg P ha −1 ) (Figure 6c). The KW test indicated that the SPARROW_TP and SWAT_TP did not originate from the same probability distribution, but there was no evidence of stochastic dominance between SPARROW_TP and SWAT_ORGP+SEDP or between SPARROW_TP and SWAT_ORGP+SOLP (Figure 6c). The PBIAS values between SWAT_TP and SPARROW_TP at 13 of 32 HUC8 units were within the range of ±70% (Figure 6d). In addition, the PBIAS values between SWAT_ORGP+SEDP and SPARROW_TP at 50% of the HUC8 fell into the range of ±70% (Supporting Information Figure S4c), and the PBIAS values between SWAT_ORGP+SOLP and SPARROW_TP at 59% of the HUC8 were within ±70% (Supporting Information Figure S4d).
The spatial patterns of TN from the SWAT and SPAR-ROW were moderately correlated with each other (correlation coefficient r = 0.54). The spatial pattern of SPARROW_TP was not correlated with SWAT_TP, but weakly correlated with SWAT_ORGP+SOLP (r = 0.38). In addition, the SPARROW-estimated spatial patterns of TN and TP were correlated with each other; however, the SWAT-simulated spatial distributions of TN and TP were decoupled because SEDP contributed most (65%) to TP, whereas TN was dominated by inorganic N in current TRB.

| Hydrology and water quality responses to biomass production
SWAT-simulated similar hydrologic and water quality responses between two future scenarios (BC1 and HH3), though the N and P fertilizer amounts (kg/ha) for switchgrass and miscanthus were 27%-33% higher under HH3 than under the BC1 scenario after the first year of rotation (Table 1).

| Feedstock production
SWAT-simulated mean annual yields of switchgrass biomass in the BC1 and HH3 scenarios were comparable to the BT16 projected annual yields (US DOE, 2016a) but higher than the observed annual yields for lowland ecotypes reported by Wullschleger, Davis, Borsuk, Gunderson, and Lynd (2010) (Figure 7a). SWAT-modeled miscanthus yields (Figure 7b) under BC1 and HH3 were consistent with the BT16 projections as well as the metadata analysis by Heaton, Voigt, and Long (2004). Previous studies have also found that SWAT simulates lower than expected growth for miscanthus (Zhang et al., 2011), suggesting that our water quality results for the HH3 scenario may be more realistic than those for BC1. Willow yields reported in the BT16 were higher than our SWAT-simulated yields ( Figure 7c) for BC1 and HH3 assumptions, but our SWAT simulations were comparable to field observations in North America (Volk et al., 2011(Volk et al., , 2018 and Europe (Fabio & Smart, 2018). Note that the HH3 scenario did not include higher fertilizer for willow, and so the same fertilizer amounts were simulated in both future scenarios and assumed genetic improvements leading to 3% yield escalation accounted for differences reported in the BT16 Volume I report (US DOE, 2016a).

| Percent changes in water yield
We observed a significant decrease in actual evapotranspiration (ET) and a significant increase in total runoff (WYLD) in response to conversion of lands to bioenergy crops, as measured by the distribution of percent changes in simulated hydrologic indicators over 55 subbasins across multiple years (Figure 8a). ET decreased in 87% of subbasins (Supporting Information Figure S5a), with a median change of −0.9% (−7.0 mm), and this decrease was reflected in the increased runoff from 91% of subbasins (Supporting Information Figure S5b), with a median change of +0.9% (+5.1 mm). Both surface runoff (SURQ, median change: +3.0% or +4.0 mm) and shallow groundwater flow (SGWQ, median change: +0.6% or +1.1 mm) increased markedly, yet the changes in lateral flow (LATQ) or deep groundwater flow (DGWQ) were insignificant.

| Percent changes in N loadings
We found significant decreases in both total N (TN) and nitrate (NO3) loadings, although organic N (ORGN) loadings became higher under BC1 compared to those under the Current scenario (Figure 8b). TN decreased in 80% of subbasins (Supporting Information Figure S5c). Constituents of TN, nitrate loadings decreased in 78% of F I G U R E 6 Comparison of spatial distribution of TN and TP yields between SWAT simulation and SPARROW dataset at 32 HUC8 units.

| Percent changes in sediment and P loadings
We did not observe significant changes in sediment (TSS, see Supporting Information Figure S5e) and total P (TP, see Supporting Information Figure S5f) loadings, though organic P (ORGP) and sediment P (SEDP) loadings increased in most subbasins (median change in ORGP: +1.9% or +0.001 kg/ha; median change in SEDP: +3.1% or +0.02 kg/ha) under BC1 compared to Current (Figure 8c). Decreased soluble P (SOLP) loadings (median change: −4.6% or −0.005 kg/ha) were offset by increased SEDP and ORGP loadings, which resulted in unchanged loadings of mineral P (MINP) and TP (Figure 8c).

| Percent changes in streamflow and nutrient concentrations
Higher streamflow (FLOW) and lower TN concentrations were found under BC1 than under current base landscape ( Figure 8d). FLOW increased in 82% of subbasins, with a median change of +1.1% (+1.0 m 3 /s). TN concentrations decreased in 87% of subbasins, with a median change of -40.7% (−2.7 mg/L). Similar to loadings, we did not observe notable changes in TSS and TP concentrations, despite significant increases in ORGP concentrations (median change: +3.8% or +0.01 mg/L). Significant declines in nitrate (NO3) concentrations (median change: -48.2% or −2.8 mg/L) and increases in ORGN concentrations (median change: +7.6% or +0.04 mg/L) were also found under BC1 relative to the current landscape.

| DISCUSSION
Our study in the TRB evaluated the hydrology and water quality changes associated with planting three bioenergy crops including switchgrass, miscanthus, and willow under a base-case and high-yield future scenario from the BT16.
A key finding of this study is that TN and nitrate levels decreased under both future scenarios. This is likely explained by the decrease (46%) in N fertilizer when hay was converted to miscanthus and willow (Supporting Information Figure S6). Our results did not show significant changes in sediment and P loadings under future scenarios. Our analysis indicated sediment yields were more dependent on topography (specifically, the elevation drop in a subbasin) than land-use classes in this region (Wellen et al., 2015). In addition, only up to 6.5% of the land would be converted to biofuel crops in the two BT16 scenarios, which might also limit the changes in sediment loadings. Under current landscape, very little P fertilizer was applied across the TRB. Although P fertilizer (16-37 kg ha −1 year −1 after the first year of rotation) was required when  Heaton et al. (2004); and (c) willow, observed values from Volk et al. (2011Volk et al. ( , 2018 bioenergy crops were planted, the area-weighted P fertilizer applied over TRB was only 1.2 kg ha −1 year −1 , which was not sufficient to result in significant changes in TP loadings and concentrations. Changes in streamflow rates were small (median: +1.4%), though significant. We showed that changes in nutrient loadings in the TRB were explained by the increased area planted in miscanthus and willow, as they replaced hay that usually required higher amounts of , nitrate in groundwater flow; TSS, total suspended sediment; TP, total P; ORGP, organic P; SOLP, soluble P; SEDP, mineral P attached to sediment; MINP = SOLP + SEDP. "**" and "*" indicate a significant difference between BC1 (or HH3) and current with p-value <0.001 and p-value <0.05, respectively. "ns" means no significant difference with p-value ≥ 0.05 (t test) fertilizer, whereas switchgrass was mainly converted from pasture (Supporting Information Figure S6). In addition, willow occupied higher acreage and required smaller amounts of fertilizers than switchgrass (see Table 1). A similar study in the Arkansas-White-Red (AWR) river basin also indicated decreases in nutrient loadings and concentrations, where switchgrass was the main bioenergy crop and represented 3.8-5.1% of future landscapes (Jager et al., 2015). The AWR study showed a reduction in sediment yields (median: −14%) but an increase in sediment concentrations (+1.8%) owing to decreased streamflow (−14%). In Indian Creek watershed in Illinois, NO À 3 loadings were decreased by 26% and 18% when simulating conversion of marginal lands to switchgrass and willow, respectively (Ssegane & Negri, 2016).
Our hydrologic results in the TRB were different from previous studies in other watersheds, where the conversion of conventional crops to perennial bioenergy crops led to lower runoff (Jager et al., 2015;Ssegane & Negri, 2016;Trybula et al., 2015). We acknowledge potential uncertainties associated with the modeling of various components of hydrology (i.e., surface runoff, lateral flow, and groundwater flow) as no measurements were available for the model evaluation of these components. In this study, projected increases in residential areas (Figure 3) resulted in more impervious area thereby leading to higher surface runoff (SURQ). In addition, our data analysis showed that the reduction in hay and expansion of pasture/grassland and bioenergy crops (particularly willow) tended to decrease ET and increase groundwater flow in the TRB. Compared to the mean annual ET from hay (889 mm), the ET from miscanthus (986 mm) was higher, whereas the ET losses from switchgrass (858 mm) and willow (774 mm) were lower. It ultimately resulted in lower area-weighted ET in the TRB under BC1 than current as switchgrass and willow acreages were much higher than miscanthus (Figure 3). Trybula et al. (2015) also found that ET from miscanthus was higher than for switchgrass. Although ET of willow is generally higher than that of conventional crops, pasture, and forest, significantly lower ET from willow than pasture and forest has also been observed (Hartwich et al., 2016). In this study, much lower ET from willow than other land-uses (hay, forest, miscanthus, and switchgrass) might be partly explained by relatively lower mean annual precipitation and smaller available water capacity (AWC) in the middle and north-east TRB growing willow. Our results indicate that lower ET of willow than other crops may occur, which is controlled by site-specific properties (Hartwich et al., 2016). Lower ET resulted in higher groundwater recharge and more groundwater flow.
Model comparisons against data are always challenging, especially when working at a large spatial scale with reservoir operations and evaluating multiple response variables. We developed innovations to overcome hurdles associated with limited data for model testing: (a) we identified empirical datasets interpolated in space and time to use in our comparison, and (b) we implemented an autocalibration approach to allow simultaneous calibration against multiple responses, including specific synthetic response variables (e.g., HUC8 runoff and total fluxes of NO À 3 +NO À 2 ). Using these innovations, we were able to successfully parameterize the SWAT model for the TRB and to evaluate model performance. We incorporated measured reservoir outflows into the modeling and used HUC8 runoff instead of streamflow rates as calibration targets, thereby avoiding the representation of complex operational strategies for many reservoirs. This is understandable as we focused on hydrologic and water quality responses to land-use change and the simulation of real-time reservoir operation is out of the scope of this study. Our spatiotemporal evaluation of SWAT enables to more realistically simulate the current hydrology and water quality, which gives us more confidence to simulate the effects of land-use change on water quality in the future.
In summary, we simulated changes in water quality and hydrology associated with land-use change from conventional lands to bioenergy crops using the SWAT model. Our results suggest that nitrate and TN loadings and concentrations could be greatly reduced by increases in biomass production consistent with the BT16 BC1 scenario, whereas runoff, sediment, and phosphorus were slightly or insignificantly affected in the TRB. The extent of influence of biomass production on water quantity and quality depends on the types of bioenergy crops, their cultivation areas and management practices as well as the site-specific characteristics.