Simulating the effects of behavioral and physical heterogeneity on nonpoint source pollution

To increase the effectiveness of conservation programs focused on reducing agricultural nutrient runoff and targeting management interventions, some have called for greater attention to the role of diversity in both management and physical context. To examine the independent and interactive effects of behavioral and physical heterogeneity on phosphorus loads, a sensitivity analysis was conducted using six different assumptions about distributions of phosphorus fertilizer application rates and soil test phosphorus (STP) levels for hydrologic response units in a SWAT model for the Maumee River Watershed. Results indicated that changing assumptions about behavior and STP levels can significantly affect estimated dissolved reactive phosphorus (DRP) loads and the level of disproportionality, which is a measure of the unequal distribution of pollutant loading. Placing the highest fertilizer application rates on fields with the most excessive STP produced 14% greater estimated DRP load and higher levels of disproportionality compared to a baseline model, where homogeneity in farmer fertilizer behavior and STP were assumed. In contrast, placing the lowest fertilizer application rates on the fields with the most excessive STP led to estimated DRP loads and level of disproportionality that were similar to the baseline model. Results from this analysis suggest that simplistic or uniform assumptions about behavior or STP levels may mask serious environmental risks in agricultural watershed models.

Federal and state authorities of Canada and the United States have established numerous regulatory and nonregulatory programs to limit the severity of HABs in Lake Erie (EPA & Environment and Climate Change Canada, 2022).Evaluating the effectiveness of these programs is an important step to achieve efficient and successful water quality management.Spatially distributed watershed-scale models can provide useful and convenient tools for this evaluation.One such hydrological model widely used to simulate the effectiveness of conservation practices is the Soil and Water Assessment Tool (SWAT) (e.g., Cooper et al., 2017;Gu et al., 2015;Lee et al., 2017;Wallace et al., 2017).While important advances have been made to improve the ability of SWAT to capture biophysical processes and built infrastructure (Arnold et al., 1998(Arnold et al., , 2012;;Bieger et al., 2017;Neitsch et al., 2005), much less work has been done to account for the complexity of human land management behaviors.
The majority of spatially distributed watershed-scale models lack spatially refined representation of farm management practices such as fertilization (Getahun & Keefer, 2016;Merriman et al., 2018), assuming representative or homogenized inputs of farm management at large scales.For instance, around 70% of the published SWAT model applications use a single rate or the auto-fertilization algorithm to simulate fertilizer management practices (Arrueta et al., 2022).A dearth of detailed and spatially explicit data on farm management has contributed to the reluctance of SWAT modelers to incorporate this information in their work (Xu et al., 2018).
Using aggregated data to simulate phosphorus pollution may be misleading and inefficient.Previous research has shown that environmental pollution is not caused equally by all actors.Freudenburg's (2006) research on "disproportionality" demonstrates that a small percentage of actors are often responsible for a large portion of environmental pollution.This theory diverges from previous theories about responsibility for environmental pollution, which tend to focus on the overall, average, or homogeneous patterns of behavior.One example is the "Ecological Footprint" literature, which measures the average impact of human activities in terms of the amount of environmental impacts required to produce the goods consumed and assimilate the waste generated (Global Footprint Network, 2016).According to Lang (2011), these indices conceal within-group variations and overestimate the typical contributions by the majority of individuals within a group.
An assumption about an individual based on average data obtained for a group may result in an assessment error, known as "ecological fallacy" (Robinson, 1950).Just because nonpoint sources are the main cause of water quality degradation, we cannot assume that all farmers are equally contributing to the problem.In fact, extensive past research on actual management practices has suggested that farmer behaviors can be quite heterogeneous.Farmers are a highly diverse group, and their behavior related to the adoption of production and conservation practices (e.g., fertilizer, tillage, irrigation management) can vary widely based on personal characteristics and decision-making contexts (Prokopy et al., 2019;Reimer et al., 2014).For example, while published data are limited, it has long been apparent that farmer fertilizer use behaviors range widely.In one Wisconsin survey of 1708 farmers, Shepard (2000) found phosphate (P 2 O 5 ) application rates that ranged from 0 to 1355 kg/ha, with a mean of 128 kg/ha and a standard deviation of 112 kg/ha.Most farmers applied levels of nutrients that were outside recommended ranges for the production of corn, and 12% of the farmers applied more than one standard deviation over the average, with nearly 2% of the farmers applying three standard deviations or more above the mean.
Moreover, human contributions to environmental degradation are also determined by the specific physical setting where risky behaviors take place (Nowak et al., 2006).Inappropriate management in a fragile physical setting may be more detrimental than the same practice in a well-buffered physical setting.There is growing evidence that a small portion of farm fields contributes the majority of phosphorus loading to waterbodies due to legacy P issues (Jarvie et al., 2013;Sharpley et al., 2013).For example, a recent study of soil testing laboratory data from the Western Lake Erie Basin found that approximately 5% of the agricultural fields have phosphorus concentrations 2.5 times greater than the maximum agronomic recommendations for corn and soybeans (Dayton et al., 2020).Numerous studies have linked these elevated phosphorus soils to excessively greater loads of phosphorus in surface runoff (Duncan et al., 2017;King et al., 2018).
Several studies have focused on identifying areas that contribute excessive amounts of nutrients, as an aid in targeting conservation practices (Ghebremichael et al., 2013;Jha et al., 2010;Scavia et al., 2017;Teshager et al., 2017).However, the majority of these studies identified such areas using only physical characteristics of the landscape (e.g., slope, soil characteristics, land cover).For example, Jha et al. (2010) targeted conservation practices to highly erodible areas.While human and physical dimensions can have independent impacts, the interactions of both dimensions may reveal new and complex patterns and processes not evident when studied separately (Liu et al., 2018).In this sense, Nowak et al. (2006) suggest examining the interactive effects of behavior and physical vulnerability since particular behaviors in particular fields may account for a disproportionate amount of the environmental impact in most watersheds.Disproportionality occurs when there are asymmetries between the appropriateness of social behaviors and the buffering capacity of the physical setting where these behaviors occur (Nowak et al., 2006).A better understanding of the dynamics of disproportionality can help target the people and locations for optimal implementation of conservation management practices.Therefore, integrative interdisciplinary research that investigates the interactions between

Research Impact Statement
Simulating heterogeneity in fertilizer management and soil phosphorus content affected dissolved reactive phosphorus export predicted by a watershed model.farmers' management behavior and farm-level physical heterogeneity is needed to improve the reliability and validity of spatially distributed watershed-scale models simulating water quality outcomes.
The main goal of this study was to assess the sensitivity of SWAT-simulated riverine phosphorus loading to model assumptions about behavioral and physical heterogeneity.Specifically, we examined the independent and interactive effects of behavioral and physical heterogeneity on phosphorus loads by incorporating stronger assumptions about the heterogeneity of phosphorus fertilizer application rates and soil test phosphorus (STP) values across the Maumee River Watershed (MRW), a major contributing area associated with periodic HABs in Lake Erie.
Empirical results from surveys conducted in the MRW were used to characterize the distribution of phosphorus fertilizer application rates, and reports from regional soil testing laboratories were used to characterize the distribution of STP values in the MRW.

| Study area
The MRW is the largest watershed contributing to Lake Erie, draining a 16,200 km 2 area that includes northwestern Ohio and parts of Indiana and Michigan (Maumee RAP et al., 2006, Figure 1a).The MRW is the largest single contributor of phosphorus to Lake Erie, accounting for 48% of total phosphorus (TP) and 20% of dissolved reactive phosphorus (DRP) (Maccoux et al., 2016).
The region is characterized by flat landscapes and poor surface drainage; around 50% of the area has a slope lower than 0.5% and has poorly drained soils (Gebremariam et al., 2014).Nearly 75% of the land cover is under row crops, primarily corn and soybeans, while urban areas occupy 10%, and grasslands and forest each cover 6% of the watershed area (USDA, 2017, Figure 1a).Moreover, about 70% of the row crop agriculture is drained through subsurface ("tile") drains (NRCS, 2020), and approximately 20% of the phosphorus input in the watershed is from manure (International Joint Commission, 2018;Scavia et al., 2017).
The average annual precipitation in the study region is 1052 mm (based on data from water years 1990 to 2022), with the summer season tending to be wetter than the winter season (NOAA, 2022).The average annual temperature is 10°C, with February (−8°C) as the coldest month and July (29°C) as the warmest month (Williams & King, 2020).

| SWAT model
The SWAT is a process-based spatially distributed watershed-scale hydrological model that can predict the impact of land management on water, sediment, and nutrient fluxes within watersheds (Arnold et al., 1998).The SWAT model of the MRW used in this study was published in Land cover in the Maumee River Watershed SWAT and (b) Estimated location of the HRUs that had non-manured corn fields in a corn/soybean rotation analyzed in the study.HRU, hydrologic response unit.Apostel et al. (2021).This model sets hydrologic response unit (HRU) boundaries based on approximate farm field boundaries, which assures spatial continuity.The model consists of 24,256 HRUs, from which 84% are agricultural HRUs with an average size of 70.9 ha, comparable to the average farm-field size in Ohio (72.4 ha), Indiana (106.8 ha), and Michigan (82.9 ha) (USDA, 2017).The model performance was assessed using updated performance metrics for satisfactory model fit in Moriasi et al. (2015).For further information on model development, process and configuration of HRUs, and model calibration and validation, see Apostel et al. (2021).

| Characterization of fertilizer behavior using survey data
Phosphorus fertilizer application rates were estimated for cropland in the watershed based on the results of a longitudinal survey conducted in the MRW by The Ohio State University during the winter of 2016 and 2018.The 2016 survey produced 748 respondents (a 29% response rate, Prokup et al., 2017).In 2018, a new survey was sent to all 2016 respondents and generated 381 useable responses (a 60% response rate, Beetstra & Wilson, 2018).Although the main purpose of the surveys was to investigate how farmers perceive and use recommended conservation practices, each survey instrument included a section about current fertilizer application rates on a typical farm field.The combined samples produced 1129 observations of fertilizer application behavior.
The sample population for this survey consisted of farmers growing corn and/or soybean.For the purpose of this study, we focused only on corn fields in a corn/soybean rotation that had not received manure during the previous year.Manured fields were excluded from the analysis due to limited observations.Based on data published by the USDA-ERS (2019), 53% of the commercial phosphorus fertilizer applied is used for corn production, and the corn/soybean rotation is the most common rotation in the study area.From the total number of observations, 325 reported that they grew corn in a corn/soybean rotation and did not apply manure in the previous year.From this, 224 farmers reported phosphorus rates.
Before analyzing the self-reported phosphorus fertilizer management data, respondents who provided unrealistic or outlier values were removed using the Z-score approach (Cousineau & Chartier, 2010).The Z-score represents the number of standard deviations by which the value of an observation is above or below the population mean.We used a Z-score threshold of 3 to identify and remove the most extreme (and presumably unreliable) observations.This resulted in removing three observations from the analysis.
Data from 221 observations were used for the analysis.Phosphorus fertilizer rates for non-manured corn fields in a corn/soybean rotation varied considerably in the study region (Figure 2a).The mean phosphorus rate from 221 observations was 44.6 kg/ha with a standard deviation of 28.6 kg/ha, and the median was 34.2 kg/ha, which was relatively close to the mean estimate for Ohio published by the USDA in 2014, 2016, and 2018 (36 kg/ha, USDA-ERS, 2019).The lowest phosphorus rate was 0 kg/ha, and the highest phosphorus rate (after removing outliers) was 134.5 kg/ha.The simulated fertilizer rates were generated by first arranging the observational data from lowest to highest and partitioned into 10 sections or deciles (Table S1).Then, each HRU of the SWAT model was randomly assigned to a decile bin.Finally, within each decile bin, each selected HRU was randomly given a value between the boundaries of the decile bin.The simulated data explained reasonably well the observational data based on the Jensen-Shannon divergence index of 0.28 (Figure 2a).The Jensen-Shannon divergence index measures the level of difference between two probability distributions (Stewart, 2019).The index ranges from 0 to 1, where 0 indicates that both distributions are similar and 1 indicates that both distributions are maximally different.

| Characterization of soil nutrient levels using soil data
To determine soil nutrient levels, results of STP analyses from regional soil testing laboratories from the years 2010 to 2015 were used.High STP values have been linked to an increased risk of phosphorus runoff (Duncan et al., 2017).These values were provided to an Ohio State University research team in 2016 by three major soil testing laboratories, A&L Great Lakes Laboratories, Inc. (Fort Wayne, Indiana), Brookside Laboratories (New Bremen, Ohio), and Spectrum Analytics Inc. (Washington Court House, Ohio) and published in Dayton et al. (2020).
Soil samples from counties in Ohio that are part of the MRW were used for this analysis (398,607 soil samples).STP values above or below 5 standard deviations from the mean were removed from the analysis.This resulted in removing roughly 1% of the total observations.A Z-score of 5 was used because recent reviews of soil testing laboratory data from the Western Lake Erie Basin found that over 5% of soil samples in the region have a STP level that exceeds 100 mg/kg (Dayton et al., 2020;Williams et al., 2015).
Data from 396,332 observations were used for the analysis.Overall, STP values varied widely (Figure 2b).The mean Mehlich-3 STP value was 48.9 (standard deviation = 38.2),and the median was 38.The lowest Mehlich-3 STP value was 1 and the highest value was 293.2.The simulated STP values were generated by first arranging the observational data from lowest to highest and partitioned into 10 sections or deciles (Table S2).Then, each HRU of the SWAT model was randomly assigned to a decile bin.Finally, within each decile bin, each selected HRU was randomly given a value between the boundaries of the decile bin.The simulated data explained quite well the survey data based on the Jensen-Shannon divergence index of 0.03 (Figure 2b).
In order to incorporate the STP information in SWAT, the SOL_SOLP parameter was modified for all agricultural HRUs (18,018 HRUs), while all others remained at the default value of 6.75 Mehlich-3 P STP (6238 HRUs) based on the criteria described in Table S3.The SOL_SOLP parameter controlled the initial soluble phosphorus concentration in the soil.Since the SWAT model used in this study employs labile P to describe the initial soluble phosphorus concentration in the soil layer, and there is a published conversion between Bray1-P and SWAT labile P (Sharpley et al., 1984), STP values were converted from Mehlich3-P to Bray1-P and then converted to labile P concentrations.To convert from Mehlich3-P to Bray1-P, Mehlich3-P values were divided by 1.35.This conversion factor was developed for Ohio, Indiana, and Michigan by Culman et al. (2020).After Mehlich3-P values were converted to Bray1-P STP values, the latter values were converted to labile P values using the following linear relationship between Bray soil testing results and soil labile P developed by Sharpley et al. (1984): In order to represent the vertical stratification of STP values, soil phosphorus values were split into two depths: 0-5 cm and 5-20 cm below the soil surface.The first depth of 0-5 cm represents the surface layer that often has elevated soil phosphorus, and the second depth of 5-20 cm represents the common sample depth for traditional soil tests.The STP values were distributed vertically using the percentage distribution for a nearby watershed (Sandusky River watershed) in which mean values in the top 5 cm depth of soil samples were 68% higher than the mean values of the lower section (5-20 cm) (Baker et al., 2017).
The following relationships were then assumed: Substituting Equation (1) into Equation ( 2) and rearranging, we obtain Equation (3): where STP 0-5 = STP concentration (mg P/kg soil) in the top 5 cm of the soil core; STP 5-20 = STP concentration (mg P/kg soil) in the 5-20 cm of the soil core; STP avg = average STP concentration (mg P/kg soil) in the 20 cm soil core.

| Simulating single and interactive effects of behavioral and physical heterogeneity on phosphorus loads
The single and interactive effects of behavioral and physical heterogeneity on phosphorus loads in the MRW were simulated in SWAT using the frequency distributions obtained for phosphorus fertilizer rate application and STP values in the previous analysis.To accomplish this, (1) Labile P = (0.56 × Bray P) + 5.1 (2) six scenarios were developed (Table 1).A detailed description of assumptions about fertilizer use and STP levels in each of the scenarios are presented in Table S3.
The first scenario is a Baseline scenario, where a uniform farmer fertilizer behavior and a single STP value were used based on the average values for phosphorus fertilizer rate and STP values obtained from the MRW surveys and laboratories' reports.The next two scenarios (P rate heterogeneity and STP heterogeneity) were developed to analyze the single effect of behavioral or biophysical heterogeneity on phosphorus loads.The P rate heterogeneity scenario incorporates a random distribution of farmer fertilizer behaviors grounded in the empirical results from the MRW surveys reported above, while a single STP value was used across all agricultural fields.In contrast, the STP heterogeneity scenario allocated STP levels randomly across agricultural fields based on actual reports of STP from regional soil testing laboratories and a uniform farmer fertilizer behavior was used.
The final three scenarios (Random allocation, Low risk, and High risk) were analyzed in order to get a better understanding of the interactive effect of farmers' fertilizer behavior and the vulnerability of the settings where this behavior is taking place on the phosphorus pollution.
Therefore, the last three scenarios made different assumptions about the relationship between fertilizer application rate decisions and a field's STP levels.The Random allocation scenario assumes that fertilizer rate applications have no significant association with STP values, but rather that they are two different phenomena that occur independently.This would be the case if farmers do not take into account the STP values of their fields when deciding on the application rates.
The Low-risk scenario assumes that farmers adjust their fertilizer rates in response to soil testing.In other words, the most vulnerable fields (with the highest STP values) would be expected to receive the lowest rates of fertilizer, and fields with low STP levels would receive the highest fertilizer application rates.One reason for this pattern would be an assumption that farmers generally use soil test results to guide their fertilizer rate applications (as many agronomists and agricultural economists believe).This assumption could be supported by the results of Reimer et al. (2020), who found that around one-third of the farmers in the Midwest account for uncertainty in their decision-making process associated with N management through increased data collection about weather, yield, soil quality, and other agronomic conditions.In addition, farmers may be aware of the heightened environmental risks associated with high STP levels and thus be more motivated to implement nutrient management practices.According to Ranjan et al. (2019) and Prokopy et al. (2019) the vulnerability of the land was positively correlated with the adoption of conservation practices.A theory that could help explain this is the "Construal Level Theory" (Brügger et al., 2016), which describes the relationship between psychological distance and the extent to which people perceive an event or an object to be abstract or concrete.The more distant an object is from the agent, the more abstract it is perceived, while the closer the object, the more concrete it is perceived.Farmers who are located on fields with high STP may perceive the phosphorus runoff issue as more concrete than farmers who are not located in these places.Therefore, they would be more likely to adopt more conservative application rates.Various findings from studies about climate change adaptation have supported this theory, suggesting that farmers who perceive climate change to be occurring are also more likely to plan for adaptation (Spence et al., 2012).For example, farmers in California who experienced changes in water availability have been found to be more likely to adopt adaptation measures, with "adaptation driven by psychologically proximate concerns for local impacts" (Haden et al., 2012).
Finally, the High-risk scenario assumes that farmers located on more vulnerable landscapes (defined here as high STP values) are also applying the highest rates of fertilizer.This over-application of fertilizers could be the reason why fields have increased soil phosphorus concentrations in the first place.Farm management is complex, involving interactions of overlapping biological, physical, and social systems (Darnhofer, 2014).This complexity can lead to high levels of uncertainty, which could affect farmers' decisions (Aimin, 2010).In order to minimize this uncertainty, farmers (and most people) are known to use heuristics or shortcuts (such as rules of thumb, habits, or behaviors learned from parents) to make decisions.Reimer et al. (2020) found that around two-thirds of the farmers in the Midwest rely on previous experience and TA B L E 1 Scenarios used to analyze the single and interactive effect of behavioral and physical heterogeneity on phosphorus loads.The highest rates of phosphorus fertilizer were applied to the HRUs that had the highest values of STP heuristics to make decisions related to nitrogen management.Moreover, since farmers make many decisions daily (on top of fertilizer rates), cognitive limitations or bounded rationality may also affect farmers' decision-making.In any case, if a farmer overapplies fertilizer for many years, their soils will more likely accumulate high STP levels.

| Scenarios setup
Scenarios were simulated by running the baseline SWAT model for 4 years under observed climate and current management regimes as a warm-up period.Then, the scenarios were run for the next 2 years using the criteria described in Table 3. Simulation outputs were summarized by water year, from October of Year 5 to September of Year 6, from HRUs that had non-manured corn fields in a corn/soybean rotation (Figure 1b).Predicted annual TP and DRP loads were used to compare the five scenarios with the Baseline scenario, in order to quantify how consequential the different approaches to representing heterogeneity could be for watershed-scale predictions of total nutrient loading rates.
TP loads were calculated by adding organic phosphorus, sediment-bound phosphorus, soluble phosphorus, and subsurface phosphorus loads, while DRP loads were calculated by adding surface soluble phosphorus and subsurface phosphorus loads.
The level of disproportionality in nutrient loading for each scenario was quantified by using Gini coefficients (Gini, 1912).The Gini coefficient is a dimensionless metric that ranges from 0 to 1, with a Gini coefficient of 0 expressing perfect equality, where all values are the same, and a value of 1 expressing maximal inequality among values (Dniestrzański, 2015).In this case, a Gini coefficient of 1 would mean that only one HRU is emitting all the phosphorus load in the watershed, and a coefficient of 0 would mean that all the HRUs are emitting similar loads of phosphorus per unit area.The Gini coefficient has been previously used in hydrologic studies (Gall et al., 2013;Jawitz & Mitchell, 2011;Masaki et al., 2014;Saha et al., 2018;Williams et al., 2018).
All six scenarios were run three times with varying randomization to test if the results did not vary due to some inherent randomness.In this sense, the same P rate and STP values were assigned to differently randomized HRUs for each scenario.Finally, we ran all six scenarios for three different periods: 2005-2010, 2006-2011, and 2007-2012 to better understand how temporal factors will influence the direct and interactive effects of behavioral and physical heterogeneity on phosphorus loads.Simulated outputs of water years 2010, 2011, and 2012 were selected and compared for analysis.These years were selected because they represent extremes in spring precipitation conditions that are present in Ohio (Michalak et al., 2013; NOAA, 2022, Table 2).Spring precipitation events coupled with long-term trends in agricultural land use and practices have produced excessive algal growth in Lake Erie (USEPA et al., 2018;Michalak et al., 2013).

| RE SULTS
Six scenarios were run in the SWAT model (Baseline, P rate heterogeneity, STP heterogeneity, Random allocation, Low-risk, and High-risk) to assess the direct and interactive effects of behavioral and physical heterogeneity on phosphorus loads.Overall, the inclusion of greater behavioral and physical heterogeneity did not affect TP loads under all scenarios compared to the Baseline scenario (Table 3).TP loads from the water year 2010 were relatively similar across all scenarios, ranging from 1975 g/ha (STP heterogeneity scenario) to 1992 g/ha (P rate heterogeneity scenario).Also, there were no differences in the level of disproportionality in TP loads under the studied scenarios compared to the Baseline scenario (Table 4).Disproportionality was measured in terms of disparities among the proportions of phosphorus loads contributed by each of the HRUs.A high level of disproportionality suggests that a small subset of HRUs is producing most of the share of phosphorus loads.
In contrast to TP loads, the incorporation of behavioral and physical heterogeneity increased DRP loads and the level of disproportionality of DRP loads under all scenarios compared to the Baseline scenario.The introduction of heterogeneity into the SWAT model of either phosphorus fertilizer rates (P rate heterogeneity scenario) or STP values (STP heterogeneity scenario) increased DRP loads by 4% and 3%, respectively (Figure 3); and the level of disproportionality from 0.48 to 0.51 (Table 4) relative to the Baseline scenario.

TA B L E 2
Observed annual and spring precipitation in Ohio across the three studied periods (The Great Lakes Water Quality Agreement Nutrients Annex Subcommittee has defined spring as the period between March 1 and July 31 each year; USEPA et al., 2018).The random allocation of both fertilizer rates and STP values (Random allocation scenario) also increased DRP loads and the level of disproportionality compared to the Baseline scenario.Under this scenario, the disproportionality increased from 0.48 to 0.54 compared to the Baseline scenario (Table 4).This resulted in an increase of 7% in the DRP loads compared to the Baseline scenario (Figure 3).
The highest increase in DRP loads and disproportionality was obtained under the High-risk scenario, where the highest rates of phosphorus were allocated to the most vulnerable settings.In this case, the disproportionality increased from 0.48 to 0.59 relative to the Baseline scenario (Table 4).This increase in disproportionality resulted in an increase of DRP loads of 14% compared to the Baseline scenario (Figure 3).
In contrast, when the lowest rates of phosphorus fertilizer were allocated to the HRUs with the highest STP values (Low-risk scenario), DRP loads only increased by 1% (Figure 3), and the disproportionality level increased from 0.48 to 0.49 (Table 4) compared to the Baseline scenario.
Spatial randomization of fertilizer behavior and STP values did not influence TP and DRP loads, which did not vary substantially across randomizations for a given scenario (Table S4).For example, under the Random allocation scenario, the mean TP load was 1979 g/ha, with a standard deviation of 3.42 g/ha, and the mean DRP load was 232 g/ha, with a standard deviation of 2.37 g/ha.Also, spatial randomization did not have an effect on the level of disproportionality of TP and DRP loads.Gini coefficients were identical across all randomizations for each scenario.
Finally, temporal variation affected the disproportionality of DRP loads among all scenarios compared to the Baseline scenario.Water years with high rainfall had higher disproportionality in DRP loads comparing all scenarios against the Baseline scenario (

F I G U R E 3
Percent change in DRP loads from non-manured corn fields in a corn/soybean rotation relative to the Baseline scenario for the water year 2010.and S6).For example, the Gini coefficient for the High-risk scenario for the water year 2010 was 0.11 points greater than the Baseline scenario, while during the water year 2011, the Gini coefficient for the High-risk scenario was 0.15 points greater than the Baseline scenario.

| DISCUSS ION
This study examined the direct and interactive effects of behavioral and physical heterogeneity on phosphorus loads (DRP and TP) in nonmanured corn fields with a corn/soybean rotation in the MRW draining to Lake Erie using a SWAT model.Stronger assumptions about the heterogeneity of phosphorus fertilizer application rates and STP values increased estimated DRP loads under all the scenarios.The results reflect the fact that DRP is driven both by runoff of excess fertilizer nutrients, which move easily with water (Gildow et al., 2016), and in accordance to soil phosphorus concentration (Duncan et al., 2017;Hussain et al., 2021;King et al., 2018;McDowell & Sharpley, 2001).For example, in a study conducted in the Western Lake Erie Basin, Duncan et al. (2017) found a significant positive relationship between STP values and DRP loads (p < 0.05), suggesting that soils with greater STP tended to have greater DRP losses in subsurface drainage.Contrary to these results, TP loads did not change under any of the scenarios.This is likely due to the fact that TP loading includes organic phosphorus from plant material, which is controlled by plant growth and residue decomposition rather than fertilizer application (Neitsch et al., 2005), as well as sedimentbound phosphorus, which is less impacted by short-term fertilizer management (Gildow et al., 2016).
The incorporation of a more realistic characterization of behavioral and physical heterogeneity generated greater estimated DRP loads and increased the level of disproportionality in the contribution of individual HRUs to phosphorus losses compared to a model that assumes homogeneous patterns of fertilization behavior and STP levels.The greatest increases in DRP loadings were obtained under the Random allocation and the High-risk scenarios, in which heterogeneous distributions of both phosphorus fertilizer rate and STP were combined at random or where the highest rates of phosphorus fertilizer were allocated to the most vulnerable geographic settings.The changes in the assumed levels of disproportionality represented under these two scenarios caused an increase in annual DRP loads of 7% and 14%, respectively.
These results are consistent with findings from Wischmeier and Smith (1978) and Nowak et al. (2006), who argue that when both social and biophysical dimensions follow log-normal frequency distributions, their interaction will produce a distribution of pollution even more strongly skewed than either distribution of its own.Thus, the extreme values at the tail of these distributions have disproportionately large effects on the environment.
In contrast to the previous findings, when the lowest rates of phosphorus fertilizer were allocated on the most vulnerable settings in the Low-risk scenario, the estimated DRP loading and disproportionality level were similar to the values obtained in the Baseline scenario.These results are in accordance with Nowak et al. (2006), who suggested that environmental degradation is determined by human behavior and the specific biophysical setting where this behavior takes place.An inappropriate behavior within a particularly buffered setting would not produce the same effect as if it was in a more vulnerable setting.Sometimes targeting strategies are only focused on the vulnerability of the physical landscape and ignore the fact that the land may already have adequate treatment to address pollutant loss.In this sense, a recent study published by Dayton et al. (2020) suggests that farmers in Ohio may be taking steps to manage phosphorus inputs and STP, based on the decreasing temporal trends of STP values and negative P balance in the majority of the counties.Increasing the efficiency of conservation TA B L E 4 Gini coefficients for each scenario.Gini coefficients were used to measure disproportionality in TP and DRP loads.A Gini coefficient of 1 means that only one HRU (hydrologic response unit) emits all the P load in the watershed, while a coefficient of 0 means that all the HRUs emit similar loads of P per unit area.programs may require a focus on fields or operations that need treatment as well as behaviors that are disproportionately likely to generate P losses.
Results from this study showed that the interaction of both human and physical dimensions could have a multiplicative effect of reducing or exacerbating the overall vulnerability of the landscape.For example, estimated DRP loads in the High-risk scenario were approximately 11% greater than the scenarios that consider the behavioral and physical dimensions independently.Therefore, accounting for both the distribution of farmer behaviors and the resiliency of the biophysical setting of those behaviors may improve our understanding about patterns of DRP pollution across the landscape.However, agricultural areas that generate disproportionately high pollutant loads are often identified in watershed models based on only physical characteristics such as slopes and soil types (Ghebremichael et al., 2013;Scavia et al., 2017;Teshager et al., 2017).While some efforts have been made to represent better the heterogeneity among farmers in their decision-making process in models (Kast et al., 2021), the absence of readily available local data on actual farmer management practices is a key limitation to simulate the variability of these practices.
Simulation of spatial randomness did not influence loads nor the disproportionality of DRP or TP loads.However, the inclusion of temporal variation generated a window into how rainfall cycles affect the level of disproportionality of DRP loads, with years having higher precipitation producing greater disparities in the levels of disproportionality in DRP load comparing all scenarios with the Baseline scenario.This variation could be explained by changes in precipitation within a year.For example, the differences in disproportionality of DRP loads between all scenarios compared to the Baseline scenario were higher during the water year of 2011 compared to 2010.According to NOAA (2022), Ohio experienced an annual precipitation of 940 mm in 2010 and 1266 mm in 2011.This conclusion is further supported by lower precipitation in 2012 compared to 2011 (1051 mm), leading to lower differences in DRP disproportionality in 2012 than in the previous year.In Lake Erie, it is the high-loading years that produce the most extensive harmful algae blooms (Michalak et al., 2013).Unfortunately, our findings suggest that the inclusion of behavioral and physical heterogeneity increases the level of disproportionality during the most problematic years, making management even more difficult if these fields are not properly treated.Throughout the next century, the Midwest region of the United States is projected to experience increases in precipitation intensity and annual precipitation volume (Michalak et al., 2013;Pryor et al., 2013).These results suggest that under future climate conditions, the estimated share of DRP loads produced by the highest emitting fields may increase.
However, more research is needed to better understand the effect of temporal variation of behavioral and physical heterogeneity on the disproportionality of DRP loads.
This study likely represents a conservative estimate of the disproportionality in phosphorus loads produced in row crop agriculture in the MRW.Focusing only on non-manured fields because of insufficient data on fertilizer applications to manured fields may have limited the occurrence of overapplication of phosphorus present in the watershed, as fields receiving liquid manure may be more likely to receive higher rates of phosphorus than fields receiving only inorganic fertilizers.The N/P ratio in manure is usually smaller than the N/P ratio required for crop production (Eghball, 2002).Thus, when manure is applied to meet crop nitrogen demands, phosphorus is generally oversupplied.In the future, it would be helpful to increase the number of empirical observations of farmer behaviors used to allocate diverse fertilizer application rates to increase the validity of the current study by collecting more local data on fertilizer and manure management.In addition, even though the studied scenarios considered heterogeneity in many physical characteristics, we simulated minimal heterogeneity in farmer behavior (e.g., crop rotations, manure application, etc.), which might have led to the underestimation of DRP loadings.Findings from Muenich et al. (2017) suggest that detailed field-level management information is needed to accurately model nutrient loadings.
The scenarios were designed to test the sensitivity of the watershed to combined physical and behavioral heterogeneity, and yet uncertainty still exists about the effects of actual behaviors and STP levels on phosphorus loads.In particular, the lack of geographic information linking these two factors makes it impossible to make inferences about the true spatial patterns of high-risk or low-risk areas in the study watershed.If more spatially explicit data on fertilization behavior or STP levels were available, it would enable us to more accurately identify the places where P losses are highest.Also, the SWAT model used in this study, while well calibrated for daily discharge and nutrient loads at the Finally, it is important to mention that simulated outputs are conditional to the procedures, datasets, and assumptions made to set up a SWAT, and caution should always be taken during the interpretation of model results.
In the future, it would be interesting to examine more thoroughly the decision-making factors that influence heterogeneity in phosphorus fertilizer use.For example, it is possible that farmers who apply higher phosphorus fertilizer rates also have higher crop yields.However, past research on inequality has suggested that the production of environmental pollution is often not proportional to economic outputs (Freudenburg, 2006).Variation in phosphorus fertilizer use is likely driven by multiple economic and non-economic factors (Caswell et al., 2001;Prokopy et al., 2019).For example, Pannell (2017) found that nitrogen fertilizer use depends on both technical factors (e.g., crop type, soil characteristics, and precipitation) as well as socioeconomic factors (e.g., sale price of grain, purchase price of fertilizer, objective of the farmer, risk aversion level, nitrogen fixation from legumes, flat payoff functions for nitrogen fertilizer, and policies).Understanding the reasons for farmer decisions is critical to an integrated approach to manage agricultural diffuse pollution on water quality.

| CON CLUS ION
The independent and interactive effects of behavioral and physical heterogeneity on phosphorus export from farmland were evaluated by incorporating stronger assumptions about the heterogeneity of phosphorus fertilizer application rates and STP levels in a SWAT model for the MRW.The inclusion of greater behavioral and physical heterogeneity did not affect TP loads and disproportionality under all scenarios compared to the Baseline scenario.In contrast, the incorporation of behavioral and physical heterogeneity increased DRP loads and the level of disproportionality of DRP loads under all scenarios compared to the Baseline scenario.The highest increase in DRP loads and disproportionality was obtained under the High-risk scenario, where the highest rates of phosphorus were allocated to the HRUs that had the highest STP values.Under this scenario, DRP loads increased by 14%, and the level of disproportionality or inequality increased from 0.48 to 0.59 compared to the Baseline scenario.In contrast, when the lowest rates of phosphorus fertilizer were allocated to the HRUs with the highest STP values (Low-risk scenario), DRP loads only increased by 1%, and the level of disproportionality was relatively similar to the Baseline scenario.
Overall, this analysis suggests that assuming an average fertilizer rate or soil phosphorus concentration across a watershed in spatially distributed watershed-scale hydrologic models may underestimate DRP loadings and mask the ability to identify and address the highest sources of environmental risk in agricultural watersheds.Increasing DRP loads from the MRW contributed to greater cyanobacterial biomass and more frequent occurrence of HABs in Lake Erie (Kane et al., 2014).The use of finer resolution data is important as studies have shown that the incorporation of field-scale data can significantly impact nutrient loading in watershed models (Apostel et al., 2021;Muenich et al., 2017), and here, we expand on this finding to demonstrate an effect on disproportionality of nutrient loading.These results indicate that incorporating more nuanced approaches to capture both behavioral and physical heterogeneity in SWAT models may be particularly relevant in predicting phosphorus pollution and designing effective and targeted measures in agricultural watersheds.Special attention should be given to outliers in human and physical dimensions, which can have a multiplicative effect of reducing or exacerbating the overall vulnerability of the landscape.
4) STP 5−20 = 0.855 × STP avg F I G U R E 2 Density plots of (a) phosphorus rate applications and (b) soil test phosphorus (STP) based on survey and simulated data.
watershed outlet(Apostel et al., 2021), has certain limitations at capturing internal processes within the watershed.Apostel et al. (2021)  found that the sensitive nature of SWAT's runoff processes to surface soil phosphorus concentration produced an unrealistic partitioning of surface and subsurface phosphorus loadings, tending to overpredict phosphorus loading through surface runoff and underpredict the contribution of phosphorus from tile drainage.This limitation in the model capability could have resulted in an overestimation of DRP losses in surface runoff.

Table 4
Simulated annual surface runoff, tile discharge, and DRP and TP loads for non-manured corn fields in a corn/soybean rotation.