Ignoring uncertainty in predictor variables leads to false con ﬁ dence in results: a case study of duck habitat use

. An assumption of most regression analyses is that independent variables are measured without error. However, in ecological studies it is common to use independent variables that are derived from samples and therefore contain some uncertainty. For example, when assessing the assumption that energy availability on the landscape is the primary driver of duck distribution during nonbreeding seasons, investigators typically sample energy availability at sites and use the site-level means as a covariate to predict duck abundance. This strategy ignores uncertainty in the estimates of energy availability, which should be propagated into estimates of effects and predicted values of the response variable. I used Bayesian hierarchical models to include uncertainty in site-level covariates when modeling dabbling duck count data during the spring in northeastern Colorado, USA. I found that even after accounting for uncertainty in energy availability, it was an important predictor of dabbling duck use of sites. Counts were greater at sites with more energy available; however, credible intervals were substantially wider when uncertainty in predictor variables was included. Therefore, ignoring uncertainty leads to overly precise model outputs. Furthermore, I found that larger sites and those further east also supported more dabbling ducks. Using a sample as a covariate is common in ecological studies, and researchers can use the methods outlined here to account for this additional level of uncertainty. These case study results can be used by habitat managers and planners to guide how and where wetland restoration occurs with a more accurate idea of the uncertainty associated with various effects.


INTRODUCTION
Uncertainty enters into our understanding of ecological processes through multiple sources. Increasingly, researchers are acknowledging various sources of uncertainty and attempting to account for it (Cressie et al. 2009). Process uncertainty represents incomplete knowledge or depiction of a phenomenon or the unpredictability of parameters, whereas observation uncertainty results from inability to measure perfectly or completely (i.e., sampling) and may include bias in measurements (Hilborn andMangel 1997, Hobbs and. Traditionally, process uncertainty has been the main source of uncertainty included in statistical analyses, whereas observation uncertainty is not typically mentioned unless there was an obvious source of bias in measurements (e.g., Dorazio 2014). Specifically with regard to predictor variables, observation uncertainty can lead to bias in parameter estimates, loss of power to detect effects, and potentially false conclusions (Davies and Hutton 1975, Reeves et al. 1998, Carroll et al. 2006. A fundamental assumption of most regression models is that independent variables are measured without error (Sokal and Rohlf 1995). Therefore, methods to deal with observation uncertainty in predictor variables have received much attention, and an entire class of models (measurement error or errors-in-variables models) have been developed (Fuller 1987). However, used in a likelihood-based framework, these models can be somewhat limiting in terms of how complex they can be. Achieving convergence and parameter estimation can be challenging due to their high dimensionality (Jitjareonchai et al. 2006, Bartlett and Keogh 2018).
Bayesian hierarchical models are being used more frequently by ecologists to model complex systems and represent one approach to address process and observation uncertainty. Hierarchical models allow uncertainty at one level (e.g., predictor variable) to be propagated into effect sizes and parameter estimates in another level (predictions of response variable; Gelman and Hill 2007, Bartlett and Keogh 2018). Furthermore, in a Bayesian framework with Markov Chain Monte Carlo methods, models can handle much greater complexity when compared with frequentist methods (Bartlett and Keogh 2018). Bayesian hierarchical models accounting for measurement error in covariates have been used in epidemiology (Richardson andGilks 1993, Bartlett andKeogh 2018), occupancy modeling (Rota et al. 2016), and population dynamics (Calder et al. 2003, Clark andBjornstad 2004) but have received limited use in other realms of ecology. The approach involves treating the true unobserved values of a predictor variable as a latent variable (Hoffmann et al. 2017). Measurement error can be modeled using either classical or Berkson errors. Classical measurement error models consist of the observed variable X* conditional on the latent true variable X and an error term (X* = X + ϵ), whereas with Berkson errors, the latent true variable is conditional on the observed values plus an error term (X = X* + ϵ, Carroll et al. 2006). Classical errors arise when researchers have repeated measurements around a true value, whereas Berkson errors occur when a group's mean is assigned to each individual (Heid et al. 2004). I focus on classical measurement error in this paper.
A common scenario in ecological research involves taking samples of some covariate at the site (or individual, group, etc.) level that is suspected to explain variation in a response variable among sites (or individuals, groups, etc.; Osborn et al. 2017, Nhu et al. 2019, Spaan et al. 2019. For example, in a study attempting to identify predictors of bird abundance among montane meadows, vegetation samples were taken within each site and the site-level means were used as covariates in regression models (Saveraid et al. 2001). Using site-level means as covariates seems to be the standard way information derived from samples is included as covariates in analyses (Kalies et al. 2012, Pilliod et al. 2013, Wimp et al. 2019, Lindstrom et al. 2020. Under this scenario, no bias is suspected because the covariate is based on multiple samples from each site. However, there is still uncertainty in the estimate of the site-level mean that should be accounted for but is rarely mentioned (Freckleton 2011). As with any sampling, this uncertainty is the result of data collected at a smaller scale (sample) than that at which inference is desired (population). As long as there is no systematic bias in sampling, parameter estimates should be unbiased but confidence intervals around predicted values and effects should widen. By taking multiple samples at each site, researchers have the ability to estimate the uncertainty associated with the site-level mean and include this information in their models.
Herein, I use a case study of duck habitat use among sites to show how observation uncertainty in predictor variables can be accounted for. Although I focus on predictors of habitat use among sites, these methods of accounting for uncertainty in predictor variables could be used in many other scenarios involving sampled variables that are used as predictors of variation among individuals, species, or any other grouping or classification. Methods have previously been developed to deal with observation uncertainty (Richardson and Gilks 1993, Fox and Glas 2003, Bartlett and Keogh 2018, and the effects of ignoring observation uncertainty have been documented (Linden andKnape 2009, Freckleton 2011), but they are rarely used in ecological research for the habitat use or selection scenario described in the case study. Therefore, my objective was to show how these methods can be applied to a common ecological research scenario and how they influence overall inference. v www.esajournals.org 2 October 2020 v Volume 11(10) v Article e03273

Case study
Current paradigms of habitat management and planning for nonbreeding waterfowl are focused on providing enough energy on the landscape to support regional population goals (U.S. Department of the Interior and Environment Canada 1986, Central Valley Joint Venture 2006, Soulliere et al. 2007, Playa Lakes Joint Venture 2008. These habitat management and planning strategies assume that energy availability is a limiting factor during nonbreeding periods (Playa Lakes Joint Venture 2008), and therefore, migrating and wintering ducks select habitat based on energy availability. This assumption should be thoroughly tested, given the amount of management resources that are spent to increase food availability on the landscape (Colorado Division of Wildlife 2011, U.S. Fish and Wildlife Service 2013). Generally, previous research has shown that local-scale duck abundance is positively related to energy density of sites although the effects were variable (Brasher 2010, O'Shaughnessy 2014. In addition to energy availability, many other factors have been shown to influence duck use of sites during migration and winter including wetland type or vegetation patterns in wetlands, size, presence of adjacent wetlands, distance to rivers, and water depth (Brasher 2010, Webb et al. 2010, Hagy and Kaminski 2012, Beatty et al. 2014, O'Shaughnessy 2014, Osborn et al. 2017). If ducks do not respond to changes in energy availability as strongly as predicted due to selection for other factors, the actual number of ducks supported may be less than what planners anticipated.
Examining the effects of energy availability on duck abundance requires estimating energy availability. Like many ecological covariates, food sampling is a time and labor-intensive process (Williams et al. 2014). Therefore, researchers take a small number of food samples (usually about 5-10, Olmstead et al. 2013, O'Shaughnessy 2014, Osborn et al. 2017 at each site that they are interested in representing. These samples contain uncertainty ) that is generally ignored in analyses that use site-level means of energy availability as predictor variables. Accounting for this uncertainty should broaden confidence intervals around the covariate's parameter estimate and overall model predictions, potentially to the point where the variable is no longer important.
In a Bayesian hierarchical modeling framework, I assessed the relationship between sitelevel characteristics and the intensity of duck use in northeastern Colorado during spring migration. My main objective was to examine the effect of accounting for observation uncertainty in energy availability while assessing its relationship with the intensity of duck use of sites. I expected other factors to influence site use as well so I also assessed how vegetation structure, size, and geographic location affected the intensity of duck use of sites. Like energy availability, vegetation density was also a sample and contained observation uncertainty for which I accounted. Vegetation structure or density can influence how ducks perceive predation risk (Behney et al. 2018). If perceived predation risk prevented ducks from using sites, I expected my measure of vegetation density to be a good predictor of use. There are energetic costs when traveling from roosting or loafing sites to foraging patches (Johnson et al. 2014). I assessed whether these energetic costs played an important role in shaping duck distribution by including variables quantifying the distance of each site to a large reservoir and the distance to a river, which are commonly used for roosting or loafing (Ringelman et al. 1989). Lastly, local-scale duck abundance is limited by duck abundance at larger spatial scales. If this large-scale flyway location effect is an important driver of local-scale duck abundance, then geographic location along an east-west gradient should be a good predictor of duck abundance.

Study area and site selection
I conducted this study over a 3,331,501-ha area in northeastern Colorado in Sedgwick, Washington, Logan, Morgan, Weld, and Larimer counties. The area is generally classified as shortgrass prairie with intensive agriculture focusing on cattle production and row crop farming. Most water features in the region (other than playas) are associated with the South Platte River and were found within the river basin corridor. Therefore, to ensure a spatially balanced sample, I divided the South Platte River corridor (~10 km from river) into four quadrants (~70 river km per quadrant). Using wetland geographic information system v www.esajournals.org 3 October 2020 v Volume 11(10) v Article e03273 data, I compiled a list of all potential sites (properties, both public and private) within each quadrant for each type of water feature and randomly selected sites within each quadrant. I used 6 general types of water features: actively managed emergent wetland, passively managed emergent wetlands, sloughs, playas, small reservoirs (<200 ha), and large reservoirs (≥200 ha). These water feature types represented a substantial proportion of the overall wetland habitat base used in regional avian habitat planning models (Playa Lakes Joint Venture 2008). I randomly selected sites equally among quadrants and water feature type. I was unable to gain access to 3 randomly selected sites so I randomly selected new sites to replace them. Playas are found farther from the South Platte River than other wetland types. Therefore, I selected study playas, by creating a grid (7.5 × 7.5 min topographic map quadrangles) across northeastern Colorado. I randomly selected two grid cells that contained at least 5 playas that were greater than 0.25 ha. I then randomly selected playas within each grid cell to sample. During this study, playas were infrequently inundated due to drought, which drastically reduced the sample size of playas.

Sampling methods
Duck counts. -In late winter, after most water bodies thawed and ducks started to arrive, I conducted weekly bird counts during morning (sunrise to 1000) between 28 February and 1 June 2016 and 2017 on all sites. I visited each site once per week in a random order and noted the number of each species of duck present. For most sites, I was able to conduct counts each week and collected 10 or 11 counts throughout each spring. However, for some sites, I was not able to visit them every week and I excluded sites from analyses with fewer than three samples in a year. I also excluded sites for which I did not collect energy availability or vegetation density data. I attempted to estimate detection probability during duck counts through double counting. First, I counted ducks from a vantage point or vehicle without disturbing them. Then, for water features less than~2 ha, I walked the perimeter and out into the wetland to flush and count all ducks. For sloughs, I walked a 500 m stretch to flush and count all ducks. For reservoirs, I randomly selected a quarter or an eighth of the water feature, and used aerial photographs in the field to identify the extent of, and counted all ducks within, the selected area. I also walked the bank or out into any vegetation to flush any ducks that were not initially visible. Farther from the bank, there was no vegetation that would have prevented detection of ducks. Observers used spotting scopes and binoculars to scan open water areas to search for ducks. Detection of ducks during flush counts was always ≥ to that of vantage surveys so I only used flush counts in analyses and did not account for detection probability. By flushing all ducks in smaller water features and those in vegetation in reservoirs, I believe that any bias due to detection probability was minimal.
Food and vegetation sampling.-I used core sampling (Williams et al. 2014) to estimate food density during fall and spring. Details of food sampling and estimates of food and energy density and depletion for these study sites are presented in Behney (2020). Briefly, I sampled in late winter (23 February through 16 March 2016 and 2017), as soon as wetlands began thawing to estimate food density at the beginning of spring migration. I randomly distributed seven core samples ) throughout portions of water features with water depth less than 50 cm. I used this cutoff because dabbling ducks (Genus Anas, Spatula, Mareca, Aix) generally do not feed in water deeper than 50 cm (Behney 2014). Core samples were processed in the laboratory where I visually searched through the material and picked out any seed, tuber, or invertebrate and identified items to lowest taxonomic level possible, generally, to genus for plant matter and to class or order for invertebrates. I dried all material at 60°C to a constant mass (about 48 h) and weighed to the nearest 0.00001 g to estimate food density (kg/ha) and facilitate conversion to energy density (kcal/ha) based on published true metabolizable energy values for duck food items at the lowest taxonomic level possible (Livolsi et al. 2015). I converted energy density to total available energy at a site by multiplying energy density by the area of each site for which water depth was shallow enough to facilitate feeding by dabbling ducks (<50 cm) presented in Behney (2020). At each core location, I estimated vegetation density/structure using a modified v www.esajournals.org Robel pole technique for which I noted the number of 10 cm bands on a 1.5 m tall pole (2.54 cm diameter) that were completely visible from 5 m, repeated for each of the four cardinal directions.

Statistical analyses
To assess factors that influence dabbling duck use of sites during spring, I modeled weekly dabbling duck counts (pooled across species) in a Bayesian hierarchical analysis. I used a negative binomial distribution to represent duck counts because initial data exploration suggested a Poisson distribution could not adequately represent the variation present in the duck count data. I limited analyses and inference to dabbling ducks because they are the primary focus of management activity in the region (Playa Lakes Joint Venture Waterfowl Team 2005). I included week as a count-level (i.e., individual-level) predictor and total available energy (all energy found at water depths less than 50 cm), an index of vegetation density (Robel pole measurements), site size (ha), distance to the nearest large reservoir (km), distance to the South Platte River (km), and UTM easting coordinate (km) as site-level (i.e., group-level) predictors. To calculate the distance from each site to large reservoirs and the river, I calculated the distances between the centroid of each site and the centroid of the nearest large reservoir or the centerline of the river in ArcMap (Esri, Redlands, California, USA).
Energy density and vegetation density metrics included some uncertainty because they, themselves, were samples. Therefore, in addition to the count model, I included energy and vegetation density models in the overall hierarchical model and used the parameters estimated from these models as covariates in the duck count model. I modeled energy using a lognormal distribution: where veg dijq is the number of sections of the pole that were visible out of 15 total sections, read from direction d, at sample location i, within site j, in year q. I included direction d as an additional level for the vegetation model but not in the energy availability model because at each sample location, a vegetation density reading was taken from 4 directions, whereas only 1 food sample was taken at each sample location. Sample location was nested within site so I used the vegetation density mean of the site as the prior mean for the sample location. I modeled weekly duck counts (k) at site j in year q using a negative binomial distribution where the intercept was allowed to vary by site and year. I included site-level covariates to explain variability in the intercept. The global count model took the form: were estimated in the energy and vegetation density submodels, respectively. Full model posterior and joint distributions are shown in Appendix S1.
By modeling energy and vegetation density in the overall hierarchical model, uncertainty in their estimates is propagated into the duck count model. All priors were vague; coefficients (β, τ) were assigned normal(0, 0.0001 [precision]) except in the vegetation density model where coefficients (α1 veg jq ) were assigned normal(0, 0.5[precision]) priors because it used a logit link function, and I wanted priors to be vague on the probability scale (Northrup and Gerber 2018) rather than the scale of the link function. Standard deviations were assigned uniform (0, 5) priors. I centered all predictor variables before inclusion in the model. I used a multi-phase modeling strategy. Firstly, I assessed the best form of the week variable v www.esajournals.org 5 October 2020 v Volume 11(10) v Article e03273 without including any site-level predictors. I included week as a cubic, quadratic, and linear predictor of duck counts and included an intercept-only null model in the model set. I compared models using deviance information criterion (DIC) and by assessing whether 95% credible intervals of the coefficients contained zero. Many options exist for Bayesian model selection but I chose DIC because it is simple to calculate and is similar to AIC, which is widely used and understood by ecologists . Secondly, using the best form of the week variable, I assessed whether energy density or vegetation density should be included in the final model of duck counts. I included these variables in this separate stage of modeling because they were samples themselves and required a mixture of models, each with their own parameters. I fit a model including vegetation and energy density in an additive relationship and determined that a variable should be in the final model if the 95% credible interval for its coefficient excluded zero (e.g., Hobbs et al. 2012). I did not use DIC to compare models in this stage because DIC measures the fit of a model penalized by the effective number of parameters (Gelman and Hill 2007). The goal of incorporating uncertainty in covariates is not necessarily to improve model fit, but to more realistically represent uncertainty that is present in the data, which requires estimating additional parameters. Therefore, a model incorporating predictor uncertainty will likely have a worse DIC than a model ignoring this uncertainty. Lastly, using the best model from stage two, I compared models including all additive combinations of distance to large reservoir, distance to river, and easting using DIC and assessing 95% credible intervals. In this final stage, I also included two models incorporating interactions between energy density, distance to reservoir, and distance to river (energy × reservoir and energy × river) to test for differences in how ducks select foraging sites based on how far they are from roost locations (Kennedy and Gray 1997). Because the response variable was raw counts (not density), to account for differences in area among sites, I included a size (ha) variable in all models during this last stage of modeling. None of the predictors were highly correlated (all Pearson correlation coefficients < 0.5).
I ran three chains for 30,000 Markov Chain Monte Carlo (MCMC) iterations following 5,000 iterations for burn-in and 5,000 iterations for adaptation for optimal sampling and fit models using Jags via the RJags package (Plummer 2018) in program R (R Core Team 2019). To assess convergence, I visually examined trace plots and calculated Gelman and Rubin's convergence diagnostic in the CODA package (Plummer et al. 2006) to ensure it was close to 1 (Gelman and Rubin 1992, Brooks and Gelman 1998, Gelman and Hill 2007. To check whether models adequately fit the data, I used posterior predictive checks with Bayesian p values based on means, standard deviations, and discrepancy (sums of squares) of the data and from simulated data from the fitted models Hill 2007, Hobbs and. Bayesian P values were calculated as the percentage of iterations for which the metric (i.e., mean, SD, discrepancy) from the observed data exceeded that of the simulated data where extreme values (close to zero or one) indicate poor fit.

RESULTS
I was able to include 498 counts over 28 sites in analyses. The three most common species observed were mallard (Anas platyrhynchos), northern shoveler (Spatula clypeata), and gadwall (Mareca strepera). For all models, trace plots indicated convergence for all parameters and Gelman and Rubin's convergence diagnostic was ≤ 1.01 for all parameters. Little evidence of lack of fit was revealed for models from posterior predictive checks; all Bayesian p values were between 0.5 and 0.7.
For the first stage of modeling, including week as a quadratic or a cubic term resulted in very similar DIC scores (within 0.1 DIC; Appendix S2: Table S1). Both models were substantially better than models including week in linear form (ΔDIC = 32.6) or a null, intercept-only model (ΔDIC = 148.7). The cubic coefficient's 95% credible interval overlapped zero (−0.001-0.005; Appendix S2: Table S1), and the predicted values were very similar between cubic and quadratic models. Therefore, I used the quadratic form of week as a count-level predictor in subsequent modeling.
v www.esajournals.org 6 October 2020 v Volume 11(10) v Article e03273 BEHNEY For the second stage of modeling, the 95% credible interval for the energy availability coefficient excluded zero (0.49, 95% CI 0.26-0.66), whereas the credible interval for the vegetation density coefficient included zero (0.17, 95% CI − 0.84-1.03). Therefore, I only included energy availability as a site-level predictor in subsequent modeling.
I included 10 models in the third stage of modeling. Based on DIC, the best model included additive effects of energy availability, distance to reservoir, and easting (Table 1). There were four competing models (Table 1) so I identified the model on which to base inference by looking at whether credible intervals overlapped zero. In the top model, credible intervals for energy availability (0.49, 0.30-0.67) and site size (0.004, 0.002-0.006) excluded zero, easting was borderline (0.01, 0.0005-0.01), and distance to reservoir included zero (−0.02, −0.06-0.01). In the second best model, the credible interval for the interaction term between distance to reservoir and energy availability included zero (0.009, −0.004-0.03). Therefore, I generated predictions and based inference on the model including energy availability, site size, and easting in an additive relationship. Energy availability, easting, and site size were positively related to duck counts (Fig. 1). Accounting for uncertainty in energy availability resulted in substantially wider credible intervals for both predicted duck counts and the group-level effect of energy availability on duck counts (Fig. 2).

DISCUSSION
Increasingly, investigators are recognizing the value of accounting for various sources of uncertainty in ecological studies. Observation uncertainty in predictor variables is one source that is rarely accounted for in ecology but can have substantial influence on overall inference (Carroll et al. 2006). Using a case study of duck habitat use among sites, I found that accounting for observation uncertainty in site-level covariates resulted in substantially less confidence in model predictions and effect sizes than if this uncertainty was ignored. The case study presented in this paper represents a common situation in ecological research where site-level predictors contain observation uncertainty, and my analyses confirm that researchers should account for this uncertainty in predictors or risk severely overestimating precision.
With the increasing use of Bayesian analyses, many researchers who are familiar with coding Bayesian models should find it straightforward to add a term to account for predictor uncertainty in their analyses. A simple rule that researchers can use to indicate whether accounting for predictor uncertainty is required is if they were planning on calculating site or some other group-level means of a predictor variable before inclusion in a model, then there is uncertainty in that variable and it should be accounted for. Studies involving vegetation sampling are one of the most common occurrences of predictor uncertainty because researchers typically collect a large amount of vegetation data to attempt to accurately represent a site and then use site-level means as covariates (e.g., Saveraid et al. 2001, Behney et al. 2018. In this situation, researchers can simply include a submodel in the overall hierarchical model that predicts the site-level vegetation means and use the predictions as a covariate in the primary model. In my case study, even after accounting for observation uncertainty, I detected a positive effect of energy availability on duck use. My results are consistent with other studies of duck habitat use during the nonbreeding season in that food or energy density was positively related to duck use but other factors contribute to how ducks distribute themselves among sites (Brasher 2010, O'Shaughnessy 2014 Fig. 1. The intercept in the count model was allowed to vary by site and site-level predictors were used to explain variation in the intercept (left column). The intercept was on the log scale, because I used a log link function in the count model. Back-transformed values of predicted number of ducks site −1 d −1 at the mean date during spring are shown in the right column. Predicted values are shown with heavy solid or dashed lines and 95% credible intervals are shown with dotted lines. Each circle represents the intercept for that site and its measured energy availability, size, or easting. The top row shows the effects of total energy at a site that is found at water depths less than 50 cm. The second row shows the effect of UTM easting coordinate, and the third row shows the effects of site size. All predictions were made from the model including energy + easting + size. When showing the effects of individual variables, other variables were held constant at their mean.
v www.esajournals.org 8 October 2020 v Volume 11(10) v Article e03273 BEHNEY 2017). I also found that site size and easting both contributed to explain duck counts in northeastern Colorado. Within the range of site-level predictor values observed in this study, energy availability and site size appeared to have the greatest effect on the number of ducks observed at a site. In other studies, factors that have been shown to influence duck use of sites during migration and winter include wetland type or vegetation patterns in wetlands (Hagy and Kaminski 2012, Beatty et al. 2014, O'Shaughnessy 2014, Osborn et al. 2017) and wetland structure/ morphology (Brasher 2010, Osborn et al. 2017 and landscape characteristics (Brasher 2010, Beatty et al. 2014). These studies did not account for observation uncertainty in predictor values which may be why they were able to detect an effect of vegetation density and I was not. Energetic carrying capacity models used for allocating resources for habitat acquisition or restoration assume that energy provided through these projects will be available and consumed by ducks (Central Valley Joint Venture 2006, Soulliere et al. 2007, Playa Lakes Joint Venture 2008. However, this and other research has shown that a variety of factors (e.g., geographic location, site size, vegetation patterns) can influence duck use of sites other than energy availability. If these factors are ignored, sites may not be exploited by ducks as predicted, resulting in a suboptimal use of resources for creation/enhancement of habitat projects.
In this study, site size and easting were both positively related to intensity of duck use of sites. Larger sites have previously been shown to support more ducks (Brasher 2010), and based on my results, the finding is not due to larger sites containing more total energy. My estimates of total energy availability only included energy found at depths shallow enough to be accessible to dabbling ducks. Larger sites contained proportionally less shallow water habitat than smaller, shallower sites (Behney 2020), and thus, energy availability was not positively correlated with size (r = −0.47). The relationship between site size and intensity of duck use was positive so it appears that ducks were selecting larger sites for reasons other than energy availability of the site. Ducks generally select larger, open water sites for roosting (Hopper 1968, Ringelman et al. 1989, possibly due to the lower perceived predation risk in open water (Behney et al. 2018), which may explain why we observed more ducks at larger sites. Furthermore, larger sites generally support greater species richness (MacArthur and Wilson 1967, Rosenzweig 1995, Connor et al. 2000, which may result in more total individuals at each site. My finding of more ducks using sites further east suggests there is a gradient of regional duck abundance ranging from more ducks in far eastern Colorado and fewer ducks as one moves west. Colorado lies at the far western edge of the Central Flyway, and if fewer ducks use the fringe of the flyway, I would expect a positive relationship with easting within Colorado. Alternatively, there tend to be more water features at the western edge (i.e., Colorado Front Range) of my study area, which may also result in fewer ducks on each individual water feature. However, the effect of easting was more a function of the extreme east sites supporting more ducks rather than the extreme western sites supporting fewer (Fig. 1).
Interestingly, I did not detect effects of distance to reservoir or river on duck counts. Ducks are known to select large reservoirs for roosting habitat (Hopper 1968, Ringelman et al. 1989) and would forage nearby if their goal was to minimize energy expenditure. In a meta-analysis, mean foraging flight distance for dabbling ducks was 5.1 km (Johnson et al. 2014). Many of my sites (n = 28) were farther than this reported mean foraging flight distance (range = 0-50 km), which may explain why I did not observe a relationship with distance to reservoir. Ducks are known to use the South Platte River, particularly during cold temperatures when other types of water features (e.g., reservoirs, wetlands) freeze (Hopper 1968, Ringelman et al. 1989. My study took place during spring, after water features thawed, so there likely was little selective pressure for ducks to use the river when so many other water features were available.
I attempted to estimate detection probability during duck counts through double counting. However, I always detected at least as many ducks during flush counts than vantage surveys so I only used flush counts in analysis and did not estimate detection probability. I acknowledge that failing to account for detection probability may have resulted in some bias in the count data but believe this to be minimal for two reasons. Firstly, the number of ducks I typically counted was low (75% of observations were ≤ 20 ducks, median = 5), which makes it easier to accurately count all individuals. Secondly, the main factor that might limit an observer's ability to see and count all ducks present at a site was vegetation that limited visibility. By flushing ducks out of any vegetation, observers could accurately count ducks as they flushed into the air. Flush counts are a relatively common way of counting ducks in areas where detection probability may otherwise be < 1 (Linz et al. 1998, St. James et al. 2013, Lindstrom et al. 2020. Inferences from this study are limited to spring migration and should not be used to infer patterns of habitat use during fall migration. Disturbance from waterfowl hunting plays a large role in habitat selection during the fall (Lancaster et al. 2015, Palumbo et al. 2019) but is absent during the spring. Therefore, managers wishing use this information to guide where habitat restoration takes place should keep in mind that habitat selection during fall migration may be different than what I present here. There is some evidence that ducks migrate faster and have shorter stopover durations during spring than fall (Nilsson et al. 2013), presumably to reach the breeding grounds earlier to increase reproductive success (Dzus and Clark 1998). Therefore, they may be using different migration corridors during spring to reach the breeding grounds early. I recommend future research incorporating both fall and spring habitat selection in the same region to discern any differences between the two periods.
These results can be incorporated into habitat planning strategies for nonbreeding ducks in a number of ways. Firstly, they confirm that providing abundant energy for ducks to consume should be a primary goal of habitat conservation or restoration projects. Secondly, factors other than energy can also be used to guide where and how habitat projects are carried out. For example, all other factors being equal, projects that are closer to reservoirs and farther east in Colorado will tend to provide the greatest value to ducks during spring. Furthermore, I show that Bayesian hierarchical models represent a good way to account for uncertainties that arise at multiple levels (i.e., hierarchical). This approach could be incorporated into many ecological studies to more accurately estimate uncertainty. Not only does appropriately accounting for such uncertainty present a more accurate depiction of v www.esajournals.org reality but also results in better predictions (Cressie et al. 2009).

ACKNOWLEDGMENTS
Funding for this research was provided by Colorado Parks and Wildlife's Wetland Wildlife Conservation Program, Avian Research Program, and a Playa Lakes Joint Venture ConocoPhillips grant. I would like to thank all the technicians who worked on this project, the landowners who graciously allowed access to their property, and Colorado Parks and Wildlife field staff for logistical support. J. Gammonley, J. Lemly, N. Marymor, B. Noonan, M. Reddy, G. Stoebner, B. Sullivan, and T. Verquer provided valuable input on the design or logistics of this project. I thank the National Socio-Environmental Synthesis Center (SESYNC) for statistical training. K. Aagaard and J. Runge provided valuable input on statistical analyses. This manuscript benefitted greatly from reviews by K. Aagaard and B. Walker.