Precision gain versus effort with joint models using detection/non‐detection and banding data

Abstract Capture–recapture techniques provide valuable information, but are often more cost‐prohibitive at large spatial and temporal scales than less‐intensive sampling techniques. Model development combining multiple data sources to leverage data source strengths and for improved parameter precision has increased, but with limited discussion on precision gain versus effort. We present a general framework for evaluating trade‐offs between precision gained and costs associated with acquiring multiple data sources, useful for designing future or new phases of current studies.We illustrated how Bayesian hierarchical joint models using detection/non‐detection and banding data can improve abundance, survival, and recruitment inference, and quantified data source costs in a northern Arizona, USA, western bluebird (Sialia mexicana) population. We used an 8‐year detection/non‐detection (distributed across the landscape) and banding (subset of locations within landscape) data set to estimate parameters. We constructed separate models using detection/non‐detection and banding data, and a joint model using both data types to evaluate parameter precision gain relative to effort.Joint model parameter estimates were more precise than single data model estimates, but parameter precision varied (apparent survival > abundance > recruitment). Banding provided greater apparent survival precision than detection/non‐detection data. Therefore, little precision was gained when detection/non‐detection data were added to banding data. Additional costs were minimal; however, additional spatial coverage and ability to estimate abundance and recruitment improved inference. Conversely, more precision was gained when adding banding to detection/non‐detection data at higher cost. Spatial coverage was identical, yet survival and abundance estimates were more precise. Justification of increased costs associated with additional data types depends on project objectives.We illustrate a general framework for evaluating precision gain relative to effort, applicable to joint data models with any data type combination. This framework evaluates costs and benefits from and effort levels between multiple data types, thus improving population monitoring designs.

Despite advantages of combining multiple data sources, discussion is limited on precision gain versus effort or cost with these joint data models. Many joint model studies combine data already collected with ancillary information or information derived from existing databases (i.e., eBird [Sullivan et al., 2009, http://www.ebird.org).
In these scenarios, there is little cost to incorporating additional data types, and cost-benefit analyses may not be necessary. In field studies, however, added costs for collecting additional data types may be considerable, making it desirable to evaluate trade-offs between precision gained from additional data and the data acquisition cost.
Consequently, projects frequently employ these techniques at smaller scales to reduce effort (time and money), potentially limiting the statistical inference spatial scale (Zipkin et al., 2017;Zipkin & Saunders, 2018). Alternatively, presence-absence data for use in occupancy models (MacKenzie et al., 2002) and count data are considerably less expensive and thus can be collected at larger scales for similar cost.
These data types historically provided less information than more intensive sampling approaches, but new analytical approaches now allow estimation of survival, population gains from local recruitment and immigration, and abundance using these data types with dynamic N-occupancy models (Rossman et al., 2016;Zipkin et al., 2017Zipkin et al., , 2014. Here, we illustrate how joint models using detection/non-detection and banding data can be used to make inference on abundance at a larger spatial scale and with greater precision than was possible using single data type models. Our case study used an 8-yr data set on western bluebird (Sialia mexicana) populations in ponderosa pine (Pinus ponderosa) forests of northern Arizona, USA. Our objectives were to: (a) estimate abundance, survival, and recruitment in a Bayesian hierarchical framework from separate models using detection/non-detection and banding data as well as a joint model using both data types, (b) compare precision of the resulting estimates among model types, and (c) evaluate differences in precision gain versus effort (a combination of time and money) among models. This approach results in a general framework for cost-benefit analysis that can be used to evaluate trade-offs between precision gained from and costs associated with collecting additional data types in designing future studies.

| Case study
Within ponderosa pine forests in the southwestern USA, fire is a common natural disturbance (Covington & Moore, 1994;Moir, Geils, Benoit, & Scurlock, 1997) and secondary cavity-nesting birds, like the western bluebird, rely on cavities in snags for nesting and protection from predators. Managers require information for predicting fire effects on avian community structure, especially in the Southwest where relatively little is known about avian responses (Bock & Block, 2005). Because employing capture-recapture monitoring schemes across the landscape is often cost-prohibitive, alternative study designs and model advancements could increase inference at larger spatial and longer temporal scales.

| Study area
Our study occurred in ponderosa pine forest within the Flagstaff Ranger District of the Coconino National Forest, northwest of Flagstaff, in north-central Arizona (for study location maps and study area details see: Latif, Sanderlin, Saab, Block, & Dudley, 2016;Sanderlin, Block, & Strohmeyer, 2015). Two wildfires (Horseshoe Fire and Hochderffer Fire) occurred in May and June 1996 within the study area. The Horseshoe Fire encompassed ~3,500 ha and the Hochderffer fire encompassed >6,600 ha adjacent to the Horseshoe Fire. Burn severity, quantified by using the delta normalized burn ratio (dNBR; see description in "Data" section), in these areas ranged from low to high severity.

| Detection/non-detection
We sampled birds starting 1 year post-fire using the variable-radius point-count method (Reynolds, Scott, & Nussbaum, 1980

| Banding
Artificial nest boxes were distributed randomly among approximately half of the points from each fire-severity strata (n = 72 points; 27 out of 59 high severity, 25 out of 50 moderate severity, 20 out of 40 unburned) where each station had three nest boxes. After completing a point-count (see above), observers rewalked the transect line to check nest boxes for activity and band adult birds using the boxes.
Observers captured birds by target mist-netting (Ralph, Geupel, Pyle, Martin, & DeSante, 1993), or females were removed from the nest while incubating eggs, and banded with a U.S. Geological Survey Patuxent Wildlife Research Center Bird Banding Laboratory (BBL) leg band on one leg and two BBL color bands on the other leg with a unique combination (Federal Bird Banding Permit Number 21653).
Juveniles were marked with a cohort band to identify the year of birth. Resighting occurred either when (a) observers removed a female incubating eggs from the nest and quickly read her bands, or (b) observers read the color bands of perched birds in the nest vicinity.
Observers attempted to recapture any juveniles resighted after their hatch year and mark them with unique BBL bands.

| Vegetation sampling
We sampled live tree basal area at all point-count stations using a 20factor prism and snag basal area using a 5-factor prism. Unburned

| Data
Detection/non-detection data covered all points, whereas banding data were available only for nest box locations. Banding data were condensed to detections during primary periods (e.g., if an individual was detected at least once during a year, it was classified as "1" otherwise "0" for that year) due to inconsistent effort with secondary periods, which meant mark-resight models (Arnason, Schwarz, & Gerrard, 1991) were not possible. Therefore, banding data were used in Cormack-Jolly-Seber models (CJS; Cormack, 1964;Jolly, 1965;Seber, 1965) instead. We used full-identity birds only to reduce model complexity. We collapsed point-count detections to detection/non-detection data for use in dynamic N-occupancy models (Dail & Madsen, 2011;Rossman et al., 2016), and used detections within 100 m of point-count stations for correlating detection/nondetection data with area quantified by fire severity (see below).
We used dNBR generated from a comparison of Landsat TM imagery recorded before and after wildfire (Eidenshink et al., 2007, http://www.mtbs.gov/) to quantify burn severity. Raw dNBR values were compiled at a 30 × 30 m resolution, and a mean dNBR was calculated for a 100-m-radius neighborhood centered on the pointcount station. We used the following model covariates for single data type and joint models: dNBR, an indicator for nest box location (nbox), time since fire (tfire; indicator for burned × time since fire), transect line (TR), snag basal area (snag), live tree basal area (live), point-count data observer (obs), and sex of banded adult bird (sex).
Transect line and observers were random effects, whereas all others were fixed effects. For numerical reasons, we used standardized covariates (mean zero and unit variance) for dNBR, snag basal area, and live tree basal area.
F I G U R E 1 Directed acyclic graphs (DAG) of joint data model including banding (B) and detection/non-detection (P) data for a western bluebird case study in ponderosa pine forests within Coconino National Forest in north-central Arizona, USA between 1999 and 2006. Notation is as follows: λ (expected count of individuals), N (abundance), ϕ (apparent survival probability), S (number of individuals that survived), γ (recruitment), G (number of individuals gained through recruitment), p P (point-count detection probability), p B (banding detection probability), Z (latent alive matrix for banded individuals), Y P (detection/non-detection data), and Y B (banding data). For simplicity, case study regression coefficient parameters for λ, ϕ, γ, p P , and p B were not included within the figure. Arrows indicate dependencies with parameters (circle nodes) and data (square nodes). Single arrows indicate probabilistic relationships, whereas double arrows indicate deterministic relationships. DAGs for the (a) first time period (t = 1) and (b) time periods after the first time period (t > 1) are displayed

| Models
To evaluate precision gain with the joint model (Figure 1), we first constructed separate models for each data structure and then the joint model. Results from a simulation study using the joint model indicated that our model was valid using statistical properties of accuracy, bias, percent coverage, and Bayesian credible interval (BCI) length for abundance, survival, and recruitment parameter estimates (Supporting Information Appendix S1 and S2). Because we did not know truth with our data example, we could not evaluate bias and accuracy. However, we evaluated precision of abundance, survival, and recruitment estimates (simulation study results indicated that patterns with precision were also reflected in patterns of bias and accuracy), and used the relative difference ([single model-joint model]/ joint model) in length of Bayesian credible intervals (BCIs) as our response. We were also interested in quantifying any differences between point estimates of the joint versus separate models for each data structure.

| Detection/non-detection data single model
We used dynamic N-occupancy models (Dail & Madsen, 2011;Rossman et al., 2016) to obtain demographic estimates (abundance, survival, recruitment) from detection/non-detection data using a state-space modeling approach. Dynamic N-occupancy models are more reliable when sites have fairly low densities (due to the reliance on detection heterogeneity to model abundance), and studies have at least 75 survey sites and 5 years of data (Rossman et al., 2016).
Our study satisfied both criteria.
The state-space model, a first-order Markov process, describes two-time series, the biological state process and the observation process, that run in parallel and incorporate both process and sampling error in the same framework (i.e., Buckland, Newman, Thomas, & Koesters, 2004). We used a detection/non-detection data matrix Y P , where element Y Pjkt was a binary indicator of species detection.
When Y Pjkt = 1, a western bluebird was detected at point j (j = 1,…, 149) during session k (k = 1, 2, 3) of year t (t = 1,…, 8). Changes in abundance N jt over time were a function of the biological state processes. We modeled abundance at each site j during the first year of sampling (t = 1) using site-level covariates to describe the expected value of λ j of a Poisson distribution (Equation 1): where expected count (Equation 2) λ j was: and each point j was located within a transect R (R = 1, …, 15), indexed by R j . Parameter a 4R was a normal random effect for transect R with mean 0 and a uniform (0, 5) prior on σ (e.g., a 4R ~ Normal (0, σ 2 )). Parameters a 0 , a 1 , a 2 , and a 3 had normal (µ = 0, σ 2 = 0.1) priors.
Abundance at t > 1 was a function of the number of individuals that survived (S jt ) and the number of individuals that were recruited (G jt ) from t−1 to t: N jt = S jt + G jt . We modeled S jt (Equation 3) as: where apparent annual survival probability (Equation 4) from time t−1 to t, φ jt , was: where the expected number of individuals gained (Equation 6) to site j between t −1 and t was: Parameters c 0 , c 1 , c 2 , c 3 , and c 4 had normal (µ = 0, σ 2 = 0.1) priors. This recruitment estimate included births and immigration, but empirical data suggested immigration was negligible (see next section).
F I G U R E 2 Box and whisker plot for detection probability medians of all sampling locations by year for detection/nondetection data from the joint data model from a western bluebird case study in ponderosa pine forests within Coconino National Forest in north-central Arizona, USA between 1999 and 2006. Detection probability estimates from detection/non-detection data were originally for each secondary period but converted to primary periods (e.g., We modeled the observation process (Royle & Nichols, 2003) (Equation 7) as: where detection probability (Equation 8) was: and each point j, session k, year t had observer o (o = 1, …, 9), indexed by o jkt . Parameter d 2o was a normal random effect for observer o with mean 0 and a uniform (0, 5) prior on σ (e.g., d 2o ~ Normal (0, σ 2 )). We assumed there was no individual heterogeneity with detection probability. Priors for parameters d 0 and d 1 were normal (µ = 0, σ 2 = 1) and normal (µ = 0, σ 2 = 0.1), respectively.

| Joint model
Because we assumed these data structures were independent (see Supporting Information Appendix S3 for exploration of independence assumption), we factored the following components (Equation   12) of the joint posterior distribution, also depicted in the directed acyclic graph (DAG) (Figure 1): For simplicity, case study regression coefficient parameters for λ, ϕ, γ, p P , and p B were not included within Equation 12.

| Inference
We conducted model selection for nested models using indicator variable selection (O'Hara & Sillanpää, 2009). Individual coefficients for our predictor variables β were modified with a binary parameter. For each single data type and joint model, we evaluated all coefficients for our predictor variables at the same time.
All indicator variables v i had Bernoulli (0.5) priors. If the posterior mean for v i was closer to one than zero, the covariate had more model support than if v i was closer to zero. We defined strong model support as the posterior mean > 0.5. We implemented these Bayesian hierarchical models (Gelman, Carlin, Stern, & Rubin, 2004) in JAGS (Plummer, 2003)  Convergence was reached (R < 1.1 [Brooks & Gelman, 1998]). We assessed goodness-of-fit (GOF) using the squared loss statistic for a Bayesian p-value (Gelman et al., 2004:162).

| Effort
We used estimated project costs for detection/non-detection and banding data to quantify effort, which was a combination of time and money. We evaluated effort with single and combined data sources.
To illustrate the types of costs that might be included within a study to evaluate effort and precision, we quantify costs here with data types specific to our study. We note that cost functions are often study-specific, but general cost categories exist with establishment, logit(p Bi ) = e 0 + e 1 × sex i .
,N,p P ,S,G, , ,Z, sampling unit, sampling occasion (and combinations of sampling unit by sampling occasion) components. We do not include costs associated with collecting covariate information, since relative costs would be the same in our case study for banding data and detection/non-detection data only models (e.g., we would include the same vegetation data in single data models). However, field costs may be substantial for collecting covariate data and differ between data types in other studies, so this would be important to include in such cases. We used the following cost function (C p ) for detection/ non-detection data (Equation 13): where s was the number of sampling units (s = 149 points), k was the number of sampling occasions (k = 3), t was the number of years (t = 8), C 0,P was the initial project startup cost with detection/ non-detection data which included study design and equipment costs, C 1,P was the additional establishment cost per sampling unit for detection/non-detection data, C 2,P was the additional cost to sample each sampling unit per sampling occasion per year for detection/non-detection data, and C 3,P was the additional cost per sampling occasion per year for detection/non-detection data. For our example, we used the following cost estimates: C 0,p = $21,900, C 1,p = $19 (per point cost includes two observers' salaries and equipment costs), C 2,p = $12 (per point/occasion/year cost includes two observers' salaries), and C 3,p = $64 (per occasion per year cost for data entry with two observers). Based on our field study, we assumed that two observers (one biological technician, one crew leader) could sample or establish 20 points total per day (10 points per observer).
We used the following cost function (C B ) for banding data Based on our field study, we assumed that two observers (one biological technician, one crew leader) could sample or establish one box per hour.
We used the following cost function (C J ) for the joint data struc-

tures (Equation 15) of detection/non-detection and banding data:
where s was the number of sampling units (s = 77 points for the C 1,J term, s = 149 points for the C 2,J term, s = 216 boxes for the C 4,J and C 5,J terms), whereas k, t, C 0,J , C 1,J , C 2,J , C 3,J , C 4,J , C 5,J , and C 6,J were the same as above, but for both detection/non-detection and banding data sources. For our example, we used the following cost estimates (note that cost estimates were not the same as above due to differing amounts of time allocated for sampling and how travel time was distributed between sampling methods): C 0,J = $28,400, C 1,J = $19 (per point cost includes two observers' salaries and equipment costs), C 2,J = $10 (per point/occasion/year cost includes two observers' salaries), C 3,J = $85 (per occasion per year cost for data entry with two observers), C 4,J = $105 (per nest box cost includes two observers' salaries and equipment costs), C 5,J = $46 (per nest box per year cost includes two observers' salaries), and C 6,J = $200 (equipment costs per year). Based on our field study, we assumed that two observers (one biological technician, one crew leader) could sample or establish 20 points total per day (10 points per observer) and sample or establish one box per hour.

| Estimates of abundance, survival, and recruitment
Model support for individual covariates varied among model types (Tables 1 and 2  To illustrate how precision of estimates varied among models, we used examples from two points (one with and one without a nest box) from three transects of high, moderate, and low/unburned fire severity. For apparent survival probability, both the joint and band data only models had relatively constant survival over time, whereas survival with the detection/non-detection data only model increased slightly with increased time since fire with points at high and moderate severity (Figure 3). Precision increase was greatest for survival estimates, followed by abundance, especially with locations that did not have nest boxes, and minimal increases in precision with recruitment (Figures 4 and 5).

| Effort comparison
The largest difference in precision between estimates with single and joint data sources occurred with the apparent survival ing data were more precise than apparent survival estimates derived using detection/non-detection data, there was little gained in precision by adding detection/non-detection data to banding data (but added cost also was minimal). In contrast, adding banding data to detection/non-detection data resulted in larger increases in precision, but also required significant increases in cost ( Figure 6). Precision also increased for abundance estimates when banding data were added to detection/non-detection data, but again this addition required a significant cost increase ( Figure 6).
Adding banding to detection/non-detection data resulted in minimal changes in precision for recruitment, despite the much higher cost involved (Figure 6).

| D ISCUSS I ON
Demographic models that incorporate multiple data sources are often selected over single source models due to increased precision Estimates marked with an "*" indicate strong model support (posterior mean > 0.5). "NA" indicated the parameter was not part of the model. Data sources included detection/non-detection and banding data. Parameters included N (abundance), ϕ (apparent survival probability), γ (recruitment), p P (point-count detection probability), p B (banding detection probability).
TA B L E 1 Mean estimates of posterior support using indicator variable selection for model covariates in joint data and single data models with detection/ non-detection and banding data from a western bluebird case study in ponderosa pine forests within Coconino National Forest in north-central Arizona, USA between 1999 and 2006 of resulting population parameter estimates (Besbeas et al., 2002;Schaub & Abadi, 2011). If precision was the only concern for a research program, the increased costs associated with collecting additional data types would not be a consideration. However, most research programs work with limited budgets, and evaluating precision gain versus effort for joint data models is warranted in such studies. This evaluation is particularly important when designing new studies or new phases of projects with multiple possible data collection opportunities. Inherently, some data sources will be more reliable than others, and costs relative to precision vary with the amount of effort required to sample across the landscape. Our general framework to evaluate differences in effort versus precision gain with one type of joint data model can also be applied to joint data models using other combinations of data types in the study design phase.
TA B L E 2 Median estimates (95% credible intervals) for model covariates in joint data and single data models with detection/nondetection and banding data from a western bluebird case study in ponderosa pine forests within Coconino National Forest in north-central Note. We include all covariates that had strong model support with indicator variable selection (posterior mean > 0.50). "NA" indicated the parameter was not part of the model and "-" indicated that the covariate did not have strong model support. Parameters included N (abundance), ϕ (apparent survival probability), γ (recruitment), p P (point-count detection probability).
Our joint model combining occupancy and capture-recapture data yielded parameter estimates that were more precise than those resulting from single data type models, but the increase in precision varied by parameter. The greatest increase in precision estimates occurred for apparent survival probability, which was expected because this parameter was shared by both single data source models.
Abundance showed moderate improvement in estimate precision, followed by recruitment.
The amount of precision gained relative to cost also varied by data source. For apparent survival, estimates based on banding data were more precise than those derived from detection/non-detection data, so little precision was gained (but at minimal cost) when detection/non-detection data were added to banding data for the joint model. There was a gain, however, in spatial coverage and ability to estimate abundance and recruitment when adding detection/non-detection data to banding data because banding data were collected at a subset of locations where detection/non-detection data occurred. Detection/non-detection data collection is less time-intensive per area sampled, and a less invasive sampling method, which may be a preferred for sampling threatened and endangered species. Conversely, adding banding to detection/nondetection data did not increase the spatial scope of inference since banding data were collected at a subset of detection/non-detection data locations, but resulted in larger increases in precision for survival and abundance. Precision increases with common parameters between data types are expected (Besbeas et al., 2002;Schaub & Abadi, 2011) (Fontaine & Kennedy, 2012;Kotliar et al., 2002;Saab, Russell, & Dudley, 2007).
In the detection/non-detection only data and joint data models, the snag covariate had a positive relationship for recruitment, but negative with survival. The relationships were not significant, however, with the joint model. Because western bluebirds are cavity-nesting species, snags are likely to contribute to a positive relationship with recruitment (i.e., Saab, Powell, Kotliar, & Newlon, 2005;Wightman & Germaine, 2006). We did not include interaction effects within the model, but we expected an interaction between snag BA and time since fire because snags decline over time in severely burned areas.
F I G U R E 6 Violin plots showing the difference in relative Bayesian credible interval (BCI) length (a measure of precision) between models built from single and joint data sources using data from a western bluebird case study in ponderosa pine forests within Coconino National Forest in north-central Arizona, USA between 1999 and 2006 relative to cost (USD) of adding additional data sources. Relative difference in BCI length was calculated as (BCI length single-BCI length joint)/BCI length joint, so larger numbers equate to more precision gained by incorporating multiple data types. Individual plots show the kernel density distribution across all points sampled. Phi band (left) shows the gain in precision for apparent survival estimates when detection/non-detection data were added to existing banding data. Phi pnt, N pnt, and G pnt (left to right in right hand group) show the increase in precision of estimates for apparent survival, abundance, and recruitment, respectively, when banding data were added to existing detection/non-detection data We envision extending the integrated population model to include information on nest data to better inform the recruitment component for a full integrated model. We expect to have increased precision with the recruitment parameter, with no additional costs due to the way these data were collected in this study and parameterized within the cost equation. Depending on data collection limitations, additional data sources will vary in additional costs and effort.
Evaluating value added from different data sources relative to cost will be dependent on the objectives of a given project, as well as available resources. The explicit use of costs within our study establishes a framework for accomplishing this evaluation, and could be coupled with optimization procedures (i.e., Sanderlin, Block, & Ganey, 2014) to maximize accuracy (with a simulation study, i.e., Supporting Information Appendix S1) or precision (like this case study) subject to cost constraints. For example, if estimating recruitment precisely was the primary objective, our case study indicated that adding an additional data source carried high costs but resulted in limited gains in precision. The ability to evaluate costs versus precision, bias, and/or accuracy in parameter estimates is valuable for targeting where to allocate limited resources to meet study objectives and to evaluate power for a given effect size. Further, sampling design trade-offs not only with or without specific data sources, but different levels of effort with each data source could be evaluated with respect to parameter accuracy and associated costs within our framework, and warrants future exploration. For example, in our simulation study (Supporting Information Appendix S1), both number of banding sites and detection/non-detection sessions were important for estimating apparent survival (with banding sites being more important), while number of detection/non-detection sessions was more important for estimating abundance and recruitment. In conclusion, our general framework to evaluate differences in effort versus precision gain with one type of joint data model is applicable to other data type combinations of joint data models, and can also be used to evaluate trade-offs with different levels of effort within each data source. This framework allows research and monitoring programs to evaluate optimal use of limited funds when multiple data sources are available within the study design phase to meet study objectives.

ACK N OWLED G M ENTS
We thank S. Auza, K. Covert, K. Cobb, J. Dwyer, L. Doll, L. Dickson, L. McGrath, T. Pope, G. Martinez, J. Iniguez, and S. Vojta for collecting point-count, banding, and vegetation data. National Fire Plan and Joint Fire Science Program (01-1-3-25) provided study funding. We also thank J. Dudley for calculations of dNBR and two anonymous reviewers for helpful comments on earlier drafts.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R S' CO NTR I B UTI O N S
WB and VS conceived the overall project ideas, and JS, WB, and JG conceived the methodological application ideas for the manuscript; JS designed the modeling methods; WB, BS, and VS designed the sampling methodology; WB, BS, and VS collected the data; JS analyzed the data; All authors interpreted the data, JS led the writing of the manuscript. All authors contributed critically to the drafts and gave final approval for publication.

DATA ACCE SS I B I LIT Y
Data are available through the USDA Forest Service Data Archive (Block, Strohmeyer, & Sanderlin, 2018, https://doi.org/10.2737/