Effective sampling area is a major driver of power to detect long-term trends in multispecies occupancy monitoring

. Occupancy-based monitoring has become an important tool in wildlife conservation and management. Nonetheless, meeting occupancy modeling assumptions and providing biologically accurate information are dif ﬁ cult tasks over long time periods, large areas, or when monitoring multiple species. In occupancy modeling frameworks, derived grids are commonly used to divide landscapes into discrete units. Grid sizes that match the home range size of the species of interest are considered optimal, but this practice is complicated as home range size may vary by sex, habitat quality, or among species. Additionally, studies often assume their survey methods sample an entire grid cell when the actual effective sampling area may be much smaller. The effect of reduced effective sampling area on occupancy estimation has received little attention to date, despite being ﬂ agged as a critical issue. In this study, we assessed (1) how the relationship between effective area, home range size, and grid size affects power to detect trends in occupancy; (2) how varying the sampling design factors of effective area, duration, detection probability, and resurvey interval in ﬂ uence monitoring ef ﬁ ciency; and (3) determine whether a single sampling design can simultaneously detect declines in two species with different home range sizes. We used a spatially explicit simulation framework to create biologically realistic declining populations over 10 yr and assessed statistical power to detect known declines using occupancy modeling. We found that effective area and detection probability had the greatest in ﬂ uence on statistical power. We could not reliably detect declines when detection probability was low or when effective sampling area was < 1/4 cell. We conclude that failing to account for effective area less than the cell size will result in overestimation of statistical power. Our simulations suggest occupancy models can detect declines for two species with different home range sizes using the same grid cell size under certain conditions, for instance, surveying > 25% of the landscape, ≥ 25% effective area, and ﬁ xed sampling locations. Further, increasing resampling interval greatly increased monitoring ef ﬁ ciency. Our results show monitoring planning requires explicit consideration of effective sampling area and methods with suf ﬁ cient detectability to detect population declines.


INTRODUCTION
Occupancy monitoring, or monitoring change in the proportion of an area occupied by a species over time, has become a commonly used tool for describing wildlife populations and for use in conservation and management over the past two decades (Mackenzie et al. 2017). Recently, there have been concurrent advances in both occupancy modeling (Guillera-Arroita 2017, Valente et al. 2017) and non-invasive field detection methods for wildlife such as passive acoustic devices (Sugai et al. 2019, Wood et al. 2019a, genetic sampling (Schwartz et al. 2007), remote cameras (Long et al. 2008, Rich et al. 2017, and environmental DNA (Franklin et al. 2019, Harper et al. 2019. Additionally, there is growing trend toward multispecies occupancy monitoring in wildlife (Pavlacky et al. 2017, Banner et al. 2019, Baumgardt et al. 2019, Wood et al. 2019b. Surveying for multiple species simultaneously allows for increased efficiency and provides insight into community ecology (Manley et al. 2004, Sanderlin et al. 2014). Although such methods may appear cost effective, it is unclear whether multispecies surveys achieve species-specific goals, particularly in detecting population trends for multiple species in a single survey effort. Moreover, multispecies monitoring may not be effective for rare species that are often more difficult to detect or isolated geographically (Manley et al. 2005, Raphael andMolina 2007).
Occupancy is a simplified and more often reported metric for species monitoring compared with abundance, which is logistically difficult and expensive to obtain, especially for rare or cryptic species distributed across large landscapes (Kendall et al. 2008, Roffler et al. 2019. Despite widespread use of occupancy modeling paired with large-scale survey efforts, in some situations sufficient sampling may be near impossible to detect population declines, even with intensive monitoring scenarios (Manley et al. 2004, Ellis et al. 2014, Latif et al. 2018. Consequently, ensuring an occupancy monitoring sampling design has sufficient statistical power (i.e., the ability to statistically detect a change if present) to detect population trends-for one or more species-is an essential component of effective wildlife conservation and management.
A critical consideration in occupancy estimation is the assumption of closure, meaning sites are closed to change in population status (i.e., sites are either occupied or unoccupied and do not oscillate between those states) during the sampling period (e.g., discrete ponds, isolated vegetation patches, short time periods). Many authors have acknowledged that this closure assumption is often violated in empirical studies (Efford and Dawson 2012, Otto et al. 2013, Valente et al. 2017. Violation of the closure assumption has been found to bias estimates, usually resulting in overestimates of predicted occupancy (Rota et al. 2009), a bias that is particularly concerning when monitoring rare or declining species. Nonetheless, occupancy estimation is frequently used in continuous vegetation types where home ranges span multiple sites and thus the closure assumption is violated via movement within a home range (Efford and Dawson 2012, but see Valente et al. 2017). While some have proposed solutions that are definition-based, such as "occupancy" represents "use" rather than true occupancy (Latif et al. 2016), this simply is a change in definition that makes the index more removed from abundance and does not solve the root of the closure assumption violations.
Field sampling data to inform occupancy models are often based on grids to create discrete spatial units. The most biologically relevant grid cell size (hereafter, cell size) is optimized at target species' home range size because this scaling is when occupancy and abundance are most closely correlated (Noon et al. 2012, Linden et al. 2017, Steenweg et al. 2018. Studies that estimate celllevel occupancy often include the assumption that the survey method effectively samples the entire cell (e.g., the North American Bat Monitoring Program with 10 9 10 km cells, Banner et al. 2019; Breeding Bird Surveys with 5 9 5 km cells, Zuckerberg et al. 2009, Rosenberg et al. 2019. Nonetheless, field survey techniques may actually sample a relatively small area (e.g., 50-500 m; camera traps, acoustic recorders, avian point counts; see Burton et al. 2015, Furnas and Callas 2015, Wilton et al. 2016. Effective sampling area, hereafter referred to as "effective area," can be defined as a combination of the probability a survey (either passive device or surveyor) detects an animal when encountered and the probability of an animal encountering the device and/or surveyor when the animal moves within some larger area, usually assumed to be an animal's home range. When effective area is substantially smaller than the cell size, survey devices are in essence acting as point sampling devices where the cell size simply acts to control the spacing between sampling sites (Steenweg et al. 2018). As such, monitoring programs failing to account for their effective area potentially being smaller than their cell size may be overly optimistic in their expectations of observing population changes. Quantifying the effect of effective area in occupancy monitoring has not been accomplished to our knowledge, although this challenge has been alluded to within Efford and Dawson (2012).
Thus, we see two major primary knowledge gaps that are critical for effective long-term occupancy monitoring: (1) How effective area affects the ability to detect trend in occupancy remains unknown; and (2) the interaction of effective area, home range size, and cell size. Additionally, monitoring design requires assessment of logistical trade-offs or how to allocate survey efforts to maximize effectiveness with finite resources (Baumgardt et al. 2019). Previous studies have identified a variety of other factors that can affect power to detect trends in monitoring, including (but not limited to): per-visit detection probability (Guillera-Arroita and Lahoz-Monfort 2012), number of visits (Ellis et al. 2014), sampling approach (Sanderlin et al. 2014, Specht et al. 2017, survey duration (Steenweg et al. 2018, Kays et al. 2020, and optimal monitoring interval (e.g., every 1, 2, or 3 yr, Rhodes and Jonz en 2011). As such, our objectives were to (1) assess how the relationship between effective area, home range size, and grid size affects statistical power to detect trends in occupancy estimation; (2) evaluate how sampling factors (per-visit detection probability, number of visits, proportion of landscape sampled, number of visits, and resurvey interval) influence monitoring efficiency; and (3) determine whether, and under what conditions, a single sampling design has sufficient statistical power to simultaneously detect declines in two species with different home range sizes as a starting point to validate multispecies occupancy monitoring efforts. We used a spatially explicit, individual-based simulation framework (Ellis et al. 2015) to create biologically realistic declining populations and then assessed statistical power to detect known declines (20%, 50%) using occupancy modeling. We simulated two species, with different home ranges between species and between sexes within species. We used two sampling grid sizes, approximating the home range size for each species, to investigate how trends derived from occupancy monitoring could be influenced by grid size and the potential match or mismatch with home range size. Such extensive simulations have not been comprehensively documented, and we suggest understanding the interaction of field components and monitoring is an essential step in understanding the utility of one of wildlife biology's most prevalent tools, occupancy modeling, for documenting species distributions and trends.

METHODS
We conducted a series of spatially explicit simulations using the package rSPACE (Spatially based Power Analyses for Conservation and Ecology) in R (Ellis et al. 2014, R Development Core Team 2017. In summary, this simulation approach involved four steps (two biological and two observational): (1) simulating a biological landscape, and populations of individuals; (2) simulating an abundance decline by removing individuals over a pre-defined rate and time frame; (3) sampling the simulated landscape using two grid sizes (reflecting small and large home ranges) to create presence-absence encounter histories to estimate detection probability and annual occupancy; and (4) fitting a trend model to annual occupancy estimates and assessing statistical power to detect trend. We describe the parameters used at each step and justification for selecting them below. We refer to each combination of parameters as simulation scenarios.

Simulated landscape and populations
To realistically simulate two different home range sizes (small and large), we selected two closely related and well-studied mustelid species that have many similarities in physiology, behavior, and habitat, but differ in home range size: Pacific marten (Martes caurina, representing the v www.esajournals.org small home range) and fisher (Pekania pennanti, representing the large home range). These species are sexually dimorphic, with males larger than females, often with males having home ranges~2-3 times larger (fisher male~30 km 2 , female~10 km 2 ; marten male~6 km 2 , femalẽ 3 km 2 , Powell 1994, Spencer et al. 2015b, Moriarty et al. 2017. All parameters related to movement and spatial arrangement were derived from these two species (Table 1), although they are meant to be representative for comparative purposes for any hypothetical animal. We selected these species because both have been petitioned for listing or listed under the Endangered Species Act (FWS 2016(FWS , 2018(FWS , 2020, and we have existing data to inform spatial parameters from both a long-term monitoring program (Zielinski et al. 2013) and numerous radio-telemetry studies (Sweitzer et al. 2016, Moriarty et al. 2017. For each simulation scenario, we conducted 250 randomized replicates of the process described below. Between replicates within a scenario, initial placement, addition and removal of individuals, and sampled cells varied. Within a replicate, cells sampled were fixed and cumulative over the entire monitoring period meaning the same cells were sampled each year (fixed locations). For example, when comparing power to detect a 20% decline when sampling 5% vs. 25% of the landscape, the 100 cells for the 25% landscape scenario included the same 20 cells from the 5% landscape scenario.

Biological simulations
Simulated populations.-We conducted simulations on a uniform 50 9 200 km landscape with no underlying vegetation or predicted habitat covariates. We chose the long rectangular shape to approximate the shape of mountain ranges in which monitoring occurred (Zielinski et al. 2013). To investigate how population size and density affect statistical power to detect trend, we simulated three starting population sizes (N = 150, 250, and 400) to reflect a biologically realistic range of animal densities expected in populations of interest for monitoring.
For each species, we distributed individual home range centers across the landscape and assigned each individual a utilization distribution, modeled as a random distribution of individual activity centers based on parameters defining spatial use within and between home ranges (Table 1). Due to well-documented  † Effective area is expressed as full cell, 1/4 cell, and 1/16 cell. ‡ Landscape sampled is expressed as a percentage of grid cells.
v www.esajournals.org differences in home range size between sexes in many carnivore species, we defined spatial parameters by sex and placed female home ranges on the landscape in greater proportion than males (Powell 1994). To reflect territoriality, we made utilization distributions roughly uniform within each territory radius, with a 5% decrease in probability density between the movement center and territory edge. We also added a uniform, low-probability tail that extended a sex-specific distance outside a territory radius to reflect extra-territory movements (Moriarty et al. 2017). Because monitoring surveys are often completed within a single season, we assumed the potential for long-distance movements was low such that individuals spent 95% of their time within and 5% outside of their territory (Ellis et al. 2014). Finally, we combined individual utilization distributions to produce an encounter probability surface describing the probability of encountering at least one individual at any point on the landscape (Ellis et al. 2014(Ellis et al. , 2015. Simulated declines. -After establishing a baseline population using the initial population size parameter for each scenario (year 0 ), we simulated two different magnitude declines, a 20% or 50% abundance decline, over a 10-yr period (years 1-10 ) according to an exponential model of population growth (k = population growth rate) by randomly removing individuals successively over each time step. Therefore, the total monitoring period was 11 yr (year 0 baseline population + years 1-10 of population decline). To reflect natural variation due to turnover (e.g., individuals die and are replaced by new offspring or immigrants), we added a turnover parameter based on the observed average proportion of new individuals per year in >10-yr fisher and marten capture studies (R. Green, personal communication; K. Moriarty, unpublished data). We removed the turnover percentage of individuals each time step and then randomly added the same number of new individuals according to the same spatial parameters used to define the initial population (see Appendix S1 for starting parameters).

Observational simulations
Simulated surveys and monitoring. -We overlaid two sampling grids across the landscape reflecting the different home range sizes simulated. The small home range grid, modeled after a marten home range, had 6.25-km 2 cells (Moriarty et al. 2017) and included 1600 cells, and the large home range grid, modeled after a fisher home range, had 25-km 2 cells and included 400 cells (Thompson et al. 2010, Sweitzer et al. 2016. We nested these grids such that one large home range cell contained four small home range cells. We then simulated three successively smaller effective areas within each cell: full cell, 1/4 cell, and 1/16 cell. Simulating a full cell effective area mimicked the assumption that survey methods effectively sample an entire cell. The 1/16 cell size was chosen to approximate the small effective areas of a non-invasive survey device found in empirical field studies (effective survey radius 50-500 m, Furnas and Callas 2015, Wilton et al. 2016, Tucker et al. 2020). The intermediate 1/4 cell size was designed to approximate the effective area of survey arrays of multiple devices (e.g., 3-12 devices in a cluster) commonly used in carnivore survey protocols Kucera 1995, Zielinski et al. 2013). We located effective area centers at random locations in each cell; we fixed these locations over time across all sampling years for each simulation replicate.
Because our goal was to make the simulations as realistic as possible, an individual's home range could span multiple adjacent cells potentially violating the closure assumption (MacKenzie et al. 2002(MacKenzie et al. , 2003. Therefore, occupancy represented use similar to empirical studies (Latif et al. 2016). Detection probability (p) was therefore the product of true detection probability (probability of detecting an individual if present; defined by the simulation parameter p sim ) and the probability at least one individual was present in the cell. Thus, the estimated per-visit detection probability (p) output by the simulation model was defined as: p = p sim 9 probability of presence. We used two values for p sim, 0.2 and 0.7, to parameterize simulations with a low to moderate range of detectability reflecting estimates of p found in empirical field studies. For instance, we would expect low detectability in cases where spotted owls (Strix occidentalis) vocalize less frequently in the presence of barred owls (Strix varia) (Bailey et al. 2009) or with passive, un-baited cameras relying on an animal walking within the range of a sensor. However, v www.esajournals.org actual p is specific to each scenario depending on the simulation parameters. For instance, in the scenario for a large home range species with 250 individuals, effective area equal to the full cell, and moderate detectability (p sim = 0.70) the resulting mean per-visit detectability was p = 0.37.
To survey the landscape, we calculated the probability of ≥1 individual being present within each effective area. We created 11-yr encounter histories of presence/absence in a cell (year 0 À year 10 ). We then modified encounter histories to reflect imperfect detection (detection probability <1) using the methods described in Ellis et al. (2014Ellis et al. ( , 2015. From these simulated modified encounter histories, we assessed statistical power for small and large home range populations on both cell sizes (25 and 6.25 km 2 ), across a wide range of spatial sampling intensities (5-95% of landscape [cells] sampled) to assess the efficacy of monitoring species with home ranges that are larger or smaller than the cell size.
We used the encounter histories we created as input to program MARK and obtained occupancy estimates by fitting Robust Design Occupancy models (White and Burnham 1999) using the R package RMark (Laake et al. 2013, Ellis et al. 2014, R Development Core Team 2017; Appendix S1). We used the variance components procedure to fit a linear random effects trend model to the annual occupancy estimates. To account for sampling effort, we applied a finite population correction to the trend parameter which reduces the sampling variance by a factor of (N À n)/N where N is the total cells in the grid and n is the number of sampled cells. We defined a trend as successfully detected if (1) the trend was in the correct direction; and (2) the 90% confidence interval of the trend parameter excluded zero. We estimated statistical power as the percentage of replicates for each simulation in which the known declining trend was correctly detected. Additional details regarding the simulation structure and trend analysis are found in Ellis et al. (2014).
Factors influencing monitoring effectiveness.-To assess the effect of survey duration on statistical power, we tested three intervals representing short, moderate, and long duration surveys (3, 5, 10 visits, respectively). These simulations were not defined by specified time frame, but rather by a per-visit detection probability so that results were generalizable across a wide variety of studies. In practice, a visit can represent either a biological interval (e.g., 5 d to allow a marten to move throughout its range, Moriarty et al. 2017) or a logistical interval (e.g., fieldwork schedule). We compared our simulation detection probabilities with empirical fisher and marten detection history data and found similar per-visit detection probabilities were achieved with field survey durations of~6-10 d. Therefore, we conceptualized the survey durations for the simulation's 3, 5, and 10 visits as approximately 18-30, 30-50, and 60-100 d periods.
We varied the resample interval from 1 to 3 yr to assess opportunities to optimize efficiency over time. We summarized total field effort as number of cells 9 the number of years surveyed and then evaluated power vs. effort for surveys every 1, 2, or 3 yr. Lastly, as we focused on large monitoring areas (e.g., mountain ranges, states) which can be logistically difficult to sample, we simulated the effect of sampling irregularities where a 20% of cells were randomly coded as missing data each time step to reflect common field logistical problems that prevent accomplishing sampling targets (e.g., bad weather, wildfire, broken vehicles).

Population size
Estimated occupancy was high for populations with large home ranges ranging from 60% to 90% (starting N = 150-400) with occupancy for the small home range populations considerably lower ranging from 20% to 45% according to the starting population size. Detection probability and power increased with population size, except for the large home range population, where statistical power decreased slightly as the population increased from 250 to 400 (Appendix S1). For clarity, in subsequent sections we report results for only the intermediate population sizes results (n = 250) with complete results for all scenarios presented in Appendix S1.

Detection probability
Detectability was higher for the small home range populations and increased with population v www.esajournals.org size for both home range sizes. Decreasing detectability resulted in an increase in the variation around occupancy estimates reducing statistical power to detect trend (Fig. 1). Low detectability simulations were unable to detect a 20% abundance decline even when sampling 95% of the landscape; declines of 50% were not detectable unless the effective area was equal to the full cell and surveys were of long duration (10 visits) or across a large proportion of the landscape (≥75%). Notably, for most simulation scenarios, power with low detectability with 10 visits was the same or worse than moderate detectability simulations with only three visits (Appendix S1). As low detectability simulations were ineffective at detecting population declines, we report only moderate detectability simulation results in subsequent sections. Fig. 1. Estimated occupancy over time for four potential study designs illustrating trade-offs between detectability and sampling intensity. Rows present two different detection probability parameterizations (A) low and (B) moderate with columns depicting two different sampling intensities (50% vs. 5% of the landscape). Transparent dots represent individual replicate results and box plots summarize occupancy over all replicates. When the spread of the box plots was large, detecting a biologically meaningful trend was difficult. Simulation parameters shown: initial population size = 250, 50% abundance decline (k = 0.933) over 10 yr (Years 2-11), effective area = full grid cell, home range size = large. v www.esajournals.org 7 May 2021 v Volume 12(5) v Article e03519

Effective area
Detection probability (p) declined as effective area decreased for both small and large home range populations, with a reduction in effective area to 1/4 of their respective cell sizes resulting in a 40-50% decrease in p (Fig. 2). Notably, these 1/4 cell effective area detectability estimates more closely align with per-visit detection probability reported from empirical field data (Zielinski et al. 2013. Because reducing effective area greatly lowered estimated detection probability, it also drastically reduced statistical power. The smallest effective area simulated (0.39 km 2 ), which we use as representative of the effective area of a single device, was ineffective at detecting trend for any scenario or cell size. Decreasing effective area to 1/4 cell, increased required sampling to detect a 50% decline with 80% statistical power from~5% of the landscape to~25%. Further, reducing effective area to 1/16 cell, resulted in insufficient power for either small or large home range populations unless a large percentage of the landscape was sampled (>75%) and surveys were of very long duration (Figs. 3,  4C).
Increasing proportion of the landscape sampled, effective area, and visits each decreased variation around estimates of population trend, thereby increasing ability to detect declines (results for large home range simulations in Fig. 4, small home range in Appendix S1). For most simulations to have at least 80% power to detect a 50% population decline with a moderate number of visits (5) required both sampling ≥25% of the landscape and an effective area ≥1/4 of the cell. Nonetheless, to have power to detect a smaller 20% decline with the same ≥1/4 cell effective area and moderate detectability required 10 visits or sampling ≥50% of the landscape (Appendix S1, Scenarios 11 and 17). The effect of effective area on statistical power was consistent across both small and large home range simulations.
Monitoring multiple home range sizes in a single survey framework Our simulations suggest it is possible to simultaneously detect declines in species with two  . We considered the monitoring effort effective when there was >80% statistical power (i.e., colored line exceeds the horizontal gray line). Simulation parameters: N = 250, decline = 50%, visits = 5, moderate detectability. different home range sizes using the same sampling grid under certain conditions: (1) sampling devices or survey methods must have an effective sampling area ≥1/4 of the cell; and (2) the protocol (e.g., device type, bait, lure) and visit duration result in moderate detectability. It is not until the effective area reached ≥1/4 cell did we find sufficient power to detect a 50% decline while sampling~25% of the landscape with the large home range cell size for both species (Fig. 5). Because the small home range populations had higher overall detectability, their power was higher than for the large home range populations. We found the smaller 6.25-km 2 cell size with a 1/4 effective area could also have sufficient power to detect declines for both populations but required a much higher number of proportion of landscape sampled.

Trade-offs to increase monitoring efficiency
Similar to other power analysis studies (Ellis et al. 2014), increasing visits resulted in increased precision in occupancy estimates and improved power to detect population declines. Increasing from 3 to 5 visits improved both power and could reduce total field effort with less of the landscape needing to be sampled (i.e., we assume more repeat visits at established sites require less work than fewer repeat visits across more sites), whereas increasing to 10 visits improved power nominally but doubled overall field effort. For example, in one scenario, 80% statistical power required surveying 200 cells with three visits (total field effort = 600 trips to devices) compared with 90 cells with five visits (total field effort = 450 trips) or 65 cells with 10 visits (total field effort = 650 trips). For the smaller home range simulations, we observed a similar pattern, but field effort differences were amplified due to the increased number of cells (Fig. 6).
Each successively larger resurvey interval (1, 2, 3 yr) increased the proportion of the landscape surveyed required to achieve adequate power, but this increase was relatively small when compared with the overall required field effort ( Fig. 7, Table 2). For example, to detect a 50% fisher decline required either sampling 98 cells every year (25% of the landscape), 111 cells every 2 yr (28% of the landscape), or 143 cells (36% of the landscape) every 3 yr. When translated into total field effort or the number of cells required to be sampled over the total 11-yr monitoring period (year 0 -year 10 ), sampling every 2 yr would require~38% less total field effort and every 3 yr would require~47% less total field effort than annual sampling (Table 2). Simulating 20% missing data reduced power of annual sampling to almost equivalent to every other year sampling (Fig. 7).

DISCUSSION
Purposeful monitoring design is critical to ensure that collected field data are sufficient for detecting population trends. Implementation of occupancy monitoring as it exists now relies on the assumption that surveys detect an organism within a defined area. Our simulations reveal a strong reduction in the statistical power to observe a population decline when accounting for the actual effective sampling area compared with assuming survey methods are sufficient for Fig. 5. Power to detect declining trends for large home range and small home range populations on the same survey grid for (A) small home range sized (marten) grid (6.25-km 2 grid, 1600 cells) and (B) large home range size (fisher) grid (25 km 2 , 400 cells). 80% statistical power = horizontal gray bar. Simulation parameters: initial N = 250, 50% decline, five visits, moderate detectability, effective sampling area = 1/4 cell. an entire cell. Failing to account an effective area smaller than the cell size resulted in overestimation of statistical power and underestimation of the amount of sampling needed to detect population declines. We reveal several opportunities to achieve the requisite power while improving efficiency (e.g., survey more of the landscape but every 3 yr) and factors that result in field sampling that would fail to identify population trends (e.g., low detectability, small effective area). Our simulations indicate it is possible to effectively monitor trend for two species with different home range sizes using the same cell size.
Balancing trade-offs is a challenge in monitoring design, especially for rare species that are found in large and remote landscapes. Recent research has suggested passive survey devices, such as remote cameras or acoustic recorders, can provide multispecies trend information (e.g., Kays et al. 2009, Furnas and Callas 2015, Loeb et al. 2015, Steenweg et al. 2018, Wood et al. 2019a). We caution that while such methods can be used to obtain detections of different species, it is likely that for each species, each device has a different effective area and therefore effective multispecies trend monitoring becomes much more complicated. Our data suggest that unless sample design is intentional and effective area is explicitly considered, single passive devices with small effective areas are likely not sufficient to use occupancy modeling to monitor population trends.

Effective area
The issue of effective area is rarely addressed in occupancy studies, with effective area often undefined or assumed to be equivalent to the entire cell regardless of field methods (Efford and Dawson 2012). Empirically measuring effective area is difficult because it depends on effectiveness of the survey device (e.g., trigger speed, recording capacity) and factors such as a species' home range size, movement rates, terrain, Fig. 6. Effect of number of visits (line color) on statistical power (y-axis) to detect trend. Grid sizes reflect two home range sizes, (A) large home range (25-km 2 grid; 400 cells) and (B) small home range (6.25-km 2 grid; 1600 cells). The intersection of each curve and the horizontal gray line indicates cells (x-axis) needed to reach 80% statistical power to detect trend. Simulation scenario parameters: decline = 50%, initial N = 250, moderate detectability, effective sampling area = 1/4 cell. ambient temperature, or use of attractants. Nonetheless, research linking GPS collar movement data in comparison with passive device detections suggest the effective radius of a single device may be limited. Wilton et al. (2016) found the probability of detecting black bears (Ursus americanus) declined to <5% when black bears were ≥330 m from a lured hair snare (100-km 2 cells). Popescu et al. (2014) found a similar detection probability at~250 m on a fisher study. Other paired GPS and camera station data from California for detecting a female fisher at a camera station over a six-visit survey found the cumulative detection probability to be 90% with a 300 m camera distance, 80% at 480 m, declining to <50% when greater at 920 m (Tucker et al. 2020). Following these findings, we would assume a device like a camera trap with an effective detection radius of 100-500 m would generate effective area of 0.05-0.8 km 2 . We presume effective area is much smaller when animals are not lured in, such as passive bioacoustic recorders or un-baited cameras. Furnas and Callas (2015) found automated recorder surveys for Fig. 7. Power curves evaluating trade-offs between resampling interval and statistical power to detect either a 50% (solid line) or 20% (dashed line) abundance decline. Resurvey intervals include annual, biennial (every second year), triennial (every third year), or continuous annual sampling with 20% missing observations. Simulation parameters plotted: initial N = 250, moderate detectability, effective sampling area = 1/4 cell, visits = 5. forest birds to have an effective radius of 50 m, and Wood et al. (2020) found a 250 m effective radius of acoustic recorders for California Spotted Owls (Strix occidentalis occidentalis). Given relatively small detection radiuses, effective area of a single camera or point sampling method is likely much smaller than derived cells meant to represent home range sizes. This must be accounted for in monitoring planning as forewarned by Efford and Dawson (2012).
Another important consideration is the relationship between time and effective survey area. As effective sampling area includes the probability an animal encounters a sampling device when moving through a larger area (home range) it follows that increasing survey duration (time) increases the overall movement of an animal during the sampling period and thereby, also increases effective sampling area (Steenweg et al. 2018). The magnitude of the effect of time on effective area is dependent on an animal's movement rate (Stewart et al. 2018) as higher movement rates increase the probability that an animal will encounter the detection device which increases detection probability. For example, Popescu et al. (2014) found detection probabilities increased dramatically with greater numbers of GPS locations within 500 m of camera station. Our simulations did find that increasing time increased power to detect trend (Fig. 6). However, even long-duration surveys did not completely compensate for the loss in power when effective area was very small. For scenarios with the smallest simulated effective area (1/16 cell), we observed increased variation in trend estimates and reduced statistical power even in our longest duration simulations (10 visits) when compared to scenarios with a larger effective areas and shorter durations (Fig. 4, Appendix S1).
It is important to note considerations for effective sampling area may differ considerably for other non-point-based sampling methods such as snow tracking or scent detection dog surveys, which have been shown to be very effective survey methods for some highly mobile carnivores (Lind en 1996, Moriarty et al. 2018). For example, Squires et al. (2012) found a detection probability of >0.95 for Lynx (Lynx canadensis) by surveying a single 10 km snow track survey route within a much larger 64 km 2 cell. In these transect-based survey methods, a positive detection can result from a survey route intersecting one of many longer, spatially distributed movement paths across a home range which differs from passive devices like trail cameras or acoustic recorders that rely on an individual using a large home range area to enter a relatively small detection zone (Burton et al. 2015). The most efficient combination of field techniques to generate sufficient detectability will vary depending on the monitoring questions, species, and landscape.
Increasing effective area of surveys that employ passive devices may be accomplished by spacing out multiple detection devices within a cell, improving detection probability both by increasing number of devices and expanding spatial extent, increasing the likelihood a device falls within an animal's movement path. Increasing detectability with arrays of multiple devices within a cell has been a common practice for carnivore surveys (Zielinski and Kucera 1995,  Notes: Although the proportion of the landscape required increases with longer resurvey intervals, the total survey effort for the entire monitoring period decreases (values in boldface). The total monitoring period = 11 yr (1-yr baseline population size + 10 yr of population decline). Simulation parameters: Fisher = 25-km 2 cells (400 cells), Marten = 6.25-km 2 cells (1600 cells), effective area = 1/4 cell, moderate detectability, five visits. Total trips refer to annual visits to grid cells and do not include five visits within a year. Zielinski et al. 2017), and our simulations provide insight into why the spatial aspect of this recommendation is so critical for effective monitoring. Our results indicate increasing effective area to at least 1/4 cell is necessary to achieve sufficient detectability to detect trend with a logistically feasible sampling effort. Due to the difficulty in precisely defining effective area, prescribing spacing and number of devices is difficult and further research is needed to inform such recommendations. Nonetheless, because of our simulations, evidence from field large-scale surveys (Zielinski et al. 2013, Barry 2018, and simulations for multispecies sampling design (Sun et al. 2014, Wilton et al. 2014, Evans et al. 2019, we suspect an efficient survey method includes clusters of >2 devices within an area such that they overlap their effective sampling areas but enlarge the likelihood of detecting an animal. For example, long-term fisher monitoring in California conducts sampling within grid cells using arrays of 3-6 track plates or trail cameras spaced 500-800 m apart. This device spacing reflects the approximate effective sampling area of each device and functions both to increase detection probability through the use of multiple detection devices and the overall effective sampling area within each cell (Zielinski et al. 2013. Alternatively, multiple device clusters spaced at twice the effective area could be used to survey a larger effective area, but also allow the clusters to be considered spatially independent for linking with landscape covariates. All simulations relied on fixed monitoring locations over time to detect trend, such that the same cell was resampled every time step and sampling was cumulative with cells added in a fixed order over time. Similarly, the effective area locations were fixed within a cell over time. By using monitoring locations over time in this manner, we minimized between-year variations in occupancy estimates to maximize statistical power. Previous simulation work has found varying sampling locations either through (1) sampling different cells each time step or (2) moving survey locations within a cell, drastically reduces power to detect trend by greatly increasing the variation around annual occupancy estimates (J. M. Tucker and M. M. Ellis, unpublished data). Also, when effective area is smaller than the cell size, sampling in a fixed location facilitates correlating changes in occupancy or detectability with fine-scale environmental covariates (e.g., canopy cover, basal area). In other words, our simulations strongly suggest that spatially variable random sampling over years is not effective for monitoring population trend.
Monitoring multiple home range sizes in a single survey framework We explicitly wanted to test whether it was feasible to detect population trends for two species with different home range sizes using a single sampling grid. Our simulations suggested it was possible to detect declines for both home range sizes when effective area was ≥1/4 of the cell, moderate detectability was achieved, even when there was a mismatch between home range size and cell size. In such cases, our results indicated that it was more efficient to design monitoring using the cell size matched to the larger species home range. We observed similar power to detect declines for both small and large home range sizes on the larger cell, assuming a 1/4 cell effective area. In these simulations, 80% statistical power was achieved when sampling~22% of the landscape (86 cells) for the small home range species or~25% (98 cells) for the large home range species. We were also able to similarly detect declines using the smaller cell, but this required a larger proportion of the landscape to be surveyed (smaller home range = 27% of the landscape [424 cells]; larger home range = 42% of the landscape [675 cells]). This difference can be somewhat accounted for assuming that 1/4 cell effective area is less for the smaller grid, requiring fewer devices with closer spacing. Even so, one would need to survey many more cells and a higher proportion of the landscape using the smaller cell size. We found power for the small home range simulations was higher than for the large home range simulations. Assuming similar movement rates (Moriarty et al. 2017(Moriarty et al. , 2019, an animal traversing a larger home range is less likely to be near a survey device at any given time compared with an animal traversing a smaller home range. Therefore, for a fixed survey duration, as home range size decreases both detectability and power to detect trend increases (Popescu et al. 2014).

Population size
Our results showed a generally positive relationship between increasing population size and statistical power. However, when occupancy was very high (large home range N = 400, occupancy = 90%) we found power decreased. With high occupancy, multiple individuals will likely overlap within a cell such that ≥1 individual must be lost for that cell to change occupancy status. Therefore, there is a threshold, which will vary by species and landscape (in our studỹ 90% occupancy for fisher), where high population density or increasing social aggregation results in reduced power in occupancy monitoring. Similar situations would be expected in landscapes with densely clustered home ranges such as the rare coastal Humboldt martens (Martes caurina humboldtensis) which have the smallest home ranges and highest densities of martens documented globally ). To detect trends in a landscape saturated with individuals, occupancy models may be less appropriate compared with abundance estimates, especially when closure assumption are violated, and data reflect use rather than occupancy. When population sizes are unknown, surveys may need to include techniques that allow for individual or sex identification to inform the number of individuals within cells such as hair snares (Zielinski et al. 2006), track identification (Slauson et al. 2008, Alibhai et al. 2017, or measuring strips at cameras (Buskirk et al. 2018).

Factors influencing monitoring design and efficiency
We found that detection probability was the factor that had the biggest effect on power to detect population trends. Low detectability simulations, representative of a short durations or lower detectability field methods (e.g., camera stations without bait or lure), had little power to detect even large magnitude trends (50% abundance decline). This is because low detection probability resulted in greater bias on yearly occupancy estimates, making it difficult to discern a true decline from sampling error. We conclude that trend monitoring must employ a method with moderate to high detectability to avoid requiring extremely long survey durations or sampling large proportions of the landscape (logistically/financially infeasible). We also find that while increasing duration can slightly increase the statistical power to detect a population decline, increased temporal duration was not compensatory with the loss of power resulting from small effective areas.
Per-visit detection probability is a function of time, effective area, and survey method. Although our simulations did not specify the amount of time and survey method, there are some real-world implications to these results. To increase detectability within a visit practitioners can attempt to (1) increase visit duration (but for small home ranges, there may be little effect of duration on detectability); (2) increase the effective area; or (3) increase detectability of the method at each survey location. Methodological changes that can increase detectability include adding bait, lure, or other attractants (McDaniel et al. 2000, Gerber et al. 2012, Braczkowski et al. 2016, or adding a secondary survey device such as pairing a camera with a hair snare, track plate, or another camera in close proximity . Time of year can also significantly affect detectability, with previous fisher and marten studies finding detectability highest during winter (Zielinski et al. 2015, Sweitzer et al. 2016 when individuals may be more willing to move through uncharacteristic vegetation types due to increased energetic stress and reduced predation risk (Moriarty et al. 2015). Similarly, bird surveys have increased likelihood of detecting species in unsuitable areas during winter when coaxed by bait or lure (Rodr ıguez et al. 2001, Belisle and Desrochers 2002, MacKenzie and Royle 2005, Desrochers et al. 2011. Our examples of opportunities to increase detectability may reduce the biological significance of the survey, however, as winter does not reflect the reproductive period for these species. Regardless of the field method used, consistency in methodology over time, which minimizes variation in estimates between years, is an important component of effective monitoring design. Our simulations suggest there is a trade-off in statistical power between number of visits and number of cells sampled, with fewer visits requiring more cells (also see Ellis et al. 2014). As it is generally more time-efficient to revisit the same station than to establish new stations, increasing number of visits (time) will usually be more cost-effective than surveying additional v www.esajournals.org cells. However, simulating long surveys of 10 visits showed only minor improvements in power compared with half the field effort (five visits) at those locations. There are some instances where it may be preferable to spread sampling across a larger area with fewer visits. For example, in heterogeneous or rapidly changing environments, sampling a larger number of cells with fewer visits will better reflect spatial variability and increase opportunities to monitor population edge contractions or expansions.
Increasing the resampling interval over time has great potential to make large-scale trend monitoring more efficient. Resurveying cells every second or third year in contrast with annual surveys resulted in a substantial reduction total monitoring effort over the 11-yr monitoring period to obtain equivalent statistical power. While there may be certain situations that warrant annual sampling, for instance monitoring a population undergoing rapid environmental changes (e.g., wildfire, logging, tree disease), for more general monitoring our results illustrate that considering longer monitoring intervals has the potential to greatly improve efficiency. At broad spatial scales, this could provide logistical opportunities for partnerships. If a team and equipment were available annually, those resources could potentially move between monitoring areas on a 3-yr time scale. We also found that missing observations caused a substantial reduction in power, such that annual sampling with 20% missing observations had similar power to sampling every other year. Therefore, for monitoring areas likely to have frequent logistical complications (e.g., difficult terrain, heavy snow, changes in land ownership or access) planning needs to adjust sampling targets upwards to account for potential missing observations to prevent loss of statistical power.
Our simulations have some limitations that are important to consider when interpreting results for monitoring design. In our study, we conducted simulations on a uniform landscape without underlying biotic or abiotic information and, therefore, the resulting populations were more homogeneously distributed than would occur naturally. In heterogeneous landscapes, an organism with a smaller home range can achieve higher densities and more easily use smaller fragments of suitable conditions, frequently resulting in a more patchy distribution than organisms in the same landscape with a larger home range. The uniform landscape in our simulations also resulted in the complete overlap between populations which, while well-representative of some carnivore communities (Manlick et al. 2017), is not always the case as species frequently spatially segregate according to vegetation or climate factors (Spencer et al. 2015a. Future work should incorporate landscape, habitat, or climate factors to investigate how power may be impacted when populations are restricted to patchily distributed vegetation types. Additionally, these simulations were parameterized with two species that, while they have different home range sizes, are similar in many other characteristics (sex ratios, territoriality, turnover rates). More work is needed to expand simulations further to encompass a greater range of variation in terms of home range size or behavior to assess whether a single grid can be effective for more ecologically disparate species.

CONCLUSIONS
This study is an important step forward in understanding the constraints of long-term occupancy monitoring. We caution practitioners designing long-term occupancy monitoring programs to carefully consider field efforts that are most likely to influence power to detect population declines. Effective area of the sampling devices had far more of an effect than we anticipated. Given that empirical evidence suggests effective area is likely much smaller than a typical cell, we see this as a central problem for occupancy monitoring. Unless this is addressed in each program or for each field method, it is likely that occupancy monitoring with single devices meant to survey grid cells will not provide the power necessary to detect population trends, which is the fundamental intent of monitoring. In addition, this challenge is likely to vary for every species considered. Given the rise of multispecies occupancy monitoring programs, we suggest careful consideration of design factors including effective area assumptions and maximizing detectability using non-invasive devices. With a better understanding of how effective area influences long-term occupancy trend monitoring, we are optimistic that well-designed programs that use careful design and improve and innovative sampling methods can achieve effective population monitoring.