Of detectability and camou ﬂ age: evaluating Pollard Walk rules using a common, cryptic butter ﬂ y

. Estimating distribution and abundance of species depends on the probability at which individuals are detected. Butter ﬂ ies are of conservation interest worldwide, but data collected with Pollard walks — the standard for national monitoring schemes — are often analyzed assuming that changes in detectability are negligible within recommended sampling criteria. The implications of this practice remain poorly understood. Here, we evaluated the effects of sampling conditions on butter ﬂ y counts from Pollard walks using the Arctic fritillary, a common but cryptic butter ﬂ y in boreal forests of Alberta, Canada. We used an open population binomial N-mixture model to disentangle the effects of habitat suitability and phenology on abundance of Arctic fritillaries, and its detectability by sampling different conditions of temperature, wind, cloud cover, and hour of the day. Detectability varied by one order of magnitude within the criteria recommended for Pollard walks ( P varying between 0.04 and 0.45), and simulations show how sampling in suboptimal conditions increases substantially the risk of false-absence records (e.g., false-absences are twice as likely than true-presences when sampling 10 Arctic fritillaries at P = 0.04). Our results suggest that the risk of false-absences is highest for species that are poorly detectable, low in abundance, and with short ﬂ ight periods. Analysis with open population binomial N-mixture models could improve estimates of abundance and distribution for rare species of conservation interest, while providing a powerful method for assessing butter ﬂ y phenology, abundance, and behavior using counts from Pollard walks, but require more intensive sampling than conventional monitoring schemes.


INTRODUCTION
Species' distribution and abundance are the two most common state variables in ecology (Krebs 1972, K ery andSchaub 2012). To estimate distribution and abundance, ecologists have historically drawn upon counts of organisms, treating them either as indices of abundance or censuses (Krebs 1972, K ery andSchaub 2012). Yet, observers usually fail to detect all individuals present at a site, and when detection probability is not constant and adjusted for, inferences based on count data can be biased (Brown and Boyce 1998, MacKenzie et al. 2002. Therefore, detection probability (thereafter "detectability") plays an important role in our ability to understand the ecology, abundance, and distribution of organisms (Brown and Boyce 1998, MacKenzie et al. 2002, Ancona et al. 2017. Because ignoring imperfect detection can result in biased estimates of diversity, occupancy, and abundance (K ery and Plattner 2007, Jarzyna and Jetz 2016, Ancona et al. 2017, ecologists have developed techniques to account for detectability (MacKenzie et al. 2002, Nowicki et al. 2008. Often, the probability of detecting an individual depends on the conditions under which sampling is conducted (even for plants; see, e.g., Dennett and Nielsen, 2019) and thus can be modeled as a function of relevant covariates (MacKenzie et al. 2002, K ery andSchaub 2012). Nevertheless, methods designed to incorporate the observation process are often not used, especially for insects (Nowicki et al. 2008, Kellner andSwihart 2014).
Butterflies (Lepidoptera: Papilionoidea) are one of the most studied and charismatic insect groups (Boggs et al. 2003, Thomas 2016, and losses in abundance and distribution of many butterfly species have prompted the launch of conservation initiatives worldwide (Melero et al. 2016, Thomas 2016, van Strien et al. 2019, Wepprich et al. 2019. Given widespread interest, the merits and limitations of different sampling protocols and analytical approaches have been widely scrutinized (Haddad et al. 2008, Nowicki et al. 2008, Isaac et al. 2011, Schmucki et al. 2016. Today, Pollard walks (PW) emerged as the most widespread method for sampling butterflies (Nowicki et al. 2008).
Pioneered in the 1970s by Ernest Pollard (Pollard 1977), PW consist of repeated transect counts of adult butterflies conducted under specific sampling conditions (originally, samples between 10:45 h and 15:45 h, at temperatures >13°C with sunny weather and >17°C with cloudy weather). Since Pollard's seminal manuscript, only minor adjustments have been proposed to the original method, including controlling for the effects of wind and adapting time of sampling to different countries (Van Swaay et al. 2008). Today, decades of data from thousands of PW worldwide have been recorded (Dennis et al. 2013, Schmucki 2016, Wepprich et al. 2019, with this massive effort resulting in prominent contributions in ecology and conservation (Warren et al. 2001, Pateman et al. 2012, Macgregor et al. 2019.
Although originally conceived for estimating population trends (Pollard 1977), PW have also been used to assess other applied and theoretical questions (Nowicki et al. 2008, Dennis et al., 2013, Matechou et al. 2014, Schmucki et al. 2016. Counts from PW were historically analyzed with regression methods (e.g., linear models), using the sum of multiple site visits as a site abundance index. However, analyzing PW counts as abundance indices assumes that the proportion of butterflies counted is equal among sites. This proportionality assumption is violated when detectability varies, undermining comparisons between samples (Nowicki et al. 2008) and between species within the assemblage (Isaac et al. 2011. Importantly, butterfly detectability varies not only with species' characteristics (e.g., natural history, sex, behavior, and morphology), but also with sampling conditions and habitat assessed (Brown and Boyce 1998, Haddad et al. 2008, Isaac et al. 2011. For instance, detection is easier when organisms move (Caro 2005), with ectotherm movement conditioned by the abiotic environment (Kevan and Shorthouse 1970). Therefore, variation in conditions during sampling affects PW counts, and the changes in these counts are not necessarily due to variation in the true abundance of butterflies within a transect, but rather to our ability to detect them (e.g., due to changes in activity patterns). Defining sampling criteria based on thresholds to environmental conditions at which PW should be conducted, PW implicitly assume that detectability does not vary substantially within recommended sampling criteria (e.g., temperature >13°C with cloud cover <50% and >17°C when cloud cover >50%, wind speed <5 Beaufort units, and hour of sampling approximately between AE 3.5 h from the time of sun peaking; Van Swaay et al. 2008). Yet, except for initial tests conducted from Pollard himself (Pollard 1977), this assumption has virtually remained untested (but see Harker and Shreeve 2008).
Here, our objective was to assess how sampling conditions influence PW counts for an abundant, but cryptic (i.e., overall body color resembling the general color of the organism habitat; Caro 2005) species-the Arctic fritillary (Boloria chariclea; Fig. 1). We hypothesized that changes in the abiotic environment would affect detectability of this species primarily through changes in its activity patterns due to constraints and relationships to body temperature (i.e., thermoregulation behavior). Specifically, we predicted that detectability would (1) increase with air temperature, because arctic fritillaries are active at body temperatures that are generally high (~26°-31°C) relative to the study area; (2 and 3) decrease in windy and cloudy conditions that hinder the activity of butterflies; and (4) follow a unimodal trend with time of the day, because butterflies are generally more commonly observed a few hours before and after noon (Kevan andShorthouse 1970, Van Swaay et al. 2008). To test recommended sampling conditions for PW, we used a hierarchical, open population binomial N-mixture model, treating the true abundance of butterflies as a latent (unobserved) state process influenced by habitat suitability and phenology, and the observation (detection) process associated with sampling conditions. Finally, we use simulations to quantify the implications of the original PW assumptions on estimated abundance and occupancy of Arctic fritillaries.

Study area
This study was conducted within boreal treed peatlands dominated by black spruce (Picea mariana), in the Wood Buffalo region of Alberta, Canada (56°37 0 22″ N, 111°58 0 71″ W, Fig. 1; see Riva et al. 2018a for additional information on the study area). The Wood Buffalo region has been subject to widespread anthropogenic disturbance associated with in situ oil sand developments, particularly the clearing of narrow (<10 m wide) linear corridors involved in seismic assessments of the underground bitumen reserve (seismic lines). A series of studies conducted in the area revealed effects of seismic lines on abundance, behavior, and responses to wildfire in butterflies (Riva et al., 2018a(Riva et al., , b, c, 2019, as well as responses in other organisms (Dabros et al. 2018, Fisher andBurton 2018).

Model species
We studied the Arctic fritillary (Boloria chariclea), a Holarctic species previously used to assess climate change, thermoregulation, and behavior (Kevan and Shorthouse 1970, Bowden et al. 2015, Riva et al. 2018c). In the boreal forests of Alberta, Canada, Arctic fritillary is univoltine and is the most common peatland butterfly, being more abundant on seismic lines than in the surrounding peatland forests (Riva et al. 2018b(Riva et al. , 2019. Seismic lines in boreal Alberta are rich in many of the larval host plants for this generalist species, especially Salix spp. (Riva et al. 2019). Given the similarity in color between boreal peatland mosses and the dorsum of Arctic fritillaries ( Fig. 1), and its relatively small size (wingspañ 38 mm; Burke et al. 2011), we considered it cryptic in these forests and thus appropriate for an investigation of the effects of detectability on PW counts. Specifically, we expected for Arctic fritillaries' larger differences between minimum and maximum detectability in comparison with other species that are easier to detect (e.g., larger or more colorful), particularly while not flying. Arctic fritillaries often bask to thermoregulate (Kevan and Shorthouse 1970), here on Sphagnum spp. or the understory vegetation, where they are poorly detected (Fig. 1). Conversely, when observed during their gentle flight, Arctic fritillaries become more detectable. Therefore, changes in the abiotic environment affect this species' activity patterns, thereby affecting its detectability. Information on longevity and dispersal of Boloria chariclea is scarce, but three Boloria spp. are estimated to live between 3 and 12 d (Bubov a et al. 2016), whereas Boloria napaea has a mean dispersal distance of~100-150 m (Ehl et al. 2019).

Sampling design
We sampled 16 PW transects within the study area following Pollock's robust design (Pollock 1982), stratified into eight undisturbed forests vs. eight 9 m wide seismic lines. PW were 50 m long and 5 m wide, with centroids separated by at least 200 m to reduce spatial autocorrelation among samples (Fig. 1). Arctic fritillaries were common in the area (i.e., hundreds; Riva et al., 2018bRiva et al., , c, 2019, and 50-m transect lengths were sufficient to obtain adequate sample sizes. Each of the 16 PW was visited once per day during 12 sampling days between 20 July and 9 August 2018, for a total of 192 PW transects within 21 d. Samples were conducted between~07:00 h and 19:00 h by the same observer (FR), randomizing sampling order within day, at any condition of air temperature and wind, but avoiding rain. Temperature (°C), wind speed (Beaufort units, BU, measured at a height of~1.5 m), and hour of the day (h) were recorded for each PW using a Kestrel 5500 Pocket Weather Meter. We also visually estimated whether cloud cover was >50% (binary covariate). To assess the recommended sampling conditions for PW, we considered time between 09:00 and 15:00, that is, 3 h before and after peak sun. Note that this choice is conservative in relation to the recommended guidelines from Butterfly Conservation Europe (i.e., 3.5 h before and after peak sun; Van Swaay et al. 2008).

Statistical model
We used a hierarchical open population binomial N-mixture model (hereafter N-mixture model; K ery and Schaub 2012) to assess the effects of habitat suitability and phenology on abundance of Arctic fritillaries and to determine how environmental conditions affect detectability. The model estimates butterfly abundance N ik at site i during each primary sampling period k, where N ik~P oisson (k ik ). We modeled expected abundance (k ik ) as a function of treatment type (forest or seismic line) and the quadratic of Julian day, so that where e i is the random effect of site i on abundance and e i $ Normal ð0; r 2 Þ. In our model, N ik is a latent, unobserved variable estimated by modeling detection probability at each survey j done within primary sampling period k at site i, so that each count y ijk $ Binomial ðN ik ;p ijk Þ, where the number of individuals N ik is the number of trials and p ijk is the probability of detecting a given individual during the survey. The model assumes sites are closed to changes in abundance within each ❖ www.esajournals.org primary sampling period k (K ery and Schaub 2012). We modeled p ijk as a logit function of survey level environmental variables, so that where Temp is temperature, Wind is wind speed on the Beaufort scale, Time is the time of day expressed in hours from midnight (0-24 h), and Cloud is a binary covariate representing cloud cover >50%. We used temperature, wind speed, hour of the day, and cloud cover because controlling for these factors has been shown to affect butterfly detectability and is recommended for monitoring schemes based on PW (Brown and Boyce 1998, Van Swaay et al. 2008

Analysis
We fit the model in a Bayesian framework using programs JAGS version 4.3.0 (Plummer 2003) and R version 3.6.1 (R Core Team 2019) with the package R2jags (Su and Yajima 2015). We standardized continuous covariates to facilitate model convergence and allow comparisons between the effects of different covariates (K ery and Schaub 2012). MCMC simulations were run for 600,000 iterations, retaining 400,000 iterations after discarding the first 200,000. We evaluated convergence of the MCMC chains using the Gelman and Rubin R-hat statistic (Brooks and Gelman 1998). Model fit was evaluated by simulating new data from the posterior distribution of model parameters and comparing these to the estimates from the actual data (K ery and Schaub 2012). We used diffuse prior distributions for all estimated parameters. See Data S1 for model code and prior distributions.

Simulations
To assess the interplay between abundance and detectability in determining estimates of species' counts (and thus presence) in PW, we supplemented our analysis with simulations.
Specifically, we generated butterfly detection data according to binomial distributions with N trials, representing site-level abundance, and probability P, representing detectability. Note that this stochastic process is identical to that assumed in the N-mixture model, where the number of butterflies counted is treated as the realization of a random variable from a binomial distribution. Simulations were set to represent PW conducted on transects with N = 10, 20, and 40 Arctic fritillaries and probabilities of detections of P = 0.04, 0.20, and 0.40. For each combination of N and P, we simulated 100,000 realizations of the stochastic process. We chose values of N and P based on the results of the hierarchical model to represent sites where Arctic fritillaries are rare, moderate, or high in abundance, and samples that were poor, average, or good in terms of sampling conditions.
The open population binomial-mixture model fitted the data well (posterior predictive check of model adequacy = 1.08), and MCMC mixing was adequate (all R-hat values = 1). Abundance of Arctic fritillaries was higher in seismic lines than forest and peaked during the second sampling period (July 25-26; Fig. 2). Detectability peaked around 12:00, decreased with cloud cover >50%, and increased with temperature at cloud cover >50% (Fig. 3; Tables 1, 2). The Bayesian credible intervals for two covariates, the effects of temperature when cloud cover was <50% and wind, included zero, suggesting weak effects of these environmental conditions on detectability ( Table 1).
Evaluating the recommended sampling conditions for PW (i.e., samples conducted between 9:00 and 15:00, at temperatures >13°C or 17°C if cloud cover >50%, and with wind speed <5 BU), we found that P varies between a minimum of Fig. 3. Trends in detectability with temperature, hour of samples, and wind speed at cloud cover (CC) >50% (top row, in gold) and <50% (bottom row, in red). Predictions are calculated holding the other covariates at their average value (i.e., temperature~24°C, wind speed~1 BU, time of the day~noon). Black vertical lines represent the sampling criteria recommended for PW for monitoring schemes (i.e., temperature >13°C with cloud cover <50% and >17°C when cloud cover >50%; wind speed <5 BU; sampling hour between 9:00 and 15:00). 0.04 (CRI 0.01-0.09) when samples are conducted at 15:00 h, with 17°C, wind at 5 BU, and cloud cover >50% vs. a maximum of 0.45 (CRI 0.15-0.72) when samples are conducted at noon, with 31°C, wind at 0 BU, and cloud cover >50%. The interactive effect between cloud cover and temperature predicts that at temperatures lower than 30°C, cloud cover >50% decreases the probability of detecting Arctic fritillaries, while for the few samples that occurred at temperatures higher than 30°C (n = 12), the probability of detecting Arctic fritillaries was higher at cloud cover of >50%.

DISCUSSION
This study demonstrates that, even within recommended sampling criteria, environmental conditions have substantial effects on Notes: Note that detectability did not differ between forests and seismic lines. For parameter notation (i.e., y, k, and P), refer to Methods: Statistical model. detectability and thus the number of butterflies counted in PW. Specifically, we showed that an open population binomial N-mixture model can disentangle the effects of habitat suitability and phenology on abundance of butterflies from the effects of sampling conditions on their detectability. As expected for a cryptic species, detectability was generally low (average of~0.22) and varied by an order of magnitude (~0.04 vs. 0.45) between poor and optimal sampling conditions within the recommended criteria for PW (i.e., samples conducted between 9:00 and 15:00, at temperatures >13°C or 17°C if cloud cover >50%, and with wind speed <5 BU; Fig. 3; Van Swaay et al. 2008). To our knowledge, this is the first comprehensive assessment of how changes in environmental conditions within the recommended PW criteria affect butterfly counts, despite several studies dealing with detectability in butterflies (Brown and Boyce 1998, K ery and Plattner 2007, Matechou et al. 2014, Melero et al. 2016. Results are especially important in relation to initiatives such as the European Butterfly Monitoring scheme (https://butterfly-monitoring.net/eb ms), a continental effort that includes more than 6200 PW transects that have been sampled for decades, recording more than 400 species and used to inform the protection and management of habitats and species in Europe (e.g., using butterflies as indicators; van Strien et al., 2019).

Detection and abundance of Arctic fritillaries
The most important covariates in determining detectability were cloud cover and hour of the day, while wind speed was less important, and temperature had an effect only when interacting with cloud cover (Table 1, Fig. 3). While rather surprising, these results are consistent with the ecology of this system. As ectotherms and behavioral thermoregulators, temperature plays an important role in determining the activity, and thus detection, of butterflies (Kevan andShorthouse 1970, Bried and. Indeed, Arctic fritillaries are known to perform dorsal basking, reaching body temperatures between 26°and 32°C (Kevan and Shorthouse 1970). Assuming that the primary determinant of changes in detectability here was butterfly activity, our model suggests that, as long as sunlight is available, temperatures in the range of those observed when cloud cover was <50% (i.e., 17.1°-33.4°C) allow for equal Arctic fritillary activity. Presumably, Arctic fritillaries are efficient at solar basking, thus reaching the body temperature necessary to fly regardless of the air temperature when sun light is available (Kevan and Shorthouse 1970). This is not surprising given that this species inhabits much colder, temperature limited environments, for example, Zackenberg Research Station in Greenland, with an average temperature in July of~6°C vs.~16°C in our study area (Bowden et al. 2015). Conversely, when cloud cover was >50% (and thus basking was presumably more difficult), higher air temperatures were more important for Arctic fritillaries to reach the body temperature necessary to fly (Fig. 3). Interestingly, with all else being equal, Arctic fritillaries had a slight preference for flying in the morning. Finally, the smaller role of wind on detectability might reflect the fact that all assessed PW were embedded in forests. Wind speed was measured at a height of~1.5 m, and while we did record a few windy PW, the presence of shrubs and trees buffered wind speed at lower heights (<0.5 m) where Arctic fritillaries usually fly (Kevan and Shorthouse 1970).
With respect to abundance estimates, our results confirm that PW counts that are not corrected for detection probability substantially underestimate the abundance of cryptic species (Brown and Boyce 1998, Isaac et al. 2011. Here, we counted approximately one fifth of the butterflies estimated by the model, but we note that some PW were conducted on purpose out of the recommended sampling criteria, for example, in the early morning or evenings. Interestingly, because detectability did not change between forests and seismic lines, we confirmed the results of previous studies reporting increases in abundance of Arctic fritillaries on seismic lines (Riva et al. 2018b(Riva et al. , 2019. Therefore, even canonical regression techniques applied to PW counts can be effective to address management questions, but only when detectability does not vary with factors of interest (e.g., space, time or habitat; Nowicki et al. 2008. It has been argued that modeling detectability is not always necessary (Welsh et al. 2013, Hutto 2016, but see Guillera-Arroita et al. 2014, Marques et al. 2017. Our results suggest that the context, constraints, and objectives of a study should inform whether hierarchical models are the most appropriate approach.

Contextualization and implications of our study
How sampling conditions influence PW counts through variation in butterfly detectability remains poorly understood, especially at a fine temporal scale. Previous studies generally assess the effects of single covariates on the detection process, for example, temperature Pellet 2012, Matechou et al. 2014), or the effects of site, time, or habitat type (Brown and Boyce 1998, Isaac et al. 2011, Melero et al. 2016, thereby ignoring the more complex effects documented here (Fig. 3). Other analyses using binomial-mixture models (e.g., K ery and Schaub 2012, Melero et al. 2016) had different objectives, with coarser temporal scales. Our study fills therefore a knowledge gap and was possible thanks to an exhaustive sampling effort over a short period, with 192 PW samples in 21 d.
Complementing our study with simulations, we show how species at low densities and poorly detectable are likely to be overlooked under suboptimal sampling conditions. For instance, when sampling a transect containing 10 Arctic fritillaries at the lowest detectability here observed within PW conditions (P = 0.04), false-absences are twice as likely than true-presences (Fig. 4). Since species assemblages are usually uneven, composed by a few common species and many uncommon species (Preston 1948), the implications of imperfect detection often interest the majority of species, especially in areas of high biodiversity. Furthermore, we observed large variations in abundance within a period as short as one week (e.g.,~threefold decreases between 2 and 9 August; Fig. 2, Table 2). Therefore, when weekly visits (sampling frequency required by most monitoring schemes) do not capture the peak of a species flight curve, the risk of false negatives for species that are poorly detectable, low in abundance, and with short flight periods is exacerbated. We acknowledge that detectability estimates varied here broadly (e.g., 95% credible interval between 0.15 and 0.72 for the best sampling conditions), but rather than focusing on the accuracy of abundance estimates, we stress that low detection probabilities often result in false-absences when estimating the occupancy of rare species based on weekly PW counts. There are indeed success stories of rare butterflies recovering from the verge of extinction (Haddad 2018), such that it is undesirable to overlook even small populations of species of conservation concern.
Notably, the implications of this study should be evaluated in the light of the assumptions underlying the open binomial-mixture model: (1) no false positives recorded (including species misidentification); (2) equal detectability among Arctic fritillary individuals; (3) appropriateness of a Poisson distribution for the ecological state model; and (4) the ecological state did not change within each closure period (K ery and Schaub 2012). The first three assumptions are reasonably met as (1) Arctic fritillaries were distinctive in the field (only species of the genus Boloria flying at the end of July in the study area), and attention was given to avoid counting the same individual multiple times; (2) no reason to expect substantial variation in detectability between individuals (e.g., no pronounced sexual dimorphism; however, sexes may behave differently, and whether the sex ratio differs between seismic lines and forests was here unknown. For instance, because male butterflies tend to cover larger distances than females when dispersing (Ehl et al. 2019), changes in sex ratio between sampling environments can affect detectability estimates due to more severe violations of the closure assumption for PW with higher proportions of males); and (3) a Poisson distribution is appropriate because we usually counted few butterflies (no sign of overdispersion) and model fit was good (posterior predictive check of model adequacy = 1.08; Melero et al. (2016) considered adequate all models in the interval 0.8-1.2).
Conversely, the closure assumption (5) is likely violated for both the spatial (no emigration or immigration) and demographic (no recruitments or deaths) domains. Arctic fritillaries are common throughout these forests with both seismic lines and forests providing seemingly suitable habitat rich in willow (Salix spp.), the larval host plant (Riva et al. 2019). Therefore, Arctic fritillaries most likely moved freely within the study area, leaving or colonizing PW. Concurrently, butterfly life span is shorter than their flight period, such that only a subset of the population is present on any given day. This temporal fragmentation, typical of butterflies (Nowicki et al. 2008), implies that Arctic fritillaries emerged or died within closure periods (longevity for three European Boloria spp. was estimated between 3 and 12 d; Bubov a et al. 2016). However, we imposed short closure periods (i.e., the six k periods that lasted on average 2.8 AE 0.7 d, min = 2, max = 4; Fig. 2). Therefore, we consider it reasonable for this analysis to assume that the number of Arctic fritillaries on a transect was relatively constant within 2-3 d. K ery and Schaub (2012) report that mild violations of the closure assumption lead to inflated estimates of abundance and that in such cases the estimated abundance can be interpreted as the number of individuals that used a PW during the surveys. In our case, it is possible that abundance was overestimated (thus underestimating detectability), but the effects of sampling conditions observed here are substantial and we consider our inference robust.
Notably, it is generally assumed that changes in detectability do not determine changes in butterfly counts when analyzing data from monitoring schemes (but see Matechou et al. 2014). Analyses of monitoring schemes usually assess trends over time for species' abundance, commonly evaluating yearly indices calculated using the approach of area under the flight period curve (Rothery and Roy 2001, Dennis et al. 2013, Schmucki et al. 2016, Wepprich et al. 2019. These analyses assume that the effects of detectability on PW counts are negligible in comparison with changes in time due to population trends. Indeed, national-scale trends are robust, with declines in species also corresponding to loss in their distribution (Thomas 2016). However, sitelevel estimates are likely less accurate due to changes in detectability (Isaac et al. 2011), limiting our ability to manage populations of rare and/or poorly detectable species. The European Butterfly Monitoring Scheme is expanding in the Mediterranean region, one of the biodiversity hotspots in Europe, and when an assemblage contains several rare species, the risks described in this paper are exacerbated. Therefore, approaches currently applied in other monitoring schemes might be unsuitable in the Mediterranean region. For instance, Italy has one of the richest diversity of butterflies in Europe (i.e., 300 native species, more than 4 times the British butterfly fauna; Bonelli et al. 2018), and distinguishing extinctions/colonization dynamics from false-absences might be problematic for uncommon species if using data recorded following conventional monitoring approaches. A hybrid approach where transects of interest are sampled twice within closure periods (ideally one day) might be a better use of survey effort for informing the species status in each transect site.
Finally, even in cases of absolute spatial and demographic closure, it is not possible to estimate true population sizes using only PW counts analyzed with a binomial-mixture model, because information on butterfly longevity cannot be inferred from PW counts (Nowicki et al. 2008). However, in cases where marking individuals to obtain longevity estimates is not feasible (e.g., dealing with rare or delicate butterflies; Haddad et al. 2008), hierarchical models represent an effective approach to account for detectability. Substituting PW counts with true abundances estimated with an open population binomial-mixture model within the framework of area under the flight period curve (Dennis et al. 2013, Schmucki et al. 2016) would allow one to correct for the effect of different environmental conditions on PW counts, and thus compare more accurate indices of abundance for one species in space and time. This would require double visits of each PW within reasonable closure periods (i.e., ideally 1-2 d), but might have important implications when assessing rare or poorly detectable species. However, we note that because this model does not account for changes in longevity, it is thus inaccurate when mortality changes substantially within or between species, in time or with environmental characteristics.

CONCLUSIONS
Discriminating how much variation observed within samples is due to true differences in populations-and not in detectability-is a critical challenge for biodiversity management and conservation (MacKenzie et al. 2002, K ery and Schaub 2012, Marques et al. 2017). Our study addresses this matter for butterflies, where imperfect detectability has been identified as a conservation priority, especially for species that are rare, poorly detectable, and with limited distribution (Brown and Boyce 1998, Haddad et al. 2008, Nowicki et al. 2008, Isaac et al. 2011. We focused on Pollard Walks (PW), the most widely employed sampling approach for butterflies, demonstrating that ignoring species detectability can have important consequences in estimating occurrence and abundance of species even when sampling protocols follow recommended sampling conditions. Results would presumably differ when evaluating other systems and/or species (e.g., wind might be more important when sampling open habitats, or temperature when assessing thermophilous butterflies), but our work highlights the necessity to critically evaluate the assumptions underlying many studies that use PW. Here, we focused on a cryptic species, but we stress that detectability is an issue even for species easily detectable because variation in detectability with sampling conditions biases estimates of abundance and distribution regardless of the average detectability of a species.
To be clear, we are not suggesting that PW be abandoned. Ecologists routinely capitalize on all sorts of data, including opportunistic data that are arguably far more challenging that PW data. Instead, PW have been invaluable in providing some of the most important long-term, standardized field data. However, analyzing PW counts while disregarding detectability will yield unbiased results only when variation in detectability among samples is negligible. Other methods that incorporate detectability (e.g., distance sampling or mark-recapture techniques) are more appropriate when detectability varies substantially, and a reconciliation between such methods and PW has long been sought. For instance, Nowicki et al. (2008) stated that "there is apparently little space for reducing the labor requirements (for mark-recapture). . . it is, rather, refinements of transect surveys that should be sought. A vital advance would be a field method making possible the estimation of detection probability of individuals counted on transects." Hierarchical models akin to that here presented seem the natural candidate to bridge the chiasms between PW and the analysis of detectability but require multiple samples within meaningful closure periods of assumed constant butterfly abundance.
The implications of low detectability are especially misleading when dealing with poorly detectable (e.g., cryptic) species that are low in abundance. When repeating PW counts in short periods is not feasible, we suggest that PWs should be conducted at more conservative conditions or conditions better representing idiosyncrasies in species ecologies, especially in areas of high diversity or habitats potentially suitable for rare species. Further studies evaluating more species across a gradient of environments and ecological preferences will be necessary to understand the relation between environmental conditions and PW counts, and thus provide more robust recommendations in terms of sampling conditions.

ACKNOWLEDGMENTS
FR led study design, data analysis, and manuscript writing. GG participated in fieldwork. ADC and FVD contributed to the analysis. SB, JHA, and SEN provided important intellectual contributions on ecology of butterflies and detectability. All authors suggested edits to the manuscript drafted from FR. All authors agreed to submission and be accountable for the contents of this manuscript.