Accounting for bias in prevalence estimation: The case of a globally emerging pathogen

1. Accurate quantification of infection parameters is necessary to ensure effective surveillance, investigation and mitigation of infectious diseases. However, hosts and pathogens are often imperfectly observed and key epidemiological parameters, such as infection prevalence, can be biased if this observational uncertainty is not properly accounted for. 2. Here, we evaluated the combined effects of imperfect pathogen detection and host


| INTRODUC TI ON
Emerging infectious diseases pose an important threat to human health, food security, the economy and biodiversity; thus, their surveillance is crucial to assess any risks they may present (Baker et al., 2022;Daszak et al., 2000;Fisher et al., 2012).To efficiently surveil diseases, it is necessary to provide robust metrics at a host population level that can be reliably compared over space and time (Wobeser, 2007a).Prevalence, the proportion of infected individuals in a group at a specific point in time (i.e. point prevalence) or during a period of time (i.e.period prevalence) is a fundamental metric in epidemiology (Porta, 2014).This parameter is used, for instance, to assess the risk of emergence or stage of invasion of a pathogen; to determine the biotic and abiotic factors influencing the spatiotemporal dynamics of infection and disease; to assess the effectiveness of disease mitigation strategies and to parameterise models used to evaluate disease risk (Langwig et al., 2015;Wobeser, 2007a).
Thus, reliable prevalence estimates are essential as biased estimates can perpetuate erroneous inferences about infectious diseases, potentially rendering management measures ineffective and wasting resources (McClintock et al., 2010).However, prevalence estimates can be biased for several reasons, including inappropriate sampling design and imperfect pathogen detection (Lachish & Murray, 2018;Wobeser, 2007b).The latter relates to infection status misclassifications, that is, false positives and false negatives.False positives can be the result of lack of test specificity or cross contamination during sampling or diagnostic processes, while false negatives may be due to the failure of sampling to collect the pathogen or some of its parts (e.g.antigens, nucleic acid) and/or the failure of the diagnostic test to detect the pathogen when present in the sample (sensitivity; Colvin et al., 2015;DiRenzo, Campbell Grant, et al., 2018;Thompson, 2007).
Obtaining reliable estimates of prevalence is particularly difficult when studying wildlife host-pathogen systems (Lachish & Murray, 2018;Wobeser, 2007c).First, designing adequate sampling schemes is inherently harder than in epidemiological studies on humans or domestic animals.For instance, some individuals could be sampled more than once if they are not individually identifiable, especially in investigations aiming to estimate a period prevalence.This issue is known as individual pseudoreplication, which can decrease the variability around estimates and violates the assumption of data independence (Hurlbert, 1984), leading to biased prevalence estimates (Miller et al., 2018).Second, infection status misclassifications are also more frequent in wildlife: most sampling and diagnostic methods have not been adequately validated because of the lack of means or reference tests with known sensitivity and specificity for the species under study (Wobeser, 2007c).In that regard, occupancy models, a commonly used method in ecology (MacKenzie et al., 2002), based on taking replicates to estimate the probability of observing an organism present during sampling (i.e.detection probability), have proved to be very useful for adjusting prevalence estimates to account for infection status misclassifications, especially false-negative errors (Lachish et al., 2012;Miller et al., 2012).However, most studies have focused only on the sensitivity of the diagnostic test in occupancy models (Colvin et al., 2015;Lachish et al., 2012;Miller et al., 2012;Thompson, 2007).A notable exception is DiRenzo, Campbell Grant, et al. (2018), who developed a hierarchical, multiscale occupancy model that accounts for false negative errors arising from both sampling and diagnostic testing (hereafter referred to as the 'detectionadjusted model').This model was validated by DiRenzo, Campbell Grant, et al. (2018) using an extensive simulation study but, to our knowledge, it has never been fitted to empirical data.
The detection-adjusted model was originally developed to account for imperfect detection of Batrachochytrium dendrobatidis (Bd); however, this model can be used to account for imperfect detection of a variety of infectious agents (DiRenzo, Campbell Grant, et al., 2018).Bd is a fungus that causes amphibian chytridiomycosis in susceptible species, an emerging infectious disease responsible for extensive biodiversity loss (Scheele et al., 2019).Surveillance of Bd mostly relies on the use of non-invasive oral and skin swabs (for larval and post-metamorphic stages, respectively), followed by detection of Bd DNA using molecular techniques such as a TaqMan quantitative (real-time) polymerase chain reaction (qPCR) assay (Boyle et al., 2004;Hyatt et al., 2007;Kriger et al., 2006;Shin et al., 2014).
However, infection status misclassifications caused by the sampling and diagnostic techniques have rarely been considered when estimating prevalence (but see DiRenzo, Campbell Grant, et al., 2018;Hollanders & Royle, 2022;Miller et al., 2012).The detection probability of Bd using qPCR is generally high (Boyle et al., 2004;Hyatt et al., 2007;Kriger et al., 2006) but dependent on Bd load, that is, it has a low sensitivity when Bd load is low (Miller et al., 2012).Two studies showed that skin swabbing can yield erroneous results for both infection status and Bd loads in comparison with other sampling methods such as filtered water (Shin et al., 2014), or skin digests (Clare et al., 2016), especially at low infection intensities.This leads to an underestimation of the infection prevalence and of the resources during mitigation strategies of infectious diseases.The methods used here can be applied to a wide range of host-pathogen systems, and will be of interest to both researchers and practitioners aiming to investigate and mitigate the impacts of infectious diseases on free-ranging populations.

K E Y W O R D S
Batrachochytrium dendrobatidis, chytridiomycosis, Darwin's frog, detection probability, emerging infectious diseases, host pseudoreplication, infection prevalence, occupancy models average infection intensity (Clare et al., 2016;Shin et al., 2014). DiRenzo, Campbell Grant, et al. (2018) demonstrated that, if imperfect Bd detection due to sampling alone was not taken into account, Bd prevalence could be underestimated by as much as 71% in an amphibian assemblage in Panama.
Here, we aim to quantify infection prevalence over a period of 5 months (i.e.period prevalence; this period corresponds to spring and summer, which is when our target host species is active), while accounting for imperfect pathogen detection arising from both the sampling and diagnostic processes in an empirically studied wild host-pathogen system.To this end, we fitted the detection-adjusted model to Bd infection data obtained from free-living populations of the southern Darwin's frog (Rhinoderma darwinii) in Chile.As our data come from hosts individually identified using photographs taken at the time of sampling, we were also able to evaluate the effects of individual pseudoreplication in our prevalence estimates.We expected the naïve (i.e.observed) Bd infection prevalence to be negatively biased in our system, given the results obtained by previous studies assessing either Bd swab detection probability or qPCR detection probability (DiRenzo, Campbell Grant, et al., 2018;Hollanders & Royle, 2022;Miller et al., 2012).We also predicted that pseudoreplication would exacerbate the underestimation of Bd prevalence, since R. darwinii individuals are highly susceptible to lethal chytridiomycosis and infected individuals are less likely to be recaptured than uninfected individuals (Valenzuela-Sánchez et al., 2017, 2022).Finally, using a deterministic matrix population model parameterised with vital rates derived from previous studies, we examine whether using our naïve or corrected prevalence estimate changes the projections of the size of a R. darwinii population over a 20-year period, which would have implications for disease risk assessment and conservation prioritization.

| Sample collection
The field study was conducted in two areas of southern Chile: Neltume (39° 48′ S; 71° 57′ W) and Contulmo (38° 01′ S; 73° 10′ W; Figure 1).Each area was visited once a month from November 2018 to March 2019, with each visit comprising a three-consecutiveday sampling period.Six 20 × 20 m and two 50 × 50 m nearby plots were delineated within each area.A team of two people (one of whom participated in all fieldwork sessions) searched each plot for 30-60 min each day, walking haphazardly and capturing all R. darwinii individuals found.Each frog was handled with new disposable nitrile gloves and footwear was disinfected between plots separated by at least 1 km or when crossing a stream.Each captured frog was photographed and sampled for Bd using a dry, sterile rayon-tipped swab (MW100; Medical & Wire Equipment Co™) as described by Soto-Azat, Valenzuela-Sánchez, Clarke, et al. (2013).Individual identification is possible in R. darwinii using photographs of its natural ventral colour patterns (Soto-Azat, Valenzuela-Sánchez, Collen, et al., 2013).If a frog was captured more than once during the 3-day search period, it was sampled only during the first capture occasion.
On 607 occasions, this swabbing process was immediately repeated using a second swab, and the order of the swabs was recorded.Skin swabs were kept at ambient temperature (on average, 15°C) for no more than 4 h in the field (and always away from direct sunlight) before storage at −20°C, and later at −80°C on arrival at the laboratory.

| Laboratory diagnostic testing
DNA was extracted in June 2019 from skin swabs following the Prepman Ultra™ protocol and the presence of Bd DNA was detected using the validated TaqMan qPCR assay targeting the ITS1/5.8SDNA region, with extractions diluted 1:10 and including bovine serum albumin (BSA) to decrease PCR inhibition (Boyle et al., 2004;Garland et al., 2010;Hyatt et al., 2007;O.I.E., 2019).The quantification of Bd infection intensity was adjusted to account for the dilution of the original sample by multiplying the qPCR result by 120 (see Hudson et al., 2016).The results are presented as Zoospore Equivalents (ZE).
Negative controls and quantitation standards (also serving as positive controls) at 0.1, 1, 10 and 100 ZE were run on each qPCR plate in duplicates.Quantitative standards were made with a Bd isolate belonging to the Global Panzootic Lineage (ref.IA043,Spain).In general, two qPCR wells were run for each swab, but sometimes up to six wells were run per sample (see Appendix S1: Estimation of Bd infection prevalence with more stringent criteria to define positivity).
A swab was considered positive if a positive value and a clear amplification with a sigmoid curve appeared in at least one of all qPCR replicates.An individual was categorized as Bd-positive at each time of capture if any of the swabs taken from it at that time was positive.

| Data analyses
All data manipulations and analyses were performed in the R environment (v4.2.0; R Core Team, 2022) unless stated otherwise.All codes are available in Sentenac et al. (2023).

| Naïve period prevalence estimation
We calculated the naïve 5-month infection prevalence by dividing the number of captures found as Bd positive by the total number of  captures made during the study period (i.e.period prevalence).Note that an individual can be related to more than one capture if that individual was sampled more than once during different months (i.e.pseudoreplication).We calculated naïve prevalence for two different scenarios: (i) considering only the first of the two swabs taken from the frogs (hereafter, 'classic Bd investigation with pseudoreplicates' scenario) and (ii) considering both of the duplicated swabs (hereafter, 'double swab with pseudoreplicates' scenario).

| Removing individual pseudoreplication
Our sampling of individuals was not instantaneous; therefore, we expected the presence of individual pseudoreplication because a single frog could have been sampled during multiple months.As R. darwinii individuals were identified using photographs, we could remove pseudoreplicates from our dataset.To achieve this, we followed this procedure: (1) for individuals never captured as Bd-positive, a single capture occasion was randomly retained in the dataset; (2) for individuals captured multiple times but only once as Bd infected, the Bd-positive occasion was retained and (3) for individuals captured multiple times as Bd-positive (this occurred only with three individuals: No. 270, 313, 487), we randomly retained one of the Bd-positive occasions.Depending on which positive occasions is selected for these individuals, different final datasets without pseudoreplicates were possible, but we showed that this had little impact on the estimation of prevalence (see Figure S1 for details).We chose one for the main text and calculated 5-month prevalence for the 'classic Bd investigation without pseudoreplicates' and 'double swab without pseudoreplicates' scenarios.

| Accounting for imperfect pathogen detection resulting from sampling and diagnostic errors
We used the sampling and diagnostic detection-adjusted model described by DiRenzo, Campbell Grant, et al. (2018) to account for imperfect pathogen detection resulting from sampling and diagnostic errors.We fitted this model to the dataset with and without pseudoreplicates.An alternative approach for dealing with individual pseudoreplication would be to use a multi-season occupancy model to estimate point prevalence, but developing such a model was beyond the scope of our study.Briefly, the detection-adjusted model is a static multiscale occupancy model considering two latent ecological processes, the probability of a host being infected (ψ, which can be seen as the infection prevalence; here a host corresponds to a 'site' in classical site occupancy models) and, if yes, its infection intensity; and two observational processes, the sampling process and diagnostic testing.For each, pathogen detection probability is modelled by a classical logistic regression based on the history of pathogen detections in the series of samples (i.e.swabs), taken from the same capture, and in the series of diagnostic tests (qPCR), performed on the same sample, respectively.The model also accounts for heterogeneity in detection probabilities that arises due to variation in infection intensity.The full description and parametrisation of the model are available in the Appendix S1: Description of the detection-adjusted model.All parameters were estimated in a Bayesian framework by Markov Chain Monte Carlo methods using JAGS (Plummer, 2003) through the R package jagsUI (Kellner, 2015).

| Predictions of host population trajectories
We used a simple deterministic matrix population model to explore the implications of choosing our worst (i.e.'classic Bd investigation with pseudoreplicates' scenario) and best (i.e.detection-adjusted model without pseudoreplicates, or 'full design' scenario) method to estimate prevalence on predicted host population trajectories.This population model considers three age classes (new-borns, 1-year-old juveniles and adults) and two infection states (Bd infected or uninfected).New-borns and juveniles stay in their respective classes for a year, and all individuals reach adulthood when 2 years old (Valenzuela-Sánchez et al., 2022).For the projection matrix, we used demographic parameters (per-capita fecundity and survival probabilities) estimated on an annual scale for a free-living R. darwinii population, which has an asymptotic population growth rate of 1.067 in the absence of Bd infection (i.e.slowly growing population; Valenzuela-Sánchez et al., 2017, 2022).To allow transition between infection states, the model was parameterized with annual infection and recovery probabilities (we used the same transition probability values across all the age classes).As we used time-constant parameters in our matrix model, once the stable stage distribution has been reached, the values selected for the transition probabilities must lead to an annual prevalence equal to that estimated using either the 'classic Bd investigation with pseudoreplicates' or the 'full design' scenarios.Therefore, it was necessary to calculate annual prevalence from our 5-month prevalence estimates.We used an individual-based model with monthly time steps to calculate annual period prevalence (see Appendix S1: Calculation of the annual prevalence).
We projected the fate of a population having an initial size of 100 individuals (57 juveniles and 43 adults, representing the stable stage distribution of the Bd-free population) for a period of 20 years, as described by Valenzuela-Sánchez et al. (2022).We made the simplifying assumption that this population was closed (no emigration or immigration occurred), and no compensatory mechanisms against infection existed (but see Valenzuela-Sánchez et al., 2022).

| RE SULTS
We made 1085 captures of R. darwinii: 583 in Contulmo and 502 in Neltume (Figure 1).As the number of positive individuals is generally very low in R. darwinii (Valenzuela-Sánchez et al., 2017), we combined data from both populations prior to analyses.

| Naïve period prevalence estimation
In total, we detected Bd-positive frogs on 43 occasions, yielding a naïve 5-month prevalence of 4.0% in the 'double swab with pseudoreplicates' scenario (43/1085, Figure 1).Had we not taken duplicate swabs, we would only have observed 36 positives and estimated a 5-month prevalence of 3.3% (Figure 1;Table S1).This is what a 'classic Bd investigation with pseudoreplicates' would have reported, although other investigations might have more stringent criteria to categorize individuals as infected and report an even lower prevalence estimate (see Table S1).

| Removing individual pseudoreplication
Of 1085 captures, we captured only 641 individuals, of which 282 were captured more than once (in total, there were 444 recaptures).
Three individuals were captured multiple times as infected.Eleven individuals gained infection while three apparently cleared it over the study period (Table S2).Removing pseudoreplicates from our dataset led to a 5-month period prevalence of 6.2% (40/641) and 5.3% (34/641) in the 'double swab without pseudoreplicates' and 'classic Bd investigation without pseudoreplicates' scenarios, respectively.In other words, not properly accounting for individual pseudoreplication resulted in a considerable underestimation (of roughly 36%-38%) of the infection prevalence estimates, regardless of the scenario considered (Figure 1; Table S1).

| Accounting for sampling and diagnostic imperfect pathogen detection using the detectionadjusted model
The detection-adjusted model showed that both swab and qPCR detection probabilities significantly increased with infection intensity (their respective slope coefficient in the logistic regression was positive and did not overlap zero) but were not significantly different from each other (Figure 2, 95% Bayesian credible intervals, or CRI, are overlapping).The swabbing detection probability estimates had a higher uncertainty than the diagnostic testing process, possibly due to the lower number of duplicated observations in the former case (Figure 2).Posterior distributions for all model parameters in the scenarios with and without pseudoreplicates are shown in Figure S1.
Removing pseudoreplicates or not did not influence the magnitude of the bias attributable to false negative errors: overall, taking a second set of swabs and correcting for imperfect swab and qPCR detection avoided a 27% underestimation of the prevalence whether pseudoreplicates were removed (7.3% instead of 5.3%) or not (4.5% instead of 3.3%).Alone, the detection-adjusted model avoided a 16% underestimation when pseudoreplicates were removed (7.3% instead 6.2%) and an 11% underestimation when they were not removed (4.5% instead of 4.0%; Figure 1).

| Implications on predicted host population trajectories
Our deterministic matrix population model showed that the method chosen to estimate prevalence has a considerable impact on the projected host population trajectories.In the 'classic Bd investigation with pseudoreplicates' scenario, there was a reduction of 6.8% in the asymptotic population growth rate when comparing the Bdfree versus the Bd-positive population (1.067 vs. 0.994; Figure 3).In the 'full design' scenario, this reduction was 15.5% (1.067 vs. 0.902; Figure 3).Although the presence of Bd infection turned a growing R. darwinii population into a declining one in both scenarios, the difference in the magnitude of the predicted population decline without taking imperfect detection of Bd and pseudoreplication into account versus that of the predicted decline when doing so is substantial (Figure 3).For example, after 20 years the population size, in the 'classic Bd investigation with pseudoreplicates' scenario, was 90 frogs (i.e.10% decrease in size during the period), while it was only 14 frogs in the 'full design' scenario (86% decrease in size during the period).

| DISCUSS ION
Uncertainty is pervasive in disease ecology (Lachish & Murray, 2018), yet epidemiological investigations that address multiple sources of bias are relatively scarce.In our study system, not accounting for the combined effects of individual pseudoreplication and imperfect Bd detection from sampling and diagnostic methods would have led to a 55% underestimation in period prevalence.Our projection of the fate of a R. darwinii population experiencing different Bd infection prevalences helps illustrate the importance of such a bias in the context of disease risk assessment.In the 'classic Bd investigation with pseudoreplicates' scenario, our matrix population model predicted a 10% decrease in population size over a 20-year period, while the predicted decline was 86% for the same period in the 'full design' scenario (i.e.detection-adjusted model without pseudoreplicates).
Prevalence bias can, therefore, mislead researchers and conservation practitioners into underestimating the impacts of pathogen infection on wild populations.Accounting for bias in prevalence estimation is essential to correctly prioritize conservations actions, build appropriate mitigation strategies and adequately allocate resources.
In addition, prevalence estimates are used for purposes other than for the assessment of disease risks, including the evaluation of mitigation measure effectiveness or the understanding of infection and disease dynamics.Our study shows the importance of correcting for bias in prevalence to provide robust inferences about the study system.
13652664, 2023, 9, Downloaded from https://besjournals.onlinelibrary.wiley.com/doi/10.1111/1365-2664.14457by University College London UCL Library Services, Wiley Online Library on [02/10/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Pseudoreplication is often not considered in wildlife disease studies, but we show that failing to account for individual host identity can substantially affect the estimation of infection period prevalence (pseudoreplication alone led to a 36%-38% prevalence underestimation).This finding is in line with the conclusions of a meta-analysis on primate helminth infections, which showed that prevalence estimates were, on average, lower by over 12% when researchers did not identify individual hosts (Miller et al., 2018).In a study investigating faecal helminth parasite loads in six different populations of wild giant pandas (Ailuropoda melanoleuca) individually identified a posteriori via microsatellite markers, Zhang et al. (2011) reported that not accounting for host identity led to prevalence underestimation in half of the populations.It also led to the erroneous conclusion that there were significant differences in prevalence between populations, when there were none (Zhang et al., 2011).
Attempts to avoid bias in prevalence due to host pseudoreplication are rare in the literature, especially regarding studies on parasites other than intestinal parasites detected in faecal samples.This might be because such investigations require individual identification, and this can be difficult to achieve for free-living animals.While we acknowledge this difficulty, there is now a variety of individual identification methods available, including physical tags, genetic methods, radio-frequency identification (e.g.Passive-Integrated-Transponder [PIT] tags), and image-based techniques (Petso et al., 2021;Vidal et al., 2021).We used the latter technique, which can be inexpensive and non-invasive, although perhaps time-consuming to analyse compared with physical or PIT tags.The advent of deep and machine learning is likely to solve this problem and might also reduce human errors (Vidal et al., 2021).Interestingly, implementing any of these identification techniques can serve many other purposes additional to epidemiological investigations, such as demographic, behavioural and ecological surveys.We acknowledge that, in some F I G U R E 2 Estimated Batrachochytrium dendrobatidis (Bd) detection probabilities in this study (Chile) and others, with (a) and without (b) pseudoreplicates.Illinois: Miller et al. (2012);Panama: DiRenzo, Campbell Grant, et al. (2018); East Australia: Hollanders and Royle (2022).Shaded areas represent the 95% Bayesian credible intervals.qPCR, quantitative (real-time) polymerase chain reaction; ZE, Zoospore Equivalent.circumstances, it might be impossible to implement any identification technique, for instance when studying extremely small species (e.g.Paedophyrne frogs).Alternative approaches to avoid pseudoreplication in these cases can include adopting an instantaneous sampling design or sampling with removal.Hierarchical models that allow robust estimation of state-specific abundance (and, hence, of prevalence) using data from unmarked individuals captured at multiple sites have also been developed (DiRenzo et al., 2019;DiRenzo, Zipkin, et al., 2018).Although these models have the potential to improve our understanding of wildlife diseases, they are generally data-hungry and parameter identifiability problems can arise when the number of sampled sites is small (DiRenzo et al., 2019).
Our results confirm that accounting for false negative errors through the use of occupancy modelling frameworks improves the estimation of epidemiological parameters (Colvin et al., 2015;Lachish et al., 2012;Miller et al., 2012;Thompson, 2007).Interestingly, prevalence underestimation due to imperfect detection alone was not as high in our study as in other Bd-amphibian systems where this has been investigated (DiRenzo, Campbell Grant, et al., 2018;Hollanders & Royle, 2022;Miller et al., 2012).For instance, while our estimated swab detection probability was very similar to that of DiRenzo, Campbell Grant, et al. (2018), their prevalence was much more biased due to swab errors than in our study because they captured a higher proportion of individuals with extremely low infection burdens (<1 ZE), for which swab detection probability imperfection is greater (Figure 2; Figure S3).In addition, pathogen detection probabilities can vary across different studies (Figure 2); therefore, caution should be taken when extrapolating false negative error rates from other (even similar) study systems in an attempt to correct preva- Therefore, we advise researchers to provide a detailed description of the procedures followed during sampling, pathogen detection and data analyses.This not only is important for replicability but also can allow researchers to assess the extent to which estimates of pathogen detectability can be extrapolated to other systems.
The hierarchical modelling framework used here is flexible and can be applied to a wide range of host-pathogen systems, including those with different sampling/diagnostic testing techniques (see DiRenzo, Campbell Grant, et al., 2018 for examples).The detectionadjusted model is also flexible in the sense that it could include additional sources of errors that we did not consider here, provided there are sufficient data.First, we did not account for variation in host detection probability.If infected individuals are less detectable than uninfected ones, which is the case in some host-pathogen systems (Briggs et al., 2010;Hudson et al., 2016), infection prevalence estimates can be negatively biased.For instance, Jennelle et al. (2007) showed in the house finch (Carpodacus mexicanus)-Mycoplasma gallisepticum system that uninfected finches were more detectable, leading to a significant underestimation of prevalence and spurious inference on disease dynamics.Second, the detection-adjusted model used in the current study assumes there are no false positives.A recent study described a hierarchical model able to account for false positive errors in a Bd-amphibian system and showed that ignoring false positives greatly influenced transition rates between infection states (Hollanders & Royle, 2022).A high number of false positives would cause an overestimation of prevalence, but we are confident that this error remained minimal in our study since we ensured strict protocol conditions (e.g.negative controls, biosecurity measures) and because the recapture history of some individuals showing swab and qPCR conflicting results was more indicative of imperfect Bd detection than it was of false positives resulting from sample-to-sample contamination or PCR-product carry-over (Kwok, 2012).For example, at the time of first capture, one individual (#270, Table S2) had a first swab that was negative and the duplicate swab (taken a few seconds later) showing either absence or very low F I G U R E 3 Theoretical population growth curves of southern Darwin's frog (Rhinoderma darwinii) populations, under different scenarios of Batrachochytrium dendrobatidis (Bd) annual prevalence.The deterministic growth rate once the age structure of the population is stable (λ) is shown.
amounts of Bd DNA depending on the qPCR replicates (all qPCRs <5 ZE).It was recaptured the following month with both swabs giving infection loads >2000 ZE, indicating that this individual probably was at the very early stage of infection and not a false positive when sampled the first time.Finally, the model might be improved if it could account for the order in which replicate samples were taken, for instance by including time-varying detection probabilities (we were not able to follow this approach because the low number of infected individuals in our study system precluded parameter estimation).While other amphibian-Bd studies have shown both replicate swabs to have the same probability of pathogen detection (DiRenzo, Campbell Grant, et al., 2018;Simpkins et al., 2014), our results suggest the order in which the swabs are conducted might impact their respective probability to detect Bd in our system: the second swabs detected Bd on seven occasions when the first swabs were negative, while the reverse was true on only one occasion.While more data are needed to robustly confirm these signals, it seems from our data that second swabs might be better able to detect Bd in our study system, and in greater quantity, when individuals have low infection intensity (<100 ZE; Figure S2).

| CON CLUS ION
Our findings highlight the importance of taking imperfect detection and pseudoreplication into account when estimating infection prevalence.Not considering these sources of error led to a considerable underestimation of period prevalence in our study system.As illustrated by our matrix population model, this bias in prevalence estimation can have important implications regarding the perceived impacts of an infectious disease, for instance by substantially changing the predicted population-level impacts of infection.Some sources of error arising during epidemiological studies can be avoided with an adequate sampling design (for instance, if possible, using cross-sectional instantaneous sampling to avoid pseudoreplication) or individual identification during data analysis.It is crucial that practitioners and researchers are aware of the origins and implications of the numerous sources of error that can bias the estimation of epidemiological parameters, and know how to address these problems using the most suited sampling design, pathogen detection technique and analytical methods.Failing to do so is likely to hamper our ability to effectively inform wildlife disease risk assessment and management.

F
I G U R E 1 Summary of the design (a) and approaches (b) followed in this study to estimate the 5-month prevalence of Batrachochytrium dendrobatidis (Bd) infections in the southern Darwin's frog (Rhinoderma darwinii).qPCR, quantitative (real-time) polymerase chain reaction; ZE, Zoospore Equivalent.
lence estimates.Sources of variation in pathogen detectability can be attributed to factors associated with the pathogen (e.g.different Bd strains can have different numbers of DNA copies of the region targeted by the qPCR assay;Rebollar et al., 2017)  or the observation process.For instance, sampling and diagnostic testing protocols as well as the criteria used to define infection may not be consistent among research teams.This is often the case with Bd investigations, with different swabbing techniques(Simpkins et al., 2014), different diagnostic testing protocols (e.g.using, or not, BSA to decrease qPCR inhibition), and different methods to correct for dilution.