### Abstract

- Top of page
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Acknowledgements
- Literature cited
- Supporting Information

Early detection of population declines is essential to prevent extinctions and to ensure sustainable harvest. We evaluated the performance of two *N*_{e} estimators to detect population declines: the two-sample temporal method and a one-sample method based on linkage disequilibrium (LD). We used simulated data representing a wide range of population sizes, sample sizes and number of loci. Both methods usually detect a population decline only one generation after it occurs if *N*_{e} drops to less than approximately 100, and 40 microsatellite loci and 50 individuals are sampled. However, the LD method often out performed the temporal method by allowing earlier detection of less severe population declines (*N*_{e} approximately 200). Power for early detection increased more rapidly with the number of individuals sampled than with the number of loci genotyped, primarily for the LD method. The number of samples available is therefore an important criterion when choosing between the LD and temporal methods. We provide guidelines regarding design of studies targeted at monitoring for population declines. We also report that 40 single nucleotide polymorphism (SNP) markers give slightly lower precision than 10 microsatellite markers. Our results suggest that conservation management and monitoring strategies can reliably use genetic based methods for early detection of population declines.

### Introduction

- Top of page
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Acknowledgements
- Literature cited
- Supporting Information

Managers of threatened populations faces the challenge of early and reliable detection of population declines. Maintenance of large populations and associated genetic variation is important not only to avoid population extinction but also because loss of genetic variation affects the adaptation capability of a population. Timely detection of populations that have suffered a decline will allow for a broader and more efficient range of management actions (e.g. monitoring, transplanting, habitat restoration, disease control, etc.) which will reduce extinction risks.

The most widely used genetic method for short-term (contemporary) *N*_{e} estimation (Krimbas and Tsakas 1971; Nei and Tajima 1981; Pollak 1983) is based on obtaining two samples displaced over time (generations) and estimating the temporal variance in allele frequencies (F) between them. Luikart et al. (1999) demonstrated that the temporal method was far more powerful than tests for loss of alleles or heterozygosity for detecting population declines. However, little is known about the relative power of other *N*_{e} estimators for early detection of declines. Single sample methods based on linkage disequilibrium (LD), have been proposed (Hill 1981; Waples 2006) and have been compared to the temporal method for equilibrium (i.e. stable population size) scenarios (Waples and Do 2010). Methods to estimate long-term effective size (Schug et al. 1997) are by definition not generally applicable to the problem of detecting a recent sudden change in effective size.

Here we evaluate and compare the power, precision and bias of both methods used to estimate *N*_{e} for early detection of population declines. We use simulated datasets from population declines with a wide range of bottleneck intensity, sample size and number of loci. We simulate both highly polymorphic loci (microsatellites) and biallelic loci (single nucleotide polymorphisms, SNPs). We also study, to a smaller extent, a more recent temporal method based on likelihood (Wang 2001; Wang and Whitlock 2003).

We address important questions posed by conservation biologists such as, ‘To establish a monitoring program, how many individuals and loci are needed to detect a decline to a certain *N*_{e}?’, ‘How many SNPs are required to achieve sensitivity equal to microsatellites to estimate *N*_{e} and detect declines?’, ‘How many generations after a population decline will a signal be detectable?’, ‘What is the probability of failing to detect a decline (type II error)?’.

### Methods

- Top of page
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Acknowledgements
- Literature cited
- Supporting Information

We conducted simulations using the forward-time, individual based simulator simuPOP (Peng and Kimmel 2005). The default scenario was based on a constant size population of *N* = 600 run until mean heterozygosity reached approximately 0.8 (10 generations) split into a number *n* of subpopulations (*n* = 1, 2, 3, 6, 12) without any migration. This in practice simulates a bottleneck (with the exception of *n* = 1). The average sex ratio was 1 with random mating. This approximates *N*_{c} = *N*_{e}. Each scenario was replicated 1000 times. For convenience, the census size before the bottleneck will be called *N*_{1} and after will be labelled *N*_{2}. Unless otherwise stated, when referring to equilibrium scenarios, we are mainly concerned with a population of constant size (e.g. *N*1 = *N*2 above).

The genome simulated includes, 100 neutral, independent microsatellites initialized with a Dirichelet distribution (10 alleles exhibiting a mean of eight at the generation before the bottleneck) and no mutation.

We also compared and evaluated both methods according to:

- 1
Sensitivity to mutation rate. We used the K-allele model (

Crow and Kimura 1970) with 10 alleles and a relatively high mutation rate of 10

^{−3} typical of some microsatellites (

Ellegren 2004).

- 2
Usage of SNPs. We conducted simulations using genomes with 100 physically unlinked SNPs initiated from a uniform distribution.

- 3
Sensitiveness to initial population size. We used different initial population sizes (2400, 1200, 600, 300) all bottlenecking to an *N*_{2} of 50.

- 4
Benefits of using additional loci versus additional samples. While for equilibrium scenarios adding more loci is roughly equal to adding an equal proportion of individuals sampled (

Waples 1989;

Waples and Do 2010), we investigated if this symmetry holds under a population decline. We constructed a scenario with

*N*_{1} = 300 and

*N*_{2} = 50 and used different sampling strategies: 50 loci with 10 individuals and 10 loci with 50 individuals.

The simulation application saves for analysis all individuals in the generation exactly before the bottleneck along with 1, 2, 3, 4, 5, 10 and 20 generations afterwards. Each replicate is then sampled to study the effect of the sample size of individuals and loci. For *N*_{e} estimation we only study a single sub-population after each bottleneck to assure independence of all estimated values among replicates. We use for the number of loci 10, 20, 40 (and 100 for SNPs) and for the number of individuals 25 and 50. For each simulation replicate the following statistics are computed under different sampling conditions using Genepop (Rousset 2008) through Biopython (Cock et al. 2009): *F*_{st} (Weir and Cockerham 1984), expected heterozygosity, and allelic richness.

To study the LD method each simulation replicate was analyzed with the LDNe application (Waples and Do 2008) which implements the bias correction (Waples and Gaggiotti 2006) to the original LD method (Hill 1981). Point estimates and 95% confidence intervals (parametric) are stored using only alleles with a frequency of 2% or more which is reported to provide an acceptable balance between precision and bias (Waples and Do 2010) for the sample strategies tested.

For the temporal method we implemented the *N*_{e} estimator from Waples (1989) based on Nei and Tajima (1981):

- (1)

where *t* is the time between generations, *S*_{0} is the sample size at the reference, prebottleneck point and *S*_{t} at the postbottleneck generation being considered. The *F*_{k} estimator was implemented for each locus (*l*) as (Krimbas and Tsakas 1971; Pollak 1983):

- (2)

where *K* is the number of alleles at the current loci, *f*_{ri} is the frequency of allele *i* at the reference time and *f*_{ti} is the frequency of allele *i* at the current time. The generation before the bottleneck is used as the reference point to which all the other postbottleneck samples are compared. The *F*_{k} value used in the *N*_{e} estimator will be the weighted arithmetic mean of all locus *F*_{k} estimators, being the weight the number of alleles.

Confidence intervals on , which can be used to calculate the CI of , were computed as follows (Waples 1989; Sokal and Rohlf 1995; Luikart et al. 1999):

- (3)

where *n*′ is the number of independent alleles given by:

- (4)

where *K*_{i} is the number of alleles of locus *K*.

We also studied a more recent version of a temporal based method, MLNE (Wang 2001; Wang and Whitlock 2003) which is based on likelihood estimation of effective population size. The number of cases studied was limited to only two bottleneck scenarios as the computational cost makes an exhaustive evaluation expensive.

The coefficient of variation (CV) is commonly used as a measure of precision and it is useful to compare results with theoretical expectations as these expectations hold for equilibrium. The CV for based on LD is (Hill 1981; Waples and Do 2010):

- (5)

where *n* is:

- (6)

The CV provides a theoretical insight on other potential sources of lack of precision of the estimator: number of alleles and sample size are also expected to influence the precision of the estimator and most previous simulation studies of equilibrium report behaviours in line with theory. It is therefore important to investigate if qualitative and quantitative results hold for bottleneck cases.

The CV of the temporal estimator was presented in Pollak (1983):

- (7)

where *t* is the time number of generations betweens samples and *S* is the sample size. The temporal based estimator has another expected source of imprecision: the temporal distance between samples.

We evaluated performance of both methods from three different perspectives:

- 1
*Detection* of a decline from the prebottleneck effective population size, for example to detect if the

*N*_{e} (point estimate) is below 0.8 ×

*N*_{1}. This is similar to bottleneck tests (e.g.

Cornuet and Luikart (1996), as we are not concerned with the ability to approximate

*N*_{2}, only to detect if the population size decreased. The value chosen is arbitrary, but close to, and a function of

*N*_{1}.

- 2
*Approximation* of an effective population size that has declined closer to *N*_{2} than to *N*_{1}. Here we try to understand if, adding to the previous ability to detect a decline, an estimator (point estimate) can approach the new effective size. For instance if there is a bottleneck of *N*_{1} = 600 to *N*_{2} = 50, we want to study the ability of estimators’ point estimate to be below 75, which is 50% above *N*_{2}. This quantifies the ability to detect a change in *N*_{e}, but will not distinguish between an unbiased estimate of *N*_{2} and downward bias one.

- 3
*Estimation* of

*N*_{2} with low bias and high precision and reliable confidence intervals. Most studies of equilibrium scenarios (stable population size) are of bias and precision and thus most comparable with this third perspective [e.g.

England et al. (2006) and

Wang and Whitlock (2003)].

The three perspectives above are presented as they might be useful in different situations: a practical research question might need only to detect that a population is declining (detection perspective) or it might require that a certain conservation threshold (e.g. *N*_{e} < 100) has been passed (approximation perspective) or, still, a precise and unbiased estimation of population size (estimation perspective). The first two perspectives are not applicable in equilibrium settings, but provide insights needed for practical conservation applications.

Methods for detection of population decline are reliable if, when there is no decline, the method does not erroneously suggest one (type I error). This effect is especially important with *N*_{e} estimators as their variance is known to increase with increasing real *N*_{e}. As such we also assess how often each estimator to detect a decline when there is none (false positive rate).

When characterizing the distribution of across simulations, we use mainly box plots. Box plots show the median, 25th and 75th percentiles, the lowest datum within 1.5 of the lower quartile and the highest datum within 1.5 of the quartile range. Other measures like for example mean squared error, can be calculated from the Supporting Information (statistics from simulations).

We supply, as Supporting Information, the distribution of *N*_{e} estimates (point, upper and lower CI) according to the boundaries specified in the perspectives above (i.e. the percentage of estimations which fall above *N*_{1}, 0.8*N*_{1}, 1.5*N*_{2}, 0.5*N*_{2} or below 0.5*N*_{2} for all scenarios studied for the first five generations following the population decline. We also supply a set of standard population genetics statistics for (*F*_{st}, expected heterozygosity and allelic richness) starting from the generation before the bottleneck up to 50 generations after. This material can be loaded in standard spreadsheet software for further analysis. Furthermore, we also include an extensive number of charts covering all statistical estimators for all scenarios studied. Supporting Information is made available on http://popgen.eu/ms/ne.

### Results

- Top of page
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Acknowledgements
- Literature cited
- Supporting Information

With a fixed initial effective population size (*N*_{1}) of 600 and a population decline to an *N*_{2} of 50, we could detect a reduction of from the original *N*_{1} (detection perspective) after only one generation in 80% or more cases for each method when sampling just 25 individuals and 20 microsatellite loci. For an *N*_{2} of 100 the temporal method detected the decline only after a few generations or by using more samples or loci, while the LD based method still immediately detects a decline with just 25 individuals and loci. If *N*_{2} only drops to 200, the LD method will have still have power above 80% with 20 loci and 50 individuals at the first generation after the decline. Generally, the ability to detect a decline decreases for higher *N*_{2} for both estimators as expected from the CV (Waples and Do 2010) of both estimators.

Both methods were able to approximate *N*_{2} (i.e. compute an estimation below 1.5*N*_{2}) at generation two with a severe bottleneck of *N*_{2} = 50 if 50 individuals were sampled. However, the temporal method never had power above 80% for less severe bottlenecks (*N*_{2} = 100) in the first two generations. The power to detect an *N*_{e} < 1.5*N*_{2} (approximation perspective) is presented in Fig. 1.

As theoretically expected, power for early detection of a decline increases if more individuals are used. However, the following deviations from expectations (Waples 1989; Waples and Do 2010) are observed and further investigated in the discussion:

- 1
For the temporal method and for an *N*_{2} of 200, power decreased slightly with more samples.

- 2
Increasing the number of individuals sampled is more beneficial for both methods than increasing the number of loci. This effect is more noticeable with the LD method.

For the estimation perspective (i.e. low bias and small confidence intervals; see Methods), our bias and precision analysis showed that the temporal method has lower precision and, with larger *N*_{2}, higher bias upwards than the LD method. With a very low number of individuals, the LD method is biased upwards (consistent with England et al. In press) and less precise than the temporal method in line with the effect presented above (Fig. 2).

The MLNE did not perform better than the original moments-based temporal method. We used MLNE with two bottleneck scenarios (*N*_{2} of 50 and 200) and a sampling strategy using only two time points, MLNE never provided a reliable estimation even for large sample of 50 individuals and 40 loci. MLNE results were only usable with three samples in time but estimates were generally above *N*_{2} in concordance with Wang (2001) which also reports over-estimation of *N*_{e} in nonequilibrium scenarios (further details and an estimation perspective with MLNE are presented in the Supporting Information).

In order to understand the relative benefit of increasing the number of loci versus increasing the sample size, we simulated bottlenecks with an *N*_{1} = 300 and a *N*_{2} = 50 using two radically different sampling strategies: One maximizing the number of individuals (i.e. using a sample size equal to *N*_{2}) but using only 10 loci and another using 50 loci but only 10 individuals. The scenario with five times more individuals than loci gave higher precision in both methods. This effect was more pronounced with the LD method as both bias and precision are affected during all the initial five generations (Fig. 3). The temporal method is mainly affected in precision, and only in the two initial generations for the scenario studied.

We also studied the behaviour of confidence intervals for both estimators. The upper confidence interval of the temporal method is often far higher than the initial population size during the initial bottleneck generations in most scenarios. This effect rarely occurs with the LD method: only on the very first generation and for high values of *N*_{2} (Fig. 4).

The usefulness of any estimator to detect a decline can be jeopardized by false positives, that is detection of a reduction in *N*_{e} when none occurred. We assessed the false positive rate for both estimators, that is with a true *N*_{e} of 600 (Fig. 5). The LD-based method lower quartile of estimates was always above 400, whereas the lower quartile of point estimates for the temporal method approaches 200 when the sample size is only 25 individuals. For a sample size of 50 the LD method point estimates were normally above 500 whereas the temporal method point estimates were occasionally only approximately 100 even though the true *N*_{e} was 600.

We also studied how the prebottleneck size (*N*_{1}) affects the behaviour of the estimators. We simulated bottlenecks with different initial population sizes (initial *N*_{1} = 1200, 600, 400, 300, 200 and final *N*_{2} = 50, Supporting Information). The LD method was little influenced by initial size, but the temporal method accuracy and precision decreased as *N*_{1} decreased. This effect was mostly visible on the first generation after the decline, and disappears shortly after. This means that, adding to type I errors which make methods less reliable to high *N*_{1}, the temporal method also has precision problems with a lower *N*_{1}.

We also quantified precision for bi-allelic markers (i.e. SNPs). Using 10 and 40 microsatellites and 40 and 100 SNPs a comparison among the distributions of the point estimates reveals results consistent with theoretical expectations (Fig. 6). As an example, 10 microsatellite loci gave slightly higher precision than 40 SNPs: as the median allelic richness for the microsatellite scenarios after the bottleneck is six (Supporting Information) the number of degrees of freedom (i.e. approximately the number of independent alleles) of the 40 SNPs scenario is smaller (20) than the 10 microsatellite scenario (50). The bias with SNPs is slightly lower probably because rare allele effects occurred less with bi-allelic markers we simulated. Type I errors also behave as expected, which for the sampling strategies shown and with equilibrium scenarios, there is not enough precision to differentiate between a type I error and a real decline, again making type I errors a fundamental consideration.

We also quantified the influence of mutation rate on the ability to estimate *N*_{e}. The number of new mutations is negligible in small populations over 1–10 generations even with high mutation rates. As an example, for an *N*_{e} of 100 and a relatively high mutation rate of 0.001 the expected number of new mutations per generation per locus would be 0.2 (2 × *N*_{e}μ). Simulation results show negligible effect (Supporting Information).

### Conclusion

- Top of page
- Abstract
- Introduction
- Methods
- Results
- Discussion
- Conclusion
- Acknowledgements
- Literature cited
- Supporting Information

Early detection of population declines is increasingly feasible with the use of genetic monitoring based on effective population size estimators. If the number of samples is sufficiently high, LD based method is arguably more powerful and better suited for monitoring to detect declines because it is less prone to type I errors, has tighter confidence intervals, and is more flexible with regards to designing different experimental design strategies. Nonetheless it is important to further research the behaviour of both estimators under an even broader set of realistic scenarios, for example with age structure or migration, and to understand if variations of the temporal method (Jorde and Ryman 1996) or LDNe allow for earlier and more precise estimation of effective population size in decline populations. Both methods along with others (e.g. loss of alleles) should often be used when monitoring in order to gain a better understanding of the causes, consequences and severity of population declines (Luikart et al. 1999).

As the precision of both estimators requires the true effective population size to be relatively small, their use is currently limited to scenarios in conservation biology and perhaps studies of the ecology and evolution in small populations. For instance, they cannot be used to conduct reliable genetic monitoring when the effective size remains larger than approximately 500–1000 unless perhaps hundreds of loci and individuals are sampled and/or improved estimators are developed.

Combining simulation evaluations of new statistical methods and increasing numbers of DNA markers makes management and genetic monitoring increasingly useful for early detection of population declines, even with noninvasive sampling of elusive or secretive species. This results are encouraging and contribute to the excitement and promise of using genetics in conservation and management.