SEARCH

SEARCH BY CITATION

Keywords:

  • effective number of breeders;
  • effective population size;
  • genetic estimate;
  • molecular coancestry;
  • single cohort sample

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

The effective population size, Ne, is an important parameter in population genetics and conservation biology. It is, however, difficult to directly estimate Ne from demographic data in many wild species. Alternatively, the use of genetic data has received much attention in recent years. In the present study, I propose a new method for estimating the effective number of breeders Neb from a parameter of allele sharing (molecular coancestry) among sampled progeny. The bias and confidence interval of the new estimator are compared with those from a published method, i.e. the heterozygote-excess method, using computer simulation. Two population models are simulated; the noninbred population that consists of noninbred and nonrelated parents and the inbred population that is composed of inbred and related parents. Both methods give essentially unbiased estimates of Neb when applied to the noninbred population. In the inbred population, the proposed method gives a downward biased estimate, but the confidence interval is remarkably narrowed compared with that in the noninbred population. Estimate from the heterozygote-excess method is nearly unbiased in the inbred population, but suffers from a larger confidence interval. By combining the estimates from the two methods as a harmonic mean, the reliability is remarkably improved.


Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

The effective population size, Ne, is one of the most important parameters in population genetics and conservation biology, because this parameter determines both the amount of genetic drift and the rate of inbreeding (Crow and Kimura 1970; Falconer and Mackay 1996). Ne can be estimated from demographic data such as the number of parents and the variance in their progeny number (Caballero 1994). However, the demographic data needed to estimate Ne is often not available in many wild species. As an alternative to estimating Ne from demographic data, methods for estimating Ne from genetic data have been developed (for reviews, see Waples 1991; Schwartz et al. 1999; Beaumont 2003; Leberg 2005; Wang 2005). These methods have different time scales on which Ne is measured. Some of them infer the long-term Ne in the past on an evolutionary time scale, and others estimate the current or short-term Ne (Waples 1991; Wang 2005). For solving practical issues such as managing a small population of endangered species, an accurate estimate of the current or short-tem Ne is of special importance, which is a major concern of this study.

To date, three methods are available for this purpose: the temporal method (Nei and Tajima 1981; Pollak 1983; Waples 1989), the linkage disequilibrium method (Hill 1981) and the heterozygote-excess method (Pudovkin et al. 1996; Luikart and Cornuet 1999). These methods actually assess the effective number of breeders (Neb) of a cohort from which a sample is obtained. If the sample consists of reproductive adults, Neb is nearly equivalent to Ne in populations with nonoverlapping generations (Schwartz et al. 1999; and as will be discussed later). Ne can be estimated from Neb in populations with overlapping generations, if the age structure is known (Waples 1991).

The logic behind the temporal method is that the change of allele frequency in samples separated in time is a reflection of genetic drift. This method is the most tested of the genetic Neb estimators and has been used to estimate Neb of various species (Schwartz et al. 1999). The primary weakness of this method is that two or more samples separated in time are necessary (Schwartz et al. 1999). This can be expensive and, by nature, time-consuming. The linkage disequilibrium method is based on the fact that genetic drift generates nonrandom association among alleles in different loci. Despite of the obvious advantage that this method can be used to estimate Neb from a single cohort sample, there are several drawbacks (Schwartz et al. 1999; Wang 2005). Perhaps, the most critical one is that the estimator assumes an isolated equilibrium population with a constant effective size, which may not be tenable for natural populations of endangered species. The heterozygote-excess method is based on the fact that when the breeding population is small, binomial sampling error produces allele frequency differences between male and female breeders, resulting in an excess of heterozygotes in their progeny (Robertson 1965). As in the linkage disequilibrium method, this method has the advantage that only a single cohort sample is required. Further, this method is appealing because the estimate is easily computed. However, there are few applications of this method, presumably because of the low precision, as empirically shown by Luikart and Cornuet (1999).

Several authors (Waples 1991; Pudovkin et al. 1996; Luikart and Cornuet 1999) emphasized the importance of exploring a method that gives an estimate independent of ones from existing methods, because a combined estimate of several independent estimates is expected to improve the precision of separate estimates. In the present study, a novel method for estimating Neb from genetic data of a single cohort sample is proposed. The estimator is obtained from a simple parameter (molecular coancestrty) of allele sharing among sampled individuals. Reliability of the new estimator is compared with that from the heterozygote-excess method using computer simulation. Improvement of the reliability attained by combining the two methods is also examined.

Methods

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

Estimation of Neb from parent-based coancestry

Although a monoecious diploid population is assumed throughout the following derivation, the extension to dioecious diploid species is straightforward and the same estimation method is applicable to the population.

Let ft be the coancestry among two randomly sampled individuals in generation t, and P be the probability that two randomly sampled alleles each from different individuals in generation t come from the same individual in generation t − 1. The recurrence equation for the coancestry is given by

  • image(1)

(Crow and Kimura 1970, p. 102), where Ft−1 is the inbreeding coefficient of individuals in generation t − 1. Following the definition by Crow and Kimura (1970, p. 347), we define the effective number of breeders (Neb), or strictly the inbreeding effective number, as

  • image(2)

We set the base population of ft at the population of generation t − 1 by assuming Ft−1 = ft−1 = 0. Putting t − 1 = 0 in (1), we obtain from (1) and (2), inline image and

  • image(3)

This means that an estimate of Neb can be obtained if the parent-based coancestry (f1) among individuals in one cohort is estimated.

Estimation of parent-based coancestry

Molecular coancestry

For locus l, molecular coancestry fM,xy,l (frequently called ‘molecular similarity index’) between individual x having alleles a and b and individual y having alleles c and d is defined as (Malécot 1948)

  • image(4)

where indicator Iac is one when allele a of individual x is identical to allele c of individual y, and zero otherwise, etc. When there are L marker loci, molecular coancestry fM,xy is the average molecular coancestry over all loci (Toro et al. 2002, 2003):

  • image

Molecular coancestry will be not only because of alleles that are identical by descent but also because of alleles that are alike in state (AIS). Molecular coancestry is, therefore, an upward biased estimator of the coancestry relative to an arbitrary base population. When sl denotes the probability that two alleles at locus l are AIS in the base population, the expected molecular coancestry between individual x and y at locus l is (Oliehoek et al. 2006)

  • image(5)

where fxy is the coancestry between individuals x and y expressed relative to the base population.

Equation (5) shows that a value for sl is needed for each locus to obtain fxy. If allele frequencies in the base population are known without errors, sl is computed as inline image, where nl is the number of alleles in locus l and pi the frequency of ith allele in locus l in the base population. Because allele frequencies in the base population are, however, usually unknown, sl needs to be estimated. Similar problem is arisen in estimating any relatedness from molecular markers. In most of the published works (e.g. Ritland 1996; Lynch and Ritland 1999), allele frequencies have been estimated from the current population for which relatedness is estimated, meaning that the base population is set equal to the current population. For our purpose, this approximation leads to an apparent contradiction, because it implicitly assumes no drifts in allele frequencies between parent and progeny generations (i.e. Neb = ∞).

Estimation of f1from fM,xy

Irrespective of the upward bias, simulations suggest that molecular coancestry can be a good indicator of the coancestry relative to an arbitrary base population (e.g. Toro et al. 2003; Oliehoek et al. 2006). We take advantage of this property to convert the molecular coancestry to the parent-based coancestry (f1).

Suppose that n individuals are sampled from progeny in a given generation, for which f1 is estimated. We assume that the sample consists of at least two nonsib families. This assumption will be satisfied except for a population with an extremely small number of parents, such as a population with only one male parent in polygynous species. Thus, for a given individual in the sample, at least one nonsib pair should be involved in the possible − 1 pairs with other sampled members. Underlying concept of our estimation is that the nonsib pairs could be inferred from molecular coancestry. Fernández and Toro (2006) showed that a sib-ship can be reconstructed from molecular coancestry with a high accuracy, suggesting that the inference on nonsib pairs based on molecular coancestry has a fairly high precision.

We assume that pairs inferred to be nonsibs (putative nonsibs) are true nonsibs (i.e. fxy = 0). Thus, substituting the average molecular coancestry (inline image) for locus l over all pairs of putative nonsibs into (5) gives an estimate of sl:

  • image(6)

With the weight wl to optimize the contributions of loci to the estimate of coancestry, suggested by Oliehoek et al. (2006), the parent-based coancestry between individuals x and y, f1,xy, is estimated as

  • image

where

  • image

(Oliehoek et al. 2006)

  • image

and inline image is the estimated frequency of allele i in locus l from the sampled individuals. Note that the weight wl puts more weight on loci with small sl and with lots of alleles at nearly equal frequency. The estimate of f1 is simply obtained by averaging inline image over inline image pairs:

  • image

And from (3), Neb is estimated by

  • image(7)
Selection method for putative nonsib pairs

The simplest method for selecting putative nonsibs from all the possible pairs is to select a given number (n0) of pairs with the smallest molecular coancestry. However, this method leads to an underestimation of sl, because of the positive correlation between fM,xy and fM,xy,l due to the finite number of marker loci (L). For example, in an extreme case where only one marker locus is available (= 1), the selection of the smallest fM,xy automatically results in the selection of pairs with the smallest fM,xy,l. When the number of selected pairs (n0) is much smaller than the number of the actually existing nonsib pairs, the average of fM,xy,l over the selected n0 pairs is expected to be lower than that of fM,xy,l over all the actually existing nonsib pairs, leading to an underestimation of sl [cf. equation (6)].

In a strictly statistical sense, the selection of putative nonsibs for the estimation of sl should be based on data independent of the sample from which sl is estimated. This problem could be largely solved by excluding the information on locus l in selecting putative nonsib pairs for the estimation of sl. Denoting the molecular coancestry between individuals x and y excluding the information on locus l by fM,xy,/l, we can compute it as

  • image(8)

For estimating sl, the selection of n0 pairs with the smallest coancestry is based on this partial molecular coancestry.

In the present study, the following selection method was applied: (i) Give the sequential numbers (= 1, 2, …, n) to n sampled individuals. (ii) For the first individual (= 1), a pair with the smallest fM,xy,/l [computed from (8)] is selected from − 1 pairs with other members. (iii) For the proceeding individual (i ≥ 2), a pair with the smallest fM,xy,/l is selected in the same manner. But if the pairs already selected in the previous selection are included in − 1 candidate pairs, the pairs are excluded from the candidates to avoid doubly selecting the same pairs. (iv) As a result, we obtain n0(=n) pairs with the smallest fM,xy,/l; (v) averaging fM,xy,l [computed from (4)] over the n0 pairs. The average (inline image) is the estimate of sl [cf. equation (6)]. (vi) Steps (ii)–(v) are repeated until estimates of sl are obtained for all marker loci.

Computer simulation

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

Computer simulation was carried out to evaluate the reliability of the presented method. Genotypes of individuals in the initial population were generated by assigning alleles randomly sampled from an infinite (conceptual) gene pool with a uniform allele frequency distribution with two alleles for the ‘low-polymorphic’ marker loci case or 10 alleles for the ‘high-polymorphic’ marker loci case. The number of loci was 80 for both polymorphic cases. Prior to progeny sampling for the estimation of Neb, eight generations of random mating with a breeding system defined below were simulated to accumulate inbreeding and relationship. As the breeding system, monogamy and polygyny were modeled. Under monogamy model, an equal number of male and female parents (N/2) were randomly paired to form N/2 permanent couples. Progeny (parent of the next generation) was produced from a randomly sampled couple, and the sampling of a couple and the reproduction were repeated until N/2 replacements of each sex have been obtained. Under polygyny model, Nm males and Nf (>Nm) females were generated, and each female was mated with a randomly sampled male (thus, there are Nf fixed matings). Progeny was produced from a randomly sampled mating, and this was replicated to obtain Nm males and Nf females for the parents of the next generation. In the final generation, a sample of n progeny was obtained in the same manner of reproduction of the respective breeding system. From the loci each with at least two segregating alleles in the sampled progeny, = 5–30 loci were randomly chosen as marker loci. For the standard parental population size, = 10 in monogamy, and Nm = 5 males and Nf = 20 females in polygyny were computed. Sample size of progeny (n) in the final generation was 100 for the two breeding systems. In the low-polymorphic marker loci case, all the marker loci should have exactly two alleles (nl = 2) as in single nucleotide polymorphisms, but the allele frequency distribution is varied among the loci. In the high-polymorphic marker loci case, not only the allele frequency distribution but also the number of alleles is varied among the loci. In the above standard population size, the average numbers of alleles per marker locus was 3.83 in monogamy, and 5.31 in polygyny, which would be comparable with the allele number of microsatellite markers in a practical survey. This type of data generation is referred to as the ‘inbred population’ model, in a sense that the parental population of sampled progeny consists of inbred and related individuals, which will be a general situation of endangered species populations.

As another type of data generation, the ‘noninbred population’ model was also simulated. The manner for the assignment of initial genotypes and the acceleration of generations were exactly same as in the inbred population, except for that the number of accelerated generations was seven. At the final generation, the allele frequency distribution of each locus was memorized. Then, genotypes of parents were regenerated by assigning alleles randomly sampled from an infinite gene pool with the memorized allele frequency distribution. The sampling of progeny and the choice of marker loci were same as in the inbred population. These procedures could produce a parental population consisting of noninbred and nonrelated individuals but having the same quality of molecular information as in the corresponding inbred population. This type of data generation could be an approximation of a recently recolonized population in an ephemeral habitat.

In additional computations, different sizes of parental population and progeny sample were examined. The effect of unequal contribution of parents on the estimates was also evaluated under monogamy with = 10, by considering the following two patterns of unequal contributions of N/2 = 5 couples: (0.4, 0.3, 0.1, 0.1, 0.1) and (0.6, 0.1, 0.1, 0.1, 0.1). The number of replicated runs for each combination of population model, breeding system and variables was 5000.

Demographic effective number of breeders (Neb,demo) under monogamy model was computed from the standard formula of the inbreeding effective size (Caballero 1994):

  • image(9)

where inline image and inline image are the mean and variance of the number of progeny of couples, respectively. The expression of inline image under the simulated condition is given in Appendix A. Neb,demo under polygyny is computed as

  • image(10)

The derivation of this equation is shown in Appendix B. Neb from pedigree coancestry was also computed, which was simply obtained by substituting the average parent-based pedigree coancestry of sampled progeny into (7). The computed Neb well agreed with Neb,demo. Thus, only the value of Neb,demo was presented in results, and it was referred to as the true value of simulation. In addition to the estimate (denoted as inline image hereafter) obtained from (7), estimate from the heterozygote-excess method (inline image; Pudovkin et al. 1996) was computed for comparison. The locus specific inline image is estimated as

  • image

where

  • image

and Hobs,i and Hexp,i are the observed and expected proportion of heterozygotes having allele i, respectively. Multiple loci estimate was simply computed as the harmonic mean of inline image over the marker loci, following the previous simulation studies (Pudovkin et al. 1996; Luikart and Cornuet 1999). In both methods, when a negative estimate was obtained, the estimate was regarded as an infinite (inline image).

As a criterion of evaluation, the harmonic mean of estimates over 5000 replicates was computed. Furthermore, to characterize the variation and distribution of estimates, 10th, 50th and 90th percentiles in replicates were calculated. The xth percentile was obtained as the 5000 × (x/100)th smallest estimate in 5000 replicated estimates.

Results and discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

Left and middle panels in Fig. 1 (A: monogamy and B: polygyny) illustrate the 10th, 50th and 90th percentiles, and a harmonic mean of 5000 replicated estimates of the effective number of breeders (Neb) from the heterozygote-excess and molecular coancestry methods applied to the noninbred population with = 5–20 high-polymorphic marker loci. The three percentiles indicate that the distributions of estimates from both methods are skewed upward. The 50th percentile and harmonic mean were, however, close to Neb,demo (10 for monogamy and 13.79 for polygyny) in both methods. Under monogamy, the interval between 10th and 90th percentiles in inline image tended to be wider than that in inline image, whereas the reversal tendency was observed under polygyny.

image

Figure 1.   Harmonic mean (marked by open circle), and 10th, 50th and 90th percentiles (marked by bar) of 5000 estimated effective numbers of breeders in the noninbred population under (A) monogamy with = 10 (half of each sex) parents and (B) polygyny with Nm = 5 male and Nf = 20 female parents, for the case of high-polymorphic marker loci. The sample size of progeny is = 100. inline image is the estimate from heterozygote-excess method (Pudovkin et al. 1996), inline image the estimate from equation (7) and inline image the estimate by the harmonic mean of inline image and inline image. The value in top of each graph is the clipped 90th percentile, and the value in parentheses is the percentage of replicates with inline image. The dashed line shows the effective number of breeders expected from demographic parameters (Neb,demo = 10 under monogamy and 13.79 under polygyny, respectively).

Download figure to PowerPoint

The corresponding simulation results in the inbred population are shown in Fig. 2. Although the 50th percentile and harmonic mean show that the heterozygote-excess method gives an essentially unbiased estimate of Neb, the estimate from the molecular coancestry method tends to be biased downward. The degree of bias became larger as the number of marker loci increased. Inbreeding and relationship in the parental population gave quite a different impact on the confidence interval in the two methods. The interval between 10th and 90th percentiles in inline image was widened in the inbred population, compared with that in the noninbred population (Fig. 1). The increase of confidence interval was more remarkable under monogamy. In fact, the 90th percentile under monogamy was infinite even with = 20 marker loci. In contrast, the interval in inline image was remarkably narrowed in the inbred population. For example, the 10th and 90th percentiles in inline image under monogamy with = 20 marker loci were 3.75 and 12.93, respectively.

image

Figure 2.  Harmonic mean (marked by open circle), and 10th, 50th and 90th percentiles (marked by bar) of 5000 estimated effective numbers of breeders in the inbred population under (A) monogamy with = 10 (half of each sex) parents and (B) polygyny with Nm = 5 male and Nf = 20 female parents, for the case of high-polymorphic marker loci. The sample size of progeny is = 100. inline image is the estimate from heterozygote-excess method (Pudovkin et al. 1996), inline image the estimate from equation (7) and inline image the estimate by harmonic mean of inline image and inline image. The value in top of each graph is the clipped 90th percentile, and the value in parentheses is the percentage of replicates with inline image. The dashed line shows the effective number of breeders expected from demographic parameters (Neb,demo = 10 under monogamy and 13.79 under polygyny, respectively).

Download figure to PowerPoint

In a strict sense, the heterozygote-excess method is valid only when the progeny are produced by random union gametes (Pudovkin et al. 1996; Luikart and Cornuet 1999). When the progeny are produced by individual-based pairwise matings such as monogamy and polygyny, the sample of progeny is family-structured. In such a sample, heterozygote deficiency generated by the interfamily Wahlund effect may mask the heterozygote excess, reducing the usefulness of the heterozygote-excess method (Luikart and Cornuet 1999). Using computer simulation, Luikart and Cornuet (1999) examined the effect of a family-structured sample on the reliability of the heterozygote-excess method. They found that the heterozygote-excess method gives an essentially unbiased estimate even with a family-structured sample. However, the existence of family structure in sampled progeny substantially increased the variance of estimates under monogamy. Simulation data of Luikart and Cornuet (1999) was generated in the same manner as the noninbred population of the present study. Thus, their sample of progeny contains only sib families. On the other hand, the sample of progeny from the inbred population consists of families with various degrees of relationship (e.g. cousins). The increased confidence interval observed in Fig. 2 indicates that the application of the heterozygote-excess method to such a sample reduces the reliability, although the method still gives an unbiased estimate. The reduction of reliability will be more serious under monogamy (Fig. 2).

As a detail information on the estimation process in the molecular coancestry method, Table 1 gives the observed and estimated [from equation (6)] AIS probability (sl) in the parental population, and the average estimated parent-based coancestry among actual nonsibs (NS), actual half-sibs (HS), actual full-sibs (FS) and all pairs of sampled progeny, for the case of monogamy and polygyny with = 15 high-polymorphic marker loci. All the values are shown as the average over 5000 replicates (and over 15 marker loci for sl). In the noninbred population, the estimated AIS probability was close to the observed value, giving the average estimates of the parent-based coancestries in the three categories (NS, HS and FS) close to the pedigree coancestries, i.e. 0, 0.125 and 0.25 for NS, HS and FS, respectively. Thus, the molecular coancestry method gives an essentially unbiased estimate of Neb for the noninbred population (Fig. 1). However, the process of selecting putative nonsibs in the molecular coancestry method causes a problem when applied to the inbred population. The selection method may select the actual nonsibs with a reasonably high probability. But the putative nonsibs selected from the inbred population may be less-related nonsibs with regard to further back ancestral relationships than the average nonsibs among the sampled progeny. As seen from Table 1, this causes an underestimation of AIS probability, implying that the base population for coancestry is set at a further back generation over the parental generation. This overrun in setting the base population results in an overestimation of the parent-based coancestry, leading to a downward bias of inline image as observed in Fig. 2. Irrespective of this drawback, the narrow confidence interval of inline image in the inbred population is attractive in its practical use. Although the molecular coancestry method will be less useful for a point estimate of Neb in inbred populations, it will be useful for detecting a small Neb.

Table 1.   Observed and estimated AIS probability, and estimated parent-based coancestries among actual nonsibs (NS), actual half-sibs (HS), actual full-sibs (FS) and all pairs of sampled progeny from the noninbred and inbred parental populations under monogamy with = 10 parents or polygyny with Nm = 5 male and Nf = 20 female parents, for the case of = 15 high-polymorphic marker loci and the sample size of = 100.
Breeding systemPopulationAIS probabilityEstimated parent-based coancestry among
ObservedEstimatedActual NSActual HSActual FSAll pairs
  1. The AIS probability is the average over 5000 replicates and 15 marker loci, and the coancestry is the average over 5000 replicates.

MonogamyNoninbred0.35870.35710.00450.25520.0546
 Inbred0.35650.33660.03460.26510.0806
PolygynyNoninbred0.29670.29720.00080.12590.25030.0370
 Inbred0.29810.28300.02370.14180.25920.0579

The simulation results for the estimation with the low-polymorphic marker loci are shown in the left and middle panels in Fig. 3(A) for noninbred and Fig. 3(B) for inbred populations in monogamy. Results in polygyny (data not shown) were essentially similar to those in monogamy. As seen from the 10th and 90th percentiles in inline image, the heterozygote-excess method suffers from a larger confidence interval. In fact, even with = 30 marker loci, the 90th percentile in inline image was still infinite in both noninbred and inbred populations. In contrast, the molecular coancestry method gave an estimate with a practically acceptable confidence interval when = 30 marker loci were available.

image

Figure 3.   Harmonic mean (marked by open circle), and 10th, 50th and 90th percentiles (marked by bar) of 5000 estimated effective numbers of breeders in the (A) noninbred and (B) inbred populations under monogamy with = 10 (half of each sex) parents, for the case of high-polymorphic marker loci. The sample size of progeny is = 100. inline imageis the estimate from heterozygote-excess method (Pudovkin et al. 1996), inline imageestimate from equation (7) and inline image the estimate by harmonic mean of inline image and inline image. The value in top of each graph is the clipped 90th percentile, and the value in parentheses is the percentage of replicates with inline image. The dashed line shows the effective number of breeders expected from demographic parameters (Neb,demo = 10).

Download figure to PowerPoint

Table 2 shows the results from simulation runs with additional combinations of the number of parents and sample size, for the case of = 15 high-polymorphic marker loci. As the harmonic mean of replicated estimates well agreed with the 50th percentile, it was not shown in the table. The general properties of estimates, e.g. a small bias of estimation from both methods in the noninbred population and a downward bias of inline image in the inbred population, were similar to those observed in Figs 1–3. A remarkable point in Table 2 is a narrower confidence interval of inline image in a small sample of progeny from a small inbred population. For example, under monogamy with = 10 parents, the 90th percentile of inline image from = 10 progeny was 38.2, while the corresponding percentile of inline image was infinite. In most of the practical situations of conservation biology, the population in question will be small and inbred, and may suffer from a low reproductive ability. The molecular coancestry method could significantly contribute to the detection of small Neb of such populations. The magnitude of the downward bias of inline image increased in a larger inbred population, as seen from the 50th percentiles in monogamy with = 50 and polygyny with Nm = 20 and Nf = 80, which may limit the usefulness of the molecular coancestry method. However, even in these populations, the narrow confidence interval of inline image would be of practical significance for obtaining a conservative estimate of Neb.

Table 2.   Percentiles (10th, 50th and 90th) of estimated effective number of breeders for 5000 replicated simulation runs in the noninbred and inbred populations with several additional combinations of the number of parents and sample size.
Population and breeding systemN or Nm:NfNeb,demoninline imageinline imageinline image
10th50th90th10th50th90th10th50th90th
  1. Fifteen (= 15) high-polymorphic marker loci were assumed.

  2. N, the number of parents (half of each sex) in monogamy; Nm, the number of male parents; Nf, the number of female parents in polygyny; Neb,demo, effective number of breeders expected from demographic parameters; inline image, estimated Neb from the heterozygote-excess method; inline image, estimated Neb from equation (7); inline image, harmonic mean of inline image and inline image.

  3. Figures in parentheses are the percentage of replicates with inline image.

Noninbred
 Monogamy1010104.8411.99∞ (23.2)4.108.27∞ (10.3)5.399.4227.01 (2.1)
  205.2411.01∞ (16.7)4.488.81114.5 (8.5)5.909.5724.42 (1.2)
50505019.7355.33∞ (26.5)17.045.80∞ (23.1)22.5844.75285.37 (6.3)
 Polygyny5:2013.79207.6316.18∞ (14.4)6.1112.42∞ (12.0)8.8013.8138.51 (1.7)
  508.7315.1773.97 (5.8)7.0613.5785.49 (6.7)9.0914.1530.01 (0.5)
20:8053.7810025.2859.03∞ (17.6)21.6250.24∞ (18.2)28.1052.03203.54 (3.0)
Inbred
 Monogamy1010104.4612.18∞ (26.5)3.436.7038.20 (5.7)4.908.0318.09 (0.9)
  204.8110.99∞ (22.8)3.516.6022.29 (3.6)5.087.8516.58 (0.3)
50505017.5050.37∞ (23.4)11.5820.3085.59 (4.7)16.5827.8369.50 (1.0)
 Polygyny5:2013.79207.5216.19∞ (17.6)5.009.3141.06 (4.8)7.2611.4525.37 (0.6)
  508.4715.85∞ (10.0)5.318.8521.79 (1.6)7.7111.3319.90 (0)
20:8053.7810023.6157.84∞ (19.7)15.0124.6273.89 (2.6)21.4433.7372.07 (0.4)

The effect of unequal contributions of parents on estimates of Neb is shown in Table 3, in which a monogamy with = 10 (half of each sex) and a sample size of = 100 offspring was assumed. In all the cases computed, the 90th percentile in the molecular coancestry method was much smaller than in the heterozygote-excess method. As unequal contribution of parents is an important factor for a smaller Ne than the census number of breeders (Frankham 1995), the higher accuracy of the present method observed in Table 3 will be a practically appealing point.

Table 3.   Percentiles (10th, 50th and 90th) of estimated effective number of breeders for 5000 replicated simulation runs with unequal contribution of parents under monogamy in the noninbred and inbred populations with = 10 (half of each sex) parents and the sample size of = 100.
ContributionNeb,demoPopulationinline imageinline imageinline image
10th50th90th10th50th90th10th50th90th
  1. Fifteen (= 15) high-polymorphic marker loci were assumed.

  2. Contribution: expected contributions of inline image=5 couples to sample.

  3. Neb,demo, effective number of breeders expected from demographic parameters; inline image, estimated Neb from the heterozygote-excess method; inline image, estimated Neb from equation (7); inline image, harmonic mean of inline image and inline image.

  4. Figures in parentheses are the percentage of replicates with inline image.

0.4, 0.3, 0.1, 0.1, 0.17.18Noninbred4.538.14302.02 (9.3)3.596.9118.55 (2.1)4.817.3113.46 (0.2)
Inbred4.078.30∞ (16.9)2.695.4514.09 (1.1)4.096.3110.95 (0)
0.6, 0.1, 0.1, 0.1, 0.15.03Noninbred3.806.82107.07 (8.8)2.264.7413.90 (2.0)3.405.429.94 (0.1)
Inbred3.637.24∞ (14.6)1.764.1712.50 (1.6)2.965.028.90 (0.1)

Figure 4 represents the joint distribution of estimates from the heterozygote-excess and molecular coancestry methods applied to the inbred populations under polygyny with Nm = 5 and Nf = 20 parents and = 15 high-polymorphic marker loci. The moment and Spearman’s rank correlations, excluding the pairs with infinite estimate, were −0.003 and −0.164, respectively. In all other cases simulated, the correlations of these orders were obtained. An interesting point in Fig. 4 is that the incidence of overestimations in the two methods tends to be exclusive. At present, it is not theoretically obvious how to combine several estimates of Neb optimally to give a single best estimate (Wang 2005). As a tentative method, I combined the two estimates as the harmonic mean, according to the suggestion of Waples (1991):

  • image
image

Figure 4.   Joint distribution of estimates of effective number of breeders from heterozygote-excess (inline image) and molecular coancestry (inline image) methods in the inbred population under polygyny with Nm = 5 male and Nf = 20 female parents and = 100 sample of progeny. Estimates outside the graph were clipped.

Download figure to PowerPoint

The harmonic mean is expected to work well in the present case, because of the exclusive incidence of overestimations in the two methods; an overestimated Neb returned by one method is filtered out and the combined estimate is largely determined by the estimate from the other method. The property of the combined estimate is shown in the right panels in Figs 1–3 and the column of inline image in Tables 2 and 3. The combined estimate in the inbred population was biased downward because of the downward bias of inline image. However, as expected, the confidence interval of the estimate was substantially narrowed, comparing with the separate estimates. It is notable that the improvement is larger for lower marker quality, i.e. for a smaller number of marker loci and/or a smaller number of alleles in each locus (Figs 1–3), and for a smaller sample size (Table 2). Although the development of an optimal method for combining separate estimates into a single estimate deserves further investigation with sophisticated statistical tools, the above results strongly suggest that a highly reliable estimate can be obtained from the optimal combination.

Some of the limitations of the method proposed in this study are shared by most of the published methods: marker alleles are assumed to be selectively neutral, mating within the population is at random and immigration from other populations is absent (Leberg 2005). In addition, the present method involves a problem associated with age at sampling. Estimation of Ne from the recurrence equation (1) is based on the assumption that the average coancestries in two successive generations are measured as the same age stage. In fact, the application of the present method to a sample of juveniles gives an estimate of ‘the effective number of breeders’. But even in a population with nonoverlapping generations, the estimate can be largely different from Ne, depending on the survival pattern of juveniles to adults. Following Crow and Morton (1955), we consider two extreme patterns of the survival: (i) random survival and (ii) survival of the family as a unit. In the random survival model, survival from juvenile to adult is randomly determined with the expected survival rate s. Under this pattern of survival, the average coancestry among adults is expected to be unchanged from that among the juveniles. Thus, if the present method is applied to a population with nonoverlapping generations, inline image. Under the survival of the family as a unit, the entire juveniles in a family either survive or do not. With the average survival rate s in the population, inline image obtained from a sample of juveniles is related to Ne as inline image (for the theoretical aspect of the above consideration, see Appendix C). Although this model describes an extreme pattern of survival, inline image of animals with low fecundity and high survival rate, such as mammals and birds in which parental nursing for their brood is generally observed, should be cautiously interpreted. On the other hand, inline image will give an appropriate estimate of Ne when the method is applied to animals with high fecundity and low survival rate, such as marine invertebrates and fishes, whose survival seems to be essentially random.

The present method involves additional problems associated with the selection method for putative nonsibs. One is the problem as to the determination of the number (n0) of selected pairs as putative nonsibs. Although the selection method applied to the present study automatically assigns the number (n) of the sampled progeny to n0, this is an arbitrary choice. With a smaller n0, it is more likely that the selected pairs are actually nonsibs, but the coancestry among them will underestimate the AIS probability, and vice versa. Another problem is the drift-induced linkage disequilibrium among marker loci. In small populations, the drift-induced linkage disequilibrium may be an important factor (Hill 1981) and reduce the degree to which loci provide independent information about coancestry. This may reduce the effectiveness of the selection criterion of putative nonsibs defined by equation (8). One potential for solving these problems and improving the estimates of Neb from molecular coancestry is the use of a sib-ship reconstruction technique. To date, several methods for sib-ship reconstruction from molecular markers have been developed using different algorithms, such as Markov Chain Monte Carlo (MCMC) algorithm (Almudevar and Field 1999; Thomas and Hill 2002; Wang 2004) and simulated annealing (Almudevar 2003; Fernández and Toro 2006), and have been reviewed by Blouin (2003) and Butler et al. (2004). I here take the method proposed by Fernández and Toro (2006) as a trial example of the use of a sib-ship reconstruction technique for estimating Neb. By the use of their method, we can find the sib-ships among sampled individuals that yield a parent-based coancestry matrix with the highest correlation with the molecular coancestry matrix. A notable feature of their method is that it is free from the assumption of linkage equilibrium among marker loci. Two methods for the use of the reconstructed sib-ships were examined: In the first method (SR1), the reconstructed sib-ships were directly used for computing inline image in equation (7). In the second method (SR2), the average locus-specific coancestry among the inferred nonsib pairs were used for estimating sl as in equation (6). Simulation with 200 replicates was run for the case of polygyny in the inbred population with Nm = 5 and Nf = 20 parents, = 100 sample of progeny and = 15 high-polymorphic marker loci. The results are summarized in Table 4. The two methods with sib-ship reconstruction worked quite well; they gave nearly unbiased estimates and narrower confidence intervals. Although further evaluations including other published methods for sib-ship reconstruction should be carried out under a wide range of scenario, the results in Table 4 suggest the potential for improving the molecular coancestry method.

Table 4.   Harmonic mean and percentiles (10th, 50th and 90th) of two estimates (inline image and inline image) of effective number of breeders from 200 replicated simulation runs with a combined use of the molecular coancestry method and a sib-ship reconstruction technique.
EstimateHarmonic meanPercentile
10th50th90th
  1. The corresponding values from the heterozygote-excess (inline image) and molecular coancestry (inline image) methods are also presented. Polygyny with Nm = 5 male and Nf = 20 female parents in the inbred population with = 15 high-polymorphic marker loci and the sample size of = 100 was assumed. The effective number of breeders expected from demographic parameters is 13.79.

  2. Figures in parentheses are the percentage of replicates with inline image.

inline image16.119.1016.41111.56 (5.5)
inline image8.075.328.1416.33 (0.1)
inline image14.3910.7415.0718.54 (0)
inline image12.849.6613.3817.67 (0)

Acknowledgments

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

I thank Troy Day and four anonymous referees for their helpful comments on the manuscript and Jesús Fernández for sending me Fortran code of his algorithm. This work was supported in part by grant-in-aid for scientific research (no. 19658104) from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

Literature cited

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

Appendices

  1. Top of page
  2. Abstract
  3. Introduction
  4. Methods
  5. Computer simulation
  6. Results and discussion
  7. Acknowledgments
  8. Literature cited
  9. Appendices

Appendix A – Expression of inline image in equation (9)

In general, variance of x can be written as

  • image( (A1))

where inline image and inline image are the expectation and variance of x conditional on a given y, respectively (Mood et al. 1987, p. 159). We apply this formula to the derivation of expression of inline image.

Let inline image be the expected contribution of ith couple to the cohort of offspring and ki the number of offspring by ith couple in sample with size n. Applying (A1), we obtain

  • image

where inline image is the mean of ci.

For example, in the simulation condition assumed in Figs 1–3 and Table 2, inline image for all i, giving

  • image

Substituting this expression of inline image and inline image into (9) gives

  • image

as expected.

Appendix B – Derivation of equation (10)

The effective size (Ne) of populations with unequal sex ratio and variation in mating success has been generally formulated by Nomura (2005). Consider a population of polygynous (harem) breeding system with Nm male and Nf female parents, in which a male mates with several females and a female mates with only one male. Let dmi be the number of matings of male parent inline image with the mean inline image and variance inline image. Assuming a Poisson distribution of litter size (the number of newborns per mating), the equation given by Nomura (2005) reduces to

  • image( (B1))

where inline image is the coefficient of variation of dmi. Under the condition of the present simulation, the number of matings (dmi) of male parents follows a binomial distribution with the mean inline image and variance inline image, giving

  • image

Substituting this expression into (B1) leads to

  • image

Putting Neb,demo = Ne, we obtain equation (10).

Appendix C – Effect of age at sampling on relation between Ne and Neb

For simplicity, consider a population of monogamous species with an equal number (N/2 = Nm = Nf) of male and female parents. Generations are assumed to be discrete (nonoverlapping). Let kei be the number of offspring at the early age stage (juveniles) contributed by family (couple) i, and kai be the number of offspring at the later age stage (reproductive adults) contributed by family i. The average survival rate from juvenile to adult is s. According to the standard formula of effective population size (Caballero 1994), the effective number of breeders of juveniles Neb and the effective population size Ne (or equivalently the effective number of breeders of adults) are expressed as

  • image

and

  • image( (C1))

We consider two extreme survival models: (i) random survival and (ii) survival of the family as a unit. Although μka = ke in both models, the expression of inline image and consequently the relation between Neb and Ne depend on the model of survival assumed, as shown below.

Random survival

Applying equation (A1) and noting inline image, we obtain an expression of inline image as

  • image( (C2))

Substituting (C2) into (C1) gives

  • image
Survival of the family as a unit

Under this model, the expression corresponding to (C2) is

  • image

Substituting this expression into (C1) leads to

  • image