Nearly unbiased estimator of contemporary Ne/N based on kinship relationships

Abstract This study develops a nearly unbiased estimator of the ratio of the contemporary effective mother size to the census size in a population, as a proxy of the ratio of contemporary effective size (or effective breeding size) to census size (Ne/N or Nb/N). The proposed estimator is based on both known mother–offspring (MO) and maternal‐sibling (MS) relationships observed within the same cohort, in which sampled individuals in the cohort probably share MO relationships with sampled mothers. The rationale is that the frequency of MO and MS pairs contains information regarding the contemporary effective mother size and the (mature) census size, respectively. Therefore, the estimator can be obtained only from genetic data. We also evaluate the performance of the estimator by running an individual‐based model. The results of this study provide the following: (a) parameter range for satisfying the unbiasedness, and (b) guidance for sample sizes to ensure the required accuracy and precision, especially when the order of the ratio is available. Furthermore, the results demonstrate the usefulness of a sibship assignment method for genetic monitoring, providing insights for interpreting environmental and/or anthropological factors fluctuating Ne/N (or Nb/N), especially in the context of conservation biology and wildlife management.

N e /N fluctuates across years, the trends can clarify the interpretation of environmental and/or anthropological factors, causing the variance of reproductive potential, family-correlated survivorship, or fluctuating population dynamics. Besides, low precision and/or large bias for estimating N e /N may lead to a wrong interpretation of the population (Tallmon, Waples, Gregovich, & Schwartz, 2012).
The estimation of N e /N has been performed by utilizing the estimated values of contemporary effective population size (N e ) and census size (N), unless complete pedigree data and/or full census data are available. Additionally, there are numerous methods for estimating N e from genetic markers (Wang et al., 2016, and the ref-erences contained therein). There are also numerous methods for estimating N, such as a mark-recapture method or population dynamics modeling with survey data (e.g., Kéry & Schaub 2011;Methot & Wetzel, 2013;Quinn & Deriso, 1999;Seber, 2002). It is known that there are large variations in both estimators; thus, their combination (i.e., the estimator of N e /N) also shows large variation (Marandel et al., 2018;Palstra & Fraser, 2012). There is currently a little theoretical foundation for the estimator of N e /N, indicating no guidance for a sample size to ensure the required accuracy and precision.
Close-kin mark-recapture (CKMR) is a recently developed method for estimating N that utilizes the information about kinship in a sample.
This was possible owing to the recent advances in genetic methods for kinship determination (Bravington, Grewe, & Davies, 2016;Bravington, Skaug, & Anderson, 2016;Hillary et al., 2018;Skaug, 2017), although similar methods have been proposed in the beginning of the 21st century (Nielsen, Mattila, Clapham, & Palsbøll, 2001;Pearse, Eckerman, Janzen, & Avise, 2001;Skaug, 2001). Besides, the rationale is that the presence of a kinship pair in the sample is analogous to the recapture of a marked individual in mark-recapture. Kinship pairs in the sample are less likely to be observed in larger populations; thus, the number of kinship pairs may reflect N. While the original CKMR is designed to estimate adult abundance (i.e., N), the monitoring data for CKMR also produce the estimator of N e by detecting half-sibling (HS) pairs within the same cohort (Akita, 2020). This kinship-oriented estimation of N e was presented in the context of the sibship assignment method (Wang, 2009) and is expected to provide a much more accurate estimator as kinship determination becomes more accurate.
In this study, we propose a new method for estimating the ratio of contemporary effective mother size to the census size (N e,m ∕N m ) in a population, as a proxy of N e ∕N. Assuming that kinships are genetically detected without any error, this method is based on the numbers of maternal-sibling (MS) and mother-offspring (MO) pairs in a sample. Sampling is completed at a single breeding time; sampling offspring within the same cohort and mothers probably shares MO relationship with sampled offspring. Our model provides a nearly unbiased estimator of N e,m ∕N m that explicitly incorporates two types of overdispersed reproduction (i.e., parental and nonparental variations; Akita, 2020), making it possible to target a species that shows iteroparity (i.e., multiple reproductive cycles during the lifetime) and/or sweepstakes reproductive success. This estimator applying an iteroparous species corresponds to the estimator of the ratio of contemporary effective breeding mother size to the census size, N b,m ∕N m . First, we explain the modeling assumption and sampling scheme. Then, we analytically determine (nearly) the unbiased estimators of N e,m , 1∕N m , and N e,m ∕N m , which are based on the numbers of MS and/or MO pairs. Finally, by running an individual-based model, we investigate the performance of the estimator and provide a guide for a sample size. It is noteworthy that our modeling framework can be applied to diverse animal species. However, the description of the model focuses on fish species, which are presently the best candidate target of our proposed method.

| THEO RY
In this section, we present the theoretical foundation for estimating N e,m ∕N m . The estimator is based on previous studies that provide  (Akita, 2020) and 1∕N m (Akita, 2018) The main contribution of this paper is formulation of the estimator of N e,m ∕N m , presented in Equation 9. The main symbols used in the current paper are summarized in Table 1.

| Hypothetical population
We suppose that there is a hypothetical population comprising N m mothers and there is also no population subdivision or spatial structure. In this study, a mature female is called a mother even if she does not produce offspring. For mathematical tractability, we assume that only one spawning ground exists in which the mothers remain for the entire spawning season. Following (Akita, 2020), our modeling framework employs two types of overdispersed reproduction: parental and nonparental variations. Thus, the former indicates a variation caused by the mother's covariates, such as age, weight, and residence time on the spawning ground, while the latter indicates a variation caused by nonrandom stochastic events during a series of reproductive episodes, which are independent of the mother's covariates, such as family-correlated survivorship or the mating behavior effects (e.g., competition for males/females and correlation between mating opportunities of the mother and the number of her offspring). Figure 1 illustrates a schematic diagram of the effects of parental and nonparental variations exemplified by age-dependent reproduction and family-correlated survival on kinship relationships in a population.
Detailed definitions of parental and nonparental variations are stated in (Akita, 2020). Appendix 1 provides the theoretical foundation of both parental and nonparental variations in reproduction.

| Sampling
To obtain the estimator of N e,m ∕N m , we utilize the number of MS and MO pairs observed in a sample. For the mathematical tractability, only one reproductive season is targeted for sampling. Thus, the sampling method does not affect our modeling framework whether it is invasive or noninvasive. In the population, n M mothers are randomly sampled immediately after the end of the reproductive season. Additionally, in the population, n O offspring are also randomly sampled. The numbers of MS and MO pairs are determined by pairwise comparison of all the sample individuals ( n O C 2 and n M n O comparisons, respectively). As depicted in Figure 1, six offspring and three mothers are sampled and two MS and three MO pairs are observed. In our modeling framework, if an MS pair also shares a paternal-sibling (PS) relationship, we count it as an MS pair and ignore the existing full-sibling (FS) relationship.

| Linking N e /N m to kinship relationships
In this subsection, we provide the theoretical foundation for understanding how N e,m ∕N m is associated with kinship relationships in a population, based on work presented in previous studies. The rationale is that the observed number of MS and MO pairs has information about N e,m and N m , respectively, as noted below.
First, we consider the relationship between the number of MS pairs and N e,m . (Akita, 2020) defined the contemporary effective mother size as follows: where MS denotes the probability that two offspring share an MS relationship with an arbitrary mother. This definition is related to the inbreeding effective population size (Nordborg & Krone, 2002;Wang, 2009). When sampling from a single cohort in a population with overlapping generations, the effective mother size in our definition corresponds to the effective breeding mother size per breeding-time unit (e.g., year), which produces the single cohort and is denoted by N b,m (Waples, 1991). Hereafter, the description of the model focuses on species with discrete generations; thus, we use N e,m to denote the contemporary effective mother size, although N b,m is the appropriate notation in the left-hand side of Equation 1 when the target species is iteroparous with overlapping generations, as exemplified in Appendix 2.
(1) N e,m = 1 MS , F I G U R E 1 Example of relationships between mothers and their offspring number. The open, gray, and filled circles represent mothers, their eggs, and their offspring, respectively. The area of an open circle indicates the degree of reproductive potential of each mother (i.e., i ). The dotted and thin arrows denote mother-egg and egg-offspring relationships, respectively. The symbol × denotes a failure to survive at sampling. Sampled individuals are denoted with squares Given the total mother number and the degree of overdisepered reproduction in the population, (Akita, 2020)  The mathematical description of c is briefly summarized in Appendix 3.
Next, we consider the relationship between the number of MO pairs and N m . It is natural to consider that the probability of a randomly sampled mother and her offspring sharing an MO relationship (denoted by MO ) can be associated with the total mother number, given by It is noteworthy that MO is independent of the distribution of the offspring number (Akita, 2018).
Together with Equations 1 and 3, we finally obtain the ratio of the effective mother size to census size as follows: indicating that N e,m ∕N m is associated with kinship relationships (i.e., MS and MO) in a population. In other words, when 1∕ MS and MO is estimated from observed MS and MO pairs, respectively, the ratio can also be estimated. Meanwhile, (Akita, 2020) obtained an alternative formulation of the ratio using Equations 1 and 2: where N m ≫ 1 is assumed for approximation. This theoretical connection indicates that estimating N e,m ∕N m corresponds to estimating 1∕c.

| Estimator of N e,m /N
This subsection proposes the estimator of N e,m ∕N m as follows: A "hat" denotes the estimator of a variable in this study. The requisite condition that satisfies Equation 6 is independent of N e,m and 1 ∕N m . This property will be shown later in this subsection. Akita Let both n M and n O be given. We numerically confirmed that there is no correlation between H obs MO and H obs MS (results are not shown). To intuitively explain this independency, we consider three mothers (i = 1,2,3) and their offspring, and assume that k 1 ,k 2 ,k 3 = (3,1,1) and n M ,n O = (1,3). When the three offspring born to the first mother are sampled (i.e., H obs Meanwhile, when an offspring is sampled from each mother's offspring (i.e., H obs MS = 0), the expected number of MO relationship is also one (= 1∕3 × 1 + 1∕3 × 1 + 1∕3 × 1). Therefore, we conclude that both N e,m and 1 ∕N m are independent, and N e,m ∕N m is expected to work well (see details in the Section 3).
The bias of N e,m ∕N m is defined by b, which is approximately given by (see Appendix 5 for the derivation). ( AKITA It is noteworthy that N e,m ∕N m is downwardly biased, especially when n O is very small. However, this bias may be ignored for a wide range of parameters (see details in the Section 3). Theoretically, b asymptotically converges to zero as n O increases, making N e,m ∕N m a nearly unbiased estimator. Moreover, as demonstrated in the Section 3, it is observed that an extremely skewed reproduction breaks down the unbiasedness (e.g., in the case that c = 20 and 100 in the results).

| Individual-based model
We developed an individual-based model that tracks kinship relationships to evaluate the estimator's performance (e.g., N e,m ∕N m ). The population structure was assumed to be identical to that in the development of the estimators. In addition, the population comprised mothers and their offspring, and it was assumed to follow a Poisson or negative binomial reproduction (the degree of skewness due to the nonparental variation is controlled by a parameter ; see Appendix 2 for details). The expected number of the surviving offspring of a mother (denoted by ) followed the density distribution f ( ) (which is involved to the parental variation; see Appendix 2 for details). We calculated overdispersion parameter (c) from and f ( ) , as well as confirmed numerically that the value of c gives the same statistics of the estimators even if the combination of and f ( ) differs (results are not shown). Therefore, each offspring retained the mother's ID, making it possible to trace an MS and MO relationship.
Let a parameter set (n O , n M , N m , , and parameters that determine f ( )) be given. We simulated a population history in which N m mothers generated offspring; this process was repeated 100 times.
The sampling process for each history was repeated 10,000 times, acquiring 1,000,000 data points that were utilized to construct the distribution of the estimators for each parameter set. Furthermore, true value of N e,m was calculated from N m and c (Equations 1 and 2).  Figure S1 in Supporting Information). For most of the investigated parameter sets, we observed that their relative bias is less than 10%.

| RE SULTS
As expected, the relative bias is not affected by n M since 1 ∕N m is exactly an unbiased estimator of 1∕N m (see Equation A11 in Appendix 4 and also Figure S3 in Supporting Information). Meanwhile, N e,m is downwardly biased when n O is relatively small to true N e,m (e.g., see Figure S2 for c = 1 in Supporting Information), as presented in (Akita, 2020); thus, N e,m ∕N m is downwardly biased. Contrary to the theoretical prediction for the direction of the bias (Equation 10), relatively strong overdispersion results in an upwardly bias for N e,m ∕N m when N m is relatively small and c is relatively large (e.g., c = 20 and 100 in Figure 2a). This inconsistency may be caused by the breakdown of the approximation for deriving N e,m (equation S14 in Akita, 2020).

| D ISCUSS I ON
We theoretically developed a nearly unbiased estimator of the ratio CKMR method (for estimating 1∕N m ), which defines a kinship-oriented estimation of effective/census population size. Improvements to these methods directly contribute to the estimation of N e,m ∕N m .
Furthermore, the original theory of the sibship assignment method requires HS and FS pairs but does not require a distinction between the MS and PS pairs (Wang, 2009). This is a significant advantage due to the difficulty of the distinction from genetic data. However, the limitation of using MS or PS pair enables us to employ a nearly unbiased estimator of N e for particular sex (Akita, 2020), which greatly improves the accuracy of the estimation of the N e,m in this study and thus that of N e,m ∕N m (see figure E3 in Appendix 5). It is noteworthy that the estimator of 1∕N is given by where n P and H obs PO denotes the sample size of the parent and the observed number of parent-offspring (PO) pairs in a sample, respectively (Bravington, Skaug, et al., 2016). The development of the unbiased estimator of N e without a distinction between MS and PS pairs that could provide an unbiased estimator of N e ∕N coupled with Equation 11, is a study for the future. Furthermore, using cross-cohort HS pairs, the CKMR method also provides the estimator of N (Bravington, Skaug, et al., 2016) that does not require the sampling of the parent, which probably provides the estimator of N e ∕N only from unmatured samples. This perspective of the study will also be conducted in the future.

ACK N OWLED G M ENTS
The author thanks R. Nakamichi for fruitful discussions. The author also thanks Robin Waples and an anonymous reviewer for constructive feedback that substantially improved the quality of this manuscript. This work was supported by JSPS KAKENHI Grant Number 19K06862.

CO N FLI C T O F I NTE R E S T
The author declares no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
No datasets were generated or analyzed in this study.

D I S TR I B UTI O N O F TH E N U M B ER O F S U RV I V I N G O FF-S PR I N G PE R M OTH E R A N D IT S VA R I A N CE
Let k i denote the total number of surviving offspring from a mother i (i = 1,2, … ,N m ) at time of sampling. Following (Akita, 2020) and giving the expected number of the surviving offspring per mother at time of sampling, i (>0), k i is assumed to follow a negative binomial distribution through a conventional parametrization: where (>0) denotes the overdispersion parameter describing the degree of nonparental variation. In the current work, is assumed to be constant across mothers, whereas the expected number of surviving offspring ( i ) varies across mothers. The mean and variance of this distribution are denoted by i and i + 2 i ∕ , respectively. In the limit of infinite , this distribution becomes a Poisson distribution, which is We assume i to be independent and identically distributed with the density function f ( ), producing a parental variation. The shape of the density function is often complex, but may be described by information from the population, for example, the mother's weight composition in the population. The specific form of f ( ) is given in Appendix 2 and is used for running an individual-based model.
The variance of the offspring number can be given by

PRO BA B I LIT Y D EN S IT Y FU N C TI O N A N D ITS M O M ENT O F O FFS PR I N G N U M B E R PE R M OTH E R
As stated in the main text, our modeling framework does not require the specific form of the density function of offspring number per mother (denoted by f ( )); it only requires the ratio of the second moment to the squared first moment ( [ 2 ∕ ] 2 ) instead. However, the specific form is required for evaluating the theoretical results (i.e., calculating the moment or running the individual-based model).
Herein, we model an age-structured fish population that serves as a representative example, demonstrating both parental and nonparental variations. The following contents are almost the same as those of (Akita, 2020) except for the parameter values that produce the setting c = 20 and 100.
Suppose that the mean fecundity of a mother depends on her age. where S denotes a survival probability. The maturity at age (Q(a)) is assumed to be a knife-edge function, which is given by where a mat denotes the mature age.
To calculate [ 2 ∕ ] 2 , the required parameter set is a max , ,a 0 , ,S,a mat . In this study, for the purpose of representation, we fixed the values of several parameters as follows: a max = 20, Finally, we provide specific forms of f ( ); thus, when a and h mat (a) are obtained, f ( ) is given by

TA L VA R I ATI O N S
Following (Akita, 2020), combined effect of parental and nonparental variations can be partitioned into the two components, such as: The first and second parentheses give the effect of nonparental variation and parental variation, respectively. When is con-

A PPE N D I X 4
Derivation of the bias of N e,m ∕N m For calculation of the bias of N e,m ∕N m , we require an expectation of the estimator given by As stated in the main text, both N e,m and 1 ∕N m are independent. Thus, the first term in the right-hand side of Equation A10 can be ignored. The expectation of 1 ∕N m is given by

WA N G ' S E S TI M ATO R
Our modeling framework estimates the ratio of contemporary effective mother size to census size (denoted by N e,m ∕N m ), as a proxy for the ratio of contemporary effective size to census size (denoted by N e ∕N). To compare the performance of our method with that of other methods for directly estimating N e ∕N, we propose the estimator of N e ∕N, given by n P and n O indicate the sample numbers of parents and offspring, respectively, and H obs HS , H obs FS , and H obs PO indicate the observed number of half-sibling, full-sibling, and parent-offspring pairs, respectively. N e is based on (Wang, 2009) assuming random mating and 1 ∕N is defined in Equation 11 in the main text.
We evaluated the estimator's performance on data simulated by running an individual-based model under the Wright-Fisher process for a diploid population. In the current comparison, we did not consider the case of overdispersion (i.e., N e = N). Sex ratio was fixed to 0.5 in both whole and sampled populations. Each parent retained the ID of its offspring, making it possible to trace HS, FS, and PO relationships. Given a parameter set (N, n O , and n P ), we backwardly simulated a population history in which mother-offspring and father-offspring relationships were randomly specified from n O offspring; this process was repeated 10,000 times, acquiring 10,000 data points that were used to construct the distribution of N e ∕N for each parameter set. If neither HS nor FS pairs were found in a sample, we did not include that trial when constructing the distribution. For ease of comparison, we used the same sample size and population structure for both methods. Figure A1 illustrates the distribution of the relative bias of N e ∕N for limiting cases where parent and offspring sample numbers are identical (i.e., n P = n O ). Our results indicate that the estimator is upwardly biased, particularly when sample size is small. Alternatively, our estimator shows a broader region such that unbiasedness is approximately achieved. For example, when n = 100 and 150 in N = 2,000 (i.e., N m = 1,000), the bias of N e ∕N is clearly observed ( Figure E1a); meanwhile, N e,m ∕N m does not produce a bias in these conditions ( Figure 2a)