Age is not just a number: How incorrect ageing impacts close‐kin mark‐recapture estimates of population size

Abstract Population size is a key parameter for the conservation of animal species. Close‐kin mark‐recapture (CKMR) relies on the observed frequency and type of kinship among individuals sampled from the population to estimate population size. Knowledge of the age of the individuals, or a surrogate thereof, is essential for inference with acceptable precision. One common approach, particularly in fish studies, is to measure animal length and infer age using an assumed age‐length relationship (a ‘growth curve’). We used simulation to test the effect of misspecifying the length measurement error and the growth curve on population size estimation. Simulated populations represented two fictional shark species, one with a relatively simple life history and the other with a more complex life history based on the grey reef shark (Carcharhinus amblyrhynchos). We estimated sex‐specific adult abundance, which we assumed to be constant in time. We observed small median biases in these estimates ranging from 1.35% to 2.79% when specifying the correct measurement error standard deviation and growth curve. CI coverage was adequate whenever the growth curve was correctly specified. Introducing error via misspecified growth curves resulted in changes in the magnitude of the estimated adult population, where underestimating age negatively biased the abundance estimates. Over‐ and underestimating the standard deviation of length measurement error did not introduce a bias and had negligible effect on the variance in the estimates. Our findings show that assuming an incorrect standard deviation of length measurement error has little effect on estimation, but having an accurate growth curve is crucial for CKMR whenever ageing is based on length measurements. If ageing could be biased, researchers should be cautious when interpreting CKMR results and consider the potential biases arising from inaccurate age inference.


| INTRODUC TI ON
Close-kin mark-recapture (CKMR) is a method for estimating population size and other key parameters such as fecundity (and population growth and survival rates) using data on the relatedness of individuals sampled from the population (Bravington, Skaug, & Anderson, 2016;Skaug, 2001).The key rationale is that small populations will tend to contain a higher proportion of closely related individuals than large populations.
One of the main advantages of CKMR over capture-recapture (Otis et al., 1978) and its extensions such as spatial capturerecapture (Borchers & Efford, 2008) is that it can be applied in cases when sampling is necessarily lethal, such as fisheries, and when physical recaptures are rare or impossible, where alternative metrics are often relative (e.g., catch-per-unit-effort) and potentially unreliable (Bravington, Grewe, & Davies, 2016;Casas & Saborido-Rey, 2023).This is because CKMR does not require the recapturing of individuals, but rather their genetic markers.Offspring share genetic information with their parents (hence 'kin'), thus they 'mark' their parents when born; through modern genetics we can compare sampled individuals with one another to see if these marks are 'recaptured'.So far, CKMR has been developed for parent-offspring pairs (POPs; e.g., Bravington, Grewe, & Davies, 2016;Ruzzante et al., 2019;Trenkel et al., 2022), half-sibling pairs (HSPs; e.g., Hillary et al., 2018;Bravington et al., 2019;Patterson et al., 2022), and the combination of both (e.g., Bradford et al., 2018).The rise in popularity of the method has become clear from an increase in published studies involving CKMR, although the total number of applications is still small (Delaval et al., 2023).Most of the applications up to this point involved marine or aquatic species.Several salmonids have been studied (Prystupa et al., 2021;Ruzzante et al., 2019;Wacker et al., 2021), as well as large pelagic species such as southern bluefin tuna (Thunnus maccoyii; Bravington, Grewe, & Davies, 2016) and the pelagic bluefin tuna (Thunnus orientalis; Tsukahara et al., 2023), and a variety of elasmobranchs such as white sharks (Carcharodon carcharias; Hillary et al., 2018), lemon sharks (Negaprion brevirostris; Swenson et al., 2024), thornback rays (Raja clavata ;Trenkel et al., 2022), blue skates (Dipturus batis; Delaval et al., 2023), and grey nurse sharks (Carcharias taurus; Bradford et al., 2018).The Christmas Island ying-fox (Pteropus natalis; Lloyd-Jones et al., 2023) and the yellow fever mosquito (Aedes aegypti; Sharma et al., 2022) were the only terrestrial species that we could identify in published CKMR studies to date.
CKMR with POPs can estimate the size of the entire adult population, whereas with HSPs only the breeding adult population is estimated, for example, post-reproductive adults are 'invisible' for the method (Bradford et al., 2018).Here, we focus only on POPs.For any comparison between two individuals, the probability that a potential offspring truly is the offspring of the parent is inversely related to the number of mature animals alive in the birth year of the offspring.
Probabilities of finding a kin pair are expressed as a function of the expected relative reproductive output (ERRO) of the parent in the year that the offspring was conceived.This approach is parent-centric, as it starts from the point that the parent is sampled and then formulates a probability for a PO relationship (an alternative, offspringcentric formulation was proposed by Skaug (2017)).
In the simplest scenario, the probability of any adult being the parent of a juvenile reduces to two over the number of potential parents, assuming a 50:50 sex ratio; in reality, this probability is often more complicated, for example, when reproductive output is related to age, or when there is stock structure or population trend.To use relatedness to estimate adult population abundance with acceptable precision, it is therefore essential to accurately age the studied animals because birth year is derived from their age.
Accurate ageing can be challenging: for example, epigenetic ageing requires calibration using individuals of known age (De Paoli-Iseppi et al., 2017;Polanowski et al., 2014), which is not always possible; ageing via otoliths, which are calcium carbonate structures in the inner ear, can be relatively accurate (Campana, 2001) but requires lethal sampling and is only possible for animals that have otoliths (and sharks are not among those); and ageing by counting the dental or cementum growth layer groups in teeth is not necessarily lethal and commonly used for (marine) mammals (Hohn, 2009, Chapter 9), but cannot be applied to fish species.Sharks can be aged from their vertebrae, but this is a lethal procedure and can be biased in various ways or even unusable depending on the species (Burke et al., 2020).
Alternatively, length can be used to infer age through growth curves, which seems appealing as length is often recorded during sampling.
Accurate estimates for growth curves of the studied species are not always available, however, and age as a function of length (ageat-length) can vary substantially between populations of the same species (e.g., Bradley, Conklin, Papastamatiou, McCauley, Pollock, Kendall, et al., 2017).Moreover, length measurements often involve measurement error.Swenson et al. (2024) studied the effects of ageing error from incorrect length measurement through simulation and found that incorrect ageing can induce substantial bias in CKMR parameter estimates.Various degrees of error were added to the true lengths of individuals, after which these were converted to ages using a von Bertalanffy growth curve.These 'incorrect' ages were then used as inputs for the CKMR model without explicitly modelling the length measurement error.
Simulation is an important tool to assess the robustness of statistical methods to violations of model assumptions (DiRenzo et al., 2023) and their performance more generally (Morris et al., 2019).Through simulation, Conn et al. (2020) studied the effects of unmodelled spatial heterogeneity on CKMR estimation and found that this can induce a negative bias in the abundance estimates; Sévêque et al. (2024) found that fitting overly simplistic CKMR models (that do not account for complex life-history traits or selective sampling) can cause biases in survival and estimates in non-trivial directions; and Waples and Feutry (2022) showed, among other things, that age-specific vital rates can bias abundance estimates from CKMR.We follow an agent-based simulation approach similar to Swenson et al. (2024) to explore the effects of incorrect ageing on the CKMR adult abundance estimator.Unlike Swenson et al. (2024), our model does not assume length (and thus age) to be perfectly known but rather we explicitly account for the measurement error on lengths.
It is often important for demographic modelling to account for the uncertainty in the age estimates, especially when sampling probabilities depend on the age of individuals (i.e., when there is 'selectivity'), which is fundamental to fishing (Vasilakopoulos et al., 2020).Correctly accounting for ageing error is therefore still an active part of fisheries research (e.g., Hulson & Williams, 2024).Fournier and Archibald (1982) showed how ageing error in catchat-age data can be accounted for as long as the ageing error is known.Later, Richards et al. (1992) developed statistical methodology to account for ageing error when the error is unknown, using multiple readings of fish.We are unaware of any CKMR studies in which ageing error is directly modelled and estimated.Bravington et al. (2019) accounted for the uncertainty in ageing by first fitting a known-age CKMR model to the data and then refitting the model ten times, resampling ages from the age-at-length curve each time.In our simulation, ageing error is introduced in two ways: (i) through misspecified growth curves, and (ii) through incorrect length measurements, that is, measurement error.In reality, error could also be (and almost surely is) introduced within a population through natural variation in length-at-age, for example, as a function of genetic and environmental factors.We assume that all individuals follow the growth curve perfectly; however, one could readily interpret the length measurement error as the joint error of length measurement and length-at-age variation, or even solely as length-at-age variation if that is more appropriate for a particular case study.We assume ageing error from incorrect length measurements to be known and explicitly account for it in our model (Bravington, Skaug, & Anderson, 2016, Section 3.1.4).
The research presented in this manuscript is centred around two fictional shark species that are based on a grey reef shark population (Carcharhinus amblyrhynchos) at Palmyra Atoll, in the central Pacific Ocean (Bradley, Conklin, Papastamatiou, McCauley, Pollock, Pollock, et al., 2017;Papastamatiou et al., 2018).This motivating case study consists of genetic samples that were collected from this population in 2013 and 2014.One fictional species is a simplification of the real species (hereafter referred to as the 'simple species') and was included to test the basic performance of the model.The other fictional species has more realistic life history traits (hereafter referred to as the 'complex species') and was included to more closely match a real empirical study.We also compare the results for both species.
It is paramount to first explore the feasibility of CKMR, for example through simulation, before committing the resources and time required for the correct collection and genetic analysis of the samples.
Moreover, the findings will be relevant to other CKMR studies when age is uncertain.

| MATERIAL S AND ME THODS
We first present our setup of the simulations for the two fictional shark species.Simulated time series are 100 years long, with sampling occurring in the final 2 years (mimicking the 2 years of sampling in the Palmyra Atoll case study).Following that, we present the POP-based CKMR models for our two species using these 2 years of data, followed by our estimation method and performance diagnostics.We assume that kinship relationships are known with certainty; in real-life situations, one often needs to account for uncertainty in this process (Bravington, Skaug, & Anderson, 2016).All variables and quantities used in this study are summarised in Table 1.
Code for the simulation and fitting of models was written in R 4.3.2 and C++14, where the latter was linked to R through Rcpp 1.0.12(Eddelbuettel, 2013;R Core Team, 2023).
Two different 'species' were simulated separately, one with simple life history characteristics, and one with a more complex life history.For each simulation, sampling in the last 2 years was random, and mating occurred at random as well, that is, mothers and fathers were matched at random, where all non-gestating mothers mated and mature males could father multiple litters in the same mating cycle.Females of the simple species always produced two offspring, whereas the litter size for the complex species ranged from 3 to 6, with equal probability.Females of the simple species reproduced every year as gestation was negligible; females of the complex species gestated for a year and therefore reproduced every other year.
Newborns had age zero and sex was assigned at random with an expected 50:50 sex ratio.The survival process was Bernoulli where the annual survival probability was the same for all ages and sexes, but different between the two species and empirically set at a level that resulted in the yearly population growth rate equalling approximately one, that is, no growth.Natural mortality was the only source of mortality we considered, and all individuals that reached the maximum age perished at the next survival event, that is, animals could go through at most a max + 1 yearly cycles.The maximum age for sharks of the simple species was 19 years and 63 years for the complex species, where the latter matches the results from Bradley, Conklin, Papastamatiou, McCauley, Pollock, Kendall, et al. (2017).For a given species, all individuals of the same sex matured at the same age: males and females in the simple species matured at 10 years old, whereas in the complex species males matured at 17 years and females matured at 19 years of age.The length of an animal was the same for all individuals of a certain age, irrespective of sex and species.After the initialisation of a population in year zero, the simulation looped through four distinct events: a birthing/mating event, a sampling event (only in the final 2 years of the simulation), a survival event, and an ageing event (Figure 1).
For both species, we ran the simulations for 100 years, to ensure that all animals of the initial populations would have died off.Every simulation started with 8500 individuals to stay close to the population size estimate of 8433 by Bradley, Conklin, Papastamatiou, McCauley, Pollock, Pollock, et al. (2017), with an expected 50:50 sex ratio.At each sampling event, 375 individuals were randomly and non-lethally sampled, where re-captures were possible between sampling events.This resulted in at most 750 unique sampled individuals across the 2 years of sampling, which is of a similar scale as the number of genetic samples available in the motivating case study.All 750 samples were retained for analysis as there was no particular reason to exclude recaptures, unlike, for example, Hillary et al. (2018), where duplicate samples were excluded from the analysis to avoid them aliasing as half-sibling pairs.For every sampled individual, the age, year of capture, and sex were recorded; the

F I G U R E 1
Flowchart representing the different stages of the life-cycle for the simulation.A population is initialised at the start of a simulation.Following that, it loops through stage 1-4 every year the simulation runs.
true length was derived through a von Bertalanffy growth function (VBGF; von Bertalanffy, 1938;Francis, 1988), that was specified as where ∞ = 163 cm is the asymptotic length, a 0 = − 8.27 the theoretical age at length zero, and k = 0.0554 denotes the growth coefficient.
These values match the estimates of the best model in table 2 of Bradley, Conklin, Papastamatiou, McCauley, Pollock, Kendall, et al. (2017).
Gaussian noise was added to reflect (symmetric) length measurement error with variance 2 = 2.89 2 , with over-and underestimates being equally likely, after which this 'observed' length was rounded to the nearest integer.Based on these parameters, we generated 1000 different realisations of a 100-year-long population history for each species, using functions based on those from the fishSim-package (Baylis, 2022).

| POP-based estimator
We developed estimators for both populations based only on POPs.
Any other possible genetic relationship (such as half-sibling or selfcapture) was categorised as 'not a POP'.CKMR models are generally fitted through a likelihood (function), which is constructed from the joint distribution of all pairwise comparisons between the samples, that is, the product of approximately n(n − 1) ∕ 2 Bernoulli trials for a POP, where n is the number of samples.We only consider pairwise comparisons and treat these as independent, whereas they clearly are not: an offspring can only have one parent of each sex.Because we ignore these higher dependencies, our likelihood is not a true likelihood but rather a pseudo-likelihood.Working with such a likelihood should not affect the point estimates but could affect other properties of likelihood-based estimation, such as variance estimation, although this effect is likely minor or even negligible provided that a small proportion of the total population is sampled, that is, n ≪ N (Bravington, Skaug, & Anderson, 2016;Skaug, 2001).Because length is measured with error and age is inferred from length, age is uncertain and hence we cannot assume directionality in the comparison, that is, who is the parent and who is the offspring.Therefore, for any comparison for individual i and j, we test both directions (parent-offspring and offspring-parent), denoted PO/OP.In practice, we tend to optimise the logarithm of the pseudo-likelihood, the so-called 'pseudo-loglikelihood', as this is generally easier to work with and numerically more stable.Our pseudo-log-likelihood is given by where is the parameter vector, x denotes the observed data, K ij is the kinship between i and j, Pr is the probability function, ij is an indicator that is 1 if the kinship between i and j is observed to be PO/OP and 0 otherwise, and z denotes the information recorded about a captured individual, such as length.Age is required to calculate the probability of observing kinship, and therefore we sum over all potential ages for i, j and multiply by the probability density of that age given the measured length, f(a| l * ): We will now specify the two main elements of Equation (3), namely the probability of observing the PO/OP kinship, and the probability density of age given length.

| Probability of kinship
We modelled the female and male adult abundance separately; thus, for every PO/OP comparison between two individuals we had to consider, conditional on the sexes, both combinations of which individual in older and thus the potential parent.We will first present the formulae for the simple species, followed by those for the complex species.
The probability of any comparison between i and j being PO/OP is the same as the sum of testing for PO and OP separately, thus we only present the PO probabilities.For the simple species this became for the females, and for the males.Here, is an indicator function that returns 1 if its argument is true and 0 otherwise, MO and FO refer to mother-offspring and father-offspring, respectively, y denotes the birth year, the age of maturity, N s,t the total adult abundance of sex s in year t, c the year of capture, and i t 1 , t 2 the survival function for individual i from t 1 to t 2 .As survival was assumed constant, i t 1 , t 2 was defined as t 2 −t 1 .
Even though females could only have one litter whereas males could father multiple litters, their ERROs were formulated similarly, that is, the reciprocal of the total mature abundance of their respective sexes.
For the complex species, the probability of an MO pair thus became and the probability of an FO pair became The two key differences between the complex species relative to the simple one were that (1) a potential father only needed to (4) (5) .
have been alive the year before the birth of the offspring, whereas a potential mother needed to have survived until birthing, and (2) the potential parents needed to have matured at least 1 year before the birth year.To illustrate this, imagine that we are comparing two individuals from the complex species, where the parent is female, and we know the individuals' ages.The offspring was caught in year 50 at age 3, and thus born in year 47.The potential parent was female, and caught in year 45 and would have needed to survive for at least 2 years in order to be a potential parent; she was 36 years old at the time of capture, and thus born in year 9.The ERRO for this parent in the year of mating, that is, the year before the birth year of j, is the reciprocal of the number of females alive in that year who also survived 1 year of gestation, which is .Therefore, the probability that i is the mother of j would be: Every comparison, given a i and a j , contains a signal about the adult population in a specific year.We assumed a constant population size, and thus N s,t = N s .We also developed and tested a model that included sex-specific growth parameters.This model was internally inconsistent and therefore not included in the main body of this manuscript for any formal inference.However, we did include the derivations and some results in Appendix C.

| Probability density of age given length
We had an assumed true length-at-age curve l(a) (Equation ( 1)) and we knew that there was measurement error on lengths.Denoting the measured length as l * , we derived the probability density f(a| l * ) using Bayes' rule as follows: Measured length given age l * | a was assumed to follow a discretised Normal distribution, as lengths were rounded to the nearest centimetre.We followed Roy (2003) in defining this distribution as where Φ denotes the standard normal cumulative distribution function, the expectation is given by Equation (1), and l captures the standard deviation of length measurement error.As the sampling probability in the simulation was unrelated to age, the age distribution of sampled individuals was the same as the age distribution in the whole population, and we did not need to distinguish between the two.
We assumed that the population had a stable age distribution with no growth, which meant that the distribution of ages, had we not imposed a maximum age, would have been geometric with shape parameter being equal to the mortality rate, which is 1 − .Acknowledging that there was a maximum age, a max , we needed to condition on the age being at most this age, and thus where the numerator and denominator were the geometric probability mass and cumulative distribution functions, respectively.Note here that we used the definition of a geometrically distributed variable being the number of failures (survival) until a success (death) occurs.Finally, the probability density function on measured length became

| Fitting
The parameters in the CKMR model were estimated from the sam- underestimate, and a 33% and 67% overestimate.We also considered five different growth curves: the correct one, two that were shifted upwards by 5% and 10%, and two that were shifted downwards by 5% and 10%.These growth curve shifts were aimed to represent real variation in growth curves between populations of the same shark species (Bradley, Conklin, Papastamatiou, McCauley, Pollock, Kendall, et al., 2017).This resulted in a total of 25 combinations or 'scenarios'.We labelled these scenarios using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had the label ME+33:GC-5.These measurement errors and growth curves are visualised in more detail in Figure 2.
Considering 25 scenarios for every simulation resulted in the fitting of 50,000 models in total.To keep computation time to a minimum, we implemented most of the fitting process in C++. (8)

| Variance and performance
To evaluate the performance of the estimator, we present the following metrics: (i) mean error and mean relative error to evaluate a potential bias; (ii) median error and median relative error to evaluate the median bias, which uses the median instead of the mean, as the median is often more appropriate when distributions are skewed.In addition, the mean absolute error (MAE) and root mean square error (RMSE) are presented in supplemental tables.The definitions of the six metrics are given in Appendix A.1.Furthermore, we derived the 95% log-normal confidence interval (CI) coverage to evaluate the performance of these CIs in the correct growth curve scenarios.
Variance was estimated from the Hessian matrix produced by the maximum likelihood estimation, and averaged over these 1000 estimated standard errors.We can treat the pseudo-likelihood as a true likelihood as long as sampling was sparse (see Section 2.2).It is unclear if this criterion was met in our study, as we took 750 samples from a population with roughly 8500 individuals.If sampling is not sparse, the estimated variance could be negatively biased as the pairwise comparisons are not approximately independent.To explore the extent of this potential bias, we evaluated how well the average estimated standard error estimated the empirical standard deviation of population estimate errors across the 1000 simulations for each species.We include definitions of these in Appendix B.1.

| RE SULTS
The mean number of POPs for all sampling realisations was 48.6 (range: 25-76) for the simple species and 55.6 (range: 31-90) for the complex species.Mean simulated adult abundances in the final year of the simulation were 794 and 793 (range: 630-992 and 600-1019; ♀ and ♂) for the simple species and 514 and 650 (range: 400-683 and 516-824; ♀ and ♂) for the complex species.A small number of recaptures between the 2years of sampling, that is, that some individuals were sampled at more than one sampling event, occurred in every simulation.This ranged from 4 to 25 recaptured individuals for the simple species and 5-30 individuals for the complex species.
The simulated mean annual growth rate was 0.999 for both sexes of the simple species, and 1.001 and 0.998 for the males and females of the complex species, respectively; the mean annual growth for any simulation was always within 0.3 percent point from the mean across all simulations.The fitting algorithm did not always converge when the measurement error standard deviation and/or the growth curve was (very) negatively biased.Whenever this happened, it happened for most of the simulations in that scenario.Therefore, we excluded the scenarios where this happened from the analysis, which led to the exclusion of scenarios ME-67:GC-10, ME-67:GC-5, ME-67:GC + 0, ME-33:GC-10, ME-33:GC-5, and ME+0:GC-10.In the other scenarios, all models converged successfully.
For the simple species, median errors for Ns when using correct measurement error and growth curve specification (ME+0:GC + 0) were 20.83 and 22.52 (relative: 2.57% and 2.79%; ♀ and ♂) individuals (Figure 3; Tables A1 and A2).For the complex species, median errors for Ns when using correct measurement error and growth curve specification were 6.48 and 10.28 (relative: 1.35% and 1.59%; ♀ and ♂) individuals (Figure 4; Tables A3 and A4).For the simple species, median relative errors in abundance estimates were positive but close to zero for all deviations from the true standard deviation of length measurement error provided that the growth curve was correctly specified, although they were slightly larger for the females (Figure 3, also Table A1).For any given measurement error standard The left panel shows the five growth curves that were used in the scenarios tested in this study.The true growth curve is indicated in red; the black dotted lines show the incorrect ones, which were constructed by shifting the growth curve up and down in steps of 5%.These shifts were aimed to represent real variation in growth curves between populations of the same shark species (Bradley, Conklin, Papastamatiou, McCauley, Pollock, Kendall, et al., 2017).The right panel shows five measurement errors used in this study.The true simulated error was 2.89 cm, and the other measurement errors were chosen by deviating from this error in both directions.

F I G U R E 3
Box plots for the error in estimated sex-specific adult abundance relative to the true abundance for the simple species.We only present the results for scenarios in which the optimiser consistently converged; this meant that some scenarios were left blank.Box plots show the interquartile range (IQR) and the median; the mean is indicated by the darker filled circle; the vertical lines cover five times the IQR; and all values outside of that are indicated as outliers.The scenarios were labelled using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had label ME+33:GC-5.

F I G U R E 4
Box plots for the error in estimated sex-specific adult abundance relative to the true abundance for the complex species.We only present the results for scenarios in which the optimiser consistently converged; this meant that some scenarios were left blank.Box plots show the interquartile range (IQR) and the median; the mean is indicated by the darker filled circle; the vertical lines cover five times the IQR; and all values outside of that are indicated as outliers.The scenarios were labelled using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had label ME+33:GC-5.deviation, we observed a trend from a positive median error to a negative median error as we shifted the growth curve upwards (Figures 3 and 4).When growth curves were shifted down 5%, this resulted in median relative errors of around 30% for the simple species, and between 30% and 60% for the complex species.Shifting growth curves up by 5% resulted in median relative errors of −30% for the simple species, and between −30% and −40% for the complex species.
When the growth curve was correctly specified, the 95% CI coverage (rounded to one decimal place) for the simple species adult abundance estimates ranged from 96.1% to 96.4%, and ranged from 94.4% to 95.9% for the complex species estimates (Table B3).For a given growth curve, no relation between the measurement error standard deviation and CI coverage became apparent.Incorrectly specified growth curves severely lowered the CI coverage for all measurement errors for both species (Table B3).When the growth curve was correctly specified, the empirical standard errors ranged from 180.97 to 182.56 for the male simple species, and 190.85 to 192.56 for the female simple species (Table B1); for the complex species, these errors ranged from 145.77 to 148.42 for the males, and from 107.51 to 109.27 for the females (Table B2).Given a growth curve, increasing the measurement error standard deviation seemed slightly decrease the empirical standard errors, for all species and sexes.Given an assumed standard deviation of measurement error, the empirical standard errors decreased as the growth curve was shifted upwards.When the growth curve was correctly specified, the average estimated standard errors ranged from 181.04 to 183.04 for the male simple species, and 183.54 to 185.51 for the female simple species (Table B1); for the complex species, these errors ranged from 139.79 to 142.41 for the males, and from 102.99 to 105.71 for the females (Table B2).The changes in average estimated standard errors between the scenarios follow a similar pattern to the empirical standard errors.
Whenever the growth curve was correctly specified, average model estimated standard errors were always (slightly) lower than empirical standard errors in all scenarios and both species, except for the male simple species (Tables B1 and B2).The underestimation of the empirical standard error by the average estimated standard error was always < 5%.Deviations from the correct growth curve increased underestimation in all cases (Tables B1 and B2).

| DISCUSS ION
In this study, we explored the effects of incorrect age inference from length measurements on CKMR estimates of adult abundance through misspecifying the length measurement error and the growth curve in various ways.The number of POPs discovered in our simulation was in the vicinity of the 50-100 kin pairs recommended for a CKMR application (Bravington, Skaug, & Anderson, 2016), albeit on the lower end.Overall, an incorrect assumed standard deviation of measurement error mostly impacted the convergence likelihood of the fitting algorithm, whenever this standard deviation was assumed to be smaller in the fitting than was true for the simulation.Whenever the measurement error standard deviation was high enough to allow for convergence, it made little difference whether it was the true value or if a much higher standard deviation was assumed.This would suggest that if researchers are ever unsure about whether their assumed degree of spread in length measurement error is correct, it is safer to overestimate it.A misspecified growth curve, on the other hand, had drastic effects on the estimation of all parameters: a 5% shift away from the true growth curve resulted in biases ranging from −60% to +40%; estimated and empirical standard errors seemed to scale with the abundance estimates, and shifting away from the true growth curve resulted in an increased underestimation of empirical standard errors.
The model performed well under correct specification (scenario ME+0:GC + 0), although the positive median relative error in adult abundance estimates suggests a positive median bias.This error was more extreme for the simple than for the complex species.A bias in the estimates is not uncommon for maximum likelihood methods when the sample size is small, which could be true in our study as the number of sampled POPs never surpassed 76 for the simple species and 90 for the complex species.However, it could also be that this shows a slight positive bias in the method itself, especially as a previous CKMR simulation study by Conn et al. (2020) found small positive biases in the abundance estimates, too.We can express the empirical standard errors for the correct scenarios as percentages of the associated mean simulated abundances in the final year.This gives standard errors relative to the mean (also known as coefficients of variation) of 23.0% and 24.3% for the male and female estimates of the simple species, respectively, and 22.7% and 21.2% for the male and female estimates of the complex species, respectively.These are high but not uncommon for real-life population studies.
The 95% log-normal confidence intervals (CIs) seemed to accurately represent the uncertainty around the estimates whenever the correct growth curve specification was used, as the coverage ranged from 94.4% to 96.4%.Nonetheless, the coverage always exceeded 95% when the entire model was correctly specified, which could indicate that the 95% log-normal CIs were slightly conservative.
In this study, we assumed that all individual sharks followed the specified growth curve perfectly, and any variation in lengths for a given age resulted from measurement error.This is a simplification of reality, and future research could focus on ways to accommodate natural variation in length at a given age, which could be a function of age in itself.As an incorrect standard deviation of length measurement error seemed to have little effect on point estimates, we believe that, when in doubt, it is preferable to assume a higher standard deviation as this improves how likely it is that the fitting algorithm converges.
The effects of deviating from the true growth curve on the adult abundance estimates were substantial.When growth curves were shifted by 5% we often observed median relative errors of over 30%.This strongly highlights the sensitivity of the method to correct age estimation.Empirical standard errors were also increasingly underestimated when growth curves were shifted away from the truth.This effect was stronger when the growth curve was shifted upwards, that is, when ages were being underestimated.An underestimation of uncertainty could be a consequence of the comparisons not being truly independent, that is, a violation of the sparse sampling assumption.It is important here to note that we did not evaluate the standard deviation of abundance estimates but rather of the error in abundance estimates (see Appendix B.1).The true abundance was different for every simulation, so we could not use the standard deviation of the abundance estimates itself, since this would partly capture the stochasticity of the simulation process.To overcome this, we used the standard deviation of the error in abundance estimates, which should be a more robust measure of the true standard error.This should not be a problem as long as estimation is unbiased; however, our results indicate a slight positive bias, which could have impacted the accurateness of the empirical standard error in being a measure of true standard error.CI coverage was most severely impacted by incorrect growth curves; however, this was likely mostly due to the bias in the estimates in those scenarios.
In real-world applications, researchers could potentially check the correctness of their assumed growth curve by assessing the distribution of lengths/ages among the sampled individuals.If many of the observed lengths are either associated with very low ages or are close to asymptote, or in some other way exhibit an unexpected sampled age distribution given the sampling scheme, then this could be an indication that the assumed growth curve is incorrect (or that sampling assumptions are violated).
Even though recaptures should be rare (as long as sampling is sparse), they did occur in our simulations between sampling years.These recaptures did not pose any problems within the analysis, for example, getting mistaken for a different genetic relationship, and thus we retained the recaptures in our data.Alternatively, duplicate samples can be excluded from the analysis when there is reason to do so.We hypothesise that excluding recaptures would likely increase estimates of uncertainty, as fewer observations are used for the analysis.We are unaware of any study that investigated the extent to which including recaptures could potentially affect precision or even bias in CKMR estimation, and we believe that this could be a great topic for future research.Whenever it is known that multiple samples belong to a single individual, there exists the potential for extending CKMR by incorporating some form of capturerecapture into the method (Bravington, Skaug, & Anderson, 2016;Otis et al., 1978).Additionally, it could also allow us to fit the growth curve jointly with the CKMR model, instead of assuming it to be known by extrapolating from other studies (Bravington et al., 2019).This could create a situation where one collects new samples every year to update the model, thereby continuously improving the estimates not only of the abundance and trend, but also of the growth curve: in a Bayesian framework, one could use the initial growth curve as prior information, and then update the posterior every year as more information is collected.
In our model, we did not allow for any growth or decline in the population size over time.Our simulated populations exhibited no systematic growth, but the stochastic nature of the process did lead to some random growth/decline.One could consider estimating a growth rate, or assume a small range of growth rates (see Hillary et al. (2018) for an example with white sharks (Carcharodon carcharias)).The main challenge would be to understand how including a growth rate parameter affects the assumed age distribution f(a).We can imagine three general population growth scenarios.
If a population is stable but growing or in decline, the assumed age distribution will be geometric and depend on a combination of survival and growth rate (Caswell, 2006, Section 4.5.2.1).The second scenario is when a population exhibits changing growth or decline, in which case there is no stable age structure.We believe that this scenario is intractable, and it would make a good subject for a robustness study to see how much it affects estimation.The third scenario would be where there is no expected population growth or decline but there is demographic stochasticity, which in practice could result in deviations from the stable age structure.
For this scenario, an option could be to use the method described by Hillary et al. (2018), where the measured lengths were binned and a multinomial distribution was fit to these binned data to estimate the distribution of sampled ages.Still, this could be a topic for future research to see what other methods exist to find the distribution of (sampled) ages.
CKMR involves many pairwise comparisons, which often involve many identical probabilistic statements.To limit computation time, we evaluated unique probabilistic statements only once.If further computational improvements are required, it is possible to reduce the number of pairwise comparisons that are evaluated by excluding a subset of comparisons from the analysis.For example, the lengthage relationships are often much clearer for younger animals, and therefore one could choose to only consider animals up to a certain size as potential offspring (Trenkel et al., 2022).
In our simulation and model, we assumed some life history traits to be fixed and known, but this is not always required for CKMR.
We estimated sex-specific adult abundance only in our model and assumed quantities such as survival to be known and fixed.In order to relax the assumption of a fixed and known survival parameter , one could estimate it by including half-sibling pairs alongside parent-offspring pairs (Bravington, Skaug, & Anderson, 2016).
Parent-offspring pairs can be used to model fecundity as long as the parameter appears explicitly in the model, which could be the case when fecundity varies with the size or age of animals (Bravington, Skaug, & Anderson, 2016, Section 3.1.4)We are unaware of any attempts to estimate time-varying fecundity or survival, and we believe this to be a potential direction for future research.Moreover, we assumed maturity to be knife-edge as it slightly reduces the complexity of the model.However, if maturity occurs more gradually, then this can be accommodated by adding a fecundity curve to the model (e.g., a logistic curve; Conn et al., 2020).We also imposed a fixed and known maximum age in the simulation, mostly to reduce computation time.In reality, animals do not always have a maximum age; in such cases, one could set the maximum age equal to an age that the animal has practically zero probability of reaching.Further, we made the assumption that sampling was random with respect to age, that is, that there is no selectivity, which does not necessarily need to be true in reality.When accounting for ageing error when there is selectivity, it will be necessary to include some function relating true age to observed age, which would depend on the probability of being sampled at a given true age.Finally, we have not considered fishinginduced mortality, as our case study concerned an area protected from fishing.This and other anthropomorphic sources of mortality should be accounted for whenever they are present, analogously to Bravington, Grewe, and Davies (2016).
When a promising method like CKMR is first presented, one can see the appeal to start studying populations as quickly as possible.Benchmark comparisons could be useful (e.g., Ruzzante et al., 2019) to compare a new method to some 'truth'.However, these comparisons can be ambiguous when it is unclear how accurate the benchmark truly is.Simulation studies, such as this one (and see Conn et al. (2020) for the effects of unmodelled spatial heterogeneity on CKMR), are a key part of understanding when the CKMR method works well and when it does not.We believe the CKMR method has great potential and, in some cases, is an improvement over other methods, but our study confirms that care that needs to be taken when ageing is biased.In such cases epigenetic ageing could be preferable, even though epigenetic ageing can still involve substantial uncertainty (e.g., Larison et 2021;Prado et al., 2021) and relies strongly on the quality of the training data (Mayne et al., 2023).

ACK N OWLED G EM ENTS
We would like to thank Dr Mark Bravington for his time and enthusiasm whenever we discussed the methodology and worked on deriving the correct probabilities over email and in-person.We would also like to thank Dr Paul Conn and an anonymous reviewer for their thorough and thought-provoking feedback.This is contribution #1709 from the Institute of Environment at Florida International University.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare no conflicts of interest.

O PEN R E S E A RCH BA D G E S
This article has earned Open Data and Open Materials badges.

TA B L E A 1
Performance metrics for the estimation of parameter N ♀ , extracted from 1000 simulations of the simple shark population.

TA B L E B 1
Model-based estimates for the standard error of adult abundance estimates, averaged over 1000 simulations ( N s ), the empirical standard errors of adult abundance estimates, derived from 1000 simulations ( N s ), the difference between the two (Δ N s ), and the difference relative to the empirical standard error (Δ N s ∕ N s ; %), for both sexes of the simple species.Note: The 95% CIs were estimated for every simulation using the model-based standard error assuming a normal distribution on the link scale, which results in a log-normal distribution on the real scale.

MODELLING POPULATION GROWTH
In an alternative version of this model we included a growth parameter r, which allowed for the estimation of exponential growth or de- however, this was beyond the scope of our research.We could have assumed that the population had settled into a new equilibrium, in which case the age distribution f(a) would be proportional to the dominant eigenvector from the associated Leslie matrix.However, as our population growth/decline was stochastic rather than systematic, this did not seem appropriate (see the Discussion for more detail).We did run our simulation study with yearly growth parameters r and r assuming that the f a| * as presented in the main text was approximately correct, that is, ignoring population growth in the age distribution formulation.We decided not to include this part of the study in the main body of this manuscript, as we did not believe that the results could be used to accurately assess the effect of incorrect ageing on parameter estimation.Nonetheless, we included these results here for completeness as they could contain some valuable insight and form the basis for future research.As we considered abundance for both sexes separately, we estimated the following four parameters: N ,t 0 , N ,t 0 , r and r, where t 0 is some reference year.The kinship probabilities remained the same as presented in the Equations (4)-( 7).As population size was no longer assumed equal for all years, abundances in different years are linked through a geometric population dynamics model: where r ∈ (0, ∞ ) denotes the yearly growth rate.We set t 0 = 2014 to match Bradley, Conklin, Papastamatiou, McCauley, Pollock, Kendall, et al. (2017) as closely as possible.
Estimated abundance through time.
We fit our 25 scenarios, consisting of all combinations of 5 different measurement errors and 5 different growth curves, to both populations.We modelled the male and female side of the population separately resulting in four figures, each containing 19 population history plots-six plots are blank since the models in these scenarios (C1) N t = N t 0 r t−t 0 , F I G U R E C 1 Plots of the 1000 simple female adult population trends for the simple species for the last 20 years of the simulation, for the 19 scenarios that resulted in successful fits.The median of these 1000 trends is indicated in dark grey, and the truth adult abundance is indicated in red.The scenarios were labelled using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had label ME+33:GC-5.
did not (all) fit correctly.In Figures C1-C4 we notice a similar pattern of over-and underestimation related to shifting the growth curves.However, as we also model exponential growth or decline, we also notice effects of shifting the growth curves on the direction and magnitude of this trend.Albeit potentially informative, due to the inconsistency between modelling growth and the assumed age distribution we believe that these results cannot be directly used for inference.
F I G U R E C 2 Plots of the 1000 estimated male adult population trends for the simple species for the last 20 years of the simulation, for the 19 scenarios that resulted in successful fits.The median of these 1000 trends is indicated in dark grey, and the truth adult abundance is indicated in red.The scenarios were labelled using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had label ME+33:GC-5.
F I G U R E C 3 Plots of the 1000 estimated female adult population trends for the complex species for the last 20 years of the simulation, for the 19 scenarios that resulted in successful fits.The median of these 1000 trends is indicated in dark grey, and the truth adult abundance is indicated in red.The scenarios were labelled using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had label ME+33:GC-5.

F I G U R E C 4
Plots of the 1000 estimated male adult population trends for the complex species for the last 20 years of the simulation, for the 19 scenarios that resulted in successful fits.The median of these 1000 trends is indicated in dark grey, and the truth adult abundance is indicated in red.The scenarios were labelled using the format 'ME±XX:GC ± YY', where ME refers to the measurement error, XX denotes the percentage over-or underestimate, GC stands for growth curve, and YY denotes the percentage of up-or downwards shifting; for example, the scenario with a 33% overestimated standard deviation of length measurement error and a 5% downshifted growth curve had label ME+33:GC-5.
Summary of notation.
collected in the last two years of the simulation by maximising the pseudo-log-likelihood, which can involve prohibitively long computation time.To resolve this, we restricted the number of pairwise comparisons.Many pairwise comparisons resulted in identical probabilistic statements, and thus in practice only needed to be derived once.As we considered adult abundance for both sexes separately, we estimated two parameters: N ♀ and N ♂ .All other parameters, such as , were assumed known and fixed.To each of the 2000 population realisations (1000 for each species) we fitted the appropriate POP model with varying degrees of length measurement error and growth curves, which was achieved by altering some of the fixed parameters.Specifically, we assumed five different standard deviations for length measurement error: the correct one, a 33% and 67% NTR I B UTI O N S Felix T. Petersma: Conceptualization (lead); formal analysis (lead); methodology (lead); visualization (lead); writing -original draft (lead); writing -review and editing (equal).Len Thomas: Conceptualization (supporting); formal analysis (supporting); methodology (supporting); visualization (supporting); writing -original draft (supporting); writing -review and editing (equal).Danielle Harris: Conceptualization (supporting); formal analysis (supporting); methodology (supporting); visualization (supporting); writing -original draft (supporting); writing -review and editing (equal).Darcy Bradley: Conceptualization (supporting); formal analysis (supporting); methodology (supporting); writing -review and editing (equal).Yannis P. Papastamatiou: Conceptualization (supporting); formal analysis (supporting); methodology (supporting); writing -review and editing (equal).
The 95% confidence interval (CI) coverage for the sex-specific adult abundance for the successfully converging scenarios, denoted in percentages (%).
cline in the population size.Including growth in the model meant that the distribution of age given measured length f a| * is no longer de- * by grouping the data and fitting a multinomial distribution;