Can threatened species adapt in a restored habitat? No expected evolutionary response in lay date for the New Zealand hihi

Abstract Many bird species have been observed shifting their laying date to earlier in the year in response to climate change. However, the vast majority of these studies were performed on non‐threatened species, less impacted by reduced genetic diversity (which is expected to limit evolutionary response) as a consequence of genetic bottlenecks, drift and population isolation. Here, we study the relationship between lay date and fitness, as well as its genetic basis, to understand the evolutionary constraints on phenology faced by threatened species using a recently reintroduced population of the endangered New Zealand passerine, the hihi (Notiomystis cincta). A large discrepancy between the optimal laying date and the mode of laying date creates a strong selection differential of −11.24. The impact of this discrepancy on fitness is principally mediated through survival of offspring from hatchling to fledgling. This discrepancy does not seem to arise from a difference in female quality or a trade‐off with lifetime breeding success. We find that start of breeding season depends on female age and average temperature prior to the breeding season. Laying date is not found to be significantly heritable. Overall, our research suggests that this discrepancy is a burden on hihi fitness, which will not be resolved through evolution or phenotypic plasticity. More generally, these results show that threatened species introduced to restored habitats might lack adaptive potential and plasticity to adjust their phenology to their new environment. This constraint is also likely to limit their ability to face future challenges, including climate change.

Because of their strong links to fitness, phenological traits are one of the possible means for species to adapt to climate change and therefore are of considerable importance for conservation biology (Bellard, Bertelsmeier, Leadley, Thuiller, & Courchamp, 2012).
In birds, the laying date of many species has been shown to be getting earlier in recent decades Crick, Dudley, Glue, & Thomson, 1997;Parmesan & Yohe, 2003), most likely in response to environmental variation Visser, Both, & Lambrechts, 2004). Phenological response to climate change is expected to be driven by two mechanisms: (a) phenological traits can be plastic and change at a rapid pace as climatic conditions are changing or (b) they can be under strong selection, due to their influence on fitness in the context of climatic variation, resulting in evolutionary change over time if the trait is heritable. Deciphering the relative importance of both mechanisms is a difficult problem as it requires the genetic basis of phenological traits to be determined (Gienapp, Teplitsky, Alho, Mills, & Merilä, 2008). Although laying date has been found to be heritable and under selection in a few species (Gienapp, Postma, & Visser, 2006;Merilä & Sheldon, 2000;Sheldon, Kruuk, & Merilä, 2003), the expected response to selection may be too small to be detected (Gienapp et al., 2006), or laying date shifts may be constrained by trade-offs with other traits (Brown & Brown, 1999). As a result and given other empirical results showing that laying date seems to largely plastic (Charmantier et al., 2008;Gienapp et al., 2008), the hypothesis of phenotypic plasticity is often favoured.
Phenotypic plasticity is however not always adaptive (Ghalambor, McKay, Carroll, & Reznick, 2007), and the ability of bird species to efficiently adjust their phenology to climatic change through a plastic response will essentially depend on the environmental cue used (Chevin & Lande, 2015;Visser et al., 2004) and how well it predicts the optimal phenological timing. Indeed, depending on the relevance of the signal used to plastically alter the laying date, the mechanism could (a) help better track the optimum of environmental resources, (b) be totally independent of environmental resources or (c) even trigger a response in the wrong direction. For example, change in photoperiod is a possible cue for the beginning of spring but, being independent from climate, it is not expected to trigger adaptive phenotypic plasticity to an earlier spring onset as a result of climate change. One of the most efficient environmental cues related to climate change is temperature, which seems to be the cue used by the most studied passerine species to determine lay date (Caro, Schaper, Hut, Ball, & Visser, 2013;Crick et al., 1997;Phillimore, Leech, Pearce-Higgins, & Hadfield, 2016).
Most of the evolutionary research presented above has however been performed on non-threatened, mostly European, birds (but see Teplitsky, Mills, Yarrall, & Merilä, 2010, for a study on a nonthreatened New Zealand endemic bird). As a consequence, little is known about the potential for an evolutionary or plastic response of threatened birds, or more generally for any endangered species, typically existing in degraded habitat or in small and isolated populations with potentially limited genetic diversity.
The hihi (stitchbird, Notiomystis cincta) is a New Zealand endemic and threatened passerine bird. Once spread over most of the North Island, hihi are now naturally occurring only on Hauturu-o-Toi (Little Barrier Island; 36°12′S, 175°05′E) and in six reintroduced populations spread across their former range. Natural dispersal is not possible for the hihi between these seven populations (due to long distances between them and the inability of the hihi to survive for a long enough period outside pest-free sanctuaries), and no occurrence of natural re-colonization has been observed since the hihi population collapse across the North Island in the late 1800s.
As a result, a response of the species to climate change will either depend on adaptation to these changes (through evolutionary processes or phenotypic plasticity), human intervention such as translocation to more suitable habitats (Chauvenet, Ewen, Armstrong, & Pettorelli, 2013) or, more generally, appropriate changes in management strategies. Additionally, some of the environments within which hihi were reintroduced are currently recovering from heavy degradation (e.g., almost complete land clearing), by contrast with the only native population consisting predominantly of mature forest (Makan, Castro, Robertson, Joy, & Low, 2014). This means that introduced populations are not expected to be at evolutionary equilibrium with these growing, immature forests, possibly resulting in a strong mismatch between their phenology and resource availability (Gienapp et al., 2008), reducing population fitness. If this is the case, the currently increasing additional pressure from climate change might pose a threat to their conservation in these environments.
In this study, we investigate the evolutionary aspects of laying date in the hihi using a particularly extensive data set from a threatened bird species, which includes data on nest lay date, fledgling recruitment, survival and individual fitness. We assess in particular whether the heritability, selection and plasticity typical of laying date described previously in non-threatened bird species apply to this threatened species. We use long-term data on a pedigreed wild population of hihi to explore variation in laying date, its genetic basis and the selective pressure upon this variation, in order to determine whether (a) lay date is under stabilizing selection with an optimum of fitness, and if so, whether the observed lay date matches this fitness optimum, and (b) if a discrepancy with this optimum occurs, whether this trait has enough adaptive potential (i.e., trait heritability) to evolve in response to selection.

| Data sampling
Here, we focus on a restored mammalian pest-free island, Tiritiri Matangi (36°36′S, 174°53′E), in which individuals were released from the island Hauturu-o-Toi in August 1995, August 1996 and March 2010. The population has been closely monitored since 1995 with all birds individually identifiable with leg bands (almost exclusively applied as nestlings) and with most nesting attempts known.
No natural immigration to or emigration from the island has been observed. Hihi feed on a mix of fruits, nectar and small invertebrates (Castro, Minot, & Alley, 1994), but are also provided with supplementary food (20% by mass sugar water). The population grew since its establishment in 1995 to an artificially managed carrying capacity (ca. 150) reached in 2005-2006, due to semi-regular harvests (every ~2-3 years) of fledglings to source individuals for new translocation events . During harvests, at most 20% of fledglings are taken at random from the population, with translocations taking place between March and May, the austral autumn. The majority of juvenile mortality occurs in the first 8 weeks after fledging (Low & Pärt, 2009), so most individuals harvested are likely to be fit enough to recruit and by this point most selection would have already taken place .
Hihi usually reproduce in their first year, during the austral spring and summer (September to February, Castro et al., 1994). Females lay clutches ranging from three to five eggs, at 25-hr intervals and can produce multiple clutches within a season although normally only one or two are successful. Despite males providing around 30% of the care during rearing (Ewen & Armstrong, 2000;Low, Joy, & Makan, 2006), extrapair paternity in this species is widespread. Around 60% of chicks within a brood are sired by extrapair males (Brekke, Cassey, Ariani, & Ewen, 2013). Hihi usually lay eggs in natural tree cavities, but given that the forest on Tiritiri Matangi is immature, such cavities are scarce. Instead, nest boxes are provided for hihi across the island, within which the vast majority of breeding events occur. As nest boxes and food are provided ad libitum by management, and the population is subject to periodic removal of fledglings for translocation to other populations, there is very little evidence of density-dependent survival in the population . During each nesting attempt, we record (a) the identity of the (social) sire and the dam; (b) lay, hatch and fledge dates; and (c) the corresponding numbers of eggs/chicks at each stage. Surviving fledglings are measured, banded and blood-sampled. The intense monitoring (since 1995), combined with a microsatellite genotyping effort started in 2004 (Brekke, Dawson, Horsburgh, & Ewen, 2009), allowed us to reconstruct a long-term pedigree of the Tiritiri Matangi population, while accounting for extrapair paternity (Brekke et al., 2012).
We used temperature data from 1995 to 2010, downloaded from the New Zealand National Climate Database (https://cliflo. niwa.co.nz/), to assess the relationship between laying date and temperature. The years cover a period over which a weather station from New Zealand's National Climate was present on the island. As a proxy for the yearly environmental cue, we used the average maximal temperature for 50 days prior to the grand mean of the start of breeding season (see below) across years, as sliding window approaches often identify this as the most predictive period (McLean, Lawson, Leech, & van de Pol, 2016).

| Phenotypic information
This study is based on the breeding seasons spanning from 1997/1998 to 2013/2014. Lay date was derived directly from the data, except for the first season 1997/1998 where the data were missing. Instead, it was computed as being 17 days before the recorded hatch date (regression of laying date on hatch date, N obs = 1,207, a = −17.07 days, b = 1.0, R 2 = 0.998). Dates require a point of reference, so for the analyses of this article, the laying date was defined as the number of days since the 1st of September of the year corresponding to the breeding season. Attempts at breeding by females were numbered with an increasing clutch number, independently of the success of the clutches. In the following, "start of breeding season" corresponds to the laying date of the first of these successive clutches or of the sole clutch if only one has been recorded. The fitness of the breeding female was computed as the number of offspring recruited as breeders in the following generations (using both the social and genetic pedigrees, see below). The age of females was taken as the number of years between the focal season and the season of the birth of the female. Because hihi have a limited growth after fledging (Low, 2006), we used the tarsus length measured when individuals are banded (as 21-day-old nestlings) as a measure of the individual adult size. The total data set consisted of 1,369 breeding events (855 whole-season events) for 330 females, spanning 16 years.

| Pedigree reconstruction
The social pedigree was reconstructed based on the colour band information of the sire and dam observed at each nest box. Due the high level of extrapair paternity, we used a panel of microsatellite markers (Brekke et al., 2009) and the COLONY software (Wang, 2004(Wang, , 2012Wang & Santure, 2009)  were extracted using the ammonium acetate precipitation method.
All samples were genotyped at 18 microsatellite loci, 15 were species-specific and three were designed for other passerine species (see Brekke et al., 2009; for extraction and genotyping details). Sex was identified using two fluorescently labelled primers (Z002A and Z037B; Dawson et al., 2007) and where possible in combination with adult plumage morphology. For COLONY, in brief, all behavioural maternities of clutches were assumed correct and specified as such.
For clutches where behavioural observation of maternity was not available, only maternal sibship was specified but not maternal ID.
Candidate fathers included any known male alive in the pre-breeding September census and post-breeding February census (not born that season) and also all identified territorial males. The probability of the true parents being in the candidate lists was set at 0.90 for females and 0.80 for males as a number of males do not hold territories (≃30%) and may still gain extrapair copulations (Brekke, Ewen, Clucas, & Santure, 2015). These probabilities reflect a very high probability of recapture of the individuals on the island (Chauvenet et al., 2013). Both sexes were defined as polygamous and allele frequencies and genotyping error rates were provided as input (between 0 and 0.012 depending on the locus, estimated from repeat genotyping of ≃10% of samples). Because blood sampling was only initiated in the 2003/2004 breeding season, information relating to the genetic sire of individuals born previously is missing. For these individuals, we considered the information as missing, rather than using the social sire (hereafter termed the "full pedigree"). The maximal depth of this pedigree was 13 generations with an average of 6.69. We also constructed a pedigree restricted to the years where the genetic information is available (from 2003 to 2014, termed the "subset pedigree"). This pedigree had a maximum depth of 10 with an average of 4.2 generations. We computed individual inbreeding coefficients using the whole pedigree record, but only for individuals with known parents and grandparents. Inbreeding coefficients for individuals with at least one unknown parent or grandparent were considered as missing values.

| Start of breeding season, reclutching and female survival
To study the influence of female quality on variation in the start of the breeding season, probability of reclutching and female survival, we used three different proxies for female quality: age, size and inbreeding.
First, to study the variation in the start of breeding season, we used a mixed modelling approach where age, size and inbreeding were fitted as fixed effects and the identity of the female and year were fitted as random effects, to model random between-individual and between-year variation. Because the relationship with age seemed nonlinear, we evaluated the fit of a linear model, a quadratic model, a broken lines model with a break at age 2, a broken lines model with breaks at ages 2 and 6 and a broken lines model with a different slope for each transition in age. Once the best model was determined, we tested the continuous effect of time (year, standardized, i.e., mean-centred and scaled to a variance of 1) to detect variation in the start of breeding season over the study period. This effect was tested with and without year as a random effect (to remove covariation when year is fitted as both fixed and random and therefore increase power to detect a linear trend).
Second, we studied the probability of reclutching (having more than one clutch) by fitting a binomial model using start of breeding season and female quality proxies (age, size and inbreeding) as fixed effects with female identity and year as random effects. For the effect of age, we tested a linear effect, a difference between 1-year-old and older females and a difference between 1-year-old, middle-aged (from 2 to 6 years) and older females (above 6 years).
Third, we studied whether female survival to the subsequent year depended on when she started breeding, the number of clutches per year and her quality in binomial mixed models with year and female identity as random effects.
Fourth, we studied the existence of phenotypic plasticity with temperature, using the average temperature cue described above.
To do so, we fitted an individual model of the start of breeding season using the best model from the first analysis above and tested whether the inclusion of the temperature cue as a new variable was significant in the model.
To help convergence of the algorithms, continuous variables (except inbreeding) were standardized, that is, mean-centred and scaled to a variance of 1. The models were fitted in R (R Core Team, 2017) with the lme4 package (Bates, Mächler, Bolker, & Walker, 2015) and were compared using Akaike's information criterion (AIC, Akaike, 1981;Burnham & Anderson, 2002). The AIC was computed using maximum likelihood while the estimates provided throughout the article were computed using restricted maximum likelihood. The significance of variables for the model was tested by comparing AIC as follows: Variables were evaluated separately against a null model.
Variables that were not significant on their own were discarded.
Variables that were significant on their own were all included in a full model and compared to models with each variable dropped one by one. The best model was chosen, and variables were tested (dropped) again until a stable state was reached. The best inferential model was considered as the most parsimonious one with a contrast to the best predictive model ΔAIC < 2. Within the best model, the significance of each parameter (departure from zero) was tested using the lmerTest R package.
Because including inbreeding was generating a predictor with a lot of missing values (hence a greatly reduced subset of the data), we tested this variable separately by comparing a null model, a model with inbreeding and the best model with or without inbreeding.

| Heritability of laying date
To estimate the heritability of female laying date, we used the R package MCMCglmm (Hadfield, 2010). We conducted the analysis either using all data available (i.e., full pedigree) or restricting to years where molecular data were available to reconstruct the pedigree (i.e., subset pedigree). Although the distribution of laying date is skewed, it was analysed as a Gaussian trait as using the clutch number within a year as a fixed effect was an efficient way to account for the skewness. In addition to the additive genetic effect, the female identity (permanent environment effect), the social male identity and the year were fitted as random effects. The phenotypic variance was computed as the sum of all random effect variances, the residual variance and the variance arising from fixed effects, following de Villemereuil, Morrissey, Nakagawa, and Schielzeth (2018). The prior for these random effect variances used the parameter extension implemented in MCMCglmm with parameters V = 1, nu = 1, alpha·mu=0 and alpha·V = 1,000.
The prior parameters for the residual variance were set to V = 1 and nu = 0.02. Clutch number was included as a fixed effect, and its significance was tested using the pMCMC value yielded by MCMCglmm. The models were run for 500,000 iterations with a thinning interval of 10, after a burn-in of 3,000. These parameters were chosen to ensure a MCMC effective sample size above 8,000 for all parameters. Convergence for all parameters was checked graphically and by using the Heidelberger and Welch (1981) test as implemented in the coda R package (Plummer, Best, Cowles, & Vines, 2006). The heritability of laying date was computed as the ratio between the additive genetic variance and the sum of all variance components in the model excluding the between-year variance (i.e., phenotypic variance within years).

| Power analysis
To evaluate the capacity of our data to estimate low levels of heritability, we performed a power analysis. We used our exact data structure (pedigree, number of individuals and structure of multiple measurements), but simulated a new phenotypic trait. We simulated breeding values according to our pedigree using the MCMCglmm rbv() function, as well as all the other random effects fitted in the above model (permanent environment, mate and year effects, all with the same variance). Variance components were set so that the total variance was comparable to our laying date data set and the resulting expected heritability would be 0.1, as estimates below this threshold would be typically considered as small. As a comparison, using the meta-analysis data set from Mittell, Nakagawa, and Hadfield (2015), we found that heritabilities reported for passerine laying date typically range from 0.09 to 0.265 with an average of 0.14 (n = 8). We replicated this simulation 100 times and analysed each simulated data set using MCMCglmm to estimate heritability.
We computed the posterior mean, median and 95% credible interval for each replicate.

| Optimum inference
For six fitness-related traits ((a) fitness as defined previously as the number of offspring recruited as breeders in the following generations, (b) the number of eggs laid, (c-e) survival through the three different juvenile stages egg-hatchling, hatchling-fledgling, fledgling-recruit, and (f) survival from egg-recruit), the presence of an optimum was first tested using a generalized linear mixed model (GLMM) including a first-and second-order effect of laying date (as well as various other confounding effects such as the size, inbreeding and age of the female and the clutch number of the season). For each fitness-related trait, we compared the model including the second-order effect with a model without using AIC: If the difference in AIC was larger than 2, indicating support for an optimal value, we proceeded by using the following model to infer the value of the optimum. The main reason for using the model below rather the GLMM is that the optimum, which is the parameter of interest here, is a compound function of the first-and second-order parameters of the GLMM. This makes the computation of uncertainty measures (such as confidence/credible intervals) and the inclusion of a year-to-year variation of the optimum complex. In order to infer the optimum of these fitness-related traits depending on lay date, we considered a model (akin to the model described in equation 1 of Chevin, Visser, & Tufto, 2015) where the latent response Z was evaluated as a Gaussian curve depending on the (mean-centred and scaled to a variance of 1 across all years) laying date x i (here thus a covariate of the model) for each individual breeding record i during year j(i), the optimal date λ j(i) depending on the year j(i), the dispersion coefficient around the optimum σ and a scaling factor A.
Note that this estimates one optimum per year j. The optimal date was modelled as depending on the year as a random effect: where μ is an across-year intercept, u j is a year-dependent random effect with variance 2 U and  [−2,2] is a normal distribution truncated between [−2, 2]. The realized response Y (i.e., the observed data) is then modelled according to either a Poisson (for fitness) or binomial (for survival) error distribution, hereafter denoted as : Prior distributions for the total model were as follow: where  stands for a uniform distribution and Γ stands for the Gamma distribution. The upper bound of the prior for the dispersion parameter σ was set to a high value so that the model was able to yield a flat line, in case the most likely model is one without an optimum. The upper bound of the scale parameter A, here denoted A max , was equal to 5 for fitness and 1 for survival. The relatively narrow distribution for λ j = μ + u j was chosen to ensure that the fitted optimum belongs to the realm of possible laying date with boundaries −2 and 2 roughly corresponding to the 2% and 98% quantiles of the scaled laying date variable.
For each fitness-related trait, the model above was implemented in JAGS (Plummer, 2003). The total model was run in eight chains for 50,000 iterations with a thinning interval of 10 after a burn-in of 3,000. These parameters were chosen to ensure a total effective sample size of the MCMC above 10,000. The convergence was checked by comparing the eight chains using the potential scale reduction factor of Gelman and Rubin (1992) as implemented in the coda R package (Plummer et al., 2006). All parameters had a factor equal to 1, meaning that the same convergent state was reached by the different chains. As lifetime reproductive success tend to be zero-inflated, we performed posterior predictive checks (Gelman, Meng, & Stern, 1996;Rubin, 1984): We simulated new data according to the model and posterior distributions and compared their distribution to the distribution of our data. A model including an individual random effect (based on female ID) was considered but is not included here as it took longer to run with no substantial difference in the results. Finally, we fitted a model using information regarding the temperature cue for each year (termed θ j ), to check whether this (2) would improve the fit of the model and hence the inference of the optima. This model is the same above, with an additional slope B between the optimum and temperature: The prior of the slope was defined as a vague normal distribution with mean 0 and variance 10 6 . The significance of the slope was tested as twice the proportion of iterations of a different sign than the posterior median.
Optima were compared to the mode (computed as the optimum of the density distribution of laying date), rather than the mean, because the distribution of laying date is skewed towards later dates which would influence the mean but not the mode. We do not comment on year-to-year estimated variation in optima, as low sample sizes led to estimates that appeared too unreliable for some years to be able to robustly infer a temporal trend. The selection differential of laying date was computed as the covariance between this trait and relative fitness (Robertson, 1966(Robertson, , 1968, and standardized selection gradients were computed using Lande and Arnold (1983)'s framework. Standard errors were obtained from a non-parametric bootstrap for the selection differential and from the standard errors of the linear model estimates for the gradients.

| Survival analysis and laying date
We studied the survival between different stages of development (egg, hatchling, fledgling, recruit; with recruit being defined as breeding for at least 1 year) according to laying date (as a quadratic effect), clutch number and female quality (age, size and/or inbreeding). To keep the number of models tested low, only one model for age was tested: young (1-year-old)/middle (2-to 6-year-old)/old (>6-year-old).
This was done using a binomial mixed model with year and female identity as random effects. As above, the models were fitted in R with the lme4 package and compared using AIC using the procedure described previously. To improve model fit, laying date and size were standardized (mean-centred and scaled to a variance of 1). When a quadratic effect of laying date was significant, we also used the optimum model described above to estimate the optimum of survival according to laying date.

| Start of breeding season
The start of breeding season depended on both age and size of the female. The best model to predict the influence of the age of female on the start of breeding season was the broken lines model with breaks at age 1 and 6 and size as covariates (see Supporting Information Table S1). The predicted trend in this best model (red line in Figure 1) shows that females in their first year lay eggs later (effect ± SE = 12.5 ± 1.38 days, t 506 = 9.06, p < 10 −15 ). This is also the case for old females (i.e., of age over 6) with a significantly positive slope (slope ± SE = 1.33 ± 0.426 days/year, t 486 = 3.13, p = 0.00184, see also Figure 1). Between the ages of 2 and 6, however, the start of breeding season is earlier and does not significantly depend on age (slope ± SE = 0.228 ± 0.476 days/ year, t 639 = 0.480, p = 0.632, see also Figure 1). The start of breeding season was further negatively dependent on the tarsus size of the female (slope ± SE = −1.70 ± 0.749 days/mm, t 245 = −2.28, p = 0.0237), although it had a relatively small effect in relation to female age. Inbreeding was not significant (see Supporting   Information Table S1).
There was weak evidence for a temporal trend in the start of breeding season. Using the best model above and including years as F I G U R E 1 Violin plot of the start of breeding season according to the age of the female. The grey-filled area depicts the density of start date values conditional on age (i.e., width is not comparable between different ages, to improve readability). The solid red line is the prediction from the best model (i.e., the broken lines model with breaks at ages 2 and 6). The dotted parts are the discontinuities in the model Figure 2), but these results show that should it be changing over time, it would be towards later, rather than earlier dates.
In contrast, the relationship between start of breeding season and our temperature cue was well supported. Adding the temperature variable to our best model of start of breeding season resulted in a significantly better fit (AIC = 708.54, compared to AIC = 724.61 without the temperature cue). Lower temperatures 50 days prior to the average start of breeding season lead to a delayed start of breeding (slope ± SE = −18.28 ± 3.24 days/°C, t 14.1 = −5.65, p = 5.86 × 10 −5 , see Figure 3). No linear trend (e.g., average increase in temperature over time) could be detected for the temperature cue over the years (slope ± SE = −0.036 ± 0.031°C/year, t 12 = 1.41, p = 0.275).

| Probability of reclutch
Older and earlier breeding females tended to reclutch more often than younger and later breeding females. The best model for the probability of reclutch included start of breeding season and firstyear/older females effects (see Supporting Information Table   S2). First-year females tended to reclutch less than older females (effect ± SE = 1.10 ± 0.220, z = 4.97, p = 6.69 × 10 −7 ) and the effect of start of breeding season was negative (slope ± SE = −1.53 ± 0.16 5 day −1 , z = −9.32, p < 10 −15 , see also Figure 4). Size and inbreeding did not significantly influence the probability of laying more than one nest (see Supporting Information Table S2)

| Female survival to the next season
Surival was associated with a larger number of clutches, but not with the start of breeding season. The best model for female survival to the next season included age as a continuous effect and the number of clutches during the breeding season (see Supporting Information Table S3 and Figure S1a). Thus survival was not significantly associated with the start of breeding season (see Supporting Information F I G U R E 2 Violin plot of the start of breeding season according to years for each age class (young: 1-year-old, middle: between ages 2 and 6, old: older than 6). The grey-filled area depicts the density of start date values for each year (note that width is not comparable between different years). The solid red line is the prediction from the best model (i.e., the broken lines model with breaks at ages 2 and 6 with tarsus size as a covariate, not shown here). The blue dashed line is a LOESS estimate (from the ggplot2 R package, Wickham, 2009)

| Heritability of laying date
The estimated heritabilities of laying date were very low, with a lower bound of the credible interval near zero (Table 1), despite statistical support away from the prior, shown by the agreement between the posterior mode and median. This low heritability resulted from a low additive genetic variance, also with a lower bound of the credible interval near zero. The variance estimates for laying date were comparable between the two analyses using all years available (full pedigree) or only years with genotypic information (subset pedigree, see Table 1). The heritability computed from the mother-daughter regression was also very low (Table 1) and not significant (t 241 = 0.053, p = 0.958). The variance of the permanent environment effect (V PE ) was estimated as being larger than the additive genetic variance (V A ), but with a similar order of magnitude. In combination, these two effects lead to a small repeatability (although supported as being away from zero in the model) of the laying date (r 2 in Table 1). Our power analysis, which simulated a heritability of 0.1 (Supporting Information Figure S2) shows that the pedigree had sufficient power to detect a moderate heritability, and further that the probability of obtaining a heritability estimate as low as or lower than ours is minimal, with 75% (94%) of the replicates with posterior mode (median) higher than those calculated from our true data set. Combining this power analysis with our upper credible interval bound, it is likely that the true heritability of laying date of the hihi is lower than 0.1. Clutch number was a significant effect in our models (pMCMC < 10 −5 ).  Figure S3). When using only birds of good "quality" a priori best able to target the optimum using two different criteria: (a) females from ages 2 to 6, (see

| Optimum of initial investment
The significance of an optimum of the initial investment (number of eggs laid) according to laying date was confirmed using a generalized linear mixed model with a Poisson distribution and year as a random effect (AIC laying date = 4,594.6, AIC null = 4,621.4).
Compared to the optimum of fitness, the optimum in initial F I G U R E 3 Average start of breeding season (over all breeding females) each year against the temperature cue (average temperature 50 days prior to the grand mean of start of breeding season across the years). The blue line is only illustrative and is based on a simple linear modelling of the data points without other fixed and random effects, see main text for a more refined slope estimate breeding investment is inferred with more uncertainty, due to the effect of laying date being less strong ( Figure 6). The optimum date of laying is inferred as being October 22nd with a 95% credible interval between September 29th and November 8th. It is thus not significantly different from the mode of laying date (November 2nd in this subset of the data).
F I G U R E 4 Violin plot of "reclutching" (having at least one other clutch after the first clutch) for females against the start of breeding season for young females (first year) and older ones (second years and older). The red solid line is the predicted probability of reclutching

| Egg to hatchling
There was no significant optimum of survival from egg to hatchling according to laying date. The best model for this variable included age and clutch number as fixed effects (Supporting Information   Table S4). When compared to middle-aged females (of ages 2-6), the probability of hatching was significantly lower for older females   and October 24th. It is thus significantly different from the mode of laying date for this subset of the data (November 2nd). The probability of survival from hatchling to fledgling was 0.56 on average.

| Fledgling to recruit
The probability of survival and recruitment into the breeding population after fledgling did not significantly depend on any of the covariates tested in this study (see Supporting Information Table S6). It was low, at 0.22 on average.

| Egg to recruit
The compounded probability of surviving from the egg stage to recruitment had a clear optimum with the best model for this probabil- Total survival to breeding was overall very low with an average at 0.092.

| D ISCUSS I ON
Phenology is a key feature in adaptation to climate change for a broad spectrum of species (Parmesan & Yohe, 2003) and thus also an important aspect of the long-term conservation of threatened species (Rosemartin et al., 2014;Wadgymar, Cumming, & Weis, 2015).
To the best of our knowledge, this is the first study on a threatened species that has explored the relationship between a phenological trait, its genetic basis and fitness. Our results show that (a) there is an apparent discrepancy between the phenology of the hihi and its optimal value; (b) laying date is not significantly heritable, hence there is not sufficient genetic variation for the population to respond to natural selection; and (c), over the 16 years of this study, hihi phenology did not change towards earlier dates, although it is unclear whether it changed in the other direction or not. These results raise several questions and issues for both evolutionary and management perspectives, particularly because the population of Tiritiri Matangi continues to demonstrate strong population growth, despite a particularly large discrepancy between observed and optimal breeding times.

| Discrepancy between laying date and its optimal value
We observed a very large difference between the optimum and mode of laying date (with an optimum almost a month earlier than the mode), with relatively few individuals that are actually breeding during the optimal period (see Figure 5). This gap results in an overall large selection differential of −11.24, which is much stronger than selection differentials estimated in other passerines (Gienapp et al., 2008;Van Noordwijk, McCleery, & Perrins, 1995;Visser, van Noordwijk, Tinbergen, & Lessells, 1998). This indicates a strong maladaptive phenology of the hihi population in Tiritiri Matangi Island, most likely imposing a burden in terms of population fitness, raising concern over the conservation status of this population. We found that phenology depended on a temperature cue (or, rather, a proxy of it) based on the average temperature 50 days prior to average start of breeding season. Yet, we found no significant connection between this cue and the optimum of laying date. This could be explained by a lack of power in our analysis or (not exclusively) by a weak relationship between this temperature cue and the optimum in Tiritiri Matangi. If this is the case, it would suggest that plasticity in response to this temperature cue is not efficient and adaptive enough to place the population close to the optimum. Data on more years is necessary to confirm this result. This is important for the hihi in the context of climate change as a negative relationship between increase in temperature and breeding success has been found, and projected regional climate change scenarios result in an overall decrease in the carrying capacity of the Tiritiri Matangi population (Chauvenet et al., 2013). However, should the temperature increase at Tiritiri Matangi, the observed phenotypic plasticity may become adaptive, as it would trigger earlier lay dates, possibly resolving the discrepancy with the optimum of laying date at the same time. This scenario cannot be tested here as the temperature cue did not increase over time during our study period, a period of time which overlaps with a "hiatus" in global climate change (Pachauri et al., 2014).
Despite the observed discrepancy between observed and optimal breeding times, the population of Tiritiri Matangi continues to demonstrate strong population growth. One hypothesis to explain this result is that recruitment is density-dependent, with low reproductive output-due to the population breeding away from the optimum-compensated by high rates of fledgling recruitment in the absence of competition (Reed, Grøtan, Jenouvrier, Saether, & Visser, 2013). However, no density-dependent compensation has been observed in hihi . Other compensatory mechanisms may be at play to explain the strong population growth, or there may be little cost associated with the discrepancy observed (Dunn & Møller, 2014; for example due to food supply being available ad libitum during all of the breeding season). This suggests that if some of the costs (large or small) associated with the discrepancy in lay date and suboptimal habitat were not in place, population growth would have an even more positive lambda than currently observed .

| Can individual quality explain the observed delay?
The difference between optimal and observed lay date could be explained by the fact that a part of the population is not fit enough to actively track this optimum, hence are obligatory late breeders, resulting in the evolution of an advanced laying date (Price, Kirkpatrick, & Arnold, 1988). This has been repeatedly found to be the case for fledgling success among passerines, where competition for resources or limitations in individual quality may mean that less fit individuals are unable to lay at the optimum (Van Noordwijk et al., 1995;Verhulst & Nilsson, 2008;Verhulst, van Balen, & Tinbergen, 1995). There are a number of lines of evidence working against this hypothesis in the case of the hihi. We found that the main factor related to individual quality explaining the differences in start of breeding was age, that is middle-aged females tended to have the earliest start, as previously observed for many life-history traits (Brekke et al., 2013;Chauvenet et al., 2013;Low, Pärt, & Forslund, 2007). However, estimating the optimum from only middle-aged individuals returned essentially the same results as the inference using the general population, with an optimum still significantly earlier than the mode of laying date. Likewise, using only females surviving to the next year, as a proxy for highquality females, did not impact our results. The inability of even the seemingly fittest females to match the optimum suggests that direct competition between females is not the primary driver for the discrepancy in lay date. Differences in body condition may also contribute to individual differences in lay date. Unfortunately, female hihi body weight has not been consistently recorded over the period of study, with data only available for a small number of females (n = 36, Low, 2004Low, , 2006. Regardless, the low repeatability of laying date suggests that there are very few consistently highquality females in the population across years. Finally, there is clear decline in fitness for females that are too early compared to the optimum ( Figure 5), working against the hypothesis that "earliest is best." Nevertheless, only an experimental approach (generally not possible in threatened species) can definitely disentangle the relative contributions of female quality and the effect of phenology to the observed pattern of fitness (Verhulst & Nilsson, 2008).
Another explanation for the difference between the optimum and mode of laying date would be a trade-off between the reproductive output of a breeding event and survival to, or opportunity for, future breeding events. Again, this is unlikely to be the case in this system. Regarding opportunities for future breeding events within the same year, earliest breeding females were also the ones most likely to have a least one other clutch during the year. As for survival to future breeding events in consecutive years, the probability of female survival to consecutive years was not dependent on the phenology, and females that laid more clutches were more, not less, likely to survive (note that this is also the case when using survival according to surveys performed twice a year on the island, data not shown). Thus, early reproduction does not appear to come at a fitness cost for individual females.

| Life-history optimum and relationship to the ecology of the hihi
The fitness of a female's breeding event, as it was computed in this study, depended on two factors: the initial investment (number of eggs laid by the female) and the success of each individual investment (here separated into the survival at three developmental transitions from egg to hatchling, to fledgling, to recruit). The only significant impact of lay date on fitness that we found was during the transition from hatching to fledgling, with a relatively sharp and early optimum of laying date which increased survival. However, this relationship was strong enough to be a significant shaping factor of the overall probability of survival of young (i.e., we found a significant quadratic effect of laying date from egg to recruit). Before fledging, the parents are heavily dependent on the resources in their environment to provide for the nestlings. It has been shown, for example, that hihi adults drastically increase their consumption of invertebrates during this period (Castro et al., 1994), possibly to feed them to the juveniles (Rasch, 1985). Because these resources F I G U R E 7 Probability of survival between different development stages (egg, hatchling, fledgling, recruit). Circle sizes are proportional to the number of individuals sharing the same realized value (ratio, for a given female, of the number of offspring alive at the later stage to the number of offspring alive at the earlier stage) and laying date. The red curve is the fitted model, vertical red line is the optimum and the light red area depicts the 95% credible interval of the optimum (figures with no red are those with no optimum inferred). The vertical solid blue line is the mode of laying date for the particular subset of the data corresponding to the graph (invertebrates, but also fruit and flowers) will fluctuate throughout the breeding season, a reasonable explanation for our results is that laying eggs around mid-October might coincide with a peak in specific resources for provisioning nestlings, around early November.
However, in the context of this study, it is difficult to pinpoint the exact nature of these resources.

| Lack of adaptive potential and phenotypic plasticity
The discrepancy between the optimum and the mode of laying date raises the question of the real ability of the population to face this Matangi is in many respects more typical of the current state of New Zealand wild forests than Hauturu-o-Toi, especially in terms of forest maturity and complexity, it appears likely that in the majority of translocated hihi populations there is similarly no "exposed" additive genetic variation for selection to act on. Therefore, the majority of hihi populations appear likely to share low effective heritabilities for lay date and will be limited in their evolutionary response to selection. This effect would also be reflected in most endemic and endangered birds in New Zealand, who share a similar history of decline and reintroduction.
The lay date appeared to be responding to the temperature cue we analysed in this study, demonstrating the plasticity of phenology in the hihi. This is also confirmed by considering the Zealandia and climate is predicted to also change, which will likely result in a modification of the cues, the optimum and the mode of laying date in the future. Whether these changes over time will resolve or increase the discrepancy with the optimum is hence unknown.

| Conservation implications and future management
The hihi conservation programme, like many other New Zealand conservation programmes, is limited by sites that can deal with the main threats to its survival-mammalian predators and loss of pristine habitat. Mammalian predator control has been achieved with the use of island sites or large-scale fencing of on-shore sanctuaries.
However, finding large enough forested areas that contain restored habitat (as most pristine, mature forests have been cleared) but could sustain a hihi population with ongoing management remains a challenge. Despite this, the hihi programme has been successful at establishing new, growing populations at a range of sites on offshore and mainland islands alike. But the threat of climate change remains a constant, and populations at these new sites limited to suboptimal, immature habitat are likely to be more susceptible to its effects, as they are already at a disadvantage from displaying a discrepancy with current environmental conditions, potentially limiting their long-term viability. The lack of adaptive potential and adaptive plasticity in this species may be one of the reasons it requires intense management to maintain the reintroduced populations. However, this level of intervention is not unusual in highly threatened species, and the established populations continue to grow and flourish in relatively varied habitats across the North Island. Our findings, along with previous research (Chauvenet et al., 2013), support the emphasis on assisted colonization to areas outside the hihi natural range, as climate changes and these areas potentially become more suitable over the coming decades. However, given that plasticity is not perfectly adaptive and given the challenges associated with identifying the environmental cues, it is difficult to make predictions about the suitability of these new sites to resolve the discrepancy between observed and optimal laying date.
As the number of threatened species increases because of anthropogenic action, so do the number and types of human intervention to prevent extinction. One of the most commonly used tools for the management of threatened species is reintroduction (Ewen, Armstrong, Parker, & Seddon, 2011), currently being used in hundreds of conservation programmes globally, across most taxa (Soorae, 2016). In the process of reintroduction, threatened populations undergo genetic bottlenecks, as a consequence of sampling a small number of individuals to found new populations. The repercussions of bottlenecks are well established: loss of genetic diversity and, consequently, loss of adaptive potential (Willi, Buskirk, & Hoffmann, 2006). But in the face of extinction, this trade-off may be the only one or one of very few alternative/s. Conservation programmes globally are also limited by large-scale habitat loss, leaving feasible sites for reintroduction at best modified but more commonly suboptimal or even nonexistent. Finding appropriate reintroduction sites therefore remains a huge management challenge, particularly as for most species there is no information on optimal habitat (Armstrong, Castro, & Griffiths, 2007). Here, we show that for threatened species with low genetic diversity in the face of climate change, surviving in suboptimal habitat can add another level of complexity. Populations in suitable (for conservation purpose, e.g., with a positive population growth, as is the studied population here), but suboptimal (from an evolutionary perspective), habitat are likely to be more susceptible to the effects of climate change as they are already at a disadvantage from displaying a discrepancy with current environmental conditions, potentially further limiting their long-term viability.
Our findings precisely quantify adaptive potential and plasticity in a trait vital to population fitness in a closed population of a threatened species. Ongoing debate has centred on the relative importance of these two processes to how species may overcome the effects of climate change. We are the first to show empirical evidence towards how an already threatened species may cope (or not), but more evidence is sorely needed on a larger number of traits, populations and species to enable us to assess more accurately its effects and test genetic management alternatives more widely.

ACK N OWLED G EM ENTS
We are indebted to all the volunteers, past students and We also thank Ally Phillimore and other anonymous reviewers whose comments greatly improved the quality of the paper.

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R S CO NTR I B UTI O N
AWS and PB conceived of the study. JGE and PB supervised and coordinated the collection of the data. PB developed the microsatellite data set, supervised the genotyping and performed the pedigree reconstruction. PdV designed and conducted the analysis of the data, with advice from all other authors. PdV wrote the paper, with input from all other authors.

DATA ACCE SS I B I LIT Y
The data, model and simulation code are available online on the Dryad database (https://doi.org/10.5061/dryad.0h2br0d).