Testing the effect of quantitative genetic inheritance in structured models on projections of population dynamics

Despite this, synchrony between great tits and their caterpillar prey was reduced and population declines occurred. Our approach demonstrates that this model framework provides a promising avenue through which to explore the roles of phenotypic plasticity and evolution in trait changes and population dynamics.

Global climate change is altering the timing of life history events for species living in seasonal environments. These shifts in phenology can lead to the disruption of interspecific relationships with implications for individual fitness. Predicting phenological change and its population level consequences can provide insights into population persistence. Achieving this is challenging for labile traits as current structured population models do not explicitly distinguish between the roles of phenotypic plasticity and micro-evolution, hindering realistic predictions of trait change. In this study we present the first empirical test of a new integral projection model (IPM) framework, which allows phenotypic plasticity and micro-evolution to be teased apart by incorporating a quantitative genetic inheritance function. We parameterise this model for a population of wild great tits Parus major and test its predictive capabilities through K-fold cross validation. We test the predictive accuracy of the quantitative genetic IPM in comparison to the standard IPM. We demonstrate that adding genetic inheritance rules maintains high accuracy of projections of phenological change, relative to the standard IPM. In addition, we find almost identical projections of population dynamics in this population for both IPMs, demonstrating that this model formulation allows researchers to investigate the contributions of phenotypic plasticity and microevolution to trait change, without sacrificing predictive accuracy. Modelling in this way reveals that, under directional environmental change, both micro-evolution and plasticity contribute to an advance of phenology, although the effect of plasticity is an order of magnitude higher than evolution. Despite this, synchrony between great tits and their caterpillar prey was reduced and population declines occurred. Our approach demonstrates that this model framework provides a promising avenue through which to explore the roles of phenotypic plasticity and evolution in trait changes and population dynamics.

Introduction
Rapid climate change is altering biological populations across the globe (IPCC 2007). Average temperatures are increasing and seasonal patterns are shifting, with springs occurring earlier (Parmesan andYohe 2003, Parmesan 2007) and autumns later (Gallinat et al. 2015). One of the most widely observed responses to the current climatic shifts is phenological change; an alteration of the timing of life history events. Phenological change has been observed across almost all taxonomic groups (Menzel et al. 2006, Cleland et al. 2007, Gallinat et al. 2015, Thackeray et al. 2016, and across a multitude of life history events from migration (Lehikoinen and Sparks 2010), reproduction (Crick et al. 1997, Both et al. 2004, Both and Visser 2005, Durant et al. 2007, Charmantier et al. 2008, and emergence from hibernation (Ozgul et al. 2010), to senescence of leaves (Gallinat et al. 2015). Different species have different degrees of environmental sensitivity (adaptive plasticity) and plasticity (Cushing 1969, Durant et al. 2007, Thackeray et al. 2010, consequently rates of spring advance have been highly variable (Parmesan and Yohe 2003, Menzel et al. 2006, Both et al. 2009). These uneven patterns of response have been hypothesised to lead to mismatches between interacting species, which rely on temporal synchrony between interspecific life history events (Cushing 1969, Durant et al. 2007, Singer and Parmesan 2010, Reed et al. 2013.
Whether mismatch will occur depends on both the plastic responses of the interacting species and evolutionary change. Phenotypic plasticity is the primary mechanism by which phenological matching is achieved in inter-annually variable environments (Charmantier et al. 2008, Chevin et al. 2010, Charmantier and Gienapp 2014, Chevin and Lande 2015. However, whether the degree of plasticity that evolved in fluctuating environments will be sufficient under directional change is not known; some evolution may be required to maintain synchrony. Several studies have suggested that the contribution of plasticity to phenology is high and heritability low, therefore limiting the capacity to evolve in response to selection pressure (Vedder et al. 2013, Charmantier and Gienapp 2014. However, the incorporation of these processes in population models of natural populations has been limited. Teasing apart the contribution of ecological and evolutionary processes in phenological responses is a key step to understanding how phenological synchrony changes and identify processes that may disrupt it. Despite a wealth of research on phenology and its importance for individual fitness (Reed et al. 2013, Plard et al. 2014, our understanding of the mechanisms behind phenological change, how this influences phenological synchrony and how this translates to changes in population dynamics, is poorly developed (Miller-Rushing et al. 2010, Bennett et al. 2015, Johansson et al. 2015. In order to address this gap, we need a model that can not only track population dynamics but also evolution and plasticity through time. Modelling frameworks now exist, which can achieve this. Such models were introduced in Childs et al. (2016) and Coulson et al. (2017), which extend the integral projection model (IPM) to include quantitative genetics. Standard IPMs are relatively easy to parameterise using data collected from the field using standard statistical analyses (Ozgul et al. 2010), and offer the potential to address questions of joint trait and population dynamics. Evolutionarily explicitly IPMs, which we implement here, are more complex to parameterise and require detailed pedigree information. However, they also offer the potential for greater insights into evolutionary processes. It is currently unknown whether these models will be useful for producing projections for actual systems or whether they perform any better than the standard IPM in terms of accuracy of projections of trait and population dynamics.
In this study we use a population modelling framework to address questions about the role of ecological and evolutionary processes in phenological change, and the level of model complexity needed to capture them. We present the first empirical use of the extended IPM framework introduced by Coulson et al. (2017). This framework allows us to tease apart the contributions of plasticity and evolution to phenological change. We construct two density-dependent IPMs, a standard IPM and one with a quantitative genetic inheritance function (sensu Coulson et al. 2017). We parameterise these models using long-term data from a population of great tits Parus major in Wytham Woods. The timing of reproduction in this population plays an important role in reproductive success. Individuals hatching chicks around 13 days prior to their prey species' peak abundance (winter moth caterpillar Operophtera brumata) fledge the most chicks (Simmonds et al. 2017). Phenology of both the great tits and the winter moth caterpillars are advancing (Charmantier et al. 2008). The phenology and demography of this population has been recorded in a standardised way since 1960, providing an exemplary dataset to test model performance.
We use our parameterised models to address three specific aims: 1. To test whether adding a quantitative genetic inheritance function to an IPM improves accuracy of projections relative to a standard IPM, as assessed through cross validation. 2. To explore the contribution of plasticity and microevolution to phenological change using the quantitative genetic IPM. 3. To assess whether phenotypic plasticity and microevolution can maintain trophic synchrony under directional environmental change.

Methodology General model description
In this study we build and parameterise two integral projection models (IPMs) (Easterling et al. 2000). One is a standard IPM, the second adds a quantitative genetic inheritance function (Coulson et al. 2017). Both of these IPMs consist of four classes of fundamental demographic functions; survival, development, recruitment and inheritance. These fundamental functions capture the survival of individuals from one year to the next, change in the focal trait for survivors, the number of new individuals that recruit to the population in the next year, and their inherited trait values, respectively. The basis of these models has typically been phenomenological. Coulson et al. (2011) extended the IPM framework to include population genetics. It has been further extended in 2016 by Childs et al. to incorporate quantitative genetics into the inheritance function, and by Coulson et al. (2017) to explicitly distinguish some forms of phenotypic plasticity from micro-evolution. Coulson et al.'s framework (2017), the one which we implement here, has several differences to the standard IPM. The primary development, based on Childs et al. (2016) is the inclusion of a genetic inheritance function based on quantitative genetic assumptions. Our model framework is similar to that introduced by Childs et al. (2016) both in its application and model structure. Childs et al.'s model tracked the bivariate distribution of a breeding value and the phenotypic trait itself. The dynamics of the breeding value assumed the infinitesimal model. The aim of their model was to investigate the role of microevolution and selection on labile traits. They used their model to partition the contributions of evolution on trait change via calculation of terms in the age-and sex-structured Price equation. Our model operates slightly differently in that we track the bivariate distribution of the breeding value and the environmental component of the phenotype. Both models make a simplifying assumption that permanent environment effects are negligible, although both approaches could be extended to capture such effects. Our model has an additional simplifying assumption of identical demography for both sexes such that males and females do not need to be separated into different classes, reducing the dimensionality of the model. Our framework is formulated to make the environmental dependence of the phenotype more explicit. The two components of the phenotype can always be isolated and can change independently. This is the first time an IPM-based model has been parameterised to simultaneously examine the interplay between plasticity and evolution on ecological and evolutionary dynamics.
The standard IPM framework tracks a distribution of traits (Z) through time. In the quantitative genetic IPM framework, we track a bivariate distribution of breeding values (G) and environmental components (E) of a trait. These two components can be combined to create a distribution of phenotypes: we assume Z(G, E) = G + E. The E component of the phenotype changes at each time step based on the fixed effects of environmental conditions at that time step, making the trait labile. An underlying assumption of this approach is that the breeding value (G) remains fixed for life across the environments that an individual experiences, therefore, development cannot alter the conditional distribution of breeding values. The conditional distribution of G can only be altered by the survival and recruitment functions, which operate on the entire phenotype and the inheritance function. Here we implement a single sex, infinitesimal model, with breeding values for the offspring cohort defined by a Gaussian distribution with a mean of the selected mothers and variance defined by the additive genetic variance. In this model, we have assumed the widely used infinitesimal model of inheritance, specifically, we assume that selection does not erode genetic variance and the offspring breeding value is normally distributed in each cohort Walsh 1998, Walsh andLynch 2018). This is one aspect where our model diverges from the one in Childs et al. (2016). The quantitative genetic aspect of our model assumes that males and females have identical demography, the implicit assumption is that selection acts the same way in males and females with respect to the breeding value of this trait, and that assumption is currently very difficult to test. While this is a simplifying assumption and any deviation from this could impact our results, we have no evidence from the study population to suggest that this is not the case. For instance, males and females have been found to show similar survival rates (Culina et al. 2015). A single sex model was chosen as males are typically assumed to have little influence on breeding times in great tits (van Noordwijk et al. 1980). Effects of males have found for lay date but not clutch size (Germain et al. 2016). These effects were an order of magnitude lower than that of females, however, as the male and female effects positively covaried, including males did increase the overall heritability of the trait (Germain et al. 2016). In light of these findings, our results could be interpreted as a potential slight underestimate of the evolutionary potential of phenological traits. But we would expect the contribution of males to be small relative to that of females.
The development and inheritance functions have the effect of redistributing E within each value of G in the bivariate distribution. Through this method of tracking the bivariate distribution of G and E, it is possible to isolate effects of ecological and evolutionary processes involved in quantitative trait and population dynamic change. Such distinction is not possible from the single distribution in a standard IPM.
The equations for the standard IPM and the extended approach are given below (Eq. 1, 2, respectively) where Z is the trait at t, Z′ is the trait at t + 1, N(Z, t) is the distribution of the trait at time t, N(Z′, t + 1) is the distribution of the trait time t + 1. S(Z, θ, t), R(Z, θ, t), D(Z′|Z, θ, t) and H(Z′|Z, θ, t) are the survival, recruitment, development and inheritance conditional on the phenotype and environment at time t(θ). G and E in Eq. 2 are the breeding value and environmental component of the phenotype at time t, respectively. E′ represents the environmental component of the phenotype at time t + 1. The survival and recruitment functions are conditional on both the environment at time t and t − 1 (θ′), due to lagged effects of spring conditions immediately before the census. Evolution occurs in our model as a response to natural selection -a within generation process that alters the distribution of breeding values via viability selection (the survival function) and fertility selection (the recruitment function).

Study system
In this study we used population census and phenological records from approximately 6000 individuals from 1961 to 2010. Details of data used in parameterisation are shown in Table 1. Great tits maximise reproductive success when they hatch their eggs 13 days prior to population-wide caterpillar peak abundance (Simmonds et al. 2017). Typically phenology has been explored by analysing lay date, but hatch date is a better indicator of matching between chick energetic needs and peak caterpillar availability as non-trivial phenology adjustments are possible after laying (Simmonds et al. 2017), so we use hatch date as our phenological trait here. Phenology of great tits has been advancing over recent decades and has generally kept pace with that of their caterpillar prey (Charmantier et al. 2008). This is assumed to be driven by plasticity and it is not known how long it will continue.
Precipitation, temperature and beech mast index were chosen as the key environmental variables for this analysis because they have been previously linked to population dynamics, and phenology in passerine birds (Raven et al. 2005, Sandvig et al. 2017) and specifically in tit species (Paridae) (Van Balen 1980, Perdeck et al. 2000, Payevsky 2006, Grøtan et al. 2009). These cover both spring and winter conditions, therefore investigating the influence of the weather throughout the year.
Long-term data (from 1960 to 2015) were collected through population censuses of the nest box breeding population of great tits in Wytham Woods (~6000 individuals over this period). Censuses have been conducted using a standardised procedure since 1960 (Perrins 1965, Perrins andMcCleery 1989) and provide data on breeding phenology, clutch size, reproductive success, survival and a detailed social pedigree. In addition to the spring breeding census, mist netting is conducted throughout the winter within Wytham Woods. Any new birds caught over winter are fitted with uniquely identifiable metal leg rings and radio-frequency identification tags and age, sex and morphometric data are recorded. All numeric variables, including hatch date, were scaled across years (mean subtracted and divided by the standard deviation) to ensure comparable effect sizes. Census years run from 1 June to 31 May the following calendar year, so from breeding season to breeding season. All results are interpreted and presented on the original scale. Full details of data collection, data cleaning and sources can be found in the Supplementary material Appendix 1 Section A1.

Functional forms and parameterisation of the survival, development and recruitment functions
The two fundamental functions of survival and recruitment have an identical construction and parameterisation for both the standard IPM and the quantitative genetic inheritance IPM. The development function has the same construction but is implemented differently in the two IPMs. Each of these functions is based on a linear predictor of the form in Eq. 3, where β 0 is the intercept of a linear regression taking into account effects of any explanatory factors, β 1 is the slope of the relationship between the phenotypic trait (Z) of interest and the response variable, β j is the slope of the relationship between explanatory variable j and the response variable, and X j is the value of explanatory variable j at this time step.
Each function is parameterised from the coefficients from various forms of linear model (explained in detail below and in Supplementary material Appendix 1 Section A2-A5), which assess the relationships between the demographic rates (survival, development and recruitment) and the explanatory variables of interest. Model selection for all functions was performed using stepwise model reduction with an information theoretic approach, using the AIC (Akaike information criterion) as is commonly employed in the phenology field (Hinks et al. 2015, Roberts et al. 2015, Bailey and De Pol 2016, Thackeray et al. 2016. The exception was the inheritance function, due to the Bayesian model optimisation where only standard errors for fixed effects, were used to determine if a variable should be removed, following Wilson (2010). We performed stepwise model reduction to allow variables to be removed based on their influence on model predictive performance rather than due to a priori decisions about the importance of covariates. The motivation for model selection was to reduce the number of explanatory variables included in our model but retain predictive ability. We were not testing hypotheses related to any of these variables and do not expect that we have found a single best model, simply a balance between simplicity and predictive performance. We systematically removed one variable at a time and assessed the impact on the AIC. Variables to be removed were chosen based on their standard errors, those with estimates less than two times the standard error were candidates for removal. Whether a variable was retained in the model or removed permanently was determined by the ΔAIC. Variables were only retained in the model if removal generated a ΔAIC greater than the model containing the variable.
Some variables were always retained in the statistical models of each function regardless of their standard errors, because these were of biological importance or interest. Candidate variables for model selection that were included in all functions were: spring temperature, spring precipitation, winter temperature, winter precipitation and section of the woodland. Candidate variables included in some functions only were: synchrony and its quadratic, a quadratic effect of hatch date, immigrant status and beech mast index (in survival, development and recruitment), age (in recruitment and development) and clutch size and its quadratic (in recruitment). Each demographic function included the effects of hatch date (the trait value of interest in this study) and population size (to ensure density dependence). These variables were included in every function, regardless of their standard error values because they are of specific biological interest (Coulson et al. 2011). Parameter values are the intercept and beta estimates from the linear model for each demographic rate.

The survival function
The survival function describes the probability of survival to t + 1 as a function of Z at time t. The survival function is assumed to be logistic, details of the equations of the functional form are available in the Supplementary material Appendix 1 Section A2. The survival function alters the distribution of G and E. Natural selection acts through this function, as survival depends on the phenotype.
The parameter values for this function were obtained from a Cormack-Jolly-Seber (CJS) mark recapture model. The CJS model had survival as a binary response variable with logit link function. This not a closed population, however, as individuals primarily migrate during their first year and do not return, immigration is indistinguishable from mortality and therefore they are considered the same in this model.

The development function
For the quantitative genetic IPM, the development function describes the change in mean E from time t to t + 1 and the variance around this change. As G values are assigned for life, this function redistributes E values within each G value. The development function captures the extent to which individual phenology is explained by the previous years' phenology in addition to the effect of environmental covariates on phenology. We might expect individuals that hatched their eggs early relative to the population, in absolute terms, at t might also have early phenology at t + 1 (Thorley and Lord 2015). However, we might expect the effect of previous timing to be considerably lower than the effect of spring temperature at t + 1. The development function can be approximated as a Gaussian probability function (Easterling et al. 2000) of the expected values of E at t + 1 (E′) and its variance, given E at t.

6
Details of the equations of the functional form are available in the Supplementary material Appendix 1 Section A3.
The linear predictor of the expected value was parameterised from a general linear mixed effect model (GLMM) with hatch date at t + 1 as a response variable with a Gaussian distribution with identity link. As the breeding value remains fixed for life and genetic variation explains little of the phenotypic observations, we assume the dynamics of Z (hatch date) and E are sufficiently similar to parameterise this function on Z. The variance of the function, σ 2 , was taken as the variance of the residuals around the fitted relationship from the GLMM. We model the variance as the same at each t, regardless of environmental conditions as it showed no relationship with any explanatory variables in the model. Due to the inclusion of age in this function, we implemented two formulations of the development function in both IPMs, one with an intercept representing juvenile birds (first year breeding) and the second with an intercept for birds older than one year. This makes the models age structured.
For the standard IPM functional form and parameterisation of the development function is the same except that it operates on the phenotype (Z) as a whole not on E. This is because in a standard IPM we track only a single distribution of Z not a bivariate distribution of G and E.

The recruitment function
The recruitment function describes the number of offspring produced by an individual with each Z value, which survive to recruit to the population at t + 1. Details of the equations of the functional form are available in the Supplementary material Appendix 1 Section A4. The recruitment function alters the distribution of G and E. Natural selection acts through this function, as reproductive success depends on the phenotype.
The parameter values for this function were obtained from a Poisson GLMM with number of recruits as response variable (count data) and a log link function.
Both the survival and recruitment functions include lagged environmental effects from spring conditions in t − 1.

The inheritance function of the standard IPM
This inheritance function describes the change in hatch date given the mother's mean hatch date and the environmental conditions at t. The inheritance function, like development, can be approximated as a Gaussian probability function. Details of the equations of the functional form are shown below and in the Supplementary material Appendix 1 Section A5. Equation 3, describes the expected values of Z at t + 1 (Z′) and its variance, given Z at t. V(Z, X) is the linear predictor from Eq. 3, which gives the value of Z′ given the environmental conditions at t + 1. σ 2 is the variance of the probability function. H(Z′|Z, X, θ, t) is the probability distribution of Z at t + 1, given the Z and the environmental conditions at t + 1 (θ).
This function was parameterised using the coefficients from a GLMM with daughter hatch date as the response variable with Gaussian distribution and an identity link.

The inheritance function of the quantitative genetic IPM
The quantitative inheritance function consists of two elements. The first captures genetic inheritance. This is done assuming an infinitesimal model of quantitative genetic inheritance, where selection occurs but additive genetic variance is maintained and normally distributed. At each time step the breeding values of offspring are generated from a random normal distribution. The mean of the breeding values of the selected mothers (those that produced recruits) defines the mean of the distribution of offspring breeding values at t + 1. The variance is defined as additive genetic variance from the animal model (Wilson et al. 2010), therefore variance does not erode during our simulations. Equation 5 represents this first element of inheritance where G O,t+1 is the distribution of breeding values of offspring at t + 1, G R,t+1 is the distribution of breeding values of selected mothers at t + 1 and V A is the additive genetic variance.
The second element represents non-genetic inheritance which are changes in plasticity. This part of the function redistributes E values within each G value. This environmental effect can be interpreted as the plastic component of the phenotype. Hatch date is a trait influenced by genes, the maternal environment and the external environment. Although here we assume that permanent environment effects are negligible. Each female offspring has a genetic predisposition to a certain hatch date but also matches the local environment through E. This element has a functional form that is the same as the standard IPM, but operates on the expected values of E at t + 1 (E′) and its variance, given E at t (Eq. 6). The linear predictor of the expected values was estimated mechanistically using a quantitative genetic approach through the animal model (Wilson et al. 2010). This included the breeding value as a random explanatory variable within the model, allowing estimation of the variance of breeding values in the population (additive genetic variance), as well as maternal effect and permanent environment effect. We included fixed explanatory variables of section of woodland, population size, spring temperature, winter temperature, spring precipitation and winter precipitation. The variance of the Gaussian function, σ 2 , was taken as the total phenotypic variance estimated by the animal model minus the additive genetic variance (the residual environmental variance).
Social pedigree is used to infer relatedness, therefore only individuals that appear in the pedigree could be used here (n = 5855). For females social pedigree should represent accurate genetic relationships, however there are rare instances of females laying eggs in nests other than their own as illustrated by observed mixed species broods (found to be 3% in blue tits and great tits (Barrientos et al. 2015)). For males on the other hand, a social pedigree will have a higher number of incorrectly assigned relationships. Even in socially monogamous bird species extra pair paternity occurs in on average 11.1% of offspring (Griffith et al. 2002). Extra pair paternity was measured at 13% in our study population (Patrick et al. 2012).

Caterpillar peak abundance
The timing of the peak abundance of the winter moth caterpillar is also predicted at each time step in the model simulations using a linear predictor (Eq. 3). This function is parameterised from a linear model of the date of peak caterpillar abundance against environmental explanatory variables of spring temperature, spring precipitation, winter temperature, winter precipitation and beech mast index. Predicting caterpillar timing at each time step allows us to calculate temporal synchrony between great tit hatch timing and caterpillar peak abundance at t + 1.

K-fold cross validation
To test model performance our IPMs were parameterised using a training dataset and then projections were generated for a five-year test dataset (year beginning and ending 20 May) using observed environmental data from the test years. Using this design, we could compare our projected results of the trait value and population size to the observed hatch date and population abundance in the test years (five-fold cross validation). The full dataset from 1961 to 2010 (1960 was removed due to a vast amount of missing data on parent IDs, birth years and previous breeding attempts) was split into 11 fiveyear sections. In turn, each section was used as a test dataset, removed from the parameterisation process, and then predicted from the resulting model. The remaining 10 fiveyear sections were used as the training dataset from which parameters were generated. This allows predictive capacity of the model to be tested whilst controlling for some of the stochasticity in test years. Through this process we obtained 55 years of predicted population size and trait values, with corresponding observed values, from which we could test predictive accuracy.
Population size and mean hatch date were calculated at each time step in each model. These measures were compared to the observed population size and mean hatch date. Predictive accuracy was assessed using mean absolute error (MAE), the mean absolute difference between observed and predicted values.

Model simulation of 50 years
To test model performance over a longer time period both IPMs were parameterised using the whole dataset of 50 years. These models were then used to project the population trait-or bivariate-distribution from the same years used to parameterise . Here the training and test datasets are the same, so not independent. Consequently, this is not a validation of model predictive performance but demonstrates how the models perform over longer simulation runs. Observed values of environment covariates were again used to direct predictions at each simulated time step.

Running the model simulations
At each simulated time step values of explanatory variables need to be generated. In these simulations the values of explanatory variables were taken from observed values or held at their mean or 0 for random effects (Fixed effects: clutch size, section of the woodland. Random effects: mother ID and permanent environment effect). The date of peak caterpillar abundance, and consequently synchrony, was calculated at each time step from the environmental drivers. Only the dynamics of resident birds are simulated, as we did not model immigration. It should be noted that while we only simulate resident birds, the population size measure we use is for the total population size including immigrant birds. This was chosen to improve the estimate of density dependence in the population (Supplementary material Appendix 1 Section A6).
In order to explore the impact directional climate change might have on synchrony between the great tits and their caterpillar prey we ran a second 50-year simulation. In this simulation spring temperature values were no longer taken from observed values but instead sampled randomly from a normal distribution at each time step. We fit a linear model of spring temperature over time to estimate the rate of change in spring temperature from 1960 to 2010. For a scenario of directional change, we assumed the same rate of change would continue. Spring temperature values were therefore drawn from a normal distribution with an increasing mean and a standard deviation equal to that of the observed data. The mean began at one standard deviation above the observed mean and increased by the estimated slope of the relationship between spring temperature and time each year for 50 years.

Results of model fitting
The final model parameterisation of both IPMs, determined by model selection, included the influence of both winter and spring environmental conditions. Winter conditions, including food supply (beech mast index) played a significant role in both survival and recruitment as well as inheritance of hatch date. No other weather-related environmental drivers were retained in the development function following model selection. Full details of the variables included in the final form of each function can be found in the Supplementary material Appendix 1 Section A2-A5.

Can adding quantitative genetic inheritance to an IPM improve projections?
Both the standard IPM and quantitative genetic IPM reproduce the trait, mean hatch date, dynamics well (Fig. 1). The MAE is only three days for each method, when rounded to nearest whole day. There is no difference in accuracy of mean hatch date prediction between the two IPMs. The ability to accurately predict phenological change is equally high for both models. The same is true for population dynamics. While neither IPM fully reproduces the observed population dynamics, they are both equally accurate. There is a systematic accuracy error, with predictions almost always being lower than observed population numbers (Table 2). Figure 2 shows the projected population dynamics and mean hatch date values for a single model simulation of 50 years (parameterised from the whole dataset).
The under-prediction of population dynamics is emphasised when predictions are generated over a 50-year, rather than 5-year, period (Fig. 2). When predictions are generated from a model parameterised on the whole 50-year dataset and predict back observed values in those years (non-independent test and training datasets) the predicted mean hatch dates still match the observations as well as for shorter projection runs.  However, the discrepancy between the predicted population size and observed increases (Table 3), the difference between the two IPMs remains small but increases to two individuals. The capturing of the population dynamics is not improved by using resident population size (Supplementary material Appendix 1 Section A6).

What are the contributions of micro-evolution and plasticity to phenological change?
In the standard IPM, it is primarily plasticity that is driving phenological change. This can be seen in the Supplementary material Appendix 1 Fig. A4, as the contribution from the mother's hatch date to variation in the offspring hatch date is substantially lower than the contribution from environmental effects. Using the quantitative genetic IPM, we can explicitly track the contribution of micro-evolution and plasticity. When we do this, we show that plasticity is again the predominant driver of phenological change in this population ( Fig. 3). Micro-evolution, represented by changes in the mean breeding value, does occur, but is an order of magnitude lower than the contribution of phenotypic plasticity. If we quantify the directional change in breeding value and environmental component of the phenotype by using a linear model of each component against time, the slopes are −0.14 for the environmental component and −0.01 for the genetic component.

Can phenological synchrony be retained under directional environmental change?
Under directional change in spring temperature, we see a gradual loss of optimal synchrony coupled with a decline in population size (Fig. 4). Over the course of the simulation the mean synchrony moves steadily closer to 0 and further from the optimum of −13 days, i.e. great tits hatch closer to the time of peak caterpillar abundance the further into the future we project. This is despite both great tits and caterpillars using the same cue for phenology in this model and experiencing the same environmental change. Directional change in phenology is generated in this simulation by both plasticity and micro-evolution. Both components of the phenotype advance, leading to earlier hatch dates. However, this advance is not as fast as that of the caterpillar prey. Despite steady population declines (Fig. 4) and increased selection for earlier laying, micro-evolution was not sufficient to retain matching between the two trophic levels in our model.

Discussion
The final model parameterisation of both IPMs included the influence of both winter and spring environmental conditions. Winter conditions, including food supply (beech mast index) played a significant role in both survival and recruitment as well as inheritance of hatch date. The role of winter conditions confirms previous work suggesting a key role of winter conditions on overwinter survival for both adults and first year birds (Van Balen 1980). It also highlights the importance of considering cross-season environmental influences on demography rather than focussing only on a single environmental driver (Reed et al. 2013). Our statistical analyses of the drivers of phenological change (here development) also supported previous work indicating that breeding phenology is primarily driven by spring conditions (Perrins 1965, Husby et al. 2010).
Further to the environmental drivers included in our models, we also found a quadratic effect of synchrony with caterpillar prey on reproductive success and survival. This is what we would expect from mismatch theory (Cushing 1969, Johansson et al. 2015 and previous work looking at hatch date synchrony (Simmonds et al. 2017), both breeding too early or too late reduces reproductive success and female survival. We would expect both of these effects due to mismatch reducing the amount of food available to chicks and increasing stress on females. This effect was not incorporated into previous models (Childs et al. 2016), because they modelled synchrony as a focal trait rather than a covariate.
It should be noted that our choice of a single sex model could have an impact on the predicted results. Restricting a model to one sex results in any sex specific differences in offspring recruitment being ignored. If the sex ratio of recruits varies between years based on environmental conditions (as discussed in Oddie 2000) then this could lead to over or underestimations of population size. Based on previous analyses (Oddie 2000), which did not find significant sex differences in recruitment in great tits and the dominating role of females in egg laying, these should not have a large effect in this population. However, potential sex biases should be considered when applying these methods to other species and populations.  The two constructions of IPM we test here were able to accurately reproduce the dynamics of the mean hatch date for the study population. Projections of phenological change had average error of lower than the standard deviation of the data (approx. seven days), even when run across 50 years. However, they also both showed a systematic under prediction of population size. This is could be driven by several, non-mutually exclusive factors. The first is a failure of the statistical analyses that parameterise the fundamental functions to capture the true form of population dynamics. This could be due to a lack of inclusion of secular trends in survival or reproductive rates. We did not include secular trends in these models but a steady increase in population size across the study period. We trialled the final survival and recruitment models with a fixed linear effect of time rather instead of a random effect, to quantify the secular trend in these demographic rates. The estimate for a secular trend in survival was not statistically significant (EST = 0.006, SE = 0.004) but we did a statistically significant the effect of time on recruitment (EST = 0.176, SE = 0.037, estimate for time scaled by subtracting mean year and dividing by the standard deviation). This suggests there is a temporal trend in recruitment that our current environmental drivers do not capture. There could also be some stochastic temporal variation in survival that we did not capture in our models, due to a lack of a temporal random effect. The absence of congruence between model predictions and observed population dynamics could also stem from non-linear density dependence acting in this system that we do not account for in our linear models. A final potential factor is an inability to correctly model immigrants to the population (Plard et al. 2019). In these models we only include immigrants as a fixed addition relative to the projected resident population size. In reality the proportion of immigrants varies annually, this could alter the strength of density dependence in our model differently to reality.
Despite both IPMs having systematic error in projected population size, the accuracy of the two IPMs was virtually identical. The maximum deviation was two individuals. This close similarity is unsurprising as only the inheritance functions differ between the two models. This suggests that incorporating the ability of the focal trait to evolve, through changes in G as a result of survival and recruitment, does not reduce accuracy of projections of population dynamics. In addition, despite a difference in mean projected phenology on an interannual (time step by time step) scale, the MAE in phenology projection between the two models remains the same. On average, across all years predicted, the difference between projected phenology and observed is the same for the standard and quantitative genetic IPM even though interannual projections of mean phenology are not. Both model constructions capture the phenological change with good accuracy even in simulations covering several decades, this suggests that our fundamental functions have captured the drivers of hatch date well, but have not fully captured population dynamics.
Our finding that the projected mean hatch date remains consistent whether you include evolution (quantitative genetic IPM) or not (standard IPM) suggests either that the mode of inheritance included in the model does not make a substantial difference to the capturing of trait dynamics or that trait heritability is low or that selection is weak. Any of these causes could result in the observed patterns. There is a strong influence of phenotypic plasticity on phenological change. Previous work has found that phenotypic plasticity is the primary driver of phenological change for the great tit (Charmantier et al. 2008, Chevin et al. 2010, Charmantier and Gienapp 2014, Chevin and Lande 2015. Evolution has been assumed to play a substantially smaller role in this species (Vedder et al. 2013, Charmantier and Gienapp 2014. This is further supported by the Supplementary material Appendix 1 Fig. A4, which demonstrates that the variation in hatch dates of the offspring contributed by the mother hatch date is considerably lower than that contributed by environmental variables. In addition, Fig. 3 shows that the contribution of micro-evolution, while directional, is an order of magnitude lower than that of plasticity. This indicates that the phenotype is primarily controlled by environmental conditions rather than additive genetic effects. Indeed, a model without any inheritance might still capture the phenotypic change well. From other work on phenology, we would expect micro-evolution to play an important role (Caro et al. 2013) in tandem to a key role of plasticity, with plasticity dominating interannual variation but evolution playing a more important directional role on longer timescales. It seems from the current results that micro-evolution will not be sufficiently rapid in the study population to achieve this.
For our study population from Wytham Woods, including evolution of hatching phenology in IPM led to equal accuracy in population size predictions and retained good accuracy in phenological predictions. When coupled with the ability to explicitly investigate the roles of micro-evolutionary and plastic change, the extended IPM we test here offers a strong potential to give insights into phenological change (and change in other plastic traits). Despite the increased complexity of this model, there does not seem to be a cost in terms of predictive performance. In contrast, by using this new framework it is possible to disentangle the contribution of phenotypic plasticity and micro-evolution to trait change, whilst maintaining the same accuracy of projected population dynamics.
Under directional change in spring temperatures, plasticity alone was not sufficient to keep pace with the environmental change (Fig. 4). Even when combined with some evolutionary change (Fig. 3), great tit hatch dates still did not advance sufficiently to maintain synchrony with their caterpillar prey. The change we induced here was at the same rate as the observed data but did reach into novel conditions. The different rates of response across trophic levels, even when responding to the same cue, have been suggested as inherent ). This is supported by our results, which show caterpillars advancing more rapidly than their great tit predators, creating selection for earlier phenology in great tits. However, selection pressures do not seem to produce an evolutionary response fast enough to match the environmental change. This could indicate that mismatch is inevitable in the face of future environmental change, however, the current results present the effects of a gradual change in spring temperature in isolation. In reality, multiple environmental conditions change at once, which could have different impacts on various demographic rates and therefore buffer populations from the negative impacts of a single driver. Further studies should consider change in multiple drivers preferably directed by predictions of future weather from climate models.
We demonstrated the first empirical example of a new IPM framework, detailing how such frameworks can be applied to real biological populations and tested its predictive ability compared to a standard IPM. The procedures outlined here can be easily applied to other systems. By including environmental drivers from across different seasons and considering their influence on both phenology and demographic rates (survival and recruitment) this kind of population model has potential to dig into the role of phenological change in population dynamics. Considering all key demographic processes together is essential to move from individual fitness to population level consequences, which has been rarely achieved in phenological research. The models analysed here provide an excellent opportunity to achieve this.

Data accessibility
The data and code used for this study is freely available as supplementary files and on at the Github repository (<https://github. com/emilygsimmonds/Evolutionarily_Explicity_IPM>).