- Top of page
- Material and methods
- Supporting Information
Natural selection is acting solely on individuals' phenotype, whereas individuals mainly pass on genotypes to their offspring. Understanding how genotype shapes phenotype is therefore an essential issue to understand evolution in nature (Ridley 2003). Quantitative genetics is a powerful framework to explore the complex genetic architecture of phenotypic traits (Kruuk, Slate & Pemberton 2008). In wild populations, how much of the observed phenotypic variation on a trait can be transmitted to the next generations is a frequently asked question because it affects the speed and magnitude of trait evolution. The fraction of variability in the phenotypic trait that is of transmittable genetic origin is called heritability (Falconer & Mackay 1996; Roff 1997; Lynch & Walsh 1998). Because heritability is a key genetic parameter in regard to whether natural selection is able to generate evolution on a trait or not, it has been the focus of many studies in various species (Mousseau & Roff 1987; Falconer & Mackay 1996; Kruuk 2004; Roffn, 2007; Hill & Kirkpatrick 2010).
Until recently, two approaches were available and classically used to estimate heritability in wild populations: half-sibling design and parent–offspring regression (Falconer & Mackay 1996; Roff 1997). The half-sibling design method operates by comparing intra- and inter-family variance for half-sibs (the full-sibling design is to be avoided because of its sensitivity to dominance effects, see Lynch & Walsh 1998). In the parent–offspring regression method, the heritability (resp. half of the heritability) of a trait is given by the slope of the regression between the mid-parent phenotype (resp. the phenotypes of one of the parents) and the mean offspring phenotype. Advantages and disadvantages of these two methods are well known and reviewed in Falconer & Mackay (1996), Roff (1997) and Lynch & Walsh (1998). The parent–offspring regression has been more frequently used than the half-sibling design in the wild because it is easier to set up and requires less offspring per individual (Roff 1997) and less information about family structure (e.g. molecularly assigned paternities). In wild populations, the presence of environmental effects shared by related individuals (Wilson et al. 2010) and issues related to data quality (Quinn et al. 2006; Postma & Charmantier 2007), misassigned paternities (Charmantier & Réale 2005) or imperfect detection (Cam 2009; Papaïx et al. 2010) can, however, generate biases or decrease statistical power when estimating heritability. The estimation of heritability in wild populations therefore requires accounting for these specificities, in particular unbalanced sampling designs (Kruuk 2004). Over the last decade, an increasing number of studies (e.g. Réale, Festa-Bianchet & Jorgenson 1999; Milner et al. 2000; Merilä, Kruuk & Sheldon 2001; Kruuk, Merilä, Sheldon 2001, Kruuk et al. 2002; Sheldon, Kruuk & Merilä 2003; McCleery et al. 2004; Charmantier, Keyser & Promislow 2007; Nilsson, Åkesson & Nilsson 2009; Morales et al. 2010; Lane et al. 2011) have estimated the heritability of traits in the wild using the animal model approach (Kruuk 2004; Postma & Charmantier 2007; Visscher, Hill & Wray 2008). This model was developed in the 1950s (e.g Henderson, 1950, 1976) for animal (and plant) breeding studies, from which it owes its name. The animal model is a (possibly generalized) linear mixed model that uses a pedigree of the population to estimate the additive genetic variance component (and potentially other kinds of genetic effects). The advantages of this approach over the parent–offspring regression and half-sibling design are twofold. First, the animal model is not restricted to specific types of relationships between individuals. Therefore, it maximizes statistical power (Sorensen & Kennedy 1984; Kruuk 2004) and is more robust to inbreeding and selection (Sorensen & Kennedy 1984; van der Werf & de Boer 1990; Sillanpää 2011). Second, the animal model can explicitly account for many confounding effects such as dominance, common environment and parental identity (Kruuk 2004; Wilson et al. 2010). Because of its flexibility when dealing with such unbalanced sampling design, the animal model approach has been strongly promoted for estimating heritability of traits in wild populations (Kruuk 2004; Postma & Charmantier 2007; Wilson et al. 2010). However, animal models also suffer from the classical pitfalls of mixed models, which are notoriously computationally demanding and sometimes difficult to handle correctly (Bolker et al. 2009; Zuur et al. 2009).
As a practical demonstration of the advantages of the animal model over the parent–offspring regression, Kruuk (2004) reviewed heritability estimates obtained with both methods in wild populations. She showed that parent–offspring regression estimates were on average 30% higher than those from animal models. Yet, this comparison was criticized by Åkesson et al. (2008) who argued that the data sets were too different for the estimates obtained with both methods in different studies to be comparable. Indeed, when restricting to the four studies comparing parent–offspring regressions and animal models for the same data sets, yielding 22 different heritability estimates (Réale, Festa-Bianchet & Jorgenson 1999; MacColl & Hatchwell 2003; Åkesson et al. 2008; Hadfield et al. 2006), we could find no bias anymore: the heritability estimate was higher for the animal model than for the parent–offspring regression in 13 cases, and lower in eight cases (see Table 1). In a simulation study, in which related individuals shared an environmental effect, Kruuk & Hadfield (2007) showed that the parent–offspring regression performed better in estimating heritability than an animal model in which this environmental effect had not been specified, that is, a ‘naive’ animal model, and almost as well as an animal model incorporating this effect, that is, an ‘informed animal’ model. These results may be due to the simulation by the authors of a ‘non-transgenerational’ environmental effect shared by related individuals (Rossiter, 1996). These non-transgenerational effects are shared by related individuals within the same generation only (e.g. sibs). By contrast, transgenerational effects, that is, effects shared by related individuals between generations (e.g. parents and their offspring), increase the resemblance between parent and offspring and may thus artificially inflate heritability estimates given by parent–offspring regressions. In other words, parent–offspring regressions are expected to give higher estimates than animal models only in the presence of transgenerational environmental effects if relevant information is provided to the animal model (i.e. additional random effect(s) in the model). When referring to a transgenerational effect here, we will exclusively consider parents and their offspring, excluding other potential levels of relatedness between individuals (e.g. for grand-maternal effects), which have not been commonly investigated in wild populations using animal models so far.
Table 1. Difference between estimates of heritability of continuous traits obtained from the parent–offspring regressions (RegPO) and animal models (AM) on the exact same data sets in the four studies showing both estimates published so far (we included only studies based on observed pedigree). Positive values indicate higher values for the animal model. Significance of differences could not be tested, because information on sample size is not always available. Marked (*) heritability are considered as zero for the differences computation. Additionnal random effects (‘Effects’ column) are either maternal effects (M) or broodlitter effects (BL, also called ‘nest effect’ in the cited articles)
|Body mass (lambs, June)||—||0·00||0·31||0·31||1|
|Body mass (lambs, September)||—||0·02||0·29||0·27||1|
|Body mass (yearling, June)||—||0·12||0·43||0·31||1|
|Body mass (yearling, September)||—||0·07||0·24||0·17||1|
|Body mass (2-year-old, June)||—||−0·15*||0·03||0·03||1|
|Body mass (2 years old, September)||—||−0·09*||0·00||0·00||1|
|Body mass (3 years old, June)||—||0·28||0·27||−0·01||1|
|Body mass (3 years old, September)||—||0·49||0·51||0·02||1|
|Body mass (4 years old, June)||—||0·26||0·23||−0·03||1|
|Body mass (4 years old, September)||—||0·59||0·34||−0·25||1|
|Body mass (adult, June)||M||0·39||0·28||−0·11||1|
|Body mass (adult, September)||M||0·57||0·81||0·24||1|
|Wing length||M, BL||0·76||0·72||−0·04||4|
|Wing projection||M, BL||0·47||0·48||0·01||4|
|Tail length||M, BL||0·68||0·81||0·13||4|
|Bill depth||M, BL||0·07||0·07||0·00||4|
|Bill width||M, BL||0·39||0·46||0·07||4|
|Bill length||M, BL||0·97||0·84||−0·13||4|
|Skull length||M, BL||0·44||0·32||−0·12||4|
|Tarsus length||M, BL||0·72||0·73||0·01||4|
The previous reasoning is valid for any kind of data distribution. Yet, new difficulties arise when using non-Gaussian distribution, especially for all-or-none (i.e. binary) data, as standard methods become irrelevant, because of intrinsic normality assumptions (e.g. classical parent–offspring regression) or issues in likelihood computation (e.g. REML). To relax normality assumptions, methods based on threshold models are used (Wright, 1934; Dempster & Lerner1950; Elston, Hill & Smith 1977; Gianola, 1982; Lynch & Walsh, 1998). These models assume an underlying continuous character and a threshold value triggering the presence of the all-or-none trait. The statistical properties of parent–offspring regressions using threshold models are well understood (van Vleck, 1972; Elston, Hill & Smith 1977; Lynch & Walsh, 1998; Roff, 1997). On the contrary, the behaviour of some estimation methods based on animal models needs further investigation for binary data (e.g. Charmantier, Keyser & Promislow 2007). For instance, Charmantier et al. (2011) obtained contradictory results on the heritability of natal dispersal behaviour in the wandering albatross when comparing a parent–offspring regression and an animal model and between different estimation methods used to fit the animal model. Numerous studies of heritability of binary traits in the wild have been published using only parent–offspring regression (Hansson, Bensch & Hasselquist 2003; Doligez, Gustafsson & Pärt 2009), animal model (Thériault et al. 2007; Wilson et al. 2011; Reid et al. 2011a, b) or both approaches (Charmantier, Keyser & Promislow 2007; Doligez et al. 2012; Charmantier et al. 2011). Yet, despite the growing number of heritability estimates for binary traits using animal models, we are still lacking statistical studies comparing different estimation methods.
In order to address these different issues, we conducted a comprehensive simulation study carrying out a statistical comparison of the performance of parent–offspring regressions and animal models in estimating heritability. We assessed the influence of different factors on this comparison. First, we simulated contrasted conditions to assess the influence of environmental effects shared by individuals: (i) no shared environment, (ii) share of a non-transgenerational environmental effect and (iii) share of a transgenerational environmental effect. Second, we investigated the heritability estimation for both a continuous and a binary trait, using several popular estimation methods to fit the animal model to Gaussian or binomial data. We also investigated in each case the influence of data quality and quantity (Quinn et al. 2006) on the bias and precision of heritability estimates by simulating (i) a large and a small-size data set with (ii) a high and a low level of knowledge about the genetic relationships between individuals (i.e. a fully connected pedigree and a sparsely connected pedigree with many missing relationships). Finally, we investigated estimates for low, medium and high true heritability levels of the traits, in particular because of possible boundary effects for low heritability level, which could lead to shifts in accuracy or precision. Our results are discussed along some results from other studies, and conclusions are drawn about the relevance of each approach and method in the form of advice to the practitioner.
- Top of page
- Material and methods
- Supporting Information
Our simulations showed that the animal model was the best approach to estimate heritability, using REML for Gaussian phenotypic traits and MCMC for binary traits. We join previous authors (Quinn et al. 2006; Postma & Charmantier 2007) in pointing out the importance of data quantity and quality: our simulations revealed the high level of imprecision for estimates given by some approaches and estimation methods for a sample size of 200 individuals, which may be considered already a large sample in studies on wild populations. Importantly, this is especially true for binary traits. To best describe and account for the influence of shared environmental effects on heritability estimates, we advocate the systematic use of (and comparison between) mid-parent–offspring, mother–offspring and father–offspring regressions in addition to animal models to compute heritability estimates. Comparing the different estimates obtained would allow detecting overlooked non-transgenerational environmental effects, which would generate low heritability estimates for the parent–offspring regression but high estimates for the animal model. However, given the high imprecision of parent–offspring regression, this comparison is not likely to be significant. This comparison would also allow detecting sex-dependent transgenerational effects, such as maternal transgenerational effects, which would generate higher estimates for mother–offspring compared with father–offspring regressions. Therefore, we still encourage both parent–offspring and animal model estimates to be reported simultaneously. Although improvements in estimating heritability are still required, especially regarding binary traits and implementation of transgenerational parental effects, we hope that our study contributes to a better guidance and use of models and methods to estimate heritability of traits in wild populations.