Estimating heritability using genomic data



  1. Heritability (h2) represents the potential for short-term response of a quantitative trait to selection. Unfortunately, estimating h2 through traditional crossing experiments is not practical for many species, and even for those in which mating can be manipulated, it may not be possible to assay them in ecologically relevant environments.
  2. We evaluated an approach, GCTA, that uses relatedness estimated from genomic data to estimate the proportion of phenotypic variance due to genotyped SNPs, which can be used to infer h2. Using phenotypic and genotypic data from eight replicates of experimentally grown plants of the annual legume Medicago truncatula, we examined how h2 estimates from GCTA (h2GCTA) related to traditional estimates of heritability (clonal repeatability for these inbred lines). Further, we examined how h2GCTA estimates were affected by SNP number, minor allele frequency, the number of individuals assayed and the exclusion of causative SNPs.
  3. We found that the average h2GCTA estimates for each trait made with the full data set (>5 million SNPs, 200 individuals) were strongly correlated (r = 0·99) with estimates of clonal repeatability. However, this result masks considerable variation among replicate estimates of h2GCTA, even in relatively uniform greenhouse conditions. h2GCTA estimates with 250 000 and 25 000 SNPs were very similar to those obtained with >5 million SNPs, but with 2500 SNPs, h2GCTA were lower and had higher variance than those with ≥25 k SNPs. h2GCTA estimates were slightly lower when only common SNPs were used. Excluding putatively causative SNPs had little effect on the estimates of h2GCTA, suggesting that genotyping putatively causative SNPs is not necessary to obtain accurate estimates of h2. The number of accessions sampled had the greatest effect on h2GCTA estimates, and variance greatly increased as fewer accessions were included. With only 50 accessions sampled, the range of h2GCTA ranged from 0 to 1 for all traits.
  4. These results indicate that the GCTA method may be useful for estimating h2 using data sets of a size that are available from reduced-representation genotyping but that hundreds of individuals may need to be sampled to obtain robust estimates of h2.


The vast majority of ecologically important traits are quantitative or complex traits. For these traits, the genetic basis is too complex to be untangled using traditional molecular genetic approaches, although genome-wide association mapping is being used to identify individual genes that contribute to the variation. Given the inability to map phenotypic variation of complex traits to the individual molecular determinants, statistical approaches based on relatedness among individuals have been used to disentangle the relative contributions of genetics and environment to phenotypic variation (Fisher 1918; Falconer & Mackay 1996), and predicting expected evolutionary responses to selection (Lande & Arnold 1983; Shaw et al. 2008). The effectiveness of formal statistical approaches for predicting response to selection without explicitly identifying the molecular mechanisms is demonstrated by the advances achieved by plant and animal breeders in the past 100 years (Moose, Dudley & Rocheford 2004).

A central parameter in estimating responses to selection and summarizing the proportion of variance due to genetics is heritability (h2) (Wright 1920; Falconer & Mackay 1996; Lynch & Walsh 1998; Hill 2010). Traditionally, h2 has been estimated through pedigree analyses or using individuals of known relationships established by experimental crosses. Although reasonable for short-lived species that can be reared in common environments, experimental crosses are considerably more challenging for species that are long lived, very large or are not amenable to controlled mating. In addition, because heritability is dependent upon the environment in which organisms are reared, h2 estimates made under controlled conditions may not be good estimators of h2 in natural conditions (Geber & Griffen 2003), although a survey of estimates from animals suggests that laboratory and field estimates are often quite similar (Weigensberg & Roff 1996).

Recognizing these limitations, Ritland and colleagues (Ritland 1996; Ritland & Ritland 1996; Lynch & Ritland 1999) developed methods for estimating h2 in the field. These methods were based on linear relationships between marker-based estimates of relatedness and phenotypes. However, because of uncertainty in estimate of relatedness and confounding of relatedness with the environment, h2 estimates from these methods have not been accurate (Coltman 2005; Frentiu et al. 2008; Pemberton 2008; Gay, Siol & Ronfort 2013). In recent years, multiple methods have been developed in the animal breeding literature that use large-scale genomic data to predict phenotypes (Meuwissen, Hayes & Goddard 2001; Van Raden 2008; Goddard et al. 2009; Campos et al. 2012) and estimate heritability based on the proportion of phenotypic variance explained by genotyped SNPs. Many of these methods have focused on phenotype prediction (i.e. genomic selection or genomic prediction) with the goal of being able to predict phenotypes in order to speed up breeding cycles and save money required for phenotyping (Goddard & Hayes 2009; Jannink, Lorenz & Iwata 2010). Yang et al. (2010, 2011a) modified these methods in the GCTA software package to estimate the additive genetic variance for a trait using genome-scale single nucleotide polymorphism (SNP) data. This method advances Ritland (1996) by estimating relatedness with many thousands of markers and then using estimated relatedness to infer the additive genetic variance (hereafter referred to as h2GCTA) of a trait. If the assayed SNPs adequately capture the relationships among individuals at causative alleles, h2GCTA is equivalent to narrow-sense heritability (Yang et al. 2010). Yang et al. (2010) evaluated their method by estimating the proportion of variance explained by ~290 k common SNPs in 3925 humans, and after correcting for SNPs that are not genotyped and those at lower minor allele frequency, come quite close to the heritability of height estimated from sib models, ~0·8. GCTA also has been used to estimate heritability for human weight (Yang et al. 2011b), intelligence (Davies et al. 2011), disease susceptibility (Lee et al. 2012) and personality (Verweij et al. 2012). In addition, similar methods have been use to partition genetic variation in wing length (Robinson et al. 2013), clutch size and egg mass of a wild bird population (Santure et al. 2013).

Genomic-based estimates of heritability, such as that implemented in GCTA, together with the ability to collect genome-scale polymorphism data through reduced-representation genotyping such as RAD-tag, multiplexed shotgun genotyping or genotype-by-sequencing (GBS) (Baird et al. 2008; Andolfatto et al. 2011; Elshire et al. 2011), can make precise estimates of heritability practical even for natural populations of long-lived non-model species. Such estimates may be valuable for understanding evolution in natural populations and predicting population responses to environmental perturbations including ongoing climate change (Lavergne et al. 2010; Shaw & Etterson 2012).

In this study, our overall objective was to use full-genome sequence data and empirical estimates of heritability to evaluate the effects of marker density and minor allele frequency, sample size and the exclusion of causative SNPs on the performance of the GCTA method. Our specific objectives were to (i) compare h2 estimates obtained from replicated accessions grown in a common environment with the h2GCTA estimates of heritability, (ii) evaluate how estimates of h2GCTA are affected by sample size, SNP density and minor allele frequency and (iii) examine the effects excluding putatively causal genomic regions has on h2GCTA estimates. We pursue these objectives by subsampling a data set of ~6 million SNPs identified by genome sequencing (Branca et al. 2011; Stanton-Geddes et al. 2013) and phenotypic data for six traits (flowering time, height, trichome density, total nodules and rhizobia strain occupancy of nodules above and below 5 cm root growth) from 226 accessions of M. truncatula. Our evaluation of the method extends that of Yang et al. (2010) by empirically examining the effect of including uncommon SNPs and by evaluating the performance of GCTA with SNP densities and sample sizes that may be obtained by evolutionary ecologists working in non-model systems.

Materials and methods

Medicago truncatula is highly selfing in nature (>95%; Bonnin et al. 2001; Siol et al. 2008) with a native range that extends around the Mediterranean (Ronfort et al. 2006). The genomic and phenotypic data analysed here, the collection of which is described in full in Stanton-Geddes et al. (2013), were obtained from 226 accessions of M. truncatula sampled from across a wide portion of the species range. In brief, the genome of each accession was sequenced using Illumina technology (90-bp reads, average coverage ~6× per accession) and mapped to the eight chromosomes and two pseudomolecules (SNPs that could not be mapped to a chromosome: T – tentative consensus sequences from the Dana–Farber Cancer Institute M. truncatula gene index v10.0 and U – unanchored BACs) of the M. truncatula v3.5 reference genome (Young et al. 2011, Sequence data are available from For analyses, we used 5·67 million biallelic SNPs that were scored in ≥ 100 accessions.

Phenotype data (previously described in Stanton-Geddes et al. 2013) were obtained from 226 accessions grown in a fully randomized eight-block greenhouse experiment, with each accession replicated once per block. Because M. truncatula is highly selfing in nature and accessions were selfed for two or more generations prior to extracting DNA for sequencing and conducting the experiment, replicates are expected to be nearly genetically identical. Each plant was inoculated with two strains of rhizobia (M249 and KH46c) that differ in nodulation phenotypes (Sugawara et al. 2013). We recorded date of first flower and plant height at 10 weeks after flowering and counted all nodules on each of 1899 surviving plants (6·1–8 plants per accession depending on the trait) at 11 weeks after harvesting. For plants from 6 blocks, we calculated the proportion of nodules occupied by rhizobia strain M249 in the upper (top 5 cm) and lower roots using a dot-blot assay (Cregan, Keyser & Sadowsky 1989). Trichome density was measured as the number of trichomes visible at 10× magnification along a 2 mm section of the petiole of 1 fully expanded leaf. For each trait, we calculated clonal repeatability using linear mixed-effects (LMM) models implemented in the rptR package (Schielzeth & Nakagawa 2011) in R version 2.15 (R Core Team 2013). For height, trichome density and flowering date we used a Gaussian distribution with MCMC sampling (rpt.mcmcLMM), while we used a GLMM with logit link and multiplicative overdispersion for the nodule number (count data: rpt.poisGLMM.multi) and nodule occupancy (proportion data: rpt.binomGLMM.multi) traits. For total nodules and nodule occupancy, repeatability is reported on the transformed (link) scale. Results did not differ if we used an anova or MCMC approach (Stanton-Geddes 2013; Appendix 1).Clonal repeatability is a measure of the among accession phenotypic variance and is equivalent to broad-sense heritability, h2 (Lessells & Boag 1987; Nakagawa & Schielzeth 2010). h2 is an upper bound on narrow-sense heritability (h2) as it also includes effects due to dominance and epistasis (Falconer & Mackay 1996; Lynch & Walsh 1998).

We used the program GCTA v1.04 (Yang et al. 2011a) to estimate the proportion of phenotypic variance explained by genotyped SNPs. The GCTA analysis consists of two steps. First, all SNPs are used to calculate the genetic relationship matrix (GRM) among accessions. GCTA uses the accessions included in the analysis as the base population for defining relatedness, such that the average relatedness between all ‘unrelated’ pairs of accessions (off-diagonals of GRM) is zero (Powell, Visscher & Goddard 2010; Yang et al. 2010). As we are working with an inbred species, the average relatedness of accessions with themselves (diagonals of GRM) is equal to two, not one as for humans (Yang et al. 2010). The GRM is then used as a predictor in a mixed linear model with a trait as the response to estimate h2GCTA. The GCTA method estimates the proportion of additive genetic variance for a trait and thus narrow-sense heritability and so should be lower than the h2 estimate of clonal repeatability. Scripts for GCTA analysis available in Appendix 2.

In unmanipulated populations, genotypes likely will be represented by only a single individual. Therefore, we calculated h2GCTA separately for plants from each of the experimental blocks, producing eight estimates of heritability for height, trichomes, flowering date, total nodules and six estimates for the nodule strain occupancy traits. Thus, each accession is only included a single time in each GCTA analysis. For comparison, we also calculated h2GCTA using accession means for each trait that were estimated from a linear model that included block and accession as fixed effects.

To evaluate how less complete genotyping data, that is, marker density, influences estimates, we compared h2GCTA estimated from the full data to h2GCTA estimates from 100 data sets of approximately 5 million, 250 000, 25 000 and 2500 randomly sampled SNPs using PLINK (Purcell et al. 2007; With 25 000 SNPs, there is on average ~1 SNP/10 kb of reference genome. Linkage disequilibrium, as measured by r2, is estimated to decay to background levels within 5–10 kb in M. truncatula (Branca et al. 2011). To evaluate how h2GCTA estimates are affected by sample size (i.e. number of accessions), we estimated heritability on 100 data sets comprised of 50, 100 or 150 randomly sampled accessions. To evaluate how h2GCTA estimates are affected by sampling only common SNPs, we compared h2GCTA estimated using the 2 178 524 SNPs with MAF >10% to h2GCTA with an approximately equal number of SNPs (2 155 724) sampled with MAF >2%. Finally, to evaluate whether or not sampling putatively causative loci affects h2GCTA estimates, we estimated the GRM using a pruned data set in which we removed all SNPs in 10 kb windows flanking each of 1000 SNPs that a genome-wide association study conducted with these same data identified as having the strongest statistical association (lowest P-values) with phenotypic variation in each trait (Stanton-Geddes et al. 2013). The top 1000 SNPs are certain to contain many false positives, that is, SNPs that are not actually responsible for phenotypic variation, but given that the GWAS was conducted with the same data, these 1000 SNPs are those that have the strongest association with the phenotypes we analysed. We compared the h2GCTA estimates using the GRM calculated from the causative SNP pruned data to a GRM created by randomly masking 1000 10 kb windows. For estimating the effects of excluding uncommon SNPs, of sparsely sampling SNPs and not sampling putatively causative SNPs, we compared h2GCTA estimates to the estimates from the full data set using accession means. Scripts for statistical analysis available in Appendix 3.


The relatedness of accessions (off-diagonals of GRM) calculated by GCTA for the full data ranged from −0·26 to 2·49, with the 95% confidence interval from −0·26 to 0·29 (Fig. S1). This range of relatedness is an order of magnitude greater than that found for 3925 ‘unrelated’ humans by Yang et al. (2010) (95% CI from −0·027 to 0·027). The distribution of relatedness is bimodal (Fig. S1), consistent with two major groups found in previous analyses of population structure (Ronfort et al. 2006; Paape et al. 2013). Unlike Yang et al. (2010), we did not remove closely related accessions since all accessions were grown in common conditions, and thus, closely related individuals are no more likely to share common environmental conditions than distant relatives.

For the six phenotypes we analysed, clonal repeatabilities (i.e. h2) estimated from replication of the accessions ranged from near zero for nodule occupancy to 0·71 for flowering date (Fig. 1, Table 1). For each of the traits, h2GCTA estimates using all 5·6 million SNPs and phenotypes from each block individually spanned the clonal repeatability estimates (Fig. 1). The mean of the per-block estimates was highly correlated with the clonal repeatability estimates (r = 0·98, = 0·0008, Fig. 2). However, estimates of h2GCTA from individual blocks showed considerable variance around this mean estimate (Fig. 1). Contrary to expectations that h2GCTA (narrow-sense h2) would be lower than repeatability (broad-sense h2), all estimates of repeatability were within the confidence interval for the mean h2GCTA estimates and the slope of the regression equation did not differ from one (0·93 ± 0·10). The intercept of the equation did not differ from zero (−0·02 ± 0·04) indicating no bias of h2GCTA compared to clonal repeatability (Fig. 2).

Figure 1.

Plot of estimates of repeatability (r; black diamonds) and estimates of h2GCTA from each block (+ signs) using full sequence data. Standard errors are not shown for clarity but are reported in Table 1 (repeatability) and supplemental Data 3 (h2GCTAfor each block).

Table 1. Estimates of clonal repeatability (r) and h2GCTA for six traits using ~ 2 million SNPs with minor allele frequency (MAF) >2% and >10% and with 10 kb windows around the top 1000 SNPs masked (MAF > 2%). Standard errors for h2GCTA estimates are in parentheses
Trait r h2 GCTA(S.E.)h2 GCTA (S.E.)h2 GCTA (S.E.)
MAF > 2%MAF > 10%Candidate SNPs masked
Trichomes0·35 (0·03)0·55 (0·22)0·35 (0·15)0·54 (0·21)
Flowering date0·71 (0·02)0·77 (0·14)0·72 (0·11)0·78 (0·14)
Height0·52 (0·03)0·71 (0·17)0·64 (0·13)0·70 (0·16)
Total nodules0·31 (0·03)0·91 (0·16)0·84 (0·12)0·92 (0·16)
U. nod occ0·02 (0·03)0·04 (0·08)0·09 (0·09)0·04 (0·08)
L. nod occ0·01 (0·02)0·01 (0·04)0·02 (0·03)0·01 (0·040)
Figure 2.

Relationship between the mean h2GCTA from all blocks and repeatability estimated using all data for each trait. The regression equation and r2 from the linear model of h2GCTA fit on repeatability are shown.

Estimates of h2GCTA conducted with 250 000 or 25 000 SNPs were very similar to those obtained with the full data set (Fig. 3). However, when only 2500 SNPs were used, h2GCTA estimates were lower for all traits with non-zero estimates – from 0·22 to 0·54 lower for flowering date and total nodules, respectively (Fig. 3). The range of h2GCTA values in the resampled data sets was also considerably larger when fewer SNPs were used in the estimates (Fig. 3), increasing from 0·07 to 0·11 for flowering date and total nodules, respectively.

Figure 3.

Box-and-whisker plots of h2GCTA estimates from 100 samples each made with 226 accessions and 2500, 25 000, 250 000 or 5 million SNPs for flowering date (Flo Date), height, trichome density, total nodules (Tot nod), upper roots rhizobia strain occupancy (U nod occ) and lower roots rhizobia strain occupancy(L nod occ). The boxes give the first and third quartiles, while the whiskers extend to the highest value within 1·5 times the interquartile range.

Although estimates were relatively robust to the number of SNPs used, they were strongly affected by the number of accessions – with the range of h2GCTA estimates increasing as the number of accessions was reduced (Fig. 4). When only 50 accessions were used to estimate heritability, the values of h2GCTA obtained from the resampled data sets ranged from 0 to 1, the entire range of possible values. Interestingly, the mean h2GCTA estimates did not show a consistent change with smaller samples. For flowering data, height and trichomes, the mean h2GCTA estimates increased with decreasing number of accessions, whereas for the nodulation traits the estimates decreased with fewer accessions.

Figure 4.

Box-and-whisker plots of h2GCTAestimates from 100 samples each made using all SNPs and with 50, 100, 150 or 200 accessions for flowering date (Flo Date), height, trichome density, total nodules (Tot nod), upper roots rhizobia strain occupancy (U nod occ) and lower roots rhizobia strain occupancy (L nod occ). The boxes give the first and third quartiles, while the whiskers extend to the highest value within 1·5 times the interquartile range.

Estimates of h2GCTA from only common SNPs (2 178 524 SNPs with MAF >10%, Table 1) were highly correlated with h2GCTA with a similar sample size including uncommon SNPs MAF > 2% (r = 0·99, P < 0·0001). The estimates based on common SNPs alone were, however, lower than those obtained from the full data set (average reduction = 0·18 for the four non-zero traits, and all but 6 of the 52 per-block estimates were lower in the MAF >10% estimates). In contrast, removing 10 kb windows of SNPs that surround the 1000 SNPs that a previous GWAS identified as mostly closely associated with phenotypic variance in the data (Stanton-Geddes et al. 2013) had only minor effects on h2GCTA estimates (Table 1). This result reinforces that the GCTA method does not require causative SNPs to be genotyped for accurate estimates of heritability, as long as SNP density is high enough to accurately capture fine-scale relatedness.


Heritability is central to predicting the potential for a trait to respond to selection. Unfortunately, estimating heritability using traditional breeding or pedigree-based approaches is difficult and may be not even possible for many organisms (Platenkamp & Shaw 1995). Using traditional approaches for estimating heritability is even more challenging if organisms are grown or reared in natural settings, which may be important given that heritability is environmentally dependent. Genome-scale sequence data provide an opportunity to estimate relatedness of individuals using molecular data and then using estimated relatedness to infer heritability from the proportion of phenotypic variance explained by genotyped SNPs (Yang et al. 2010). Yang et al. (2010, 2011) evaluated the performance of GCTA-based estimates of heritability of several human traits using hundreds of thousands of SNPs and thousands of individuals. The sample sizes that Yang et al. (2010) considered are reasonable for researchers analysing human data but are currently unobtainable in many other species. The potential for these genomic-based approaches to estimate heritability of non-model species will depend on the characteristics of the molecular data, and the sample sizes needed to obtain reliable estimates of heritability.

The good news for evolutionary ecologists interested in estimating heritability of non-model species growing in natural environments is that estimates of heritability obtained from GCTA were positively correlated with estimates of clonal repeatability (Figs 1, 2) and appear relatively robust to SNP number (Fig. 3) and the exclusion of causative SNPs (Table 1). However, for any single replicate, the h2GCTA could be quite far from the mean (Fig. 1). For the six traits, we examined h2GCTA estimates obtained with 25 000 SNPs, approximately one SNP per 10 kb which is a slightly greater distance than that which LD decays to background levels (Branca et al. 2011), were similar to h2GCTA estimates obtained from our full 5·6 million SNP data set. A simulation study similarly found that only a few thousand markers are adequate to accurately estimate heritability, particularly in selfing species such as M. truncatula (Gay, Siol & Ronfort 2013). This result is encouraging because it means that the number of SNPs needed to obtain robust estimates of h2 can be assayed using reduced-representation approaches such as RAD-tag or GBS. These approaches are both less expensive and faster than full-genome sequencing, especially for non-model species. However, our data suggest that estimating heritability using only a few thousand SNPs, a number that may be assayed using some SNP-genotyping platforms, may produce unreliable estimates.

Our data also indicate that it is not necessary to assay causative SNPs or even SNPs that are in close physical linkage with causative SNPs to obtain reliable estimates of h2. The similarity of estimates obtained with and without putatively causative SNPs indicates that with high-density genotyping, allelic variation to accurately capture relatedness (Goddard & Hayes 2009). Similarly, Ober et al. (2012) found that only 150 000 SNPs were necessary to capture the same predictive ability as full sequence data in Drosophila, and Yang et al. (2010) were able to capture about half of the heritability for human height using only 294k SNPs. Simulation studies also have shown that inclusion of causative SNPs has little effect on whole-genome sequence-based phenotype prediction (e.g. genomic prediction) – Meuwissen & Goddard (2010) showed that including the causative SNPs yields only about a 3% increase in prediction accuracy.

While GCTA-based estimates of heritability appear robust to removal of causative SNPs and relatively robust to SNP density, h2GCTA estimates are affected by assaying only common SNPs. We found that h2GCTA estimates that rely only on common SNPs were lower than those obtained with the full data set (Table 1). The lower estimates of h2GCTA obtained when only common SNPs are used to estimate relatedness likely reflects phenotypic differences among closely related accessions that are not differentiated when low frequency SNPs are excluded from the analyses. Lower estimates of h2 when assaying only common SNPs also are consistent with the initial application of the GCTA method to human height (Yang et al. 2010); using 250 k common frequency SNPs, they explained ~ 45% of the phenotypic variance, and by making assumptions about the MAF of ungenotyped causal SNPs, they were able to explain the additional 35% to correspond with independent estimates of the heritability. Despite lower point estimates of h2GCTA, estimates made with common SNPs (MAF > 10%) were tightly correlated with h2GCTA estimates made using all SNPs, indicating that simple corrections, such as used by Yang et al. (2010), may be adequate to estimate heritability with only common SNPs. Moreover, if researchers are primarily interested in the relative potential for different traits to respond to selection, rather than absolute responses to selection, and then, basing relatedness on only common SNPs may be valid. From a practical perspective, many now-commonly used genotyping methods, including RAD-tag and GBS, do not require ascertainment of SNPs and thus are not likely to be strongly biased towards common SNPs.

Although SNP density, MAF, and the inclusion of causative SNPs had relatively small effects on h2 estimates, we found that estimates were highly dependent on the number of accessions sampled. When only 50 accessions were used, h2 estimates ranged from 0 to 1 for all six traits (Fig. 4). Even with 100 accessions, the range of values was large – with 50% quantiles of the h2GCTA estimates spanning more than half of the possible heritability values for trichome density and nodule number. This range of values suggests that hundreds of individuals should be used to obtain reliable estimates of h2. It is worth noting that the need for large sample sizes is not limited to GCTA – Villemereuil et al. (2013) showed with simulated data that even when sampling 200 individuals, confidence intervals around animal model-based estimates of h2 can be quite large with 95% quantiles covering a range of ~0·5 for traits with moderate or high h2 (Villemereuil et al. 2013, Fig. 1).

In addition to the challenge of sampling hundreds of individuals to obtain reliable estimates of h2, genomic-based estimates of heritability in natural environments may be confounded by the environment (Garant & Kruuk 2005). This was evident even with data from plants grown in controlled greenhouse conditions, with estimates of h2GCTA from individual blocks different from the overall mean by up to 0·24 (Fig. 1). This problem will almost certainly be greater in natural settings. In particular, if relatedness and the environment covary, as may be expected with species that have limited dispersal, for which there is a genetic basis to habitat choice (Bazzaz 1991), or for which maternal environmental effects are strong, then h2GCTA estimates are likely to be biased upwards. Yang et al. (2010) suggested that the potential bias of shared environments might be limited by removing closely related individuals that may share a common environment. This approach has been criticized for failing to account for fine-scale population structure or ascertainment bias of samples (Browning & Browning 2011), though Goddard et al. (2011) emphasize this is a general problem for genetic studies when the environment has a large effect on the phenotype. An alternative approach, if important environmental variables can be identified and assayed, would be to directly include the relevant environmental variables as covariates in the model estimating heritability.

In summary, to date, the estimation of heritability has required a known pedigree in nature, or experimentally generated progeny reared in common environments. We find that the GCTA method, which has been used to investigate heritability of many human traits (Yang et al. 2010, 2011b; Lee et al. 2012), provides estimates of heritability that correspond to independent estimates of clonal repeatability. These results suggest that GBS-type approaches, which sample tens to hundreds of thousands of SNPs (e.g. Baird et al. 2008; Andolfatto et al. 2011; Elshire et al. 2011), will be a valuable resource for estimating heritability in natural populations when many hundreds of individuals can be sampled.


We thank Jian Yang for help with running the GCTA program, Ruth Shaw for discussions, Jarrod Hadfield for comments, and Jean-Marie Prosperi, Joëlle Ronfort, Magalie Delalande, Thierry Huguet, Laurent Gentzbittel and Mohammed Badri for maintaining and providing Medicago germplasm. Tim Paape, Roxanne Denny, Brendan Epstein, Stephanie Erlandson, Masayuki Sugawara and Mohamed Yakub assisted with data collection. This work utilized computing resources at the University of Minnesota Supercomputing Institute and was funded by National Science Foundation Grant 0820005.

Data accessibility

Sequence data are available from: Phenotype data are available from Data Dryad doi: 10.5061/dryad.pq143. Scripts used to perform the analysis are available in online Appendices.