Genomic predictors of trainability


C. Bouchard: Human Genomics Laboratory, Pennington Biomedical Research Center, Baton Rouge, LA 70808, USA.  Email:


The concept of individual differences in the response to exercise training or trainability was defined three decades ago. In a series of experimental studies with pairs of monozygotic twins, evidence was found in support of a strong genotype dependency of the ability to respond to regular exercise. In the HERITAGE Family Study, it was observed that the heritability of the maximal oxygen uptake response to 20 weeks of standardized exercise training reached 47% after adjustment for age, sex, baseline maximal oxygen uptake and baseline body mass and composition. Candidate gene studies have not yielded as many validated gene targets and variants as originally anticipated. Genome-wide explorations have generated more convincing predictors of maximal oxygen uptake trainability. A genomic predictor score based on the number of favourable alleles carried at 21 single nucleotide polymorphisms appears to be able to identify low and high training response classes that differ by at least threefold. Combining transcriptomic and genomic technologies has also yielded highly promising results concerning the ability to predict trainability among sedentary people.

The concept of human variation in the ability to respond to exercise training was proposed almost 30 years ago based on research performed in my laboratory at Laval University in Quebec City (Bouchard, 1983). In a series of standardized and carefully monitored exercise training experiments conducted with groups of sedentary young men and women, 18–30 years of age, it was shown that there were large interindividual differences in the response to training, i.e., trainability, for all traits that were investigated, including maximal oxygen uptake ( inline image), submaximal exercise capacity, skeletal muscle oxidative potential indicators, and adipose tissue lipid mobilization and storage markers (Lortie et al. 1984; Despres et al. 1984b; Savard et al. 1985; Hamel et al. 1986; Simoneau et al. 1986).

The most comprehensive data on the individual differences in trainability come from the HERITAGE Family Study, in which 742 healthy but sedentary subjects followed a highly standardized, well-controlled, laboratory-based endurance-training programme for 20 weeks. The training programme induced on average significant changes in  inline image and other maximal and submaximal indicators of cardiorespiratory fitness and performance. However, these changes were characterized by marked interindividual differences. For example, the average increase in  inline image was about 400 ml O2 min−1 with a standard deviation of about 200 ml O2 min−1. The training responses varied from no change to increases of more than 1000 ml O2 min−1 (Bouchard et al. 1999; Skinner et al. 2000; Bouchard & Rankinen, 2001). The same pattern of variation was evident for several other training response traits (Bouchard & Rankinen, 2001; Wilmore et al. 2001). Similar heterogeneity in responsiveness of  inline image to exercise training has been reported in other populations (Kohrt et al. 1991; Hautala et al. 2003; Karavirta et al. 2011).

Evidence for a genetic component

Several selection experiments have confirmed the concept that there is a substantial genetic component to the trainability of exercise performance traits. For instance, in one study on selection for high and low responses to treadmill training in rats, the mean running distance increase of the founder population was 222 m (Troxell et al. 2003). Pairs of lowest and of highest responders to training were mated, and their offspring were later exposed to the same treadmill training programme. Offspring from the low line did not differ in trainability from the founders, while those from the high line improved their running distance by more than 60% over the low line. These results also revealed that the narrow heritability of running performance trainability reached 43%.

In a series of experiments that we conducted with pairs of monozygotic (MZ) twins, all sedentary young adults, it was established that individual differences in trainability were not randomly distributed (Prud’homme et al. 1984; Despres et al. 1984a; Hamel et al. 1986). Thus, there was consistently more variance in training responses between pairs of MZ twins than was observed between brothers or sisters (within pairs). These MZ twin intervention experiments revealed that there was a strong genotype–training interaction effect contributing to the variation in  inline image trainability. Four twin studies were performed with exercise training programmes that differed in duration, intensity and control over dietary intake (Prud’homme et al. 1984; Boulay et al. 1986; Hamel et al. 1986; Simoneau et al. 1986; Bouchard et al. 1994). Intraclass correlation coefficients computed from the within-pairs variance and the between-pairs variance in  inline image response to training ranged from 0.44 to 0.77, indicating that the aggregate genotype of a person plays a major role in determining the amplitude of  inline image trainability.

Similar findings have been reported for other laboratory-based measures of maximal exercise performance in experimental studies conducted with pairs of MZ twins. Thus, significant within-pair resemblance in training response was observed for total power output during a 90 min maximal cycle ergometer test (Boulay et al. 1986; Hamel et al. 1986) and in short-term (10 s power output) and long-term maximal power tests (90 s power output; Simoneau et al. 1986).

The high degree of heterogeneity in responsiveness to a fully standardized exercise programme in HERITAGE was not accounted for by baseline  inline image level, age, sex or ethnic differences. In the case of the 99 families of European descent who were part of HERITAGE, the increase in  inline image showed 2.5 times more variance between families than within families (Bouchard et al. 1999). Thus, the remarkable heterogeneity observed for the gains in  inline image among adults is not random and is characterized by a strong familial aggregation. A model-fitting analytical procedure found that the most parsimonious models yielded a maximal heritability estimate of 47% for  inline image response level (Bouchard et al. 1999).

Among other findings of interest from HERITAGE, maximal heritability estimates for the changes with exercise training ranged from about 25 to 55% for the gains in  inline image and power output at 60 and 80% of maximum (Perusse et al. 2001). Submaximal exercise heart rate, stroke volume and cardiac output at a power output level of 50 W exhibited significant familial aggregation in response to endurance training, with broad heritability estimates of about 35% (An et al. 2000, 2003; Bouchard & Rankinen, 2001).

Candidate genes

Multiple nuclear and mitochondrial DNA markers have been significantly associated with haemodynamic traits and indicators of physical performance (Bray et al. 2009; Rankinen et al. 2011). Unfortunately, the vast majority of the studies substantiating these associations were based on observational data, were targeting poorly justified candidates and were grossly statistically underpowered. Moreover, all the positive findings on autosomal markers have been diminished by damaging negative reports. The situation applies also to genomic markers of trainability.

Few candidate genes have been found to be associated with the trainability of cardiorespiratory fitness traits. The ACE gene encodes a peptidyl dipeptidase, angiotensin-converting enzyme, a component of the renin–angiotensin system. A few reports have dealt with the ACE insertion/deletion (I/D) polymorphism and exercise training-induced left ventricular growth as assessed by echocardiography. Montgomery and coworkers reported that the ACE D allele was associated with greater increases in left ventricular mass and septal and posterior wall thickness of the heart after 10 weeks of physical training in British Army recruits (Montgomery et al. 1997). A few years later, the same group observed that the training-induced increase in left ventricular mass in another cohort of Army recruits was 2.7 times greater in the D/D than the I/I homozygotes (Myerson et al. 2001). The prevalence of echocardiographically determined left ventricular hypertrophy increased only among the DD homozygotes (Montgomery et al. 1997). Similar observations were made in endurance athletes, with DD homozygotes exhibiting larger left ventricular mass and a higher prevalence of left ventricular hypertrophy (Di Mauro et al. 2010).

Expression of the components of the renin–angiotensin system is increased in response to stimuli leading to cardiac hypertrophy. The ACE D allele is the allele associated with higher ACE activity and left ventricular growth response (Skipworth et al. 2011). If the DD genotype leads to more heart size growth in response to exercise in healthy individuals, then one would expect that the same genotype would result in higher  inline image gains as well. However, several studies on this issue have been reported, and the results do not support the a priori hypothesis. For instance, the II genotype subjects increased  inline image twice as much as the DD subjects in a training study of postmenopausal women (Hagberg et al. 1998). Several exercise training studies have not observed any differences in  inline image trainability among the three ACE I/D genotypes (Sonna et al. 2001; Roltsch et al. 2005; Day et al. 2007). In HERITAGE subjects exposed to 20 weeks of aerobic exercise, the DD individuals increased  inline image by 476 ml O2 min−1 (SD = 23), while the II subjects gained 417 ml O2 min−1 (SD = 26; P = 0.042) in the subgroup of adult offspring (n = 303) from Whites of European descent (Rankinen et al. 2000). However, there were no differences among ACE genotypes in  inline image trainability in the parental group of Whites (n = 188) and in the offspring (n = 196) and parent subgroups (n = 75) of African descent.

Other genes have been considered as candidates for the individual differences in trainability of cardiorespiratory fitness traits. The apolipoprotein (APOE) gene is one of them; however, only two studies have been reported, and their results are discordant (Hagberg et al. 1999; Thompson et al. 2004). Another gene of interest has been the α-actinin 3 (ACTN3) gene (MacArthur & North, 2011). A C/T transition in codon 577 of ACTN3 replaces an arginine residue (R577) with a premature stop codon (X577), resulting in α-actinin 3 deficiency in the XX individuals. The stop codon variant is quite common in humans, with an estimated 1 billion people worldwide being XX homozygotes (MacArthur & North, 2011). There is a dearth of data on the role of ACTN3 in the training response of cardiorespiratory fitness traits. However, two studies reported to date on its role in the trainability of skeletal muscle performance traits are inconclusive (Clarkson et al. 2005; Delmonico et al. 2007).

Genome-wide exploration

Improvements in microarray-based high-throughput technologies have made it possible to assay hundreds of thousands of single-nucleotide polymorphisms (SNPs) in a single reaction or to quantify the expression level of thousands of transcripts simultaneously. As a result of these advances, it became feasible to undertake genome-wide screens focused on DNA sequence variants (mainly SNPs) or gene transcript abundance. In a nutshell, in appropriate conditions, objective and largely unbiased hypothesis-free tests became possible. To date, only a handful of genome-wide linkage or genome-wide association studies (GWAS) pertaining to trainability have been reported.

Genome-wide linkage analysis was used in HERITAGE to find genes for the response to exercise training. Quantitative trait loci (QTLs) for training-induced changes in submaximal exercise (50 W) stroke volume and heart rate were found on chromosomes 10p11 and 2q33.3–q34, respectively (Rankinen et al. 2002; Spielmann et al. 2007). The QTL on 10p11 for the gains in stroke volume was narrowed down to a 7 Mb region. Among the linkage-positive families, the strongest associations were found with SNPs in the kinesin family member 5B (KIF5B) gene locus (Argyropoulos et al. 2009). Resequencing of KIF5B revealed several sequence variants. The SNP with strongest association modified the KIF5B promoter activity in cell-based systems. Furthermore, inhibition and overexpression studies in C2C12 cells showed that changes in KIF5B expression level altered mitochondrial localization and biogenesis; KIF5B inhibition led to diminished biogenesis and perinuclear accumulation of mitochondria, while overexpression enhanced mitochondrial biogenesis (Argyropoulos et al. 2009).

The QTL for the changes in exercise heart rate at 50 W (HR50) on chromosome 2q33.3–q34 was localized within a 10 Mb region (Rankinen et al. 2010). The strongest evidence of association was detected with two SNPs located in the 5′-region of the cAMP-responsive element binding protein 1 (CREB1) gene locus (P = 1.6 × 10−5). The most significant SNP explained almost 5% of the variance in HR50 response, and the common allele homozygotes and heterozygotes had about 57 and 20%, respectively, greater decreases in HR50 than the minor allele homozygotes. Furthermore, one of these SNPs located about 2.6 kb upstream of the first exon of CREB1 was shown to modify promoter activity in vitro. The A-allele, which was associated with a blunted HR50 response, showed significantly greater promoter activity in a C2C12 cell model than the G-allele.

The first trainability studies incorporating dense genome-wide screening technologies were recently published, and both of them targeted  inline image as a response trait (Timmons et al. 2010; Bouchard et al. 2011). In the first report, Timmons and colleagues relied on global skeletal muscle gene expression profiling and DNA markers to identify genes associated with  inline image training response (Timmons et al. 2010). RNA expression profiling of pretraining skeletal muscle samples was performed in subjects of two independent exercise training trials. The first study (n = 24) identified a panel of 29 transcripts that were strongly associated with the gains in  inline image. The predictive value of the 29 transcripts was subsequently confirmed in a second study (n = 17). This was followed by genotyping tagging SNPs for the predictor transcripts in the cohort of Whites of HERITAGE. A multivariate regression analysis using the transcriptome-derived SNPs and a set of SNPs defined from positional cloning studies performed in HERITAGE identified a set of 11 SNPs that explained about 23% of the variance in  inline image training response. Seven of the SNPs were from the RNA predictor gene set, and four were from the HERITAGE QTL projects.

The second report was based on a GWAS undertaken with more than 320,000 SNPs on the sample of Whites in HERITAGE (Bouchard et al. 2011). A total of 39 individual SNPs were associated with  inline image training response at P < 1.5 × 10−4. The strongest evidence of association (P = 1.3 × 10−6) was observed with an SNP located in the first intron of the acyl-CoA synthetase long-chain family member 1 (ACSL1) gene. When all 39 SNPs were analysed simultaneously in multivariate regression models, nine SNPs explained at least 2% (range 2.2–7.0%) of the variance (P < 0.0001 for all), while seven markers contributed between 1 and 2% each. Collectively, these 16 SNPs accounted for 45% of the variance in  inline image trainability, a value comparable to the heritability estimate of 47% reported previously in HERITAGE (Bouchard et al. 1999).

A predictor score was constructed using all 21 SNPs that entered in the final regression model. Each SNP was coded based on the number of high  inline image training response alleles; low-response allele homozygote was assigned 0, heterozygote received 1, and homozygote for the high-response allele was assigned 2. While the theoretical range of the score was from 0 (no beneficial alleles) to 42 (two copies of the beneficial alleles at all 21 loci), the observed scores ranged from 7 to 31. The difference in  inline image training response between those with the lowest (9 or less, n = 36, mean =+221 ml O2 min−1) and the highest scores (19 or more, n = 52, mean =+604 ml O2 min−1) was 383 ml O2 min−1, as depicted in Fig. 1.

Figure 1.

Age-, sex- and baseline maximal oxygen uptake ( inline image )-adjustedinline imagetraining responses across nine genome-wide association study predictor single nucleotide polymorphism (SNP) score categories in HERITAGE Whites
A predictor score was constructed using all 21 SNPs that entered in the final regression model. Each SNP was coded based on the number of high  inline image training response alleles. The difference in  inline image training response between those with the lowest (9 or less, n = 36, mean =+221 ml O2 min−1) and the highest scores (19 or more, n = 52, mean =+604 ml O2 min−1) was 383 ml O2 min−1. The number of subjects within each SNP score category is indicated inside each histogram bar. ‘le’ stands for ‘less than or equal to’ and ‘ge’‘greater than or equal to’. From Bouchard et al. (2011), with permission.


Even though these genome-wide-based results have not been comprehensively replicated yet, they suggest that it may be possible to define panels of expressed transcripts and DNA variants that would predict  inline image responsiveness to regular exercise. However, a lot more work remains to be done. One key question is whether the response pattern in a given individual is specific to the given exercise mode and regimen. Another issue is that of the duration of the exercise intervention or the training programme. Would the fitness outcomes be different if the exercise programme lasted for years instead of months? The impact of these rather practical issues on the genomic predictors of true cardiorespiratory fitness trainability is unknown.