Association of sequence variants in CKM (creatine kinase, muscle) and COX4I2 (cytochrome c oxidase, subunit 4, isoform 2) genes with racing performance in Thoroughbred horses


  • J. GU,

    1. Animal Genomics Laboratory, UCD School of Agriculture, Food Science and Veterinary Medicine, University Veterinary Hospital, UCD School of Agriculture, Food Science and Veterinary Medicine, University College Dublin, Ireland.
    Search for more papers by this author
  • D. E. MacHUGH,

    1. Animal Genomics Laboratory, UCD School of Agriculture, Food Science and Veterinary Medicine, University Veterinary Hospital, UCD School of Agriculture, Food Science and Veterinary Medicine, University College Dublin, Ireland.
    Search for more papers by this author
  • B. A. McGIVNEY,

    1. Animal Genomics Laboratory, UCD School of Agriculture, Food Science and Veterinary Medicine, University Veterinary Hospital, UCD School of Agriculture, Food Science and Veterinary Medicine, University College Dublin, Ireland.
    Search for more papers by this author
  • S. D. E. PARK,

    1. Animal Genomics Laboratory, UCD School of Agriculture, Food Science and Veterinary Medicine, University Veterinary Hospital, UCD School of Agriculture, Food Science and Veterinary Medicine, University College Dublin, Ireland.
    Search for more papers by this author
  • L. M. KATZ,

    1. Animal Genomics Laboratory, UCD School of Agriculture, Food Science and Veterinary Medicine, University Veterinary Hospital, UCD School of Agriculture, Food Science and Veterinary Medicine, University College Dublin, Ireland.
    Search for more papers by this author
  • E. W. HILL

    Corresponding authorSearch for more papers by this author


Reasons for performing study: The wild progenitors of the domestic horse were subject to natural selection for speed and stamina for millennia. Uniquely, this process has been augmented in Thoroughbreds, which have undergone at least 3 centuries of intense artificial selection for athletic phenotypes. While the phenotypic adaptations to exercise are well described, only a small number of the underlying genetic variants contributing to these phenotypes have been reported.

Objectives: A panel of candidate performance-related genes was examined for DNA sequence variation in Thoroughbreds and the association with racecourse performance investigated.

Materials and methods: Eighteen candidate genes were chosen for their putative roles in exercise. Re-sequencing in Thoroughbred samples was successful for primer sets in 13 of these genes. SNPs identified in this study and from the EquCab2.0 SNP database were genotyped in 2 sets of Thoroughbred samples (n = 150 and 148) and a series of population-based case-control investigations were performed by separating the samples into discrete cohorts on the basis of retrospective racecourse performance.

Results: Twenty novel SNPs were detected in 3 genes: ACTN3, CKM and COX4I2. Genotype frequency distributions for 3 SNPs in CKM and COX4I2 were significantly (P<0.05) different between elite Thoroughbreds and racehorses that had never won a race. These associations were not validated when an additional (n = 130) independent set of samples was genotyped, but when analyses included all samples (n = 278) the significance of association at COX4I2 g.22684390C>T was confirmed (P<0.02).

Conclusions: While molecular genetic information has the potential to become a powerful tool to make improved decisions in horse industries, it is vital that rigour is applied to studies generating these data and that adequate and appropriate sample sets, particularly for independent replication, are used.


For 3 centuries the natural athleticism of the horse has been selected by breeders to produce Thoroughbred racehorses that, aided by intense management, have become athletes with extreme exercise-performance phenotypes. While management of exercise conditioning and nutrition have considerable effects on the development of elite Thoroughbred athletes (approximately 65%), a significant proportion of variation in athletic ability is heritable (Gaffney and Cunningham 1988). Genetic contributions to human athletic performance phenotypes are well documented and more than 220 gene loci have been described (Bray et al. 2009). While it is likely that Thoroughbred racing performance is also influenced by a large number of genes, only 2 performance-associated sequence variants in exercise-relevant genes (MSTN and PDK4), have been reported for the horse (Hill et al. 2010a,b). As well as these genes, it has been established that genomic regions containing an over-representation of genes responsible for insulin signalling, fatty acid metabolism and muscle strength have been selected during the development of the Thoroughbred (Gu et al. 2009). Additionally, relationships between mitochondrial genotypes and athletic performance in the Thoroughbred have been reported (Harrison and Turrion-Gomez 2006).

A single nucleotide polymorphism (SNP) is a single base substitution in a genomic DNA sequence. SNPs are stable, bi-allelic genetic markers that are abundantly distributed throughout the genome. They may be discovered by aligning segments of a genomic region from different individuals following re-sequencing of candidate genes. We hypothesised that genomic sequence variants may be detected in exercise-relevant genes in Thoroughbred horses. Such sequence variants may be developed for future use to distinguish between individuals with greater potential for elite racetrack performance and individuals with lesser prospects for success. The prospective identification of genetic potential may improve selection decisions and reduce operating costs and may provide opportunities to individually design conditioning programmes to reduce injury risk.

Therefore, we investigated a panel of candidate athletic performance genes with functions in muscle development and metabolism, many of which were included in the human gene map for performance (Rankinen et al. 2006). The aim of this study was to identify sequence variation in a panel of candidate athletic performance genes and to evaluate SNP association with racing performance phenotypes. As the Thoroughbred population has been subjected to recent and strong selection, we hypothesised that adaptation to exercise performance may have resulted in advantageous sequence variants in genes that contribute to an athletic phenotype and that these variants may be found at higher frequencies in successful subgroups of the population.

Materials and methods


This work has been approved by University College Dublin, Animal Research Ethics Committee.

Horse populations and DNA samples

Thoroughbreds: More than 1400 registered Thoroughbred horse samples (hair or fresh blood) were collected from stud farms, racing yards and sales establishments in Ireland and New Zealand between 1997 and 2009. To minimise confounding effects of racing over obstacles, only horses with performance records in Flat races were considered for inclusion in the study cohorts. The highest standard and most valuable elite Flat races are known as Group races. Horses were categorised based on retrospective racecourse performance records as ‘Elite Thoroughbreds’ (TBE) or ‘Other Thoroughbreds’ (TBO). Elite Thoroughbreds were Flat racehorses that had won at least one Group race (Group 1, Group 2 or Group 3). Other Thoroughbreds had competed in at least one race, but had never won a race and had handicap ratings (Racing Post Rating, RPR) <80. Race records were derived from 3 sources - Europe race records: The Racing Post online database (; Australasia and South East Asia race records: Arion Pedigrees ( and North America race records: Pedigree Online Thoroughbred database ( In all cases pedigree information was used to control for genetic background by attempting to exclude samples sharing sires. No dams were shared. Also, overrepresentation of popular sire lines (e.g. Northern Dancer, etc.) within the pedigrees was avoided where possible.

Sample Set I: Sample Set I (Table S1) comprised 150 elite (TBE, n = 80; mean RPR = 117) and nonelite (TBO, n = 70; mean RPR = 63) performing Thoroughbreds. There was some sharing of sires among the sample cohorts; i.e. there were 63 sires among the 63 TBE samples and 57 sires among the n = 70 TBO samples. The elite performer group contained a subset of animals that competed successfully in short distance (≤8 f, ≤1609 m) and long distance (>8 f, >1609 m) races.

Sample Set II: Sample Set II (Table S1) (n = 148) was refined to include individuals with no shared sires or dams within each cohort. Therefore 17 samples were removed from Sample Set I and supplemented with additional samples collected on an on-going basis during the project. Sample Set II contained 86 elite (mean RPR = 115) and 62 nonelite (mean RPR = 59) performing Thoroughbreds. The elite performer group contained a subset of animals that competed successfully in short distance (≤8 f) and long distance (>8 f) races.

Validation Sample Set: A set of 130 (97 TBE and 33 TBO) additional Thoroughbred samples was selected from the repository for validation of the SNP associations, and criteria for inclusion were as for Sample Set II.

NonThoroughbreds: Samples from 3 non-Thoroughbred populations were included as diverse samples in a panel for SNP discovery by re-sequencing: Akhal-Teke (AH; Turkmenistan, Central Asia), Connemara (CON; Ireland, Western Europe) and Tuva (TU; Republic of Tuva, Southern Siberian Steppes).

Genomic DNA was extracted from either fresh whole blood or hair samples using a modified version of a standard phenol/chloroform method (Sambrook and Russell 2001).

Retrieval of equine genomic sequence for SNP discovery

Twenty-three candidate athletic performance genes (Table 1) were selected for SNP discovery on the basis that their key functions were relevant to exercise physiology and on the availability of equine-specific genomic sequence at the time of PCR assay design. An overview of the study genes including gene symbol, gene name, chromosome location and functional ontology is given in Table 1.

Table 1. Candidate athletic performance genes
Gene symbolGene nameChrKEGG pathway and/or GO biological processResequencing
ACE2 Angiotensin I converting enzyme (peptidyl-dipeptidase A) 2Xhsa04614:Renin-angiotensin systemY
ACOT9 Acyl-coa thioesterase 9XGO:0006629∼lipid metabolic processY
ACTN3 Actinin, alpha 312hsa04510:Focal adhesion; GO:0003012∼muscle system processY
AGTR2 Angiotensin II receptor, type 2XGO:0002016∼renin-angiotensin regulation of blood volumeY
CKM Creatine kinase, muscle10GO:0006603∼phosphocreatine metabolic processY
COX4I2 Cytochrome c oxidase, subunit 4, isoform 222hsa00190:Oxidative phosphorylationY
CYCS Cytochrome c, somatic4GO:0045333∼cellular respirationY
FBP1 Fructose-1,6-bisphosphatase 123hsa00010:Glycolysis / GluconeogenesisY
GYG1 Glycogenin 116GO:0005977∼glycogen metabolic processY
GYG2 Glycogenin 2XGO:0005977∼glycogen metabolic processY
HIF1A Hypoxia inducible factor 1, alpha subunit (basic helix-loop-helix transcription factor)24GO:0001666∼response to hypoxiaY
LDHA Lactate dehydrogenase A7hsa00010:Glycolysis/GluconeogenesisY
MYEF2 Myelin expression factor 21GO:0007518∼myoblast cell fate determinationN
PDHA1 Pyruvate dehydrogenase (lipoamide) alpha 1Xhsa00010:Glycolysis/GluconeogenesisY
PDHA2 Pyruvate dehydrogenase (lipoamide) alpha 23hsa00010:Glycolysis/GluconeogenesisY
PFKM Phosphofructokinase, muscle6hsa00010:Glycolysis/GluconeogenesisY
PGK1 Phosphoglycerate kinase 1Xhsa00010:Glycolysis/GluconeogenesisY
PHKA1 Phosphorylase kinase, alpha 1 (muscle)Xhsa04910:Insulin signalling pathwayY
PPARGC1A Peroxisome proliferator-activated receptor gamma, coactivator 1 alpha3hsa04910:Insulin signalling pathwayY
PRKAA1 Protein kinase, AMP-activated, alpha 1 catalytic subunit21hsa04910:Insulin signalling pathwayN
PRKAA2 Protein kinase, AMP-activated, alpha 2 catalytic subunit2hsa04910:Insulin signalling pathwayN
SLC2A4 Solute carrier family 2 (facilitated glucose transporter), member 411hsa04910:Insulin signalling pathwayN
VEGFA Vascular endothelial growth factor A20hsa04370:VEGF signalling pathway, hsa04510:Focal adhesionN

Primers for PCR were selected and designed from 3 different sources based on the availability of equine-specific genomic sequence for each candidate gene. Before the availability of the first assembly (EquCab1.0) of the horse (Equus caballus) genome (February 2007), PCR primers were either: 1) chosen from published papers; or 2) designed from available sequences in GenBank and the Ensembl Trace Archive (Wheeler et al. 2008). In instances where no horse sequence was available, a comparative genomics approach was taken and PCR primers were designed based on the conserved regions between human (Homo sapiens) and cattle (Bos taurus) sequences.

PCR primer design, amplification and purification and DNA re-sequencing

To amplify candidate gene target regions, PCR primers were designed using the online tool Primer3 (Rozen and Skaletsky 2000) and synthesised by Invitrogen1.

Assays that were successfully optimised (Table S2) were used to amplify their target regions in a SNP discovery panel of 8 individuals from 4 diverse horse populations representing modern and ancient breeds (Thoroughbred n = 4; Akhal-Teke n = 2; Connemara n = 1 and Tuva n = 1). As such, the SNP discovery phase of the study was sufficiently powered to detect 75% of SNPs that had a minor allele frequency >0.05 (Kruglyak and Nickerson 2001). PCR products were purified using the ChargeSwitch PCR product purification kit1. Bidirectional DNA sequencing of PCR products was outsourced to Macrogen2 and carried out using AB 3730xl sequencers3. Sequence variants were detected by visual examination of sequences following treatment by Pregap4 and alignment to detect single sequence variations within one amplicon by comparing different sample segments using Gap4 (Bonfield et al. 1995).

DNA re-sequencing in the COX4I2 gene

The re-sequencing of the COX4I2 gene was designed to detect 95% of SNPs with MAF >0.05 in the Thoroughbred population (Kruglyak and Nickerson 2001). There was particular interest in identifying SNPs in this gene as it has been shown to be differentially expressed in Thoroughbred horse skeletal muscle following exercise (Eivers et al. 2010). Eleven pairs of overlapping PCR primers were designed to cover the entire COX4I2 genomic sequence using the PCR Suite extension to the Primer3 web-based primer design tool (Rozen and Skaletsky 2000; van Baren and Heutink 2004) (Table S3). Twenty-four unrelated Thoroughbred DNA samples were included in a re-sequencing panel to identify Thoroughbred-specific sequence variants in the COX4I2 gene. Bidirectional DNA sequencing of PCR products was outsourced to Macrogen and carried out using AB 3730xl sequencers. Sequence variants were detected by visual examination of sequences following alignment using Consed version 19.0 (Gordon et al. 1998).

High-throughput SNP genotyping

A panel of 47 SNPs in 14 genes were genotyped in either Sample Set I (n = 150) or Sample Set II (n = 148); (Tables S4, S5). The assays included novel SNPs that were discovered in this study. The SNP identification process was augmented by the comparison of sequences deposited in the Equus caballus Ensembl Trace Archive repository before January 2007. Furthermore, the SNP panel was supplemented with EquCab2.0 SNPs lying within candidate gene regions that were discovered in the Horse Genome Sequencing Project (Wade et al. 2009).

Genotyping in Sample Set I was outsourced to KBiosciences4 using either competitive allele specific PCR or Taqman assay3. Genotyping in Sample Set II was carried out using iPlex technology5 at Sequenom's facilities. Genotyping in the Validation Sample Set was performed on a TaqMan StepOnePlus instrument3 according to the manufacturer's instructions.

Statistical analyses

All statistical analyses, including tests of association were performed using PLINK Version 1.05 ( (Purcell et al. 2007). Quality control analyses included computation of sample allele frequency and percent missing genotypes. Case-control association tests were performed for all loci. Statistical significance was assessed using the Cochran-Armitage test for trend and an unconditioned genotypic model. The linear regression model was used to evaluate quantitative trait association at loci CKM g.15884567A>G and COX4I2 g.22684390C>T using best race distance (furlongs) as the phenotype.


MatInspector (Cartharius et al. 2005) was used to identify putative transcription factor binding sequences in the sequences surrounding the polymorphic sites at the CKM g.15884567A>G and COX4I2 g.22684390C>T loci.


A population-based case-control study was performed to investigate candidate gene single nucleotide polymorphism (SNP) association with retrospective racecourse performance phenotypes in Thoroughbred horses. Twenty-three candidate genes (Table 1) that had molecular functions relevant to physiological processes important for exercise were selected for the study. The genes were interrogated for SNPs using 3 approaches: 1) De novo SNP discovery by selective resequencing; 2) SNP identification from the Ensembl Trace Archive (Wheeler et al. 2008); and 3) SNP selection from the Horse Genome Sequence SNP Database Version 2.0.

The initial SNP discovery phase executed by these combined approaches revealed 16 sequence variants in 4 (ACTN3 n = 4; CKM n = 4; CYCS n = 7; NOS2 n = 1) of 12 candidate genes. Eight SNPs in 2 genes (ACTN3 n = 4; CKM n = 4) were not recorded in the EquCab2.0 SNP Database. Targeted re-sequencing in COX4I2 identified 14 sequence variants, of which 12 were not included in the EquCab2.0 SNP Database (Table S6). Two of the novel COX4I2 sequence variants and 3 EquCab2.0 COX4I2 SNPs were included in the genotyping panel. An additional 26 SNPs in 9 genes (ACATE2, ACE2, HIF1A, MYEF2, PPARGC1A, PRKAA1, PRKAA2, SLC2A4 and VEGF) were selected from the Horse Genome Sequence (EquCab1.0) SNP Database Version 1.0 following localisation of candidate gene regions by Blast analysis of homologous Homo sapiens gene sequences on Horse Genome Sequence Version 1.0. Upon availability of the annotated genome sequence (EquCab2.0), the SNP locations were cross-referenced to confirm that they were contained within candidate gene genomic regions (nomenclature henceforth refers to EquCab2.0).

In total, 47 SNPs in 14 genes were genotyped in genomic DNA from a panel of Thoroughbred horse samples - either Sample Set I or Sample Set II (Tables S4, S5). Six SNPs had minor allele frequencies <0.05 and were excluded from the association analyses. An exact test for deviation from Hardy-Weinberg proportions was applied at each locus (Wigginton et al. 2005). We did not exclude SNPs that deviated significantly (P<0.01) from Hardy-Weinberg proportions from subsequent analyses since the Thoroughbred population does not conform to many of the conditions under which Hardy-Weinberg equilibrium would be expected to hold. In addition, the tests for association remain valid under departure from Hardy-Weinberg proportions, albeit with a potential loss in power if they reflect systematic genotyping errors. Therefore 41 SNPs were included in a series of phenotype-based genetic association tests (Tables S7, S8). All further discussion refers to SNPs in Sample Set II unless otherwise stated.

Sequence variants in the CKM (g.15884567A>G, χ2= 5.355, P = 0.021, OR = 2.45) and COX4I2 (g.22684390C>T, χ2= 4.654, P = 0.031, OR = 1.731 and g.22684676C>T, χ2= 4.384, P = 0.036, OR = 1.680) genes were significantly associated (P<0.05) with elite racing performance (Table 2). Of the COX4I2 SNPs, g.22684390C>T had the strongest association (P = 0.03) with performance. For both genes the significance of the associations became stronger (CKM g.15884567A>G, χ2= 7.724, P = 0.005, OR = 4.401; and COX4I2 g.22684390C>T, χ2= 7.172, P = 0.007, OR = 2.233) when elite sprinters were compared with nonwinners. In order to determine the most parsimonious genetic model for the associations between these SNPs and elite racing performance the analysis was repeated with coding variables for additive, recessive and overdominant models. For CKM, an allelic model in which the A allele is favourable (i.e. A:A or A:G) provided the best explanation for the data (P = 0.021) and for COX4I2 a recessive model was the best fit in which homozygous individuals (i.e. T:T) for the minor allele were superior racehorses (P = 0.014) (Table 2).

Table 2. Genetic model for the CKM (g.15884567A>G) and COX4I2 (g.22684390C>T) SNPs (Sample Set II)
SNP IDAllele 1Allele 2ModelTBETBOCHISQDFP value
CKM_ 15884567GAGENO1/10/702/14/395.0320.081
COX4I2_ 22684390CTGENO4/44/3210/30/156.97920.031

As the associations became stronger when winners of short distance Group races were compared to nonwinners, the relationship between SNP and the best race distance (furlongs) for each individual were examined. The best race distance was defined as the distance of the highest grade Group race won by each individual. In cases where multiple races of the same grade were won, the distance of the race in which the most prize money was won was used. A quantitative association test analysis for elite individuals that had won a Group race (n = 86) revealed a significant association with best race distance for 4 of the COX4I2 SNPs (Table 3). Individuals homozygous for the favourable allele at COX4I2 g.22684390C>T on average won their best races over shorter distances (P = 0.025). The average best race distances for each genotype were: T:T = 7.9 f; C:T = 8.9 f and C:C = 10.6 f.

Table 3. Quantitative association test analysis results for elite individuals in Sample Set II. a. Association test using best race distance b. Quantitative trait means for best race distance

To validate the associations we genotyped an additional set of samples (n = 130) for 2 of the SNPs with the most significant associations (COX4I2 g.22684390C>T and CKM g.15884567A>G). While the associations were not identified in this set of samples (P = 0.516 and P = 0.792), when all of the samples that had been genotyped were considered together (n = 278) the association at COX4I2 g.22684390C>T was retained (P = 0.014).

Following correction for multiple testing the association between this SNP and racing performance did not remain significant. However, while correction for multiple testing is essential in order to control for false positives in global nonhypothesis driven experiments, multiple testing is not always necessary in hypothesis-driven candidate gene studies (Perneger 1998).


Cytochrome c oxidase (COX) is a multi-subunit enzyme (Complex IV) that catalyses the electron transfer from reduced cytochrome c to oxygen in mitochondrial respiration. COX is a dimer in which each monomer is made up of 13 subunits, 3 of which are encoded by the mitochondrial genome (COX1, 2 and 3). Nuclear encoded COX4 is responsible for the regulation and assembly of mitochondrially encoded subunits on the inner mitochondrial membrane and has been associated with mitochondrial volume. COX4 comprises 2 isoforms (COX4-1 and COX4-2) encoded by the COX4I1 and COX4I2 genes that are differentially regulated in normoxic and hypoxic environments (Fukuda et al. 2007). In normal oxygen environments the COX4I1 gene is preferentially transcribed. In limited oxygen environments the master regulator of the hypoxic response, HIF-1 (hypoxia inducible factor 1), activates transcription of COX4I2 and the mitochondrial LON gene, which inhibits the expression of COX4I1. It has been proposed that this environmental regulation of COX4-2 may increase the efficiency of cellular respiration (Fukuda et al. 2007).

We have identified a weak, but significant, association between an intronic SNP in the COX4I2 gene and retrospective racing performance. The COX4I2 g.22684390C>T SNP disrupts a putative glucocorticoid response element (GRE) binding site (C/TGTT). The favourable allele (T) retains the site (TGTT), the less favourable allele (C) disrupts the site (CGTT), which may disable GRE binding and repress expression of the gene. Alternatively, the SNP disrupts a putative p53 tumour suppressor binding site (CAC/TG). The favourable allele (T) retains the site (CATG), therefore enabling p53 binding, while the less favourable allele (C) disrupts the site (CACG) disabling the p53 binding.

There is growing evidence for a pivotal role for p53 in the regulation of exercise adaptation via mitochondrial biogenesis and apoptosis (Saleem et al. 2009) and regulation of the cytochrome c oxidase complex (Kruse and Gu 2006; Matoba et al. 2006). For instance, p53 has been shown to promote aerobic metabolism and exercise capacity by the regulation of a number of mitochondrial specific genes and in a tissue-specific manner (Park et al. 2009). While this study has not determined the significance of the maintenance or disruption of a putative p53 binding site in COX4I2, we hypothesise that the presence or absence of the binding site may contribute to mitochondrial biogenesis and therefore overall aerobic capacity.

The creatine kinase, muscle gene (CKM) encodes a muscle type isozyme of creatine kinase found exclusively in striated muscle and involved in cellular energetics. During exercise CKM gene knockout mice show a lack of burst activity but maintain normal absolute muscle force (van Deursen et al. 1993). A CKM sequence variant described in man confers a tendency to be more effective in a 90 min performance test and to have less decline in force production during a 60 s force generation test (Bouchard et al. 1989). Human CKM polymorphisms have been shown to be associated with an increase in cardiorespiratory endurance as indexed by maximal oxygen uptake following 20 weeks of training (Echegaray and Rivera 2001). In the Thoroughbred horse skeletal muscle transcriptome, CKM mRNA is the most abundantly expressed transcript representing 6.9% of the annotated transcriptome (McGivney et al. 2010). Studies have indicated that CKM makes up ∼1% of the human skeletal muscle transcriptome (Welle et al. 1999). The very high expression of CKM mRNA in equine compared to human skeletal muscle is indicative of the importance of the CKM gene product in the highly adapted athletic phenotype of the Thoroughbred. In support of this, CKM gene transcripts are significantly increased 4 h following treadmill exercise (Eivers et al. 2010) and following a 10 month period of training (Eivers et al. 2010).

We have identified a preliminary association between a CKM polymorphism and racing performance; however, this relationship must be validated in additional sample sets before any application of the information may be useful. Briefly, the CKM g.15884567A>G SNP is located in intron 4 and disrupts a putative interferon regulatory factor (IRF-1) binding site (GCA/GA). The A allele retains the site (GCAA) while the G allele disrupts the site (GCGA). IRF-1 is an oxygen mediated transcription factor involved in mitochondrial biogenesis and metabolism, and in man has been shown to be significantly activated after a period of endurance exercise (Mahoney et al. 2005).


To date, a range of approaches has been taken to investigate measurable associations with athletic performance phenotypes in Thoroughbred racehorses including assessment of heart size (Young et al. 2005), muscle fibre type (Rivero et al. 1993, 1995; Barrey et al. 1999; Young et al. 2005), musculoskeletal conformation (Fang et al. 2000), post exercise lactate concentration (Evans et al. 1993), speed at maximal heart rate (Gramkow and Evans 2006), haematological (Revington 1983) and other physiological variables (Harkins et al. 1993). The availability of the horse genome sequence (Wade et al. 2009) and the parallel development of molecular genomics tools for the horse have rapidly enabled the identification of sequence variants associated with athletic performance phenotypes in Thoroughbreds. The first genetic test for a known performance associated trait (Hill et al. 2010a) is now commercially available and will assist the bloodstock industry to maximise the genetic potential of each Thoroughbred horse. As it is likely that there is a considerable number of genes that contribute to the heterogeneity observed in phenotypic performance, it will be necessary to further refine and develop genetic selection methodologies to enhance decision-making.

This study contributes to the growing body of knowledge regarding the genetic contributions to elite performance phenotypes in the equine athlete and suggests that sequence variation in 2 genes (COX4I2 and CKM) contributes to racing performance observed on the racetrack. However, it is important to note that a genotype with a significant but weak effect on racing ability is unlikely to have commercial value as many other factors (both genetic and environmental) will also influence performance. Therefore, with regard to the evaluation of specific genomic information for the equine industries, it is imperative that all preliminary associations are validated in adequate and appropriate cohorts of individuals. The application of genomic information in the Thoroughbred industry is a new and emerging field and should adhere to rigorous scientific standards to establish and maintain the integrity of such information among end-users.

Conflicts of interest

None declared.

Manufacturers' addresses

1 Invitrogen, California, USA.

2 Macrogen, Seoul, South Korea.

3 Applied Biosystems, Foster City, California, USA.

4 KBiosciences Ltd., Hoddesdon, Hertfordshire, UK.

5 Sequenom, San Diego, California, USA.