SEARCH

SEARCH BY CITATION

Keywords:

  • Genomic confounding;
  • GWAS;
  • SNP;
  • genetic association

Summary

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

Genome-wide association studies have transformed genetic studies of disease susceptibility, identifying many variants that may tag functional polymorphism nearby. Variants are often ascribed to a physically close gene exhibiting plausible functionality for a causal pathway. However, more physically remote genes may be at a lesser linkage or linkage disequilibrium (LD) distance from the tested SNP and could therefore contain the functional variant tagged. This analysis aims to identify instances where research may be misled by misassociation of a variant with a gene and develop tools to analyse genomic confounding. A catalogue of reported associations was systematically analysed for unreported genes which may represent the true functionality ascribed to a reported variant, calculating physical and genetic distances for all genes within 1 cM of the tagging polymorphism. Results revealed 55 SNPs where recombination was lower between the identified SNP and a physically more remote gene than initially reported, and 374 where an alternative gene was genetically and physically closer than the reported gene. Analyses show potential for genomic confounding through false inferences of variant association to a gene. An online visualization tool (http://gcb.genes.org.uk/) was developed to plot genes by physical and genetic distance relative to a variant, along with LD data.


Introduction

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

Genome-wide association studies (GWAS) have rapidly become the most popular way of searching for genetic associations with many human biological or disease traits. An online catalogue of genetic associations, genome.gov (Hindorff et al., 2009), yielded 555 papers and 2725 SNPs (as of May 2010). These reported SNPs have been reported to be associated with 326 phenotypic traits and outcomes, such as height, cognitive performance, type 1 and type 2 diabetes, Crohn's disease and prostate cancer. Commonly, these studies test around 500,000 SNPs for differential representation of alleles between cases versus controls.

Each reported SNP indicates association between a marker in the genome and the observed trait, suggesting that an underlying causal pathway might be related to the reported SNP, either directly through functionality exhibited by the SNP itself or through linkage disequilibrium (LD) to other SNPs not genotyped, where a causal SNP has an effect on the functionality of a genomic region. Many catalogued associations show reported SNPs to be marking functionality in nearby regions of the genome, commonly for one gene (Reveille et al., 2010; Turnbull et al., 2010), but also for multiple genes (Newton-Cheh et al., 2009; Köttgen et al., 2010) or intergenic regions (Samani et al., 2007; McKay et al., 2008; Gudmundsson et al., 2009) where there is ambiguity. Often these genes are selected either because of a robust association to the reported SNP or because of plausible functionality in some causal pathway, but sometimes these genes are merely selected based on physical proximity to the reported SNP. In some instances there is no gene at all nearby, but a very strong candidate has been observed a long physical distance away (e.g., MC4R (Chambers et al., 2008; Loos et al., 2008), IRS1 (Rung et al., 2009)), indicative that long range effects extending up to at least a megabase may be prevalent in the genome. Whilst causal functionality for a reported SNP is likely to be physically close, there is the potential for the SNP to mark functionality that is physically further away than other possible known functionality that exists nearby. Such links may be made through LD or through genetic recombination, where chromosomal regions with lower recombination rates are more likely to be conserved than those with higher recombination rates. A potential for misleading associations may occur in regions where recombination in one direction (upstream or downstream) from the SNP is dramatically higher than in the other direction, such that genes physically further away from the SNP in one direction are at a much smaller genetic distance to genes in the other direction that may be physically closer. Excluding locus control regions, which could avoid this measurement issue, an assumption that closer genes are most associated with a SNP of interest may be invalid, misreporting physically close genes as associated with a disease or trait.

This misinterpretation of the causal pathway, highlighted by a study of the angiotensin I-converting enzyme (ACE) gene, has been termed “genomic confounding” (Huang et al., 2007). This paper showed that ACE levels were significantly associated with both an insertion/deletion (I/D) polymorphism in the ACE gene itself and a marker BglII-B in the GH-CSH gene cluster, containing five genes about 370 kb away on chromosome 17. This study demonstrated that a substantial positive association between GH-CSH BglII-B and serum ACE activity was due to LD between the ACE I/D and the GH-CSH BglII-B SNP as the findings showed no association between BglII-B and left ventricular mass (LVM) change. It was concluded that LVM association with ACE I/D did not represent a GH-CSH causal effect, but other phenotypes that are associated with the ACE I/D merit further investigation for causal effects in the GH-CSH cluster to avoid misinterpreting the chain of causality. Our prediction is that there are other GWAS associations where genes are reported to be associated with some phenotype (suggesting a causal pathway intersecting that gene), but potential for an alternative causal pathway exists, perhaps through genes where LD information is not publicly available.

Establishing the molecular mechanism of genetic causality is difficult and has only been achieved for a very small number of complex trait risk genotypes, largely where a protein variant is directly causal (Gloyn & McCarthy, 2010). Even then, there may exist alternative possible mechanisms and genes within the risk LD block (Day et al., 2006). This study highlights reported genes for which a tagging SNP may be more closely associated with other genes through a lower genetic distance (indicating less frequent recombination and higher LD). Using an online catalogue of genome-wide association studies (Hindorff et al., 2009), we systematically calculated physical and genetic distances for all genes up to a threshold genetic (recombination) distance of 1cM away from an associated SNP and reported all current cases where potential genomic confounding exists. In addition we created an online genomic confounding browser (GCB), to compare genetic and physical distances along with available LD data for a SNP of interest. This tool may serve as a useful resource for researchers assessing genetic causality and genomic confounding.

Materials and Methods

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

Searching GWAS Results

Genetic associations were collected from the online catalogue of genome-wide association studies (http://www.genome.gov/gwastudies). Genetic recombination data was downloaded in bulk from HapMap (Frazer et al., 2007). Gene transcript and SNP data was downloaded through the Ensembl API (Flicek et al., 2008) and gene names were checked using the HUGO Gene Nomenclature Database (Eyre et al., 2006). All data were stored in relational MySQL (software version MySQL 5.0.67–0ubuntu6) tables. Using chromosome positions from dbSNP, each SNP in the GWAS catalogue was searched against gene transcript data from Ensembl. The closest gene transcript position was recorded for all genes overlapping a 1cM flanking region of the SNP and recombination scores were calculated by taking the difference between the HapMap recombination value at the SNP and at the gene. The 1cM window covers a region in which 1/100 meioses would be expected to be recombinant, and thus should contain most common, nonrecent variation that could account for a SNP association. Where no recombination value was available for a given position in HapMap, the value was linearly interpolated from the two flanking scores and their chromosomal positions. Recombination scores for each variant were collected in ascending order until a reported gene was returned, and all those returned before that reported gene were marked as potential confounders, with any whose distance is greater than any one reported gene marked as a strong candidate.

Online Genomic Confounding Browser

A graphical viewer was developed using Perl and JavaScript to enable analysis of genomic confounding for a SNP of interest (for color images see Supporting Information). Upon user input of a dbSNP identification number, Perl scripts search for Ensembl/Vega gene transcripts falling within a 1 cM flanking region of the provided SNP. Measurements are then calculated for recombination between the nearest point of a gene and the SNP (the genetic distance), as well as the distance on the human genome sequence (the physical distance). Physical distance is derived using Ensembl/Vega and dbSNP positions accessed through the Ensembl API (Flicek et al., 2008). Recombination is derived from HapMap recombination scores (Frazer et al., 2007) downloaded in bulk, taking the difference between the two positions found for physical distance. Genes are plotted in a graphical viewer using a custom JavaScript library, developed using jQuery (http://www.jquery.com/) and Flot (http://code.google.com/p/flot/). Two axes for either Physical Distance (where the x-axis is the chromosomal position in base-pairs and the y-axis is a recombination interval in centimorgans) or Genetic Distance (where x-axis is recombination upstream or downstream of the SNP and the y-axis is physical distance in base-pairs) are available. Genes appear in two rows, where the top row is the forward strand and the bottom row is the reverse strand. In Physical Distance (see Fig. 1), a red line is plotted for each available recombination change relative to the SNP of interest. In Genetic distance, a red line is plotted for change in physical distance relative to the SNP of interest. Data are also gathered for HapMap-CEU LD scores (Frazer et al., 2007) and plotted for D’ (light green) and r2 (dark green), to assist in the analysis of genomic confounding between genes where recombination is similar, but LD is variable. By clicking on the “Genetic Distance” or “Physical Distance” buttons the user may select either view. The user may zoom in horizontally by dragging from one point anywhere in the graph to another.

image

Figure 1. Example output from the genomic confounding browser (GCB). This shows a graphical plot of Physical Distance (bp) against a 0 to 1 scale on the y-axis. The plot is centered on a user-input SNP and scores for relative cumulative recombination (cM) and linkage disequilibrium (r2) are plotted against the y-axis. Genes are shown as shaded regions along the x-axis and split by strand, with the forward strand on the upper half of the plot and the reverse on the lower.

Download figure to PowerPoint

Results

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

Our scan of 1907 SNPs reported as top hits in genome-wide association studies discovered 589 associated SNPs from 156 studies for which alternative genes to those reported may be tagged (for complete list see Table S1). Of these, 54 studies found that the alternative gene lies further away from the associated SNP than any of the genes reported, but closer in genetic distance (Table S2). This set was followed up by literature mining to discover any associations where the alternative gene might explain some functional role in the associated trait. Within our search of catalogued associated SNPs, 11 SNPs were not found in our downloaded SNP list, no genes were found within a 1cM threshold for 15 SNPs and for 192 SNP-gene associations, the associated gene (or gene alias) was not found within 1cM of the SNP. A summary of SNP-gene association sets is shown in Table 1. The distribution of four results subsets based on the positions of closest genes in relation to the reported genes, showed little correlation between genetic and physical distances (plots are available in Figs. S1 and S2).

Table 1.  Summary of GWAS results analysis.
 Type 1aType 2bType 3cType 4d
  1. aSNP-gene pairs for reported genes that are closest by both genetic and physical distance.

  2. bSNP-gene pairs for alternative genes to any reported that are closest by both genetic and physical distance.

  3. cSNP-gene pairs for alternative genes to any reported that are closest by both genetic but not physical distance.

  4. dSNP-gene pairs for reported genes that are not closest by genetic distance.

SNPs1115534156589
Papers1288591178675
Genes14591471443774
Mean physical distance, bp (std dev)47,331.6993,882.30346,319.03219,524.32
(137,612.98)(151,594.71)(638,417.10)(344,594.62)
Mean genetic distance, cM (std dev)0.050.080.180.21
(0.1)(0.17)(0.2)(0.3)
Mean change in physical distance, Δbp (std dev)0112,475.98177,418.44116,721.45
(0)(194,015.87)(442,456.37)(250,177.98)
Mean change in physical distance, ΔcM (std dev)00.110.060.09
(0)(0.17)(0.09)(0.15)

Discussion

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

Our results suggest genomic confounding to be a common issue amongst GWAS results, where 31% of reported genes have alternative genes that are potentially more related to the associated SNP. A further 26% of these SNPs were found to have physical distances greater than the distance to the reported gene, suggesting a potential for confounding. These associated SNPs were found to be associated with phenotypic traits such as AIDS progression, body mass index, blood pressure and height.

One example which highlights this potential for genomic confounding is the association reported between an intergenic SNP (rs2807580) and cognitive performance, reported as marking functionality in NIMA (never in mitosis gene a) related kinase 6 (NEK6) (Cirulli et al., 2010). We found an alternative gene, LIM homeobox 2 (LHX2), some 36,529 base pairs further away than NEK6. LHX2 had a much lower genetic distance (0.297cM) when compared to NEK6 (0.426cM). NEK6 is a human homolog of a gene discovered in Aspergillus nidulans required for entry into mitosis through regulation of chromatin condensation in mammalian cells (Hashimoto et al., 2002). LHX2 is a putative transcription factor (Wu et al., 1996) and determinant of cerebral cortical identity (Mangale et al., 2008) highly expressed in observed cases of chronic myelogenous leukemia (CML) (Wu et al., 1996). Other studies have suggested that LHX2 may be involved in the control of cell differentiation in the development of lymphoid and neural cell types (UniProtKB/Swiss-Prot: LHX2_HUMAN, P50458). This suggested link to neural cell development and evidence of function in cortical identity suggest that this gene might be a plausible candidate for function in cognitive performance that could potentially confound the association with cognitive performance ascribed to NEK6.

A recent GWAS showed dermatopontin (DPT) to be associated with ageing phenotypes and morbidity-free survival (Lunetta et al., 2007). DPT encodes an extracellular matrix protein with potential functions in cell-matrix interactions and assembly and has been shown to enhance TGF-beta activity and inhibit cell proliferation (Catherino et al., 2004; Superti-Furga et al., 1993;). In this case, an alternative gene, chemokine (C motif) ligand 1 (XCL1), lies 21,270 base pairs further away from the associated SNP, rs1412337, than DPT. Less recombination occurs between XCL1 and rs1412337 than with DPT, with a difference of 0.015cM. XCL1 is reported to control chemotactic activity for lymphocytes and may therefore exhibit an effect on overall immunity in the body and ultimately morbidity-free survival (Kelner et al., 1994; Kennedy et al., 1995).

Another association has been reported for the major histocompatibility complex, class I, C (HLA-C) gene with psoriasis and psoriatic arthritis (Liu et al., 2008). This SNP, rs2395029, results in the G2V polymorphism of the class I gene HCP5 (HLA complex P5), an endogenous retroviral element previously associated to HIV-1 control and AIDS progression (Fellay et al., 2009; Limou et al., 2009). Although Liu et al. (2008) only refer to HCP5 in relation to this SNP, the catalogue of genome-wide association studies (Hindorff et al., 2009) shows the reported SNP as being related to HLA-C. Strong LD in the HLA region is a widely known problem illustrated by Haemochromatosis (the gene for which is HFE), where recombinational heterogeneity misled researchers to focus on the nearby gene HLA-A (4.6 Mb from HFE), potentially resulting in a decade of searching longer than necessary for the causative gene (Lonjou et al., 1998). In either case, other genes exist at a lower genetic distance and higher LD such that the causal pathway may actually involve other genes. One such gene is lymphotoxin alpha (LTA), a member of the tumour necrosis factor (TNF) family and a cytokine produced by lymphocytes (Lo et al., 2007). LTA mediates a wide variety of inflammatory, immunostimulatory and antiviral responses and has been previously associated with psoriatic arthritis susceptibility through another polymorphism correlating with the presence and progression of joint erosions as well as the lowest mean age onset of psoriasis (Balding et al., 2003).

A final SNP example, rs1109670, has been reported to be associated with multiple sclerosis (MS) (Baranzini et al., 2009). In this study, the associated gene was reported as ArfGAP with SH3 domain, ankyrin repeat and PH domain 2 (ASAP2), a multidomain protein activating the small GTPases ARF1, ARF5 and ARF6 (Ishikawa et al., 1997; Andreev et al., 1999). This activity mediates vesicle budding when recruited to Golgi membranes. Additionally, ASAP2 functions as a substrate and downstream target for PYK2 and SRC, a pathway potentially involved in regulating vesicular transport. Our results show another gene, kinase D-interacting substrate, 220 kDa (KIDINS220), that lies 175,452 base pairs further away from rs1109670 than ASAP2, but shows a slightly lower genetic distance of 0.312 cM (0.011 cM less). This gene is selectively expressed in brain and neuroendocrine cells, promoting a prolonged MAP-kinase signaling by neurotrophins through activation of a Rap1-dependent mechanism (Iglesias et al., 2000; Liao et al., 2007). KIDINS220 is suggested to affect neurotrophin- and ephrin-mediated neuronal outgrowth and axon guidance during neuronal development and neuronal regeneration (Bracale et al., 2007). Functionality in KIDINS220 may be tagged by rs1109670, facilitating development of MS through differential neuronal development, as opposed to ASAP2 which may affect MS through cell membrane mediation.

Our analyses highlight the potential for misleading inferences and follow-up studies through misreporting of genes tagged by an associated marker, which may hinder the discovery of true causal pathways. Evaluating such parallel genotype-phenotype associations is dependent on LD, but is also relevant to techniques dependent on genetic associations, such as Mendelian Randomization (Fig. 2) (Ebrahim & Davey Smith, 2008) and genotype ratio treatment indexing (Davies et al., 2011). We also present an online tool for the analysis of genomic confounding around a SNP of interest to aid in reducing confounding of future association studies.

image

Figure 2. Diagram showing genomic confounding. The direction of causality between the intermediate trait and the outcome is inferred by the connection between Gene 1 and the observed outcome, determined by the marker which may tag polymorphism in Gene 1. An alternative intermediate trait associated with Gene 2 may also associate with the outcome, whilst Gene 2 is transitively associated via LD, causing confounding for the original intermediate trait.

Download figure to PowerPoint

Funding

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

This work was supported by the UK Medical Research Council (MRC Capacity Building Studentship in Bioinformatics to author CR). CR is a member of the MRC funded Bristol Centre for Systems Biomedicine (BCSBmed) doctoral training centre (director INMD).

References

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information
  • Andreev, J., Simon, J. P., Sabatini, D. D., Kam, J., Plowman, G., Randazzo, P. A. & Schlessinger, J. (1999) Identification of a new Pyk2 target protein with Arf-GAP activity. Mol Cell Biol 19, 23382350.
  • Balding, J., Kane, D., Livingstone, W., Mynett-Johnson, L., Bresnihan, B., Smith, O. & FitzGerald, O. (2003) Cytokine gene polymorphisms: Association with psoriatic arthritis susceptibility and severity. Arthritis Rheum 48, 14081413.
  • Baranzini, S. E., Wang, J., Gibson, R. A., Galwey, N., Naegelin, Y., Barkhof, F., Radue, E., Lindberg, R. L. P., Uitdehaag, B. M. G., Johnson, M. R., Angelakopoulou, A., Hall, L., Richardson, J. C., Prinjha, R. K., Gass, A., Geurts, J. J. G., Kragt, J., Sombekke, M., Vrenken, H., Qualley, P., Lincoln, R. R., Gomez, R., Caillier, S. J., George, M. F., Mousavi, H., Guerrero, R., Okuda, D. T., Cree, B. A. C., Green, A. J., Waubant, E., Goodin, D. S., Pelletier, D., Matthews, P. M., Hauser, S. L., Kappos, L., Polman, C. H. & Oksenberg, J. R. (2009) Genome-wide association analysis of susceptibility and clinical phenotype in multiple sclerosis. Hum Mol Genet 18, 767778.
  • Bracale, A., Cesca, F., Neubrand, V. E., Newsome, T. P., Way, M. & Schiavo, G. (2007) Kidins220/ARMS is transported by a kinesin-1-based mechanism likely to be involved in neuronal differentiation. Mol Biol Cell 18, 142152.
  • Catherino, W. H., Leppert, P. C., Stenmark, M. H., Payson, M., Potlog-Nahari, C., Nieman, L. K. & Segars, J. H. (2004) Reduced dermatopontin expression is a molecular link between uterine leiomyomas and keloids. Genes Chromosomes Cancer 40, 204217.
  • Chambers, J. C., Elliott, P., Zabaneh, D., Zhang, W., Li, Y., Froguel, P., Balding, D., Scott, J. & Kooner, J. S. (2008) Common genetic variation near MC4R is associated with waist circumference and insulin resistance. Nat Genet 40, 716718.
  • Cirulli, E. T., Kasperavičiūtė, D., Attix, D. K., Need, A. C., Ge, D., Gibson, G. & Goldstein, D. B. (2010) Common genetic variation and performance on standardized cognitive tests. Eur J Hum Genet 18, 815820 Available at http://www.ncbi.nlm.nih.gov/pubmed/20125193..
  • Davies, N. M., Windmeijer, F., Martin, R. M., Abdollahi, M. R., Smith, G. D., Lawlor, D. A., Ebrahim, S. & Day, I. N. (2011) Use of genotype frequencies in medicated groups to investigate prescribing practice: APOE and statins as a proof of principle. Clin Chem 57, 502510.
  • Day, I. N. M., Rodriguez, S., Královicová, J., Wood, P. J., Vorechovsky, I. & Gaunt, T. R. (2006) Questioning INS VNTR role in obesity and diabetes: Subclasses tag IGF2-INS-TH haplotypes; and -23HphI as a STEP (splicing and translational efficiency polymorphism). Physiol Genomics 28, 113.
  • Ebrahim, S. & Davey Smith, G. (2008) Mendelian randomization: can genetic epidemiology help redress the failures of observational epidemiology? Hum Genet 123, 1533.
  • Eyre, T. A., Ducluzeau, F., Sneddon, T. P., Povey, S., Bruford, E. A. & Lush, M. J. (2006) The HUGO Gene Nomenclature Database, 2006 updates. Nucleic Acids Res 34, D319321.
  • Fellay, J., Ge, D., Shianna, K. V., Colombo, S., Ledergerber, B., Cirulli, E. T., Urban, T. J., Zhang, K., Gumbs, C. E., Smith, J. P., Castagna, A., Cozzi-Lepri, A., De Luca, A., Easterbrook, P., Günthard, H. F., Mallal, S., Mussini, C., Dalmau, J., Martinez-Picado, J., Miro, J. M., Obel, N., Wolinsky, S. M., Martinson, J. J., Detels, R., Margolick, J. B., Jacobson, L. P., Descombes, P., Antonarakis, S. E., Beckmann, J. S., O’Brien, S. J., Letvin, N. L., McMichael, A. J., Haynes, B. F., Carrington, M., Feng, S., Telenti, A. & Goldstein, D. B. (2009) Common genetic variation and the control of HIV-1 in humans. PLoS Genet 5, e1000791.
  • Flicek, P., Aken, B. L., Beal, K., Ballester, B., Caccamo, M., Chen, Y., Clarke, L., Coates, G., Cunningham, F., Cutts, T., Down, T., Dyer, S. C., Eyre, T., Fitzgerald, S., Fernandez-Banet, J., Gräf, S., Haider, S., Hammond, M., Holland, R., Howe, K. L., Howe, K., Johnson, N., Jenkinson, A., Kähäri, A., Keefe, D., Kokocinski, F., Kulesha, E., Lawson, D., Longden, I., Megy, K., Meidl, P., Overduin, B., Parker, A., Pritchard, B., Prlic, A., Rice, S., Rios, D., Schuster, M., Sealy, I., Slater, G., Smedley, D., Spudich, G., Trevanion, S., Vilella, A. J., Vogel, J., White, S., Wood, M., Birney, E., Cox, T., Curwen, V., Durbin, R., Fernandez-Suarez, X. M., Herrero, J., Hubbard, T. J. P., Kasprzyk, A., Proctor, G., Smith, J., Ureta-Vidal, A. & Searle, S. (2008) Ensembl 2008. Nucleic Acids Res 36, D707714.
  • Frazer, K. A., Ballinger, D. G., Cox, D. R., Hinds, D. A., Stuve, L. L., Gibbs, R. A., Belmont, J. W., Boudreau, A., Hardenbol, P., Leal, S. M., Pasternak, S., Wheeler, D. A., Willis, T. D., Yu, F., Yang, H., Zeng, C., Gao, Y., Hu, H., Hu, W., Li, C., Lin, W., Liu, S., Pan, H., Tang, X., Wang, J., Wang, W., Yu, J., Zhang, B., Zhang, Q., Zhao, H., Zhao, H., Zhou, J., Gabriel, S. B., Barry, R., Blumenstiel, B., Camargo, A., Defelice, M., Faggart, M., Goyette, M., Gupta, S., Moore, J., Nguyen, H., Onofrio, R. C., Parkin, M., Roy, J., Stahl, E., Winchester, E., Ziaugra, L., Altshuler, D., Shen, Y., Yao, Z., Huang, W., Chu, X., He, Y., Jin, L., Liu, Y., Shen, Y., Sun, W., Wang, H., Wang, Y., Wang, Y., Xiong, X., Xu, L., Waye, M. M. Y., Tsui, S. K. W., Xue, H., Wong, J. T., Galver, L. M., Fan, J., Gunderson, K., Murray, S. S., Oliphant, A. R., Chee, M. S., Montpetit, A., Chagnon, F., Ferretti, V., Leboeuf, M., Olivier, J., Phillips, M. S., Roumy, S., Sallée, C., Verner, A., Hudson, T. J., Kwok, P., Cai, D., Koboldt, D. C., Miller, R. D., Pawlikowska, L., Taillon-Miller, P., Xiao, M., Tsui, L., Mak, W., Song, Y. Q., Tam, P. K. H., Nakamura, Y., Kawaguchi, T., Kitamoto, T., Morizono, T., Nagashima, A., Ohnishi, Y., Sekine, A., Tanaka, T., Tsunoda, T., Deloukas, P., Bird, C. P., Delgado, M., Dermitzakis, E. T., Gwilliam, R., Hunt, S., Morrison, J., Powell, D., Stranger, B. E., Whittaker, P., Bentley, D. R., Daly, M. J., de Bakker, P. I. W., Barrett, J., Chretien, Y. R., Maller, J., McCarroll, S., Patterson, N., Pe’er, I., Price, A., Purcell, S., Richter, D. J., Sabeti, P., Saxena, R., Schaffner, S. F., Sham, P. C., Varilly, P., Altshuler, D., Stein, L. D., Krishnan, L., Smith, A. V., Tello-Ruiz, M. K., Thorisson, G. A., Chakravarti, A., Chen, P. E., Cutler, D. J., Kashuk, C. S., Lin, S., Abecasis, G. R., Guan, W., Li, Y., Munro, H. M., Qin, Z. S., Thomas, D. J., McVean, G., Auton, A., Bottolo, L., Cardin, N., Eyheramendy, S., Freeman, C., Marchini, J., Myers, S., Spencer, C., Stephens, M., Donnelly, P., Cardon, L. R., Clarke, G., Evans, D. M., Morris, A. P., Weir, B. S., Tsunoda, T., Mullikin, J. C., Sherry, S. T., Feolo, M., Skol, A., Zhang, H., Zeng, C., Zhao, H., Matsuda, I., Fukushima, Y., Macer, D. R., Suda, E., Rotimi, C. N., Adebamowo, C. A., Ajayi, I., Aniagwu, T., Marshall, P. A., Nkwodimmah, C., Royal, C. D. M., Leppert, M. F., Dixon, M., Peiffer, A., Qiu, R., Kent, A., Kato, K., Niikawa, N., Adewole, I. F., Knoppers, B. M., Foster, M. W., Clayton, E. W., Watkin, J., Gibbs, R. A., Belmont, J. W., Muzny, D., Nazareth, L., Sodergren, E., Weinstock, G. M., Wheeler, D. A., Yakub, I., Gabriel, S. B., Onofrio, R. C., Richter, D. J., Ziaugra, L., Birren, B. W., Daly, M. J., Altshuler, D., Wilson, R. K., Fulton, L. L., Rogers, J., Burton, J., Carter, N. P., Clee, C. M., Griffiths, M., Jones, M. C., McLay, K., Plumb, R. W., Ross, M. T., Sims, S. K., Willey, D. L., Chen, Z., Han, H., Kang, L., Godbout, M., Wallenburg, J. C., L’Archevêque, P., Bellemare, G., Saeki, K., Wang, H., An, D., Fu, H., Li, Q., Wang, Z., Wang, R., Holden, A. L., Brooks, L. D., McEwen, J. E., Guyer, M. S., Wang, V. O., Peterson, J. L., Shi, M., Spiegel, J., Sung, L. M., Zacharia, L. F., Collins, F. S., Kennedy, K., Jamieson, R. & Stewart, J. (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851861.
  • Gloyn, A. L. & McCarthy, M. I. (2010) Variation across the allele frequency spectrum. Nat Genet 42, 648650.
  • Gudmundsson, J., Sulem, P., Gudbjartsson, D. F., Blondal, T., Gylfason, A., Agnarsson, B. A., Benediktsdottir, K. R., Magnusdottir, D. N., Orlygsdottir, G., Jakobsdottir, M., Stacey, S. N., Sigurdsson, A., Wahlfors, T., Tammela, T., Breyer, J. P., McReynolds, K. M., Bradley, K. M., Saez, B., Godino, J., Navarrete, S., Fuertes, F., Murillo, L., Polo, E., Aben, K. K., van Oort, I. M., Suarez, B. K., Helfand, B. T., Kan, D., Zanon, C., Frigge, M. L., Kristjansson, K., Gulcher, J. R., Einarsson, G. V., Jonsson, E., Catalona, W. J., Mayordomo, J. I., Kiemeney, L. A., Smith, J. R., Schleutker, J., Barkardottir, R. B., Kong, A., Thorsteinsdottir, U., Rafnar, T. & Stefansson, K. (2009) Genome-wide association and replication studies identify four variants associated with prostate cancer susceptibility. Nat Genet 41, 11221126.
  • Hashimoto, Y., Akita, H., Hibino, M., Kohri, K. & Nakanishi, M. (2002) Identification and characterization of Nek6 protein kinase, a potential human homolog of NIMA histone H3 kinase. Biochem Biophys Res Commun 293, 753758.
  • Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S. & Manolio, T. A. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106, 93629367.
  • Huang, S., Chen, X. H., Payne, J. R., Pennell, D. J., Gohlke, P., Smith, M. J., Day, I. N., Montgomery, H. E. & Gaunt T. R. (2007) Haplotype of growth hormone and angiotensin I-converting enzyme genes, serum angiotensin I-converting enzyme and ventricular growth: Pathway inference in pharmacogenetics. Pharmacogenet Genomics 4, 291294
  • Iglesias, T., Cabrera-Poch, N., Mitchell, M. P., Naven, T. J., Rozengurt, E. & Schiavo, G. (2000) Identification and cloning of Kidins220, a novel neuronal substrate of protein kinase D. J Biol Chem 275, 4004840056.
  • Ishikawa, K., Nagase, T., Nakajima, D., Seki, N., Ohira, M., Miyajima, N., Tanaka, A., Kotani, H., Nomura, N. & Ohara, O. (1997) Prediction of the coding sequences of unidentified human genes. VIII. 78 new cDNA clones from brain which code for large proteins in vitro. DNA Res 4, 307313.
  • Kelner, G. S., Kennedy, J., Bacon, K. B., Kleyensteuber, S., Largaespada, D. A., Jenkins, N. A., Copeland, N. G., Bazan, J. F., Moore, K. W. & Schall, T. J. (1994) Lymphotactin: A cytokine that represents a new class of chemokine. Science 266, 13951399.
  • Kennedy, J., Kelner, G. S., Kleyensteuber, S., Schall, T. J., Weiss, M. C., Yssel, H., Schneider, P. V., Cocks, B. G., Bacon, K. B. & Zlotnik, A. (1995) Molecular cloning and functional characterization of human lymphotactin. J Immunol 155, 203209.
  • Köttgen, A., Pattaro, C., Böger, C. A., Fuchsberger, C., Olden, M., Glazer, N. L., Parsa, A., Gao, X., Yang, Q., Smith, A. V., O’Connell, J. R., Li, M., Schmidt, H., Tanaka, T., Isaacs, A., Ketkar, S., Hwang, S., Johnson, A. D., Dehghan, A., Teumer, A., Paré, G., Atkinson, E. J., Zeller, T., Lohman, K., Cornelis, M. C., Probst-Hensch, N. M., Kronenberg, F., Tönjes, A., Hayward, C., Aspelund, T., Eiriksdottir, G., Launer, L. J., Harris, T. B., Rampersaud, E., Mitchell, B. D., Arking, D. E., Boerwinkle, E., Struchalin, M., Cavalieri, M., Singleton, A., Giallauria, F., Metter, J., de Boer, I. H., Haritunians, T., Lumley, T., Siscovick, D., Psaty, B. M., Zillikens, M. C., Oostra, B. A., Feitosa, M., Province, M., de Andrade, M., Turner, S. T., Schillert, A., Ziegler, A., Wild, P. S., Schnabel, R. B., Wilde, S., Munzel, T. F., Leak, T. S., Illig, T., Klopp, N., Meisinger, C., Wichmann, H., Koenig, W., Zgaga, L., Zemunik, T., Kolcic, I., Minelli, C., Hu, F. B., Johansson, A., Igl, W., Zaboli, G., Wild, S. H., Wright, A. F., Campbell, H., Ellinghaus, D., Schreiber, S., Aulchenko, Y. S., Felix, J. F., Rivadeneira, F., Uitterlinden, A. G., Hofman, A., Imboden, M., Nitsch, D., Brandstätter, A., Kollerits, B., Kedenko, L., Mägi, R., Stumvoll, M., Kovacs, P., Boban, M., Campbell, S., Endlich, K., Völzke, H., Kroemer, H. K., Nauck, M., Völker, U., Polasek, O., Vitart, V., Badola, S., Parker, A. N., Ridker, P. M., Kardia, S. L. R., Blankenberg, S., Liu, Y., Curhan, G. C., Franke, A., Rochat, T., Paulweber, B., Prokopenko, I., Wang, W., Gudnason, V., Shuldiner, A. R., Coresh, J., Schmidt, R., Ferrucci, L., Shlipak, M. G., van Duijn, C. M., Borecki, I., Krämer, B. K., Rudan, I., Gyllensten, U., Wilson, J. F., Witteman, J. C., Pramstaller, P. P., Rettig, R., Hastie, N., Chasman, D. I., Kao, W. H., Heid, I. M. & Fox, C. S. (2010) New loci associated with kidney function and chronic kidney disease. Nat Genet 42, 376384.
  • Liao, Y., Hsu, S. & Huang, P. (2007) ARMS depletion facilitates UV irradiation induced apoptotic cell death in melanoma. Cancer Res 67, 1154711556.
  • Limou, S., Le Clerc, S., Coulonges, C., Carpentier, W., Dina, C., Delaneau, O., Labib, T., Taing, L., Sladek, R., Deveau, C., Ratsimandresy, R., Montes, M., Spadoni, J., Lelièvre, J., Lévy, Y., Therwath, A., Schächter, F., Matsuda, F., Gut, I., Froguel, P., Delfraissy, J., Hercberg, S. & Zagury, J. (2009) Genomewide association study of an AIDS-nonprogression cohort emphasizes the role played by HLA genes (ANRS Genomewide Association Study 02). J Infect Dis 199, 419426.
  • Liu, Y., Helms, C., Liao, W., Zaba, L.C., Duan, S., Gardner, J., Wise, C., Miner, A., Malloy, M. J., Pullinger, C. R., Kane, J. P., Saccone, S., Worthington, J., Bruce, I., Kwok, P., Menter, A., Krueger, J., Barton, A., Saccone, N.L. & Bowcock, A. M. (2008) A genome-wide association study of psoriasis and psoriatic arthritis identifies new disease loci. PLoS Genet 4, e1000041.
  • Lo, J. C., Wang, Y., Tumanov, A. V., Bamji, M., Yao, Z., Reardon, C. A., Getz, G. S. & Fu, Y. (2007) Lymphotoxin beta receptor-dependent control of lipid homeostasis. Science 316, 285288.
  • Lonjou, C., Collins, A., Ajioka, R. S., Jorde, L. B., Kushner, J. P. & Morton, N. E. (1998) Allelic association under map error and recombinational heterogeneity: A tale of two sites. Proc Natl Acad Sci U S A 95, 1136611370.
  • Loos, R. J. F., Lindgren, C. M., Li, S., Wheeler, E., Zhao, J. H., Prokopenko, I., Inouye, M., Freathy, R. M., Attwood, A. P., Beckmann, J. S., Berndt, S. I., Bergmann, S., Bennett, A. J., Bingham, S. A., Bochud, M., Brown, M., Cauchi, S., Connell, J. M., Cooper, C., Smith, G. D., Day, I., Dina, C., De, S., Dermitzakis, E. T., Doney, A. S. F., Elliott, K. S., Elliott, P., Evans, D. M., Sadaf Farooqi, I., Froguel, P., Ghori, J., Groves, C. J., Gwilliam, R., Hadley, D., Hall, A. S., Hattersley, A. T., Hebebrand, J., Heid, I. M., Herrera, B., Hinney, A., Hunt, S. E., Jarvelin, M., Johnson, T., Jolley, J. D. M., Karpe, F., Keniry, A., Khaw, K., Luben, R. N., Mangino, M., Marchini, J., McArdle, W. L., McGinnis, R., Meyre, D., Munroe, P. B., Morris, A. D., Ness, A. R., Neville, M. J., Nica, A. C., Ong, K. K., O’Rahilly, S., Owen, K. R., Palmer, C. N. A., Papadakis, K., Potter, S., Pouta, A., Qi, L., Randall, J. C., Rayner, N. W., Ring, S. M., Sandhu, M. S., Scherag, A., Sims, M. A., Song, K., Soranzo, N., Speliotes, E. K., Syddall, H. E., Teichmann, S. A., Timpson, N. J., Tobias, J. H., Uda, M., Ganz Vogel, C. I., Wallace, C., Waterworth, D. M., Weedon, M. N., Willer, C. J., Wraight, V. L., Yuan, X., Zeggini, E., Hirschhorn, J. N., Strachan, D. P., Ouwehand, W. H., Caulfield, M. J., Samani, N. J., Frayling, T. M., Vollenweider, P., Waeber, G., Mooser, V., Deloukas, P., McCarthy, M. I., Wareham, N. J. & Barroso, I. (2008) Common variants near MC4R are associated with fat mass, weight and risk of obesity. Nat Genet 40, 768775.
  • Lunetta, K. L., D’Agostino, R. B., Karasik, D., Benjamin, E. J., Guo, C., Govindaraju, R., Kiel, D. P., Kelly-Hayes, M., Massaro, J. M., Pencina, M. J., Seshadri, S. & Murabito, J. M. (2007) Genetic correlates of longevity and selected age-related phenotypes: A genome-wide association study in the Framingham Study. BMC Med Genet 8(Suppl 1), S13.
  • Mangale, V. S., Hirokawa, K. E., Satyaki, P. R. V., Gokulchandran, N., Chikbire, S., Subramanian, L., Shetty, A. S., Martynoga, B., Paul, J., Mai, M. V., Li, Y., Flanagan, L. A., Tole, S. & Monuki, E. S. (2008) Lhx2 selector activity specifies cortical identity and suppresses hippocampal organizer fate. Science 319, 304309.
  • McKay, J. D., Hung, R. J., Gaborieau, V., Boffetta, P., Chabrier, A., Byrnes, G., Zaridze, D., Mukeria, A., Szeszenia-Dabrowska, N., Lissowska, J., Rudnai, P., Fabianova, E., Mates, D., Bencko, V., Foretova, L., Janout, V., McLaughlin, J., Shepherd, F., Montpetit, A., Narod, S., Krokan, H. E., Skorpen, F., Elvestad, M. B., Vatten, L., Njølstad, I., Axelsson, T., Chen, C., Goodman, G., Barnett, M., Loomis, M. M., Lubiñski, J., Matyjasik, J., Lener, M., Oszutowska, D., Field, J., Liloglou, T., Xinarianos, G., Cassidy, A., Vineis, P., Clavel-Chapelon, F., Palli, D., Tumino, R., Krogh, V., Panico, S., González, C. A., Ramón Quirós, J., Martínez, C., Navarro, C., Ardanaz, E., Larrañaga, N., Kham, K. T., Key, T., Bueno-de-Mesquita, H. B., Peeters, P. H., Trichopoulou, A., Linseisen, J., Boeing, H., Hallmans, G., Overvad, K., Tjønneland, A., Kumle, M., Riboli, E., Zelenika, D., Boland, A., Delepine, M., Foglio, M., Lechner, D., Matsuda, F., Blanche, H., Gut, I., Heath, S., Lathrop, M. & Brennan, P. (2008) Lung cancer susceptibility locus at 5p15.33. Nat Genet 40, 14041406.
  • Newton-Cheh, C., Johnson, T., Gateva, V., Tobin, M. D., Bochud, M., Coin, L., Najjar, S. S., Zhao, J. H., Heath, S. C., Eyheramendy, S., Papadakis, K., Voight, B. F., Scott, L. J., Zhang, F., Farrall, M., Tanaka, T., Wallace, C., Chambers, J. C., Khaw, K., Nilsson, P., van der Harst, P., Polidoro, S., Grobbee, D. E., Onland-Moret, N. C., Bots, M. L., Wain, L. V., Elliott, K. S., Teumer, A., Luan, J., Lucas, G., Kuusisto, J., Burton, P. R., Hadley, D., McArdle, W. L., Brown, M., Dominiczak, A., Newhouse, S. J., Samani, N. J., Webster, J., Zeggini, E., Beckmann, J. S., Bergmann, S., Lim, N., Song, K., Vollenweider, P., Waeber, G., Waterworth, D. M., Yuan, X., Groop, L., Orho-Melander, M., Allione, A., Di Gregorio, A., Guarrera, S., Panico, S., Ricceri, F., Romanazzi, V., Sacerdote, C., Vineis, P., Barroso, I., Sandhu, M. S., Luben, R. N., Crawford, G. J., Jousilahti, P., Perola, M., Boehnke, M., Bonnycastle, L. L., Collins, F. S., Jackson, A. U., Mohlke, K. L., Stringham, H. M., Valle, T. T., Willer, C. J., Bergman, R. N., Morken, M. A., Döring, A., Gieger, C., Illig, T., Meitinger, T., Org, E., Pfeufer, A., Wichmann, H. E., Kathiresan, S., Marrugat, J., O’Donnell, C. J., Schwartz, S. M., Siscovick, D. S., Subirana, I., Freimer, N. B., Hartikainen, A., McCarthy, M. I., O’Reilly, P. F., Peltonen, L., Pouta, A., de Jong, P. E., Snieder, H., van Gilst, W. H., Clarke, R., Goel, A., Hamsten, A., Peden, J. F., Seedorf, U., Syvänen, A., Tognoni, G., Lakatta, E. G., Sanna, S., Scheet, P., Schlessinger, D., Scuteri, A., Dörr, M., Ernst, F., Felix, S. B., Homuth, G., Lorbeer, R., Reffelmann, T., Rettig, R., Völker, U., Galan, P., Gut, I. G., Hercberg, S., Lathrop, G. M., Zelenika, D., Deloukas, P., Soranzo, N., Williams, F. M., Zhai, G., Salomaa, V., Laakso, M., Elosua, R., Forouhi, N. G., Völzke, H., Uiterwaal, C. S., van der Schouw, Y. T., Numans, M. E., Matullo, G., Navis, G., Berglund, G., Bingham, S. A., Kooner, J. S., Connell, J. M., Bandinelli, S., Ferrucci, L., Watkins, H., Spector, T. D., Tuomilehto, J., Altshuler, D., Strachan, D. P., Laan, M., Meneton, P., Wareham, N. J., Uda, M., Jarvelin, M., Mooser, V., Melander, O., Loos, R. J. F., Elliott, P., Abecasis, G. R., Caulfield, M. & Munroe, P. B. (2009) Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet 41, 666676.
  • Reveille, J. D., Sims, A., Danoy, P., Evans, D. M., Leo, P., Pointon, J. J., Jin, R., Zhou, X., Bradbury, L. A., Appleton, L. H., Davis, J. C., Diekman, L., Doan, T., Dowling, A., Duan, R., Duncan, E. L., Farrar, C., Hadler, J., Harvey, D., Karaderi, T., Mogg, R., Pomeroy, E., Pryce, K., Taylor, J., Savage, L., Deloukas, P., Kumanduri, V., Peltonen, L., Ring, S. M., Whittaker, P., Glazov, E., Thomas, G. P., Maksymowych, W. P., Inman, R. D., Ward, M. M., Stone, M. A., Weisman, M. H., Wordsworth, B. P. & Brown, M. A. (2010) Genome-wide association study of ankylosing spondylitis identifies non-MHC susceptibility loci. Nat Genet 42, 123127.
  • Rung, J., Cauchi, S., Albrechtsen, A., Shen, L., Rocheleau, G., Cavalcanti-Proenca, C., Bacot, F., Balkau, B., Belisle, A., Borch-Johnsen, K., Charpentier, G., Dina, C., Durand, E., Elliott, P., Hadjadj, S., Jarvelin, M., Laitinen, J., Lauritzen, T., Marre, M., Mazur, A., Meyre, D., Montpetit, A., Pisinger, C., Posner, B., Poulsen, P., Pouta, A., Prentki, M., Ribel-Madsen, R., Ruokonen, A., Sandbaek, A., Serre, D., Tichet, J., Vaxillaire, M., Wojtaszewski, J. F. P., Vaag, A., Hansen, T., Polychronakos, C., Pedersen, O., Froguel, P. & Sladek, R. (2009) Genetic variant near IRS1 is associated with type 2 diabetes, insulin resistance and hyperinsulinemia. Nat Genet 41, 11101115.
  • Samani, N.J., Erdmann, J., Hall, A. S., Hengstenberg, C., Mangino, M., Mayer, B., Dixon, R. J., Meitinger, T., Braund, P., Wichmann, H., Barrett, J. H., König, I. R., Stevens, S. E., Szymczak, S., Tregouet, D., Iles, M. M., Pahlke, F., Pollard, H., Lieb, W., Cambien, F., Fischer, M., Ouwehand, W., Blankenberg, S., Balmforth, A. J., Baessler, A., Ball, S. G., Strom, T. M., Braenne, I., Gieger, C., Deloukas, P., Tobin, M. D., Ziegler, A., Thompson, J. R. & Schunkert, H. (2007) Genomewide association analysis of coronary artery disease. N Engl J Med 357, 443453.
  • Superti-Furga, A., Rocchi, M., Schäfer, B. W. & Gitzelmann, R. (1993) Complementary DNA sequence and chromosomal mapping of a human proteoglycan-binding cell-adhesion protein (dermatopontin). Genomics 17, 463467.
  • Turnbull, C., Ahmed, S., Morrison, J., Pernet, D., Renwick, A., Maranian, M., Seal, S., Ghoussaini, M., Hines, S., Healey, C. S., Hughes, D., Warren-Perry, M., Tapper, W., Eccles, D., Evans, D. G., Hooning, M., Schutte, M., van den Ouweland, A., Houlston, R., Ross, G., Langford, C., Pharoah, P. D. P., Stratton, M. R., Dunning, A. M., Rahman, N. & Easton, D. F. (2010) Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet 42, 504507.
  • Wu, H. K., Heng, H. H., Siderovski, D. P., Dong, W. F., Okuno, Y., Shi, X. M., Tsui, L. C. & Minden, M. D. (1996) Identification of a human LIM-Hox gene, hLH-2, aberrantly expressed in chronic myelogenous leukaemia and located on 9q33–34.1. Oncogene 12, 12051212.

Supporting Information

  1. Top of page
  2. Summary
  3. Introduction
  4. Materials and Methods
  5. Results
  6. Discussion
  7. Funding
  8. References
  9. Supporting Information

Table S1 Complete list of variants with alternative candidate genes.

Table S2 Shortlist of variants where alternative gene lies physically further from the variant than any reported gene, but genetically closer.

Figure S1 Plot of genetic distance against physical distance between a gene and an associated SNP for 4 sets of data (along with correlation coefficients, r).

Figure S2 Plot of change in genetic distance against change in physical distance between the reported gene and the alternative gene for two sets of data (along with correlation coefficients, r).

FilenameFormatSizeDescription
AHG_677_sm_FigureS1_legend.docx20KSupporting info item
AHG_677_sm_FigureS2_legend.docx20KSupporting info item
AHG_677_sm_FigureS1.tif144KSupporting info item
AHG_677_sm_FigureS2.tif124KSupporting info item
AHG_677_sm_TableS1.xls88KSupporting info item
AHG_677_sm_TableS2.xls54KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.