An Integrative Genetics Approach to Identify Candidate Genes Regulating BMD: Combining Linkage, Gene Expression, and Association


  • Charles R Farber,

    Corresponding author
    1. Department of Medicine, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    • Address reprint requests to: Charles R Farber, PhD, Department of Medicine, University of Virginia, Charlottesville, VA 22908, USA
    Search for more papers by this author
  • Atila van Nas,

    1. Department of Human Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Anatole Ghazalpour,

    1. Department of Medicine, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Jason E Aten,

    1. Department of Human Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Sudheer Doss,

    1. Department of Medicine, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Brandon Sos,

    1. Department of Medicine, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Eric E Schadt,

    1. Rosetta Inpharmatics/Merck, Seattle, Washington, USA
    Search for more papers by this author
    • Dr Schadt is an employee of Rosetta Inpharmatics, a wholly owned subsidiary of Merck and Co. All other authors state that they have no conflicts of interest.

  • Leslie Ingram-Drake,

    1. Department of Human Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Richard C Davis,

    1. Department of Human Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Steve Horvath,

    1. Department of Human Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    2. Department of Biostatistics, School of Public Health, University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Desmond J Smith,

    1. Molecular and Medical Pharmacology, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Thomas A Drake,

    1. Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    Search for more papers by this author
  • Aldons J Lusis

    1. Department of Medicine, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    2. Department of Human Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    3. Deptartment of Microbiology, Immunology, and Molecular Genetics, David Geffen School of Medicine at University of California, Los Angeles, California, USA
    4. Molecular Biology Institute, UCLA, Los Angeles, California, USA
    Search for more papers by this author


Numerous quantitative trait loci (QTLs) affecting bone traits have been identified in the mouse; however, few of the underlying genes have been discovered. To improve the process of transitioning from QTL to gene, we describe an integrative genetics approach, which combines linkage analysis, expression QTL (eQTL) mapping, causality modeling, and genetic association in outbred mice. In C57BL/6J × C3H/HeJ (BXH) F2 mice, nine QTLs regulating femoral BMD were identified. To select candidate genes from within each QTL region, microarray gene expression profiles from individual F2 mice were used to identify 148 genes whose expression was correlated with BMD and regulated by local eQTLs. Many of the genes that were the most highly correlated with BMD have been previously shown to modulate bone mass or skeletal development. Candidates were further prioritized by determining whether their expression was predicted to underlie variation in BMD. Using network edge orienting (NEO), a causality modeling algorithm, 18 of the 148 candidates were predicted to be causally related to differences in BMD. To fine-map QTLs, markers in outbred MF1 mice were tested for association with BMD. Three chromosome 11 SNPs were identified that were associated with BMD within the Bmd11 QTL. Finally, our approach provides strong support for Wnt9a, Rasd1, or both underlying Bmd11. Integration of multiple genetic and genomic data sets can substantially improve the efficiency of QTL fine-mapping and candidate gene identification.


During the early 1990s, the development of genome-wide genetic markers made it possible to identify chromosomal regions (referred to as quantitative trait loci [QTLs]) harboring genetic variation contributing to common diseases such as osteoporosis.math image In the years since, numerous studies in humans and mice have identified QTLs affecting a wide spectrum of osteoporosis-related traits, including BMD.math image In the mouse, many strategies exist for QTL fine-mapping, including congenic strain analysis and recombinant progeny testing; however, these are time consuming, quite laborious, and are typically only intermediate steps in the gene identification process.math image However, with the advancement of genomic technologies, tools such as high-throughput genotyping and global gene expression profiling provide the opportunity to significantly accelerate gene discovery.

A gene's expression represents a quantifiable intermediate phenotype and can provide key links between genetic perturbations and disease.math image Now, with the advent of DNA microarrays, gene expression can be quantified on a global scale.math image One exciting application of transcriptomics is its integration with traditional genetic analysis to study the “genetics of gene expression.”math image In the same way that QTLs are mapped for clinical traits, expression QTLs (eQTLs) can be mapped for transcript levels. eQTLs are classified as either local or distant, and these terms describe the proximity of eQTL and gene.math image Local eQTLs are located near the transcript they regulate, whereas distant eQTLs are removed (typically on different chromosomes) from their structural locus. We have shown that at least two thirds of local eQTLs represent allele-specific cis effects, and therefore, the genes they regulate are strong positional candidates for overlapping clinical trait QTLs.math image

The ability to establish causal links between gene expression and clinical traits is an additional advantage of measuring gene expression in a segregating population. In biological systems, the flow of cellular information always begins with DNA. This knowledge can be leveraged to orient the downstream relationships between genes and complex traits. Recently, causality modeling algorithms have been developedmath image and shown effective in establishing causality in mouse crosses using microarray-generated gene expression profiles.math image

The main disadvantage of using traditional mapping populations, such as backcrosses, intercrosses, and recombinant inbred panels, is their lack of genetic resolution. QTLs identified in these crosses typically have confidence intervals in the range of 20–40 cM (∼40–80 Mbp), which correspond to regions containing hundreds of genes. Recently, several elegant studies have shown that association mapping in outbred and heterogeneous stock (HS) mice can provide substantial increases in mapping resolution.math image This increase is caused by the accumulation of recombinations over many generations of random breeding. HS mice have recently been used to map 843 QTLs for ∼100 human disease traits with an average 95% CI of 2.8 Mbp, showing the effectiveness of HS mice for high-resolution mapping.math image Therefore, the use of outbred mice offers substantial improvements in QTL localization, relative to traditional crosses, and makes downstream quantitative trait gene (QTG) identification much more rapid.

In this study, we outline an approach that integrates linkage in an F2 intercross, eQTL analysis, high-density SNP maps, causality modeling, and association in outbred mice to identify candidate genes for BMD. Although we have applied this approach to BMD, it is, in theory, extensible to the analysis of any complex clinical or gene expression trait.


Mapping populations

C57BL/6J × C3H/HeJ (BXH) F2 mice (N = 309, 164 males and 145 females) were generated by intercrossing F1s. Mice were fed a chow diet containing 4% fat (Ralston-Purina, St Louis, MO, USA) until 8 wk of age and were placed on a high-fat “Western” diet containing 42% fat and 0.15% cholesterol (Teklad 88137; Harlan Teklad, Madison, WI, USA) for 12 wk. At 20 wk, mice were killed after a 12-h fast, and adipose tissue was dissected, flash frozen in LN2, and stored at −80°C. Female MF1 mice (N = 97) were purchased from Harlan (Indianapolis, IN, USA) at ∼4–6 wk of age. MF1s were fed the same chow diet until 19 wk of age and the Western diet for 14 wk until they were killed at 35 wk. All mice were maintained on a 12-h light/dark cycle. All mouse protocols were managed according to the guidelines of the American Association for Accreditation of Laboratory Animal Care (AAALAC).


Genomic DNA was isolated from BXH F2 kidneys by phenol-chloroform extraction. Genotyping was conducted by ParAllele (South San Francisco, CA, USA) using the molecular-inversion probe (MIB) multiplex technique.math image MF1 genomic DNA was isolated from tail clips using the Qiagen DNeasy tissue kit (Qiagen, Valencia, CA, USA). Genotyping was conducted by Affymetrix (Santa Clara, CA, USA) using the Affymetrix GeneChip Mouse Mapping 5K SNP platform. SNPs in both populations were annotated using the NCBI Build 37.1 genome assembly.

BMD determination

All carcasses were stored at −20°C after death and thawed overnight at 4°C before BMD scans. The left and right femurs of BXH F2 mice were removed, partially defleshed, and scanned. For MF1 mice, the entire thawed carcass was scanned. BMD scans were preformed using a Lunar PIXImus II Densitometer (GE Healthcare, Piscataway, NJ, USA). Positional scanning effects have been documented for PIXImus Densitometers.math image To determine whether scanning position affected the DXA used in this study, we recorded the x and y coordinates for each scan. Using ANOVA, we found no difference in BMD dependent on x-axis position; however, there was a small increase in BMD from top to bottom on the y-axis (right femur: r2 = 0.01, p = 0.04; left femur: r2 = 0.02, p = 0.006). To determine whether this significantly affected the linkage analysis, BMD (right and left femurs) was adjusted for y-axis position, and the average of the residuals was calculated. We preformed the QTL analysis (methods to be described below) with y-axis adjusted and unadjusted BMD data; however, we observed no difference in the LOD score profiles. Therefore, for all the analyses described below, BMD was calculated as the average of the right and left femurs unadjusted for y-axis position effects. For the MF1 scans, BMD was calculated for the whole body, the average of both femurs, and the L2–L6 lumbar vertebrae. These data were not corrected for the position effect.

BMD linkage analysis

All statistical analyses for the project were performed using the R language and environment for statistical computing ( image Specifically, the R/qtl package was used for the linkage analysis.math image Initially, missing genotypes were imputed with the sim.geno function. Marker regression for each SNP (N = 1486; scanone function) was performed using two statistical models. The first was an additive model that included sex, body weight at death (weight), and cross direction (F2 mice were generated using F1s from either maternal C57BL/6J × paternal C3H/HeJ [BXH; N = 161] or maternal C3H/HeJ × paternal C57BL/6J [HXB; N = 148] matings) as additive covariates. A sex interaction model was also used and contained the same additive covariates plus a sex × QTL interaction term. Sex-specific genome scans were also performed using an additive model adjusting for weight and cross direction. We also evaluated a model including a cross × QTL interaction term. However, no significant differences in LOD scores were observed, indicating the lack of cross × QTL interactions for BMD in our cross (data not shown). The 1.5 LOD-drop support intervals (1.5 SIs), which have been shown to approximate 95% CIs for densely genotyped intercrosses (markers every ∼1 cM),math image were calculated using the lodint function. Significant (p < 0.05) and suggestive (p < 0.63) LOD thresholds were empirically determined for each analysis using 1000 permutations. A multiple QTL model comprised of covariates (sex, weight, and cross direction), all significant and suggestive QTLs, and all possible sex × QTL interaction terms was generated for BMD using the fitqtl function. Novel QTLs were named according to Mouse Genome Informatics (MGI) nomenclature guidelines ( and previously assigned names were used for loci identified by Beamer et al.math image

DNA microarray profiling

Microarray gene expression profiling was performed as described previously.math image Briefly, RNA was isolated from the gonadal fat pads (N = 293) of F2 mice using the TRIzol method. 60-mer oligonucleotide chips were used (Agilent Technologies, Santa Clara, CA, USA), and all hybridizations were performed in duplicate with fluor reversal. Each individual sample was hybridized against a pool of 150 F2 samples. The expression of each probe was represented as the mean log ratio (mlratio) of an individual's expression relative to the pool.

Expression QTL analysis

The Agilent arrays contained probe sets for a total of 23,574 transcripts. Probes were annotated using the NCBI Build 37.1 genome assembly. To eliminate nonexpressed probes, we used the 12,932 “active” genesmath image for all downstream analyses. Briefly, “active” genes are those detected as expressed, are correlated with at least one (of ∼25) metabolic trait in BXH F2 mice, and have at least one eQTL with a LOD >4.3. From the “active” list, we selected genes with nominally significant (p < 0.01) Pearson partial correlations with BMD in either the entire cross (after correcting for weight and sex), males (after adjusting for weight), or females (after adjusting for weight). False discovery rates (FDRs) for correlations at p < 0.01 were estimated by creating 1000 permuted BMD vectors and repeating the correlation analysis for each vector. FDR was calculated by dividing the average number of genes, across all 1000 permutations, significant at p < 0.01, by the number of genes correlated at p < 0.01 in the real data. Using the mlratios for each of the correlated (p < 0.01) transcripts, we performed an eQTL analysis after correcting for sex, weight, and cross direction. Only markers on chromosomes harboring BMD QTL were used in the analysis. The statistical threshold (p < 0.05) was determined using 500 permutations of 500 randomly selected “active” genes. The mean significant LOD score across all 500 genes (LOD = 3.39 ± 0.01) was used as the experiment-wise significance threshold. The cumulative mean threshold was stable by the 500th gene, indicating that permuting additional genes would not significantly alter the results (data not shown). Local eQTLs were defined as those whose peak was located ±20 Mbp of the structural gene they regulated.math image

Causality modeling using network edge orienting

Network edge orienting (NEO) is a recently developed R function designed to orient the relationships between genetic markers, gene expression traits, and clinical traits.math image Tutorials outlining the use of NEO can be found at NEO uses the fact that all cellular information begins with DNA and therefore the many possible relationships that can exist between DNA variation, gene expression, and clinical traits can be distilled to three. The three relationships (or models) are (1) causal—flow of information goes from DNA to gene to BMD (gene's expression is causing the change in the trait); (2) reactive—flow of information goes from DNA to BMD to gene (gene's expression is reacting to the change in the trait) and (3) independent—DNA variation affects both traits independently. NEO uses structural equation modeling to estimate the probabilities for each of the three relationships. The log10 ratio of the causal model probability relative to the next best model probability (of the two remaining) is calculated. This ratio (referred to as the LEO next best or LEO.NB score) quantifies the relative likelihood that a gene's expression is causal for a trait such as BMD. Simulation studies have shown that single marker LEO.NB scores >1.0 are highly suggestive of causal relationships.math image

For this experiment, we ran NEO for each gene correlated with BMD and regulated by a local eQTL. Each of the nine BMD peak QTL markers were used as anchors for the appropriate genes (e.g., the peak marker for Bmd11 on Chr 11 was used as an anchor for each gene correlated with BMD and regulated by a local eQTL on Chr 11). The analysis was preformed using data from the entire cross for all nine QTLs. We also preformed a sex-specific analysis for those QTL displaying sex-biased expression. LEO.NB scores >1.0 were considered significant evidence of a causal relationship.

Association analysis in MF1s

For the association analysis with BMD in MF1 mice, only markers (N = 398) located within one of the BMD QTL 1.5 SIs and with a minor allele frequency of >5% were tested. The association was performed using three traits: whole body, femoral, and spinal BMD. Whole body and spinal BMD were not correlated with body weight (p > 0.05), so raw unadjusted traits were used for association. Femoral BMD was highly correlated with weight (r = 0.48, p < 0.001); therefore, weight-corrected residuals were used for the association. The relationship between SNP genotype and BMD was measured using ANOVA. To guard against small sample size biasing the ANOVA p values, any genotype class with less than five samples was eliminated from the analysis. Adjusted p values were generated for all SNPs by repeating the ANOVA analysis using 1000 permuted BMD vectors and nonpermuted MF1 genotypes. The 95% CI for the location of an associated region was calculated using 1000 bootstrap samples drawn with replacement.math image For the Bmd11 association, 11 SNPs surrounding the peak were used for the bootstrap analysis. The marker with the maximum –logP (negative log10 of the association p value) for each bootstrap was recorded, and the markers at the top and bottom 2.5% of the distribution defined the 95% CI.

Defining regions of high and low SNP density between C57BL/6J and C3H/HeJ

SNPs polymorphic between C57BL/6J and C3H/HeJ were downloaded from the Mouse Phenome Database (MPD; SNPs were annotated using the NCBI Build 37.1 genome assembly. A custom R function was used to calculate and plot the SNP frequency for 25-kbp intervals across the Bmd11 region.math image


Genetic regulation of BMD in C57BL/6J × C3H/HeJ F2 mice

To identify QTLs regulating femoral BMD in C57BL/6J × C3H/HeJ (BXH) F2 mice (N = 309), we performed genome scans using an additive (sex, weight, and cross direction used as additive covariates) and a sex interaction model (same additive covariates plus a sex × QTL interaction term). Using the additive model, we identified eight QTLs surpassing the significant (p < 0.05; LOD = 3.9) or suggestive (p < 0.63; LOD = 2.4) LOD thresholds. The QTLs were located on chromosomes (Chrs) (QTL names) 1 (Bmd5), 3 (Bmd40), 4 (Bmd7), 7 (Bmd41), 11 (Bmd11), 13 (Bmd13), 14 (Bmd14), and 18 (Bmd43) (Fig. 1). We repeated the genome scan using the sex interaction model, and this resulted in the identification of an additional suggestive (p < 0.63; LOD = 3.53) QTL located on Chr 10 (Bmd42) (Fig. 1).

Figure FIG. 1..

Genome scan results for BMD in BXH F2 mice. (Top) LOD score plots for BMD using an additive model adjusting for sex, cross direction, and weight. The dashed horizontal lines represent the permutationally derived significant (p < 0.05) and suggestive (p < 0.63) LOD thresholds. (Bottom) LOD score plots for BMD using a sex interaction model adjusting for sex, cross direction, weight, and sex × QTL. The dashed horizontal lines represent the permutationally derived significant (p < 0.05) and suggestive (p < 0.63) LOD thresholds.

Of the nine QTLs, five (Bmd5, Bmd7, Bmd11, Bmd13, and Bmd14) have been previously reported in a distinct BXH F2 cross.math image A QTL on Chr 18 (Bmd16) was reported by Beamer et al.,math image but it did not overlap with the Chr 18 QTL (Bmd43) identified in this study (the peaks for Bmd16 and Bmd43 lie on opposite ends of Chr 18 and their 95% CIs do not overlap). Bmd40 and Bmd42 were novel with regard to their effect on femoral BMD, although QTLs in similar locations have been found to regulate tibial BMD.math imageBmd41 is unique to this study.

The reason for performing the analysis using two models was to determine the effect of sex. If LOD scores improve after adjusting for a sex × QTL interaction, this is evidence, but not proof, that the QTL in question is expressed differently between the sexes. By comparing the two genome scans, modest increases in LOD scores for Bmd40 (Chr 3), Bmd42 (Chr 10), and Bmd11 (Chr 11) were observed (data not shown). To formally address the effect of sex, we generated a multiple regression model comprised of terms for covariates, the nine QTLs, and all possible sex × QTL interaction terms. The final model, which included all QTL terms and significant sex × QTL interactions, was highly significant (p < 0.0001), accounting for 63.3% of the variance in BMD (Table 1). As suggested above, the sex × QTL interaction terms for Bmd40, Bmd42, and Bmd11 were significant (p < 0.05; Table 1).

Table Table 1.. Results of a Multiple QTL Model Generated for BMD
original image

We also performed sex-specific QTL scans and generated sex-specific effect plots for each QTL (Figs. 2 and 3). These analyses supported the results of the regression model and indicated that Bmd40, Bmd42, and Bmd11 were male-biased (effect was more pronounced in males). Although the regression-based sex × QTL interaction p values were not significant, the sex-specific scans and effect plots also suggested that Bmd5, Bmd14, and Bmd43 were female-biased (Figs. 2 and 3). The inability to detect significant interaction terms for these three loci may be because of a lack of statistical power.

Figure FIG. 2..

Sex-specific BMD QTL analysis in BXH F2 mice. LOD score plots for BMD using an additive model (adjusting for cross direction and weight) in males (solid line) and females (dashed line) separately. The dashed horizontal lines represent the permutationally derived significant (p < 0.05) and suggestive (p < 0.63) LOD thresholds for males (male and female thresholds were only slightly different).

Figure FIG. 3..

Effect plots for BMD QTLs in BXH F2 mice. Genotypic contrasts are plotted for each of the nine BMD QTLs identified in Fig. 1. The mean (± SE) BMD residuals (mg/cm2) (after adjusting for body weight and cross direction) are plotted for B6/B6 homozygotes (black line), B6/C3H heterozygotes (wide dashed line), and C3H/C3H homozygotes (narrow dashed line) for each sex and QTL.

Identification of gene expression traits correlated with BMD

To identify genes correlated with BMD, we first selected probes that were identified as “active.”math image This reduced the number of array probes from 23,574 to 12,932. Correlated genes were identified as those with nominally significant (p < 0.01) partial correlations (adjusted for sex and weight) in the entire cross or each sex separately (adjusted for weight). We choose to consider all three groups of correlations instead of only those from the whole cross, because some genes may be either (1) correlated in one sex but not the entire cross or (2) not correlated in the whole cross because of correlations that go in the opposite directions in the sexes. Of the active genes, 3037 in the entire cross, 1335 in males, and 736 in females were correlated with BMD (p < 0.01). At p < 0.01, the FDR was 4.3%, 8.9%, and 16.2% in the entire cross, males, and females, respectively. In total, 3338 genes were correlated (p < 0.01) in one or more of the groups.

Identifying local eQTLs coincident with BMD loci

Local eQTL represent genetic variation perturbing the expression of a nearby gene.math image By definition, local eQTLs coincident with BMD loci (referred to throughout as “coincident local eQTL”) are strong positional and functional candidates. To identify coincident local eQTLs, we performed an eQTL analysis using microarray expression profiles of adipose tissue from individual BXH F2 mice (N = 293). Local eQTL were identified for the 3338 correlated gene expression traits using a model adjusting for sex, cross direction, and weight. Only markers on the nine chromosomes harboring BMD QTLs were used in the analysis. Coincident local eQTLs were defined as those (1) with LOD scores surpassing the significant (p < 0.05; LOD = 3.39) LOD threshold and (2) with peaks within one of the defined BMD QTL 1.5 LOD-drop support intervals (1.5 SIs; as determined using physical location [Mbp] for both the 1.5 SIs and QTL peak markers; Table 2). Using these criteria, 150 gene expression traits (148 unique genes) were regulated by a coincident local eQTL (Supplemental Table 1).

Table Table 2.. Support Intervals and Peak Markers for the Nine BXH F2 BMD QTLs
original image

We prioritized the 150 gene expression traits by ranking based on strength of correlation. We ranked using the absolute value of the maximum correlation among the three groups. Interestingly, of the top three prioritized genes at each locus, five (18.5%) have been previously shown to influence BMD and/or bone development including: twist homolog 2 (Twist2), ranked second within Bmd5math image; adrenomedullin (Adm), ranked second within Bmd41math image; Wingless-type MMTV integration site family, member 9A (Wnt9a), ranked second within Bmd11math image; matrix metalloproteinase 14 (Mmp14), ranked first within Bmd14math image; and MAD homolog 4 (Smad4), ranked third within Bmd43math image (Table 3). To determine whether this represents a significant enrichment, we identified all transgenic and knockout strains with a bone mineralization defect by searching the “Phenotypes, Alleles & Disease Models” database at MGI ( using the search string “bone mineralization.” Of 7307 total strains, 220 (3.0%) were annotated as having differences in bone mineralization. This represents a 6-fold enrichment (3.0% versus 18.5%; Fisher's exact, p = 2.3 × 10−3) of known BMD genes at the top of our prioritized gene list.

Table Table 3.. Local eQTL LOD Scores and Correlations With BMD for the Three Genes Most Highly Correlated Within Each BMD QTL
original image

Predicting causal links between genes with local eQTLs and BMD

Genes with coincident local eQTLs are potentially key drivers of changes in BMD; however, eQTL coincidence and correlation alone are not enough to support causality. To more formally identify causal relationships between gene expression traits and BMD, we used NEO, a recently developed R function. NEO uses genetic markers to predict causal interactions between two traits, which in our case were gene expression and BMD.math image

A NEO single marker analysis was performed for each gene with a local eQTL by anchoring the relationships using peak QTL markers. The analysis was performed using the whole cross and adjusting BMD for sex, cross direction, and weight. The analysis was also run in males for the male-biased QTLs (Bmd40, Bmd42, and Bmd11) and females for the three loci displaying suggestive evidence of being female-biased (Bmd5, Bmd14, and Bmd43) (Figs. 2 and 3). A total of 17 genes in the whole cross were predicted to be causal for BMD with single marker LEO.NB scores exceeding 1.0, indicating that, given the data, the causal model is at least 10 times more likely than any of the alternative models (Table 4). One additional gene, Wnt9a, whereas not significant in the whole cross, was significant in the male-specific NEO analysis. Of the five known bone genes mentioned above, Twist2, Mmp14, and Wnt9a were predicted as causal. Although not >1.0, Adm and Smad4 had positive single marker LEO.NB scores of 0.771 and 0.128, respectively (Supplemental Table 2).

Table Table 4.. Genes Predicted to be Causal With a NEO Single Marker LEO.NB Score ≥1.0
original image

Integrating linkage, high-density SNP maps, eQTL analysis, causality modeling, and association in MF1 mice identifies Wnt9a and Rasd1 as strong candidate causal genes for Bmd11

In the previous sections, we describe the use of linkage, global gene expression profiling, and causality modeling to identify BMD candidate genes. In this section, we illustrate the power of integrating these data with high-density SNP collections and association in outbred mice to identify candidate QTGs for Bmd11.

Bmd11, located on Chr 11, was the second most significant QTL in BXH F2 mice, explaining 6% of the variance in BMD and mapping with a peak LOD score of 6.5 at 55.2 Mbp (Table 2; Fig. 1). Additionally, there was significant statistical evidence for male-biased expression of Bmd11 (Table 1) The 1.5 SI for Bmd11 extended from 27.7 to 64.1 Mbp.

To potentially fine-map Bmd11 and each of the other BMD QTLs, we performed an association analysis using outbred MF1 mice. First, we identified the 398 markers genotyped in MF1s that were located within the 1.5 SIs for each BMD QTL (Table 2). ANOVA was used to test the relationship between SNP genotype and whole body BMD. Although we report only the results of whole body, spinal and femoral BMD were also analyzed and gave very similar, albeit slightly less significant, associations (data not shown). Of the 398 SNPs, a group of three within Bmd11 (Chr 11) were the most significant, all with a nominal –logP of 3.74 (Fig. 4). Using a stringent permutation-based approach, the adjusted p values for the three Chr 11 SNPs were p = 0.06. The 95% bootstrap CI for the Chr 11 association was the 2.8-Mbp interval between rs3023265 at 57.6 Mbp and rs13481052 at 60.4 Mbp (Fig. 4). This reduction represents a 13-fold increase in resolution relative to the linkage results in F2 mice. Importantly, the C3H-like haplotype across this region increased BMD in MF1 mice. This is concordant with C3H alleles at Bmd11 increasing BMD in F2 mice (Table 1; Fig. 3), suggesting that the same variant(s) may be affecting BMD in both populations.

Figure FIG. 4..

Chromosome 11 (Bmd11) SNPs are associated with whole body BMD in MF1 mice. Each panel contains the association results plotted as the negative log10 of the ANOVA p value (−logP) for MF1 markers overlapping each of the nine BXH BMD QTL regions. Individual SNPs are represented by vertical bars. The solid horizontal lines represent the nominal p < 0.001 level of significance. The empiric p value for the most significant association on Chr 11 (determined by permutation) was p = 0.06. The 95% bootstrap CI for the Bmd11 association (2.8 Mbp) is highlighted by the dark horizontal line at the top of the Bmd11 (Chr 11) panel.

We also studied the genetic variation structure within Bmd11 using recently developed high-density SNP collections.math image A total of 16,565 known SNPs, polymorphic between C57BL/6J and C3H/HeJ, were identified within the Bmd11 1.5 SI. Across the region, 43% of the 1456 25-kbp bins were nearly devoid of SNPs (zero or one SNP per 25 kbp), suggesting these regions are identical by descent between the two strains.math image Interestingly, the most SNP dense region of Bmd11 overlapped the location of the most strongly associated SNPs in MF1 mice (Fig. 5). The mean SNP density of the 95% CI for the associated region was nearly 2.5-fold higher (28.1 SNPs per 25 kbp) than the mean of the entire Bmd11 region (11.4 SNPs per 25 kbp). These data suggest that the variation underlying Bmd11 is most likely within the region associated with BMD in MF1 mice.

Figure FIG. 5..

An integrative genetics approach identified Wnt9a and Rasd1 as candidate QTGs for Bmd11. (Top) LOD score profile for Bmd11 in BXH F2 mice. The 1.5 LOD-drop SI extended from 25.7 to 64.1 Mbp (defined by vertical lines). (Middle) Frequency of SNPs polymorphic between C57BL/6J and C3H/HeJ per 25 kbp bin across Bmd11. A total of 15 genes with prioritized local eQTL peaks were located within the Bmd11 SI interval (•) (Supplemental Table 1). (Bottom) Negative log10 of the ANOVA p value (−logP) for each Bmd11 SNP in MF1 mice. The three most associated SNPs had empiric p values of p = 0.06. The 95% bootstrap CI for the Bmd11 association (2.8 Mbp) is highlighted by the dark horizontal line at the top of the Bmd11 (Chr 11) panel. Six of the genes with eQTL were predicted to be causal using NEO (•). Wnt9a and Rasd1 were the only genes located in the region, regulated by local eQTLs, and predicted to be causal. The x-axis for all panels is in base pairs.

In addition, the Bmd11 region contained a total of 15 genes correlated with BMD and regulated by local eQTL (Supplemental Table 1; Fig. 5). Of the 15, the expression data supported a causal relationship for 6 (Table 4). However, of the genes predicted as causal, only Wnt9a and Rasd1 were located within the 95% CI of the MF1 associated region (Fig. 5). Together these data suggest that expression differences in Wnt9a, Rasd1, or both underlie Bmd11.


The most difficult challenge in complex trait analysis is not the identification of QTLs but the transition from QTL to gene. Many strategies have been developed for QTL dissection; however, most are only marginally successful when used in isolation. In this study, we identified nine BMD QTLs covering roughly 300 cM of the mouse genome. These regions together contain between 4000 and 6000 “positional” candidates. To reduce this to a manageable number, we integrated linkage, genomic sequence, gene expression, causality modeling, and association to fine-map QTLs and identify a small number of “functional” candidates. Although these require further study, our approach also identified several genes known to influence bone development or mineralization, including Twist2, Mmp14, Wnt9a, Adm, and Smad4.

In our study, three QTLs displayed statistically significant interactions with sex and three others had suggestive sex effects. The only other previous linkage analysis for femoral BMD in BXH F2 mice used only females, prohibiting the analysis of sex differences.math image However, our results are in line with recent observations in humansmath image and other mouse crossesmath image showing that BMD is controlled in large part by sex-specific genetic variation. These data suggest that the BXH F2 intercross is an excellent model to study gene × sex interactions affecting BMD.

In light of the apparent sex effects, we observed good concordance between our study and that of Beamer et al.math image Of the 10 QTLs identified by Beamer et al.,math image 5 (Bmd5, Bmd7, Bmd11, Bmd13, and Bmd14) were replicated in our study. Of the four novel QTLs, our discovery of two, Bmd40 and Bmd42, is most likely explained by their male-biased expression. The other two, Bmd41 and Bmd43, are likely because of differences in the measurement of volumetric versus areal BMD, the effect of a high-fat diet, or other environmental differences.

One potential limitation of our study is the use of adipose gene expression profiles to identify local eQTLs instead of bone tissue or primary osteoblasts/osteoclasts. Whereas it is likely that key expression differences were missed in the analysis, it is clear that many genes important in bone metabolism, and expressed in bone, are also expressed in adipose. Additionally, we have previously shown that the majority (63–88%) of genes with local eQTLs in one tissue will also vary in other tissues where the gene is expressed.math image It is also possible that the expression in adipose of some of the putatively causal genes directly affects BMD.

The goal of our study was the identification, in an unbiased manner, of a small number of high-priority candidate BMD genes by integrating many complementary genetic and genomic datasets. This approach led to the identification of three categories of known (genes linked to bone development and/or density) and novel genes: (1) genes with prioritized coincident eQTLs; (2) genes with prioritized coincident eQTLs that were predicted to be causal; and (3) genes located in the MF1 associated/SNP dense region on Chr 11 with prioritized coincident local eQTLs that were predicted to be causal. These classifications will be used to prioritize the validation, using in vitro and in vivo approaches, of each candidate.

The exact origin of MF1 mice is unknown, but it is has been shown that MF1 haplotypes are similar to those of inbred strains.math image Yalcin et al.math image sequenced a 62-kbp region in 12 MF1 mice and found only four SNPs not shared among eight inbred strains previously sequenced across the same region.math image Therefore, it is expected that a significant fraction of the same genetic variation, and QTLs, segregating between any two inbred strains are likely to be present in MF1 mice. Here we show that even a relatively small population (N = 97) of MF1s can be used to significantly reduce the genomic region harboring QTLs identified through linkage. Using MF1s, we were able to increase mapping resolution 13-fold for Bmd11. These data suggest that a genome-wide study of BMD using a larger and more densely genotyped outbred population consisting of both sexes would be informative.

Predicting causal relationships between a gene's expression and BMD is a potentially powerful filtering tool to distinguish between causal correlation and correlation caused by linkage or downstream reactivity. However, its use, especially to prioritize local eQTLs, could be prone to false predictions. False positives can arise because of strong linkage disequilibrium with the true causal gene and false negatives can arise because of complex relationships between the gene and trait (e.g., a case were the expression of the causal gene is several steps upstream of BMD). In addition, our analysis examined only genetically regulated transcript level changes. However, there are many other types of DNA perturbations, such as those affecting protein function, translational efficiency, or stability, that likely influence BMD and would be missed by our analysis. Additionally, our approach would miss variation in gene expression important to adult BMD that occurs much earlier in development.

Through the integration of multiple statistical techniques in two different mouse populations, we identified two candidates (Rasd1 and Wnt9a) that were correlated with BMD, predicted to be causal, and located on Chr 11 (Bmd11), in a region both linked and associated with BMD (Fig. 3). RAS, dexamethasone-induced 1 (Rasd1) is a Ras family GTPase protein shown to be critical for proper circadian rhythm.math imageWnt9a is a member of the canonical WNT signaling pathway and has been shown to be involved in joint formation/maintenance and skeletogenesis.math image The role of Rasd1 in BMD is unclear and will need to be validated. However, given its known role in skeletogenesis, it is likely that Wnt9a mediates at least a portion of the Bmd11 effect.

In summary, we showed the successful application of a truly “integrative” genetics approach to identify genes underlying changes in BMD. We used this strategy to identify many known and novel genes that are likely key drivers of the genetic differences in BMD between C57BL/6J and C3H/HeJ mice. Our strategy provides the framework for future integrative studies designed to identify factors contributing to complex traits.


This study was funded by NIH Program Project Grant HL28481. CRF was supported by a Ruth L. Kirschstein NIH F32 Fellowship (5F32DK074317).