Genome-wide association analysis of juvenile idiopathic arthritis identifies a new susceptibility locus at chromosomal region 3q13

Authors


Abstract

Objective

In a genome-wide association study of Caucasian patients with juvenile idiopathic arthritis (JIA), we have previously described findings limited to autoimmunity loci shared by JIA and other diseases. The present study was undertaken to identify novel JIA-predisposing loci using genome-wide approaches.

Methods

The discovery cohort consisted of Caucasian JIA cases (n = 814) and local controls (n = 658) genotyped on the Affymetrix Genome-Wide SNP 6.0 Array, along with 2,400 out-of-study controls. In a replication study, we genotyped 10 single-nucleotide polymorphisms (SNPs) in 1,744 cases and 7,010 controls from the US and Europe.

Results

Analysis within the discovery cohort provided evidence of associations at 3q13 within C3orf1 and near CD80 (rs4688011) (odds ratio [OR] 1.37, P = 1.88 × 10−6) and at 10q21 near JMJD1C (rs647989 [OR 1.59, P = 6.1 × 10−8], rs12411988 [OR 1.57, P = 1.16 × 10−7], and rs10995450 [OR 1.31, P = 6.74 × 10−5]). Meta-analysis provided further evidence of association for these 4 SNPs (P = 3.6 × 10−7 for rs4688011, P = 4.33 × 10−5 for rs6479891, P = 2.71 × 10−5 for rs12411988, and P = 5.39 × 10−5 for rs10995450). Gene expression data on 68 JIA cases and 23 local controls showed cis expression quantitative trait locus associations for C3orf1 SNP rs4688011 (P = 0.024 or P = 0.034, depending on the probe set) and JMJD1C SNPs rs6479891 and rs12411988 (P = 0.01 or P = 0.04, depending on the probe set and P = 0.008, respectively). Using a variance component liability model, it was estimated that common SNP variation accounts for approximately one-third of JIA susceptibility.

Conclusion

Genetic association results and correlated gene expression findings provide evidence of JIA association at 3q13 and suggest novel genes as plausible candidates in disease pathology.

Juvenile idiopathic arthritis (JIA) is a debilitating complex genetic disorder that is characterized by inflammation of the joints and other tissues and that shares histopathologic features with other autoimmune diseases. Clinically, the International League of Associations for Rheumatology (ILAR) classification includes 7 JIA subtypes (1). Overall there are ∼50,000 children with JIA in the US (∼1 per 1,000 births) (2), an incidence similar to that of juvenile diabetes. While JIA is relatively uncommon compared to some adult-onset disorders, it may have a stronger genetic contribution since children with JIA have had less time for environment and behavior to influence disease risk relative to adults. Most subtypes of JIA are more prevalent in females. In addition, family members of patients with JIA are at increased risk for other autoimmune diseases (3). Unlike many other autoimmune diseases, JIA is more common in children of European ancestry, and the distribution of JIA subtypes differs significantly across ethnic groups (4).

Two of the JIA subtypes, oligoarticular disease (which includes both persistent and extended forms) and IgM rheumatoid factor (RF) polyarticular disease, account for the majority of cases of JIA. Oligoarticular JIA is the most common subtype. It occurs particularly in younger children, with an average age at onset of ∼5 years. This subtype is generally associated with antinuclear antibody positivity and is characterized by involvement of ≤4 joints in the first 6 months of disease. The involvement of additional joints (>4 joints in total) over time distinguishes extended oligoarticular from persistent oligoarticular disease. There is no adult disease equivalent for oligoarticular JIA (5). Disease involving >4 affected joints within the first 6 months is referred to as polyarticular (with RF-positive and RF-negative polyarticular disease being classified as separate JIA subtypes). Other subtypes of JIA include systemic JIA, enthesitis-related arthritis, juvenile psoriatic arthritis, and undifferentiated JIA. The present study was focused on oligoarticular and RF-negative polyarticular JIA in order to maximize both homogeneity and sample size in the study cohorts.

There is convincing support for the notion of a strong genetic component to JIA risk, based on evidence inferred from twin and affected sibpair studies, with an estimated sibling recurrence risk (λs) of 15 (6). As in other autoimmune diseases, genetic variation within the HLA region defines the strongest known genetic risk factors for JIA. Both HLA class I and class II haplotypes associated with JIA risk are distinct from those associated with rheumatoid arthritis (7) and differ among JIA subtypes (8). It is estimated that the HLA–DR region accounts for 17% of the sibling recurrence risk for JIA, with other major histocompatibility complex (MHC) regions contributing additional risk (9).

The 2 published JIA genome-wide association studies (GWAS) (10, 11), which identified TRAF1/C5 and VTCN1, respectively, included only modest numbers of cases or markers assayed and therefore were limited in statistical power. Polymorphisms implicated in other autoimmune diseases have been associated with JIA by us and by others (12–14). From our GWAS data set, we have previously reported single-nucleotide polymorphisms (SNPs) representing PTPN22, PTPN2, IL2RA, TNFAIP3, COG6, ADAD1-IL2-IL21, and STAT4 loci (14), but these findings still explain only a portion of JIA susceptibility. To identify additional novel JIA-predisposing loci, we extended our analysis of the largest genome-wide data set compiled to date. A 2-stage design, which included a first stage in which a discovery cohort was genotyped using Affymetrix Genome-Wide SNP Array 6.0 and a second stage in which 10 SNPs with strong statistical support were genotyped in replication samples, was used. Available gene expression data on a subset of the discovery cohort (15) enabled us to perform an integrated analysis for relevant expression quantitative trait loci (eQTLs) to improve our ability to discover genetic risk factors for complex traits.

PATIENTS AND METHODS

Discovery cohort.

The 814 JIA cases and 658 local controls of self-reported non-Hispanic European American ancestry genotyped for this GWAS have been described previously (14). As noted above, the cases were limited to patients with the 2 most common subtypes of ILAR-defined JIA, RF-negative polyarticular and oligoarticular JIA (both persistent and extended). Of the 814 cases, 113 (14%) were from multiplex pedigrees. In each pedigree, 1 RF-negative polyarticular or oligoarticular JIA case was randomly selected for genotyping. The local control cohort included healthy children without known major health conditions recruited from the geographic area served by Cincinnati Children's Hospital Medical Center (CCHMC). To increase statistical power, genotype data on 2,400 “out-of-study” controls from the Molecular Genetics of Schizophrenia non–Genetic Association Information Network (MGS_nonGAIN) study (available from dbGaP) were combined with data on the local controls. This yielded a case:control ratio of ∼1:3.8.

Replication cohort.

To attempt to replicate associations in the initial cohort, 5 independent JIA case sample collections (persistent oligoarticular, extended oligoarticular, or RF-negative polyarticular JIA) and control sample collections from individuals of self-reported Caucasian ethnicity were genotyped (see Supplementary Table 1, on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1529-0131). The Texas, Utah, and Germany samples have been described previously (14). The UK samples were from 755 patients with persistent oligoarticular, extended oligoarticular, or RF-negative polyarticular JIA from 3 sources: 1) The British Society for Paediatric and Adolescent Rheumatology National Repository of JIA, 2) UK Caucasian patients with longstanding JIA (16), and 3) a 5-center prospective inception cohort collected as part of the Childhood Arthritis Prospective Study (17). Control genotype data were extracted from the Wellcome Trust Case Control Consortium 2 (WTCCC2) European Genome-phenome Archive web site (http://www.ebi.ac.uk/ega/page.php). The Delaware samples (24 JIA cases) were collected at the Nemours/Alfred I. duPont Hospital for Children. This study was approved by the Institutional Review Board of CCHMC and the collaborating centers.

Genotyping.

For the initial phase, genotyping was performed at the Affymetrix Service Center, using Genome-Wide Human SNP Array 6.0. To avoid technical artifacts, samples were arranged in batches, with each batch included cases and controls, both sexes, and all disease subtypes. The Birdseed (version 2) calling algorithm (Affymetrix) yielded an overall call rate of 98.97%. The genotypic data were used to test for cryptic relatedness (duplicates and first-degree relatives), autosomal heterozygosity outliers (|Fst| > 0.07), and plate effects. Genotype calling cluster plots were examined by batch for all SNPs reported. SNPs met criteria of having <5% missing genotype calls, no evidence of differential missingness between cases and controls (P > 0.05), no evidence of departure from expectation in Hardy-Weinberg equilibrium (HWE) proportions (P > 1 × 10−6 and P > 0.01 in cases and controls, respectively), and a minor allele frequency (MAF) of >0.05 in cases and controls. Individual SNPs that violated HWE or had an MAF of <0.05 and showed strong evidence of association were examined individually.

TaqMan SNP genotyping (Applied Biosystems) was performed on the Texas, Delaware, Utah, and Germany replication cohorts, using either predesigned or custom assays. Genotyping was done according to the recommendations of the manufacturer, with 16 ng of genomic DNA as starting material. Amplification was accomplished using a 384-well format PTC-200 (MJ Research) in a total volume of 5 μl. Polymerase chain reaction conditions were as follows: 95°C for 10 minutes, followed by 50 cycles of 95°C for 15 seconds and 60°C for 1 minute. Following amplification, products were analyzed on an Applied Biosystems 7300 Real-Time PCR System. In JIA cases and controls in the UK, SNPs were genotyped using a MassArray iPlex platform according to the instructions of the manufacturer (Sequenom). A 90% sample quality control rate and 90% SNP genotyping success rate was imposed on the analysis. Due to genotyping technical issues, the UK replication cohort was genotyped for rs10995450 rather than rs10995447 (r2 = 0.55).

Statistical analysis.

Admixture and SNP statistical quality control.

To account for potential population substructure, a principal components analysis was performed using SNPs that passed quality control and were not in genomic regions with long-range linkage disequilibrium (LD) (18). Case–control association analyses were performed with adjustment for 2 principal components that minimized the inflation factor, such that adding additional or other principal components did not further reduce the inflation factor. Replication samples were not genotyped for admixture analysis.

Association analysis.

Four tests of genotypic association, i.e., dominant, additive, and recessive models and lack-of-fit to an additive model, were performed using the program SNPGWA (www.phs.wfubmc.edu). The additive and recessive models required at least 10 and 30 homozygotes, respectively, for the minor allele. The genetic model and odds ratios were defined relative to the minor allele. The primary inference for this study was based on the additive genetic model unless the lack-of-fit to an additive model was statistically significant (P < 0.05). If the lack-of-fit test was significant, the minimum P value from the dominant, additive, or recessive model was reported.

Replication study analysis.

The 10 SNPs taken forward in the replication study represent novel regions not previously reported for JIA (14); in the replication study a few SNPs associated with JIA that were in LD with each other were chosen. In all cases, SNP intensity plots showed robust genotype calling, and allele frequency was consistent between out-of-study and local controls. The replication genotypes were tested for association by meta-analysis using the weighted inverse normal method, where the weights were the square root of the sample size. For this analysis the 24 Delaware JIA cases were combined with the large UK collection due to a lack of regional controls. Finally, a meta-analysis of the discovery and replication cohorts was also computed both with and without weighting. For the former, the weighted inverse normal method was used and weighted by the square root of the sample size. Since rs10995450 was genotyped instead of rs10995447 in the UK replication cohort, evidence of association only from the discovery and UK cohorts was used in the meta-analysis of rs10995450.

Expression QTL analysis.

Gene expression in peripheral blood mononuclear cells (from 68 JIA cases and 23 healthy controls) was determined using an Affymetrix U133 Plus 2.0 GeneChip array as previously reported (15, 19). To test for an association between each SNP in the GWAS and gene expression levels, analysis of covariance was performed using the natural log of expression levels as the outcome and the 2 principal components as covariates. Due to the small sample size, only dominant and additive genetic models were computed. To assess functional impact, in silico examination of all SNPs shown in Table 1 (or proxy SNPs [r2 > 0.5, HapMap CEU, i.e., samples from Utah residents with ancestry from northern and western Europe]) was also completed using gene expression data derived from lymphoblastoid cell lines (LCLs) from 378 children with asthma (www.sph.umich.edu/csg/liang/imputation/) (20). Cis associations (P < 0.05) are reported. Affymetrix probe set annotations were confirmed by comparing consensus sequences in RefSeq databases.

Table 1. Genome-wide sssociation study results in the JIA discovery cohort*
SNP (gene)Chromosome/Mb (MA)nMAFGenotype frequency, AA/AB/BBAssociation analysis
JIAControlsJIAControlsJIAControlsPOR (95% CI)
  • *

    The additive model results are presented unless the test for lack of fit to an additive model was significant (P ≤ 0.05) in the discovery cohort. This was the case for rs6766899, rs13139573, rs4254850, rs10995447, rs6479891, rs12411988, and rs9302588, and the dominant model is provided. Positions are from NCBI Build 36 throughout. JIA = juvenile idiopathic arthritis; SNP = single-nucleotide polymorphism; MAF = minor allele frequency; OR = odds ratio; 95% CI = 95% confidence interval.

rs6766899 (CDGAP)3/120.59 (T)8063,0430.260.220.54/0.41/0.060.62/0.34/0.059.82 × 10−51.37 (1.17–1.60)
rs4688011 (C3orf1)3/120.71 (T)8143,0460.240.190.59/0.35/0.070.66/0.3/0.031.88 × 10−61.37 (1.21–1.57)
rs13139573 9 (IL15)4/142.84 (T)8133,0580.440.480.33/0.47/0.20.26/0.52/0.222.44 × 10−40.73 (0.62–0.86)
rs4254850 (IL15)4/142.91 (G)8113,0580.430.470.34/0.46/0.190.27/0.51/0.227.84 × 10−50.72 (0.61–0.85)
rs10995447 (NRBF2-EGR2)10/64.55 (T)8133,0580.220.180.6/0.36/0.040.67/0.29/0.041.09 × 10−41.37 (1.17–1.61)
rs6479891 (JMJD1C)10/64.68 (T)8143,0490.190.130.66/0.31/0.030.75/0.23/0.026.10 × 10−81.59 (1.34–1.87)
rs10761747 (JMJD1C)10/64.79 (G)8123,0560.260.210.55/0.38/0.070.63/0.32/0.057.44 × 10−51.29 (1.14–1.46)
rs12411988 (REEP3)10/64.99 (C)8143,0580.180.130.66/0.31/0.020.76/0.22/0.021.16 × 10−71.57 (1.33–1.86)
rs12719740 (IGF1R-FAM169B)15/96.89 (T)8123,0580.230.170.6/0.35/0.050.69/0.28/0.036.80 × 10−81.45 (1.27–1.66)
rs9302588 (CHD9-TOX3)16/51.55 (T)8143,0580.230.180.59/0.36/0.050.67/0.29/0.048.13 × 10−61.44 (1.23–1.68)

Estimation of genetic variance.

The cumulative variance explained by common SNP variation was estimated using a variance component model and restricted maximum likelihood estimation as implemented in the GCTA software package (21). The variance component models were adjusted for sex and the 2 principal components for population structure, with separate variance components for each chromosome and one for the extended MHC region extending from HIST1H2AA (telomeric end) to RPL12P1 (centromeric end). Estimates using Yang's correction factor (c = 0 from formula 9) (21) for imperfect LD with causal variants were nearly identical (not shown). The estimates use SNPs within the GWAS that had <1% missing genotypes (555,355 SNPs). In addition, individuals were excluded using a relatedness threshold of 0.025, consistent with Yang et al (21). The stringent relatedness criteria resulted in 82 individuals being dropped from the within-study analysis and 219 individuals being dropped when out-of-study controls were included in the analysis. Of note, results were comparable using GWAS data with and without these additional exclusions.

RESULTS

The discovery cohort included 814 JIA cases, 658 local controls of self-reported non-Hispanic, European American ancestry, and 2,400 out-of-study controls from the MGS_nonGAIN samples available from dbGaP. Demographic and clinical details are provided in Supplementary Table 1 (http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1529-0131). The cases were classified based on the ILAR revised criteria for JIA (1). To control for potential population substructure, 2 principal components were identified and included in the logistic regression model as covariates. The genome-wide inflation factor was 1.04, and there was little evidence of systematic departure from expectation (as shown in the Q-Q plot presented in Supplementary Figure 1, available at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1529-0131). A total of 561,137 SNPs had an MAF of >0.05, no differential missingness between cases and controls, <5% missing data, and no evidence of departure from HWE (P > 0.01 in controls, P > 0.0001 in cases). P values less than or equal to 5 × 10−8 were considered significant for genome-wide testing. Estimates of statistical power are shown in Supplementary Figure 2, on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN) 1529-0131.

Both HLA and non-HLA associations with JIA were identified in the principal component–adjusted analysis (Figure 1A). The strongest associations were within the MHC region and included HLA class I and class II loci (Figure 1B) as well as non-HLA loci independent of LD (Thompson SD, et al: unpublished observations). The strongest SNP associations outside the MHC region included novel loci as well as loci that have been implicated in other autoimmune diseases. We have previously reported the overlap with autoimmune disease susceptibility loci in this data set (14). To extend association findings for JIA, 10 SNPs, representing 5 regions not yet reported in JIA, were selected for replication testing in independent samples. These SNPs and the association results in the discovery cohort are listed in Table 1 and represent chromosomal regions 3q13, 4q31, 10q21, 15q26, and 16q12.

Figure 1.

A, Results of genome-wide association analysis of juvenile idiopathic arthritis association, plotted on a genomic scale (Manhattan plot). Horizontal line represents a P value of 5 × 10−8. This conservative threshold for genome-wide significance does not take into account linkage disequilibrium in this Caucasian cohort. Loci shared with other autoimmune diseases are shown in blue. B, Association results for the extended major histocompatibility complex region (chromosome 6 [26–34 Mb]). Single-nucleotide polymorphisms (SNPs) with P values of less than 0.01 are represented and are color-coded by odds ratio (OR) strata. SNPs of interest include 1) rs1035798 (position 32,259,200 bp, gene AGER; OR 2.32, P = 9.5 × 10−42); 2) rs2071286 (position 32,287,874 bp, gene NOTCH4; OR 2.21, P = 5.0 × 10−35); 3) rs10947261 (position 32,481,210 bp, gene BTNL2; OR 2.66, P = 1.1 × 10−32); 4) rs9268858 (position 32,537,736 bp, 17 kb from HLA–DRA; OR 2.70, P = 1.9 × 10−32); 5) rs9268853 (position 32,537,621 bp, 17 kb from HLA–DRA; OR 2.70, P = 2.3 × 10−32); and 6) rs9366752 (position 30,132,656 Mb, 2 kb from HLA–H; OR 2.14, P = 1.9 × 10−21). TNF = tumor necrosis factor.

The genome-wide association analysis in the discovery cohort revealed evidence of an association of JIA with the chromosome 3q13 region, which includes the genes CD80, KTELC1, and C3orf1. Visual resolution of the association pattern across this region (Figure 2) indicated that the signal maps to a single LD block. Replication was attempted for rs4688011 and rs6766899 based on the statistical findings in the discovery cohort (odds ratio [OR] 1.37, P = 1.9 × 10−6 and OR 1.37, P = 9.8 × 10−5, respectively). Association with JIA was consistent in the replication cohort for rs4688011 (OR 1.18, P = 0.0033), and rs4688011 was the strongest of all markers tested in meta-analyses (OR 1.23, P = 3.6 × 10−7) (Table 2). Due to the disparate relative sample sizes of the cohorts, the weighted meta-analysis tended to be dominated by the UK cohort. Weighting each cohort equally modestly increased the evidence of association for both rs4688011 and rs6766899 (Table 2).

Figure 2.

Regional plots of juvenile idiopathic arthritis loci. Genotyped and imputed single-nucleotide polymorphisms (SNPs) are plotted with their P values from the discovery cohort as a function of genomic position (assembly hg18) within a 400-kb region surrounding the most significant SNP (index SNP). Recombination rates from the HapMap phase II CEU (samples from Utah residents with ancestry from northern and western Europe) are plotted in blue to reflect the regional linkage disequilibrium (LD) structure. In each region the index SNP is represented by a large purple diamond, and the color of all other SNPs (circles) indicates LD with the index SNP, based on pairwise r2 values from HapMap CEU (red = r2 >0.8, orange = r2 0.6–0.8, green = r2 0.4 to <0.6, light blue = r2 0.2 to <0.4, and dark blue = r2 <0.2). SNPs chosen for replication analysis are represented by triangles. Known human genes in the University of California, Santa Cruz Genome Browser are indicated at the bottom of the plots.

Table 2. Replication of the JIA association findings*
SNP (gene)Chromosome/ Mb (MA)Meta-analysis of replication cohortsMeta-analysis of combined discovery and replication cohorts
nMAFPOR (95% CI)MAFWeighted POR (95% CI)Unweighted P
JIAControlsJIAControlsJIAControls
  • *

    Replication results are from a meta-analysis of cohorts from Delaware, Germany, Texas, the UK, and Utah. The meta-analyses of the replication cohort and the combined discovery and replication cohorts were computed using the weighted inverse normal method. The meta-analyses of the combined cohorts was also computed with equal weighting of the cohorts (unweighted). See Table 1 for definitions and genetic model information.

  • For technical reasons, rs10995450 was genotyped as a proxy for rs10995447 in the UK samples (r2 = 0.55, P = 6.74 × 10−5, OR 1.31 [95% CI 1.15–1.50] in the discovery cohort).

rs6766899 (CDGAP)3/120.59 (T)1,7266,9970.240.220.22601.08 (0.96–1.21)0.240.221.55 × 10−31.15 (1.05–1.26)2.60 × 10−4
rs4688011 (C3orf1)3/120.71 (T)1,7226,9820.210.190.00331.18 (1.06–1.3)0.220.193.60 × 10−71.23 (1.13–1.33)1.23 × 10−7
rs13139573 (IL15)4/142.84 (T)1,7026,9850.450.460.02220.87 (0.77–0.99)0.450.478.18 × 10−50.83 (0.75–0.92)7.11 × 10−5
rs4254850 (IL15)4/142.91 (G)9691,5730.460.470.50011.39 (0.82–2.34)0.440.469.93 × 10−21.16 (0.79–1.69)4.37 × 10−54
rs10995447 (NRBF2- EGR2)10/64.55 (T)9691,6070.180.180.64800.88 (0.58–1.35)0.200.187.49 × 10−21.00 (0.73–1.36)7.46 × 10−4
rs10995450 (NRBF2- EGR2)10/64.56 (T)7545,3800.200.180.04651.14 (1.00–1.31)0.210.185.39 × 10−51.21 (1.10–1.34)6.69 × 10−5
rs6479891 (JMJD1C)10/64.68 (T)1,7286,9820.150.140.19201.06 (0.93–1.21)0.160.144.33 × 10−51.19 (1.07–1.32)2.26 × 10−7
rs10761747 (JMJD1C)10/64.79 (G)1,7196,9870.220.210.14301.05 (0.95–1.16)0.230.216.35 × 10−41.11 (1.02–1.20)1.32 × 10−4
rs12411988 (REEP3)10/64.99 (C)1,7016,9780.140.130.13201.06 (0.93–1.21)0.150.132.71 × 10−51.18 (1.06–1.31)2.91 × 10−7
rs12719740 (IGF1R-FAM169B)15/96.89 (T)1,7256,9740.190.190.56701.05 (0.95–1.17)0.200.185.19 × 10−41.15 (1.06–1.25)6.97 × 10−7
rs9302588 CHD9-TOX3)16/51.55 (T)1,7286,9570.210.200.37001.03 (0.92–1.16)0.210.191.27 × 10−31.13 (1.03–1.25)4.13 × 10−5

Compelling evidence for genetic association was also found at 10q21, a region that has not been reported for any other autoimmune disease. Specifically, rs6479891, located within JMJD1C (the gene for jumonji domain–containing 1C), represented the single strongest association outside the MHC region in the discovery cohort (OR 1.59, P = 6.1 × 10−8). Analysis of the replication cohort for SNPs in the 10q21.3 region revealed odds ratios that were consistent in direction, but of smaller magnitude and less significance, than in the discovery cohort (Table 2). Notably, only 1 of the 5 SNPs in the 10q21.3 region, rs10995450, demonstrated an association that reached statistical significance at the P < 0.05 level in the replication cohort (only the discovery and UK cohorts were genotyped for this SNP). Weighting each cohort equally increased the evidence for association for all of the SNPs but rs10995450, which had comparable evidence under both approaches.

To assess the potential impact of the polymorphisms on expression levels of neighboring genes, eQTL analysis was performed, using 2 resources: 1) a data set comprising 68 JIA cases and 23 healthy controls from the GWAS, and 2) an online data set with GWAS and expression analysis from LCLs from individuals with asthma (20). The SNPs that showed association in 3q13 were significantly associated with expression levels in the eQTL analyses (Table 3). Specifically, in the JIA/control data set, both rs4688011 in C3orf1 and rs6766899 in CDGAP were associated with the TMEM39A probe set 222690_s_at (P = 0.024 and P = 0.0097, respectively). The order of genes on chromosome 3, illustrated in Figure 2, depicts TMEM39A as located between CDGAP and C3orf1. Interestingly, there was a trend toward a correlation of rs6766899 with expression levels of CD80 (probe set 1554519_at) (P = 0.065), as well. When analyzing the publicly available online LCL data set relative to our results regarding chromosome 3, we found that rs4447803, which is in LD with rs4688011 (r2 = 0.54), was strongly associated with expression of KTELC1 (P = 5.7 × 10−6), with differences in expression levels comparable in magnitude, consistent with reports of KTELC1 expression in celiac disease (22). (For celiac disease, the associated SNP is rs1599796, which is in almost complete LD with rs4688011 [r2 = 0.95 in CEU].) Neither C3orf1 nor CD80 expression data were available in the online LCL data set.

Table 3. Significant findings for eQTL analysis considering regions supported by genetic association*
SNP (locus)/Mb (MA), expression probe setcis eQTLNormalized expression value based on genotype§
Distance, kbGeneP12
  • *

    Principal component–adjusted P values were used for the local data set. Lymphoblastoid cell line data available online at www.sph.umich.edu/csg/liang/imputation/ provide additional support for functional relationships between genotype and gene expression levels.

  • Expression values in the expression quantitative trait locus (eQTL) analysis were log-transformed.

  • The distance from the single-nucleotide polymorphism (SNP) to the nearest end of the gene is from NCBI Build 36.

  • §

    1 and 2 refer to the number of copies of the genotyped minor allele (MA). The geometric mean of the normalized expression values was computed for genotypes 0, 1, and 2, and the results for 1 and 2 are presented as the ratio to the 0 category (homozygous major allele). NA = data not available.

  • rs4447803 is a SNP proxy for rs4688011 (r2 = 0.54).

  • #

    From the lymphoblastoid cell line data set.

Chromosome 3     
 rs6766899 (CDGAP)/120.59 (T)     
  222690_s_at42.45TMEM39A0.00971.010.92
  240232_at110.12C3orf10.0311.080.83
  1554519_at135.89CD800.0651.000.99
  238505_at191.27ADPRH0.03641.170.97
 rs4688011 (C3orf1)/120.71 (T)     
  222690_s_at44.78TMEM39A0.0241.040.98
  228042_at71.28ADPRH0.03391.010.96
 rs4447803 (C3orf1)/120.72 (A)     
  218587_s_at22.42KTELC15.7 × 10−6#NANA
Chromosome 10     
 rs10995447 (NRBF2-EGR2)/64.55 (T)     
  230007_at47.47JMJD1C0.03180.991.02
 rs10995450 (NRBF2-EGR2)/64.56 (T)     
  223650_s_at4.10NRBF20.03031.130.97
  241391_at38.08JMJD1C0.00058#NANA
 rs6479891 (JMJD1C)/64.68 (T)     
  223650_s_at91.67NRBF20.01091.121.00
  230007_at0JMJD1C0.0421.051.04
 rs10761747 (JMJD1C)/64.79 (G)     
  223650_s_at193.37NRBF20.03061.120.99
  1556622_s_at0JMJD1C0.04421.030.90
  241391_at0JMJD1C1.6 × 10−6#NANA
 rs12411988 (REEP3)/64.99 (C)     
  230007_at89.68JMJD1C0.0081.041.04

Similarly, the SNPs that showed association with JIA in 10q21 were associated with levels of expression of certain genes (Table 3). Specifically, rs12411988 and rs6479891 were associated with expression levels measured by the JMJD1C probe set 230007_at (P = 0.008 and P = 0.042, respectively), and other JIA-associated SNPs in the region were also associated with either JMJD1C or NRBF2 expression. These included SNPs in modest LD with rs6479891, such as rs10761747 and rs10995450 (r2 = 0.48 and r2 = 0.57, respectively). This is notable since in the online LCL data set, both SNPs were associated with differential expression for the JMJD1C probe set 241391_at (P = 1.6 × 10−6 and P = 5.8 × 10−4, respectively) (20). In total, there were 8 SNPs from the JMJD1C region in the online data set that were highly associated with expression of this gene. These SNPs were in high LD with JIA-associated SNPs (r2 > 0.77) and located within or near JMJD1C, which encompasses a region of >150 kb (Figure 2). Furthermore, one of the SNPs that relates to gene expression values (rs10761725) codes for a conservative substitution (Ser to Thr) in the JMJD1C protein, but is not available in the Affymetrix SNP data set.

A third region of interest identified by genome-wide analysis was located at 4q31 and included the gene encoding interleukin-15 (IL-15). One SNP, rs4254850, showed evidence of association (OR 0.72, P = 7.8 × 10−5), but not replication, whereas rs13139573, a nearby SNP in high LD with rs425850 (r2 = 0.84), which had more modest results in the discovery cohort, showed modest evidence of association in the replication cohort (P = 2.4 × 10−4 in the discovery cohort, P = 0.02 in the replication cohort). Figure 2 highlights the statistical strength of the association in this region and the localization of the signal to the IL15 gene. No correlation with expression levels was found for IL15 loci in the eQTL analysis.

Replication was also attempted for 2 additional loci identified in the genome-wide analysis. In the discovery cohort, an association with JIA was found for rs9302588 in the 16q12 region (OR 1.44, P = 8.1 × 10−6), but evidence of an association with this SNP was not demonstrated in the replication cohort. Similarly, rs12719740 on 15q26 was associated with JIA in the discovery cohort (OR 1.45, P = 6.8 × 10−8) but this was not supported by replication.

While the genetic architecture of JIA susceptibility has been informed by this and other studies, the extent to which common genetic variation, as opposed to rare variants, epigenetics, and gene–gene as well as gene–environment interactions, accounts for the risk of JIA remains unclear. To address this, we used these GWAS data to estimate the fraction of the JIA risk that could be attributed to common SNP variation (MAF >0.05), using a variance component threshold liability model (21) and assuming various rates of disease prevalence ranging from 25 to 140 per 100,000 (2) (Table 4). We found that common SNP variation accounted for an estimated one-third of JIA susceptibility. This result was similar even with exclusion of the out-of-study controls. When this estimate was partitioned into the individual chromosomes and extended MHC region, the extended MHC accounted for ∼8% of the variation in JIA susceptibility, while the remainder of the genome accounted for ∼20%.

Table 4. Estimates of phenotypic variance in juvenile idiopathic arthritis susceptibility explained by genome-wide single-nucleotide polymorphisms among unrelated individuals, obtained using a threshold model and the restricted maximum likelihood method*
Prevalence of JIACombined (within-study and out-of-study controls)Within-study data only
Entire genomeExtended MHC regionNon-extended MHCEntire genomeExtended MHC regionNon- extended MHC
  • *

    Values are the mean ± SEM estimate of explained variance on a liability scale. Single-nucleotide polymorphisms used had <1% missing data, no evidence of differential missingness between cases and controls (P > 0.05), and no evidence of departure from Hardy-Weinberg equilibrium (P > 0.01 in controls; P > 1 × 10−6 in cases). The extended major histocompatibility complex (MHC) is defined as the region extending from HIST1H2AA (telomeric end) to RPL12P1 (centromeric end). JIA = juvenile idiopathic arthritis.

25 per 100,0000.30 ± 0.040.08 ± 0.010.18 ± 0.040.29 ± 0.030.05 ± 0.010.19 ± 0.07
40 per 100,0000.32 ± 0.040.08 ± 0.010.19 ± 0.040.31 ± 0.030.06 ± 0.010.21 ± 0.07
50 per 100,0000.33 ± 0.040.09 ± 0.010.20 ± 0.040.32 ± 0.030.06 ± 0.010.21 ± 0.07
80 per 100,0000.35 ± 0.040.09 ± 0.010.21 ± 0.040.34 ± 0.030.06 ± 0.010.23 ± 0.08
140 per 100,0000.38 ± 0.040.10 ± 0.010.23 ± 0.050.38 ± 0.030.07 ± 0.010.25 ± 0.08

DISCUSSION

This work represents, to our knowledge, the largest GWAS of JIA cases to date and focuses on the 2 most common subtypes, oligoarticular and RF-negative polyarticular JIA. There was no evidence that the associations varied between these 2 subtypes for the SNPs included in the replication studies. Sufficient numbers of samples are not yet available to allow independent consideration of JIA subtypes. Oligoarticular and RF-negative polyarticular JIA have overlapping HLA associations, share a female sex bias, and are distinguished primarily based on the number of joints involved. Thus, a priori they are the most logical subtypes to combine to maximize the power to detect common JIA susceptibility loci and minimize heterogeneity.

We report novel JIA-associated loci and supporting eQTL results that extend the JIA associations beyond those previously reported (PTPN2, PTPN22, IL2RA, ADAD1-IL2-IL21, ANGPT1, COG6, C12orf30, and STAT4). The eQTL analyses were reinforced by 2 data sets which include a subgroup of the same patients and controls used in our GWAS as well as a publicly available data set in which gene expression was measured in lymphoblastoid cell lines. The strongest replicated evidence for association with JIA was found at the chromosome 3q13 region, which includes CD80, a costimulatory molecule necessary for T cell activation, and KTELC1, an O-glucosyltransferase that modifies the Notch receptor in Drosophila (23). Notch signaling is important at several stages of T cell development and differentiation and has been proposed as a target for selective therapy for autoimmune disorders. Consistent with a theme of overlapping association findings in autoimmune disease are reports of genetic association with celiac disease in this region, which can be related to gene expression findings (22).

Evidence of association within JMJD1C was observed in the discovery cohort, and, although consistent in direction, a less significant effect was detected in the replication cohort. JMJD1C encodes a hormone-dependent transcription factor that regulates expression of a variety of target genes. It includes a jumonji domain that functions by removing methyl marks on histones that are associated with gene regulation (24). JMJD1C expression has been reported in multiple immune cells, including B cells as well as CD4+ and CD8+ T cells (www.biogps.gnf.org). It is noteworthy that a gene expression signature (50 expression probe sets) that identified a subset of JIA patients characterized by a disease course of chronically active arthritis included probe sets specific for both JMJD1C and the previously reported PTPN2 loci (14, 19). This overlap of findings cannot be explained by chance and suggests a functional relationship between genetic differences and gene expression findings in processes related to disease pathogenesis. This region warrants further study not only to validate the genetic association, but also to explore the patterns of histone methylation that may relate to disease.

There are additional SNP–JIA associations that have not yet been considered in replication studies. A listing of the 200 top statistical associations is provided in Supplementary Table 2 (on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)1529-0131), and the complete results from the association analyses are available at http://research.cchmc.org/ARDec11. Supplementary Table 2 includes additional SNPs in the regions discussed above and in genes for which associations have been reported in other autoimmune diseases but have not yet been investigated in JIA.

For example, there is evidence of association at 2p14, where the most significant SNP in JIA (rs268132) is located within intron 1 of SPRED2. This gene encodes sprouty-related, EVH1 domain–containing protein 2, which has been shown to regulate growth factor–induced activation of the MAPK cascade and has been reported to be a rheumatoid arthritis risk locus (25). For rheumatoid arthritis, the most significant SNP, rs934734, is also in intron 1 and in high LD with the JIA-associated SNP, rs268132 (r2 = 0.78). Similarly, rs423847 in ANTXR2 shows evidence of association with JIA (OR 1.42, P = 2.4 × 10−5). This gene, located at 4q21, encodes capillary morphogenesis protein 2 and has been associated with ankylosing spondylitis (rs4333130) (26). Our findings also include SNPs in PTPRS (receptor protein tyrosine phosphatase σ), a gene that has been associated with ulcerative colitis (27), and NOS2, which encodes inducible nitric oxide synthase and is a susceptibility locus for psoriasis (28).

Finally, SNP rs2548997 near the IRF1 gene within the 5q31.1 region, which also harbors IL-4, IL-5, and IL-13, shows evidence of association (P = 1.82 × 10−4). This region was studied early on for association in JIA with conflicting results (29, 30), perhaps due to population substructure or JIA phenotype heterogeneity. Thus, our findings suggest that further studies in this important region, which includes genes related to polarization of cytokine repertoires, are warranted. This phenomenon of cytokine repertoire polarization has been previously observed in JIA (31).

The present results confirm and firmly establish the role of genetic influences on susceptibility to JIA and provide additional evidence that common genetic variation is very important in explaining JIA risk. The estimated genetic variance explained by common variation is higher than estimates for other complex genetic traits such as Crohn's disease, but similar to the estimate for type 1 diabetes, which also has onset in childhood (32). In adult arthritis, it is estimated that the 27 non-HLA loci confirmed to date and reported by Stahl et al in 2010 (25) account for only 10.7% of the genetic susceptibility (33). We estimated that approximately one-third of JIA risk can be attributed to common genetic variation and that the extended MHC region accounts for approximately one-fourth of the heritable risk (0.25 = 0.08/0.32). Thus, continued mining of JIA GWAS data in expanded cohorts and for individual JIA subtypes is warranted and will likely lead to the discovery of additional susceptibility loci.

As noted above, the disparate relative sample sizes among the replication cohorts tended to cause the UK cohort to dominate the results. Weighting each cohort equally in the meta-analysis (not removing the UK cohort but weighting equally) yielded increased statistical evidence in all of the regions shown in Table 2 and for all SNPs except rs10995450, whose evidence was comparable. The fact that the WTCCC2 is not a North American cohort suggests that further research is needed to determine whether the differing results were likely due to sampling variation, false associations, or population differences.

In summary, the novel JIA-associated loci identified here in an agnostic scan merit study in other JIA and autoimmune disease cohorts. Furthermore, integrative analyses of expression and association data will further accelerate the discovery of the underlying genetic architecture of JIA and ultimately lead to molecular profiles that are potentially relevant to diagnosis, outcomes, and treatment response.

AUTHOR CONTRIBUTIONS

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Thompson had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Thompson, Sudman, Thomson, Hinks, Haas, Bohnsack, Wagner, Langefeld, Glass.

Acquisition of data. Thompson, Sudman, Ryan, Tsoras, Barnes, Thomson, Hinks, Haas, Prahalad, Bohnsack, Wise, Punaro, Rosé, Wagner, Glass.

Analysis and interpretation of data. Thompson, Marion, Sudman, Howard, Barnes, Ramos, Thomson, Hinks, Pajewski, Spigarelli, Keddache, Wagner, Langefeld, Glass.

Acknowledgements

We gratefully acknowledge contributions from physicians at CCHMC, the Medical College of Wisconsin, Schneider Children's Hospital, and Children's Hospital of Philadelphia (Philadelphia, PA) for the collection of patient samples, and the assistance of Sandy Kramer for patient recruitment at CCHMC and coordination of clinical information. We also acknowledge David R. McWilliams, PhD, for generating the Manhattan plots. Computing support was provided by the Wake Forest School of Medicine Center for Public Health Genomics. Data used in this study were generated by the WTCCC2 and funded under Wellcome Trust award 085475. A full list of the investigators who contributed to the generation of the data is available at http://www.wtccc.org.uk. The normal control DNA collection in the discovery cohort including all genotypes was supported and made available by CCHMC.

Ancillary