Genome-Wide Association Study of Genetic Predictors of Anti–Tumor Necrosis Factor Treatment Efficacy in Rheumatoid Arthritis Identifies Associations with Polymorphisms at Seven Loci

Objective Anti–tumor necrosis factor (anti-TNF) agents are successful therapies in rheumatoid arthritis (RA); however, inadequate response occurs in 30–40% of patients treated. Knowledge of the genetic factors that influence response may facilitate personalized therapy. The purpose of this study was to identify genetic predictors of response to anti-TNF therapy in RA and to validate our findings in independent cohorts. Methods Data from genome-wide association (GWA) studies were available from the Wellcome Trust Case Control Consortium for 566 anti-TNF–treated RA patients. Multivariate linear regression analysis of changes in the Disease Activity Score in 28 joints at 6 months was conducted at each single-nucleotide polymorphism (SNP) using an additive model. Associated markers (P < 10−3) were genotyped in 2 independent replication cohorts (n = 379 and n = 341), and a combined analysis was performed. Results Of 171 successfully genotyped markers demonstrating association with treatment response in the GWA data, 7 were corroborated in the combined analysis. The strongest effect was at rs17301249, mapping to the EYA4 gene on chromosome 6: the minor allele conferred improved response to treatment (coefficient −0.27, P = 5.67−05). The minor allele of rs1532269, mapping to the PDZD2 gene, was associated with a reduced treatment response (coefficient 0.20, P = 7.37−04). The remaining associated SNPs mapped to intergenic regions on chromosomes 1, 4, 11, and 12. Conclusion Using a genome-wide strategy, we have identified and validated the association of 7 genetic loci with response to anti-TNF treatment in RA. Additional confirmation of these findings in further cohorts will be required.

predictors of response at baseline could be of great clinical and economic benefit by allowing the targeting of these therapies to the patients who are most likely to respond.
Clinical predictors of response, such as concurrent methotrexate or nonsteroidal antiinflammatory drug therapy, functional disability, and smoking habits, account for only a small proportion of the variance in treatment response (4,5). Additional factors, such as genetic and serologic markers, are also likely to influence response.
Most studies of genetic predictors of anti-TNF response performed to date have focused on candidate genes known to play or thought to play a role in susceptibility to RA, such as the HLA-DRB1 shared epitope in the HLA region (6). Other studies have investigated the TNF gene itself (7,8), additional genes in the TNF signaling pathway (9), and other cytokines (10). Despite the many studies that have been conducted, no gene that influences anti-TNF response in RA has been definitively identified and replicated, although evidence for a role of the TNF -308 polymorphism is compelling (5). Small sample sizes and a focus on few variants in a narrow selection of candidate genes are the most likely explanations for this limited success, which highlights the need for a new strategy.
Over the last 2-3 years, genome-wide association (GWA) studies have been highly successful in identifying susceptibility genes in complex diseases. Such studies aim to interrogate thousands of genetic markers covering the majority of the whole genome to assess their relationship to the particular outcome of interest. Thus, GWA studies take an unbiased view of the whole genome and therefore have a higher probability of detecting an association with a genetic marker, providing that the studies are sufficiently powered.
In 2007, the Wellcome Trust Case Control Consortium (WTCCC) in the UK published the results of a large collaborative effort aimed at identifying common susceptibility polymorphisms implicated in 7 complex diseases (11). A case-control GWA study of ϳ500,000 single-nucleotide polymorphisms (SNPs) was performed in 3,000 control subjects and in 2,000 patients in each disease group, one of which was RA. Of the 2,000 RA cases contributed by our group, 566 were patients receiving anti-TNF treatment and had available data regarding treatment response.
The aims of the current study, therefore, were first, to identify candidate genetic predictors of response to anti-TNF therapy from the available GWA data and second, to validate these findings using independent cohorts of anti-TNF-treated RA patients.

PATIENTS AND METHODS
Study design. This study used a 3-stage design. In the first stage, GWA analysis of change in the Disease Activity Score in 28 joints (DAS28) (12) between baseline and 6 months was undertaken in 566 anti-TNF-treated RA patients who where included as part of the WTCCC study (stage 1 cohort). In the second stage, markers demonstrating evidence of association (P Ͻ 10 -3 ) in stage 1 were genotyped in an independent cohort of 410 individuals (stage 2 cohort), and a meta-analysis of the 2 datasets was performed. In stage 3, markers for which the association signal was strengthened by meta-analysis, as compared to the association signal observed in the stage 1 cohort alone, were investigated in a third cohort of 364 anti-TNF-treated RA patients (stage 3 cohort), and a second meta-analysis of all 3 datasets was performed. The choice of a 3-stage design was pragmatic, since the study is actively recruiting, and further genotyping/analysis was undertaken as sufficient additional samples became available for inclusion.
Patients. The British Society for Rheumatology Biologics Register (BSRBR) was initiated with the aim of assessing the adverse events associated with treatment with the 3 anti-TNF biologic agents. The BSRBR has detailed clinical and response criteria on 4,000 patients with RA receiving each drug. Collaborations were established with a subset of the larger prescribing centers as part of the Biologics in Rheumatoid Arthritis Genetics and Genomics Study Syndicate (BRAGGSS; see Appendix A for the BRAGGSS investigators and study centers). Blood samples for DNA extraction were obtained from anti-TNF-treated RA patients who met the following 3 inclusion criteria: 1) RA confirmed by a physician, 2) currently receiving or about to begin receiving treatment with 1 of 3 anti-TNF drugs (etanercept, infliximab, or adalimumab), and 3) Caucasian origin, thus reducing the potential for spurious associations arising as a result of population stratification. Patients were ineligible for this study if they had stopped treatment during the first 6 months for reasons other than lack of efficacy.
Marker selection and genotyping. Genotyping of DNA samples included in the original GWA study of RA was described in detail in the online methods for the WTCCC report (11). Briefly, 250 ng of DNA was genotyped using an Affymetrix GeneChip 500K Mapping Array Set, which interrogates 500,000 SNPs. After quality control measures, a total of 459,446 SNPs remained available for analysis.
Markers demonstrating a significant association (P Ͻ 10 -3 ) with treatment response in the initial GWA stage of the study were selected for replication. Markers with minor allele frequencies of Ͻ10% were excluded from further investigation, as were markers with genotype frequencies that deviated from Hardy-Weinberg equilibrium at P Ͻ 10 -3 .

GENETIC INVESTIGATION OF ANTI-TNF TREATMENT EFFICACY IN RA
using Plink statistical software, release v1.07 (online at http:// pngu.mgh.harvard.edu/purcell/plink/). DNA samples from the patients evaluated in stages 2 and 3 were genotyped using a Sequenom MassArray iPlex system. In each reaction, 10 ng of DNA was used, and the protocol was followed according to the manufacturer's instructions (www.Sequenom.com).
Quality control measures were used at the genotyping stages of the study. Samples and SNPs with a genotyping success rate of Ͻ 80% were excluded from analyses.
Statistical analysis. Multivariate linear regression analysis was used to assess the effect of each SNP genotype on response to treatment, using the absolute change in the DAS28 at 6 months of followup (a continuous variable) as the outcome measure. Regression analyses were adjusted for covariates previously identified as being independent predictors of change in the DAS28: baseline DAS28, Health Assessment Questionnaire (HAQ) score (13), and concurrent DMARD therapy. These analyses were performed using Plink statistical software, release v1.07. In the stage 1 cohort, sex and smoking history were not significantly associated with change in the DAS28, and serology data (e.g., anti-cyclic citrullinated peptide antibody and rheumatoid factor) was available for Ͻ85% of patients; therefore, these data were not included as covariates.
To establish a drug type-specific effect at any of the associated loci, data for all samples were combined, and an interaction term was fitted between SNP loci and drug type (i.e., etanercept, infliximab, or adalimumab). These analyses were performed using Stata statistical software, release 9, 2005 (StataCorp; online at www.stata.com) Power calculations were performed using Quanto version 1.2.3 (online at http://hydra.usc.edu/gxe) under an additive model for a range of marker allele frequencies.

RESULTS
For stage 1 of the study, the initial GWA, DNA samples from 566 RA patients receiving anti-TNF treatment were available for study. The initial analysis had Ͼ90% power to detect a difference of Ն0.6 units in the absolute change in the DAS28 at the significance threshold (P Ͻ 10 -3 ) for allele frequencies Ն15%.
For investigation of association signals arising from the initial GWA, 410 patients who were also treated with one of the anti-TNF biologic agents were genotyped. Genotyping quality control failed in 7 patients, leaving 403 patients, of which 379 had complete information for analysis. Finally, 364 patients were genotyped in stage 3 of the study. Genotyping quality control failed in 16 of them, and of the remaining 348 patients, 341 had sufficient data on baseline covariates and on treatment response for analysis. Patient characteristics for the 3 cohorts available for analysis are given in Table 1 Using a tagging strategy based on linkage disequilibrium between associated SNPs (R 2 Ն 0.8), the number of SNPs prioritized for genotyping in the first replication cohort was reduced from 247 to 183 (Supplementary Table 1).
Ten SNP markers (i.e., rs7305646, rs1350948, rs7962316, rs17301249, rs1532269, rs4694890, rs7070180, rs12081765, rs1024125, and rs10739625) demonstrated an improved association signal with the response to anti-TNF treatment in the meta-analysis of the discovery and stage 2 cohorts over and above that observed in the discovery cohort alone (P Ͻ 0.001) under an additive model of association (Table 2).
Stage 3, second meta-analysis. To further investigate the observed associations with treatment response, the 10 SNPs were genotyped in a second independent cohort of anti-TNF-treated RA patients. Three SNPs (rs7070180, rs1024125, and rs10739625) failed to genotype, leaving 7 SNPs for analysis. A second metaanalysis of these data along with the discovery and stage 2 cohort data for the 7 SNPs was performed. The association signal for each marker (i.e., rs12081765, rs1350948, rs1532269, rs17301249, rs4694890, rs7305646, and rs7962316) remained the same or diminished in significance in the second meta-analysis, as compared with that observed in the first meta-analysis (Table 3). For the SNP markers rs4694890, rs1350948, and rs7962316, the effect was observed in the opposite direction in the stage 3 cohort, as compared with the initial and stage 2 cohorts, suggesting that these associations may be spurious.
Clinical significance. Taking into account the clinical and demographic factors previously shown by our group and others to influence response to anti-TNF 648 PLANT ET AL ϭ minor allele. † The genotyping assay for this single-nucleotide polymorphism (SNP) failed in the first replication cohort; therefore, a perfect proxy (rs4522221; r 2 ϭ 1) was genotyped.
GENETIC INVESTIGATION OF ANTI-TNF TREATMENT EFFICACY IN RA treatment in RA patients (DAS28 score at baseline, HAQ score, sex, concurrent DMARD therapy, rheumatoid factor positivity, and smoking habits), the variance in the absolute change in the DAS28 at 6 months of followup in the combined cohort was 15%. Incorporating into the model the 7 genetic loci identified by the current study increased the variance explained to 20%. When the model was restricted to include only those SNPs with between-study continuity in the direction of effect (i.e., rs12081765, rs1532269, rs17301249, and rs7305646), the variance in response to treatment was 19%.
Each treatment response-associated SNP was investigated for drug type-specific effects by fitting an interactive term in the analysis. No statistical correlation between the type of therapy used and the SNP marker was observed (data not shown).

DISCUSSION
We performed a multistage comprehensive GWA study of response to anti-TNF treatment in patients with RA. In combining the results of the initial GWA study and 2 independent cohorts, we demon-strated evidence of association at 7 genetic loci not previously implicated in response to these drugs, with the significance of the association increased for markers rs12081765, rs17301249, and rs7305646 in the second meta-analysis, as compared to the initial GWA results (Tables 2 and 3).
Two of the SNP markers map within genes: the PDZ domain-containing protein 2 (PDZD2) and eyes absent homolog 4 (EYA4). The SNP marker rs1532269, which is associated with a reduced response to TNF blockade, is an intronic polymorphism mapping to the PDZD2 gene. The SNP resides in a region of linkage disequilibrium confined to the latter portion of PDZD2 and demonstrates correlation (r 2 Ͼ 0.8) with 5 other intronic SNPs within PDZD2. The PDZD2 gene has been reported to influence the secretion of insulin in an animal model in which pancreatic islet cells of Pdzd2deficient mice produce more insulin than do normal islets and display increased insulin secretion at low concentrations of glucose (14). Insulin resistance and elevated insulin levels are features of severe disease in early RA, if untreated, and are driven primarily by systemic inflammation (15). In RA, dramatic reductions in serum insulin levels are observed following anti-TNF treatment (16). Therefore, a potentially interesting connection exists between insulin levels and disease severity in patients with RA, and this connection may have a genetic basis. The EYA4 SNP associated with an improved treatment response in the current study, rs17301249, is an intronic variant tagging an additional intronic EYA4 polymorphism (rs9375955) that was associated with response to treatment at the initial GWA analysis stage. In the HapMap CEU (Utah residents with ancestry from northern and western Europe) data set (release 22), the two SNPs are strongly correlated with other intronic EYA4 SNPs and SNPs upstream of the transcription start site; the region of high linkage disequilibrium does not stretch to any nearby genes. Mapping to chromosome 6q23.2, EYA4 was originally discovered as a cotranscription factor and is observed to stimulate the expression of interferon-␤ (IFN␤) and CXCL10 in response to undigested DNA of apoptotic cells (17). Once engulfed by macrophages, the DNA molecule from dead cells is digested by the activity of DNase II. Mice that are deficient in DNase II are prone to developing chronic arthritis as a result of the production of TNF and IFN␤ (17). Cross-talk between type I IFN and TNF has recently been investigated in patients with RA, in whom elevated expression of type I IFN response genes has been shown to correlate with a poor clinical response following TNF blockade (18). The observed genetic association in the current study could therefore indicate a link between DNA-induced innate immune responses and the efficacy of TNF blockade.
Three SNPs (i.e., rs4694890, rs1350948, and rs7962316) showed an improved association signal in the first meta-analysis, over and above that seen in the initial GWA data; however, these SNPs failed to associate with treatment response in the stage 3 cohort (n ϭ 341), and with the effect observed in the direction opposite that expected (Tables 2 and 3). The between-cohort differences in association signals for these loci are likely to be due to the small sample sizes, and while interpretation of these data should be treated cautiously, they suggest that these may be false-positive signals.
Interpreting the validated association signals located in intergenic loci on chromosomes 1 (rs12081765), 4 (rs4694890), 11 (rs1350948), and 12 (rs7962316 and rs7305646) is challenging, since they do not map close to obvious candidate genes. Nonetheless, we have found evidence of association with anti-TNF response, and it is possible to speculate that they represent long-range regulatory elements for other genes that they may lie more closely to when the 3-dimensional conformation rather than the linear DNA sequence is taken into account. Investigation of chromosomal conformation will be required to explore this further.
Our study has some limitations that require discussion. First, in order to avoid false-negative associations at the initial GWA stage, a relatively lenient statistical threshold of P Ͻ 10 -3 was chosen, thus running the risk of generating a high proportion of false-positive associations. Our strategy was to validate association signals rather than imposing a strict level of significance at the first stage. In the combined analysis of all 3 cohorts by meta-analysis, evidence of association was increased over that obtained in the initial cohort alone for 3 of the 7 markers investigated (rs12081765, rs17301249, and rs7305646). Therefore, it is important that other researchers in the field also aim to provide additional verification of our findings in independent collections of anti-TNF-treated RA patients.
Second, although a powerful means of measuring treatment response, the DAS28 score has the limitation that it is a composite score, relying on information about swollen and tender joint counts, patient-reported general health status, and the erythrocyte sedimentation rate (ESR). Hence, it should not be surprising that genetic effects influencing this complex phenotype are likely to be individually modest, as has been found in most studies of disease susceptibility. Where large effect sizes have been observed in pharmacogenetic studies, there is usually a well-defined phenotype (such as a rare adverse event) or detection of an association with a biologic marker, such as the influence of genetic variation in warfarin therapy, as assessed by measurements of the international normalized ratio. In assessing response to anti-TNF therapies, therefore, stronger effect sizes may be seen with individual components of the DAS28 score. For example, the ESR is a biologic marker of inflammation and is therefore an objective measure that may provide a more accurate means of assessing treatment response. Future studies will examine genetic predictors of response using the ESR as the outcome measure to investigate whether that approach highlights the same genetic loci or whether the proportion of signals from the initial phase that are subsequently validated is higher.
To date, our study, which included 1,285 patients, is the largest investigation of genetic predictors of anti-TNF response in RA patients. The sample size used at the initial phase of the study afforded Ͼ90% power to detect a change in the DAS28 of 0.6 units, representing a clinically relevant change in disease activity, at allele frequencies Ն0. 15. It is possible that effects conferred by less frequent alleles were overlooked.
Despite the limitations, 7 loci have been identified, which, when added into predictive models, substantially increase the variance accounted for over and above clinical variables alone. The next step will be to further explore the 7 loci using a fine-mapping strategy to identify the polymorphism responsible for the effect on treatment response. This will be particularly challenging at the intergenic loci identified, since the associated SNP is often Ͼ100 kb from the nearest known gene, or there are no obvious candidate genes in the region. It is therefore likely that functional variants with long-range effects on gene regulation are responsible for the observed associations at these intergenic loci.
In summary, we have performed a multistage association study of response to anti-TNF treatment in RA patients and have identified 7 genetic loci that influence treatment response in our data. As with all studies of genetic determinants of complex phenotypes, our findings require validation in independent cohorts.