Investigation of the MHC2TA gene, associated with rheumatoid arthritis in a Swedish population, in a UK rheumatoid arthritis cohort




A recent study of rheumatoid arthritis (RA) showed an association with a functional single-nucleotide polymorphism (SNP) mapping to the promoter region of the MHC2TA gene on chromosome 16p13 in a Swedish population. Interestingly, evidence for linkage to this region has been detected previously in a subgroup of UK RA families carrying 2 copies of shared epitope (SE) alleles. Therefore, we undertook this study to investigate the association of the MHC2TA gene promoter with RA in a UK Caucasian population.


Association with 5 SNPs spanning the promoter region of the MHC2TA gene was investigated in 813 UK RA patients and 532 population controls. Association with a functional putative RA-causal polymorphism (−168*G/A [rs3087456]) was tested in a total of 1,401 UK RA patients and 2,475 controls. Genotyping was performed using a Sequenom MassArray platform. Estimated haplotype frequencies were generated using the expectation-maximization algorithm and compared between patients and controls.


All SNPs were in Hardy-Weinberg equilibrium. No evidence for association was found, either with the putative RA-causal polymorphism (−168*G/A) or with the other SNPs tested. Haplotype analysis revealed extensive linkage disequilibrium across the promoter region but no evidence for association. Stratifying the data set by carriage of SE alleles did not alter the conclusions.


A functional polymorphism of the MHC2TA gene locus previously associated with RA in a European population has not been associated with RA in a UK population. These findings do not provide support for the notion that this gene plays a major role in the etiology of RA.

Rheumatoid arthritis (RA) is a common inflammatory autoimmune disease in which environmental factors are thought to trigger synovial joint inflammation and, less commonly, extraarticular features in genetically susceptible individuals. The major genetic locus contributing to RA susceptibility is the HLA–DRB1 gene on chromosome 6p21. Carriage of certain HLA–DRB1 alleles, collectively termed shared epitope (SE) alleles, confers a 2–3-fold increased risk of developing RA (1). Recently, substantial progress has been made in identifying non-HLA disease-causal genes. For example, association of RA with a functional polymorphism of the P1 protein tyrosine phosphate (PTPN22) gene has been widely replicated in populations of Northern European descent (for review, see ref.2). Interestingly, the same polymorphism exists at much lower frequencies in oriental populations, and no association with RA has yet been reported (3). In contrast, a functional haplotype determining messenger RNA stability of the peptidyl arginine deiminase gene encoding an enzyme involved in citrullination, PADI4, has been associated with RA, and this association has been replicated in Japanese and Korean populations, but not in populations of Northern European descent (4–8). It is therefore clear that non-HLA susceptibility genes may show significant population differences, and this highlights the importance of testing genes reported to be associated with RA in different populations.

Recently, association of RA with another gene, the class II major histocompatibility complex (MHC) transactivator (MHC2TA) gene, has been reported in a Swedish population (9). This gene maps to chromosome 16p13, and genetic and functional studies in rats identified it as a gene involved in determining expression levels of class II MHC molecules in response to a variety of stimuli including peripheral nerve trauma. A single-nucleotide polymorphism (SNP) (MHC2TA − 168*G/A [rs3087456]) in the 5′ flanking region of the gene was reported to be associated with RA, multiple sclerosis, and myocardial infarction in humans. Expression studies in humans confirmed that the polymorphism is associated with regulation of class II MHC expression.

In a linkage study involving affected sibling pair families from the Arthritis Research Campaign (ARC) (UK) National Repository of patients and families with RA, we have previously found evidence for linkage to the chromosome 16p13 region (10). The linkage appeared stronger in families sharing 2 copies of SE alleles. The MHC2TA gene is therefore a strong candidate requiring investigation in our UK cohort.

The MHC2TA gene has 4 separate promoters spanning 15 kb 5′ of the transcription start site (for review, see ref.11). The promoters vary according to cell type, and the −168*G/A polymorphism lies in the third promoter, which determines expression in “professional” antigen-presenting cells such as B lymphocytes and activated T cells (11). In the Swedish study, the effect size of the polymorphism in determining susceptibility to RA was small (odds ratio [OR] 1.29, 95% confidence interval [95% CI] 1.07–1.56) (9). We hypothesized that an additional functional variant may exist or that the MHC2TA − 168*G/A SNP was acting as a marker for an as-yet-untested polymorphism. From public databases, we identified an additional 5 frequency-validated SNPs mapping to the 5′ region of the gene. The aims of our study were as follows: first, to test for association of RA with the putative functional SNP previously associated with RA in the Swedish population (MHC2TA − 168*G/A); second, to test for association of RA with 5 other SNPs, and with haplotypes formed by them, in a cohort of RA patients from the UK; and, finally, to test for association with RA in a subgroup stratified by carriage of SE alleles.


Study design.

A case–control (association) study was undertaken to investigate association with RA of 6 SNPs spanning the MHC2TA gene promoter. The genotyping was performed in 2 phases. In the first phase, all 6 SNPs were tested in a subgroup of the total cohort, while in the second phase, the functional putative RA-causal SNP (MHC2TA − 168*G/A) was tested in the remaining samples and the results from the 2 phases were combined for analysis of this SNP. Unrelated RA patients were compared with healthy controls for genotype at each locus. Pairwise linkage disequilibrium (LD) was measured between SNPs, and haplotype frequencies were estimated using the expectation-maximization (EM) algorithm implemented with HelixTree software (Golden Helix, Bozeman, MT).

RA patients and controls.

Cases with RA (n = 1,401) were obtained from the ARC National Repository of patients and families (1 proband per family) and from local clinics as described previously (5). Control subjects with no history of inflammatory arthritis were recruited from blood donors and general practitioner registers (n = 532) or from subjects recruited as part of the 1958 birth cohort (12). This control group comprises DNA samples from a randomly selected subset of children born in England, Scotland, and Wales in 1 week in March 1958 (∼17,000 live births) who have been followed up prospectively. The Oversight Committee for the Biomedical Assessment of the British 1958 Birth Cohort Study provided access to DNA collected from 2,064 individuals from the 1958 birth cohort randomly distributed across the UK. Genotype data generated from the 64 nonwhite individuals included in the cohort were not included in the analysis. All patients and controls were of UK Caucasian ethnic origin, were recruited with Ethics Committee approval, and provided informed consent.

Genotyping methods.

Six SNPs spanning the MHC2TA promoter were selected for investigation. The SNPs were chosen to include the SNP thought to be RA disease causal in the Swedish population (MHC2TA − 168*G/A) and to include 5 other SNPs for which validated genotype frequency data were available in European populations (MHC2TA −7213*C/G [rs7501204], MHC2TA − 6952*G/T [rs6498114], MHC2TA − 5473*T/C [rs6416647], MHC2TA − 4591*C/T [rs7404672], and MHC2TA − 1788*A/T [rs6498116]). All 6 SNPs were genotyped using the Sequenom MassArray platform according to the manufacturer's instructions ( Duplicate samples and negative controls were included to ensure accuracy of genotyping.

Statistical analysis.

Single-point analysis.

Association with RA of each of the 6 SNPs was tested using the chi-square test implemented in Stata (StataCorp, College Station, TX). The clinical phenotype of RA is heterogeneous, with variation in the presence of features such as erosions, rheumatoid factor (RF) positivity, and age at onset. In addition, genotypic heterogeneity exists in terms of carriage of HLA–DRB1*04 susceptibility alleles. We have previously found evidence for linkage to chromosome 16p13 in a subset of families sharing 2 copies of SE alleles (10). Stratification analysis was therefore undertaken to investigate whether association was present in specific patient subsets based on sex, disease severity, age at onset, and carriage of SE alleles.

Haplotype analysis.

Pairwise LD measures (both the D′ and correlation [r2]) between individual SNPs were investigated and haplotypes constructed using the EM algorithm implemented with HelixTree software. Haplotype frequencies were compared between patients and controls using the chi-square test implemented with Stata.


Sample sizes were calculated based on published allele frequencies (minor allele frequency 22%) so that the association study of the functional putative RA-causal SNP (MHC2TA − 168*G/A) had 95% power to detect a gene conferring the same OR of 1.29 as detected in the Swedish study at the 5% significance level assuming a dominant model (9).


DNA was available for a total of 1,401 RA patients and 2,475 population controls, but not all were genotyped for all the SNPs. In the RA patient cohort as a whole, 1,048 of 1,388 patients (75.5%) were female, 833 of 1,066 (78.1%) had erosions, 1,043 of 1,320 (79.0%) were seropositive for RF, and the median age at arthritis onset was 43 years (interquartile range 31–57 years). Carriage of SE alleles in the 1,036 RA patients for whom this information was available was as follows: 214 patients (20.7%) carried 0 copies, 509 (49.1%) carried 1 copy, and 313 (30.2%) carried 2 copies. Sex information was available for 2,395 of the controls, 1,216 (50.8%) of whom were female. Of the 1,485 controls with HLA–DRB1 genotype information available, 782 (52.7%) carried 0 copies of SE alleles, 555 (37.4%) carried 1 copy, and 148 (10.0%) carried 2 copies.

Genotype frequencies for all SNPs were in Hardy-Weinberg equilibrium in patients and controls for all SNPs tested. The genotype error rate was <0.05% as assessed by concordance of duplicate samples across different plates.

Results of genotype analysis.

Phase 1: association with 6 SNPs tested in 813 UK RA patients and 532 population controls.

No significant association of RA with any of the 6 SNPs tested was detected in this UK population (Table 1). Stratifying by presence of RF, erosions, sex (data not shown), and carriage of SE alleles revealed no evidence for association in subgroups of patients (Table 1).

Table 1. Comparison of genotype frequencies of 6 MHC2TA gene promoter SNPs in RA patients and controls overall and stratified by carriage of SE alleles*
SNP, genotypePatientsControlsPNo copies of SE allele1 or 2 copies of SE allele
  • *

    Values are the number (%) of patients or controls with a given genotype. SNPs = single-nucleotide polymorphisms; SE = shared epitope.

  • HLA–DRB1 data were available for 1,036 patients and 1,485 controls.

  • Associated with rheumatoid arthritis (RA), multiple sclerosis, and ischemic heart disease in a Swedish population.

rs7501204 (−7213*C/G)         
 C/C449 (57.9)273 (59.1) 97 (62.6)138 (58.7) 347 (56.9)74 (59.2) 
 C/G284 (36.6)169 (36.6)0.6551 (32.9)87 (37.0)0.72227 (37.2)47 (37.6)0.53
 G/G43 (5.5)20 (4.3) 7 (4.5)10 (4.3) 36 (5.9)4 (3.2) 
rs6498114 (−6952*G/T)         
 G/G433 (57.1)258 (60.7) 94 (60.6)129 (59.2) 335 (56.5)69 (64.5) 
 G/T280 (36.9)145 (34.1)0.4952 (33.6)78 (35.8)0.88222 (37.4)34 (31.8)0.29
 T/T45 (6.0)22 (5.2) 9 (5.8)11 (5.0) 36 (6.1)4 (3.7) 
rs6416647 (−5473*T/C)         
 T/T404 (52.3)262 (54.9) 90 (58.8)129 (53.8) 310 (51.1)77 (58.7) 
 T/C300 (38.8)177 (37.1)0.6453 (34.6)92 (38.3)0.62238 (39.2)45 (34.4)0.27
 C/C69 (8.9)38 (8.0) 10 (6.6)19 (7.9) 59 (9.7)9 (6.9) 
rs7404672 (−4591*C/T)         
 C/C792 (97.4)500 (97.5) 157 (98.7)256 (96.6) 620 (97.2)135 (97.8) 
 C/T20 (2.5)13 (2.5)1.02 (1.3)9 (3.4)0.1517 (2.6)3 (2.2)1.0
 T/T1 (0.1)0 (0.0) 0 (0.0)0 (0.0) 1 (0.2)0 (0.0) 
rs6498116 (−1788*A/T)         
 A/A448 (58.0)316 (59.4) 97 (63.8)160 (58.0) 346 (56.7)96 (63.2) 
 A/T287 (37.2)185 (34.8)0.5351 (33.6)101 (36.6)0.30232 (38.0)44 (28.9)0.08
 T/T37 (4.8)31 (6.8) 4 (2.6)15 (5.4) 32 (5.3)12 (7.9) 
rs3087456 (−168*G/A)         
 G/G760 (54.2)1,391 (56.2) 123 (57.5)442 (56.5) 450 (54.7)408 (58.0) 
 G/A557 (39.8)922 (37.2)0.2882 (38.3)282 (36.1)0.24318 (38.7)255 (36.3)0.41
 A/A84 (6.0)163 (6.6) 9 (4.2)58 (7.4) 54 (6.6)40 (5.7) 

A plot of pairwise LD measures between the 6 SNPs spanning the MHC2TA promoter showed strong LD across the region, with D′ measurements of >0.8 and pairwise r2 values between SNPs of 0.73–0.94, for all SNPs except rs7404672 (Figure 1). Allele frequencies for this polymorphism were very different from those of the others, explaining the low r2 values. Haplotype analysis revealed that of the 46,656 possible haplotypes, only 5 existed at a frequency of >1% in the UK population. Together, these haplotypes captured >96% of variation. No difference in the frequency of these haplotypes was observed between patients and controls (Table 2).

Figure 1.

Linkage disequilibrium (LD) plot showing D′ and correlation between marker pairs spanning the MHC2TA gene promoter. Each single-nucleotide polymorphism (SNP) is plotted along the x and y axes. The darker the shading, the more LD exists between the SNPs. Since each SNP is in perfect LD with itself, the boxes along the central diagonal are shaded black. Two measures of LD, D′ and R, are plotted. D′ values are high across the whole promoter, indicating that there has been little ancestral recombination and the region is inherited as a block. R represents the correlation between genotypes. The measure of LD, r2, is calculated by squaring the R value. Five of the 6 SNPs are highly correlated with each other. For the rs7404672 SNP, allele frequency differences mean that correlation values are lower.

Table 2. Comparison of haplotype frequencies formed by 6 MHC2TA gene promoter polymorphisms in RA patients and controls*
  • *

    Values are the number (%) of patients or controls with a given haplotype. Haplotype frequencies could be estimated using the expectation-maximization algorithm for the 785 patients and 521 controls for whom there was a genotype result for all 6 SNPs. No difference in the frequency of these haplotypes was observed between patients and controls (χ2 = 0.37, 4 degrees of freedom, P = 0.99). See Table 1 for definitions.

C,G,T,C,A,G379 (71.2)563 (69.3)
G,T,C,C,T,A96 (18.1)154 (18.9)
C,G,C,C,T,A22 (4.2)31 (3.8)
G,T,C,C,A,G18 (3.5)27 (3.4)
C,G,C,T,A,A6 (1.2)10 (1.3)

Phase 2: association with functional SNP (MHC2TA− 168*G/A) tested in an additional 588 UK RA patients and 1,943 population controls.

It was clear from phase 1 of the genotyping that strong LD existed across the promoter region of the MHC2TA gene. There would be considerable redundancy in genotyping all the SNPs in all the samples; therefore, in phase 2, only the functional putative RA-causal polymorphism (MHC2TA − 168*G/A) was tested in the remaining samples. Of the 2,064 samples available from the 1958 birth cohort, 92% were successfully genotyped for the MHC2TA − 168*G/A SNP. No genotype frequency differences were noted between the control samples used in phase 1 and phase 2 of the study (P = 0.37). Therefore, the patient and control groups from the 2 phases were combined for analysis of this SNP. However, despite the large sample sizes, no evidence for association of this polymorphism with RA as a whole or with subsets stratified by carriage of SE alleles (Table 1) was noted.


Linkage to the chromosome 16p13 locus has been found in whole-genome scans of UK, US, and European families with RA (10, 13, 14). It is, therefore, a strong candidate RA susceptibility locus, and it was of interest that a recent study in a Swedish population showed evidence for association with RA of a functional polymorphism mapping to the promoter region of the MHC2TA gene, which maps under the peak of linkage (9). We investigated the promoter region of the gene in our UK RA population, but we found no evidence for association, either with the functional SNP reported previously or with 5 other promoter polymorphisms.

We investigated the additional polymorphisms mapping to the promoter region in order to exclude the possibility that the reported putative RA-causal promoter SNP, MHC2TA − 168*G/A, did not actually cause disease but instead was part of a haplotype associated with disease or in LD with a causal SNP. If either scenario was true, haplotype analysis should increase the power to detect association. Therefore, we initially genotyped 6 promoter SNPs for which allele frequency information was available in public databases for populations of Northern European descent.

We focused on the promoter region because no association had been reported with 2 exon 11 SNPs tested in the Swedish study, and because functional and association data implicated the promoter region in the control of levels of expression of class II MHC genes. However, preliminary analysis, performed after genotyping a subgroup of the samples, revealed that strong LD existed across the 6 SNPs. Therefore, in order to avoid further redundancy, only the functional putative RA-causal polymorphism (MHC2TA − 168*G/A) was tested in the remaining samples. Despite having 95% power to detect the effect size for the MHC2TA − 168*G/A promoter SNP reported from the original study, we detected no evidence for association. It is well recognized that effect sizes reported initially are often overestimates of the true effect size found in subsequent studies (the phenomenon of “winner's curse”) (15). However, with the sample sizes used, we had 80% power to detect a smaller effect size of 1.22.

Allele frequencies for the MHC2TA − 168*G/A SNP were similar in UK controls (25.2%) compared with Swedish controls (20–24% in different control cohorts) (9). In the Swedish study, association with multiple sclerosis, in particular, was dependent on which control population was used for comparison. It is interesting to note that when looking at marginal effect sizes in large samples, even small differences in allele frequencies may have a profound impact on whether association is detected.

Previous stratification analysis in UK RA families revealed stronger evidence for linkage to 16p13 in the subset of families in which affected siblings carried 2 copies of SE alleles (10). However, no evidence for association was found in UK RA patients stratified by carriage of SE alleles in the current study.

In the original Swedish study, the data obtained from a series of experiments in rat models provided convincing evidence that polymorphism within the MHC2TA gene regulates class II MHC gene expression in rats, but no information was provided as to whether this was associated with disease in rats (9). For example, it is unclear what role, if any, the gene plays in rodent models of inflammatory arthritis. Functional studies using human cell lines confirmed previous reports that the promoter region of the gene plays a role in controlling class II MHC expression in humans (16). Importantly, associations were demonstrated with RA, ischemic heart disease, and multiple sclerosis (9). The effect sizes were small but of the order expected for complex diseases. We have not replicated this finding in our cohort despite having 95% power to detect the same effect size. However, to confidently exclude association of RA with the promoter SNP, it would be necessary to power the study to detect an effect size at the lower limit of the 95% CI reported from the Swedish study (i.e., an OR of 1.07, which would require 14,000 patients and 14,000 controls). Clearly, it is only by pooling the results of many large studies that such sample sizes will be achieved. From our data, however, it does not appear that the functional MHC2TA − 168*G/A SNP has a large role in the etiology of RA.


We are grateful to Professor J. Todd and Dr. Neil Walker for providing HLA–DRB1 genotype data on this cohort.