Emerging patterns of risk factor make-up enable subclassification of rheumatoid arthritis



The spectrum of human autoimmune diseases is characterized by phenotypic heterogeneity. Whereas some autoimmune diseases primarily involve dysfunction of one organ, such as Graves' disease, type 1 diabetes mellitus, and myasthenia gravis, other autoimmune diseases may affect more than one organ, as observed in systemic lupus erythematosus (SLE) and rheumatoid arthritis (RA). Although differences in clinical expression between these diseases exist, they are likely to have a partly overlapping pathogenesis, because common underlying risk factors have been identified. For example, the HLA–DR3 alleles are associated with type 1 diabetes mellitus (1), Graves' disease (2), myasthenia gravis (3), and SLE (4), whereas HLA–DR2 has been reported to be a risk factor for multiple sclerosis (5) and SLE (4). More recently, certain genetic variants of, for example, PTPN22 and CTLA4, were identified to be common risk factors for several autoimmune diseases, such Graves' disease (2), type 1 diabetes mellitus (1), and RA (6).

Heterogeneity may also be present within one autoimmune disease; individuals with the same disease may differ in their phenotypic expression as well as in their (genetic) risk factor make-up. Heterogeneity might even be found between populations with different ethnic backgrounds, as exemplified by the presence of an association between PADI4 and susceptibility to RA in the Japanese population and the absence of this association in European Caucasian patients with RA (7). RA is a heterogeneous autoimmune disorder of unknown cause with a variable clinical expression. The definition of RA is phenotypic and has been developed by a consensus procedure by clinical experts, in which the clinical characteristics of patients with classic RA were compared with those of patients with longstanding RA. This resulted in the American College of Rheumatology (formerly, the American Rheumatism Association) criteria, according to which the diagnosis of RA can be established by the presence of 4 of 7 phenotypic features (8). In the pathogenesis of RA, genetic factors play an important role and likely account for ∼60% of disease susceptibility and expression.

The most important genetic risk factor for RA is found within the HLA system. In the past few years, considerable advances have been made in comprehending the role of HLA in RA as well as on the contribution of other non-HLA loci. This review article describes these advances and assesses the contribution of these genetic factors to the RA phenotype. We conclude that the phenotype RA can be subclassified on the basis of serologic factors (anti–citrullinated protein antibodies [ACPAs]), and that the genetic factors identified thus far confer risk to a subclass of RA, either ACPA-positive RA or ACPA-negative RA. Identification of subsets of patients with RA based on genetic or serologic markers is relevant, because enhanced understanding of the pathophysiologic differences between these subsets should facilitate both the development and optimal initiation of targeted therapies.


The autoimmune nature of RA is reflected by, among other factors, the presence of autoantibodies. The classic antibody associated with RA is rheumatoid factor (RF), an antibody that is directed to the Fc portion of IgG. This antibody is not uniquely present in patients with RA and can also be found in patients with other autoimmune diseases and infectious diseases, as well as in healthy (elderly) persons. In the past decade, considerable interest was devoted to autoantibodies against citrullinated proteins, proteins in which the amino acid arginine is posttranslationally modified to the nonstandard amino acid citrulline. In mice, it has been shown that infusion of ACPAs into animals with arthritis leads to exacerbation of arthritis, whereas a reduction in ACPA levels is associated with a less severe disease (9). These data indicate that ACPAs are directly involved in the progression of arthritis, but that they are not endowed with the capacity to induce arthritis in healthy hosts. These characteristics resemble the findings in humans, because ACPAs can be observed years before disease onset (10, 11) and are correlated with the severity of RA (12).

Because of these features suggesting that ACPAs represent a factor influencing the severity of RA, combined with the observation that ACPAs display a high specificity (13) and are predictive of the development of RA (14, 15), it has been hypothesized that ACPAs are relevant to the pathophysiology of RA. This hypothesis remains unproven, because these features could also be a result of a bystander effect. By analogy, many of the features of the ACPA response also apply to the autoantibody response to tissue transglutaminase in celiac disease: anti–tissue transglutaminase antibodies are highly specific for celiac disease (90–95%), and infiltration of these antibodies in the small bowel mucosa has been observed before the evolvement of clinical symptoms (16). Nonetheless, the intestinal lesions in patients with celiac disease are generated by gluten-reactive T cells and not by anti–tissue transglutaminase antibodies. Production of these bystander autoantibodies is likely to be initiated due to the ability of tissue transglutaminase to crosslink itself to deaminated gluten peptides and the subsequent uptake and presentation of these complexes by tissue transglutaminase–specific B cells to gluten-specific T cells that provide help for antibody production (17). However, regardless of whether or not ACPAs are pathogenic, studies investigating the ACPA response have revealed important novel insights into the contribution of genetic risk factors to the phenotype of RA.

HLA immune response genes

In classic studies on the genetic determinants that influence antibody production in mice, a region was found, the immune response region, that controls the magnitude and specificity of antibody production (18). Because the magnitude of the antibody response in the first-generation offspring of parents that produce high and low levels of antibodies was comparable with the magnitude of response in the parent producing high levels of antibodies, it was concluded that this region influenced antibody production in a dominant manner (19, 20). Furthermore, it was observed that the ability of an animal to generate a proper antibody response against different model peptides was strictly correlated with the presence of the genetic variant located in the immune response region, denoting that the control of immune response genes was specific for the amino acid composition of the antigen (19). Subsequently, the immune response region was shown to be similar to the H-2 region, the major histocompatibility complex (MHC) region in mice. The human analog of this region is the HLA region.

HLA and RA

Research on the relationship between HLA and susceptibility to RA began in 1969, with the observation that lymphocytes from patients with RA were unreactive in mixed lymphocyte culture reactions against cells from other patients with RA (21). In 1976, Stastny proposed that the low responsiveness in RA was based on the sharing of genes within the HLA region (22). Because it was discerned at approximately the same time that the main factor driving differences in a mixed lymphocyte culture reaction was the HLA–D locus, the HLA–DR products were studied in greater detail using HLA–DR–specific antisera. Two years later, Stastny demonstrated that particularly HLA–DRw4 is more frequently present in RF-positive patients with RA than in controls (23). Later studies showed that several other HLA–DRB1 alleles were also associated with the disease. The products of these alleles appeared to share an amino acid sequence at position 70–74 in the third hypervariable region of the DRβ1 chain of the HLA–DRB1 molecule (QKRAA, QRRAA, or RRRAA). These residues are part of the ridge of the peptide-binding site; therefore, it has been postulated that the shared epitope (SE) motif itself is directly involved in the pathogenesis of RA by allowing the presentation of an arthritogenic peptide to T cells (24). Unfortunately, this so-called SE hypothesis has not yet been confirmed by the identification of specific arthritogenic peptides that bind to the HLA–DR proteins in RA. Nevertheless, it has been suggested that citrullination could allow the binding of certain peptides to the HLA SE molecules (25, 26), and that the association of the SE alleles with RA (although quantitatively varying between alleles and populations) is robust because of its consistent association with disease susceptibility and severity among varied ethnic populations.


As a consequence of the classic findings described above, combined with the observation that ACPAs appeared specific for and predictive of RA, the relationship between ACPAs and the HLA–DRB1 SE alleles was recently studied in greater depth. These studies unraveled important aspects of the correlation between HLA SE alleles and RA, because it was observed that the SE alleles show associations in only RA patients who carry ACPAs and not in patients with ACPA-negative RA (27). This provided the first evidence that the HLA–DRB1 SE alleles do not confer risk of RA as such, but rather confer risk of a particular phenotype of RA. This finding also led to the hypothesis that the SE alleles confer a risk of ACPAs rather than (ACPA-positive) RA. To investigate this assumption, the progression from recent-onset undifferentiated arthritis (UA) to RA was studied in relation to the SE alleles and autoantibodies (14). Among patients who presented with UA, the presence of ACPAs was associated with progression to RA, in both SE-positive and in SE-negative patients. In contrast, in both ACPA-positive and ACPA-negative UA, the association between SE alleles and the development of RA was lost, indicating that the predictive value of the SE alleles is lost once the ACPA response has developed (14). Although these findings were made in a disease population rather than in healthy subjects, they do indicate that the SE alleles confer risk to ACPAs, and that these antibodies explain the association between the SE alleles and RA.

In analogy with the classic studies performed in mice showing that the MHC molecules dictate both the magnitude and the specificity of humoral immune responses, the influence of the HLA–DRB1 SE alleles on the magnitude and specificity of the ACPA response was determined. It was demonstrated that among ACPA-positive patients with RA, lower levels of ACPAs were present in patients without SE alleles than in patients with 1 or 2 SE alleles (14). However, patients carrying 2 SE alleles did not display higher levels compared with patients carrying 1 SE allele. This is compatible with a dominant effect of the SE alleles on the level of circulating antibodies, as would have been predicted on the basis of the studies on immune response genes in the mouse. Subsequently, the specificity of the ACPA response was studied in patients with RA, by measuring serum antibodies against a citrullinated peptide derived from vimentin and antibodies against a citrullinated fibrinogen peptide. In 2 independent cohorts, the SE alleles predisposed to the development of antibodies against citrullinated vimentin and not the development of antibodies against citrullinated fibrinogen (28, 29). These data indicate that SE alleles act as “classic” immune response genes in the ACPA response, because they influence both the magnitude and the specificity of this RA-specific antibody response.

Risk factors for ACPA-positive RA

The observation that the HLA SE alleles confer risk to ACPA-positive RA only and not to ACPA-negative RA suggests that the etiopathology of ACPA-positive RA is different from that of ACPA-negative RA, and that ACPA-positive and ACPA-negative RA should be considered as separate subsets of the disease. The establishment of these different subsets brings up the question of whether other genetic or environmental risk factors are also associated with either ACPA-positive or ACPA-negative RA.

The most important non–HLA-linked genetic risk factor for RA is the C1858T single-nucleotide polymorphism in the protein tyrosine phophatase N22 gene PTPN22, because the association between PTPN22 and RA has been demonstrated in several populations (30–32). Intriguingly, the presence of the PTPN22 T allele is associated with ACPA-positive RA and not with ACPA-negative RA (33). Moreover, in 3 independent cohorts of patients with RA, a gene–gene interaction for the HLA SE alleles and PTPN22 was shown for ACPA-positive RA but not for ACPA-negative RA (34), suggesting that HLA–DRB1 SE alleles and PTPN22 take part in the same biologic route leading to ACPA-positive RA. PTPN22 encodes for a lymphoid tyrosine phosphatase, which affects the threshold for T cell receptor signaling through binding to a Csk kinase. In vitro experiments have shown that the PTPN22 T allele–encoded protein binds less efficiently to Csk, suggesting that the T cells expressing the T allele are hyperresponsive (30). Knocking out the murine homolog of PTPN22 resulted in lower thresholds of T cell activation and induced an increased expansion and function of the effector/memory T cell pool, which was associated with elevated levels of serum antibodies (35).

Both of the above-mentioned studies indicate that the PTPN22 risk allele is associated with a reduced down-regulation of T cell activation. In contrast, another study indicates that in carriers of the PTPN22 T allele, the intrinsic phosphatase activity is increased, revealing a gain of function (36). These authors suggested that the paradoxical findings might be explained by a less sufficient activity of T regulatory cells in the presence of the PTPN22 T allele (36). Moreover, PTPN22 is expressed not only on T cells but also on B cells, natural killer cells, and macrophages (where its function is unknown), indicating that the action of PTPN22, in addition to its capacity to modulate T cell receptor signaling, might also be routed through manipulation of other aspects of the immune system.

The CTLA-4 protein plays an important role in down-regulation of T cell activation. To be fully activated, the T cell requires the recognition of an antigen bound to HLA and a costimulatory signal between CD80 or CD86 on the antigen-presenting cell and CD28 on the T cell. This costimulatory signal can be inhibited by CTLA-4, which is expressed on T cells, because CTLA-4 binds to CD80/CD86 with higher affinity. It is suggested that CTLA-4 regulates the threshold for signals through competition with CD28 for ligands (37). Additionally, observations from a recent study support a distinct mechanism that is based on limited contact time between antigen-presenting cells and T cells. Schneider et al (38) observed that CTLA-4 signaling results in higher motility of T cells in lymph nodes and thereby leads to diminished ability of T cells to interact with professional antigen-presenting cells, and thus to reduced T cell activation (38). Because CTLA4 polymorphisms have been associated with several immune diseases, CTLA4 seems to be a general susceptibility locus for autoimmunity, most likely through an altered ability to dampen T cell responses. The association between the A49G single-nucleotide polymorphism in CTLA4 and RA has been frequently investigated, but the results have been inconsistent (39). More recently, however, it was suggested that CTLA4 confers risk to only the subset of ACPA-positive RA patients and not to patients with ACPA-negative disease, potentially adding a third genetic risk factor that contributes mainly to ACPA-positive disease (40).

Information is scarce on environmental factors that are important for the development or course of RA. Smoking is the only environmental factor that has reproducibly been linked to an increased risk of RA. Klareskog et al were the first investigators to identify a gene–environment interaction between smoking and the HLA–DRB1 SE alleles (41). Importantly, this gene–environment interaction was demonstrated only in ACPA-positive disease and not in ACPA-negative disease, indicating that the effects of smoking also play a role in the pathway that is associated with ACPA-positive RA (41, 42). Klareskog et al also observed positive immunostaining for citrullinated proteins in bronchoalveolar lavage cells from smokers but not in those from nonsmokers (41), suggesting that smoking increases the level of citrulline-modified proteins. Therefore, it is hypothesized that in genetically predisposed (e.g., PTPN22 T allele–positive, CTLA-4 G–positive, and/or SE-positive) individuals, an HLA class II–restricted T cell response to citrullinated proteins is induced once these proteins are generated in a proinflammatory environment such as that which might frequently be present in the lungs of smokers (43). The help provided by these T cells might allow maturation and switching of B cells, resulting in (further) maturation of the ACPA response. Thus, the paradigm that all of these factors participate in the induction, propagation, and/or maintenance of citrullinated protein–directed immunity is gaining momentum and will be the subject of extensive research in the future.

Risk factors for ACPA-negative RA

In contrast to the risk factors described above, all of which predispose to ACPA-positive disease, other risk factors are associated with ACPA-negative disease, which further substantiates the notion that distinct pathways underly ACPA-positive and ACPA-negative disease. In particular, HLA–DR3 is more frequently present in patients with ACPA-negative RA than in control subjects (44, 45), but it is not known whether this association is attributable to the HLA–DR3 gene itself or to genes linked to this locus. HLA–DR3 is part of a conserved ancestral haplotype (A1;B8;DRB1*03). The class III MHC region, encoding for, among others, tumor necrosis factor and lymphotoxin α, is also part of this ancestral haplotype and has been described to influence the susceptibility to RA. Therefore, it is conceivable that ACPA-negative disease is primarily associated with genetic factors encoding for particular mediators of inflammation.

Although ACPA-negative RA is, by definition, not associated with ACPAs, it cannot be excluded that it is associated with another antibody reaction that still needs to be defined. Only recently, a novel autoantibody response was observed in “seronegative” myasthenia gravis. Myasthenia gravis, which is also associated with HLA–DR3, is typically characterized by the presence of anti–acetylcholine receptor antibodies, but in 15% of patients with generalized myasthenia gravis (and in 50% of patients with ocular myasthenia gravis), these autoantibodies are not detectable. Recently, it was demonstrated that a considerable portion of these patients harbor antibodies against muscle-specific kinase, and that these antibodies are not found in patients who were formerly classified as having seropositive myasthenia gravis (46, 47). A similar scenario might apply to ACPA-negative, RF-negative RA, although at present there is no indication of the presence of autoantibodies in patients with RF-negative, ACPA-negative RA. Although challenging, it will be relevant to link the genetic risk factors to the biologic process contributing to ACPA-negative disease, because this will further increase the understanding of RA.

Clinical heterogeneity?

The difference in associations between several genetic factors and either ACPA-positive or ACPA-negative RA (Table 1) gives rise to the hypothesis that both manifestations of RA are a consequence of distinct pathogenetic mechanisms and also brings up the question of whether the clinical expression of ACPA-positive and ACPA-negative RA is different. An extensive comparison of clinical characteristics at the time of disease onset revealed no differences between RA patients with and those without ACPAs (48). However, during followup, patients with ACPA-positive RA had more inflamed joints and a higher level of joint destruction (48). Thus, although different risk factors are associated with ACPA-positive and ACPA-negative RA, the presence or absence of ACPAs is not associated with a distinguishable clinical phenotype at the time of disease presentation in patients with early RA but is correlated with a higher level of inflammation and destruction during the course of RA.

Table 1. Identified risk factors for ACPA-positive and ACPA-negative RA*
  • *

    ACPA = anti–citrullinated protein antibody; RA = rheumatoid arthritis; SE = shared epitope.

ACPA-positive RA
 HLA–DRB1 SE alleles
 Gene–gene interaction (SE alleles–PTPN22)
 Gene–environment interaction (SE alleles–smoking)
ACPA-negative RA
 A1;B8;DRB1*03 haplotype


The risk factors described above (HLA, PTPN22, CTLA4) all seem to influence the adaptive immune response, indicating that adaptive immunity, mediated by either T cells or B cells, plays a pivotal role in the process leading to ACPA-positive RA (Figure 1). More recently, another set of risk factors was described in association with other inflammatory diseases, including several rheumatologic disorders. The genes encoding these risk factors are likely to play an important role in the immune response against infection but, when deranged, apparently can lead to self-directed inflammation. In patients with autoinflammatory disorders, local factors at disease-prone sites determine the activation of the innate immune system and induce a self-directed tissue inflammation. In the last decade, knowledge about the innate immune system has been enormously improved by the identification of pattern-recognition receptors. These receptors primarily recognize conserved molecular patterns from microorganisms but can also be activated by several host proteins and danger signals that are released by dying cells. The inadvertent activation of pattern-recognition receptors may be involved in the pathogenesis of autoinflammatory diseases (49).

Figure 1.

Influence of rheumatoid arthritis (RA) risk factors on the adaptive immune response. All of the identified risk factors for anti–citrullinated protein antibody (ACPA)–positive RA are part of the adaptive immune response. HLA class II molecules on antigen-presenting cells (APCs) are able to present antigens (Ag) to T cells; according to the shared epitope (SE) hypothesis, the SE motif allows the presentation of arthritogenic peptides to T cells. To become fully activated, the T cell requires not only the recognition of an antigen bound to HLA but also a costimulatory signal between CD80 or CD86 and CD28. CTLA4 blocks the interaction between these costimulatory molecules, possibly by competition with CD28 for ligands and/or inhibition of the APC–T cell interaction. PTPN22 affects the threshold of T cell receptor (TCR) signaling. Priming of the ACPA-directed immune response could take place at sites where citrullinated proteins are present in an environment that disturbs tolerance to these proteins, such as an inflammatory reaction following infection. Subsequently, activated CD4-positive T cells provide help for antibody production by B cells. Whether ACPAs contribute to the development of arthritis is still unresolved. BCR = B cell receptor; DC = dendritic cell.

Recently, a family of pattern-recognition receptors was identified, the Nod-like receptors (NLRs), which are sensors for intracellular bacteria, viruses, or danger signals (50). NLRs are composed of 3 domains: an N-terminal caspase activation and recruitment domain or pyrin effector domain, a nucleotide-binding NACHT domain, and a C-terminal variable-number leucine-rich repeat domain. Two NLRs, NALP3 (which is also called cryopyrine) and NOD-2, are of particular interest with regard to autoinflammation. NALP proteins have an important role in the activation of proinflammatory caspases through the formation of a complex, called the inflammasome, that produces active interleukin-1β (IL-1β) and IL-18 through the cleavage of pro–IL-1 and pro–IL-18 (50). NALP3 is encoded by the CIAS1 gene. Three CIAS1-associated autoinflammatory diseases have been described, which are, in order of increasing severity, familial cold-induced autoinflammatory syndrome, Muckle-Wells syndrome, and chronic infantile neurologic, cutaneous, articular syndrome. These are monogenetic autosomal-dominant disorders that are characterized by recurrent episodes of systemic attacks of inflammation with intermittent symptom-free intervals. Patients with these disorders respond very well to anti–IL-1 therapy.

The attacks of inflammation in the described CIAS1-related diseases are transient. In contrast, diseases that are characterized by chronic self-directed inflammation can also be mediated by deranged pattern-recognition receptors. It was recently demonstrated that Crohn's disease is associated with NOD-2, and that 3 common mutations in the NOD-2 gene account for ∼80% of the variants associated with Crohn's disease (50). The NOD-2 variants are also associated with colitic spondylarthritis (51). NOD-2 detects bacterial peptidoglycans, particularly muramyldipeptide, which is present in both gram-positive and gram-negative bacteria and activates, among others, NF-κB. The 3 Crohn's disease–related mutations impair muramyldipeptide recognition, and defects in NF-κB activation have been shown in cultured macrophages or peripheral blood monocytes from patients who are homozygous for the NOD-2 minor allele (50). Similarly, NOD-2–knockout mice do not produce proinflammatory cytokines or undergo NF-κB activation (50). These data seem to be in contrast with increased levels of proinflammatory cytokines and the presence of activated NF-κB in the lamina propria of patients with Crohn's disease; therefore, the role of NOD-2 in the pathogenesis of Crohn's disease is complex and at present not fully understood. Nonetheless, these data on NOD-2 reveal that genetic variants in NLRs might initiate an abnormal inflammatory response of the innate immune system. Although no association between NOD-2 variants and RA in general was found (52, 53), associations with ACPA-positive or ACPA-negative RA separately were not assessed.

The findings described above clearly show that variance in genes that can directly influence innate inflammatory responses can significantly contribute to autoinflammatory/autoimmune-like diseases. It can be conceptualized that factors such as NLRs also play a role in the pathogenesis of arthritis, and that genetic variance in these factors is involved in the etiopathology of (ACPA-negative) RA and/or contributes to the development of rheumatic disorders that are characterized by symptom-free intervals, such as palindromic rheumatism. However, this field is yet to be explored.


The RA phenotype can be subclassified according to the presence or absence of ACPAs, because both subsets of RA have different risk factors. Genetic risk factors for ACPA-positive RA are the presence of HLA–DRB1 SE alleles and variants in PTPN22 and CTLA4, whereas the A1;B8;DRB1*03 haplotype is a risk factor for ACPA-negative RA. The differences in risk factors suggest that ACPA-positive and ACPA-negative RA have different etiopathologies. Therefore, in future etiopathology research, to avoid the risk of false-negative findings, these 2 subgroups of RA should be studied separately. All of the genetic factors associated with ACPA-positive RA influence the adaptive immune response. Recently, newly identified factors that belong to the innate immune system were described to be associated with several autoinflammatory disorders, including rheumatic diseases. It will be very interesting to study whether particular genetic variants of factors involved in innate immunity that induce a deregulated inflammatory reaction also contribute to the process that underlies the development of (ACPA-negative) RA.


Dr. van der Helm-van Mil had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Manuscript preparation. van der Helm-van Mil, Huizinga, de Vries, Toes.


We are grateful to our colleague L. A. Trouw for providing help in improving the quality of the artwork in Figure 1.