Quantitative heritability of anti–citrullinated protein antibody–positive and anti–citrullinated protein antibody–negative rheumatoid arthritis

Authors


Abstract

Objective

The majority of genetic risk factors for rheumatoid arthritis (RA) are associated with anti–citrullinated protein antibody (ACPA)–positive RA, while far fewer genetic risk factors have been identified for ACPA-negative RA. This study was undertaken to quantify the contribution of genetic risk factors in general, and of the predisposing HLA–DRB1 shared epitope (SE) alleles in particular, to the ACPA-positive and ACPA-negative subsets of RA, by computing their heritability and assessing the contribution of the HLA SE alleles.

Methods

One hundred forty-eight RA twin pairs, in which at least 1 twin of each pair had RA, were tested for ACPAs and typed for HLA–DRB1 genotypes. Heritability was assessed in a logistic regression model including a bivariate, normally distributed random effect, representing the contribution of unobserved genetic factors to RA susceptibility, with the correlation of the random effects fixed according to twin zygosity. The contribution of the HLA SE alleles to genetic variance was assessed using a similar model, except that estimates were based on genotype-specific population prevalences.

Results

The heritability of RA among the twin pairs was 66% (95% confidence interval [95% CI] 44–75%). For ACPA-positive RA, the heritability was 68% (95% CI 55–79%), and for ACPA-negative RA it was 66% (95% CI 21–82%). Presence of the HLA SE alleles explained 18% (95% CI 16–19%) of the genetic variance of ACPA-positive RA but only 2.4% (95% CI 1.6–10%) of the genetic variance of ACPA-negative RA.

Conclusion

The heritability of ACPA-positive RA is comparable with that of ACPA-negative RA. These data indicate that genetic predisposition plays an important role in the pathogenesis of ACPA-negative RA, for which most individual genetic risk factors remain to be identified.

Several genetic risk factors for rheumatoid arthritis (RA) have been newly identified in the past few years. Candidate gene studies and new techniques such as whole-genome scans have led to the discovery of various new genetic variants that predispose individuals to RA, in, for example, the genomic regions encoding PTPN22, STAT4, and TRAF1/C5 (1–3). Interestingly, the majority of the genetic factors implicated in RA have been found to preferentially confer risk of autoantibody-positive disease. The importance of distinguishing autoantibody-positive from autoantibody-negative RA is increasingly being recognized, as a result of the recent findings concerning anti–citrullinated protein antibodies (ACPAs).

ACPAs were first described as anti–perinuclear factor and antikeratin antibodies, both of which bind to citrullinated fillagrin (4–7). Since then, ACPAs have been found to be the factor most predictive of the future development of RA (8–10). Due to their superior specificity and predictive characteristics, as compared with those of IgM rheumatoid factor, ACPAs are increasingly being used for diagnostic purposes (11). Approximately two-thirds of patients with established RA are ACPA positive, and the disease in these patients is characterized by a destructive phenotype with higher rates of joint destruction than has been observed in patients with ACPA-negative RA (12).

ACPA-positive and ACPA-negative RA differ with regard to not only the clinical disease phenotype but also their associations with different genetic and environmental risk factors (13, 14). The HLA–DRB1 shared epitope (SE) alleles, for example, which confer the highest risk among all known genetic risk factors, predispose specifically to the development of an anti–citrullinated protein immune response, rather than to the development of RA (15). PTPN22 gene variants, as well as the TRAF1/C5 locus, have been found to be predominantly associated with ACPA-positive RA, although the possibility of an association with ACPA-negative RA cannot be excluded (2, 16). It is likely that studies in which a larger number of ACPA-negative patients are included will reveal an association between these genetic risk factors and ACPA-negative RA, such as has been shown for STAT4 (17). At this moment, however, the number of genetic risk factors that have been described to be predominantly associated with ACPA-negative RA is more limited and includes HLA–DR3 haplotypes and interferon regulatory factor 5 (IRF-5) (14, 18).

Not only genetic risk factors but also environmental risk factors differ between ACPA-positive and ACPA-negative RA. Recent data have shown an interaction between smoking and the SE alleles in conferring risk of ACPA-positive RA (19, 20). The differences in clinical phenotype and underlying risk factors have led to the concept that ACPA-positive RA and ACPA-negative RA constitute distinct entities with different pathophysiologic mechanisms of disease (21). The extent to which genetic factors contribute to disease development may differ between these 2 subsets, and thus far, more genetic factors have been found to be associated with ACPA-positive RA than with ACPA-negative RA.

To investigate the contribution of genetic factors to disease development in humans, twins are a valuable source of information. Twin studies provide 2 measures of the influence of genetic factors on disease development: disease concordance, and heritability. Concordance represents the frequency with which the twin brothers or sisters of diseased individuals are also affected. While concordance is influenced by the prevalence of the disease (or disease subset), heritability is not. Heritability is a quantitative measure of the amount of variation in disease susceptibility that can be explained by genetic factors. Another advantage of heritability estimates compared with concordance rates is that heritability is based on the combined data from both mono- and dizygotic twins. This makes heritability modeling a more powerful method, especially when using small populations such as RA twin cohorts. For RA, the heritability in twin cohorts has been assessed to be ∼60% (22). However, there are no data available for the heritability of autoantibody-positive RA versus that of autoantibody-negative RA.

In this study, we determined the heritability of ACPA-positive and ACPA-negative RA by applying quantitative genetic methods to data from a large cohort of twin pairs. In addition, we assessed the contribution of the predisposing HLA–DRB1 SE alleles to the genetic variance in both subsets of RA.

PATIENTS AND METHODS

Study population.

A nationwide study of twin pairs, in which 1 or both of the twins in each pair had RA, was performed in the UK in 1989 (23). Twin pairs were recruited with the use of 2 parallel strategies. All UK rheumatologists were contacted and requested to ask all of their patients with RA whether they were a twin. Second, there was a simultaneous multimedia campaign in which patients in whom RA had been diagnosed and who had a living twin were invited to contact a UK study center. Both members of each twin pair were visited at home by trained research nurses who recorded each subject's detailed medical history and demographic characteristics and performed joint examinations. Blood samples were collected from all subjects; the serum was stored, and lymphocytes were separated for HLA typing. High-resolution HLA–DRB1 genotyping was performed to allow identification of the RA-associated HLA SE alleles (24).

RA was diagnosed according to the American College of Rheumatology (formerly, the American Rheumatism Association) 1987 revised criteria (25). Twin pairs were eligible for inclusion only if at least 1 twin satisfied these criteria. Zygosity status was verified by DNA fingerprinting on all same-sex pairs, using nonradioactive Southern blotting with a multilocus probe (26).

Anti–cyclic citrullinated peptide 2 (anti–CCP-2) assays.

Total IgG anti–CCP-2 was measured in serum samples by enzyme-linked immunosorbent assay (Immunoscan RA Mark 2; Eurodiagnostica, Arnhem, The Netherlands). Samples with an anti–CCP-2 assay value higher than 25 units/ml were considered positive for ACPAs, according to the manufacturer's instructions.

Statistical analysis.

Summary statistics were generated to investigate the prevalence of ACPA-positive and ACPA-negative RA and the occurrence of the HLA SE alleles. Disease concordance was calculated by dividing the number of pairs in which both twins were affected by the total number of twin pairs.

To calculate the heritability of RA and that of ACPA-positive and ACPA-negative RA, we used a logistic regression model with a normally distributed random effect (27–29). In this case, the normally distributed random effect represents the contribution of unobserved genetic factors to RA susceptibility. The correlation between the outcomes due to genetic factors in twins is modeled according to the extent to which a twin pair shares this random effect. Monozygotic twins share all of their genome, and therefore they also share all of the random effect in this heritability model. Dizygotic twins share half of their genome; hence, they share half of the random effect in the model.

The variance of the random effect (the genetic variance) is a measurement of the total contribution of genetic factors to the outcome. Heritability is defined as the genetic variance divided by the total variance of the outcome. The variance of the random effect, with 95% confidence intervals (95% CIs), is estimated using the profile likelihood; i.e., the likelihood is computed from a grid of genetic variances, and the value at which the likelihood curve reaches its maximum is the maximum likelihood estimate of the genetic variance. Since the twin pairs were selected for having at least 1 affected twin, the prevalences of RA, ACPA-positive RA, and ACPA-negative RA were fixed to their population prevalences (see Table 3 for the values used).

Table 3. Population prevalences used for heritability modeling*
CharacteristicRef.Prevalence, %
  • *

    The prevalence of each genotype was fixed to the specific population prevalence as reported in the UK (30, 31) or, if not available from the UK, in studies from other Caucasian populations (11, 32, 33). SE = shared epitope (see Table 2 for other definitions).

  • Worthington J: personal communication (Wellcome Trust Case Control Consortium).

Prevalence of RA331
Prevalence of ACPA+ RA110.67
Prevalence of ACPA− RA110.33
Prevalence of ACPAs in healthy controls341
Prevalence of SE in ACPA+ RA patients3184
Prevalence of SE in ACPA− RA patients59
Prevalence of SE in healthy controls3244
Table 2. Concordance of total RA, ACPA-positive RA, and ACPA-negative RA*
 Monozygotic twins (64 pairs)Dizygotic twins (84 pairs)
Concordant pairsDiscordant pairsConcordance, %Concordant pairsDiscordant pairsConcordance, %
  • *

    RA = rheumatoid arthritis; ACPA = anti–citrullinated protein antibody.

RA105415.63813.6
ACPA+ RA83817.43654.4
ACPA− RA21611.10160

Specifically, to determine heritability in each group, we computed the probability (P) of an individual being affected, using the following formula (equation 1):

equation image

in which y is equal to 1 if a twin is affected or equal to 0 if a twin is healthy, and u corresponds to the random effect (all genetic factors that contribute to the outcome). For the parameter α, given the population prevalence (p) of the outcome and the genetic variance (τ2) (representing the variance of the random effect), α can be derived from the following formula (equation 2):

equation image

in which F represents the normal distribution of u, with a mean value of zero and variance of τ2. Since u is not an observed value, it is integrated over its distribution F. Note that due to the nonlinear relationship between the population prevalence and the genetic variance, the absolute value of α increases with the genetic variance. This attenuation effect is well known from regression models that take into account errors in variables (28).

Given the values for τ2 and α, the log likelihood (l) of the data can then be computed. In this formula, y1,y2 and u1,u2 represent the pairs of outcomes and pairs of random effects, respectively, in a twin pair. The correlation of the random effects is fixed at 1 if the twins are monozygotic and is fixed at ½ if the twins are dizygotic. Thus, the log likelihood is computed as follows (equation 3):

equation image

with F being the bivariate normal distribution of u1,u2. These steps are carried out for a grid of values of τ2, and the τ2 value at which the curve reaches its maximum is the maximum likelihood estimate of the genetic variance. The heritability is then estimated by dividing the genetic variance by the total variance of the outcome (in this case, τ2 + 3, in which the value 3 corresponds to the variance of the logistic distribution) (30).

Subsequently, we computed the contribution of the HLA SE alleles to the total genetic variance of RA as well as that of ACPA-positive and ACPA-negative RA. This measure can be obtained by using a method similar to that used for computing heritability. However, instead of using the general population prevalence, the genotype-specific population prevalences were derived using Bayes' theorem (see Table 3 for specific values). The prevalences were subsequently entered into the above-described equation 2 to obtain the genotype-specific α (αSE+ and αSE−) in the regression model. The variance of the contribution of the SE to the outcome was then computed as follows: ([αSE+ − αSE−]2)(PSE [1 − PSE]). This value for the variance associated with SE alleles was then included in the logistic regression model along with the parameter for the genetic variance.

In the logistic regression model including the HLA SE alleles, the variance of the random effect (represented as ν2) measures the contribution of all remaining genetic factors besides the HLA SE. Thus, the part of the genetic variance explained by the SE is computed by dividing the variance of the contribution of the SE by the total genetic variance (τ2), as follows:

equation image

All computations were performed in the freely available software for statistical computing R (R core development team).

RESULTS

Concordance.

The study cohort consisted of 64 monozygotic and 84 dizygotic twin pairs, as described previously (22); the characteristics of the RA patients are listed in Table 1. The characteristics of the patients with RA were comparable between the monozygotic and dizygotic twin pairs with regard to demographic factors, autoantibody status, and the radiographic severity of RA.

Table 1. Characteristics of the study patients*
CharacteristicMonozygotic twinsDizygotic twins
  • *

    For all dichotomous variables, values are the number (%) of patients with rheumatoid arthritis (RA) relative to the total number of RA patients for whom information about the characteristic under investigation was available. Among the 64 pairs of monozygotic twins, 74 individuals had RA; among the 84 pairs of dizygotic twins, 87 individuals had RA. Anti–CCP-2 = anti–cyclic citrullinated peptide 2.

Age, mean ± SD years41 ± 1437 ± 13
Female sex65 (88)71 (82)
IgM rheumatoid factor positive58 (78)74 (85)
Anti–CCP-2 positive54 (73)71 (82)
Erosive disease55 (74)69 (79)

The concordance of RA, which represents the frequency with which the twin brothers or sisters of RA patients were also affected, was calculated for the monozygotic and dizygotic twin pairs (Table 2). For monozygotic twins, the concordance of RA was 15.6%, compared with a concordance of 3.6% for dizygotic twins.

Based on the results of the anti–CCP-2 measurements, the concordance of RA between the ACPA-positive and ACPA-negative subsets was also determined. In these cohorts, twins who were concordant for RA all had the same ACPA status, meaning that there was no twin pair consisting of 2 affected individuals in which 1 individual had ACPA-positive disease and 1 individual had ACPA-negative disease. The concordance of ACPA-positive RA was comparable with the concordance of total RA, while the concordance of ACPA-negative RA was lower than that of RA overall and that of ACPA-positive disease.

Heritability.

Heritability values were estimated to quantify the extent to which genetic variation contributed to the development of disease, independent of the disease prevalence. For the initial logistic regression model (as described in Patients and Methods), the population prevalence of the disease or disease subsets was fixed to the specific population prevalences as reported in the UK (31, 32); if UK data were not available, we used the population prevalences obtained in studies from other Caucasian populations (11, 33, 34) (Table 3).

The overall heritability of RA was 66%, with a 95% CI of 44–75%. The same approach was used to model the outcome of ACPA-positive RA and ACPA-negative RA. As shown in Table 4, the heritability estimates for ACPA-positive and ACPA-negative RA were similar to the heritability of total RA. The heritability of ACPA-positive RA was 68% (95% CI 55–79%), and the heritability of ACPA-negative RA was 66% (95% CI 21–82%).

Table 4. Heritability in the study cohorts*
GroupTwin pairs, no.Heritability, % (95% CI)
  • *

    95% CI = 95% confidence interval (see Table 2 for other definitions).

Total RA14866 (44–75)
ACPA+ RA11468 (55–79)
ACPA− RA3466 (21–82)

Contribution of HLA SE alleles.

The HLA SE alleles have recently been shown to be associated specifically with ACPA-positive RA (35). Therefore, we set out to investigate whether there is a difference in the extent to which the HLA SE alleles contribute to the genetic variance of ACPA-positive RA versus ACPA-negative RA.

HLA–DRB1 typing data were available for 142 twin pairs (96%). The HLA SE alleles were found to be preferentially associated with ACPA-positive RA. Specifically, 109 of 123 patients with ACPA-positive RA (89%) carried 1 or 2 SE alleles, compared with 15 of 32 patients with ACPA-negative RA (47%).

The contribution of the HLA SE to RA in general, as well as to ACPA-positive and ACPA-negative RA, was estimated by including this genetic risk factor in the logistic regression model. The prevalence of each genotype was fixed to the specific population prevalences as reported in the UK (31, 32), or if UK data were not available, we used the genotype-specific population prevalences obtained in studies from other Caucasian populations (11, 33, 34) (Table 3).

This analysis revealed that the contribution of the HLA SE alleles to the total genetic variance of RA was 11% (95% CI 10–12%) (Table 5). The HLA SE alleles contributed 18% (95% CI 16–19%) to the heritability of ACPA-positive disease, whereas the HLA SE alleles contributed only 2.4% (95% CI 1.6–10%) to the heritability of ACPA-negative RA.

Table 5. Contribution of the HLA shared epitope (SE) alleles to genetic variance in the study cohorts*
GroupTwin pairs, no.Contribution of HLA SE alleles, % (95% CI)
  • *

    95% CI = 95% confidence interval (see Table 2 for other definitions).

Total RA14211 (10–12)
ACPA+ RA11318 (16–19)
ACPA− RA292.4 (1.6–10)

DISCUSSION

The present study investigated the heritability of ACPA-positive and ACPA-negative RA. Data from a large cohort of RA twin pairs revealed heritability of RA of ∼66% for both disease subsets. Furthermore, the contribution of the predisposing HLA SE alleles to ACPA-positive and ACPA-negative RA was assessed. Although the HLA SE alleles contributed 18% to the genetic variance of ACPA-positive RA, they contributed only 2.4% to the genetic variance of ACPA-negative RA.

The heritability of ACPA-positive RA and that of ACPA-negative RA were similar, despite the fact that the concordance of these 2 disease subsets differed. Among both monozygotic twins and dizygotic twins, the concordance of ACPA-negative RA was lower than that of ACPA-positive RA. This can be explained by the fact that the concordance, which indicates how often a twin brother or sister of a patient is also affected, is influenced by the prevalence of the disease in the population, whereas heritability is not. Concordance therefore is a reflection of the prevalence of the disease in the population. Due to the fact that ACPA-negative RA is less prevalent than ACPA-positive RA, the concordance of ACPA-negative disease would be lower, despite similar heritability.

The fact that the prevalence of the disease affects the concordance but not the heritability also serves to explain the seeming discrepancy between the apparently modest concordance of 16% and the heritability of 66% for RA in monozygotic twins. This finding of concordance of 16% needs to be interpreted in light of the prevalence of RA, which is only 1% (33). This indicates that genetic factors are important determinants of disease development, which is compatible with the notion of high heritability.

The estimate of 66% (95% CI 44–75%) for the overall heritability of RA is consistent with the results of a previous study, in which an overall heritability of 65% (95% CI 50–77%) was demonstrated (22). The previous study used the same twin cohort but a different modeling approach. Both models quantified the genetic and environmental contribution to a dichotomous variable by assuming that there is a continuous underlying liability to disease, and that a threshold of liability divides subjects into those affected and those unaffected with RA. However, the previous study used a normal distribution for the liability estimates, whereas we used a logistic distribution. The advantage of a logistic distribution is that this model corresponds to the logistic regression model that is most commonly used for binary outcomes in epidemiology.

In twin studies, the variance of continuous traits, such as the underlying liability to disease, is often divided into variance due to environmental effects and variance due to genetic effects. In this study, we estimated only additive genetic variances. Variance components for dominance genetic effects or for shared environmental effects can be added to this model; however, it must be kept in mind that in twins, these 2 effects are statistically confounded and cannot be added simultaneously to the same model. When a dominance genetic effect or a shared environmental effect was added to the model, we found that this determined <1% of the total variance of RA liability and <1% of the total variance for the ACPA-positive and ACPA-negative disease subsets. According to these data, dominance genetic effects or shared environmental factors do not appear to have a substantial effect on the development of RA.

The modeling approach utilized in this study was based on the population prevalences of the genetic risk factors of interest, to estimate their contribution to the genetic variance. To ensure the validity of our estimates, we therefore used values for prevalence of the HLA SE alleles from large studies of RA patients. Although the previously reported prevalence of the HLA SE alleles in ACPA-positive RA (84%) is consistent with the value observed in our twin study population (89%), the prevalence of the HLA SE alleles in the ACPA-negative twins in our study was found to be lower than has been observed in a large study of RA patients in the UK (47% and 59%, respectively). To assess the impact of this discrepancy, the modeling was also performed with other assumed population prevalences, all of which consistently revealed a minor contribution of the HLA SE alleles to ACPA-negative RA. The results provided by the current modeling approach therefore appear to be valid, irrespective of small uncertainties in prevalence values.

The HLA SE alleles are the genetic risk factors that, to our current knowledge, confer the highest risk for disease development (13). In the past, HLA genes have been estimated to contribute 37% to the overall inherited risk of developing RA (36). Our results showing that the contribution of the HLA SE alleles to the genetic variance was only 11% (95% CI 10–12%) for RA and 18% (95% CI 16–19%) for ACPA-positive RA may therefore be considered surprising. This difference may be due to the fact that the methods used to calculate the contribution of the HLA SE alleles may not be completely comparable. Assuming that the different studies measured similar parameters, a possible explanation for the difference could be that the contribution of the SE was overestimated in the past, which may be attributed to the use of selected literature data for the calculations and to the weaknesses of the method of Rotter and Landaw as described in the study by Deighton et al (36). On the other hand, we cannot exclude the possibility that our finding of 11% contribution of HLA SE alleles to RA may represent an underestimate of the true value, due to the fact that the genotype-specific population prevalences, which were required for our modeling, may vary among populations, as discussed above. However, it is likely that the most important explanation for the difference between our results and those previously reported is the fact that a substantial part of the contribution of the HLA SE alleles to the genetic variance could be due to protective genetic factors (37) in addition to the predisposing SE alleles. Whereas we deliberately considered only the predisposing SE alleles, Deighton et al used HLA haplotype sharing for their calculation (36) and thereby included the effect of protective HLA alleles, which may have resulted in a higher estimate. It is likely that further characterization of genetic risk factors, in particular the protective HLA–DRB1 alleles, and elucidation of interactions will be necessary to fully understand the role of genetic risk factors in disease development.

Recent genome-wide association studies and candidate gene approaches have identified several new genetic risk factors for RA (2, 3). The majority of these genetic risk factors, such as PTPN22 and TRAF1/C5, have been shown to be predominantly associated with ACPA-positive RA, just like the HLA SE alleles (35, 38). This may be due, in part, to the fact that some of the larger studies have included mainly ACPA-positive patients. One cannot exclude the possibility that newer studies that might incorporate a more extensive investigation of ACPA-negative RA will show that these risk factors also predispose individuals to ACPA-negative RA (17).

At this time, however, the number of risk factors described to be preferentially associated with ACPA-negative RA is smaller and, to our knowledge, restricted to the HLA–DR3 haplotype and IRF-5 (14, 18). The latter risk factors, however, do not confer as high a risk for disease development as the HLA SE alleles. The observation that the heritability of ACPA-negative RA is similar to the heritability of ACPA-positive RA is therefore intriguing. This suggests that most genetic risk factors for ACPA-negative RA are still unknown and remain to be discovered. One reason that the description of genetic risk factors for ACPA-negative RA has not advanced at the same pace as for ACPA-positive RA could be that ACPA-negative RA may be a more heterogeneous disease entity than ACPA-positive RA, with different phenotypes being associated with different genetic risk factors. Another explanation is that many of the cohorts that have been used for large-scale genetics studies have consisted solely of patients with ACPA-positive RA. Because ACPA-negative disease is less prevalent, it is also more challenging to find sufficient numbers of patients to perform a well-powered study.

For the current study, the limited number of ACPA-negative twins was a restriction in the statistical analysis. This resulted in a larger confidence interval for the heritability estimate of ACPA-negative RA as compared with that of ACPA-positive RA. Confirmation of these results would therefore be helpful to reach a definitive conclusion regarding the heritability of ACPA-negative RA.

Our findings thus show that ACPA-positive and ACPA-negative RA have a similar heritability of ∼66%. This means that genetic predisposition also plays an important role in the pathogenesis of ACPA-negative RA, for which most individual genetic risk factors remain to be identified.

AUTHOR CONTRIBUTIONS

Dr. van der Woude had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Huizinga, Worthington, de Vries.

Acquisition of data. Van der Woude, Thomson, Worthington.

Analysis and interpretation of data. Van der Woude, Houwing-Duistermaat, Toes, Huizinga, van der Helm-van Mil, de Vries.

Manuscript preparation. Van der Woude, Houwing-Duistermaat, Toes, Huizinga, Thomson, Worthington, van der Helm-van Mil, de Vries.

Statistical analysis. Van der Woude, Houwing-Duistermaat, van der Helm-van Mil, de Vries.

Ancillary