Genetic variants in the region of the C1q genes are associated with rheumatoid arthritis


Corresponding author: L. A. Trouw, Department of Rheumatology, C1-R, Leiden University Medical Center, PO Box 9600, 2300 RC Leiden, the Netherlands.



Rodent models for arthritis implicate a role for complement in disease development and progression. In humans, complement deposition has been observed in inflamed synovia of rheumatoid arthritis (RA) patients. In this study we analysed whether genetic variants of complement component C1q predispose to RA. We genotyped single nucleotide polymorphisms (SNPs) in and around the C1q genes, C1qA, C1qB and C1qC, in a Dutch set of 845 RA cases and 1046 controls. Replication was sought in a sample set from North America (868 cases/1193 controls), and a meta-analysis was performed in a combined samples set of 8000 cases and 23 262 controls of European descent. We determined C1q serum levels in relation to C1q genotypes. In the discovery phase, five of the 13 SNPs tested in the C1q genes showed a significant association with RA. Additional analysis of the genomic area around the C1q genes revealed that the strongest associating SNPs were confined to the C1q locus. Within the C1q locus we observed no additional signal independent of the strongest associating SNP, rs292001 [odds ratio (OR) = 0·72 (0·58–0·88), P = 0·0006]. The variants of this SNP were associated with different C1q serum levels in healthy controls (P = 0·006). Interestingly, this SNP was also associated significantly in genome-wide association studies (GWAS) from the North American Rheumatoid Arthritis Consortium study, confirming the association with RA [OR = 0·83 (0·69–1·00), P = 0·043]. Combined analysis, including integrated data from six GWAS studies, provides support for the genetic association. Genetic variants in C1q are correlated with C1q levels and may be a risk for the development of RA.


The recognition molecule of the classical pathway (CP) of complement, C1q, is essential in the initiation of the CP following its binding to ligands such as immune complexes [1], apoptotic cells [2] and C-reactive protein [3]. Initially, complement was thought to be involved only in innate immunity against pathogens. However, over the last decades a wealth of insight has been generated, showing that complement is also involved in many other processes, such as coagulation, tissue regeneration, clearance of dead cells and regulation of the adaptive immune system [1].

C1q is part of the C1 complex, which consists of one C1q molecule, two C1r and two C1s serine protease proenzymes [4]. Conformational changes in the C1q molecule induced by binding to one of its ligands result in the release of these enzymes [4]. Next to its traditional ligands such as immunoglobulin (Ig)G and IgM, C1q can also bind to dead cells, DNA and matrix components [5-7]. Structurally the C1q molecule (460 kDa) is composed of 18 polypeptide chains (6A, 6B and 6C). The A, B and C chains each have a short N-terminal region, followed by a collagen region (CLR) and a C-terminal globular region (gC1q domain). Three such structural units form the hexameric C1q molecule, that has a tulip-like structure via strong non-covalent bonds in the fibril-like central portion [4]. In contrast to most other complement factors, C1q is not made by hepatocytes, but by macrophages and immature dendritic cells [8, 9]. Following their maturation, dendritic cells shut down C1q production completely [8, 9], which is suggestive of a role in adaptive immune responses [10]. Indeed, a role for C1q in adaptive immunity can also be concluded from in-vivo studies regarding antigen presentation [11] and cellular activation [12-16].

Complete genetic deficiency of C1q is associated strongly with the development of systemic lupus erythematosus (SLE) [17]. Similarly, several studies, although small, have implicated C1q in the emergence of SLE, as several genetic variants located in the C1q region seem to associate with this disease [18-22]. In addition, two small studies indicated an effect of genetic variants of C1q on the progression of cancer and the efficacy of rituximab treatment for lymphoma [23, 24].

These studies, and the observation that complement deposits are found in the rheumatoid arthritis (RA) synovium [25], as well as the described correlation between disease activity with the presence of activated complement fragments bound to C1q in sera of RA patients [26], point to a possible involvement of C1q in RA pathogenesis.

For these reasons, we studied whether single nucleotide polymorphisms (SNPs) present in the C1q region associate with RA, as this could provide further evidence for a role of the complement system in RA pathogenesis.

Materials and methods


SNPs have been genotyped in sets of controls and RA patients who met the American College of Rheumatology (ACR) 1987 revised criteria for RA. For the Leiden data set we analysed 845 RA patients who were recruited from hospitals in the western part of the Netherlands. As healthy controls, 1046 subjects were selected randomly by the Immunogenetics and Transplantation Immunology section of the Leiden University Medical Center. These patient and control sets, as well as the patient characteristics, have been described previously [27]. Within this cohorts we obtained a statistical power of 0·7 to detect differences, with a P < 0·0038 for SNPs with a minor allele frequency of 0·20.

Replication sample sets consisted of (1) the North American Rheumatoid Arthritis Consortium (NARAC) [28], that comprised 868 patients and 1193 controls; (2) 277 RA patients and 387 healthy controls from Crete, Greece, the initial collection of these patients and controls has been described previously [29]; (3) samples from the Genomics Collaborative, Inc. (GCI; Cambridge, MA, USA), comprising 475 rheumatoid factor (RF)-positive RA patients and 475 individually matched controls from the United States, which has been described in detail elsewhere [30]; and (4) data from six genome-wide association studies (GWAS) comprising 5539 patients and 20 169 controls [31]. A summary of the demographic details of the cohorts used is provided in Table 1. All patients and controls gave their informed consent to participate in the study and the study was approved by the local ethics committee of the participating hospitals.

Table 1. Study subjects.
CohortCases/controlsGenetic ancestryRA criteria, autoantibodyUse
  1. The number of individuals available for analyses is shown for each cohort as well as the genetic ancestry, the criteria used to define rheumatoid arthritis (RA), the selection of patients on the basis of their autoantibody status and the use of the particular cohorts in this manuscript. Detailed information in references [31, 35] The diagnosis of RA was made by a board-certified rheumatologist. GWAS: genome-wide association studies; BRASS: Brigham and Women's Rheumatoid Arthritis Sequential Study; NARAC: North American Rheumatoid Arthritis Consortium; WTCCC: Wellcome Trust Case Control Consortium; GCI: Genomics Collaborative, Inc.; Leiden EAC: Leiden early arthritis clinic; EIRA: Epidemiologic Investigation of Rheumatoid Arthritis; ACR: American College of Rheumatology; CCP: cyclic citrullinated peptide; RF: rheumatoid factor.
Leiden EAC845/1·046Caucasian/DutchACR 1987, unselectedDiscovery
NARAC- I868/1·193Caucasian/USAACR 1987, CCP+Replication
Crete277/387Caucasian/CretanACR 1987, unselectedReplication
GCI475/475Caucasian/USAACR 1987, RF+Replication
WTCCC1525/10·608Caucasian/UKACR 1987, CCP+/RF+Meta-GWAS
NARAC- I+III1769/5551Caucasian/USAACR 1987, CCP+Meta-GWAS
EIRA1173/1089Caucasian/SwedishACR 1987, CCP+Meta-GWAS
Canada589/1472Caucasian/CanadaACR 1987, CCP+Meta-GWAS
BRASS483/1449Caucasian/USARheumatologist, CCP+Meta-GWAS

Genotyping methods

For the initial screening we selected tagging SNPs using Tagger [32] from HapMap Release II CEU data within a 54 kb region encompassing C1qA, C1qB and C1qC, with an R2 < 0·8, LOD score [logarithm (base 10) of odds] threshold = 3 and minor allele frequency (MAF) > 10% [33, 34]. Genotyping of SNPs in the Dutch and Greek sample sets was performed using MassArray matrix-assisted laser absorption ionization time-of-flight mass spectrometry, according to the manufacturer's protocol (Sequenom, San Diego, CA, USA). At least 10% of the genotypes were assessed in duplicate, with an error rate of <1%.

The GWAS were run using different platforms, as described previously [31]. If the SNP of interest was not typed in a particular GWAS, then genotype imputation was performed using IMPUTE version 2 [35] with the proposed default parameters. Data were adjusted for population stratification, as described elsewhere [35].

Detection of circulating C1q levels

To analyse the effect of the different C1q genotypes on circulating levels of C1q we determined C1q levels in serum of 266 healthy controls by enzyme-linked immunosorbent assay (ELISA). We chose to use healthy controls to exclude potential C1q consumption due to disease activity in RA patients. The ELISA was performed essentially as described previously [8]. Briefly, 96-well Maxisorb plates (Nunc, Roskilde, Denmark) were coated with a C1q-specific rabbit anti-human C1q antibody in coating buffer (100 mM Na2CO3/NaHCO3, pH 9·6) for 2 h at 37°C. A blocking step was performed using 3% bovine serum albumin (BSA) in phosphate-buffered saline (PBS) for 1 h at 37°C. Highly purified serum C1q was used as a standard. After adding samples and incubating for 1 h at 37°C, purified rabbit IgG anti-human C1q-labelled with digoxigenin (DIG) (Boehringer Mannheim, Mannheim, Germany) was used for 1 h at 37°C, followed by horseradish peroxidase (HRP)-conjugated Fab anti-DIG (Boehringer Mannheim) for 1 h at 37°C; all these steps were performed in ELISA buffer (PBS, 1% BSA, 0·05% Tween 20). Each step was followed by three washes with PBS/0·05% Tween 20. Enzyme activity was assessed by the addition of 2,2′-azino-bis(3-ethylbenzothiazoline-6-sulphonic acid (ABTS) (Sigma, St Louis, MO, USA) and H2O2. The absorbance at 415 nm was measured using a microplate biokinetics reader (EL312e; Bio-Tek Instruments, Winooski, VT, USA), as described previously [36].

Statistical analysis

Association of SNPs with RA was performed using a χ2 test with one degree of freedom or a logistic regression. Odds ratios (OR) and 95% confidence intervals (95% CI) were calculated using the Statcalc module of Epi Info Software (Centers for Disease Control and Prevention, Atlanta, GA, USA). P-values less than 0·05 were considered significant and genotype frequencies in controls did not deviate from Hardy–Weinberg equilibrium at a significance level of P < 0·05. Bonferroni correction for multiple testing was applied in the discovery phase.

Combined analysis of the genotypes of all studies was performed using a random-effects meta-analysis on the estimated effect sizes (log OR) and their standard error. The test for heterogeneity was not statistically significant (P = 0·1).

Conditional analysis

To distinguish independent effects in the region we performed an adjusted analysis for associations with RA. In order to adjust the effect of rs292001 with effects of other SNPs in the region, we compared a two-locus model with rs292001 and each of the other SNPs with the model including rs292001 alone, using a likelihood ratio test. The likelihood ratio tests were based on the comparison of two generalized linear models. We chose a recessive model for rs292001 based on the data of the discovery cohort and an additive model for other SNPs, as no prior information was available.

Sliding window-based haplotype analysis

In order to analyse which haplotypes would confer most risk, the SNPs used in this study were subjected to a sliding window-based haplotype analysis. Adjacent groups of SNPs of sizes 2, 3 and 4 were analysed using a generalized linear model. The function of the r-package haplo.stats was used with the simulation option to obtain empirical P-values. Rare haplotypes were grouped together using the default of minimally five expected haplotypes in the sample, as suggested in the package.

Global permuted P-values

Global P-values for genetic association with RA were obtained by applying the tail-strength statistics to P-values from individual SNP associations and permuting phenotype status 5 × 103 in order to obtain empirical P-values [37]. Permutation was necessary, as P-values were not independent due to linkage disequilibrium between SNPs.

Correction for population stratification

Effect sizes and P-values for the NARAC data set were corrected by a stringent inflation factor of 1·3 prior to performing a meta-analysis or the tail-strength measure used for computing global P-values. No other data set showed significant inflation factors and therefore did not require any inflation correction.


Genetic variants in the C1q genes predispose to RA

We first analysed whether genetic variants in the genes encoding C1q (C1qA, C1qB and C1qC) are a risk for the development of RA in the Dutch population. For this purpose we genotyped a set of 13 tagging SNPs in a discovery cohort of 845 RA patients and 1046 healthy controls from the Leiden area, the Netherlands. SNPs were selected from a 54 kb haplotype block on chromosome 1, with an R2 < 0·8, LOD score threshold = 3 and MAF > 10%. We captured 94% of the variation in this region. In this discovery set we observed a statistically significant association for five SNPs (Table 2). Four of these five SNPs remained significant after Bonferroni correction for multiple testing (P < 0·0038). We observed the strongest association when analysing the data using a recessive model. We also analysed these data for global significance using a permutation procedure [37]; this analysis revealed a global significance for C1q in the Leiden data set with a P = 0·003.

Table 2. Genotypes of 13 single nucleotide polymorphisms (SNPs) across the C1q genes reveal association with rheumatoid arthritis (RA) in the Dutch population.
n111222n111222OR(95% CI)P
  1. A total of 13 SNPs across the C1q genes were genotyped in 845 RA patients and 1046 Dutch controls from the Leiden area. Shown are genotype frequencies, odds ratios (OR), confidence intervals (CI) and P-values using the recessive model; 11 represents the minor alleles, 22 represents the major alleles and 12 represents the heterozygous state with the minor allele defined with the allele frequencies in the controls. The P-values of significantly associating SNPs are shown in bold.

Next, we investigated to what extent the signal we observed was limited to the C1q genes, and therefore we genotyped an additional 40 SNPs covering a region of 400 kb, including the genes EPHA8, EPHB2 and the C1q genes. In this additional analysis we identified two additional SNPs in the C1q genes and four SNPs in the EPHA8 region that are associated significantly with RA (Fig. 1). We used conditional analysis to study whether the observed signals represent one signal or may represent two independent signals. Effect sizes (ORs) for rs292001 (0·72 when analysed alone) varied between 0·77 and 0·66 when effects of other SNPs on RA were taken into account by analysing two-locus models, where one SNP was always rs292001. Four SNPs from EPHA8 were significant, most notably rs606002, with a P-value of 0·005. Models including EPHA8 SNPs together with rs292001 could explain the data set significantly more effectively than the model including rs292001 alone, indicating an independent effect of EPHA8, which is located relatively far away from rs292001 in a region with low linkage disequilibrium. This indicates that the observed effect in the C1q genes is not mediated via other linked genes, but represents a true effect of the C1q genes themselves.

Figure 1.

Association plots across the genomic area containing EPHA8, C1qA, C1qC, C1qB and EPHB2. We calculate the d-prime and R2 between all the single nucleotide polymorphisms (SNPs) analysed in this study in the Dutch cohort. Solid horizontal lines indicate the location of the three C1q genes as well as EPHA8 and EPHB2. The top part of the figure depicts the P-value of association with rheumatoid arthritis (RA) (circles) and with C1q serum levels in healthy controls (squares) for each SNP as -log10 P-values. Horizontal lines indicate the P-values of 0·05, 0·01 and 0·001. The P-values shown in black are smaller than 0·001; other values are shown in grey.

We also performed sliding window-based haplotype analysis and observed, for a window size of two, nominal P-values < 0·05 for five SNP windows around rs209749 and rs521570 in EPHA8 and 10 SNP windows around rs292001. For a window size of three a similar pattern is observed, and a window size of four leads to significant associations only around rs292001. The strongest associations were observed for windows starting at rs294179, rs6690827 and rs12404537 for window sizes two, three and four, respectively. All these SNPs are, at most, five SNPs away from rs292001 based on physical ordering of SNPs. Collectively, these data suggest that the main signal is driven by the C1q genes, with a small and independent contribution from EPHA8. Within the C1q region we did not observe other signals independent from rs292001 and therefore we focused on this SNP in further analysis.

Genetic variants in C1q associate with C1q serum levels

We next analysed specifically the impact of the genetic variants of rs292001 on the protein levels of C1q. In order to exclude possible effects of disease activity and treatment on C1q production and/or consumption we have analysed the circulating levels of C1q in sera from 266 healthy controls. This analysis revealed a significant correlation between the presence of the ‘protective’ rs292001 G-genotype and higher C1q levels in serum (Fig. 2). These data indicate that genetic variants in the C1q locus have an impact upon circulating levels of C1q, providing a possible explanation as to how genetic variants present in the C1q-region may contribute to RA.

Figure 2.

Genetic variants in the C1q genes affect circulating protein levels of C1q. Circulating levels of C1q were analysed by enzyme-linked immunosorbent assay (ELISA) in 266 healthy, genotyped controls. Data are plotted in relation to their genotype for rs292001 as μg/ml.

Replication of the C1q association

The finding that genetic variants of C1q are associated with C1q protein and mRNA levels provided further support for a contribution of C1q to RA. We next confirmed this genetic association by independent replication. Similar to our first analysis of the Leiden data set, we analysed the 13 SNPs in the NARAC data set for global significance [37]. In this replication data set we observed a global significance (P = 0·036), confirming that in the NARAC data set genetic variants in the C1q genes are also associated with RA. Similarly, an OR of 0·83 (95% CI: 0·69–1·00, P = 0·043) was observed for rs292001.

To further confirm our findings, rs292001 was typed in two additional cohorts, one from Crete and one from North America (GCI). The data from these two cohorts consisting of 269 patients/369 controls and 469 patients/464 controls, respectively, revealed an effect in the same direction as observed in the Leiden and NARAC data set, although not significant on its own (Table 3). A meta-analysis on these non-imputed data sets confirmed an association between genetic variation at rs292001 and susceptibility to RA (P = 0·0001, OR = 0·80; 95% CI: 0·72–0·89).

Table 3. Analysis of association with rheumatoid arthritis (RA) in four cohorts of non-imputed genotypes of C1q single nucleotide polymorphism (SNP) rs292001.
  1. Genotype counts of the non-imputed data for the cohorts from Leiden, North American Rheumatoid Arthritis Consortium (NARAC), Genomics Collaborative, Inc. (GCI) and Crete are depicted, including odds ratios (OR), confidence intervals (CI) and P-value. Also a meta-analysis was performed on these non-imputed data sets.
LeidenDutch0·170·520·318450·160·450·399790·72 (0·75–0·88)0·0006
NARAC IUSA0·170·460·378680·130·460·4111930·83 (0·69–1·00)0·043
GCIUSA0·160·470·374690·150·460·394640·93 (0·70–1·20)0·548
CreteGreek0·110·510·382690·140·460·413690·89 (0·64–1·25)0·502
Meta         0·80 (0·72–0·89)0·0001

Supportive evidence from GWAS data

As the GWAS data for the RA studies are now publically available, we also further replicated our findings obtained for rs292001. Unfortunately, most GWAS data sets did not include this SNP and have been generated using a variety of genotyping platforms. Therefore, to obtain an impression of the effect of rs292001 genotypes, imputation was performed. The quality of imputation was different for each GWAS, with imputation scores ranging from 0·76 to 0·98 (Table 4; mean maximum posterior probability). All Z-scores, except for the Brigham and Women's Rheumatoid Arthritis Sequential Study (BRASS), revealed an effect in the same direction as observed in the Leiden data set. The data from these six GWAS studies, combined with the other data sets studied, now excluding NARAC from the non-imputed cohorts, revealed a small but significant (P = 0·025) contribution to the susceptibility of RA (OR = 0·85; 95% CI: 0·73–0·98) when tested using a recessive model, as also used for the individual analyses (Fig. 3).

Figure 3.

Meta-analysis of all available data for C1q single nucleotide polymorphism (SNP) rs292001. Meta-analysis of the genotyped data from Crete, the Genomics Collaborative, Inc. (GCI), Leiden and the combined six genome-wide association studies (GWAS) using a random-effects model. Significance of the meta-analysis was P = 0·025.

Table 4. Meta-analysis of the six genome-wide association studies (GWAS) for rs292001.
SNP rs292001
  1. The data that are available for rs292001 from the six GWAS are the composite of six individual GWAS, some of which used imputation to obtain data for this single nucleotide polymorphism (SNP); data on imputation score and Z-score for each study are given as well as the overall genotype counts for cases and controls. The imputation score provides information on the quality of imputation and the Z-score is indicative of the effect size, with a score > 0 indicating a positive and a score < 0 indicating a negative effect. BRASS: Brigham and Women's Rheumatoid Arthritis Sequential Study; EIRA: Epidemiologic Investigation of Rheumatoid Arthritis; NARAC: North American Rheumatoid Arthritis Consortium; WTCCC: Wellcome Trust Case Control Consortium.
Imputation score0·950g-1·0000·7660·9040·9010·976
 GGGAAA Meta OR0·95 (0·91–1·00)
Cases1961·82683·8890·2 Meta z1·908
Controls7489·69605·13066·6 Meta 2tP0·0563


In the current study, we have obtained evidence that genetic variants of C1q associate with the susceptibility for RA. The observation that several of these genetic variants also associate with circulating C1q protein levels suggests that differences in C1q quantity are involved in the onset of RA.

C1q, the initiation molecule of the classical pathway of complement, can trigger complement activation following binding to its ligands, such as immune complexes, matrix molecules and apoptotic cells [1, 6, 38], all conceivable targets in the context of RA. Immune complexes, formed, for example, by anti-citrullinated protein antibodies (ACPA), can trigger the classical pathway [39]. However, this would occur after the induction of such antibodies [40, 41], which would put genetic variants of C1q at the effector-phase of RA rather than at the onset of RA. We have divided the patient group on the basis of their ACPA status and found the effect of C1q genetic variants in both strata, although the P-values were stronger in the larger ACPA-positive group (data not shown). However, next to its role in activation of the complement system, C1q is thought to have a direct effect on adaptive immunity and autoimmunity [11-15] and as recently suggested on Wnt signalling [42].

The observed relationship between SNPs in the C1q genes and C1q serum levels might point to a role for the cells that produce C1q, predominantly macrophages and immature dendritic cells in the onset of RA [8, 9]. These two cell types are well known to be instrumental in shaping the adaptive immune response [43]. It is therefore conceivable that the basal expression or the induction of enhanced C1q production by these cells would modulate differentially the immune response to foreign antigens and possibly also against self-antigens. Dedicated experiments will need to show how C1q, either intracellularly or excreted by such cells, impacts upon the adaptive immune response.

None of the SNPs tested in this study or in high LD with rs292001 (, pilot 1 data, R2 > 0·8) have an amino acid-changing effect. Therefore, it seems likely that the associating SNPs, or SNPs in LD, have an impact upon the basal expression of C1q or on its induction following specific triggers. A genetic locus was also identified in the mouse that had an impact upon the C1q expression, secretion and autoimmune organ damage [44]. We observed that the rs292001 G allele, which provides protection against development of RA, associates with a higher C1q serum concentration. This is in line with earlier observations in SLE that low C1q levels predispose to autoimmunity [44, 45]. Although we observed an association of genetic variants of C1q with circulating levels of C1q, we did not observe an effect on the activity of the CP (data not shown), which is in line with the observation that not C1q, but rather C2, is the rate-limiting factor of the CP [46].

Although complete genetic deficiency of C1q is associated with development of SLE [45], it is not yet clear to what extent smaller genetic differences such as represented by SNPs are also associated with development of autoimmunity [18-21]. In the case of complete genetic C1q deficiency, it is thought that the absence of circulating C1q may have a direct effect on autoimmunity because of defective clearance of apoptotic cells [47]. However, many other processes may (in concert) also play a role, e.g. effects on cytokine production [48] or Wnt signalling [42].

The observed association between genetic variants of C1q and RA do not achieve stringent genome-wide significance. Nevertheless, we believe these data are important, as we confirmed our observations in the NARAC data set and observed a similar trend in other smaller cohorts, as well as the GWAS studies. We also believe that the associations between genetic variants of C1q with C1q protein levels provide additional support for a true association between genetic variants of C1q and RA. It is possible that deep sequencing of these genes would provide additional insight and may reveal the truly causal, functional variant.


Collectively, our data show that genetic variants in the region of the C1q genes are associated with the susceptibility for RA, and that this may potentially be explained by an effect of these genetic variants on basal or induced C1q production.


We thank the support of Dennis Kremer from the LUMC, LGTC unit for the sequenom analysis for excellent technical support. The authors wish to acknowledge the support of the European Union (Sixth Framework Programme integrated project Autocure, Seventh Framework Programme integrated project Masterswitch and IMI JU-funded project BeTheCure, contract no. 115142-2). This study was also supported by national funding from the Netherlands Genomics Initiative (NGI) as part of the Netherlands Proteomics Center (NPC) and the Center for Medical Systems Biology (CMSB). L.T. was supported financially by a VIDI grant from NWO-Zon-MW. R.T. was supported financially by a VICI grant from NWO-Zon-MW. F.K. was supported by the European Community's FP7 Marie Curie International Outgoing Fellowship. L.F. received a Horizon Breakthrough grant from the Netherlands Genomics Initiative (93519031) and a VENI grant from NWO (ZonMW grant 916·10·135). A.Z. received a Rubicon grant from NWO (825·10·002).