SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information

Objective

Several copy number variations (CNVs) have been found to be associated with systemic lupus erythematosus (SLE) through the target gene approach. However, genome-wide features of CNVs and their role in the risk of SLE remain unknown. The aim of this study was to identify SLE-associated CNVs in Korean women.

Methods

Genome-wide assessments of CNVs were performed in 382 SLE patients and 191 control subjects, using an Illumina HumanHap610 BeadChip genotyping platform. SLE-associated CNV regions that were identified by genome-wide association study (GWAS) were replicated in quantitative polymerase chain reaction (PCR) and deletion-typing PCR analyses in an independent sample set comprising 564 SLE patients and 511 control subjects.

Results

Of 144 common CNV regions, 3 deletion-type CNV regions in 1q25.1, 8q23.3, and 10q21.3 were found to be significantly associated with SLE by GWAS analysis. In the independent replication, the CNV regions in 1q25.1 (RABGAP1L) and 10q21.3 were successfully replicated (odds ratio [OR] 1.30, P = 0.038 and OR 1.90, P = 3.6 × 10−5, respectively), and the associations were confirmed again by deletion-typing PCR. The CNV region in the C4 gene, which showed a potential association in the discovery stage, was included in the replication analysis and was found to be significantly associated with the risk of SLE (OR 1.88, P = 0.01). Through deletion-typing PCR, the exact sizes and breakpoint sequences of the deletions were defined. Individuals with the deletions in all 3 loci (RABGAP1L, 10q21.3, and C4) had a much higher risk of SLE than did those without any deletions in the 3 loci (OR 5.52, P = 3.9 × 10−4).

Conclusion

These CNV regions can be useful to identify the pathogenic mechanisms of SLE, and might be used to more accurately predict the risk of SLE by taking into consideration their synergistic effects on disease susceptibility.

Systemic lupus erythematosus (SLE) is a chronic autoimmune disease that affects multiple organs, primarily in young women. In SLE, abnormal immune responses stimulate the production of pathogenic autoantibodies and immune complex deposits in various organs, resulting in inflammation and subsequent organ damage (1). Although the mechanism of development of SLE is still unclear, genetic components are considered to be important contributors to SLE (1). A genetic contribution to SLE is supported by the higher concordance rate of SLE in monozygotic twins than in dizygotic twins and the high heritability of this disease (2).

Recent genome-wide association studies (GWAS) have identified a number of single-nucleotide polymorphisms (SNPs) associated with the risk of SLE, in genes such as HLA, IRF5, STAT4, TNFAIP3, PTPN22, BLK, BANK1, TNFSF4, ITGAM, KIAA1542, and PXK, in various ethnic groups (2). In addition to SNPs, DNA copy number variations (CNVs) can influence the interindividual differences in the risk of disease via several mechanisms that affect gene expression, such as gene disruption and rearrangements (3–5). For example, CNVs in C4 and FCGR3B were reported to be associated with the risk of SLE in Europeans (6, 7). CNVs in TLR7, CCL3L1, and Fc receptors were also reported to be associated with SLE (8), but very few of these findings have been confirmed in different populations. All of the SLE-associated CNVs mentioned above were identified by candidate gene analysis, but GWAS can be a powerful tool to explore SLE-associated CNVs that remain to be uncovered.

In this study, we aimed to identify SLE-associated CNVs in a Korean population by analyzing the genome-wide CNV profiles of 400 Korean women with SLE and 200 healthy Korean women, using a SNP array. For this purpose, we performed a 2-step approach for GWAS discovery of CNVs, followed by independent replication of the candidate CNVs identified in the first stage in a larger case–control set. Through this strategy, we identified novel CNV loci associated with the risk of SLE in the Korean population.

PATIENTS AND METHODS

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information

Study subjects.

All participants were of Korean ancestry. For the GWAS analysis of CNVs, 400 SLE patients (all women, mean ± SD age 31 ± 10.0 years) were recruited from the BAE lupus cohort of Hanyang University Hospital for Rheumatic Diseases (Seoul, Republic of Korea). These patients had been diagnosed as having SLE according to the American College of Rheumatology revised criteria for SLE (9) as updated in 1997 (10). As a control group, 200 individuals apparently free of SLE (all women, mean ± SD age 33 ± 9.2 years) were recruited from the same hospital. For independent replication, we recruited an additional set of 564 SLE patients and 511 SLE-free control subjects (n = 1,075) from the same hospital cohort. This study was performed with the approval of our Institutional Review Board.

Whole-genome SNP genotyping.

Genome-wide SNP genotyping was conducted using a HumanHap610a-Quad BeadChip platform, which contains 620,901 SNP markers (Illumina). The average number of probes per known CNV is 37.7. Approximately 750 ng of genomic DNA per sample was used for genotyping. We used high-quality samples for reliable CNV calling (mean ± SD SNP call rate 99.89 ± 0.16%). To minimize plate or batch effects of genotyping, we conducted all of the procedures under the same conditions around the same time period.

Quality control and identification of CNVs.

The raw data consisted of signal intensity (log R ratio [LRR]) and allelic intensity (B allele frequency), both of which were obtained using GenomeStudio software (Illumina). Based on the LRR and B allele frequency values, we called CNVs using the PennCNV algorithm under the default setting (11). To ensure the quality of the SNP genotyping data, samples were excluded from CNV calling and subsequent analyses if the successful SNP call rate was <98%, if the standard deviation of the LRR was >0.24, or if the B allele frequency drift was >0.01. Boundaries of each CNV were determined as the distance from the linear location of the first SNP probe to that of the last probe (human reference sequence genome build 18; http://genome.ucsc.edu/cgi-bin/hgGateway?db=hg18). Also, the samples containing an extremely small or large number of CNVs were excluded as outliers, according to the following formula: first quartile − (1.5 × IQR) < N < third quartile + (1.5 × IQR), where N is the number of CNVs detected in each sample and IQR is the interquartile range (third quartile–first quartile) calculated from the set of detected CNV calls from 600 study subjects. As a result, a total of 573 samples (382 SLE patients and 191 control subjects) were included in the analyses.

Defining CNV regions and statistical analyses.

We generated CNV data for each probe and designated each CNV, based on its copy number status, as either diploid (2X), loss (homozygous deletion = 0X; hemizygous deletion = 1X), or gain (≥3X). After CNV calling, we defined CNV regions in the 573 samples included in the association analysis as the union of overlapping CNVs, in accordance with the principle suggested by Redon et al (12). This was determined using CNVRuler software (13). This method is straightforward but can overestimate the size if any of the CNVs constituting the CNV region are extremely large. To avoid potential overestimation, regions of very low density of overlapping CNVs (<10% of the total CNVs constituting each CNV region) were not merged into the CNV region.

After constructing the CNV regions, logistic regression analysis was conducted to adjust for the confounding effect of population stratification and batches of SNP array experiments. The false discovery rate (FDR) method was used for multiple-comparison correction. CNV regions showing differences at a level of P < 0.05 and an FDR <0.1 were considered to be potentially significant, and these regions were included in further independent replication experiments to validate the CNVs.

Validation and replication of CNV regions associated with SLE by genomic quantitative polymerase chain reaction (qPCR).

We designed specific amplification primer sets for genomic qPCR (see Supplementary Data 1, available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract) to validate and replicate the candidate CNV regions. Genomic qPCR was performed using the Viia 7 system (Life Technologies) as described elsewhere (14). For qPCR analysis of the C4 gene (C4A and C4B), copy number status was determined by TaqMan-based genomic qPCR using 2 TaqMan assays, Hs07226349_cn and Hs07226350_cn (Life Technologies), which were specifically designed for C4A and C4B (15). For this analysis, 10 μl of reaction mixture, containing 10 ng of genomic DNA, TaqMan Universal PCR Master Mix II (Life Technologies), C4A or C4B TaqMan probe, and RNaseP TaqMan probe, was used. Thermal cycling conditions consisted of 1 cycle of 10 minutes at 95°C, followed by 40 cycles of 15 seconds at 95°C and 1 minute at 60°C.

For qPCR analysis of 1q25.1 and 10q21.3, we set the median value of qPCR ratios in 511 normal control subjects as diploid (2 copies), and all measured qPCR ratio values were adjusted relative to 2 copies. For estimating the C4 copy number, we used HeLa cell DNA as the diploid control, as described elsewhere (15). The copy number of each target was defined as 2inline image, where ΔCt is the difference in threshold cycles for the sample in question normalized against the reference gene (RNaseP) and expressed relative to the value obtained by calibrator DNA (individual/calibrator) (14). The adjusted ratio values were then rounded off to the nearest integer, as described elsewhere (7, 16).

To delineate the exact sizes and boundaries of the 2 significant CNVs (1q25.1 and 10q21.3), we designed a deletion-typing PCR strategy. The primer sets for detecting the deletion were designed in the flanking sequences of the expected deletion regions and within the deletion area. Detailed information on the primers used and a full description of the deletion-typing PCR strategy are provided in Supplementary Data 2 (available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract). PCR was performed using 20 μl of reaction mixture, containing 20 ng of genomic DNA, 0.4 unit of KOD FX DNA polymerase (Toyobo), 2% DMSO, and 6 pmoles of primers. Thermal cycling conditions consisted of 1 cycle of 2 minutes at 94°C, followed by 32 cycles of 10 seconds at 98°C and 1 minute/kb at 69°C. After the PCR, 10 μl of PCR product was run on a 0.5% agarose gel. Amplicons of deleted alleles were sequenced by PCR direct sequencing.

Statistical analysis.

Statistical analyses were performed using Stata software (version 10.0) and SPSS for Windows (version 11.5). P values less than 0.05 were considered significant.

RESULTS

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information

General characteristics of identified CNVs.

The whole analysis process, from genome-wide CNV discovery to independent replication, is illustrated in Figure 1. A total of 18,266 CNVs were identified from 573 samples. The median number of CNVs per individual genome was 30 (range 1–285) and the median size of the CNVs was 21.0 kb (range 14 bp–1.8 Mb). Loss-type CNVs were 1.4 times more frequent than gain-type CNVs. The general characteristics of the CNVs are summarized in Supplementary Data 3 (available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract). We identified 2,544 CNV regions from the 18,266 CNVs. Among them, 144 CNV regions appeared in more than 5% of the subjects (for the full list, see Supplementary Data 4, available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract).

thumbnail image

Figure 1. The procedure from copy number variation (CNV) discovery to replication. Briefly, 400 systemic lupus erythematous (SLE) cases and 200 controls were genotyped using an Illumina HumanHap610a-Quad BeadChip platform, and CNVs were defined by the PennCNV algorithm. After filtering out samples that did not satisfy the quality criteria, CNV regions (CNVRs) were determined as defined in Patients and Methods. Using 144 common CNV regions (i.e., those found in >5% of subjects), logistic regression analysis was performed and CNV regions found to be significantly associated with SLE risk were replicated in a larger case–control set (564 SLE patients and 511 control subjects). LRR = log R ratio; GWAS = genome-wide association study.

Download figure to PowerPoint

CNV regions associated with SLE.

Using these 144 relatively frequent CNV regions, we performed logistic regression analysis. In the discovery stage, 3 CNV regions, in 1q25.1, 8q23.3, and 10q21.3, were identified to be significantly associated with the risk of SLE (Table 1). Frequency distributions of these 3 CNV regions are listed in Supplementary Data 5 (available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract). The gene for RAB GTPase–activating protein 1-like (RABGAP1L) is located in the 1q25.1 CNV region, while there are no coding genes in the other 2 CNV regions.

Table 1. Copy number variation (CNV) regions associated with the risk of systemic lupus erythematosus in Korean female subjects*
CNV regionStart, kbEnd, kbLength, kbTypeGeneGWAS discovery cohort (382 cases versus 191 controls)Replication by qPCR (564 cases versus 511 controls)Replication by deletion-typing PCR (564 cases versus 495 controls)
PFDROR (95% CI)PFDR§OR (95% CI)PFDR§OR (95% CI)
  • *

    In the initial genome-wide association study (GWAS) discovery cohort, principal components (PC) analysis was performed to adjust for population stratification. In regression analyses, PC1 and PC2 were used as covariates, and single-nucleotide polymorphism array batch data were also used as covariates. In both the discovery cohort and replication by quantitative polymerase chain reaction (qPCR) cohort, odds ratios (ORs) for CNV regions were estimated relative to individuals with 2n as a reference. In the replication by deletion-typing PCR cohort, ORs were estimated relative to individuals with ≥2n as a reference, because deletion-typing cannot discriminate 2n from copy number gains. 95% CI = 95% confidence interval; ND = not determined.

  • hg18.

  • The false discovery rate (FDR) was calculated with the P values for all 144 CNV regions.

  • §

    The FDR was calculated with the P values for the 3 CNV regions (in the replication by qPCR cohort) or for 2 CNV regions (in the replication by deletion-typing PCR cohort).

1q25.1173,064.5173,068.33.8LossRABGAP1L4.4 × 10−30.0242.28 (1.29–4.01)0.0380.0571.30 (1.02–1.67)0.0090.0111.38 (1.08–1.77)
8q23.3115,704.8115,711.76.9Loss 5.2 × 10−40.0440.37 (0.21–0.65)0.2300.2300.85 (0.65–1.11)NDNDND
10q21.366,980.666,983.02.4Loss 0.0390.0661.62 (1.02–2.56)3.6 × 10−51.1 × 10−41.90 (1.40–2.58)0.0110.0111.40 (1.08–1.81)

To validate these CNV regions defined from SNP array intensity signals, we performed genomic qPCR targeting those 3 regions, and as a result, we observed that 92.8% of the CNV regions identified through the SNP arrays were also detected by genomic qPCR (results not shown). The allelic intensity and qPCR validation results with regard to the RABGAP1L CNV in 1q25.1 are illustrated in Figure 2A. Among the CNV region loci that have been previously suggested to be associated with SLE (6–8), copy number loss of C4 (6p21.32) was more frequent in our SLE group, while the gain-type CNV region of C4 (6p21.32) seemed to have a protective effect, but the association did not reach statistical significance (for loss, odds ratio [OR] 2.35, 95% confidence interval [95% CI] 0.64–8.57, P = 0.197; for gain, OR 0.40, 95% CI 0.16–1.03, P = 0.057) (see Supplementary Data 6, available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract).

thumbnail image

Figure 2. Examples of the determination and validation of the copy number variation (CNV) regions (CNVRs) associated with systemic lupus erythematous (SLE). A, Left, Genoplot image of the rs4480415 marker (position 173,066,744 bp at 1q25.1; hg18), obtained using Illumina GenomeStudio software. Copy number status was defined based on 6 distinct clusters of signal intensities: 2X (A/A, A/B, and B/B), 1X (A/− and B/−), and 0X (−/−). Middle, Plot of the ratio of signal intensity around the 1q25.1 locus. Right, Genomic quantitative polymerase chain reaction (qPCR) validation of the estimated copy number of 1q25.1. Categories of 0 copies, 1 copy, or 2 copies represent the copy number status of the samples determined by the single-nucleotide polymorphism (SNP)–based PennCNV algorithm, while open circles show the DNA copy numbers estimated by genomic qPCR. B and C, Odds ratios for the risk of SLE by copy number status (<2n, 2n, or >2n) in the replication analysis using qPCR to assess CNV regions of 1q25.2 (RABGAP1L) (B) and 10q21.3 (C). Values over point estimates show the odds ratio (95% confidence interval) relative to individuals with 2n as the reference.

Download figure to PowerPoint

Replication of the significant CNV regions.

We performed an independent replication analysis for all 3 significant CNV regions by target-specific qPCR in a larger case–control set (564 SLE patients and 511 control subjects). CNV regions encompassing RABGAP1L in 1q25.1 and the 10q21.3 loci were successfully replicated in the independent set, but the CNV region in 8q23.3 was not (Table 1). Frequency distributions of each CNV region in the replication set are listed in Supplementary Data 5 (available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract).

When we estimated the risk of SLE relative to individuals with 2n as a reference, individuals with a deletion-type CNV region in 1q25.1 seemed to have a significantly higher risk of SLE than did individuals with 2n (OR 1.30, 95% CI 1.02–1.67, P = 0.038), and ORs appeared to decrease as copy number increased (r2 = 0.939) (Figure 2B). Although individuals with a deletiontype CNV region in 10q21.3 also seemed to have a higher risk of SLE than did those with 2n in the replication analysis (OR 1.90, 95% CI 1.40–2.58, P = 3.6 × 10−5), there was no apparent trend in ORs by copy number status (r2 = 0.528) (Figure 2C).

In the discovery stage, the CNV region within the C4 gene showed an association with the risk of SLE, not significantly but in the same direction as in previous reports (6, 15). Therefore, we included this CNV region in the independent replication analysis (308 SLE patients and 307 control subjects). Frequency distributions of total C4, C4A, and C4B copy numbers identified by qPCR are illustrated in Figure 3 and in Supplementary Data 6 (available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract).

thumbnail image

Figure 3. Copy number distributions in the total C4 gene (A) and in C4A (B) and C4B (C), identified by quantitative polymerase chain reaction in systemic lupus erythematosus cases and healthy controls.

Download figure to PowerPoint

Interestingly, when we estimated the risk of SLE relative to individuals with 2n as a reference, individuals with loss of C4A showed a significantly higher risk of SLE (OR 1.83, 95% CI 1.10–3.04, P = 0.02) and those with gain of C4A showed a significantly lower risk (OR 0.30, 95% CI 0.19–0.49, P = 1.87 × 10−6). In contrast, the CNV region within C4B was not significantly associated with the risk of SLE (results not shown). Individuals with copy number loss of total C4 (C4A + C4B) also showed a significantly higher risk of SLE than did individuals with 4n (OR 1.88, 95% CI 1.15–3.06, P = 0.01).

Confirmation of the deletion-type CNV regions by deletion-typing PCR.

For further confirmative verification of the deletion-type CNV regions, we designed a deletion-typing PCR for each target. This strategy is illustrated in Figure 4A. Briefly, the first primer set for detecting the deletion was designed in the flanking sequences of the expected deletion regions. For the 1q25.1 and 10q21.3 CNV regions, the deletion-typing PCR strategy worked successfully, but not for C4. In the 1q25.1 locus, the estimated size of the deletion-type CNV region by SNP array was ∼3.8 kb. The designed amplicon size of the intact allele was 9.4 kb and the real amplicon size of the deleted allele was ∼4.2 kb, which means that the actual length of the 1q25.1 deletion was ∼5.2 kb, not 3.8 kb. Likewise, in the 10q21.3 locus, the designed amplicon size of the intact allele was 11.6 kb and the real amplicon size of the deleted allele was ∼3.3 kb, which means that the actual length of the 10q21.3 deletion was ∼8.3 kb, not ∼2.4 kb as estimated by SNP array (Figures 4B and C).

thumbnail image

Figure 4. The strategy for deletion-typing polymerase chain reaction (PCR) and determination of the boundaries of the deletion-type copy number variation (CNV) regions. A, A strategy for deletion-typing PCR analysis of 1q25.1 (top) and 10q21.3 (bottom). White boxes represent deleted (del) regions. Blue arrows in the flanking region of the expected deletion region represent the primer sets (forward [F] and reverse [R]) for detecting the deletion. Red arrows located in the deleted regions represent the primer sets for discriminating homozygous (HOM) deletions and heterozygous (HET) deletions. B and C, Determination of the sizes and boundaries of 1q25.1 (B) and 10q21.3 (C) CNV regions by deletion-typing PCR. Upper plots are images of the gel electrophoresis of deletion-typing PCR products. P1 and P2 represent primer sets 1 and 2, respectively. The sizes of the bands in P1 and P2 were as follows: in 1q25.1, copy number ≥2 = 9.4 kb and 4.2 kb, HET = 4.2 kb and 2.3 kb, HOM = 2.3 kb and no band (bands marked 1, 2, and 3 were 9.4 kb, 4.2 kb, and 2.3 kb, respectively); in 10q21.3, copy number ≥2 = 11.6 kb and 3.3 kb, HET = 3.3 kb and 3.2 kb, HOM = 3.2 kb and no band (bands marked 1, 2, and 3 were 11.6 kb, 3.3 kb, and 0.3 kb, respectively). Lower plots represent examples of DNA sequences around 1q25.1 and 10q21.3 deletion breakpoints (arrows).

Download figure to PowerPoint

In deletion-typing PCR, a case that shows only a deleted-allele amplicon band is interpreted as a homozygous deletion (HOM), a case that shows both intact-allele and deleted-allele amplicons is interpreted as a heterozygous deletion (HET), and a case that shows only an intact-allele band is defined as 2n or more copies (≥2n). In some HET cases, however, the intact-sized amplicon band was very weak or even did not appear in the gel electrophoresis. Therefore, to distinguish HOM from HET, we designed another primer within the deleted sequences to verify the existence of an intact allele (Figure 4A). Through this strategy, HOM, HET, and ≥2n were clearly distinguished (Figures 4B and C). By subsequent sequencing of the amplicons of the deleted alleles, we delineated the exact sizes and breakpoints of the deletions (Figures 4B and C).

We performed deletion-typing PCR analyses of all of the samples used for the replication analysis (564 SLE patients and 495 control subjects) and found that ∼95% of the deletions defined by qPCR were consistently detected by the deletion-typing PCR. In this analysis, we estimated the risk of SLE relative to individuals with ≥2n as a reference, because deletion-typing cannot distinguish 2n and copy number gain. Individuals with the 1q25.1 deletion (HOM and HET) showed a significantly higher risk of SLE than did those with ≥2n (OR 1.38, 95% CI 1.08–1.77, P = 0.009), and the risk was also significantly higher in individuals with the 10q21.3 deletion (OR 1.40, 95% CI 1.08–1.81, P = 0.011) (Table 1). When the effects of only homozygous deletion were analyzed separately, subjects with either 1q25.1 or 10q21.3 HOM had a significantly higher risk of SLE than did those with ≥2n (OR 1.82, 95% CI 1.03–3.20, P = 0.039 and OR 1.46, 95% CI 1.01–2.10, P = 0.044, respectively). In a meta-analysis of the discovery and deletion-typing PCR replication data, both associations seemed to be more significant (OR 1.51, 95% CI 1.20–1.89, P = 4 × 10−4 for 1q25.1 and OR 1.46, 95% CI 1.17–1.83, P = 8 × 10−4 for 10q21.3).

Combined effect of simultaneous deletions on the risk of SLE.

We further explored the effect of simultaneous losses of the 3 significant loci on the risk of SLE. For this analysis, we used deletion-typing PCR results for RABGAP1L and 10q21.3 and qPCR results for C4. Subjects with losses in all 3 loci had a risk of SLE that was 5.5 times higher than that in subjects with ≥2n in all 3 loci (OR 5.52, 95% CI 2.14–14.21, P = 3.9 × 10−4). ORs for the risk of SLE by the number of deletions, from losses at 1 loci to losses at 3 loci, showed a dose- dependent increase (r2 = 0.965) (Figure 5 and Supplementary Data 7, available on the Arthritis & Rheumatism web site at http://onlinelibrary.wiley.com/doi/10.1002/art.37854/abstract).

thumbnail image

Figure 5. Dose-dependent trend of odds ratios for systemic lupus erythematous (SLE) susceptibility. The risk of SLE by the amount of deletions (losses at 1 loci, 2 loci, or 3 loci) was estimated relative to individuals without any deletions in all 3 loci (≥2n at 3 loci) as a reference. Values over point estimates are the odds ratio (95% confidence interval).

Download figure to PowerPoint

DISCUSSION

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information

To identify and characterize DNA structural variations associated with SLE, we performed a GWAS of CNVs using an Illumina HumanHap610 BeadChip platform and identified 3 potential SLE-associated CNV regions, all of which were deletion-type variations. We used the PennCNV algorithm, which is one of the most commonly used programs for CNV identification from SNP array data (16). Although PennCNV is known to have some limitations in detecting small-sized CNVs and its performance was rated intermediate in a study by Dellinger et al (which evaluated the performances of various methods), its relatively low false-positive rates support the reliability of its call rates (17).

To confirm the GWAS results, we performed an independent replication analysis by qPCR for the 3 significant CNV region targets and a CNV region within the C4 gene, using a larger case–control set. In this analysis, the RABGAP1L, 10q21.3, and C4 deletions were successfully replicated, while the 8q23.3 deletion was not. To reconfirm the results, deletion-typing PCR was performed, and the results indicated that the RABGAP1L and 10q21.3 CNV regions were consistently significant. The meta-analysis of the discovery and replication data also confirmed the significance of the associations. In particular, individuals with copy number losses in all 3 loci (8.1% of SLE patients and 2.3% of control subjects) showed a much higher risk of SLE than those with ≥2n in all 3 loci (10.7% of SLE patients and 16.6% of control subjects) (OR 5.52). All of these results support the reliability of our data.

The RABGAP1L gene, which is located on 1q25.1, encodes a RAB GTPase–activating protein. This molecule has been suggested to be a tyrosine kinase with a role in signal transduction and cellular junction formation (18, 19), but the details on its biologic functions or contributions to disease pathogenesis are not yet known. In this study, we detected a 5.2-kb deletion CNV region located in intron 20 of the RABGAP1L gene. This deletion may affect the expression of this gene by causing alternative splicing or posttranslational modification. Further study clarifying the functional implications of the deletion and downstream molecular pathways will be required. Interestingly, one recent GWAS of SNPs suggested that the RABGAP1L gene (rs2285210) was a potential SLE-associated gene in a European population (20). Previous SNP-specific GWAS findings and our CNV data indicate that genetic polymorphisms of RABGAP1L may affect susceptibility to SLE. A recent GWAS provided evidence of a linkage between SNPs and CNVs (21). In our study, however, we could not evaluate the linkage disequilibrium between the CNV and rs2285210 SNP due to the absence of a corresponding SNP probe in the Illumina 610s array.

In the 10q21.3 locus, in which an 8.3-kb deletion CNV region was identified, no known protein-coding gene or noncoding RNA is known to be present. However, many GWAS have demonstrated well-validated associations between variants in noncoding regions and diverse phenotypes, which indicate that there is a gap in our knowledge about biologic roles of noncoding regions. One potentially interesting sequence in this CNV region locus is CD34+ cell nuclease accessible site (NAS). NAS behaves as a transcription factor binding site and has been suggested to be involved in hematopoietic cell differentiation (22). It might be worth investigating the effect of the loss of this element on myeloid differentiation of CD34+ cells and the risk of SLE.

In the discovery stage, among the CNVs that were previously suggested to be associated with SLE (6–8), the CNV region of the C4 gene showed an association with the risk of SLE, but not significantly. The complement C4 gene is in the class III major histocompatibility complex region and shows up as part of RP-C4-CYP21-TNX modular duplication (6, 15). The frequencies of copy number loss of C4 (3.7% of SLE patients and 1.6% of control subjects) and gain (2.4% of SLE patients and 6.3% of control subjects) in our study were much lower than in previous reports (6, 15, 23). It has been suggested that SNP arrays have difficulty detecting CNVs, especially those located in the segmental duplication-rich regions, due to the scarcity of probes in that region (6, 24), which might mask the association between C4 and the SLE risk in the discovery stage. Indeed, in the replication study, the loss of total C4 copy was found to be significantly associated with a higher risk of SLE and the gain of C4 was associated with a lower risk. This association is consistent with the previous observations in Europeans (6, 15) and Han Chinese (25). The same association pattern was observed in the C4A gene, but not in the C4B gene.

In addition to C4, there are several genes, such as CCL3L1, FCGR3B, TLR7, and EGR, which have been reportedly associated with SLE, but none of these showed a significant association in our study (results not shown). In fact, in some of these genes, no CNV was called. Apart from the innate limitations of SNP array platforms and CNV calling algorithms, one alternative explanation for this result can be ethnic specificity. For example, a lower copy of FCGR3B was reported to increase the risk of SLE in Europeans (7), but not in Asians (26). Future studies with various platforms and algorithms will be required to clarify this issue.

To overcome the limitations of SNP arrays and to reduce the ambiguity of the qPCR, we adopted deletion-typing PCR. Since there are no confirmed diploid reference samples for 1q25.1 and 10q21.3 currently, the median value of qPCR ratios in normal control subjects had to be set as a diploid reference value for the copy number estimation by qPCR. Although 95.6% of the deletion-type CNV regions in 1q25.1 and 95.7% in 10q21.3 identified by qPCR were consistently defined by deletion-typing PCR, it was not the case for all CNV regions with ≥2n copies. For example, in 1q25.1, 95.5% of the CNV regions estimated to be ≥2n by qPCR were consistently defined by the deletion-typing PCR, but the consistency rate was only 48.8% for 10q21.3 (results not shown). According to the deletion-typing results for 10q21.3, 62.3% of the control cohorts actually had either HOM or HET in this locus, which may produce the median ratio value smaller than 2n. Therefore, some of the true deletion cases in 10q21.3 can be estimated to be 2n in the qPCR analysis. Quantification of copy numbers using deletion-typing PCR is not performed in a relative term, which can increase the validity of our results.

In summary, we identified 3 deletion-type CNV regions associated with SLE (RABGAP1L, 10q21.3, and C4) through GWAS analyses of CNVs and the independent replication study. Each of the CNV regions is significantly associated with the risk of SLE and those individuals with all 3 deletions have a much higher risk than those without any deletions. We also provided the exact sizes and breakpoints of the deletions by deletion-typing PCR. To our knowledge, this study is the first GWAS on the association between CNVs and the risk of SLE. Our results can be useful to identify the pathogenic mechanisms of SLE in Korean women and to predict SLE risks more accurately by taking their synergistic effect into consideration.

AUTHOR CONTRIBUTIONS

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Drs. S.-C. Bae and Chung had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Bang, S.-C. Bae, Chung.

Acquisition of data. Kim, Jung, J. S. Bae, Lee, Park, Bang, Shin, S.-C. Bae, Chung.

Analysis and interpretation of data. Kim, Jung, J. S. Bae, Yim, Bang, Hu, Chung.

REFERENCES

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information
  • 1
    Rhodes B, Vyse TJ. General aspects of the genetics of SLE. Autoimmunity 2007; 40: 5509.
  • 2
    Deng Y, Tsao BP. Genetic susceptibility to systemic lupus erythematosus in the genomic era. Nat Rev Rheumatol 2010; 6: 68392.
  • 3
    Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet 2006; 7: 8597.
  • 4
    McCarroll SA, Altshuler DM. Copy-number variation and association studies of human disease. Nat Genet 2007; 39: S3742.
  • 5
    Yim SH, Kim TM, Hu HJ, Kim JH, Kim BJ, Lee JY, et al. Copy number variations in East-Asian population and their evolutionary and functional implications. Hum Mol Genet 2010; 19: 10018.
  • 6
    Yang Y, Chung EK, Wu YL, Savelli SL, Nagaraja HN, Zhou B, et al. Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. Am J Hum Genet 2007; 80: 103754.
  • 7
    Fanciulli M, Norsworthy PJ, Petretto E, Dong R, Harper L, Kamesh L, et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat Genet 2007; 39: 7213.
  • 8
    Ptacek T, Li X, Kelley JM, Edberg JC. Copy number variants in genetic susceptibility and severity of systemic lupus erythematosus. Cytogenet Genome Res 2008; 123: 1427.
  • 9
    Tan EM, Cohen AS, Fries JF, Masi AT, McShane DJ, Rothfield NF, et al. The 1982 revised criteria for the classification of systemic lupus erythematosus. Arthritis Rheum 1982; 25: 12717.
  • 10
    Hochberg MC, for the Diagnostic and Therapeutic Criteria Committee of the American College of Rheumatology. Updating the American College of Rheumatology revised criteria for the classification of systemic lupus erythematosus [letter]. Arthritis Rheum 1997; 40: 1725.
  • 11
    Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 2007; 17: 166574.
  • 12
    Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, et al. Global variation in copy number in the human genome. Nature 2006; 444: 44454.
  • 13
    Kim JH, Hu HJ, Yim SH, Bae JS, Kim SY, Chung YJ. CNVRuler: a copy number variation-based case–control association analysis tool. Bioinformatics. 2012; 28: 17902.
  • 14
    Yim SH, Chung YJ, Jin EH, Shim SC, Kim JY, Kim YS, et al. The potential role of VPREB1 gene copy number variation in susceptibility to rheumatoid arthritis. Mol Immunol 2011; 48: 133843.
  • 15
    Wu YL, Savelli SL, Yang Y, Zhou B, Rovin BH, Birmingham DJ, et al. Sensitive and specific real-time polymerase chain reaction assays to accurately determine copy number variations (CNVs) of human complement C4A, C4B, C4-long, C4-short, and RCCX modules: elucidation of C4 CNVs in 50 consanguineous subjects with defined HLA genotypes. J Immunol 2007; 179: 301225.
  • 16
    Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 2005; 307: 143440.
  • 17
    Dellinger AE, Saw SM, Goh LK, Seielstad M, Young TL, Li YJ. Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res 2010; 38: e105.
  • 18
    Roberti MC, La Starza R, Surace C, Sirleto P, Pinto RM, Pierini V, et al. RABGAP1L gene rearrangement resulting from a der(Y)t(Y;1)(q12;q25) in acute myeloid leukemia arising in a child with Klinefelter syndrome. Virchows Arch 2009; 454: 3116.
  • 19
    Nakayama M, Kikuno R, Ohara O. Protein-protein interactions between large proteins: two-hybrid screening using a functionally classified library composed of long cDNAs. Genome Res 2002; 12: 177384.
  • 20
    Ramos PS, Williams AH, Ziegler JT, Comeau ME, Guy RT, Lessard CJ, et al. Genetic analyses of interferon pathway–related genes reveal multiple new loci associated with systemic lupus erythematosus. Arthritis Rheum 2011; 63: 204957.
  • 21
    McCarroll SA, Huett A, Kuballa P, Chilewski SD, Landry A, Goyette P, et al. Deletion polymorphism upstream of IRGM associated with altered IRGM expression and Crohn's disease. Nat Genet 2008; 40: 110712.
  • 22
    Gargiulo G, Levy S, Bucci G, Romanenghi M, Fornasari L, Beeson KY, et al. NA-Seq: a discovery tool for the analysis of chromatin structure and dynamics during differentiation. Dev Cell 2009; 16: 46681.
  • 23
    Paakkanen R, Vauhkonen H, Eronen KT, Jarvinen A, Seppanen M, Lokki ML. Copy number analysis of complement C4A, C4B and C4A silencing mutation by real-time quantitative polymerase chain reaction. PLoS One 2012; 7: e38813.
  • 24
    Campbell CD, Sampas N, Tsalenko A, Sudmant PH, Kidd JM, Malig M, et al. Population-genetic properties of differentiated human copy-number polymorphisms. Am J Hum Genet 2011; 88: 31732.
  • 25
    Lv Y, He S, Zhang Z, Li Y, Hu D, Zhu K, et al. Confirmation of C4 gene copy number variation and the association with systemic lupus erythematosus in Chinese Han population. Rheumatol Int 2012; 32: 304753.
  • 26
    Lv J, Yang Y, Zhou X, Yu L, Li R, Hou P, et al. FCGR3B copy number variation is not associated with lupus nephritis in a Chinese population. Lupus 2010; 19: 15861.

Supporting Information

  1. Top of page
  2. Abstract
  3. PATIENTS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. AUTHOR CONTRIBUTIONS
  7. REFERENCES
  8. Supporting Information
FilenameFormatSizeDescription
ART_37854_SuppMat.pdf.pdf147KSupplementary Data

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.