SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES

Objective

Gene–gene interaction, or epistasis, is considered a ubiquitous component of complex human diseases such as systemic sclerosis (SSc). Epistasis is difficult to model by traditional parametric approaches; therefore, nonparametric computational algorithms, such as multifactor dimensionality reduction (MDR), have been developed.

Methods

A total of 242 consecutive unrelated Italian SSc patients and an equal number of well-matched healthy controls were genotyped for 22 cytokine single-nucleotide polymorphisms (SNPs; 13 cytokine genes). The distribution of the SNPs between controls and SSc patients, controls and limited cutaneous SSc (lcSSc) patients, and controls and diffuse cutaneous SSc (dcSSc) patients was tested by the MDR constructive induction algorithm and by focused interaction testing framework (FITF), a logistic regression–based approach.

Results

None of the studied SNPs had main independent effects on SSc or disease subset susceptibility, therefore no epistatic interaction was detectable by FITF. The MDR analysis showed a significant epistatic interaction among the interleukin-2 (IL-2) G-330T, IL-6 C-174G, and interferon-γ AUTR5644T SNPs and the IL-1 receptor Cpst1970T, IL-6 Ant565G, and IL-10 C-819T SNPs in lcSSc and dcSSc susceptibility, respectively. The relevance of the single multilocus attributes constructed by the MDR inductive algorithm was then confirmed by the parametric approach (P < 0.001 for both controls versus lcSSc patients and controls versus dcSSc patients).

Conclusion

We provide evidence for gene–gene interaction among cytokine SNPs in the context of SSc. The interaction among cytokine SNPs with a profibrotic or a regulatory function on profibrotic interleukins is relevant to the susceptibility to SSc subsets and it appears to be more important than the contribution of any single cytokine SNP.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES

Systemic sclerosis (SSc) is a multifactorial disease characterized by a triad of typical features, such as widespread vasculopathy, abnormalities of the immune system, and fibrosis (1). All of these characteristics are subtly interlaced and influence each other in a pathologic network that leads to the onset of the disease or to the development of specific organ manifestations (2, 3). Namely, it has been hypothesized that vascular injury and activation of the immune system constitute the early events and background that sustain the development of fibrosis (2) or the onset of other manifestations, such as pulmonary hypertension or scleroderma renal crisis (4).

Cytokines are key mediators of the immune system with a widespread array of functions, ranging from the regulation of inflammation to cell activation, proliferation, or differentiation (5). Cytokines may also promote the deposition of collagen and fibrosis (6) and many studies have focused on the role of these mediators in SSc, depicting alterations in their concentrations (7–9) or in the balance between Th1 and Th2 cytokine levels (10). Because cytokine production is regulated at the genetic level (11–14), it has been hypothesized that single-nucleotide polymorphisms (SNPs) in or near cytokine genes may be relevant to the development of SSc. Nonetheless, studies conducted so far on this matter have often yielded disappointing results (15–17) and, in some cases, the associations described by some authors have not been confirmed in replication studies conducted in other independent populations (16, 18–20). These contradictory results could be ascribed to different factors. First, the studies used small sample sizes and therefore were unable to depict a real association due to Type II errors (21). Second, the studied SNPs might not have a causative role in the pathogenesis of SSc, but rather they might only be relevant in the progression or in the expression of the disease (17). Third, each SNP may not have a discernible main independent effect on disease risk, but its effect may be dependent on other genetic variations (gene–gene interaction or epistasis) (22). The latter aspect is of particular importance when dealing with multifactorial diseases, such as SSc, and particularly in the present context because cytokines are redundant in their activity and because they may influence each other's production and function by acting synergistically or antagonistically (5).

The evaluation of gene–gene interaction in multifactorial diseases is a challenging task, and to date, several approaches have been developed for this purpose, including parametric statistical methods such as linear and logistic regression (23) or nonparametric methods such as genetic programming neural networks (24), multilocus genotype-pedigree disequilibrium test (25), or multifactor dimensionality reduction (MDR) (26). Parametric methods suffer from a general lack of power and flexibility to detect high-order gene–gene interactions and thus nonparametric models are regarded as more appropriate in the context of statistical epistasis as alternative research strategies (27, 28).

MDR is a nonparametric and genetic model-free data mining method developed to detect gene–gene interactions that examines all possible SNP combinations from a set of given SNPs and chooses the combination that best predicts the risk of disease by maximizing the classification accuracy of cases and controls. Among the advantages of the MDR strategy is the ability to detect and characterize high-order gene–gene interactions in case–control studies with moderate sample size data by reducing the genotype predictors from n dimension to 1 dimension and the ability to analyze correlated predictors, thus overcoming the problem of multicollinearity (28). The MDR method has successfully been used in a variety of complex multifactorial human diseases such as rheumatoid arthritis (29), lupus nephritis (30), prostate and sporadic breast cancer (31, 32), and others (26). It has also been shown that the MDR analysis may be integrated with other statistical approaches to better depict gene–gene interactions, and this strategy has been advocated as an optimal approach to elucidate complex epistatic interactions in human diseases (33, 34). Recently, Heidema et al (35) demonstrated that in genetic-association studies, the application of different multilocus methods may reduce the chance of falsely identifying SNPs as important and may be helpful for the selection of a set of important SNPs for further biologic studies.

The present study was conducted to determine the epistatic interactions of multiple cytokine SNPs on SSc susceptibility by using the MDR method, the results of which were also verified and complemented by other analytical strategies to minimize the chance of false-positive findings.

PATIENTS AND METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES

Study group.

A total of 242 consecutive unrelated Italian patients with SSc referred to our outpatient clinic, who provided written consent to have their DNA collected as well as their clinical data recorded and utilized for medical research, were included. All patients fulfilled the classification criteria proposed by the American College of Rheumatology (formerly the American Rheumatism Association) (36) and were categorized as having limited cutaneous SSc (lcSSc) or diffuse cutaneous SSc (dcSSc) according to LeRoy et al (37). Disease onset was determined by patients' recall of the first non-Raynaud's symptom clearly attributable to scleroderma (38). Patients' autoantibody profiles were also determined by reviewing their medical records. Antinuclear antibodies were determined by indirect immunofluorescence on HEp-2 cells (Kallestad, Chaska, MN) using a standardized technique (39); extractable nuclear antigens were determined by a commercial enzyme-linked immunosorbent assay (Diamedix, Miami, FL). A total of 242 unrelated age- and ethnicity-matched healthy subjects were included as controls.

Genotyping.

Blood samples were collected in citrate and DNA was extracted using the DNA Isolation Kit For Mammalian Blood (Roche Diagnostics, Indianapolis, IN). Genotyping was performed by polymerase chain reaction (PCR) with sequence-specific primers (SSP-PCR) according to the 13th International Histocompatibility Workshop recommendations and by using the cytokine kit provided by the University Clinic of Heidelberg (CTS-PCR-SSP TRAY; Institute of Immunology, University of Heidelberg, Heidelberg, Germany). Briefly, SSP-PCR typing by the Heidelberg kit consists of 48 PCR primer mixes divided into aliquots in 96-well PCR trays (2 typings per tray); the kit allows the determination of 13 cytokine genes (22 SNPs): interleukin-1α (IL-1α), IL-1β, IL-1 receptor (IL-1R), IL-1 receptor antagonist (IL-1Ra), IL-2, IL-4, IL-4Rα, IL-6, IL-10, IL-12, interferon-γ (IFNγ), transforming growth factor β1 (TGFβ1), and tumor necrosis factor α (TNFα). The following SNPs were analyzed: IL-1α C-889T, IL-1β C-511T, IL-1β C+3962T, IL-1R Cpst1970T, IL-1Ra Cmspal11100T, IL-2 G-330T, IL-2 G+160T, IL-4 G-1098T, IL-4 C-590T, IL-4 C-33T, IL-4Rα A+1902G, IL-6 C-174G, IL-6 Ant565G, IL-10 A-1082G, IL-10 C-819T, IL-10 A-590C, IL-12 A-1188C, TGFβ1 T/C codon 10, TGFβ1 G/C codon 25, IFNγ AUTR5644T, TNFα A-308G, and TNFα A-238G. The mixture, which was supplied along with the reagents and consisted of MgCl2, buffer, dNTPs, and glycerol, was mixed with 1.2–3.0 μg DNA and 20 units Taq polymerase and dispensed in the 48 wells. Gel electrophoresis on a 2% agarose gel revealed either a positive or no specific amplification for each well (40). Results were interpreted according to the instruction manual provided with the kit. Results were analyzed by 2 independent investigators (FC and MB); when conflicting results occurred, missing genotypes were imputed using a frequency-based imputation procedure by the MDR data tool (available at http://sourceforge.net/projects/mdr/). This procedure has previously been validated in other studies (34) and is considered adequate when missing genotypes have a low frequency and are missing at random across cases and controls (41).

Statistical analysis.

The distribution of genotypes was tested for Hardy-Weinberg equilibrium (HWE) with the goodness-of-fit chi-square test at a significance level of 0.05. Only cytokine SNPs consistent with HWE were further analyzed.

The distribution of cytokine SNP variations between controls and SSc patients or between controls and lcSSc or dcSSc patients was tested by the chi-square test or Fisher's exact test when necessary. Variation in a particular SNP was considered to be associated with the end point at a significance level of 0.05.

MDR.

The evaluation of gene–gene interactions was performed using the 4-step process outlined by Moore et al (26). First we removed noisy SNPs from the pool of possible candidates by using the Tuned ReliefF (TuRF) filter algorithm developed by Moore and White (42). Using this procedure, we analyzed the top 4 SNPs because we only wanted to evaluate the best 2-way, 3-way, or 4-way gene–gene interactions. This decision was made to reduce the chance of overfitting the data, which is possible when considering high-order interactions (e.g., >4) in relatively small data sets (28, 33). We then confirmed that all 4 SNPs selected for each analysis had positive TuRF scores, indicating they each had genotypes that were more similar within case–control groups than between groups. SNPs with negative scores or scores close to zero were eliminated during this filter step and thus were not considered in the MDR analysis. Second, we constructed all possible combinations of 2, 3, and 4 polymorphisms using the MDR constructive induction algorithm (24, 26, 31, 42, 43).

Third, by the MDR kernel, we created a new multilocus attribute for each dimension by pooling multilocus genotypes into 1 variable consisting of 2 risk groups (high and low risk). We then used a naive Bayes classifier in the context of a 10-fold cross-validation to estimate the testing accuracy of each 1-dimensional attribute of the 2-, 3-, and 4-factor models. A single best model was selected that maximized the testing accuracy (TA). This is the model that is most likely to generalize to independent data sets. We also reported the cross-validation consistency (CVC), which measures the number of times out of 10 divisions of the data that the same best model was found. Statistical significance was evaluated using a 1,000-fold permutation test to compare observed testing accuracies with those expected under the null hypothesis of no association. Permutation testing corrects for multiple testing by repeating the entire analysis on 1,000 data sets that are consistent with the null hypothesis. Models were significant at the 0.05 level. Finally, as described by Moore et al (26), we used measures of interaction information to provide a statistical interpretation of the gene–gene interaction models. Interaction graphs were used to visualize the nature of the dependencies using the MDR algorithm and the Graphviz library (available at http://graphviz.org). All analyses were implemented in the open-source MDR software package, version 1.1.0, available at www.epistasis.org.

Focused interaction testing framework.

To complement the MDR analysis, we ran a focused interaction testing framework (FITF), a logistic regression–based approach recently developed by Millstein et al (44). With FITF, likelihood ratio tests are performed in stages that increase in the order of interaction considered. Joint tests of main effects and interactions are performed conditionally on significant lower-order effects. We used FITF to detect interactions among the same SNPs selected by TuRF that were considered in the MDR analysis and then on the single multilocus attributes constructed by the MDR inductive algorithm. The latter procedure was previously validated by Moore et al (26), who demonstrated that multilocus attributes constructed by the MDR algorithm may be evaluated by any machine learning method amenable to discrete data, and that this approach significantly increases the specificity, accuracy, and precision of the procedure, especially in the absence of main independent effects.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES

The clinical and demographic characteristics of the patients are reported in Table 1. Genotyping results for the IL-4 G-1098T, IL-4 C-590T, and IL-4 C-33T did not meet the quality requirement for interpretation (e.g., unequal or weak amplification results) and were therefore excluded from analysis; this problem had already been observed in other studies in which the Heidelberg kit was used for genotyping (45). Missing genotypes had low frequencies (IL-1α C-889T: 1.2%; IL-1β C-511T: 0.2%; IL-1β C+3962T: 1%; IL-1R Cpst1970T: 0.2%; IL-2 G-330T: 3.5%; IL-2 G+160T: 3.5%; IL-4Rα A+1902G: 0.4%; IL-6 C-174G: 1.4%; IL-6 Ant565G: 1.4%; IL-10 A-1082G: 7.2%; IL-10 C-819T: 7.2%; IL-10 A-590C: 7.2%; IL-12 A-1188C: 1.4%; TGFβ1 T/C codon 10: 5.2%; TGFβ1 G/C codon 25: 5.2%; IFNγ AUTR5644T: 3.5%; TNFα A-308G: 1.2%; TNFα A-238G: 1.2%) and were randomly distributed across cases and controls; the frequency-based approach we used was therefore deemed appropriate to fill in missing genotypes.

Table 1. Clinical and demographic characteristics of 242 patients with systemic sclerosis*
VariableValue
  • *

    Values are the number (percentage) unless otherwise indicated. lcSSc = limited cutaneous systemic sclerosis; ANA = antinuclear antibody; ACA = anticentromere antibody; restrictive lung disease = presence of a forced vital capacity <70% predicted; impaired DLCO = presence of a diffusing capacity for carbon monoxide <70% predicted; pulmonary hypertension = presence of a right ventricular systolic pressure on echocardiogram >45 mm Hg.

Female sex214 (88.1)
lcSSc175 (72.3)
Autoantibody 
 ANA232 (95.9)
 ACA104 (43)
 Scl7099 (40.9)
Age at onset, mean ± SD years47.7 ± 14.4
Restrictive lung disease48 (19.8)
Impaired DLCO179 (73.9)
Pulmonary hypertension56 (23.1)
Esophageal involvement174 (71.9)
Renal involvement10 (4.1)

Allele and genotype frequencies for the studied SNPs in cases and controls and in the 2 disease subsets are reported in Table 2. All SNPs, with the exception of the IL-12 A-1188C SNP, were consistent with the HWE in both patients and controls. Allele and genotype frequencies for the SNPs in HWE were equally distributed between cases and controls; similarly, no differences were observed between controls and lcSSc patients or between controls and dcSSc patients (Table 2).

Table 2. Allele and genotype frequencies in patients and controls*
CytokinePositionAlleleControlsSSclcSScdcSScGenotypeControlsSSclcSScdcSSc
  • *

    Values are the percentage. SSc = systemic sclerosis; lcSSc = limited cutaneous SSc; dcSSc = diffuse cutaneous SSc; IL = interleukin; R = receptor; Ra = receptor antagonist; IFNγ = interferon-γ; TGFβ1 = transforming growth factor β1; TNFα = tumor necrosis factor α.

IL-1α−889C75.174.375.869.5CC56.956.558.949.2
  T24.925.724.230.5CT36.435.633.940.7
       TT6.77.97.210.2
IL-1β−511C65.366.666.367.5CC44.642.742.045.0
  T34.733.433.732.5CT41.347.748.645.0
       TT14.09.59.410.0
 +3962C77.176.377.971.7CC60.158.961.351.7
  T22.923.722.128.3CT34.034.933.140.0
       TT5.96.25.58.3
IL-1Rpst11970C64.367.468.564.2CC40.145.245.345.0
  T35.732.631.535.8CT48.344.446.438.3
       TT11.610.48.316.7
IL-1RaMspa111100C25.222.322.023.3CC8.73.74.41.7
  T74.877.778.076.7CT33.137.235.243.3
       TT58.359.160.455.0
IL-4Ra+1902A84.183.183.183.3AA71.570.469.473.3
  G15.916.916.916.7AG25.225.427.220.0
       GG3.34.23.36.7
IL-12−1188A80747376AA69.454.153.256.7
  C20262724AC35.139.539.938.3
       CC4.46.46.95
IFNγUTR5644A55.853.654.750.0AA30.730.533.022.8
  T44.246.445.350.0AT50.246.243.654.4
       TT19.023.323.522.8
TGFβ1Codon 10C45.047.646.550.9CC20.220.319.722.4
  T55.052.453.549.1CT49.654.553.856.9
       TT30.325.126.620.7
 Codon 25C7.910.09.511.2CC0.40.91.20.0
  G92.190.090.588.8CG14.918.216.822.4
       GG84.681.082.177.6
TNFα−308A10.810.09.710.8AA0.80.81.10.0
  G89.290.090.389.2AG19.818.317.121.7
       GG79.380.981.878.3
 −238A5.56.66.66.7AA0.80.80.61.7
  G94.593.493.493.3AG9.211.612.210.0
       GG89.987.687.388.3
IL-2−330G37.235.234.836.6GG10.49.39.48.9
  T62.864.865.263.4GT53.551.950.855.4
       TT36.138.839.835.7
 +160G73.572.673.868.8GG52.654.056.446.4
  T26.527.426.231.3GT41.737.134.844.6
       TT5.78.98.88.9
IL-4−1098G10111111GG00.70.90
  T90898989GT19.920.32021.1
       TT80.179.179.178.9
 −590C87808080CC061.460.963.2
  T13202020CT7437.338.334.2
       TT261.30.92.6
 −33C84808079CC70.560.860.960.5
  T16202021CT27.437.337.436.8
       TT2.121.72.6
IL-6−174C33.231.029.934.2CC10.27.46.610.0
  G66.869.070.165.8CG46.047.146.748.3
       GG43.845.546.741.7
 nt565A26.027.125.531.7AA6.45.44.48.3
  G74.072.974.568.3AG39.143.442.346.7
       GG54.551.253.345.0
IL-10−1082A58.659.660.656.5AA37.334.134.931.5
  G41.440.439.443.5AG42.751.151.450.0
       GG20.014.813.718.5
 −819C73.278.478.378.7CC55.560.759.464.8
  T26.821.621.721.3CT35.535.437.727.8
       TT9.13.92.97.4
 −590A26.821.621.721.3AA9.13.92.97.4
  C73.278.478.378.7AC35.535.437.727.8
       CC55.560.759.464.8

Gene–gene interaction.

The 4 best SNPs selected by the TuRF filter algorithm were IL-1β C-511T, IL-6 Ant565G, IL-2 G-330T, and IL-1Ra Cmspal11100T for controls versus SSc patients; IFNγ AUTR5644T, IL-6 C-174G, IL-6 Ant565G, and IL-2 G-330T for controls versus lcSSc patients; and IL-6 Ant565G, IL-10 C-819T, IL-2 G+160T, and IL-1R Cpst1970T for controls versus dcSSc patients.

The results of the exhaustive MDR analysis that evaluated all possible combinations of these 4 polymorphisms for each comparison are summarized in Table 3. The best model of each order is shown along with its TA, CVC consistency, and significance level as determined by permutation testing. As can be observed, none of the models were significant for the controls versus SSc comparison, whereas the MDR selected the 3-way combination as the best model for both the controls versus lcSSc patients comparison and the controls versus dcSSc patients comparison.

Table 3. Multifactor dimensionality reduction (MDR) analysis*
ComparisonBest combination in each dimensionTACVCP
  • *

    Selection of the best combination of attributes by the MDR method. TA = testing accuracy; CVC = cross-validation consistency; see Table 2 for additional definitions.

  • Controls: n = 242; lcSSc: n = 175; dcSSc: n = 67.

  • The best model, that is, the model with the maximum TA and with a CVC >5 out of 10, for that comparison. lcSSc, limited cutaneous systemic sclerosis (n = 175); dcSSc, diffuse cutaneous systemic sclerosis (n = 67).

Controls vs. SScIL-1β C-511T0.497/100.883
 IL-1β C-511T, IL-6 Ant565G0.514/100.706
 IL-1β C-511T, IL-6 Ant565G, IL-2 G-330T0.58/100.796
 IL-1β C-511T, IL-6 Ant565G, IL-1Ra Cmspal11100T, IL-2 G-330T0.510/100.796
Controls vs. lcSScIFNγ AUTR5644T0.467/100.981
 IFNγ AUTR5644T, IL-6 Ant565G0.526/100.583
 IL-2 G-330T, IFNγ AUTR5644T, IL-6 Ant565G0.6010/100.004
 IL-6 C-174G, IL-2 G-330T, IFNγ AUTR5644T, IL-6 Ant565G0.5510/100.200
Controls vs. dcSScIL-6 Ant565G0.474/100.967
 IL-1R Cpst1970T, IL-6 Ant565G0.5610/100.082
 IL-10 C-819T, IL-1R Cpst1970T, IL-6 Ant565G0.5710/100.050
 IL-10 C-819T, IL-1R Cpst1970T, IL-2 G+160T, IL-6 Ant565G0.5310/100.455

To test whether the preprocessing by the TuRF filter was indeed the optimal strategy for the multilocus analysis, we also ran the MDR on the whole data set with 18 SNPs and compared the results. The 3-locus models reported in Table 3 had a worse training accuracy (TrA) and a much better TA than the 3-locus models obtained running the MDR on the entire data set (TrA = 0.63, TA = 0.60 versus TrA = 0.65, TA = 0.56 for the controls versus lcSSc patients comparison; TrA = 0.68, TA = 0.57 versus TrA = 0.71, TA = 0.46 for the controls versus dcSSc patients comparison). These results indicate that the TuRF analysis actually reduced the amount of overfitting, improving the signal and helping the MDR algorithm to generate models that are more likely to generalize to independent data sets.

The distribution of cases and controls for the controls versus lcSSc patients comparison and the controls versus dcSSc patients comparison is summarized in Figure 1. Note that the pattern of high-risk and low-risk genotype combinations is nonlinear across each multilocus dimension; this is evidence of gene–gene interaction or epistasis.

thumbnail image

Figure 1. Interaction of attributes. Summary of 3-locus genotype combinations associated with A, limited cutaneous systemic sclerosis and B, diffuse cutaneous systemic sclerosis. Each multilocus genotype combination is considered high risk when the ratio of cases to controls exceeds a threshold T, equal to the ratio of cases to controls in each population; otherwise, the cell is classified as low risk. High-risk combinations are depicted as darkly shaded cells, low-risk combinations as lightly shaded cells; empty cells are left blank. For each cell, the left bar indicates the cases, the right bar the controls. The pattern of high-risk and low-risk cells differs across each of the multilocus dimensions; this is evidence of gene–gene interaction or epistasis. IL = interleukin; IFNγ = interferon-γ; IL-1R = interleukin-1 receptor.

Download figure to PowerPoint

When we tried to replicate MDR results by the logistic-based approach, the FITF failed to detect any statistically significant effect (P > 0.05) for any of the comparisons. However, when the FITF was run considering the multilocus variables constructed by the MDR inductive algorithm, we found that these variables were highly significant, both for the controls versus lcSSc patients comparison (P < 0.001) and for the controls versus dcSSc patients comparison (P < 0.001). These results suggest that MDR identified a nonadditive interaction that was not identified by FITF because FITF depends on main effects.

The nonlinear relationship among the attributes both in the controls versus lcSSc patients comparison and in the controls versus dcSSc patients comparison is clearly illustrated in Figure 2, which shows an interaction graph highlighting the amount of information gained about case–control status by putting 2 polymorphisms together using the MDR function. A red or orange line connecting 2 polymorphisms suggests a positive information gain that can be interpreted as a synergistic or nonadditive relationship; a yellow line indicates independence or additivity. The interaction information analysis indicates that for both comparisons, interactions are mostly synergistic and that in the controls versus lcSSc patients comparison, the IFNγ AUTR5644T is the key SNP among the 3 SNPs selected in the best MDR model. On the contrary, no clear-cut distinction can be made for the model involving dcSSc patients. The significant interaction effects in the absence of a main effect, typical of nonlinear interactions (XOR model), confirm the epistatic nature of the interrelationship among cytokine SNPs in the present context.

thumbnail image

Figure 2. Interaction graphs. The interaction model describes the percentage of the entropy (information gain) that is explained by each factor or 2-way interaction. The percentage in the node expresses the amount of the label's uncertainty eliminated by the node's attribute and the connection between the relative mutual information; a red or orange line suggests a positive information gain, which can be interpreted as a synergistic or nonadditive relationship; a yellow line indicates independence or additivity. Only the most important attributes selected after filtering (42) are reported for A, the limited cutaneous subset or B, the diffuse cutaneous subset. IFN = interferon; IL-2 = interleukin-2; IL1R = IL-1 receptor.

Download figure to PowerPoint

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES

The current study was undertaken to determine the contribution of a high number of cytokine SNPs to either the susceptibility to SSc or the expression of the disease (e.g., disease subset). For this purpose we used MDR, a novel computational algorithm, to evaluate gene–gene interactions, as it has become clear that to study complex diseases with a polygenic background, traditional logistic regression analysis approaches are not adequate and may underestimate the genetic contribution to disease in the presence of interactions between loci (28). Indeed, traditional methods suffer from a general lack of power and may provide estimates with large standard errors, thus increasing Type I errors, when dealing with multiple variables and/or small sample sizes. On the contrary, MDR has been developed to overcome these concerns and has proved capable of identifying evidence for high-order gene–gene interactions in the absence of any statistically significant independent main effect (29–32).

Our results indicate that cytokine SNPs do not contribute to susceptibility to SSc per se, but rather they may be important in determining which subset of the disease the patient is likely to develop. Two different 3-factor models were found to be relevant in the susceptibility to each disease subset, both of which display the characteristics of epistasis: none of the cytokines showed independent main effects (Figure 2) and their interactions appeared to be nonlinear (Figure 1). The presence of epistatic interactions among cytokine SNPs may explain the apparently contradictory results we observed after the MDR and the FITF analysis. Indeed, although the FITF method in the presence of additive, dominant, and recessive models is more powerful than other analytical approaches, either parametric or nonparametric, it may not detect interactions when they involve genes with little or no marginal effects (44). In contrast, when this approach was used as part of a multistrategy constructive induction algorithm (26) that involved the removal of noisy attributes by the principles of information theory (e.g., TuRF) (42), the selection of interesting attributes by a nonparametric approach targeted at epistasis (e.g., MDR), and the construction of new multilocus attributes, it confirmed the validity of the 3-factor model sorted out by the MDR algorithm.

As outlined by Moore and Williams (46), it is difficult to make inferences about the biologic significance of a statistical model of epistasis; nonetheless, each of our 3-factor models has its own biologic plausibility. MDR analysis selected the IL-2 G-330T, IL-6 C-174G, and IFNγ AUTR5644T SNPs as the best predictors of lcSSc risk. IL-6 was found to be produced at increased levels by cultured peripheral blood mononuclear cells from SSc patients (47), and increased IL-6 levels were observed in sera from SSc patients, also correlating with the degree of skin fibrosis (9). On the contrary, IFNγ can negatively regulate the transcription of extracellular matrix synthesis from SSc fibroblast (48), and higher levels of IFNγ were found in lcSSc patients compared with dcSSc patients (49), while the administration of recombinant IFNγ proved effective in ameliorating skin fibrosis (50). Thus, IFNγ may modulate the fibrotic responses mediated by IL-6, and indeed lcSSc patients usually have an indolent fibrotic disease (51). Finally, the contribution of IL-2 polymorphisms to the genesis of lcSSc has already been described, as these polymorphisms were shown to be differently distributed between lcSSc patients and controls in the Italian population (16). As far as dcSSc is concerned, the MDR analysis detected a significant 3-way interaction among the IL-1R Cpst1970T, the IL-6 Ant565G, and the IL-10 C-819T SNPs. All of these cytokines have prominent profibrotic features (52, 53) and a role for IL-1R, IL-10, or IL-6 has already been demonstrated in the development of fibrotic responses in patients with SSc (42, 50–53). Thus the synergic interaction among these cytokines, as clearly outlined in Figure 2, may promote collagen synthesis or deposition and account for the prominent fibrotic features observed in the dcSSc subset of the disease (51).

Altogether, our results indicate that cytokine SNPs with a profibrotic or a regulatory function on profibrotic interleukins may be important in the susceptibility to SSc subset, that is, in determining the degree of fibrosis the patient is likely to develop (e.g., dcSSc or lcSSc). These findings are not totally unexpected because fibrosis is the ultimate hallmark of SSc (2). Still, although our models have their own biologic plausibility, a functional study would be the ultimate demonstration of their clinical relevance. However, connecting the high-risk and low-risk genotype combinations with what is known about fibrosis pathways is extremely difficult because much of what is known about the function of this pathway is based on experiments that involve 1 gene at a time. A fully elucidative study of multiple gene–gene interactions could be accomplished by either collecting and analyzing the whole genetic, genomic, and proteomic data from complex biologic systems (52) or studying biologic pathways in simple organisms (46). Nevertheless, these strategies are extremely complex and may present overwhelming technical problems, therefore an alternative strategy would be to generate via mathematical models (i.e., Petri nets) simpler hypotheses about biochemical systems that can be eventually tested in vitro or in vivo (53). So far these strategies have proved effective in relatively simple settings involving 2-locus epistatic models (54), but they could theoretically be extended to more complex biologic hierarchies that account for additional layers of complexity, thus helping to understand the genotype–phenotype relationship in genetic studies of rare diseases, such as SSc.

A major shortcoming of genetic-association studies is the possibility of false-positive results even in the presence of statistically significant findings, that is, according to the definition of Wacholder et al (55), the false-positive report probability (FPRP). Even if no formal calculation of the FPRP was carried out and no strategy has been outlined to perform such calculation in the context of epistasis, some considerations let us think that the probability of false-positive findings in our study is relatively low. First, only a limited number of cytokine SNPs has been described so far (56) and this would considerably increase the prior probability that the association is real. Second, for all the SNPs we analyzed, there have been reports that they have a functional relevance and that they are associated with high or low production of the corresponding cytokine (14,56), thus further increasing the prior probability. Finally, using the approach described by Moore et al (26), we confirmed the relevance of the SNPs sorted out by the MDR method showing that the multilocus attributes constructed by the MDR inductive algorithm are highly significant when analyzed by the logistic regression–based approach FITF (44), thus reducing the chance of falsely identifying these SNPs as important (35).

In summary, we provide evidence for gene–gene interaction among cytokine SNPs in the context of SSc. By applying the MDR algorithm, it was possible to model these interactions in the absence of main independent effects or detectable significance by parametric statistical approaches. This methodology allowed us to identify the most interesting cytokine SNPs from a high number of analyzed mutations (e.g., 18). We think this information will play an important role in helping future researchers target a smaller number of biologic pathways in an effort to develop a better understanding of the mechanisms that underlie SSc susceptibility and/or disease expression.

AUTHOR CONTRIBUTIONS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES

Dr. Beretta had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Beretta, Scorza.

Acquisition of data. Beretta, Cappiello.

Analysis and interpretation of data. Beretta, Cappiello, Moore, Barili.

Manuscript preparation. Beretta, Cappiello, Moore, Greene.

Statistical analysis. Beretta, Moore, Greene.

Collection of funds. Scorza.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. PATIENTS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. AUTHOR CONTRIBUTIONS
  8. REFERENCES
  • 1
    Abraham DL, Varga J. Scleroderma: from cell and molecular mechanisms to disease models. Trends Immunol 2005; 26: 58795.
  • 2
    Varga J, Abraham D. Systemic sclerosis: a prototypic multisystem fibrotic disorder. J Clin Invest 2007; 117: 55767.
  • 3
    Atamas SP, White B. Cytokine regulation of pulmonary fibrosis in scleroderma. Cytokine Growth Factor Rev 2003; 14: 53750.
  • 4
    Kahaleh MB. Raynaud phenomenon and the vascular disease in scleroderma. Curr Opin Rheumatol 2004; 16: 71822.
  • 5
    Balkwill F. Cytokines and cytokine receptors. In: RoittI, BrostoffJ, MaleD, editors. Immunology. 6th ed. London: Mosby International; 2001. p. 11929.
  • 6
    Wynn TA. Fibrotic disease and the T(H)1/T(H)2 paradigm. Nat Rev Immunol 2004; 4: 58394.
  • 7
    Granel B, Chevillard C, Allanore Y, Arnaud V, Cabantous S, Marquet S, et al. Evaluation of interleukin 13 polymorphisms in systemic sclerosis. Immunogenetics 2006; 58: 6939.
  • 8
    Matsushita T, Hasegawa M, Hamaguchi Y, Takehara K, Sato S. Longitudinal analysis of serum cytokine concentrations in systemic sclerosis: association of interleukin 12 elevation with spontaneous regression of skin sclerosis. J Rheumatol 2006; 33: 27584.
  • 9
    Sato S, Hasegawa M, Takehara K. Serum levels of interleukin-6 and interleukin-10 correlate with total skin thickness score in patients with systemic sclerosis. J Dermatol Sci 2001; 27: 1406.
  • 10
    Mavalia C, Scaletti C, Romagnani P, Carossino AM, Pignone A, Emmi L, et al. Type 2 helper T-cell predominance and high CD30 expression in systemic sclerosis. Am J Pathol 1997; 151: 17518.
  • 11
    Suarez A, Castro P, Alonso R, Mozo L, Gutierrez C. Interindividual variations in constitutive interleukin-10 messenger RNA and protein levels and their association with genetic polymorphisms. Transplantation 2003; 75: 7117.
  • 12
    Pociot F, Molvig J, Wogensen L, Worsaae H, Nerup J. A TaqI polymorphism in the human interleukin-1β (IL-1β) gene correlates with IL-1β secretion in vitro. Eur J Clin Invest 1992; 22: 396402.
  • 13
    Fishman D, Faulds G, Jeffery R, Mohamed-Ali V, Yudkin JS, Humphries S, et al. The effect of novel polymorphisms in the interleukin-6 (IL-6) gene on IL-6 transcription and plasma IL-6 levels, and an association with systemic-onset juvenile chronic arthritis. J Clin Invest 1998; 102: 136976.
  • 14
    Hoffmann SC, Stanley EM, Darrin Cox E, Craighead N, DiMercurio BS, Koziol DE, et al. Association of cytokine polymorphic inheritance and in vitro cytokine production in anti-CD3/CD28-stimulated peripheral blood lymphocytes. Transplantation 2001; 72: 144450.
  • 15
    Beretta L, Santaniello A, Cappiello F, Barili M, Scorza R. No evidence for a role of the proximal IL-6 G/C -174 single nucleotide polymorphism in Italian patients with systemic sclerosis. J Cell Mol Med 2007; 11: 8968.
  • 16
    Mattuzzi S, Barbi S, Carletto A, Ravagnani V, Moore PS, Bambara LM, et al. Association of polymorphisms in the IL1B and IL2 genes with susceptibility and severity of systemic sclerosis. J Rheumatol 2007; 34: 9971004.
  • 17
    Beretta L, Bertolotti F, Cappiello F, Barili M, Masciocchi M, Toussoun K, et al. Interleukin-1 gene complex polymorphisms in systemic sclerosis patients with severe restrictive lung physiology. Hum Immunol 2007; 68: 6039.
  • 18
    Hutyrova B, Lukac J, Bosak V, Buc M, du Bois R, Petrek M. Interleukin 1α single nucleotide polymorphism associated with systemic sclerosis. J Rheumatol 2004; 31: 814.
  • 19
    Crilly A, Hamilton J, Clark CJ, Jardine A, Madhok R. Analysis of the 5' flanking region of the interleukin 10 gene in patients with systemic sclerosis. Rheumatology (Oxford) 2003; 42: 12958.
  • 20
    Beretta L, Cappiello F, Barili M, Scorza R. Proximal interleukin-10 gene polymorphisms in Italian patients with systemic sclerosis. Tissue Antigens 2007; 69: 30512.
  • 21
    Ommen ES, Winston JA, Murphy B. Medical risks in living kidney donors: absence of proof is not proof of absence. Clin J Am Soc Nephrol 2006; 1: 88595.
  • 22
    Moore JH. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum Hered 2003; 56: 7382.
  • 23
    Cordell HJ. Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet 2002; 11: 24638.
  • 24
    Ritchie MD, White BC, Parker JS, Hahn LW, Moore JH. Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseases. BMC Bioinformatics 2003; 4: 2841.
  • 25
    Martin ER, Bass MP, Gilbert JR, Pericak-Vance MA, Hauser ER. Genotype-based association test for general pedigrees: the genotype-PDT. Genet Epidemiol 2003; 25: 20313.
  • 26
    Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, et al. A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 2006; 241: 25261.
  • 27
    Thornton-Wells TA, Moore JH, Haines JL. Genetics, statistics and human disease: analytical retooling for complexity. Trends Genet 2004; 20: 6407.
  • 28
    Heidema AG, Boer JM, Nagelkerke N, Mariman EC, van der A DL, Feskens EJ. The challenge for genetic epidemiologists: how to analyze large numbers of SNPs in relation to complex diseases. BMC Genet 2006; 7: 2337.
  • 29
    Julia A, Moore J, Miquel L, Alegre C, Barcelo P, Ritchie M, et al. Identification of a two-loci epistatic interaction associated with susceptibility to rheumatoid arthritis through reverse engineering and multifactor dimensionality reduction. Genomics 2007; 90: 613.
  • 30
    Gong R, Liu Z, Li L. Epistatic effect of plasminogen activator inhibitor 1 and β-fibrinogen genes on risk of glomerular microthrombosis in lupus nephritis: interaction with environmental/clinical factors. Arthritis Rheum 2007; 56: 160817.
  • 31
    Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 2001; 69: 13847.
  • 32
    Xu J, Lowey J, Wiklund F, Sun J, Lindmark F, Hsu FC, et al. The interaction of four genes in the inflammation pathway significantly predicts prostate cancer risk. Cancer Epidemiol Biomarkers Prev 2005; 14: 25638.
  • 33
    Briollais L, Wang Y, Rajendram I, Onay V, Shi E, Knight J, et al. Methodological issues in detecting gene-gene interactions in breast cancer susceptibility: a population-based study in Ontario. BMC Med 2007; 5: 2236.
  • 34
    Andrew AS, Karagas MR, Nelson HH, Guarrera S, Polidoro S, Gamberini S, et al. DNA repair polymorphisms modify bladder cancer risk: a multi-factor analytic strategy. Hum Hered 2008; 65: 10518.
  • 35
    Heidema AG, Feskens EJ, Doevendans PA, Ruven HJ, van Houwelingen HC, Mariman EC, et al. Analysis of multiple SNPs in genetic association studies: comparison of three multi-locus methods to prioritize and select SNPs. Genet Epidemiol 2007; 31: 91021.
  • 36
    Subcommittee for Scleroderma Criteria of the American Rheumatism Association Diagnostic and Therapeutic Criteria Committee. Preliminary criteria for the classification of systemic sclerosis (scleroderma). Arthritis Rheum 1980; 23: 58190.
  • 37
    LeRoy EC, Black C, Fleischmajer R, Jablonska S, Krieg T, Medsger TA Jr, et al. Scleroderma (systemic sclerosis): classification, subsets and pathogenesis. J Rheumatol 1988; 15: 2025.
  • 38
    White B, Bauer EA, Goldsmith LA, Hochberg MC, Katz LM, Korn JH, et al. Guidelines for clinical trials in systemic sclerosis (scleroderma). I. Disease-modifying interventions. Arthritis Rheum 1995; 38: 35160.
  • 39
    Bayer PM, Bauerfeind S, Bienvenu J, Fabien N, Frei PC, Gilburd B, et al. Multicenter evaluation study on a new HEp2 ANA screening enzyme immune assay. J Autoimmun 1999; 13: 8993.
  • 40
    Tseng LH, Chen PJ, Lin MT, Singleton K, Martin EG, Yen AH, et al. Simultaneous genotyping of single nucleotide polymorphisms in the IL-1 gene complex by multiplex polymerase chain reaction-restriction fragment length polymorphism. J Immunol Methods 2002; 267: 1516.
  • 41
    Little RJ, Rubin DB. Statistical analysis with missing data. New York: Wiley; 2002.
  • 42
    Moore JH, White BC. Tuning relief for genome-wide genetic analysis. Lect Notes Comput Sci 2007; 4447: 16675.
  • 43
    Hahn LW, Ritchie MD, Moore JH. Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics 2003; 19: 37682.
  • 44
    Millstein J, Conti DV, Gilliland FD, Gauderman WJ. A testing framework for identifying susceptibility genes in the presence of epistasis. Am J Hum Genet 2006; 78: 1527.
  • 45
    Kleinrath T, Gassner C, Lackner P, Thurnher M, Ramoner R. Interleukin-4 promoter polymorphisms: a genetic prognostic factor for survival in metastatic renal cell carcinoma. J Clin Oncol 2007; 25: 84551.
  • 46
    Moore JH, Williams SM. Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays 2005; 27: 63746.
  • 47
    Hasegawa M, Sato S, Ihn H, Takehara K. Enhanced production of interleukin-6 (IL-6), oncostatin M and soluble IL-6 receptor by cultured peripheral blood mononuclear cells from patients with systemic sclerosis. Rheumatology (Oxford) 1999; 38: 6127.
  • 48
    Gillery P, Serpier H, Polette M, Bellon G, Clavel C, Wegrowski Y, et al. Gamma-interferon inhibits extracellular matrix synthesis and remodeling in collagen lattice cultures of normal and scleroderma skin fibroblasts. Eur J Cell Biol 1992; 57: 24453.
  • 49
    Ingegnoli F, Trabattoni D, Saresella M, Fantini F, Clerici M. Distinct immune profiles characterize patients with diffuse or limited systemic sclerosis. Clin Immunol 2003; 108: 218.
  • 50
    Grassegger A, Schuler G, Hessenberger G, Walder-Hantich B, Jabkowski J, MacHeiner W, et al. Interferon-gamma in the treatment of systemic sclerosis: a randomized controlled multicentre trial. Br J Dermatol 1998; 139: 63948.
  • 51
    Seibold JR. Scleroderma and Raynaud's disease. In: HarrisED, BuddRC, FiresteinGS, GenoveseM, SergentJS, RuddysS, et al, editors. Kelley's textbook of rheumatology. 7th ed. Philadelphia: Elsevier Saunders; 2005. p. 1279306.
  • 52
    Reif DM, White BC, Moore JH. Integrated analysis of genetic, genomic and proteomic data. Expert Rev Proteomics 2004; 1: 6775.
  • 53
    Moore JH, Boczko EM, Summar ML. Connecting the dots between genes, biochemistry, and disease susceptibility: systems biology modeling in human genetics. Mol Genet Metab 2005; 84: 10411.
  • 54
    Moore JH, Hahn LW. Petri net modeling of high-order genetic systems using grammatical evolution. Biosystems 2003; 72: 17786.
  • 55
    Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst 2004; 96: 43442.
  • 56
    Hollegaard MV, Bidwell JL. Cytokine gene polymorphism in human disease: on-line databases, supplement 3. Genes Immun 2006; 7: 26976.