Neonatal lupus (NL) is a syndrome with clinical manifestations that include transient cutaneous rash, hematologic and hepatic laboratory abnormalities, and irreversible cardiac damage. The cardiac manifestations of NL (cardiac-NL), which include atrioventricular conduction defects (heart block) and life-threatening cardiomyopathy, are associated with substantial mortality and morbidity (1). While maternal antibodies to components of the SSA/Ro–SSB/La RNP complex are necessary for the development of disease, the rarity of advanced injury (2–5% of cases) suggests that fetal factors are highly contributory to a complex pathogenic cascade that links autoantibodies to cardiac scarring (2). A fetal genetic contribution to the development of cardiac-NL is supported by a high sibling risk ratio (λs = 10–3,000), a recurrence rate of ∼18%, concordance rates in monozygotic twins of 33% (for review, see ref.3), as well as by the results of our genome-wide association study (GWAS) of cardiac-NL (4).
While the precise pathogenic cascade is yet to be fully defined, one scenario posits that injury is initiated by increased apoptosis of the cardiocytes (5). This exposes SSA/Ro antigens and associated single-stranded RNA, generating an immune complex that is phagocytosed by macrophages, and sustaining both an inflammatory response and a fibrotic response, resulting in scarring of the heart (5). This hypothesis is supported by the results of our GWAS (4), which identified several genome-wide significant variants in the major histocompatibility complex region, as well as at 21q22.3, near ERG (the gene for erythroblast transformation–specific transcription factor), which is a regulator of embryonic development, cell proliferation, differentiation, angiogenesis, inflammation, and apoptosis.
Despite the significant associations identified in the GWAS (4), we hypothesize that there may be a global enrichment of specific pathways and genetic variation that could potentially be overlooked if only the most significant statistical associations are considered. A judicious and efficient way to identify genetic variation predisposing patients to complex diseases is through the application of a candidate gene framework that incorporates prior biologic knowledge. Unlike an agnostic GWAS, this approach narrows the hypothesis space to provide a more focused and powerful examination of the data. Indeed, several studies have shown that an essentially Bayesian approach to selecting candidate genes serves to increase the reliability and likelihood of finding genes that are truly associated with disease (6, 7).
In accordance with these considerations, we tested for potential enrichment of significant associations in genes with candidate biologic functions by mining the cardiac-NL GWAS. Given their potential contribution to inflammation, fibrosis, and conduction abnormalities in the heart, genes with functions contributing to apoptosis, calcium electrogenesis, immune function, and fibrosis were selected. Figure 1 illustrates this approach to the identification of cardiac-NL risk loci. Several associations within genes in candidate pathways relevant to the pathogenesis of cardiac-NL were identified, supporting the potential of this approach for gene discovery.
Figure 1. Approach for the identification of genes predisposing to cardiac manifestations of neonatal lupus (cardiac-NL). An approach that focuses the hypothesis space based on biologic knowledge provides a more powerful and efficient way to identify disease-causing variation. The identification of an enrichment of associations in genes believed to be relevant to the pathogenesis of cardiac-NL revealed novel associations in genes with such functions.
Download figure to PowerPoint
MATERIALS AND METHODS
- Top of page
- MATERIALS AND METHODS
- AUTHOR CONTRIBUTIONS
Ingenuity Pathways Analysis (www.ingenuity.com) was used to catalog a list of all human genes with known genomic positions and with immune (1,993 genes), fibrosis (327 genes), apoptosis (2,283 genes), T cell (980 genes), cell infiltration (311 genes), innate immune cell (1,381 genes), interferon (102 genes), Toll-like receptor (TLR) (10 genes), and calcium channel (13 genes) functions. Genes with bone functions (n = 568) served as controls. Ingenuity Systems has a comprehensive database that contains biologic findings and annotations, including thorough gene function information, manually curated from the literature and several databases.
All single-nucleotide polymorphisms (SNPs) within a 10-kb interval of each gene identified in our GWAS (4) were considered in this analysis. Briefly, 346,110 SNPs genotyped on the Illumina HumanCNV370 array were analyzed for association with cardiac-NL in the GWAS of 116 Caucasian children with cardiac-NL and 3,351 Caucasian controls. Only SNPs that met the following quality control criteria were considered: 1) no differential missingness between cases and controls (P > 0.05), 2) overall missing genotype data <10%, 3) no significant departures from Hardy-Weinberg equilibrium expectations (P > 0.0001 for cases and P > 0.01 for controls), and 4) minor allele frequency >0.02 in the control samples. Tests for association were adjusted for potential confounding effects of population structure by including principal components derived from the GWAS as covariates in logistic regression models. The primary inference for this study was based on the additive genetic model, unless the lack-of-fit to an additive model was statistically significant (P < 0.05), in which case the minimum P value from the dominant, additive, or recessive models was reported. The observed inflation factor for the original GWAS was 1.026.
The list of SNPs in our GWAS that met statistical quality control criteria and mapped within 10 kb upstream and downstream of each gene was pruned to contain only those SNPs not in linkage disequilibrium (LD) (r2 > 0.4). For each pathway, a Z test for proportions was computed to test for a potential statistically significant difference between the number of observed and expected significant associations (at P < 0.001) using the pruned SNPs that met quality control criteria. To further avoid biases due to LD, the extended HLA region was also excluded.
Finally, the most significant genes in each of the biologic function categories that showed an enrichment of associations were identified. The top 5 most significant non-HLA SNPs (P < 5 × 10−4) for each function were selected and their regions analyzed using all SNPs that met quality control criteria. Regions in which the most significant association was not corroborated by neighboring SNPs in LD were flagged as likely false-positives.
- Top of page
- MATERIALS AND METHODS
- AUTHOR CONTRIBUTIONS
The Ingenuity Pathways Analysis database was used to compile a list of all molecules with candidate biologic functions. A total of 1,993 human genes with immune functions and known genomic positions, 327 genes with fibrosis functions, 2,283 genes with apoptosis functions, 980 genes with T cell functions, 311 genes with cell infiltration functions, 1,381 genes with innate immune cell functions, 102 genes with interferon functions, 10 genes with TLR functions, 13 genes with calcium channel functions, and 568 genes with bone functions (to serve as controls) were identified.
A total of 36,434 SNPs that met quality control criteria were identified in 3,121 genes. Of these, 417 SNPs mapped to 45 genes within the extended HLA region. Once this list was pruned to eliminate redundant SNPs in LD, 15,103 SNPs in 3,068 genes remained outside the extended HLA region. Table 1 shows the results of the tests for a potential statistically significant difference between the number of observed and the number of expected significant associations (at P < 0.001) in these genes.
Table 1. P value enrichment in genes with candidate biologic functions*
| ||SNPs||No. of SNPs with cardiac-NL associations||Probability, %|| |
|Fibrosis||1,556||1.6||9||0.10||0.58||2.27 × 10−9|
|Apoptosis||10,211||10.2||26||0.10||0.25||7.67 × 10−7|
|Innate immunity||5,736||5.7||17||0.10||0.30||2.53 × 10−6|
|T cell||3,804||3.8||11||0.10||0.29||2.23 × 10−4|
|Immune function||8,100||8.1||18||0.10||0.22||5.01 × 10−4|
|Interferon||294||0.3||2||0.10||0.68||1.64 × 10−3|
|Calcium channels||251||0.3||1||0.10||0.40||1.35 × 10−1|
|Control||2,891||2.9||5||0.10||0.17||2.15 × 10−1|
|Cell infiltration||1,321||1.3||2||0.10||0.15||5.54 × 10−1|
|TLRs||26||0.0||0||0.10||0.00||8.72 × 10−1|
As shown in Table 1, 8,100 pruned SNPs that met quality control criteria in genes with immune functions were identified. We would expect 8 of these SNPs to have a cardiac-NL association at P < 0.001. However, we observed 18 SNPs not in LD that exhibited an association at P < 0.001 (more than twice the number of expected SNPs). This 8:18 ratio of expected to observed SNPs was statistically significant (P = 5.01 × 10−4), suggesting an enrichment of associations in genes with immune functions.
A significant enrichment of P values was also observed for genes with T cell functions (P = 2.23 × 10−4) and interferon functions (P = 1.64 × 10−3). A highly significant enrichment of P values was found for genes with fibrosis (P = 2.27 × 10−9), apoptosis (P = 7.67 × 10−7), and innate immune cell (P = 2.53 × 10−6) functions.
For most of the biologic functions, inclusion of the extended HLA region did not significantly alter these results. The most striking difference was the enrichment of SNP associations within immune function genes (P < 1 × 10−9 with the HLA region), which is expected given the abundance of immune function genes in this region. A stronger enrichment of associations within genes with apoptosis functions (P = 7.35 × 10−9) and innate immune cell functions (P = 3.06 × 10−7) was also observed when the HLA region was included. For the rest of the biologic functions, the enrichment of significant associations remained similar, emphasizing a modest role of HLA variation in these pathways.
Next, the genes driving the enrichment of associations in fibrosis, apoptosis, immune, T cell, innate immune cell, and interferon functions were identified. As expected, the most significant SNPs mapped to the HLA region. The most significant SNP in this region was located in the natural cytotoxicity triggering receptor 3 gene NCR3 (rs2857595; odds ratio 2.37 [95% confidence interval 1.79–3.14], P = 1.96 × 10−9), which was reported in the GWAS (4). Outside of the HLA, the most significant associations were observed in genes with immune and apoptosis functions (Table 2). In this study, the most significant non-HLA signal was observed in the alpha-2,8-sialytransferase 8B gene ST8SIA2, which ranked fourteenth in the GWAS when the results were ordered by region.
Table 2. Location of top 5 most significant non-HLA SNPs (P < 5 × 10−4) within each biologic function*
|SNP||Gene||Chr||Position, Mb||Minor allele||Minor allele frequency||P†||OR (95% CI)||Biologic functions|
|rs11260745||EPHA2‡||1||16.339||C||0.11||0.06||2.51 × 10−4||2.27 (1.46–3.53)||Fib, T cell|
|rs7543038||PBX1||1||162.990||T||0.44||0.37||2.52 × 10−4§||2.23 (1.45–3.43)||T cell|
|rs3753473||ADORA1||1||201.398||A||0.13||0.07||2.05 × 10−4¶||2.27 (1.47–3.51)||Fib, apop, innate, IFN|
|rs11811628||ATF3‡||1||210.823||A||0.07||0.03||2.15 × 10−4¶||2.84 (1.63–4.93)||Fib, apop, innate|
|rs2432143||ITGA1||5||52.167||G||0.16||0.09||4.54 × 10−5¶||2.31 (1.54–3.45)||Imm, apop, innate|
|rs1196175||GJA1||6||121.808||A||0.15||0.26||3.08 × 10−4¶||0.47 (0.31–0.71)||T cell|
|rs2272381||OPRM1||6||154.584||G||0.03||0.14||7.34 × 10−5¶||0.19 (0.08–0.43)||Imm, innate, apop|
|rs4392700||PARK2||6||162.644||C||0.24||0.15||1.28 × 10−4||1.84 (1.35–2.51)||Apop|
|rs7002001||CSMD1||8||4.564||T||0.11||0.05||6.33 × 10−5||2.41 (1.57–3.72)||Imm|
|rs1554973||TLR4||9||119.521||G||0.35||0.25||4.46 × 10−4||1.65 (1.25–2.18)||IFN, fib|
|rs2073577||GFI1B||9||134.852||G||0.43||0.36||2.43 × 10−4§||2.26 (1.46–3.48)||T cell|
|rs2244621||PLCB3||11||63.783||T||0.04||0.15||1.24 × 10−4¶||0.22 (0.10–0.48)||T cell, innate|
|rs2472299||CYP1A2||15||72.820||A||0.39||0.27||8.47 × 10−5||1.72 (1.31–2.25)||Fib, imm|
|rs1378942||CSK||15||72.864||G||0.47||0.35||9.95 × 10−5||1.68 (1.29–2.19)||Imm, T cell, innate, apop|
|rs1487982||ST8SIA2||15||90.796||C||0.26||0.17||3.37 × 10−5¶||2.20 (1.52–3.19)||Apop|
|rs611704||GLIS2‡||16||4.335||A||0.08||0.03||9.76 × 10−5¶||2.78 (1.66–4.65)||Fib|
|rs762960||TSPO||22||41.881||T||0.20||0.32||9.60 × 10−5||0.52 (0.38–0.72)||Apop|
As shown in Table 2, immune function genes included the α1 integrin gene ITGA1 (rs2432143; P = 4.54 × 10−5), the protein phosphatase CUB and Sushi multiple domains 1 gene CSMD1 (rs7002001; P = 6.33 × 10−5), and the opioid receptor mu 1 gene OPRM1 (rs2272381; P = 7.34 × 10−5). Apoptosis genes included ST8SIA2 (rs1487982; P = 3.37 × 10−5) and the aforementioned ITGA1 and OPRM1.
The most significant non-HLA associations in genes with T cell functions included the c-Src tyrosine kinase gene CSK (rs1378942; P = 9.95 × 10−5) and the phospholipase C beta 3 gene PLCB3 (rs2244621; P = 1.24 × 10−4) (Table 2). Innate immune cell function genes included the aforementioned ITGA1, CSK, and PLCB3. Among the genes with interferon functions, only 2 regions had SNPs that met quality control criteria and were significant at P < 5 × 10−4, the adenosine A1 receptor gene ADORA1 (rs3753473; P = 2.05 × 10−4) and the TLR-4 gene TLR4 (rs1554973; P = 4.46 × 10−4). Finally, non-HLA fibrosis function genes included the cytochrome P450, family 2, subfamily A gene CYP1A2 (rs2472299; P = 8.47 × 10−5) and the aforementioned ADORA1 and TLR4 (Table 2).
To reduce the potential of false-positive results, the regions of association in the most significant genes with biologic functions were examined (Table 2). With the exception of GLIS2, EPHA2, and ATF3, all of the associations were corroborated by evidence of association from neighboring SNPs that were not included in the original analysis and that were in LD with the associated SNP. In the majority of cases, all signals appeared to be driven by variants within the candidate genes. The exceptions included PLCB3, ADORA1, and the CYP1A2-CSK genes, which were in regions of strong LD (r2 > 0.8). In this situation, the possibility that variation in neighboring genes is responsible for these associations cannot be excluded. For all the other genes, their involvement in physiologic functions relevant to cardiac-NL supports their association with this syndrome.
- Top of page
- MATERIALS AND METHODS
- AUTHOR CONTRIBUTIONS
The goal of this study was to identify genetic variation that leads to dysregulation of genes with biologic functions involved in the pathogenesis of cardiacNL. This candidate gene approach based on prior knowledge was selected because it serves to increase the power, reliability, and likelihood of finding genes that are associated with disease (6, 7). This approach is illustrated in Figure 1.
A highly significant enrichment of associations was observed in fibrosis-related and immune function–related genes, suggesting that genetic variations of these pathways are likely involved in predisposition to cardiacNL. The specific immune-related functions that appear to have driven the enrichment were apoptosis, innate immune cell, T cell, and interferon functions. Exclusion of genes in the extended HLA region did not alter these results, demonstrating that the results were not driven just by HLA variation in these pathways. Genes with TLR, cell infiltration, and calcium channel functions did not show an enrichment of associations.
Several genes with potential involvement in cardiac injury were identified. Outside of the HLA region, the strongest signal mapped to ST8SIA2, also known as STX, a member of the glycosyltransferase family 29. This polysialyltransferase catalyzes the transfer of sialic acid to N-linked oligosaccharides and glycoproteins. It is expressed in the neonatal atrium, and its expression was shown to be sufficient to modulate cardiomyocyte excitability (8). Aberrant glycosylation has been shown to modulate cardiac sodium ion channel activity and electrical signaling (8), which supports the idea that this molecule has a role in the etiology of cardiac-NL.
ITGA1, also known as very late activation antigen 1 (VLA-1), encodes the α1 subunit of integrin receptors. It heterodimerizes with the β1 subunit to form a cell surface receptor for collagen and laminin. Integrins, together with multiple cytoskeletal proteins, join the sarcomeric contractile apparatus to the extracellular matrix across the plasma membrane and elicit intracellular signaling pathways promoting the cardiomyocyte hypertrophy program (e.g., MAPK pathway, including ERK-1/2, JNK, and p38 MAPK, which are important for mediating hypertrophic cardiac growth) (9). Indeed, α1β1 integrin is required for down-regulation of profibrotic epidermal growth factor receptor signaling (10), collagen IV synthesis (11), and oxidative stress–mediated damage (12). Down-regulation of α1β1 integrin expression was observed in fibroblasts isolated from patients with scleroderma (13). Furthermore, α1 integrin–null mice develop exacerbated glomerulosclerosis following injury (11) and are more susceptible to fibrosis following injury (12).
Another significantly associated gene, CSMD1, encodes a soluble protein that can block the classical complement activation pathway. CSMD1 is highly expressed in the central nervous system and epithelial tissues, where it may play important roles in controlling complement activation and inflammation (14). It is also expressed in areas of regenerative growth, and it might be developmentally regulated (14). Given that CSMD1 may be a fetal membrane–bound complement regulator, it may provide a mechanism of fetomaternal tolerance during development by protecting the embryo from spontaneous complement activation (15).
This study, which was undertaken to comprehensively analyze all genes with various immune, calcium channel, and fibrosis functions, had several limitations that must be acknowledged. Clearly, associations could be missed due to the genomic coverage of the genotyping array or the a priori selection of specific genes reported in the literature to have the aforementioned functions. Nevertheless, the associations reported herein are supported by the fact that these genes are involved in the proposed pathogenesis of cardiac-NL, which involves dysregulation of both inflammatory and fibrotic responses (5). We also note that, despite the observed enrichment of associations in genes with several candidate biologic functions in the affected child, it is possible that such enrichment reflects an inherited maternal component. Application of this same approach to systemic lupus erythematosus (SLE) cases might facilitate the distinction of fetal effects from maternal effects. However, since many mothers do not have SLE, and likewise many SLE patients do not have anti-SSA/Ro antibodies, the only way to definitively disentangle fetal enrichment from maternal enrichment is through genotyping and performing the same analysis in the mothers with cardiac-NL; this is planned for a future study. Followup studies are needed to confirm the results found in this study and to delineate the molecular mechanisms by which these variants predispose to cardiac-NL.
In summary, this is the first pathway-based analysis in children with cardiac-NL. Through this gene function paradigm, a highly significant enrichment of associations in genes with immune and fibrosis functions, as well as novel loci associated with cardiac-NL, was observed. Taken together, these new data expand the list of genes that show association with cardiac-NL and emphasize the potential genetic contribution to dysregulated immune and fibrosis pathways. Identifying the fetal genes that predispose to cardiac scar and understanding how these genetic factors might contribute to pathogenesis should ultimately lead to important opportunities for the delineation of risk profiles and development of therapeutic targets to prevent and possibly reverse cardiac injury in these children.
- Top of page
- MATERIALS AND METHODS
- AUTHOR CONTRIBUTIONS
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Ramos had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Ramos, Langefeld.
Acquisition of data. Langefeld, Buyon.
Analysis and interpretation of data. Ramos, Marion, Langefeld, Clancy.