Identification of candidate genes and mutations in QTL regions for immune responses in chicken

Summary There are two categories of immune responses – innate and adaptive immunity – both having polygenic backgrounds and a significant environmental component. In our study, adaptive immunity was represented by the specific antibody response toward keyhole limpet hemocyanin (KLH); innate immunity was represented by natural antibodies toward lipopolysaccharide (LPS) and lipoteichoic acid (LTA). Defining genetic bases of immune responses leads from defining quantitative trait loci (QTL) toward a single mutation responsible for variation in the phenotypic trait. The goal of the reported study was to define candidate genes and mutations for the immune traits of interest in chicken by performing an association study of SNPs located in candidate genes defined in QTL regions. Candidate genes and SNPs in QTL regions were selected in silico. SNP association was based on a custom SNP panel, GoldenGate genotyping assay (Illumina) and two statistical models: random mixed model and CAR score. The most significant SNP for immune response toward KLH was located in the JMJD6 gene located on GGA18. Four SNPs in candidate genes FOXJ1 (GGA18), EPHB1 (GGA9), PTGER4 (GGAZ) and PRKCB (GGA14) showed association with natural antibodies for LPS. A single SNP in ITGB4 (GGA18) was associated with natural antibodies for LTA. All associated SNPs mentioned above showed additive effects.


Genetic bases of immune responses
Immune responses fall into the category of complex or quantitative traits and, as such, they are controlled by multiple genes with different magnitudes of phenotypic effects, along with the impact of the environment. Genomic regions related to the complex traits are defined as quantitative trait loci (QTL). Deciphering genetic bases of complex traits lead from defining of the QTL toward pointing at a single mutation responsible for a considerable amount of the genetic trait variation called a QTN (quantitative trait nucleotide). Molecular dissection from QTL to QTN demands several steps: QTL validation in independent populations, QTL fine mapping, in silico selection of positional and biological candidate genes, selection of SNP (single nucleotide polymorphism) markers located within candidate genes, and finally, an association study of SNPs with phenotypes of interest possibly resulting in QTN identification. Availability of genomewide SNP panels accelerated identification of causal mutations associated with economically important traits in livestock (Dekkers 2012).

Immune responses
Immune response is composed of innate and adaptive responses. In our analysis, innate immunity was represented by natural antibodies (NAbs). Natural antibodies are immunoglobulins that need no exogenous stimulation of the immune system to be secreted by B-1 cells in large quantities (Ochsenbein et al. 1999). NAbs are very effective as a first barrier to pathogen invasion, mostly due to their massive presence in the host organism and polyreactivity (Frank 2002). Therefore, NAbs are considered to be a crucial immune barrier at the initial steps of the immune response, before the acquired antibodies are generated (Siwek & Knol 2005). NAbs bind different highly conserved, homologous epitopes (homotopes), for example, lipopolysaccharides (LPS), the molecule found in the outer membrane of Gram-negative bacteria, or lipoteichoic acid (LTA), which is an ingredient of cell walls of Gram-positive bacteria. In our studies, adaptive immunity was represented by specific antibody response toward keyhole limpet hemocyanin (KLH). KLH is a copper-containing, high-molecularweight protein antigen collected from the hemolymph of the sea mollusk, Megathura crenulata. KLH is commonly used as a soluble model protein known to induce a Th-2-like response (Bliss et al. 1996). KLH is never encountered by birds during their lifetime; therefore, it represents a novel antigen, suitable for measuring primary immune responses.

QTL regions
A QTL for a primary antibody response toward KLH (Siwek et al. 2003) and the QTL for NAbs for LPS and LTA (Siwek et al. 2006) were first detected in an experimental chicken population created by crossing two chicken lines divergently selected for a primary antibody response toward sheep red blood cells (Bovenhuis et al. 2002). Linkage analysis harbored four chicken chromosomes: GGA9 (QTL for Nabs/LPS), GGA14 (QTL for KLH and NAbs/LTA), GGA18 (QTL for NAbs/LPS) and GGAZ (QTL for NAbs/LPS). Subsequently, all these QTL were validated in two independent experimental populations: a cross of two chicken lines expressing different feather pecking behavior (Siwek et al. 2003(Siwek et al. , 2006 and in a cross of White leghorn and Greenlegged Partridgelike (Siwek et al. 2010;Slawi nska et al. 2011). Validated QTL on the GGA9, GGA14 and GGA18 chromosomes were fine mapped using two statistical tools: meta QTL analysis and joined QTL analysis (Slawinska & Siwek 2013). The additional statistical analysis allowed for narrowing down of the QTL confidence intervals. These selected regions in the genome (on GGA9, GGA14, GGA18 and GGAZ) are hypothesized to contain causative mutations underlying genetic variation of innate and adaptive immune responses. Therefore, the goal of the reported study was to define candidate genes and mutations for the immune responses in chickens by performing an association study of the SNPs located in candidate genes.

Animals and phenotypic data
Analysis was carried out in an experimental population, created by crossing two breeds of hens: Green-legged Partridgelike and White Leghorn. Animals were kept on a floor system on a farm at the University of Life Sciences in Lublin, Poland. All chickens were vaccinated according to the routine vaccination schedule, which incorporated a vaccine against Salmonella, Gumboro disease, bronchitis, Bourse Fabricius disease and encephalomyelitis. Population details are given in Siwek et al. (2010). The final F 2 generation consisted of 506 individuals obtained in six hatches. Immune responses were defined as specific antibody response to KLH (SAb-KLH) and natural antibodies (NAb) to the environmental antigens LPS and LTA. Phenotypic data were expressed by titers as the log 2 values of the highest dilution giving a positive reaction as described by Siwek et al. (2003) (for KLH) and by Siwek et al. (2006) (for LTA and LPS). Descriptive statistics of the phenotypic data are given in Table 1.

In silico gene/SNP selection
In silico analysis of positional and functional candidate genes covered four QTL regions associated with anti-LPS, anti-LTA natural antibodies and anti-KLH specific antibodies and located on four chromosomes: GGA9, GGA14, GGA18 and GGAZ. The functions of the genes were subsequently determined based on NCBI, KEGG and Gene Ontology databases. Based on the Biomart (Ensembl) and Genecards (Stelzer et al. 2011) databases, SNPs were selected in coding regions of candidate genes. SNP selection was based on chicken genome build 3.1 (March 2012). Selected SNPs were subsequently analyzed according to Illumina's technical note for the Custom Golden Gate Genotyping Assay.

DNA isolation and SNP analysis
Genomic DNA, which was used as a matrix for the genotyping of SNP markers, was isolated from blood cells. DNA isolation was performed using a commercial kit, MasterPure DNA Purification Kit for Blood (Epicentre â ). The DNA extraction procedure recommended by the manufacturer was modified due to the presence of nucleated erythrocytes in the chicken blood.
An Illumina custom 384-plex oligonucleotide pool assay was designed, and the GoldenGate TM Genotyping Assay (Illumina Inc.) was conducted on 50 ng of genomic DNA according to manufacturer's protocol. Interplate replicates were included as quality control measurements of the overall genotyping experiment. Error rate was below 0.1%. Genotypes were assigned and annotated using GENOMESTUDIO (Illumina Inc.) with a default SNP call threshold of 0.25. Samples with a call rate below 0.90 were reproduced or excluded from the study. SNPs with gen train scores <0.4 were zeroed.

SNP effect estimation
The following random mixed model (RMM) was used to estimate the additive effects of the SNPs: where y represents a value of a considered trait; X is a design vector consisting of 1s; l is a general mean; Z is a design matrix for SNP genotypes, which is parameterized as À1, 0 or 1 for a homozygous, heterozygous and an alternative homozygous SNP genotype respectively; q is a vector of random additive SNP effects; and e is a vector of residuals with e $ Nð0; Ir 2 e Þ, where I is an identity matrix. The covariance structure of q was assumed to be q $ N 0;Ir 2 e Nsnp , with I being an identity matrix,r 2 a representing the additive genetic variance of a given trait estimated by a linear mixed model with a random animal polygenic effect and N SNP being the number of SNPs used (here, 211).
The estimation of parameters of the above model was based on solving the mixed model equation (Henderson 1984 with R represented by Ir 2 e and G represented by Nsnp . The iteration on data technique was based on the Gauss-Seidel algorithm with residuals update (Legarra & Misztal 2008). Consequently, the variance of y is then given by ZGA T + R.
For testing the significance (H 0 : q i = 0 vs. H 1 : q i 6 ¼ 0) of the ith SNP effects in q, the Wald test was used. The test statistic follows under H 0 a normal distribution with mean 0 and variance 1. Test statistics have the following form:

SNP association based on CAR score
The CAR score, a highly effective criterion for variable ranking in linear regression based on Mahalanobis decorrelation of the explanatory variables, proposed by Zuber & Strimmer (2011), was selected as one method to identify a subset of significant SNPs. According to Zuber & Strimmer (2011), this approach is very effective computationally. The CAR scores, xi, defined as x = P À1/2 P Xy were considered the SNP selection criterion, where P denotes the empirical correlation matrix among SNPs, and P Xy is the marginal correlation vector between phenotype data and SNPs. Generally, CAR scores can be interpreted as something between marginal correlations and a standardized regression coefficient. In this model, the null distribution of the empirical CAR scores, used for obtaining type I error rates for SNPs, was defined as , where N is the number of SNPs and s ¼ P ix 2 i . The CAR criterion was evaluated using the R package CARE (R Core Team 2012). The disadvantage of this method seems to be the possibility of overemphasis of a set of SNPs which are close to this significance. This might be because different genes will have different number of SNPs, and moreover, the correlation between SNPs that are close to one another is high. If such a situation occurs, it is a rather marginal problem.

Selection of the most significant SNP
SNPs selected as significant by the above two models were subjected to testing in order to select polymorphisms with possible QTN effects. A mixed model: y = l + X 1 snp + X 2 sex + X 2 hatch + Zg + e, where l is the overall mean, snp is a vector of fixed additive effects of SNPs representing polymorphisms selected as significant by the two previously applied methods, sex is a vector of fixed effects of sex, and hatch is a vector of fixed effects of six hatches; X i represents corresponding design matrices, g is a random additive polygenic effect which follows q $ Nð0; Ar 2 a Þ with an additive polygenic covariance matrix between individuals A and the corresponding design matrix Z, and e represents a residual. Statistically, testing for QTN effect of the ith SNP corresponds to testing the following hypotheses H 0 : snp i = 0 against H 1 : snp i 6 ¼ 0, which was performed by comparing the likelihood of the full model defined above (L f ) with the likelihood of a reduced model with one SNP removed (L r ) using the likelihood ratio test: k = À2( ln L r À ln L f ) with an asymptotic distribution following v 2 1df . Testing was carried out separately for each trait. Nominal P-values were subjected to Bonferroni's correction for multiple testing within traits. Furthermore, to test for dominance effects of the SNPs which showed significant additive effects, a likelihood of the above mixed model with additive effect of the SNPs and a likelihood of the above mixed model with additive and dominance effects were compared separately for each significant SNP, using k. Moreover, for traits for which more than one SNP was identified as significant, significance of an effect of pairwise additive-by-additive epistasis was tested. The effects of QTN models were estimated using proc mixed in SAS software (SAS Institute 2002-2004.

Pathway analysis
The relation between immune responses to three antigens and the associated most significant candidate genes was analyzed using BIOGRAPH (VIB Genetic Service Facility, University of Antwerp; Liekens et al. 2011). Based on this tool, and gene function defined in GeneCards (2013), several functional relations between our traits of interest and their candidate genes were proposed.

Dataset
Each individual was genotyped using the custom assay SNP panel, which consists of 384 SNP markers. Forty SNP markers were removed from the set due to genotyping failure with the Golden Gate genotyping assay. In the final analysis, the SNP selection criteria were applied based on minor allele frequency (MAF), with a cutoff of 0.01, and genotyping quality, with a minimum call rate of 95%. After quality control, 211 SNPs were used in the final analysis, 132 SNPs were removed based on MAF criterion and one SNP was removed based on low call rate. The average MAF was 0.17 for all genotyped SNPs and 0.27 for SNPs selected for further analysis. The average call rate obtained for our dataset was high and amounted to 97.52% for all SNPs and 97.80% for selected SNPs.

Candidate genes
The results of the candidate gene association analysis under the RMM and CAR score are presented in Table 2. Altogether, seven candidate genes for an antibody response to KLH, located on two chicken chromosomes, were selected with the mixed model: five genes located on GGA14 (CARD11, IL9R, MAPK8IP3, PDGFA, PRKCB) and two genes on GGA18 (ITGB4, UNC13D). Association with LPS response was shown for two genes on GGA9 (EPHB1, PROCR), two genes on GGA18 (CRFL3, FOXJ1) and one gene on GGAZ (PTGER4). Association with LTA response was shown for eight genes altogether: four genes on GGA14 (IL9R, MAPK8IP3, PRKCB, MAP2K3), three genes on GGA18 (FOXJ1, ITGB4, JMJD6) and one gene on GGAZ (PTGER4). The CAR score indicated a higher number of candidate genes associated with the traits of interest than did the mixed model, showing associations of an additional 16 genes: seven genes associated with immune response to KLH [one on GGA14 (TRAF7) and three each on GGA9 (EPHB1, KLHL6, PROCR) and GGA18 (FOXJ1, JMJD6, MAP2K4)]; five candidate genes associated with innate immune response to LPS [one gene on GGA9 (KLHL6) and four genes on GGA14 (IL9R, MAPK8IP3, PRKCB, SOCS1)]; and four genes associated with innate immune response to LTA [two genes on GGA14 (CARD11, TNFRSF13B) and one gene each on GGA9 (KLHL6) and GGA18 (ITGB4)].

Selection of most significant SNPs within candidate genes
Polymorphisms with the most significant effects, most likely representing QTNs, are summarized in Table 3. No dominance or epistasis was detected. For immune response to LPS, an additive effect of four SNPs within four candidate genes located on different chromosomes [GGA9 (EPHB1), GGA14 (PRKCB), GGA18 (FOXJ1) and GGAZ (PTGER4)] was identified. For immune response to KLH and LTA, there were two single SNPs (JMJD6 and ITGB4) with additive effects, both located on GGA18.

Discussion
Our original linkage analysis harbored five QTL located on four chicken chromosomes: QTL for immune response toward LPS on GGA9, GGA18 and GGAZ and QTL for immune response toward LTA and KLH, both located on GGA14 (Siwek et al. 2010;Slawi nska et al. 2011). The current study showed that the selected SNPs were located within regions originally suggested by those linkage analyses. But, surprisingly, it also pointed to other genomic regions. Candidate genes for KLH association are located on GGA9 (EPHB1, KLHL6, PROCR) and GGA18 (ITGB4, UNC13D, MAP2K4, FOXJ1, JMJD6). Association with immune response toward LPS was directed toward genes located on GGA14 (MAPK8IP3, IL9R, SOCS1, PRKCB). Candidate genes for LTA association are located on GGA9 (KLHL6), GGA18 (FOXJ1, ITGB4, JMJD6) and GGAZ (PTGER4). There are three possible explanations for this phenomenon. The linkage analysis was based on a limited number of microsatellite markers; therefore, the power of that analysis was much lower than for the association study. Selected candidate genes might play a role in both types of immune responses: innate and adaptive. The linkage analysis in the current population was a validation

Differences between models
In the first step of the analysis, significant SNPs were preselected using two different approaches of an underlying inheritance mode: a mixed model, assuming additive effects of SNPs, and the CAR score, a model-free approach without any assumptions on the genetic effect of the SNP. Moreover, in the mixed model, a (normal) distribution is superimposed on the estimates, and consequently, it resulted in a lower number of significant SNPs. On the other hand, the CAR score applies no shrinkage.  Zakharova et al. (2009) indicated that JMJD6 (jumonji domain containing 6 gene) serves as a membrane-associated receptor that regulates phagocytosis in immature macrophages and is expressed in the cytosol and nucleus of mature macrophage-like cells. This metabolic activity is closely related to KLH's mode of action. It has been shown that KLH induces Th2 immune response and production of IL-4, IL-5, IL-10 and IL-13 cytokines, which promote alternative macrophage activation (Bliss et al. 1996;Allen & Wynn 2011).

Function of candidate genes
LPS activates innate immune response by interaction with its specific toll-like receptor 4 (TLR4). TLR4 triggers a Th1 type of adaptive immune response, which activates macrophages and participates in the generation of Tc cells, resulting in a cell-mediated immune response. Products of three of the four most significant candidate genes associated with LPS immune responses (FOXJ1, EPHB1, PTGER4) are involved in T-cell regulation or proliferation: FOXJ1 (forkhead box J1) participates in the regulation of T-cell tolerance, inhibition of the spontaneous autoimmunity and regulating thymic egress (Srivatsan & Peng 2005).
EPHB1 (EPH receptor B1) encodes Eph kinases, which are the largest family of receptor tyrosine kinases. Eph kinases and their ligands (ephrins) are expressed on the surface of T cells, B cells and monocytes/macrophages (Yu et al. 2004). A role of EPHB1 in immune response has been documented experimentally by Luo et al. (2011), who observed reduced thymus and spleen size and cellularity in double null mutated Ephb1 and Ephb2 mice as well as a significant decrease in the doublepositive and single-positive thymocyte subpopulations and mature CD4 and CD8 cells in the periphery in double knockout mice. PTGER4 (prostaglandin E receptor 4) can dramatically modulate immune response given that prostaglandin E2 production is enhanced during inflammation. Generally, cellular immune response regulation is under the control of distinct EP receptors from which EP4 regulates antigen presenting cell functions (Nataraj et al. 2001). Moreover, protein kinase C, beta (PRKCB), another candidate gene indicated as being related to LPS immune response in our study, is involved in B-cell survival and antigenic response. B cells respond to TLR ligands and present antigen (Lund 2008), organize the structure of lymphoid tissues and regulate lymphangiogenesis. An experiment on mice deficient in protein kinase C beta demonstrated an essential role of this gene in BCR-induced glycolysis in B cells (Blair et al. 2012).
ITBG4 is the most significant gene associated with LTA immune responses in our study. LTA initiates immune response through a very particular pattern recognition receptor: toll-like receptor 2 (TLR2). TLRs are known to interact with macrophages or dendritic cells, known also as antigen presentation cells (APC). Airway epithelial cells have been demonstrated to be accessory APCs, capable of activating T cells, whereas silencing of ITGB4 resulted in impaired antigen presentation and suppressed T-cell proliferation (Liu et al. 2012).

Pathways analysis
Results indicate that a relation between an immune response to LPS of bacterial origin and the FOXJ1 gene can go through different pathways such as LPS -IL10 genenegative regulation of T-cell proliferation -FOXJ1, or LPS -IRAK1 gene/CAT genenegative regulation of NF kappab transcription factor activity -FOXJ1 (Fig. 1). The relation between immune response to LPS and the PTGER4 gene is direct and goes through inflammation. Another connection between LPS and the EPHB1 gene goes through IRAK1/ MAP3K11protein amino acid autophosphorylation -EPHB1 (Fig. 2). The IRAK1 gene (interleukin-1 receptorassociated kinase 1) encodes interleukin-1 receptor-associated kinase 1, one of the two putative serine/threonine kinases that become associated with the interleukin-1 receptor (IL1R) upon stimulation. Serine/threonine protein kinase plays a critical role in initiating innate immune response against foreign pathogens. These kinases are involved in TLR and IL-1R signaling pathways. The CAT gene (catalase) encodes a protein which promotes growth of cells, including T cells and B cells.
The immune response relation between LTA and ITGB4 leads through the RIPK2 gene and VIM gene. Protein encoded by the RIPK2 (receptor-interacting serine-threonine kinase 2) gene contains a C-terminal caspase activation and recruitment domain and is a component of signaling complexes in both the innate and adaptive immune pathways. VIM (vimentin) encodes a protein which is involved in the immune response (Fig. 3).
There is no direct information proposed by BioGraph on the relation between KLH and the JMJD6 gene. However, the link between immune response and the JMJD6 gene goes through various pathways: RAG1 -T-cell differentiation in thymus or TLR4/TLR1macrophage activation. RAG1 (recombination activating 1 gene) encodes a protein involved in the activation of immunoglobulin V-D-J recombination (Fig. 4).

Conclusions
Revealing a genetic architecture of immune responses toward KLH, LTA and LPS led to three general conclusions. First, the most significant SNPs for immune responses toward KLH and LTA are located outside the QTL regions originally proposed by linkage analysis, which indicates its relatively poor resolution. Also, in the search for causal mutations in candidate genes, a linkage analysis can be well regarded as a preliminary tool but not as an indicator of final results. Second, innate and adaptive immunity have some genes in common. Third, immunity predominantly follows an additive mode of inheritance.