Non-synonymous coding mutations in a gene change the resulting protein, no matter where it is expressed, but the effects of cis-regulatory mutations could be spatially or temporally limited – a phenomenon termed limited pleiotropy. Here, we report the genome-wide occurrence of limited pleiotropy of cis-regulatory mutations in barley (Hordeum vulgare L.) using Affymetrix analysis of 22 840 genes in a population of 139 doubled haploid lines derived from a cross between the cultivars Steptoe (St) and Morex (Mx). We identified robust cis-acting expression regulators that segregate as major genes in two successive ontogenetic stages: germinating embryo tissues and seedling leaves from the embryonic axis. We show that these polymorphisms may be consistent in both tissues or may cause a dramatic change in transcript abundance in one tissue but not in another. We also show that the parental allele that increases expression can vary with the tissue, suggesting nucleotide polymorphism in enhancer sequences. Because of the limited pleiotropy of cis-regulating mutations, the number of cis expression quantitative trait loci (cis-eQTLs) discovered by ‘genetical genomics’ is strongly affected by the particular tissue or developmental stage studied. Given that limited pleiotropy is a common feature of cis-regulatory mutations in barley, we predict that the phenomenon would be relevant to developmental and/or tissue-specific interactions across wide taxonomic boundaries in both plants and animals.
Variation in gene expression is a heritable trait, and can be mapped in segregating populations using the approaches of genetical genomics (Jansen and Nap, 2001). For those organisms with sequenced genomes, the approach provides an unprecedented opportunity to compare the genetic position of the gene encoding each transcript with the position of its expression quantitative trait locus (eQTL), thereby making it possible to discriminate between the cis- and trans-regulatory control elements of gene expression for thousands of genes across the genome (Brem et al., 2002; Schadt et al., 2003; Hubner et al., 2005; Keurentjes et al., 2007; West et al., 2007). Cis-acting variation has been proposed as a major determinant of quantitative phenotypic traits, and so the detection of cis-eQTLs is a matter of particular significance (Stamatoyannopoulos, 2004). Additionally, co-segregation of a cis-eQTL with the gene itself confirms its authenticity.
Whereas non-synonymous coding mutations change the resulting protein, no matter where the gene is expressed, the effects of cis-regulatory mutations could be spatially or temporally limited, e.g. to larval anatomy, without affecting the adult, or just to a single organ or tissue, even when the gene is much more widely expressed. This phenomenon is known as reduced or limited pleiotropy (Stern, 2000;Wray, 2007). Similar effects can arise from tissue-specific transcription factors. Limited pleiotropy has been observed in many cases in which cis-regulatory mutations have an ecologically significant phenotypic impact (Wray, 2007). Analogously, Li et al. (2006) have demonstrated the existence of plasticity QTLs (pQTLs) in Caenorhabditis elegans, which result in changing levels of expression in different environments (temperature in their case) that were preponderantly trans-acting QTLs, and which could be responsible for responses to fluctuating environments.
Although variability in gene expression at a genome-wide scale across tissues is well known for both sequenced (cf. Novak et al., 2002) and unsequenced organisms (cf. Druka et al., 2006), there are very few studies focusing on the tissue-specific appearance and behaviour of cis-regulatory elements. Recently, tissue specificity of eQTLs was reported for rats (Petretto et al., 2006). The investigation of pleiotropy of cis-regulatory mutations becomes more challenging with unsequenced species, because the lack of precise information about the physical location of genes makes separation of cis- from trans- eQTLs very difficult. Rice synteny was recently employed for physically mapping wheat expressed sequence tags (ESTs), in order to compare the physical position of the ESTs with their expression mapping data obtained using wheat doubled haploid lines (Jordan et al., 2007). Another approach employed for barley (Potokina et al., 2007) was based on the numerous published reports indicating that the proportion of gene expression patterns that can be accounted for by cis-acting versus trans-acting components depends heavily on the threshold applied for eQTL detection (Gibson and Weir, 2005; Hubner et al., 2005; Keurentjes et al., 2007; West et al., 2007; Yamashita et al., 2005). As a result, the highly heritable cis-eQTLs can be widely recognized by their extremely high log of odds (LOD) scores relative to the rest of the genes (Potokina et al., 2007), and confirmed by single-nucleotide polymorphisms (SNPs). Although co-segregation technically only confirms local regulation (Rockman and Kruglyak, 2006), we use the term ‘cis-regulation’ to imply all near-regulation that has not been resolved by recombination. Trans-eQTLs may also have high LOD scores, although less frequently (Luo et al., 2007), but are less easy to reject as false positives.
In the present paper we investigate the tissue-specific appearance of cis-eQTLs in barley, one of the most important crop species with an unsequenced genome. To address this issue we analysed the steady-state mRNA transcript abundance (sometimes referred to as ‘expression’) reported by 22 840 probe sets (from here on termed ‘genes’) on Affymetrix Barley 1 GeneChips across 139 double-haploid (DH) lines of the Steptoe (St)/Morex (Mx) barley mapping population in one tissue, and in a subset of 30 DH lines for a second tissue, with the two tissues being derived from different temporal development stages. Working with the restricted mapping population, we focused only on the highly heritable cis-eQTLs that segregated like major genes and that were reproducible with the larger set of 139 DH lines both in expression and SNP genotype. The number of cis-eQTLs detected with an empirically established statistical threshold was found to be strongly dependent on the particular tissue, demonstrating that limited pleiotropy of cis-regulatory mutations occurs widely in barley. This suggests that the question ‘how many genes are regulated in cis?’ is context-dependent, with different outcomes for different tissues, as well as different crosses, under analysis.
The natural nucleotide polymorphism between St and Mx in the regulatory part of genes may affect an interaction of a gene promoter with RNA polymerase and transcription factors. As a consequence, alleles of St and Mx would result in different levels of gene transcript abundance, and this can be detected by eQTL analysis in the mapping population of DH lines of the St/Mx cross. Because of the limited pleiotropy of cis-regulatory mutations, the linkage might be discovered in one tissue, but not in another. To test this hypothesis, we compared the chromosomal locations of highly significant eQTLs detected in two tissues: the embryo-derived tissue of germinating barley grains (subsequently referred to as ‘embryo’) and leaves of 12-day-old seedlings (‘leaf’), and compared them with SNP data for the same genes.
Overall tissue effect on transcript abundance level and tissue-specific cis-regulators
We started by estimating the overall tissue effect on the 22 840 genes, by comparing the relative abundance of their transcripts across 32 genotypes (two parents and 30 DH lines) in embryo and leaf tissues. Initially, we mainly focused on the between-group variation (expression in embryo vs expression in leaves) caused by tissue-specific gene regulation, without regard to genetic polymorphism within the DH lines. A one-way anova detected 19 958 genes (87% of the total gene number on the array) in which the transcript abundance level differed significantly between tissues (FDR < 0.05; Benjamini and Hochberg, 1995) (Table S1). Among these, approximately half (9985) were higher in the embryo, whereas the other half (9973) were higher in the leaf.
We observed that replacing the St for an Mx allele for many genes significantly affected the tissue-specific regulation. Contig6206_s_at (Figure 1a) is an example of a gene being completely suppressed in the embryo (all hybridization signals were classified by the mas 5.0 algorithm as ‘Absent’), but was activated in leaves, although only in DH lines that carried the Mx allele. In contrast, Contig19508_at (Figure 1b) was not detectably transcribed in leaves (all hybridization signals were classified as ‘Absent’), and was specifically expressed in embryo tissue, but again only in those offspring that inherited the Mx allele for the gene. The St allele produced no detectable transcript in either tissue for Contig3642_at (the ‘Absent’ call was obtained for all DH lines carrying St allele; Figure 1c). Contig13784_at (Figure 2a) was detectably expressed in both tissues; however, the St allele decreased the transcript abundance just in the embryo, while no allele-specific gene expression was detected in leaves. Remarkably, the alleles of the particular parent (e.g. Mx) may increase the transcription level of a gene in the embryo, but decrease it in leaves (Figure 2b). To determine the biological relevance of this phenomenon it would be informative to estimate the frequency of occurrence for each class of tissue-specific expression pattern. Technically, this can be achieved via eQTL analysis of the gene expression data available for the 30 DH lines in two tissues to compare positions of the eQTLs mapped for the same gene, but in two different tissues, as described below.
Tissue-specific cis-regulators and the empirical genome-wide threshold of eQTL significance
In a previous paper (Potokina et al., 2007), we presented evidence for 23 738 eQTLs in this population of 139 DH lines for embryo tissue, whereas in an earlier paper (Luo et al., 2007) we identified expression markers using a subset of 30 DH lines based on leaf tissue. In the latter case, only 30 lines were available, but we now wish to compare the cis-eQTLs for the two tissues in these 30 DH lines to explore limited pleiotropy on a genome-wide scale. The use of such a small population required several key steps to minimize false gene discovery, including reference to the full 139-line set and to SNP data.
We thus carried out eQTL analysis with gene expression data available for the 30 DH lines, and compared the positions of the eQTLs mapped in the two tissues. To avoid false eQTLs artificially created by background noise, only those genes expressed at a detectable level were subjected to eQTL analysis. Thus, out of 22 840 genes on the chip, we analysed the transcript abundance variation for the 15 967 (70%) genes for embryo tissue and the 15 247 (67%) genes for leaf tissue that were classified by the mas 5.0 algorithm as ‘Present’ for at least two out of three replications of St or Mx hybridizations. The ‘Present’ flags designate genes in which the hybridization signal differs from the background significantly (P < 0.05). The two sets of mapped expression profiles for different tissues overlapped for 13 940 genes.
Altogether, 15 967 (embryo) and 15 247 (leaf) genes were each tested by composite interval mapping (CIM) for linkage between transcript level variation and a marker position in one of the 209 recombination bins of the seven chromosomes in this restricted population. The resulting LOD profiles were examined to determine eQTL peaks, as described previously (Potokina et al., 2007). To establish a threshold for declaring eQTLs to be statistically significant, we used the global permutation threshold (GPT) approach (West et al., 2007) (see also Experimental procedures). Accordingly, each eQTL peak was assigned to the corresponding P-value, reflecting its genome-wide significance obtained by the permutation approach.
Not only do false eQTLs arise from the multiple testing of thousands of expression traits, but they also arise from the very small size of the mapping population (n = 30) for which we had expression data for both tissues. The linkage obtained might be consistent for the particular set of 30 genotypes and, therefore, it would pass through the permutation test, but it is at risk of not being reproducible with a larger set of recombinants. The smaller the subset of genotypes, the more likely it is that a linkage can reflect a spurious association from a biased selection of the 30 genotypes. For this reason, large mapping populations (of at least n > 100) are routinely used for QTL mapping practice.
To account for these statistical issues, we have established an empirically estimated genome-wide threshold for the eQTL significance. The 30 DH lines analysed for eQTLs in the present study were a subset of the larger mapping population of 139 DHs previously subjected to eQTL analysis using transcript abundance in embryo tissue (Potokina et al., 2007). We evaluated the proportion of the eQTLs detected with the 30-lines set that was also reproducible with the 139-DH population, at different thresholds of the eQTL significance obtained via permutation tests. Table 1 shows that a genome-wide significance threshold of P ≤ 0.0004 allows us to keep the empirical false discovery rate (FDR) at the 5% level for the panel of 30 DH lines. The significance threshold obtained was comparable with the threshold (P < 0.001, FDR < 0.05) detected previously for a panel of 30 rat recombinant inbred strains (Hubner et al., 2005; Petretto et al., 2006). Using the established significance threshold (P ≤ 0.0004, FDR < 0.05), 1527 eQTLs were detected for the embryo tissue, and 1158 eQTLs were detected for the leaf tissue. Multiple eQTLs were identified in each tissue for ∼2% of these genes, and were removed from the analysis. This left 2081 unique eQTLs (one eQTL per gene), of which 1498 and 1134 eQTLs were found in the embryo and leaf tissue, respectively; of these, 551 eQTLs were common to both tissues (Table S2).
Table 1. Comparisons of the numbers of expression quantitative trait loci (eQTLs) controlling gene expression in the embryo detected at various LOD thresholds in the 30 double-haploid (DH) line population compared with those confirmed in the 139-line population
LOD threshold for eQTLs in 30-DH population
Genome-wide significance after a permutation test with the 30-line population (P-value)
Total number of eQTLs detected with the 30-line population
Number of eQTLs detected with the 30-line population and supported with 139-line population
The selected eQTLs represent the highest LOD scores relative to the rest of the genes: they fall into the 95% and 96% percentile boundaries of the total distribution of all LOD scores detected for embryo and leaf tissue (data not shown), and, therefore, they are most likely to be cis-regulators (Gibson and Weir, 2005; Hubner et al., 2005; West et al., 2007; Yamashita et al., 2005). To obtain experimental support for this assumption, the positions of 98 randomly taken genes mapped using SNPs were compared with the positions of their eQTLs. The SNPs and corresponding eQTLs co-segregated among the set of 30 DH lines for 93 out of the 98 genes. If the eQTL and SNP genotypes were independent, the probability of all 30 co-segregating for a given gene would be 0.530, i.e. ∼1 × 10−9. Based on this, we estimate that 95% of the selected eQTLs are cis-eQTLs; the remaining 5% map elsewhere and may be strong trans-eQTLs or cis-eQTLs associated with paralogues.
Co-appearance of cis-eQTLs when two tissues are compared
Figure 3 shows examples for three cases of co-appearance of those 2081 eQTLs in the two tissues. In addition, the LOD scores of all the 2081 eQTLs that were significant in one tissue (P < 0.0004) were plotted for the other tissue (Figure 4). For the first group of 1083 genes, an eQTL was mapped in the same position for both embryo and leaf tissues (Figures 3a and 4a–c). Among these, 551 eQTLs were highly significant (P < 0.0004) in both tissues (Figure 4a). Another 532 eQTLs were highly significant (P < 0.0004) in one tissue and significant in the second tissue when the threshold was lowered to P < 0.05 (Figure 4b,c). These 532 eQTLs were not considered as tissue specific because at least 20% of the eQTLs mapped in the 30-DH line population with P < 0.05 could be the true positives (Table 1). So, if the eQTL was mapped in the same position in both tissues, with a level of significance in one tissue of P < 0.0004 and in another of P < 0.05, it could not be reliably classified as tissue-specific.
For the second group of 615 genes (Figure 3b), an eQTL was mapped in embryos at P ≤ 0.0004, but it could not be detected even at P < 0.05 in leaf tissue (Figure 4d). The absence of the corresponding eQTLs in leaf tissue resulted from two main reasons: (i) the gene was not detectably expressed in leaf tissue, making any eQTL detection impossible (236 genes); (ii) the gene was detectably expressed in leaf tissue, but there was no genetical variation in expression level; i.e. no detectable eQTL (379 genes). The 379 P-values for the corresponding LODs in leaf tissue were far from significant, even at P < 0.05: for 80% of them, the P-values ranged from 0.98 up to 1 (data not shown). For the third group of 383 genes (Figures 3c and 4e) eQTLs were mapped in leaf tissue but not in embryos, for the same reasons as above (126 and 257 genes, respectively).
We were able to check the extent to which limited pleiotropy was being misidentified as a result of genuine eQTLs failing to reach significance in the small set of 30 DH lines (false negatives). We took those 383 cases where an eQTL was clearly identified in leaf tissue (P < 0.0004; false-positive rate < 5%), but not in embryos (P > 0.05), both based on 30 DHLs. We then looked for evidence of significant eQTLs for embryo tissue among these 383 genes based on an analysis of all 139 DH lines, and found 73 such genes (19%). This suggests that ∼81% of the cases of limited pleiotropy are correctly identified. We do not have data for the reciprocal set because expression in leaf tissue was not studied on the 139-DH lines set, but we assume that the same confidence applies.
In summary, the eQTL analysis in embryos and leaf tissue yields 1498 and 1134 detected cis-eQTLs, respectively; combining both tissues gives 2081 cis-eQTLs. Thus, by adding another tissue for eQTL analysis, we increased the number of detected cis-eQTLs by 39% compared with those detected in embryos, and by 84% compared with those detected in leaf.
Reversed effect of parental alleles in different tissues
The first group (i.e. genes with eQTLs mapping to the same location in both tissues) was further characterized by identifying tissue-dependent cis-eQTLs that undergo a reciprocal change in the parent that contributes the allele with the most abundant transcript. Thirty-four of the 1083 genes were found to invert the direction of the eQTLs between tissues (Table S3). The example of Contig4374_s_at (Figure 2b) shows that alleles of St maintain a consistent level of expression in both leaves and embryos, whereas alleles of Mx increase transcription levels in embryos, but decrease transcription levels in leaves. The results can be explained by assuming mutations exist in the putative enhancer sequence for the gene. The enhancers being targeted for tissue-specific or temporal regulation may recruit either negative or positive regulators of gene expression, depending on the developmental stage (Bilic et al., 2006; Lewin, 1997). For this particular sample, this might be the case for DH lines that inherit the Mx allele. The mutations occurring in St might disable the enhancer function, thereby reducing tissue-specific gene regulation. Of the 34 genes, 19 had alleles generating higher transcript abundance in Mx embryos and St leaves, whereas 15 had the opposite effect (i.e. higher transcript abundance in Mx leaves and St embryos).
Gene function prevalence among the genes with tissue-specific activity of cis-regulators
The gene ontology (GO SLIMS) classification provided by GeneSpring 7.2 was employed to predict functions for the cis-regulated genes based on the latest annotation information available. We used the annotations for the genes with detected cis-eQTLs to investigate possible cellular activity associated with revealed tissue-specific cis-regulators. Focusing on the functional category ‘Biological Processes’, 4549 genes can be annotated (20% of all the genes on the Affymetrix Barley 1 GeneChip). We investigated whether the tissue-specific cis-regulators represent a random sample of the total number of annotated barley genes, or if there is over-representation of detected tissue-specific cis-factors from one or more GO classifications related to the particular biological goals. To make this comparison, we considered only those functional categories where both 615 embryo-specific and 383 leaf-specific cis-regulators were represented sufficiently to permit a chi-squared test (i.e. ≥ 5 genes expected per category). The selected functional categories included genes with unknown biological processes (GO:0000004), regulation of gene expression, epigenetic (GO:0040029), electron transport (GO:0006118) and genes involved in high-level processes such as ‘nucleobase, nucleoside, nucleotide and nucleic acid metabolism’ and ‘cell communication’.
Figure S1 shows the frequency distribution of the functional categories among (i) all 4549 annotated genes, (ii) annotated genes with eQTLs observed both in embryo and leaf tissues, (iii) annotated genes with eQTLs detected in embryo tissue only and (iv) annotated genes with eQTLs detected in leaf tissue only. For each of the latter three groups we compared the distribution of functional categories against the distribution of those categories in the total sample of 4549 annotated genes.
The distribution of functional category frequencies among leaf-specific cis-eQTLs alone significantly differs from the distribution of functional categories among the total number of annotated barley genes (P = 0.010), mostly because of over-representation of genes involved in electron transport pathways. This might be expected given the photosynthetic electron transport processes preferentially activated in leaf tissue. On the other hand, 23% of the leaf-specific cis-factors related to electron transport showed homology to the cytochrome P450 gene family. In plants, P450s are known to play important roles in many processes, including the production of hormones, pigments, oils and defensive compounds (Nguyen et al., 2001).
In this paper, we have described the genome-wide occurrence of limited pleiotropy of cis-regulatory mutations in barley. The tissues in this analysis represent two successive ontogenetic stages of the barley plant: embryo-derived tissues and leaves from seedlings developed from the embryonic axis. We showed that polymorphism between St and Mx in cis-regulatory regions may alter gene transcription for only one of the two developmental stages, supporting reports that mutations in regulatory regions may sometimes have few or no pleiotropic consequences (Carroll, 2005; Stern, 2000).
The limited pleiotropy of cis-regulatory mutations was recently suggested as one of the possible reasons why selection could generally operate more efficiently and flexibly on cis-regulatory mutations than on coding mutations (Wray, 2007). In contrast with the extensive pleiotropic effects that may arise from mutations within protein-coding regions, mutations in cis-regulatory regions may affect gene transcription in just one crucial cell type out of several where it is expressed. This optimizes the use of genetic variation while avoiding the extremely deleterious effects on fitness (Carroll, 2005). One of the well-established examples discussed in terms of the evolutionary significance of limited pleiotropy (Wray, 2007) is a cis-regulatory SNP in the DARC locus (Duffy blood group, chemokine receptor). The cis-regulatory SNP abolishes the DARC transcription specifically in red blood cells, whereas in several other tissues and cell types the DARC locus is still expressed normally (Iwamoto et al., 1996; Peiper et al., 1995). Individuals lacking DARC expression in erythrocytes show no adverse health consequences, but become completely resistant to infection with the malarial parasite. Thus, the single cis-regulatory SNP results in a phenotype expected to provide a substantial fitness gain (Wray, 2007).
With the currently available methods of eQTL mapping in a segregating population, limited pleiotropy could be traced by tissue (developmental stage)-specific detection of cis-eQTLs. In our study, we only analysed a fraction of all of the cis-regulatory factors that could possibly be revealed in barley with the St/Mx cross. This highly selected set of 2081 genes showed the highest LOD scores for eQTLs compared with the rest of the genes, providing the opportunity to confidently identify them as cis-regulated loci. For approximately half (1083) of these genes, the cis-regulatory mutations appeared to work in the embryo of germinating grains as efficiently as in the leaves of seedlings. For the remaining 998 genes, the cis-factor activity was tissue-specific. The tissue-specific activity of the cis-factors appeared to be associated with the cellular processes that were preferentially activated in the corresponding tissue. For example, in leaf tissue a higher proportion of cis-regulated genes associated with photosynthetic electron transport processes was detected. In as few as 35% of the 998 genes, cis-eQTL variation could not be identified in one of the two tissues because of the complete suppression of gene expression in that tissue. For the remaining 65%, the genes were detectably expressed in both tissues, but nucleotide polymorphism between St and Mx in cis-regulatory regions caused dramatic regulatory changes (and, consequently, changes in the appearance of eQTLs) in only one tissue. Therefore, the data suggest that the limited pleiotropy of cis-regulatory mutations is widely distributed in barley.
Assuming that limited pleiotropy is a common feature of cis-regulatory mutations not only in barley, but also in other species (e.g. man), one could predict that the phenomenon would be relevant to some age-related disorders and/or tissue-specific syndromes in humans, with implications for ‘personalized medicine’. Indeed, the complex genetic component of many age-related disorders is well reported (Ruse and Parker, 2001). If the inheritance of a particular parental allele in a cis-regulatory region causes dramatic alterations of gene expression, some of the inherited alleles may be associated with a significant deleterious effect on fitness. However, transcription of many genes, particularly in higher eukaryotes, is dependent upon multiple physiological signals (Ptashne and Gann, 1997). Thus, the deleterious effect of the inherited parental allele can be temporarily concealed until the signal is received at a certain developmental stage. If negative action of the alleles appears later in life, the deleterious alleles might be kept in a population, as the effects of natural selection decline with age (Medawar, 1952).
With the restricted examples of two barley genotypes we investigated how the natural nucleotide polymorphism of DNA may modify variation of gene transcription efficiency. The most common level of transcription regulation is interaction of a gene promoter with RNA polymerase and other transcription factors. Consequently, nucleotide substitutions in binding sites upstream of the start-point of transcription, such as promoter, cis-acting motifs (short sequences recognizable by transcription factors) or enhancers (DNA sequences that can activate gene transcription from remote positions) would change the level of transcript abundance. As a consequence, eQTL analysis detects significant linkage between polymorphism at the genetic locus and variation in its transcript abundance. In that situation, the nucleotide polymorphism assigned to one genotype (e.g. Mx) may favour transcription efficiency, whereas alleles of the other (e.g. St) may not. Remarkably, this Morex (+)/Steptoe (−) pattern could be tissue-specific in cases where nucleotide polymorphism occurs in enhancer sequences, which are targets for tissue-specific or temporal regulation (Lewin, 1997). It was previously reported (e.g. Bilic et al., 2006) that enhancers may recruit either negative or positive regulators of gene expression, depending on the developmental stage. In our study, such nucleotide polymorphism in enhancer sequences between Steptoe and Morex genotypes can be hypothesized for the 34 genes in which we detected a reciprocal change in the parent that contributes the allele with the most abundant transcript in the two tissues.
In summary, the number of detected cis-eQTLs is always a function of nucleotide polymorphism content between parents in the particular cross. For example, the same cis-eQTL being mapped in the DH population from barley genotypes in the St/Mx cross may not be detectable in mapping populations of other barley genotypes (e.g. Barke/Mx cross), simply because there is no polymorphism between Barke and Mx for the particular regions affecting the transcript abundance of that gene. Certainly, it would not mean that the gene is not regulated ‘in cis’, because the promoter and cis-motifs have key impacts on the transcription of any gene. In addition, our results showed that because of limited pleiotropy of cis-regulating mutations, the number of cis-eQTLs discovered by the genetical genomics approach is strongly affected by the experimental situation (e.g. particular tissue, fixed developmental stage). In the present study, the addition to the eQTL mapping experiments of a second tissue increased the number of detected cis-eQTLs by 39% (embryo) and 84% (leaf). Thus, instead of the question ‘how many genes are regulated in cis?’ we should ask more accurately ‘how many cis-regulatory mutations can be detected with the particular cross for a given tissue?’.
We used mRNA from the embryo-derived tissue of germinating grains for expression profiling of 30 recombinant lines of a St × Mx DH population (Kleinhofs et al., 1993). The same set of DH lines was investigated previously for the purpose of single feature polymorphisms (SFP) detection (Luo et al., 2007).
Embryo-derived tissues [coleoptile (shoot sheath), plumule (foliage leaves), shoot apical meristem, scutellum, radical (embryonic root), calyptra (root cap), coleorhiza (root sheath) (see also http://www.seedbiology.de)] from three grains were dissected as a single tissue piece and were flash frozen in liquid nitrogen. Germination and tissue collection were repeated for all lines with complete randomization of the Petri plates on each of three occasions. For each line, tissues from all three occasions were bulked before RNA isolation.
To obtain seedling leaf tissue, 10 sterilized seeds per line were sown in each of three replicate 13-cm2 pots. One pot of every member of the ‘trial set’ was randomized in each of three randomized blocks, and each block was placed in a separate Snijders growth cabinet with a 16-h light (400 μE m−1 sec−1)/8-h dark, 17°C/12°C, cycle. After 12 days, leaves of seven or eight seedlings from each pot were collected, bulked and flash frozen in liquid nitrogen; tissues from all three replicate pots of each line were bulked for RNA isolation.
The RNA isolated from these bulks was used to hybridize to the microarray. A single hybridization from each of 30 DH lines was performed, but three biological replicate hybridizations were used for the parental genotypes Mx and St. RNA was isolated, processed and hybridized to the Barley 1 GeneChip (Affymetrix product #900515 GeneChip® Barley Genome Array; complete description and references can be found at http://www.affymetrix.com/products/arrays/specific/barley.affx) using previously described Trizol procedures (Caldo et al., 2004). The labelling, hybridization and GeneChip data acquisition were conducted at the GeneChip facility at Iowa State University (http://www.biotech.iastate.edu/facilities/genechip/Genechip.htm).
Microarray data handling
To estimate the overall tissue effect on the expression of the 22 840 genes present on the Affymetrix Barley 1 GeneChip, 72 probe result (CEL) files combining 36 files (30 DH lines plus three replications of two parents) for the two tissues under analysis were normalized to each other using the Robust Multi-assay Average (RMA) normalization routine from the Bioconductor packages (Irizarry et al., 2003). A one-way anova was used to compare the expression level of each of the 22 840 genes on the chip across 36 genotypes and two tissues using Perl script developed in-house. An estimation of FDR was achieved according to the method of Benjamini and Hochberg (1995), following the approach suggested by Benjamini et al. (2001).
In order to compare eQTLs detected with the 30-lines set with those discovered with all 139 lines, we further followed the procedure of data handling described previously (Potokina et al., 2007). CEL files were directly loaded into GeneSpring 7.2 and were submitted to the RMA file pre-processor. This converts the probe-level expression data into gene-level expression data, which are normalized to a certain extent. As an additional normalization step, the procedure ‘Per Gene: Normalize to specific samples’ was applied, where each gene signal was divided by the corresponding mean of three replications of Mx. The normalized data for both parents and each of the 30 DH lines were exported for eQTL mapping. To obtain mas 5.0 presence calls, the ‘mas5calls’ method from the Bioconductor package was used.
A recently developed transcript derived marker (TDM) map used for eQTL analysis of the St/Mx population of 139 DH lines, and congruent with the SNP map for the St/Mx cross (Rostoks et al., 2006), was taken as a standard to maintain the marker order. The TDMs included SFPs, as described, and were defined by Borevitz et al. (2003). To develop the genetic map for the 30 DH lines, we used the same markers as for the map based on 139 lines, but the reduced population size (n = 30) led to a smaller number of recombination bins (209 instead of 512). Thus, all of the markers linked to eQTLs on the 30-line map could be checked for significant linkage to eQTL on the 139-line map.
The CIM analysis was implemented using Windows QTL Cartographer 2.5 (http://statgen.ncsu.edu/qtlcart/WQTLCart.htm) with a 2-cM walk speed and a type-I error rate of 5%. Intervals of five background markers with a window width of 10 cM were analysed to control the QTL background effects. To establish a threshold for declaring statistically significant eQTLs, we used the GPT approach (West et al., 2007). A representative null distribution based on 1 000 000 maximum likelihood ratio test (LRT) statistics (1000 permutations × 1000 randomly selected expression traits) was employed for all transcripts detectable for embryo-derived and leaf tissues. The GPT was calculated as the 95% upper bound of the representative null distribution, giving 14.168. A LOD score for each 2-cM interval was compared with the representative null distribution, and was assigned to the corresponding P value (Churchill and Doerge, 1994).
This research was supported by a research grant from the UK Biotechnology and Biological Sciences Research Council (BBSRC), and by the Scottish Executive Environment and Rural Affairs Department (SEERAD) of the United Kingdom.