Monoallelic expression from biallelic genes is frequently observed in diploid eukaryotic organisms. Classic examples of this phenomenon include the well-characterized cases of genomic imprinting and X-chromosome inactivation. However, recent studies have shown that monoallelic expression is widespread in autosomal genes. This discovery was met with great interest because it represents another mechanism to generate diversity in gene expression that can affect cell fate and physiology. To date, the molecular mechanisms underlying this phenomenon are largely unknown. In our original study describing the dominant/recessive relationships of pollen-determinant alleles in Brassica self-incompatibility, we found that the recessive allele was specifically methylated and silenced through the action of small RNA derived from the dominant allele. In this review, we focus on recent studies of monoallelic expression in autosomal genes, and discuss the possible mechanisms driving this form of monoallelic gene suppression.
A diploid organism has two copies of each gene in the genome, one inherited from each parent. The expression of both inherited genes is sometimes biased by an effect known as monoallelic expression, which makes the organism (or cells within it) functionally hemizygous. Examples of monoallelic gene expression are classified into three categories (Tarutani & Takayama 2011). The first is genomic imprinting, an epigenetic modification of maternally- and paternally-inherited alleles that leads to their differential expression in a parent-of-origin dependent manner (Grossniklaus 2005). The second is random monoallelic expression, exemplified by X-chromosome inactivation in placental mammalian female cells (Lyon 1986). This phenomenon was previously thought to be restricted to genes on the X-chromosome and several autosomal genes with special functions, for example, odorant receptor and T-cell receptor genes. But, recent studies suggested that it occurs widely throughout autosomal genes in diploid organisms (Guo et al. 2004; Pastinen et al. 2004; Gimelbrant et al. 2007). The third category is the dominant/recessive interaction that we described in the pollen-determinant genes of Brassica self-incompatibility. In this case, monoallelic expression is determined by a dominant/recessive relationship between two alleles, and in the heterozygote carrying these two alleles, the recessive allele is always suppressed regardless of its parent of origin (Shiba et al. 2002).
While the mechanisms generating random widespread monoallelic expression are currently poorly understood, recent studies suggested that epigenetic modifications such as DNA methylation and histone modifications may be important (Wang et al. 2007; Milani et al. 2009). Furthermore, we found that the dominance relationship between Brassica self-incompatibility alleles is epigenetically controlled by recessive allele-specific DNA methylation (Shiba et al. 2006), which is induced by a small RNA derived from the dominant allele (Tarutani et al. 2010). In this review, we focus on random monoallelic expressions in various organisms and our work on dominance relationships, and discuss how these types of monoallelic expression are controlled.
Epigenetic regulation of random monoallelic gene expression
X-chromosome inactivation, a well-known example of random monoallelic expression, is involved in dosage compensation that equalizes X-linked gene expression between males and females. Another prominent example of random monoallelic expression is termed allelic exclusion, known to occur in olfactory receptor genes and several classes of immune system genes. This phenomenon is thought to maximize the combinational diversity of protein products, and thus, a cell’s ability to respond to the enormous variety of external stimuli (Serizawa et al. 2004; Cedar & Bergman 2008). Monoallelic expression on autosomes may also explain an unusual heritable form of pigmentary mosaicism (Happle 2009). In addition, recent genome-wide analyses indicate that 1–5% of mouse and human autosomal genes may display stochastic monoallelic expression (Gimelbrant et al. 2007; Wang et al. 2007).
Examples of monoallelic expression of non-imprinted autosomal genes have also been reported in plants. In maize hybrids, many allelic genes showed differences in expression at the RNA level, ranging from unequal expression of the two alleles to unique expression of a single allele. In a demonstration of the functional relevance of monoallelic expression, it was shown that protein products of the two alleles responded differentially to abiotic stress (Guo et al. 2004). Similar examples of monoallelic expression have been reported in rice, poplar, barley, and Arabidopsis, and are thought to contribute to phenotypic diversity (Zhuang & Adams 2007; von Korff et al. 2009; Takamiya et al. 2009; Zhang & Borevitz 2009).
Recent studies suggested that these examples of monoallelic expression are frequently accompanied by differences in the chromatin states of the two alleles. In the case of random inactivation of X-chromosome and autosomal genes (e.g. olfactory receptors), the two homologous alleles appear to exist in distinct chromatin states in embryonic stem (ES) cells, before the establishment of monoallelic expression. The two different states can be detected as singlet and doublet signals on one chromosome by fluorescence in situ hybridization (FISH), and require the Polycomb group protein, Eed (Mlynarczyk-Evans et al. 2006; Alexander et al. 2007). Furthermore, in the case of X-chromosome inactivation, the involvement of complicated mechanisms with multiple layers of epigenetic modifications has been suggested (Augui et al. 2011; Wutz 2011). A non-coding RNA, X-inactivation specific transcript (Xist), transcribed from an X-linked locus, the Xic, was shown to be sufficient to trigger cis-inactivation of the X-chromosome from which it is expressed during an early developmental time window. In addition, extensive studies over the last decade revealed that several trans-acting factors – including pluripotency factors – and several long-range cis-acting elements – including antisense transcription units – are involved in the regulation of monoallelic Xist expression. However, the precise mechanisms governing the inactivation of only one of two X-chromosomes in female cells remain to be elucidated.
Genome-wide studies suggested that allele-specific DNA methylation is prevalent and is contributed by CpG-SNPs in the human genome (Shoemaker et al. 2010). The allele-specific gene expression patterns in primary leukemic cells are demonstrated to be associated with CpG site methylation (Milani et al. 2009). The occurrences of differential methylation of some monoallelically-expressed genes suggest that DNA methylation may be involved in the locking-in of expression states (Wang et al. 2007; Krueger & Morison 2008).
Contrary to the situation in mammals, there are few studies in flowering plants analyzing the molecular mechanisms of monoallelic expression, with the exception of genomic imprinting. Regulation of imprinted gene expression in the endosperm is established by a maternal-specific activation that is dependent on DNA demethylation (Choi et al. 2002). Recent genomic imprinting studies reported that flowering plants use various components, including DNA methylation/demethylation enzymes, Polycomb complex proteins, and small RNAs, to regulate monoallelic expression (Mosher et al. 2009; Hsieh et al. 2011). Moreover, genome-wide analyses of endosperm have revealed substantial reduction in CG methylation coupled with extensive local non-CG hypermethylation of small interfering RNA-targeted sequences (Gehring et al. 2009; Hsieh et al. 2009).
Analyses of intraspecific hybrids in plants could provide clues to the mechanisms for monoallelic expression (Zhang & Borevitz 2009). Recently, intraspecific hybrids between the Arabidopsis thaliana accessions were used as a model system to investigate DNA methylation and histone modification patterns (Groszmann et al. 2011; Moghaddam et al. 2011). Genome-wide analyses of hybrids showed a decreased level of 24-nucleotide (nt) small RNA relative to the parental strains, and correlated changes in DNA methylation and transcriptional expression levels (Groszmann et al. 2011). ChIP on chip analyses of A. thaliana accessions showed variations in chromatin modification at H3K27me3 (typically a repressive mark) and H3K4me2 (typically an active mark) among accessions, and these modifications were rather stable in response to intra-species hybridization, with mainly additive inheritance in hybrid offspring (Moghaddam et al. 2011). In two rice subspecies and their reciprocal hybrids, differential epigenetic modifications (i.e. DNA methylation and both activating and repressive histone modifications) also correlated with changes in transcript levels among hybrids and parental lines (He et al. 2010). These data raise the possibility that autosomal allele-specific silencing in plants is also accompanied by differences in DNA methylation and histone modification.
Dominance relationships between self-incompatibility alleles in Brassica
Dominance is one of the most basic properties of inheritance investigated by Mendel, in which the phenotypic expression of one of the two alleles at a heterozygous locus is masked. In the simplest case, if a gene exists in two allelic forms, A and a, and the A allele is dominant over the a allele, phenotypic expression of a allele is masked and Aa heterozygotes exhibit the same phenotype as AA homozygotes. Despite the crucial importance of dominance relationships, which determine organismal phenotypes, the underlying mechanisms have not been intensively pursued at the molecular level. The only well-known mechanism is the general physiological model, in which dominant and recessive alleles encode “functional” and “dysfunctional” proteins, respectively. By contrast, we recently identified a completely different mechanism of dominance relationships, in which the dominant allele is monoallelically expressed in the heterozygotes. Here, we describe the details of our findings and discuss aspects of the mechanism that remain to be clarified.
Many species of hermaphrodite plants have evolved mechanisms to prevent self-fertilization (de Nettancourt 1977; Takayama & Isogai 2005). Self-incompatibility (SI) is one physiological means to avoid self-fertilization through recognizing self-pollen in or on the female pistil. SI in the Brassicaceae is known to be controlled by a large number of haplotypes at the S-locus (S1, S2, S3, ... , Sn). Each S-haplotype consists of male and female S-determinant genes, termed S-locus protein 11 (SP11, also known as SCR) and S-locus receptor kinase (SRK), respectively. SP11 is a small cysteine-rich protein that functions as a ligand for its cognate SRK receptor, and the S-haplotype-specific interaction between SP11 and SRK triggers SI responses in the stigmatic papillar cells, resulting in the rejection of self-pollen (Fig. 1a; Stein et al. 1991; Suzuki et al. 1999; Schopfer et al. 1999; Takasaki et al. 2000; Takayama et al. 2000, 2001; Shiba et al. 2001).
SP11 proteins are predominantly produced in sporophytic anther tapetum cells (2n), and subsequently deposited at the pollen surface during pollen maturation (Takayama et al. 2000; Shiba et al. 2001; Iwano et al. 2003). Therefore, the SI phenotype of pollen is determined by the dominance relationships between the two S-haplotypes carried by the anther cell (Fig. 1b; Thomson & Taylor 1966; Hatakeyama et al. 1998). Extensive analyses of dominance relationships between S-haplotypes in Brassica rapa have revealed that they can be classified into two groups, the pollen dominant S-haplotypes, termed Class-I (e.g. S8, S9, S12, and S52), and the pollen recessive S-haplotypes, termed Class-II (e.g., S29, S40, S44, and S60, see Fig. 1c). The Class-I S-haplotype members are always dominant over those in Class-II. Within each class, the members in Class-I usually exhibit co-dominance relationships, while the members in Class-II exhibit linear dominance relationships among them. Therefore, a complicated dominance hierarchy can be observed?among S-haplotypes, e.g., (S8 = S9 = S12 = S52) > (S44 > S60 > S40 > S29).
We elected to study the molecular mechanisms of these dominance relationships because all the S-haplotypes were shown to be “functional” in self-pollen recognition. Thus, this system likely used a novel mechanism to generate dominant/recessive relationships. The first important clue was obtained by the expression analyses of the male determinant SP11 genes. While SP11 genes were expressed in all S-homozygotes, the SP11 mRNA levels of recessive S-haplotypes were always greatly reduced in S-heterozygotes. The reduction of SP11 mRNA was specific for recessive S-haplotypes; transcription levels of dominant S-haplotypes were not changed in the S-heterozygotes. (Shiba et al. 2002; Kakizaki et al. 2003). Similar phenomena have been observed in another self-incompatible species, Arabidopsis lyrata, implying that the dominance relationships among S-haplotypes are determined largely at the mRNA levels of SP11 genes in family of the Brassicaceae (Kusaba et al. 2002).
The next question that we addressed was how SP11 mRNA of the recessive allele was specifically reduced. The mechanism was expected to represent a new type of monoallelic expression, because the suppression of the SP11 allele was not random, but always occurred in the same allele in the context of particular S-haplotype combinations, and was not affected by parental origin. We first envisioned a model similar to “enhancer imbalance”, in which the dominant SP11 allele has higher binding affinity for limiting transcription factors, and the sequestration of critical transcription factors by the dominant SP11 enhancer results in the silencing of the recessive SP11. However, this kinetic model is difficult to explain the drastic reduction in levels of recessive SP11 mRNA. For example, the levels of recessive S60-SP11 transcripts in the S52S60-heterozygote were decreased by four orders of magnitude as compared with the levels observed in the S60S60-homozygote. Furthermore, such a kinetic model is absolutely impossible to explain the sequential linear dominance relationships among alleles (e.g. S52 > S44 > S60 > S40 > S29).
An important clue was obtained when we predicted the involvement of epigenetic modifications and analyzed the DNA methylation status of SP11 alleles using a bisulfite sequencing procedure. The first attempt, using DNA prepared from intact anthers, failed to detect any obvious methylation of SP11 alleles. However, when we refined the method to extract DNA preferentially from the anther tapetum, significant increases in DNA methylation were detected in the promoter region of recessive SP11 alleles in S-heterozygotes. Extensive bisulfite sequencing analyses investigating a number of S-haplotype combinations demonstrated significant DNA methylation only in recessive, but not dominant, SP11 alleles, in S-heterozygotes. Such methylation was not observed in S-homozygotes (for example, compare the S52S60 and S60S60 tracks of Fig. 3b; Shiba et al. 2006). This increase in SP11 methylation was detected only in anther tapetum, not in other tissues. In addition, this methylation was shown to initiate at very early stages of anther development prior to the induction of SP11 transcription. These results suggested that the anther tapetum uses specific de novo methylation only within the SP11 promoter region of the recessive alleles in the dominant/recessive S-heterozygotes.
Next, we asked why the de novo methylation occurred specifically in the recessive SP11 alleles in the target tapetum tissue. Recently, small interfering RNAs (siRNAs) were shown to regulate silencing of target genes by degradation of their transcripts, or by inducing de novo methylation of the homologous genomic region by a process called RNA-directed DNA methylation (RdDM; Pikaard et al. 2008; Chan 2008; Matzke et al. 2009). The methylated 5′SP11 promoter region of recessive Class-II S-haplotypes showed little homology with those of dominant Class-I S-haplotypes, but were relatively conserved within the Class-II S-haplotypes. Furthermore, in the recessive 5′SP11 promoter region, significant increases in cytosine methylation were observed in all three sequence contexts, that is, CpG, CpNpG, and CpHpH sites, which is also the hallmark of RdDM. We therefore speculated that recessive SP11 allele-specific de novo methylation in S-heterozygotes might occur by the action of dominant SP11 allele-specific small RNAs localized within or around the S-locus. Such small RNAs should have high homology to the recessive SP11 alleles to mediate methylation, but not to the dominant SP11 alleles with which they are associated.
To address this hypothesis, we performed in silico searches for possible small RNAs from the 76- and 75-kb SP11 genomic regions of the dominant S9- and S12-haplotypes, respectively. The searches revealed that the sequences with high homology to the target methylated region, termed SP11-methylation-inducer (Smi) were located within these S-locus regions, in which recombination was shown to be suppressed (Fig. 2a; Tarutani et al. 2010). The genomic structure of Smi sequences showed that these putative small RNAs, which were highly similar to the target-methylated regions, could be processed from the predicted imperfect stem-loop precursors. Similar sequences were also detected in other dominant S haplotypes (S8 and S52) by genomic polymerase chain reaction (PCR), suggesting that Smi sequences are conserved among dominant Class-I S-haplotypes.
Transcriptional analyses of Smi in a dominant S9-haplotype revealed that the region was expressed as a 5′ capped and polyadenylated transcript in the anthers at early stages of development when the tapetum cells were fully intact. Furthermore, a 24-nt small RNA, presumably derived from the transcript, was identified from the screening of an anther small RNA cDNA library. The alignment of Smi sequences showed nearly perfect homology to the target 5′ methylated region of the recessive SP11 alleles (18 out of 19 nucleotides, Fig. 2b), but not that of the dominant SP11 alleles. In situ hybridization analyses showed that Smi accumulated predominantly in anther tapetum cells at the early uninucleate pollen stage, immediately prior to the initiation of SP11 transcription.
Contrary to our expectations, recessive Class-II S-haplotypes also had a homologous sequence to Smi in the flanking genomic region of SP11 (Fig. 2a). In addition, these genomic regions were shown to be transcribed and processed into 24-nt small RNAs (termed recessive Smi) in a manner similar to that of the dominant Smis. However, all recessive Smis had a base substitution at position 10 (T to A), a site known to be important for miRNA-mediated cleavage of target transcripts (Fig. 2b). This substitution also decreased its similarity to the target methylated region (17 out of 19 nucleotides).
To confirm that Smi could induce de novo methylation of the target methylated region of recessive SP11 promoters, genomic sequences containing the dominant Smi were introduced into the recessive Class-II S homozygotes (S29S29, S40S40, S44S44, S60S60) of Brassica rapa. The transgenic plants containing the dominant Smi region lost the pollen self-incompatibility phenotype irrespective of the recessive S-haplotypes they carried. This phenotypic change was shown to reflect a reduction in SP11 expression, and an increase in methylation of the SP11 5′ promoter region (Fig. 3). The methylation profile of recessive SP11 5′ promoter regions in the transformants carrying the Smi transgene was nearly identical to that observed in the dominant/recessive S heterozygotes: cytosine methylation was observed in all CpG, CpNpG and CpNpN sequence contexts, spreading from the Smi homologous region (Fig. 3b). By contrast, the transformants carrying the recessive-type Smi transgene with a one-base mismatch at the 10th nucleotide did not lose the self-incompatibility phenotype, even though the expression level of the recessive-type Smi was almost comparable to that of the dominant Smi. Thus, Smi located in the flanking region of a dominant SP11 allele acts in trans to induce de novo methylation and silencing of the recessive SP11 allele in the dominance relationships of Brassica self-incompatibility (Fig. 4).
Possible mechanisms of dominance relationships between alleles mediated by small RNA
Although dominance relationships between alleles are widely observed in mammals and plants, the underlying mechanisms have not been intensively studied. Our work describes a novel mechanism of dominance, in which small RNA derived from a dominant allele acts in trans to induce transcriptional silencing of a recessive allele. The presence of these types of dominance modifiers, that is, genetic elements controlling dominance relationships, has been suggested as a theoretical possibility for more than 80 years, but the modifier elements themselves have remained elusive (Fischer 1928; Billiard & Castric 2011). Our study suggested that small RNAs could act as dominance modifiers, and it will be important to determine the generality of this mechanism in other systems.
In the Brassica self-incompatibility system, the identified Smi was sufficient to explain the dominance relationships between (dominant) Class-I and (recessive) Class-II S-haplotypes. A remaining puzzle is the linear dominance relationships observed among Class-II S-haplotypes. We anticipate that additional types of dominance modifiers will be involved in regulating the Class-II S-haplotype relationships. The precise mechanisms by which Smi induces target methylation and silencing also remain to be solved. For example, we speculate that the RdDM pathway is involved in this process, but many aspects of this hypothesis remain to be demonstrated experimentally. In the typical RdDM pathway, transcripts from transposons and other repetitive elements are produced by Pol IV (Herr et al. 2005; Kanno et al. 2005; Onodera et al. 2005; Pontier et al. 2005), converted into double-stranded RNA by RDR2 (Chan et al. 2004; Xie et al. 2004; Alleman et al. 2006), and processed by DCL3 into 24-nt siRNAs (Li et al. 2006; Pontes et al. 2006). Then, siRNAs are incorporated into proteins in the AGO4 clade, which direct DNA methylation by the de novo DNA methyltransferase DRM2 (Qi et al. 2006; Mi et al. 2008). However, Smi, a 24-nt small RNA, is expected to be processed from imperfect stem-loop precursors, including polyadenylated sequences, in a manner analogous to the miRNA biogenesis pathway. By contrast, typical miRNAs are a class of 21-nt small RNAs usually processed by DCL1, and incorporated into AGO1 clade proteins to regulate target gene expression primarily through the cleavage of mRNA or ta-siRNA precursors (Jones-Rhoades et al. 2006). Generally, with a few exceptions, miRNAs are not known to regulate target gene expression through DNA methylation (Bao et al. 2004; Kidner & Martienssen 2005). This raises the question of how Smi can effect monoallelic gene silencing. One mechanism that has been proposed is the ta-siRNA-like pathway, in which dominant Smi acts as an miRNA to cleave a putative non-coding transcript from the antisense strand (relative to the SP11 coding sequence) of the recessive SP11 promoter. Then, the cleaved transcript is converted to double-stranded RNA and processed into siRNAs that direct recessive SP11 promoter methylation in cis (Finnegan et al. 2011). Furthermore, genome-wide analyses using high-throughput sequencing revealed the presence of 24-nt miRNAs (named long miRNAs, lmiRNAs) in addition to typical 21-nt miRNAs in A. thaliana and in Oryza sativa (Dunoyer et al. 2004; Vazquez et al. 2008; Wu et al. 2010). The lmiRNAs in rice are generated by DCL3, and incorporated into AGO4 clade proteins to direct DNA methylation at their own loci in cis as well as in trans at their target genes. In this pathway, biogenesis of lmiRNAs does not require RDR2, and directly effects DNA methylation of target genes. Accordingly, lmiRNAs can govern gene regulation (Wu et al. 2010). Testing these models should reveal the molecular mechanism of monoallelic expression at the Brassica self-incompatibility locus, and also offer insights into other examples of dominance relationships.
We believe that our research will provide novel insights into the widespread phenomena of dominance relationships and monoallelic expression in heterologous organisms. Genome-wide analyses have revealed that around 5% of A. thaliana genes have methylated promoters, and that many genes are misexpressed in triple mutants defective in de novo methyltransferases (drm1, drm2, and cmt3 mutants) that are involved in RdDM (Zhang et al. 2006). This suggests that de novo methylation is important in regulating gene expression on a genome-wide scale, and that differences in the target promoter sequences of allelic genes leads to their monoallelic expression. Detailed and comparative analyses to profile gene transcription, DNA methylation, histone modification, and small RNA production in the intraspecific hybrids and their parents could be an effective strategy to reveal the genome-wide status of monoallelic expression, its biological significance, and the underlying molecular mechanisms.
This work was supported by the Program for Promotion of Basic Research Activities for Innovative Biosciences from the Bio-oriented Technology Research Advancement Institution (BRAIN 03-01), and by Grants-in-Aid for Scientific Research on Innovative Areas (23113001, 23113002), Grant-in-Aid for Scientific Research (21248014), Grant-in-Aid for Scientific Research on Innovative Areas “Genome Sciences”, Grant-in-Aid for Young Scientists, Funding Program for Next Generation World-Leading Researchers from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan.