Upstream open reading frames: Molecular switches in (patho)physiology

Conserved upstream open reading frames (uORFs) are found within many eukaryotic transcripts and are known to regulate protein translation. Evidence from genetic and bioinformatic studies implicates disturbed uORF-mediated translational control in the etiology of human diseases. A genetic mouse model has recently provided proof-of-principle support for the physiological relevance of uORF-mediated translational control in mammals. The targeted disruption of the uORF initiation codon within the transcription factor CCAAT/enhancer binding protein β (C/EBPβ) gene resulted in deregulated C/EBPβ protein isoform expression, associated with defective liver regeneration and impaired osteoclast differentiation. The high prevalence of uORFs in the human transcriptome suggests that intensified search for mutations within 5′ RNA leader regions may reveal a multitude of alterations affecting uORFs, causing pathogenic deregulation of protein expression.


Introduction
Defective translational control of protein expression is increasingly recognized as an important mechanism in the etiology of human diseases [1]. In eukaryotic mRNA the main protein coding sequence (MCS) is flanked by upstream and downstream regulatory regions of variable length and structure. These regions may contain multiple regulatory cis-acting sequence elements, including 5 0 -located hairpins, protein binding sites, upstream open reading frames (uORFs), or internal ribosomal entry sites (IRESs) as well as 3 0 -located microRNA target sites, specific localization elements (zip codes) or polyadenylation signals. Several review articles have summarized how such cis-regulatory translational control elements influence translation of the MCS and how their dysfunction relates to the development of human diseases [2][3][4][5][6][7].
A growing body of evidence obtained from bioinformatic and genetic studies suggests that, in particular, uORFmediated translational control may serve as a comprehensive mechanism of protein expression control. Recently, the targeted genetic ablation of the translational start site in the uORF of the transcription factor CCAAT/enhancer binding protein b (C/EBPb) validated the physiological relevance of uORF-mediated translational control in an animal model [8].
This paper aims to provide a brief overview on uORFmediated translational control in general. Moreover, we show how aberrant uORF regulation may translate into (patho)physiology, as illustrated by data obtained from analyses of C/EBP transcription factors. Finally, we outline how contemporary sequencing technologies may help to unravel the implications of uORF-mediated translational control in a multitude of as-yet-unexplained human diseases.
uORFs -frequency, structure, and function complex and additional eIFs, engages with the 7-methylguanosine (m 7 G) mRNA cap structure located at the 5 0 end of a transcript [6]. The pre-initiation complex scans the mRNA toward the 3 0 end until the Met-tRNA i Met anticodon matches a functional AUG codon. Joining of the 60S ribosomal subunit completes the assembly of a fully functional ribosome and permits initiation of translation. Initially it was assumed that scanning ribosomes would generally initiate translation at the m 7 G-cap proximal AUG initiation codon [10], but subsequently an increasing number of genes were identified that differed from this ''first AUG rule''. Predominantly, transcripts with long and presumably structured 5 0 regulatory regions were found to frequently contain functional AUG codons upstream of the MCS (uAUGs) [11]. Such uAUGs constitute the initiation codon of uORFs, and interfere with unrestrained ribosomal scanning toward the MCS initiation codon [9].
The yeast transcription factor GCN4 represents the beststudied example of uORF-mediated translational control and illustrates how uORFs can facilitate the paradoxical induction of GCN4 protein expression under conditions of reduced global translation [12,13]. The first of four uORFs within the GCN4 5 0 leader is efficiently translated under both good nutritional and starvation conditions and establishes a ''reinitiation mode of translation'' [14] for all downstream initiation codons [12,13]. In non-stressed cells, rapid reloading of posttermination ribosomes with indispensable initiation cofactors allows immediate reinitiation at the proximal initiation sites of uORFs two to four. These uORFs exhibit specific inhibitory features, rendering the translating ribosomes incapable of reinitiating at the MCS. During amino acid starvation the availability of initiation cofactors decreases, resulting in decelerated reloading of post-termination ribosomes and leaky scanning across the inhibitory uORF start sites. A functional initiation complex can only be reassembled after prolonged progression of post-termination ribosomes, allowing the initiation at the MCS start codon and the induction of GCN4 under stress conditions. Due to their spatial and contextual organization, the four uORFs of the GCN4 transcript serve as a translational switchboard that allows the cell to rapidly respond to nutritional stress. Ultimately, the translational induction of GCN4 and the subsequent activation of GCN4 target genes adjust the cell's molecular repertoire to environmental needs.
Mechanistically, the expression of GCN4 is determined by the combined effects of leaky scanning and reinitiation events, which are sensitive to changing global translational conditions. Data obtained from the GCN4 transcript showed that the length of a uORF, the sequence context adjacent to its termination codon and the distance to the downstream initiation codon modulates the inhibitory effect of the uORF on ribosomal reinitiation [12,13]. As also confirmed for other transcripts, lengthening of the intercistronic space increases reinitiation rates from downstream AUG codons [14], while lengthening of the uORF itself results in decreased reinitiation [15]. Together, these data suggest a dynamic model where initiation cofactors are stripped from ribosomes during translation of a uORF, which need to be reassembled to allow reinitiation to occur [9].
Bioinformatic surveys have now identified uORFs in 35-49% of human and rodent transcripts [2,16,17] and correlated the prevalence of one or multiple uORF(s) in a transcript with reduced abundance of the respective protein [18]. Despite their high prevalence, uORFs are less frequent than could be expected by chance [16], and tend to be conserved among species [19], suggesting an evolutionary selection of functional uORFs. Recently, ribosomal profiling in yeast provided strong evidence for translation of uORFs in vivo and confirmed changing translation rates of the uORFs and the MCS of GCN4 in response to altered nutritional conditions [20]. uORFs are extremely diverse in both structural features and regulatory functions. As exemplified for humans, uORFs vary in length (average of 48 nt), number per transcript (0-13), position (close to or distant from the mRNAs m 7 G-cap, terminating upstream or downstream of the MCS initiation codon), sequence (no common uORF sequence motif has been identified) and secondary structure. In approximately half of the uORF-bearing transcripts, a single uORF precedes the MCS initiation codon [18]. In the remaining cases, uORF-mediated regulation is complicated by the presence of more than one uORF, and the regulatory effect on MCS translation results from the combined functions of individual uORFs, each acting in a highly context-specific manner. At present, uORF-mediated translational control has been validated experimentally for about 100 eukaryotic transcripts [18]. Besides establishing barrier functions to scanning pre-initiation ribosomes, as exemplified above for GCN4, uORFs can also reduce translation of the MCS by other means. In selected transcripts, uORFs can provoke mRNA instability [17,18] or render transcripts susceptible to nonsense-mediated mRNA decay (NMD) [21]. In other cases, uORF-encoded peptides repress translation of the MCS by interaction with the translational machinery or by reducing mRNA stability in response to trans-acting molecular regulators, such as sucrose [22], arginine [23] or polyamines [24]. Mass spectrometric approaches have identified a number of additional, potentially functional, uORFencoded peptides, which await experimental examination [25,26]. Despite the overt complexity of uORF-mediated translational control, several variables correlating with strong repression of MCS translation emerge from published data. These include long 5 0 cap-to-uORF distance, proximity of the uORF to the MCS initiation site, length of the uORF, multiplicity of uORFs, conservation among species, and initiation sequence context ( Fig. 1) [15,18,27,28].
Another intriguing regulatory function of uORFs is observed in transcripts harboring alternative downstream initiation sites within their MCS. In these transcripts, as exemplified by the transcription factors C/EBPa and C/EBPb, uORFs control the expression ratio of functionally distinct protein isoforms by sensing the translational status of the cell [9,29].
How uORF regulation translates into (patho)physiology -the C/EBP paradigm Evolutionarily conserved uORFs have been identified in transcripts of many key regulatory genes [30,31], implying an important physiological role for these uORFs. Among such uORF-bearing transcripts are the transcription factors C/EBPa and b, which regulate the proliferation and differentiation of multiple cell types including granulocytes, macrophages, K. Wethmar et al.
Prospects & Overviews .... adipocytes, osteoclasts, osteoblasts, keratinocytes, mammary epithelial cells, and hepatocytes [32][33][34][35][36]. C/EBP transcription factors are implicated in the regulation of various (patho)physiological processes including metabolism, inflammation, and malignant transformation [32][33][34]37]. C/EBPa, b, and four additional members (g, d, e and z) of the C/EBP family share highly conserved C-terminal basic regions and leucine zipper domains (bZIP), which are involved in DNA binding and homo-or heterodimerization, respectively [34]. The N-terminal parts of the C/EBPs are more diverse and contain regulatory and trans-activation domains that interact with transcriptional coactivators, corepressors, and the basal transcription machinery [38][39][40] (Fig. 2). C/EBPb mRNA translates into two long protein isoforms known as liver activating protein (LAP) and LAP Ã , and the truncated isoform liver inhibitory protein (LIP). Recently, an extended C/EBPa isoform has been described [41], in addition to the known full-length p42 and truncated p30 isoforms. The full-length isoforms of C/EBPa and b both contain N-terminal trans-activating and regulatory domains that can induce differentiation and inhibit proliferation. The truncated isoforms, p30 and LIP, consist of only the C-terminal part of C/EBPa and b, respectively, retaining their DNA binding capacity and the ability to form dimers with other protein isoforms of all C/EBP family members. The absence of the N-terminal domains in p30 and LIP isoforms compromises their trans-activating functions, resulting in trans-dominant repressive effects on C/EBP target genes [42].
In the transcripts of C/EBPa and b, a uORF is located out of frame between the initiation codons of the extended (a-ext and LAP Ã ) and the full-length isoforms (p42 and LAP) [43] (Fig. 2). These uORFs were shown to be critical for the balanced expression of the respective C/EBP isoforms [29,44]. Unique and overlapping biological functions of the different C/EBPa and b protein isoforms were characterized by numerous cell biological studies. The short isoforms p30 and LIP are sufficient to induce lineage commitment of adipocytes [29], hepatocytes [45], and eosinophils [46]. In addition, p30 is sufficient to commit cells to the granulocytic lineage [46], and LIP is sufficient to commit cells to the macrophage [46] and the osteoblast [47] lineages. However, the long isoforms are required to arrest the cell cycle of progenitors and to induce terminal differentiation by trans-activation of cell type-specific target genes (Fig. 3A) [29,32,33,42,45,46,[48][49][50][51][52][53]. Due to these differential effects of C/EBP isoforms in a variety of biological processes, uORF regulation was suggested to be important in determining the physiological outcome of C/EBP expression [32,34,37,54].
In most tissues, C/EBPa-p42 and C/EBPb-LAP are the most abundant protein isoforms, despite the presence of two preceding translational initiation codons and despite a suboptimal initiation codon context (Fig. 4). An optimal initiation sequence that supports initiation of virtually all scanning ribosomes is defined as CCRCCAUGG (Kozak consensus sequence), with a purine base in position À3 and a guanine base in position þ4 as most important for initiation [55,56].  Initiation sequence contexts are frequently classified as strong (both critical residues match the consensus sequence), as adequate/intermediate (either residue À3 or þ4 matches) or as weak (neither residue matches). Placing the initiation codons of the extended isoforms of C/EBPa (intermediate) and b (weak) in optimal Kozak consensus sequences resulted in loss of translation from downstream initiation codons [29], suggesting that the endogenous sequence context at the a-ext and the LAP Ã AUG codons allows leaky scanning, and does not support complete initiation of translation. In contrast, optimizing the Kozak context of the C/EBPa uORF start site mildly reduced translation of p42 and enhanced the expression of p30, indicating that a fraction of the post-termination ribosomes that had translated the uORF reinitiated at the proximal p42 initiation codon and another fraction initiated at the downstream p30 start site [29]. The relatively high proportion of ribosomes that reinitiated at the p42 start site after translating the uORF was surprising, as the C/EBPa uORF terminates only seven bases upstream of the p42 AUG codon (Fig. 4) and intercistronic sequences of that size were known to greatly impede reinitiation rates in other transcripts [14]. While strengthening of the uORF initiation sites in C/EBPa or C/EBPb resulted in an increased p30 over p42 and LIP over LAP expression ratio, respectively, deletion of the uORF initiation codon in either C/EBPa or b enhanced expression of p42 or LAP [44] and almost completely abolished translation of the truncated isoforms [29]. Therefore, the ''intermediate'' initiation context of p42 and LAP appeared to be sufficiently strong to support initiation of most of the scanning ribosomes in the absence, but not in the presence, of uORF translation. These observations implied that translation of the C/EBPa and b uORFs serves to shift ribosomes across the full-length initiation sites to support truncated isoform expression.
Several lines of evidence showed that the C/EBPa and b uORFs integrate the signaling status of a cell to modulate the expression ratio of isoforms. One key component in adjusting the activity of the translational machinery to environmental changes is the mammalian target of rapamycin kinase (mTOR). Many nutritional and signaling pathways downstream of growth factor-, cytokine-, or hormone receptors alter the activity of the mTOR kinase. Activated mTOR signaling is associated with enhanced global translational conditions and increased activity of important eIFs, including eIF4E [57]. Mimicking favorable translational conditions by overexpression of eIF4E induced the expression of truncated C/EBP isoforms p30 and LIP (Fig. 3B) and was associated with increased initiation at the uORF start site [29]. Importantly, mutation of the uORF initiation codon abolished the eIF4E-mediated induction of short C/EBP isoforms, confirming that indeed translation of the uORF was required to shift initiation toward the distal initiation codons [29]. In turn, inhibition of mTOR kinase activity by the macrolide antibiotic rapamycin, protein folding stress or nutrient depletion decreased global translational activity and was associated with the predominant production of p42 and LAP isoforms [29] (Fig. 3B). In response to rapamycin treatment, a shift of expression toward the fulllength C/EBPb protein isoform was also observed for endogenous transcripts and was shown to affect cellular fates, e.g. the differentiation of osteoclasts [35] or the proliferation of malignant cells [58]. Increased LIP over LAP isoform ratios were observed in several malignancies including Hodgkin lymphoma, anaplastic large cell lymphoma [58], and aggressive forms of breast cancer (reviewed in Ref. 37). Moreover, transgenic expression of LIP in mammary glands resulted in hyperplasia and tumorigenesis in mice, suggesting a proproliferative and tumorigenic potential of the LIP isoform the C/EBP isoform ratio affects cellular differentiation are illustrated, specifically how an increase in the short isoforms p30 and LIP disrupts proper differentiation. For example, p30 and LIP are overexpressed in several human cancers, including AML and breast cancer, respectively. The truncated isoforms are sufficient to induce lineage commitment of proliferative progenitor cells; however, they are not capable of blocking the cell cycle (indicated with the circular arrow) and inducing terminal differentiation and maturation. Ã In these cases, similar isoform specific functions have been described for both, C/EBPa and b. B: Environmental signals enhance (green) or repress (red) mTOR kinase activity, resulting in changes in global translational conditions. Changes in the translational status have been shown to affect uORF translation, resulting in changes in C/EBP protein isoform balance. In a good translational status, the C/EBPa and b uORFs may be more frequently translated, shifting the isoform expression ratio toward the truncated C/EBPa (p30) and b (LIP) isoforms (green).

K. Wethmar et al.
Prospects & Overviews .... [59]. In several model systems the rapamycin-mediated inhibition of mTOR altered the isoform ratio in favor of LAP and resulted in a decrease in tumor cell proliferation [58,60,61].
Together, these data suggested that the uORF initiation codon may serve as a physiologically important sensor of global translational conditions, shifting the isoform expression ratio toward the truncated isoforms under favorable conditions and to the full-length isoforms under unfavorable conditions. This function may be due to the suboptimal Kozak context that surrounds the uORF initiation codon (Fig. 4), which allows the modulation of initiation rates in response to the translational status. Interestingly, although surrounded by an intermediate Kozak context as well, initiation rates at the full-length initiation codon appear to be not as sensitive to changing translational conditions. Lower variability of full-length initiation may be attributed to its location downstream of the uORF and to the fact that it is efficiently used already under steady-state conditions, but the molecular mechanisms driving the preferential use of the uORF initiation codon under favorable translational conditions remain to be identified. Despite the simplicity of a linear ribosomal scanning/reinitiation model as an explanation for uORF-mediated control of isoform expression, the translational regulation of C/EBP transcription factors might be more complex. Three-dimensional stem loop structures [62], regulatory trans-acting factors including CUGBP1 [63] as well as hnRNP-microRNA interactions [64] were shown to affect C/EBP translation. Nevertheless, translation of the uORF is required to drive expression of the truncated C/EBP isoforms and represents a major determinant in the regulation of isoform expression ratios.
The recent generation of genetically altered mice, carrying a single nucleotide exchange of the ATG uORF initiation codon to TTG in the C/EBPb gene (C/EBPb DuORF ), now confirmed the concept of uORF-mediated isoform expression in vivo and contributed to a deeper understanding of how changes in the isoform ratio of C/EBPb affect mammalian physiology [8]. The C/EBPb DuORF mice were generated using homologous recombination into the endogenous c/ebpb gene locus. The DuORF mutation eliminates the uORF initiation codon and thus disrupts its function as molecular switch to induce the truncated LIP isoform, without alteration of the amino acid sequence of C/EBPb. Data obtained from the C/EBPb DuORF homozygous mice showed that the C/EBPb isoform production becomes unresponsive to extracellular stimuli, such as lipopolysaccharide treatment, which normally increases the LIP to LAP ratio [8]. Furthermore, ablation of uORF initiation prevented the physiological induction of LIP during liver regeneration. Lack of LIP expression resulted in enhanced acute phase response, prolonged repression of cell cycle genes and impaired cell cycle entry of hepatocytes after partial hepatectomy [8]. In a second recombinant mouse model (C/EBPb LIP ), the endogenous c/ebpb gene locus was replaced by the coding sequence of the LIP isoform only, resulting in complete loss of expression of LAP Ã and LAP. The exclusive expression of LIP in these animals rescued both the expression of cell cycle genes and the entry of hepatocytes into S phase [8]. Furthermore, C/EBPb LIP mice displayed enhanced differentiation of bone-resorbing osteoclasts, while in turn, the decreased LIP to LAP isoform ratio in C/EBPb DuORF mice showed an impaired osteoclast differentiation. The C/EBPb LAP isoform was found to induce the expression of the  [79] contains three subsequent hypothetical uORF initiation codons (uORF hyp ), followed by in-frame termination codons upstream (homo) or downstream (bos and mus) of the p30 start site. The rat C/EBPe sequence is not displayed, as it is 100% homologous to the mouse sequence shown in this alignment. Initiation codons of protein isoforms are highlighted by green background color, initiation and termination codons of uORFs and uORFs hyp are in red bold face, favorable residues of the core Kozak context (residues at À3 or þ4) are underlined. Ã This uORF hyp initiation codon may be nonfunctional, as its presence did not prevent deregulated C/EBPb isoform expression when the uORF AUG codon (À34) was mutated to a non-functional UUG codon [8,29]. transcription factor MafB in monocytes. MafB inhibits or sequesters other transcription factors that are known to mediate osteoclastic differentiation, including Fos, Nfatc1, and Mitf. In contrast, LIP downregulates MafB expression, resulting in increased availability of those osteoclastic transcription factors [8,35,65]. In summary, the C/EBPb DuORF mouse comprises the first genetic animal model that confirms the physiological significance of uORF translation in vivo and its action as a molecular switch driving cell fate decisions by modulating isoform expression ratios. The in vivo data support the idea that the abundance of individual C/EBPb isoforms is regulated by uORF-mediated integration of cellular signals, resulting in tissue-specific functions that depend on the cellular context.
These and other data also challenge the model of LIP being a general transcriptional repressor and of LAP Ã /LAP acting as general trans-activators. LIP has now been described as a transactivator in several situations, e.g. on target genes containing C/ EBP-responsive promoter elements that can be mutually activated by LIP or cyclin D1 [66], as well as in osteoblasts by interaction with the osteoblastic transcription factor Runx2 [47]. The trans-activation potential of LIP might be explained by LIP out-competing the repressive effects of long C/EBP isoforms, as described for E2F target genes [8,67,68]. Moreover, LIP may enhance differentiation of osteoblasts and osteoclasts [35]. These observations suggest a high versatility and target gene specificity in C/EBPb isoform functions.
For C/EBPa, data obtained from patients and targeted mouse genetics also argue for critical physiological functions of the C/EBPa uORF in balanced isoform expression. C/EBPa is an inducer of terminal differentiation in granulocytes and couples induction of cell type-specific genes to cell cycle arrest [32,33,69]. The C/EBPa full-length isoform p42, similar to the long isoforms of C/EBPb, blocks cell cycle progression by repressing E2F target genes, a function that is required in terminal cellular differentiation. In contrast, the truncated p30 isoform is not capable of repressing E2F target genes, and therefore proliferation continues, preventing terminal differentiation [70][71][72]. C/EBPa is mutated in about 10% of patients with acute myeloid leukemia (AML), where the most common mutations result in the loss of p42 expression, while the production of p30 is preserved [73][74][75][76]. A myeloid proliferative phenotype due to loss of p42 expression was also observed in knock-in mice that express p30 only [77]. These mice displayed disturbed granulopoiesis and premature death. Presence of p30 was sufficient for progenitor commitment to the granulocyte-macrophage cell lineage; however, p42 was required to restrain proliferation of these myeloid progenitors, and its absence resulted in a myeloid proliferative disease resembling human AML [77]. Furthermore, pharmacologically induced differentiation of AML cells by the triterpenoid 2-cyano-3,12-dioxooleana-1,9-dien-28-oic acid (CDDO) required an intact uORF [78]. Underlining the critical importance of the C/EBPa uORF, a null mutation of its initiation codon in mice results in early embryonic lethality (A. Bremer and C. F. Calkhoven, personal communication).
Another example of how isoform expression ratios affect cell fate decisions comes from C/EBPe, the third C/EBP family member that is produced as various N-terminally truncated isoforms. The C/EBPe gene differs structurally from C/EBPa and b in that it contains introns. In addition to alternative translational initiation, the expression of four alternative C/EBPe isoforms (p32, p30, p27, and p14) was attributed to differential promoter usage and alternative splicing. Similar to the short C/EBPa and b isoforms, the short C/EBPe isoforms display less trans-activation potential, with the shortest isoform (p14) virtually lacking trans-activating domains [79][80][81]. C/EBPe is expressed in hematopoietic cells of the granulocytic lineage, and is required for the terminal differentiation of granulocytes into eosinophils and neutrophils [79,80,82]. Recent studies showed that the isoforms of C/EBPe differentially affect granulocytic lineage commitment and differentiation pathways [81]. Despite many structural and functional similarities between C/EBPa, b, and e, it remains to be determined whether uORF-mediated translational control also affects C/EBPe isoform expression. Only the murine and rat transcripts contain an out-of-frame uAUG codon between the p32 and the p30 translational start site (Fig. 4). Nevertheless, as many as three conserved hypothetical C/EBPe uORFs could initiate from alternative out-of-frame initiation codons in the human transcript, such as ACG at À89 and À77 (which corresponds to the mouse and rat uAUG) or GUG at À59 in respect to the adenine base in the p30 initiation codon (Fig. 4). All three hypothetical uORF (uORF hyp ) start sites are conserved, but the uORF hyp terminates five bases upstream of the p30 initiation codon only in humans, while in cow and mouse it overlaps the p30-coding sequence by 85 nucleotides. Given that all three potential uORF start sites are surrounded by intermediate or favorable Kozak consensus sequences, uORF-mediated translational control might be an additional level of C/EBPe expression regulation.

Mutant uORFs accounting for human diseases
In analogy to the experimentally deleted C/EBPb uORF initiation codon in C/EBPb DuORF mice, naturally occurring uORF mutations in other genes may cause physiological alterations by deregulating translation of the affected transcript. Such mutations could either change the presence of a uORF by generating or deleting an initiation codon upstream of the MCS start site, or could affect translational control by changing one of the structural features of an existing uORF (Fig. 1). More than 500 single nucleotide polymorphisms (SNPs) have been identified in humans that either create or delete uORFs, highlighting the potential physiological implications of uORFmediated translational control. This variability in the presence of uORFs may suggest a substantial contribution of uORFmediated regulation to individual phenotypes and/or the predisposition to distinct diseases [18]. To date, three well-documented and thoroughly analyzed uORF-affecting mutations have been linked to the development of human diseases: (i) hereditary thrombocythemia is caused by a mutation that eliminates a uORF due to the generation of an alternatively spliced mRNA, resulting in increased production of thrombopoietin protein [83]; (ii) reduced production of cyclin-dependent kinase inhibitor 2A, caused by a mutation that introduces a uORF in the 5 0 leader sequence of the CDKN2A transcript, results in familial predisposition to melanoma development [84]; and (iii) Marie Unna hereditary hair loss is caused by K. Wethmar et al.
Prospects & Overviews .... variable mutations altering a uORF within the hairless homolog (HR) transcript, causing an increased expression of the hairless homolog protein [85]. This list was recently extended by 11 disease-related genes, where uORF-altering mutations were identified by computational analysis of the Human Gene Mutation Database [18]. Diseases with confirmed uORF mutations include the van der Woude syndrome (IRF6), hereditary pancreatitis (SPINK1), and familial hypercholesterolemia (LDLR) [18]. Additionally, the expression of the beta secretase BACE1, related to Alzheimer's disease [86], or the transmembrane receptor tyrosin kinase ERBB2, related to breast cancer [87], is at least partially controlled by uORFs. Whether deregulated uORF-mediated translational control is the crucial pathogenic event in these latter cases remains to be established. Even with only a few unequivocal cases at this time, it is evident that uORF mutations may be involved in a wide variety of diseases including malignancies, metabolic or neurologic disorders, and inherited syndromes. As many important regulatory proteins, including cell surface receptors, tyrosine kinases, and transcription factors act in a dosedependent fashion, uORF mutations that affect expression levels of these genes might be responsible for a number of as-yet-unexplained pathologies.

Conclusions and prospects
The recent validation of the (patho)physiological importance of uORF translation in mice added a new level of significance to this cis-regulatory mechanism of translational control. C/EBPa and b transcription factors represent well-established examples of how translational control by uORFs may affect cell fate decisions. Accumulating evidence suggests that deregulated uORF function might be a widespread mechanism underlying the development of human diseases. The rapid progress in advanced sequencing technologies will permit screening approaches to identify causative uORF mutations in primary material derived from patients. Malignancies of the blood might be among the most suitable types of diseases to start such an analysis, as cell samples are readily accessible. One would, e.g. expect to uncover mutations resulting in a ''loss of uORF function'' in proto-oncogenes, causing their ectopic and transformation-inducing overexpression. In turn, mutations yielding a ''gain of uORF function'' in tumor suppressor genes may result in malignant transformation due to a decreased production of protective proteins (Fig. 5). Given the high number of human transcripts carrying at least one uORF, the in-depth analysis of 5 0 leader sequence mutations has the potential to substantially widen the spectrum of diseases with molecularly resolved etiology. Uncovering disease-related uORF mutations will inspire extensive subsequent research aiming to target the misexpressed proteins for therapeutic intervention.