NLZ1 mRNA Expression in the Adult
Virtually all previously existing data on the expression of NET family genes refers to immature animals, probably as a result of the genes having primarily been identified through their roles in development [Cheah et al., 1994; Davis et al., 1997; Andreazzoli et al., 2001; Dorfman et al., 2002; Zhao et al., 2002; Runko and Sagerstrom, 2003, 2004; Chang et al., 2004; Hoyle et al., 2004; Nakamura et al., 2004, 2008; McGlinn et al., 2008; Brown et al., 2009; Ji et al., 2009]. As for the human genes, no expression studies have been reported, except for those dealing with breast cancer [Garcia et al., 2005; Gelsi-Boyer et al., 2005; Yang et al., 2006; Melchor et al., 2007; Bernard-Pierrot et al., 2008; Kwek et al., 2009; Holland et al., 2011; Slorach et al., 2011; Sircoulomb et al., 2011]. We have therefore studied NLZ1 expression in adult human tissues as part of our effort to characterize this important protein. Our results show that NLZ1 is ubiquitously expressed in adult tissues (Fig. 1). This pattern of expression was demonstrated by RT-PCR and RACE analyses of both human and mouse tissues, and is supported by detailed inspection of available EST data, suggesting that it might be a general phenomenon, at least in mammals. The ubiquitous expression of NLZ1 is somewhat surprising, given the highly restricted pattern of expression displayed by NET genes during development in multiple species [Cheah et al., 1994; Davis et al., 1997; Andreazzoli et al., 2001; Dorfman et al., 2002; Zhao et al., 2002; Runko and Sagerstrom, 2003, 2004; Chang et al., 2004; Hoyle et al., 2004; Nakamura et al., 2004, 2008; McGlinn et al., 2008; Brown et al., 2009; Ji et al., 2009]. However, this might be only an apparent contradiction, as it is possible that the developmentally restricted expression of NET genes might be mirrored, in the adult, by expression in specific cell types in different organs. On the other hand, it suggests that NLZ1 intervenes in cellular processes other than the developmental ones it has been associated with. For example, recent evidence that NLZ1 behaves as an oncogene for the luminal B subtype of human breast cancer [Holland et al., 2011; Sircoulomb et al., 2011] points to its involvement in cell identity specification in the mammary ducts. This is supported by the findings that NLZ1 is a Wnt signaling repressor, downregulates E-cadherin and TGFβ expression [Holland et al., 2011; Slorach et al., 2011] and is implicated in the regulation of the estrogen receptor (ER) and the E2F1 transcription factor [Sircoulomb et al., 2011].
Alternative Polyadenylation Generates Different NLZ1 Isoforms
In this study, three human NLZ1 mRNA species have been identified, that we have named mRNA1 to mRNA3, according to their size (Fig. 2). Comparison of their sequences shows that they differ only in the lengths of their 3′-UTRs, indicating they are generated by alternative polyadenylation, a process estimated to occur in half of the human coding genes [Tian et al., 2005; Yan and Marr, 2005]. It has been shown previously that when 3′-UTR sequences contain more than one polyadenylation signal, there is a preferential use of one of them [Tian et al., 2005; Yan and Marr, 2005]. This seems to be the case with NLZ1, as 53 of the 56 human ESTs containing 3′-ends available in the UniGene database correspond to the longest transcript (mRNA3), whereas only three of them match mRNA1 and none correspond to mRNA2. A bias towards mRNA3 is consistent with the fact that only mRNA3 contains the canonical polyadenylation signal AATAAA (21 nucleotides upstream of the cleavage site), which is a stronger signal, present in 60% of all human mRNAs 3′-ends [Beaudoing et al., 2000]. In contrast, mRNA1 seems to be generated through the use of the AATATA alternative signal (located 20 nucleotides upstream of the cleavage site), which occurs only in 2% of the human mRNAs [Beaudoing et al., 2000; Tian et al., 2005] (Fig. 2). Though mRNA2 has no recognizable polyadenylation signal within the 10–30 nucleotide interval upstream of the cleavage site, this does not completely exclude its existence in human cells. Indeed, although polyadenylation signals are particularly important as binding sites for the Cleavage and Polyadenylation Specificity Factor (CPSF), it is known that no conventional polyadenylation signal is detectable in 20 to 30% of human 3′-UTRs [Beaudoing et al., 2000; Tian et al., 2005]. Furthermore, the alignment of the human mRNAs 3′-ends with the homologous region in 12 representative mammalian species reveals a high degree of sequence conservation, including the polyadenylation signals and cleavage sites (Fig. S2). The 3′-end of mRNA2 displays a degree of conservation in mammals comparable to those found surrounding the ends of mRNA3 and mRNA1, suggesting that it might be functionally important (Supplementary Fig. S2).
The existence of multiple 3′-UTRs potentially interacting with different sets of RNA binding proteins or miRNAs, might have important implications on NLZ1 mRNA stability, localization and translation efficiency [Mazumder et al., 2003]. Because the different mRNAs share the same coding region, this will not affect protein structure, but nevertheless might be important for the generation of developmental stage, cell-type or tissue differences in NLZ1 mRNA species or protein levels [Yan and Marr, 2005]. In the absence of tissue-specific data on protein expression and distribution, differential regulation of NLZ1 protein levels thus remains an interesting possibility. In this regard, it is also interesting that, of the three mRNA1 EST detected, two are from pancreas and the third is from an unspecified tissue, raising the possibility of an mRNA1 bias in pancreatic tissue.
Identification and Function Analysis of the NET-Specific Conserved Domains
A combination of functional studies with in silico evolutionary analyses was used to identify and validate conserved regions of NLZ1 homologues. We found that all NET vertebrate proteins have three additional conserved domains besides the Sp, Btd Box and C2H2 Zinc finger domains previously identified in both NET proteins [Dorfman et al., 2002; Nakamura et al., 2004; Runko and Sagerstrom, 2004] and members of the Sp-family of transcription factors [Athanikar et al., 1997; Suske, 1999; Schaeper et al., 2010]. The novel conserved domains identified here were named LP, PY, and YL, according to the most commonly conserved amino acids in each, and were found to be present only in NET proteins, suggesting that they might be essential elements for their specific cellular functions (Figs. 3 and 5A). Apart from the six generally conserved domains, several low complexity amino acid stretches display a certain degree of conservation within the NLZ1 or NLZ2 orthologues, or even between the two groups, though the domain sizes are not conserved. For instance, in the majority of the NLZ1 orthologues, alanine-rich and serine-rich motifs are present between the Sp and PY domains, whereas in the NLZ2 group a glycine-rich region is detectable between the same domains. In addition, between the zinc finger and the YL domains an alanine-rich motif is evident for NLZ1 and NLZ2 orthologues in the majority of the species.
The three novel NET specific-domains (LP, PY, and YL) display a high degree of conservation (≥80%) in both paralogue groups of NET vertebrate proteins (Fig. 3), suggesting they might be important for these proteins' physiological functions, such as transcription repression [Dorfman et al., 2002; Runko and Sagerstrom, 2003; Nakamura et al., 2008; Brown et al., 2009; Ji et al., 2009; Holland et al., 2011; Slorach et al., 2011]. To explore this possibility, we started by assessing their role on binding to co-repressors of the Groucho family, since previous studies indicated different regions of the NET proteins as mediating the interactions [Runko and Sagerstrom, 2003, 2004]. However, our experiments demonstrate that neither the LP, PY or YL domains are required for human NLZ1's interaction with the Groucho family member GRG5 (Fig. 4).
Next, we tried to determine if any of the NET-specific domains are important for subcellular targeting of human NLZ1, as the protein was found to have a nuclear localization but no canonical NLS has been identified in any of the NET family members. Although NLZ1 has an essentially homogeneous nuclear distribution (Fig. 5B), a small number of dot-like structures were observed with the NLZ1 full-length construct (this study and [Sircoulomb et al., 2011]). Intriguingly, when the region that included the Sp and the LP domains was deleted (NLZ1280–590), a considerable increase in the density of nuclear dots was observed, leading to a predominantly punctuate distribution (Fig. 5). This observation could indicate that the LP or Sp domains are necessary for the normal distribution of the protein inside the nucleus. Several nuclear proteins are located in nuclear bodies, like Promyelocytic leukaemia bodies (PML) or Oncogenic domains (PODs) [Matera, 1999] but this seems not to be the case of NLZ1, according to a recent study [Sircoulomb et al., 2011]. Even so, the punctuate nuclear distribution observed could have implications in the proteins' normal functioning, by preventing it from reaching some chromatin areas and affecting its repressor activity. More striking results were obtained with the PY and YL domain-deletion constructs: they clearly demonstrate that both domains are important for proper NLZ1 nuclear localization because deleting either of them leads to retention of NLZ1 in the cytoplasm (Fig. 5B). However, only the C-terminal YL domain seems essential for the process, as its deletion causes complete nuclear exclusion of NLZ1. These results are consistent with a previous study in zebrafish showing that the 498–566 C-terminal residues of Nlz1 are essential for nuclear localization [Runko and Sagerstrom, 2003].
Finally, we investigated potential functional roles for NET-specific domains in transcription repression, using two different assays. First, we used Gal4-DBD fusions to test the intrinsic ability of different human NLZ1 fragments to repress a promotor to which they were tethered. We confirmed that human NLZ1 is a transcription repressor, and showed that full repression activity requires both the YL domain and a region that includes the LP and Sp domains (Fig. 6). Importantly, the effects of YL domain deletion on repression and nuclear localization are likely to be independent, as all the fusions tested contain Gal4's nuclear localization signal. Second, we assessed the effect of NLZ1 domain deletions on the protein's ability to specifically repress TCF/β-catenin-mediated transcription, which is activated by Wnt signaling. The Wnt pathway plays a crucial role in a variety of developmental processes in all animal species [Logan and Nusse, 2004] and there is evidence linking it to NET proteins in both Drosophila [Weihe et al., 2004] and mouse [Slorach et al., 2011]. Full-length NLZ1 was able to specifically repress transcription from a TCF/βcatenin-responsive promoter and, as in the previous experiment, full activity required the YL domain and especially the region encompassing both the LP and Sp domains. Interestingly, deletion of the YL domain had a smaller impact on repression, even though it prevents nuclear translocation of the protein. These results further suggest a role for NET-specific domains in NLZ1's activity as a transcription repressor, but also that the mechanisms of repression by NET proteins are complex, likely involving direct effects on promoters as well as cytoplasmic functions. Given recent evidence for NLZ1's involvement in breast oncogenesis [Holland et al., 2011; Slorach et al., 2011; Sircoulomb et al., 2011], our results also suggest the NET-specific domains as interesting targets for breast cancer genetic research or therapy. Our domain-deletion constructs simulate loss-of-function mutations, rather than the “increased function” (through amplification) observed in human tumors, but nevertheless suggest protein domains whose function might be enhanced through specific mutations. Perhaps more importantly, one can envisage that drugs capable of disrupting the function of NET-specific domains (e.g., in NLZ1 nuclear translocation or Wnt signaling) might constitute new and specific weapons against breast cancers.
In conclusion, we present here the first detailed analysis of human NLZ1 under non-pathological conditions. We show that NLZ1 is ubiquitously expressed, that its mRNA is subject to alternative polyadenylation and that NLZ1 and its NET homologues share six conserved domains, three of which are novel and NET-protein specific. Importantly, this is the first report that links nuclear localization of a NET protein to two different regions: the centrally located PY domain and the C-terminal YL domain. The mechanisms involved are unknown, but it is possible that both the PY and YL domains mediate interactions with other proteins that assist in targeting NLZ1 to the nucleus, since NET proteins lack a conventional NLS. Finally, we present evidence that NET-specific domains are also important for NLZ1's function as a transcription repressor. These results imply that further studies concerning NET proteins should take into account the repercussions that potential mutations may have in normal functioning of the newly identified domains.
Beyond the longstanding association of its homologues with embryonic development, human NLZ1 has recently gained relevance as a breast oncogene and potential key player in human breast cell differentiation. By furthering our understanding of human NLZ1, and pointing specific questions to be addressed regarding its function, our results should help shed light on its roles both in normal development and tumourigenesis.