Identification, cloning and expression analysis of the pluripotency promoting Nanog genes in mouse and human



The murine Nanog gene, a member of the homeobox family of DNA binding transcription factors, has been shown recently to maintain pluripotency of embryonic stem cells. We have used a sequence homology and expression screen to identify and clone the mouse and human Nanog genes and characterized their phylogenetic context and expression patterns. We report here the gene structure and expression patterns of the mouse Nanog gene, the human Nanog and Nanog2 genes, and six processed human Nanog pseudogenes. Mouse Nanog expression is high in undifferentiated embryonic stem cells and is down-regulated during embryonic stem cell differentiation, concomitant with loss of pluripotency. Murine embryonic Nanog expression is detected in the inner cell mass of the blastocyst. After implantation, Nanog is detectable at embryonic day (E) 6 in proximal epiblast in the region of the presumptive primitive streak. Expression extends distally as the streak elongates during gastrulation and remains restricted to epiblast. Nanog RNA is down-regulated in cells ingressing through the streak to form mesoderm and definitive endoderm. Nanog expression also marks the pluripotent germ cells of the nascent gonad at E11.5–E12.5 and is highly expressed in germ cell tumour and teratoma-derived cell lines. Reverse transcriptase-polymerase chain reaction analysis detected mouse Nanog expression at low levels in several adult tissues. The human Nanog genes are expressed in embryonic stem cells and down-regulated in all adult tissues and differentiated cell lines examined. High levels of human Nanog expression were detected by Northern analysis in the undifferentiated N-Tera embryonal carcinoma cell line. The conservation in gene sequence, structure, and expression of mouse and human Nanog and Nanog2 genes may reflect a common role in the maintenance of pluripotency in both species. Developmental Dynamics 230:187–198, 2004. © 2004 Wiley-Liss, Inc.


Homeodomain proteins regulate diverse developmental programs by controlling the temporal and spatial expression of other genes. During early embryogenesis, homeobox genes have been demonstrated to be important in patterning and lineage specification. For example, the Otx2 homeobox gene is expressed in anterior visceral endoderm and is required for correct anterior–posterior axis formation and patterning of the gastrulating embryo (Kimura et al., 2001). Loss of function of Otx2 leads to loss of axial mesoderm and complete deletion of the forebrain (Ang et al., 1996). Another homeobox gene of the Paired-like class, Mixl1, is expressed at embryonic day (E) 7.5 in the primitive streak and in the emerging mesoderm (Pearce et al., 1999; Robb et al., 2000). Targeted disruption of the Mixl1 gene causes lethal defects in the patterning and morphogenesis of the axial mesoderm and blocks formation of the foregut and hindgut (Hart et al., 2002). We undertook a genome-wide scan for novel homeobox genes that function during lineage specification in the early embryo by in silico sequence analysis, then assessed their expression patterns in gastrulation stage embryos by reverse transcriptase-polymerase chain reaction (RT-PCR) and whole-mount in situ hybridization (Hart et al., manuscript in preparation). Our approach was to use BLASTN nucleotide–nucleotide similarity comparison and TBLASTN protein-translated nucleotide comparison between selected homeobox gene sequences and the recently completed murine genomic sequences generated by Celera and the GenBank database. Eight previously uncharacterized homeobox genes expressed during gastrulation were identified and cloned by RT-PCR for further analysis. We describe here a novel homeobox transcript, which was restricted to pluripotent lineages in the embryo, including embryonic stem (ES) cells, blastocyst ICM, epiblast before gastrulation, and primordial germ cells. This gene was recently named Nanog and has been shown to regulate pluripotency in ES cells (Chambers et al., 2003; Mitsui et al., 2003). We also describe the sequence, gene structure, and expression pattern of two human Nanog-related genes, Nanog and Nanog2.


Nanog Is a Novel Homeobox Gene With Vertebrate Orthologs

The Nanog gene was identified by using sequence homology searching. Homeobox genes share a conserved 60 amino acid DNA binding homeodomain. The homeodomain sequences from several homeobox genes known to be expressed during gastrulation in mouse and other species were used in TBLASTN analysis of the GenBank and Celera mouse genome sequences. Several novel homeobox genes were identified and cloned. A cDNA fragment cloned by RT-PCR from E7.5 RNA shared 100% sequence homology with the recently described ENK/Nanog cDNA sequence (Chambers et al., 2003; Mitsui et al., 2003; Wang et al., 2003). We confirmed the full-length cDNA sequence by using 5′ rapid amplification of cDNA ends (RACE) and found that three alternately spliced forms of Nanog are present in ES cells. The longest Nanog transcript was 2,185 bp in length, encoding a protein of 305 amino acids, identical to the ES cell-derived Riken expressed sequence tag (EST) sequence AK010332. Two shorter transcripts were encoded by the Nanog gene, initiating 110 and 83 base pairs (bp) upstream from the first transcript, incorporating short exons of 27 and 43 bp, respectively, and splicing to nucleotide (nt) 218 of the first transcript. The alternate splice forms Nanog1a (1,994 bp) and Nanog1b (1,997 bp) both encode a predicted protein of 279 amino acids, lacking the first 26 amino acids of the Nanog protein (Fig. 1; GenBank accession nos. AY455282, AY455285).

Figure 1.

The mouse Nanog gene encodes three alternate transcripts. The longest Nanog cDNA transcript is 2,185 bp long and encodes a 305 amino acid protein. The coding region of Nanog is interrupted by three introns (black triangles) and contains two conserved domains: a SMAD4 homology domain (double underline) and the homeodomain (underlined). A five amino acid motif beginning with Tryptophan (circled) is repeated 10 times in the carboxy end of the protein. The 3′ untranslated region contains a SINE-B2 retroposon (shaded). Nanog amino acids are numbered at left, nucleotides at right. Start of Nanog transcript is indicated by the black arrow, the Nanog reading frame start and stop codons by bold type. The first exons of alternate transcripts Nanog1a (dotted line) and Nanog1b (dashed line) encode transcripts initiated at alternate sites (grey arrows) that are alternately spliced (grey triangles) to produce identical 248 amino acid proteins, which begin at M26 (bold) of the longer Nanog transcript.

Orthologs of the mouse Nanog gene are present in the human (EST sequence FLJ12581) and rat genomes (based on genomic sequence NW_043796) and have been described previously (Chambers et al., 2003). In addition, we identified a monkey Nanog ortholog encoded by the EST sequence AB062943 and a second human ortholog, which we have named Nanog2 (GenBank accession no. AY455283). The human Nanog and Nanog2 cDNAs were cloned by RT-PCR from human ES cell-derived RNA (Reubinoff et al., 2000). The human Nanog and Nanog2 genes encode open reading frames of 305 and 233 amino acids, respectively, having approximately 55% amino acid sequence identity with the full length mouse Nanog protein (Fig. 2A). The human Nanog2 transcript was identical to the testis-derived EST sequence AK097770. Homology between human and mouse Nanog proteins was highest (87%) within the 60 amino acid homeodomain. We performed a multiple sequence alignment of the homeodomain of the cloned mouse and human sequences with the predicted rat and monkey Nanog and found that they displayed a similar degree of conservation within the homeodomain: 93% in rat and 85% in monkey (Fig. 2B). Sequence BLAST comparison of the murine Nanog homeodomain with GenBank and Celera protein databases revealed highest sequence homology with the Paired-like homeodomain transcription factors (Galliot et al., 1999). The most closely related murine proteins were BarX1 and NKX2-3 both of which share 50% identity with mouse Nanog within the 60 amino acid homeobox domain. However, Nanog does not share a significant homology with NK or Paired-like proteins outside the homeodomain and lacks the TN and NK domains characteristic of NK homeoproteins (Harvey, 1996). We also noted that Nanog did not have any clear orthologs in primitive vertebrates or insects. Thus, the Nanog genes may have diverged from other homeobox genes during vertebrate evolution to form a distinct branch of the homeoprotein phylogenetic tree (Fig. 3).

Figure 2.

A: Multiple sequence alignment of the murine (Mus musculus) Nanog protein with human (Homo sapiens) Nanog and Nanog 2, monkey (Macaca fascicularis) Nanog, and the predicted rat (Rattus rattus) Nanog proteins. Protein sequences were aligned using CLUSTALX, and amino acids identical to mouse Nanog are shaded. Conserved tryptophan repeats are indicated by asterisks. Amino acids are numbered at right. B: Alignment of the mouse (M.m.), human (H.s.), rat (R.r.), and monkey (M.f.) Nanog homeodomains with the paired-like homeodomain sequences of mouse BarX1, Nkx2-3, and MixL1. Amino acids identical to mouse Nanog are shaded black. Amino acid identity (%) within the homeodomain is shown at right. C: Alignment of the mouse (M.m.), rat (R.r.), and human (H.s.) Nanog amino terminal Smad4 homology domain with mouse, rat, and human Smad4 linker region. Amino acids identical to mouse Nanog are shaded black, similar amino acids are shaded grey. Identical amino acids are indicated by asterisks. Similar amino acids are indicated by dashes. Amino acids are numbered at left and right. and percentage sequence similarity is shown at far right.

Figure 3.

Phylogenetic tree of the Nanog proteins and related homeobox sequences. A multiple sequence alignment was generated using the 60 amino acid homeodomain sequences of the Nanog protein and other closely related homeobox proteins using CLUSTALX. The results of 1,000 bootstrap calculations were plotted using NJPLOT, and the bootstrap values are shown at each branch point. The Nanog proteins form a distinct sub group within a larger group of paired-like homeodomain proteins that include NK-related and Bar-related sequences. M.m., Mus musculus; H.s., Homo sapiens; M.f., Macaca fascicularis; R.r., Rattus rattus.

Additional Conserved Domains

Additional PSI-BLAST analysis of the Nanog protein sequences identified a conserved 43 amino acid proline-rich domain close to the amino terminus (amino acids 12-54). This domain shares sequence similarity (33%) with the central region of SMAD4 proteins (Fig. 2C). It contains five absolutely conserved prolines and three residues that are always proline or serine. This region in the mouse SMAD4 protein (202-256) is in the linker region between the amino terminal MH1 (Dwarfin A) DNA binding domain (Grishin, 2001) and the carboxy terminal MH2 (Dwarfin B) protein–protein interacting domain (Kim et al., 1997). Smad4 is an essential component of transforming growth factor-β signal transduction and loss of function of Smad4 leads to a failure of epiblast proliferation and mesoderm induction in the mouse (Yang et al., 1998). The function of the linker region in Smad4 is largely unknown, although it may play a role in ubiquitination (Moren et al., 2003). Of interest, human Nanog2 lacks the conserved SMAD4 homology domain present in human, mouse, and rat Nanog (Fig. 2A).

The Nanog protein contains a novel tryptophan repeat motif located between the homeodomain and the carboxy terminus (amino acids 198-247). A five amino acid motif (W/QXXXX) is repeated 10 times in the mouse and 9 times in human, monkey, and rat proteins. By using the iterative PSI-BLAST search, we found that several otherwise unrelated proteins share similar tryptophan repeats, for example, rabbit alpha-1-antiproteinase, 5 repeats (amino acids [aa] 6-31); human “similar to testes development-related NYD-SP21,” 4 repeats (aa 926-946). The tryptophan repeat motif does not resemble any known functional domain present in either PFAM, ProDom, or PROSITE protein domain databases and, therefore, may represent a novel domain with unknown structural and functional properties.

Repetitive Elements

Analysis of the mouse and human Nanog cDNA 3′UTR sequence with RepeatMasker ( revealed several SINE (Short INterspersed Element) retroposon-related sequences, including a complete SINE-B2 element (nucleotides 1178-1364) in the mouse transcript and a closely related AluY element (nucleotides 1360-1660) in the human transcript. Of interest, SINE-B2 elements can be transcribed in embryonic lineages (Ferrigno et al., 2001) and have been shown to contribute to regulation of gene transcription in preimplantation embryos (Bladon and McBurney, 1991). The conservation of these noncoding sequences between mouse and human may reflect a conserved functional mechanism. The presence of retroposon sequences in the 3′ UTR sequence of Nanog may have also contributed to the evolution of recent duplications and pseudogenes in the mouse and human (see below). Interestingly, the SINE/Alu element present in the 3′ UTR of Nanog has been lost from the Nanog2 transcript, and we found that expression of Nanog2 was less abundant than Nanog in human ES cells (not shown).

Nanog Gene Structure and Pseudogenes

The mouse Nanog gene is composed of four exons and spans approximately 7 kb on chromosome 6, cytogenetic band F2 (GenBank locus 71950; Fig. 4A). Alternate splice forms of mouse Nanog present in ES cells, Nanog1a and Nanog1b, contain additional upstream exons (Nanog1a, nt -110 to -83 and Nanog1b, nt -30 to 15) and splice to a cryptic acceptor site (nt 218) in the first exon of Nanog (Fig. 1). Sequence BLAST analysis of the mouse genome using the mouse Nanog cDNA sequence indicated the presence of highly related sequences on chromosomes X and 12. These sequences share homology with the Nanog open reading frame and untranslated region but do not share the same intron–exon structure as Nanog. A high degree of homology between Nanog and these sequences (as high as 97% over 1kb) raises the possibility of errors created during assembly of the draft genome sequence (Cheung et al., 2003). Furthermore, we observed re-arrangement and re-localisation of these sequences after re-assembly of the mouse genome (NCBI build 30). Southern analysis using a mouse Nanog open reading frame probe also indicated the presence of at least two highly related gene sequences (not shown). To determine whether any of the Nanog-related sequences were expressed, we performed RT-PCR and sequenced 25 unique transcripts derived from ES cells and 25 transcripts from pooled adult organs. All transcripts sequenced were derived from the chromosome 6F2 locus. Comparison of mouse and predicted rat Nanog cDNAs with the rat genome sequence revealed that the rat Nanog gene is similar in structure to mouse Nanog.

Figure 4.

Nanog gene structure and pseudogenes. A: The mouse Nanog gene spans approximately 7 kb on chromosome 6F2, consists of four exons, and encodes a 2,185-bp transcript. Alternate transcripts Nanog 1a and Nanog1b are produced after alternate splicing within the first exon. B: The human Nanog and Nanog2 genes share a similar structure and map to 12p13. At least six Nanog pseudogenes are predicted in the human genome. Intron size indicated above (bp), Exon size is indicated below (bp); open boxes indicate untranslated regions; black boxes, coding regions; grey boxes, homeodomains; and stippled boxes, retroposon sequences. Chromosome localisation is shown at left and length of transcript (bp) at right. GenBank accession number for gene transcripts are at far right. Nucleotide identity with Nanog (%) is shown below exons and pseudogenes.

We cloned transcripts of the human Nanog and Nanog2 genes by RT-PCR from human embryonic stem cell-derived RNA (Reubinoff et al., 2000). The human Nanog and Nanog2 genes were localized to chromosome 12p13 by sequence comparison with the human genome sequences at Celera and GenBank. We found that the human Nanog and Nanog2 genes lie in a head to tail configuration on chromosome 12p13 separated by approximately 100 kb. The human Nanog gene, like its mouse ortholog, spans approximately 7 kb and is made up of four exons (Fig. 4B). The Nanog2 gene, which is located upstream from Nanog at 12p13, has a similar structure to Nanog, however, the first intron is much larger and the SINE/Alu element present in the 3′ UTR of Nanog is absent. The human genome also contains several processed Nanog pseudogenes on other chromosomes. These putative pseudogenes lack intron–exon structure, consisting of continuous Nanog-related sequence with a 3′ polyA tract and flanking direct repeats characteristic of processed pseudogenes (Vanin, 1985). By using BLAST analysis, we identified Nanog pseudogenes on human chromosomes 15q13.2 (NanogPseudogene-1, GenBank accession no. AY455284), 7p15.3 (NanogPseudogene-2, GenBank accession no. AY455277), 14q32.12 (NanogPseudogene-3, GenBank accession no. AY455278), 2q36.1 (NanogPseudogene-4, GenBank accession no. AY455279), Xp11.4 (NanogPseudogene-5, GenBank accession no. AY455280), and Xq11.1 (NanogPseudogene-6, GenBank accession no. AY455281). These putative pseudogenes share greater than 90% sequence identity with Nanog. Nanog Pseudogene-1, 3, and 6 encode uninterrupted open reading frames predicted to encode Nanog-related proteins (XP_208115, XP_11301925, XP_11301925). To assess the expression of Nanog and related transcripts in human ES cells, we used PCR primers common to Nanog and Nanog2 and to the predicted pseudogenes in RT-PCR. Sequencing of 48 independent cDNA transcripts derived from RT-PCR of human embryonic stem cell RNA confirmed that both Nanog and Nanog2 are the only expressed transcripts present at this stage. Of interest, the Nanog-related sequences in mouse also appear to be the products of retrotransposition, but they have undergone gene rearrangements and deletions. Therefore, human Nanog pseudogenes may have arisen recently in evolutionary terms, after mouse and human divergence and proliferated as a by-product of Alu/SINE-mediated retroposition.

Nanog Expression in Mouse Pluripotent Cell Lines and Tissues

Murine Nanog expression was studied in embryonic and adult tissues by using RT-PCR. Nanog was expressed in undifferentiated mouse ES cells (Fig. 5A) and P19 embryonal carcinoma cells (not shown). In the preimplantation embryo, expression of Nanog was detected in morula and blastocysts (E3.5), and after implantation at E6.5 and E7.5 (Fig. 5A). The expression of mouse Nanog was down-regulated after E8.5. We examined mouse Nanog expression in adult tissues by Northern blot analysis of mRNA. Expression was undetectable by Northern analysis in adult liver, kidney, brain, skin, stomach, testis, uterus, bone marrow, lung, muscle, thymus, spleen, heart, or gut (not shown). Low levels of Nanog expression were detected in many adult tissues by RT-PCR (Fig. 5B).

Figure 5.

A: Murine Nanog developmental expression profile. Nanog expression assessed by reverse transcriptase-polymerase chain reaction (RT-PCR) shows that embryonic stem cells (ES) and embryoid bodies (EB) after 3 days of differentiation (EB3) express Nanog at relatively high levels. Nanog expression decreases around day 4 of embryoid body differentiation (EB4). In the embryo, Nanog is detected in preimplantation blastocysts (embryonic day [E] 3.5) and reaches a peak around E6.5–E7.5, is decreasing during E8.5, and is absent from developing embryos after E9.5. No Nanog expression is seen in the new born (N.B.) mice. −RT, negative control lacking reverse transcriptase; hprt, hypoxanthine-phosphoribosyl-transferase. B: Nanog is expressed at low levels in adult mouse tissues. RT-PCR analysis of Nanog expression in a range of adult murine tissues. −RT, negative control lacking reverse transcriptase; hprt, hypoxanthine-phosphoribosyl-transferase; stom, stomach; bone m., bone marrow. C: Nanog expression is down-regulated during ES cell differentiation. Embryonic stem (ES) cells cultured in the absence of LIF spontaneously differentiate to form EB. Nanog expression is down-regulated at day 5 (EB5), coincident with loss of pluripotency marked by Oct4 down-regulation. The mesodermal markers MixL1 and Brachyury are expressed on day 3 of EB formation and are down-regulated on day 5. Gapdh, glyceraldehyde-3-phosphate dehydrogenase.

To further investigate the expression of Nanog during differentiation and lineage specification, we differentiated ES cells to form embryoid bodies (Keller, 1995). In this system, the withdrawal of leukemia inhibitory factor (LIF) induces the ES cells to form embryoid bodies that contain progenitors of multiple lineages. Nanog expression was down-regulated after the 4th day of ES cell differentiation into embryoid bodies (Fig. 5C). Differentiation of ES cells and lineage restriction in embryoid bodies occur in the same sequence as in the embryo (Keller et al., 1993). By day 5 of embryoid body differentiation, pluripotent cells of mesoderm, endoderm, and ectodermal lineages have begun to differentiate, forming tissue progenitors with restricted developmental potential. The formation and subsequent commitment of mesoderm is reflected by the restricted expression of Brachyury and Mixl1 in this system (Fig. 5C; Robb et al., 2000). We found that Nanog expression mimicked that of the pluripotency marker Oct4, which is also expressed in undifferentiated ES cells and is down-regulated at day 5 of embryoid body differentiation.

Expression of Nanog in the early mouse embryo was examined by whole-mount RNA in situ hybridization. Nanog transcripts were first detected in morulae (Fig. 6A). In the blastocyst, Nanog expression was confined to the inner cell mass (Fig. 6B). Nanog RNA was detected in the postimplantation murine embryo at the egg cylinder stage in the proximal epiblast at the site of the presumptive primitive streak (Fig. 7A). At the prestreak and early streak stages, Nanog expression was present in proximal epiblast with strongest expression in the region of the streak (Fig. 7B). In the no allantoic-bud to late-bud stage, expression remained confined to epiblast, with very weak expression in anterior epiblast and strong expression in posterior epiblast (Fig. 7C–H). Nanog expression was down-regulated as epiblast cells entered the primitive streak and underwent epithelial to mesenchymal transition, forming mesoderm (Fig. 7I). After the late-bud stage, the expression of Nanog waned and, by E8, was no longer detectable by in situ hybridization. Whole-mount in situ hybridization of genital ridge from E11.5 embryos revealed expression in the developing gonad (not shown). Like Nanog, the homeobox gene Oct4, a molecular marker of pluripotency, is expressed in the ICM of the blastocyst and in epiblast in postimplantation embryos (Scholer et al., 1990). We compared the expression of Oct4 in the midstreak and late streak embryo (Fig. 7J,K). In contrast with Nanog, which showed a striking restriction of expression to epiblast, Oct4 transcripts were detectable throughout epiblast and emerging mesoderm (Fig. 7L). The restricted, asymmetric pattern of Nanog expression in the epiblast of the gastrulating embryo was similar in nature to that of many genes known to function in early embryonic patterning and lineage specification. For example, expression of the mesendodermal patterning homeobox gene MixL1 is restricted to the primitive streak and emerging mesendoderm of early and late-bud embryos (Fig. 7M–O).

Figure 6.

Mouse Nanog expression in preimplantation embryos. A,B: Whole-mount in situ hybridization using a Nanog riboprobe revealed expression in morulae (A) and the inner cell mass of the blastocyst (B). C: Negative control, mouse Nanog sense probe.

Figure 7.

Expression of Nanog in postimplantation murine embryos. Whole-mount RNA in situ hybridization analysis of Nanog (A–F), Oct4 (J,K), and MixL1 (M,N) expression and transverse paraffin sections of indicated embryos (G–I,L,O). A: The earliest expression of Nanog in postimplantation embryos was detected at the egg cylinder stage, asymmetrically in the proximal epiblast at the embryonic/abembryonic junction. B: Nanog expression extends distally in the epiblast within cells of the prospective primitive streak. C: Nanog expression is localized to the epiblast of the emerging streak at the early streak stage. D–F: Transcripts are detected anteriorly within the epiblast as the streak forms and elongates along the embryonic anteroposterior axis during early-bud to late-bud stage embryos at embryonic day (E) 7–E7.5. G: A transverse section at E6.5 at the position indicated in C shows that Nanog is not expressed in the visceral endoderm and is asymmetrically expressed in the epiblast. H,I: Transverse sections at E7.0 (H) and E7.5 (I) at the positions indicated in D and F show that there is a gradient of Nanog expression in epiblast from anterior to posterior, with expression being maximal in epiblast cells flanking the primitive streak. Expression is lost as the cells enter the streak and differentiate to form mesoderm. J,K: Widespread Oct4 expression in early and late-bud embryos. L: A transverse section at the position indicated in K shows Oct4 expression is present in epiblast and in mesoderm. M,N: The Mixl1 homeobox gene is expressed in the primitive streak in early and late-bud embryos. O: Transverse section indicated in N, showing Mixl1 expression restricted to the primitive streak and emerging mesendoderm.

Human Nanog and Nanog2 Expression

The human Nanog gene has been detected in the embryonal carcinoma cell line GCT27 (Chambers et al., 2003) and several expressed sequences encoding human Nanog are listed in the TIGR provisional human consensus entry THC1463551. These include Nanog EST sequences derived from teratoma (AU125170), testis tumor (AA301077), bone marrow (BF893620), and pooled germ cell tumors (BX108731). Expressed sequences identical to Nanog2 (THC1413636) have been isolated from adult testis (AK097770) and from epididymal (BF773088) tumors. To further assess human Nanog and Nanog2 expression, we analyzed a panel of 22 human tumor-derived cell lines by Northern analysis of polyA+ RNA. Expression of the expected 2.2-kb Nanog transcript was detected at high levels in the embryonal carcinoma cell line N-tera but was undetectable in tumor cell lines of hematopoietic, colon, breast, liver, or skin origin (Fig. 8). We also analyzed Nanog and Nanog2 expression levels by Northern analysis of mRNA from adult tissues, including testis, ovary, bone marrow, spleen, thymus, colon, brain, placenta, lung, kidney, skeletal muscle, and heart. Nanog expression was not detectable by Northern analysis of adult human tissues (data not shown). The expected 1.7-kb Nanog2 transcript was not detected by Northern analysis of cell lines or tissues, although human ES cell RNA was not available for Northern analysis.

Figure 8.

Northern analysis of human Nanog expression in tumor cell lines. PolyA+ RNA from a range of human tumor cell lines was hybridized with a radiolabeled human Nanog cDNA probe. The blot was subsequently stripped and reprobed with a glyceraldehyde-3-phosphate dehydrogenase (GAPDH) cDNA probe. Expression of Nanog is restricted to the teratoma cell line N-Tera. 1, Raji (Burkitt's lymphoma); 2, EB3 (Burkitt's lymphoma); 3, CA46 (Burkitt's lymphoma); 4, U266B (plasmacytoma-myeloma); 5, Hut-78 (lymphoma); 6, HSB-2 (T cell leukemia); 7, SupT1 (lymphoma); 8, TF-1 (erythroleukemia); 9, K562 (chronic myeloid leukemia); 10, HEL (erythroleukemia); 11, MEG-01 (megakaryoblastic leukemia); 12, HL-60 (promyelocytic leukemia); 13, AML-1 (acute myeloid leukemia); 14, U937 (lymphoma); 15, Allen-1 (Ewing's sarcoma); 16, N-Tera (embryonal carcinoma); 17, A375 (melanoma); 18, COLO201 (colorectal adenocarcinoma); 19, COLO205 (colorectal adenocarcinoma); 20, COLO320DM (colorectal adenocarcinoma); 21, MDA-MB231 (breast adenocarcinoma); 22, HepG2 (hepatocellular carcinoma).


Nanog Expression, Function, and Orthologs

We describe the identification, cloning, and expression analysis of the mouse and human Nanog genes and the human Nanog2 homeobox gene. We isolated three alternately spliced forms of the mouse Nanog transcript in a screen for homeobox genes expressed during gastrulation. Comparison with the human genome sequence led us to identify and clone the human Nanog and Nanog2 ortholog transcripts from human ES cell RNA. Together with predicted orthologs in rat and monkey, the Nanog genes form a new conserved subfamily of Paired-like homeodomain proteins. The Nanog orthologs share a highly conserved homeodomain, flanked by a novel amino terminal SMAD4 homology domain and a carboxy tryptophan repeat motif. Recent studies have demonstrated a role for the murine Nanog gene in stem cell self-renewal (Chambers et al., 2003) and maintenance of the inner cell mass (Mitsui et al., 2003). Overexpression of mouse or human Nanog in ES cells can overcome the requirement for LIF to maintain the undifferentiated state and can block differentiation in the presence of retinoic acid or 3-methoxybenzamide. Nanog appears to act independently of the LIF receptor/gp130/stat3 signaling pathway to promote ES cell self-renewal (Chambers et al., 2003). Gene targeting of mouse Nanog revealed that Nanog expression is required for the maintenance of the epiblast and postimplantation development. Nanog-deficient ES cells differentiate in culture to form parietal and visceral endoderm, and Nanog null blastocysts do not give rise to epiblast outgrowths in culture (Mitsui et al., 2003).

The expression pattern of the Nanog genes is consistent with the theory that the activity of this transcription factor is linked to the maintenance of lineage pluripotency. Nanog expression is present in the inner cell mass, epiblast and germ cells, all pluripotential tissues and is down-regulated in somatic descendants of the inner cell mass. Our analysis of mouse Nanog expression demonstrates for the first time that Nanog is expressed in epiblast of the postimplantation embryo and is present at low levels in many adult tissues. These findings suggest that, in addition to its role in embryonic stem cell self-renewal (Chambers et al., 2003) and maintenance of the ICM (Mitsui et al., 2003), Nanog may also function during early postimplantation development and in many adult tissues. We found that expression of Nanog is restricted to the proximal epiblast at the site of the presumptive primitive streak at the egg cylinder stage, anticipating streak formation. As the streak forms, Nanog expression remains high in the surrounding epiblast but is down-regulated as the epiblast cells ingress into the streak, and differentiates into mesendodermal precursors. The asymmetrical pattern of Nanog expression within the epiblast contrasts with that of the pluripotency-promoting homeobox gene Oct4, which is expressed uniformly in the epiblast and mesoderm of the gastrulating embryo. The restricted pattern of Nanog expression in the postimplantation embryo more closely resembles that of genes involved in early embryonic patterning and lineage determination such as Mixl1 (Hart et al., 2002) or Otx2 (Ang et al., 1996), which are typically confined to specific regions or cell lineages within the early embryo. However, restriction of expression to epiblast alone is unusual and points to a relationship between Nanog expression and tissue pluripotency.

Low levels of mouse Nanog expression were reproducibly detected in many adult tissues; however, we were unable to determine whether expression was restricted to specific cellular compartments or reflected a generalized low level of expression. We have also identified and cloned two human Nanog orthologs, Nanog and Nanog2, expressed in human ES cells and shown that human Nanog and Nanog2 expression is present in pluripotent cell lines and is down-regulated in adult tissues.

Nanog and Self-Renewal

Homeobox genes have been implicated in maintenance of stem cell pluripotency. The POU-domain homeobox transcription factor Oct4 is expressed in oocytes, the inner cell mass of the blastocyst, and in the epiblast/primitive ectoderm at E5.5 (Palmieri et al., 1994). Our analysis shows that, although Oct4 and Nanog expression overlap in ES cells and the pluripotent cells of the blastocyst ICM and epiblast, Nanog expression is rapidly down-regulated as epiblast differentiates into mesendoderm, while Oct4 is not. Targeted null mutation of Oct4 in the mouse demonstrates that Oct4, like Nanog is required to maintain pluripotency in the ICM of the blastocyst (Nichols et al., 1998). Of interest, Oct4 is required for Nanog-induced self-renewal, but Oct4 expression is not required for Nanog expression (Chambers et al., 2003). It is, therefore, likely that Nanog functions in parallel to Oct4 in promoting stem cell self-renewal.

Intriguingly, we were able to detect Nanog transcripts in several adult murine tissues by using RT-PCR analysis. We demonstrated that the transcripts detected by RT-PCR in adult tissues originated from spliced Nanog RNA by amplifying a fragment of the open reading frame using primers spanning intron 3 and comparing the sequence of the PCR products to the genomic sequence. We also subjected RT-PCR products to restriction enzyme digestion at a SnaB1 site (nt 964-970) unique to the mouse Nanog sequence and absent in related genomic sequences on chromosome X and 12. In a similar way, we also confirmed that the human Nanog pseudogenes are not expressed in ES cells or adult tissues and that the amplified products were not of genomic origin. It would be interesting to determine whether the expression of Nanog in adult tissues is confined to the stem cell compartment (Weissman, 2000). Recent studies using global gene expression analysis by microarray have not identified Nanog transcripts in either adult hematopoietic or neural stem cells (Ivanova et al., 2002; Ramalho-Santos et al., 2002). However, human Nanog transcripts have been isolated from adult bone morrow (EST, BF893620), a potential source of blood and other multipotential stem cells (Jiang et al., 2002; Verfaillie, 2002).

Nanog and Tumorigenesis

Molecular mechanisms that regulate stem cell self-renewal in the early embryo may be re-activated in the dysregulated proliferation seen in tumorigenesis. For example, Oct4 gene expression is up-regulated in germ cell, breast, pancreas, and colon tumors (Monk and Holding, 2001; Looijenga et al., 2003). Human Nanog and Nanog2 genes are located on chromosome 12, which is subject to frequent duplication in germ cell tumors (Oosterhuis et al., 1997). Furthermore, the most consistent chromosomal abnormality in testicular seminomas, teratomas, and nonseminomas is isochromosome 12p (van Echten et al., 1997). It is likely, therefore, that many germ cell tumors carry multiple copies of the 12p13 located Nanog and Nanog2 genes, potentially leading to increased Nanog expression in these tumors. Considering the ability of Nanog to promote stem cell self-renewal, it is possible that Nanog overexpression in germ cells may contribute to tumorigenesis in the adult. It is also interesting to note that Nanog expression coincides with genome wide changes in methylation occurring in the pluripotent cells of the embryo and in primordial germ cells (Kafri et al., 1992) and that tumorigenesis is often accompanied by alterations in gene methylation patterns (reviewed in Laird and Jaenish, 1996).


The mouse Nanog gene was identified by TBLASTN homology searching of Celera (CMGD R13) and GenBank (MGSCV3) mouse genomic sequence databases by using the 60 amino acid homeodomain sequences of Mixl1 and several paired-like homeobox genes as probes. Nanog orthologs were identified in human, rat, and monkey by BLASTN and TBLASTN homology searching of the GenBank and Celera human genome sequence, GenBank rat genome DNA sequence, and GenBank nonredundant (NR) database, respectively. The conserved SMAD4, homeodomain and tryptophan repeat domains were identified by using PSI-BLAST and protein alignments were performed by using CLUSTALX. The Nanog phylogenetic tree was generated by using a CLUSTALX multiple sequence alignment of homeodomains from related proteins subjected to 1,000 bootstrap calculations and plotted using NJPLOT.

The ES cell line W9.5 (Szabo and Mann, 1994) was maintained in culture and differentiated to form embryoid bodies as described (Keller, 1995). Total RNA was isolated from cell lines and from embryos by using RNeasy mini-spin columns (Qiagen, Valencia, CA). Human ES cell RNA was purchased from ESCell International (PO Box 6492, St Kilda Road, Melbourne, Australia). Oligo dT primed cDNA was reverse transcribed from total RNA and amplified by PCR using Superscript II reverse transcriptase and Platinum Taq DNA polymerase, according to the manufacturer's recommendations (Invitrogen, Carlsbad, CA). The expression of Nanog was assessed by RT-PCR of predicted transcripts from embryo and cell lines using primers spanning intron 3: rk9875-1, 5′-AAAGGATGAAGTGCAAGCGGTGG-3′ and rk9875-4, 5′-CTGGCTTTGCCCTGACTTTAAGC-3′. The hprt (hypoxanthine-phosphoribosyl-transferase) transcripts were amplified by using the primers hprtAH1, 5′-TCCCTGGTTAAGCAGTACAGC-3′; and hprtAH2, 5′-GATGGCCACAGGACTAGAACA-3′. The PCR products were electrophoresed on 2% agarose TAE gels and analyzed by and Southern blot hybridization with a Nanog-specific oligonucleotide probe: NanogMu6, 5′-GGAACAACCGACCTGGACCAACC-3′; or an HPRT-specific oligo: hprt probe, 5′-GGATACAGGCCAGACTTTGTTGG-3′.

The 5′ end of the Nanog cDNA was confirmed by 5′ RACE (Gibco BRL, Rockville, MD). Primers used in 5′-RACE: 987-z4, 5′-GGGACTGGTACAACAATCAGGG-3′; 987-z3a, 5′- GGTCTTCAGAGGAAGGGCGAGG- 3′. Embryos were collected from timed matings and staged according to morphological landmarks (Downs and Davies, 1993). Whole-mount in situ hybridization was performed as described (Belo et al., 1997) and histologic analysis was carried out after dehydration, paraffin embedding, and sectioning of the embryos at 10 μm. Two sets of Nanog-specific riboprobes were generated for in situ hybridization, and sense and antisense probes were transcribed from Nanog cDNA fragments: nt 630-1149 and nt 191-1108. Both probes gave similar results. Northern blot analysis was carried out by using 3 μg of PolyA+ RNA extracted from cultured cell lines obtained through ATCC or from mouse tissues as previously described (Hart et al., 1995). Northern blots were probed with cDNA fragments of mouse Nanog (nt 191-1108) or human Nanog (AU125170, nt 216-1134).


We thank Steven Mihajlovic for assistance with histology and Brigitte Mesiti for image preparation. Thanks also to Ruili Li for assistance with RNA preparation. Hans Schöler generously provided the Oct4 riboprobe. L.R. is a Viertel Foundation Senior Research Fellow.