Changes in Gene Expression at the Precursor → Stem Cell Transition in Leech


  • Kristi A. Hohenstein,

    1. Biology Department, Rutgers, The State University of New Jersey, Camden, New Jersey, USA
    Search for more papers by this author
  • Daniel H. Shain Ph.D.

    Corresponding author
    1. Biology Department, Rutgers, The State University of New Jersey, Camden, New Jersey, USA
    • Biology Department, Rutgers, The State University of New Jersey, 315 Penn Street, Camden, NJ 08102. Telephone: 856-225-6144; Fax: 856-225-6312
    Search for more papers by this author


The glossiphoniid leech, Theromyzon trizonare, displays particularly large and accessible embryonic precursor/stem cells during its early embryonic cleavages. We dissected populations of both cell types from staged embryos and examined gene expression profiles by differential display polymerase chain reaction methodology. Among the ∼10,000 displayed cDNA fragments, 56 (∼0.5%) were differentially expressed at the precursor → stem cell transition; 29 were turned off (degraded, precursor-specific); and 27 were turned on (transcribed, stem cell-specific). Several putative differentially expressed cDNAs from each category were confirmed by Northern blot analysis on staged embryos. DNA sequencing revealed that 19 of the cDNAs were related to a spectrum of genes including the CCR4 antiproliferation gene, Rad family members, and several transcriptional regulators, while the remainder encoded hypothetical (10) or novel (27) sequences. Collectively, these results identify dynamic changes in gene expression during stem cell formation in leech and provide a platform for examining the molecular aspects of stem cell genesis in a simple invertebrate organism.


Fundamental properties shared by all stem cells (e.g., self-renewal, cell type–specific propagation) are likely to be regulated by overlapping molecular pathways. To date, however, relatively few genes have been identified that are associated with stem cell formation or maintenance (e.g., esg1 [1,2], nanog [3,4], oct 4 [5,6], piwi [79]). Current studies are limited in part due to technical difficulties associated with purifying stem cells to homogeneity and maintaining pure populations in culture [1012]. While contemporary research is focused primarily on mammalian stem cells (SCs), we have examined gene expression profiles of stem cells in an invertebrate model, the glossiphoniid leech, Theromyzon trizonare.

At the onset, it is important to establish the similarities between leech and mammalian embryonic stem (ES) cells, and also their differences, lest there be confusion about the semantics of two currently disparate research fields. Embryonic stem cells from both phyla (Annelida and Chordata) appear transiently during the early stages of embryogenesis, both display the potential for self-renewal under appropriate conditions [13,14], and both generate multiple cell types during development. There are clear differences, however, in the “potency” of each cell type. Thus, while mammalian ES cells display pluripotency in the embryo, descendants of leech stem cells are more restricted in cell fate. Five bilateral pairs of stem cells (M, N, O, P, Q, also known as teloblasts) generate chains of segmental founder cells that give rise to mesodermal (M), neuroectodermal (N), and ectodermal (O, P, Q) tissue (Fig. 1A); leech endoderm arises by a stem cell–independent process [15,16]. M, O, and P stem cells produce two cell types, primary daughter cells and micromeres, while N and Q generate three distinct cell types, two different primary daughter cells that arise in alternation and micromeres. An O/P stem cell produces four primary daughter cells before dividing into equivalent O and P cells, whose lineages are specified by cell-cell interactions with other ectodermal progeny [17]. Although bilateral M, N, O, P, and Q cells produce mainly germ-specific cell types, each contributes progeny to multiple germ layers (e.g., M-derived progeny appear in mesoderm and neuroectoderm; N-derived progeny appear in neuroectoderm and ectoderm, etc.) and displays the capacity to change fate [18,19].

Figure Figure 1..

Schematic of stem cell lineages and early development in leech. (A): Stem cells generate chains of daughter cells (bandlets) that coalesce and differentiate into segmental tissue. N and Q produce two distinct daughter cells in alternation (black and white cells); M, O, and P produce only one daughter cell type (white cells). (B): Precursors DM and NOPQ are born during stages 4 and 5, respectively, and give rise to 5 bilateral pairs of stem cells (M, N, O, P, and Q) by stage 7. Blackened cells were dissected from appropriate stages (∼100 of each cell type).

On the basis of cell potency, stem cells in leech are more similar to mammalian adult SCs (e.g., hematopoietic, multi-potent) since their descendant progeny are restricted in cell fate. However, leech stem cells are expressed at the early stages of embryogenesis and give rise to most adult cell types, including the germ line [20,21], similar to the role of mammalian ES cells. We therefore propose the terminology LES (leech embryonic stem) cells to identify their functional role in developing embryos and to distinguish them from the unique properties that have been designated for mammalian ES cells (e.g., pluripotency).

Embryogenesis in leech (Fig. 1B) begins with an unequal, meridional cleavage that divides the fertilized egg into a smaller AB and larger CD macromere. A second meridional cleavage forms three smaller macromeres (A, B, and C), which later fuse to form the gut [15,16], and a larger D macromere that gives rise to the segmental mesoderm and ectoderm. At around 12 hours postfertilization, the D macromere generates two precursor cells in succession, DM and NOPQ, respectively. Precursors undergo a series of highly unequal and stereotyped cell divisions giving rise to five bilateral pairs of LES cells (M, N, O, P, and Q) that are asymmetrically positioned on the embryo's surface. LES cells divide repeatedly at the rate of about one division every hour, generating chains of segmental founder cells (bandlets; Fig. 1) that coalesce along the longitudinal axis while dividing and differentiating to form the segmental tissue [22].

Theromyzon offers several experimental advantages in comparison with other model systems. Most important, perhaps, is that LES cells and their respective precursors (i.e., founder cells) are among the largest cells in the animal kingdom (50–300 μm in diameter); moreover, their asymmetric position on the surface of developing embryos permits their identification and homogeneous isolation. The availability of these two identifiable cell populations (i.e., precursors and LES cells) permitted us to examine the molecular events leading up to stem cell formation, which previously have not been investigated in mammalian ES cells due largely to technical limitations. We report here dynamic changes in a novel set of genes that are turned on and off, respectively, upon the birth of embryonic SCs in leech.

Materials and Methods

Collection of Leech Embryos and Cells

We collected adult Theromyzon trizonare (formerly known as Theromyzon rude [23]) specimens in the ponds of Golden Gate Park (San Francisco, CA). Leech embryos were maintained at 12°C in 0.3% Instant Ocean Salt (PETsMART, Phoenix, AZ). Embryos were staged by visual inspection under a stereomicroscope. Targeted cells (DM, NOPQ, M, and N) were dissected from appropriately staged embryos with fine pins (No. 10130-05, Fine Science Tools, Foster City, CA). Only one cell was extracted from each respective embryo.

RNA Purification

Total RNA was isolated by the method of Chomczynski and Sacchi [24]. Briefly, ∼100 cells of each type were added consecutively to 1 ml of guanidinium isothiocyanate denaturation solution (i.e., a dissected cell was immediately placed into GIT and vortexed after each dissection). Samples were precipitated overnight at −20°C, and RNA pellets were resuspended in 20 μl of 80% formamide. RNA was quantitated by agarose gel electrophoresis and stored at −20°C.

Differential Display-PCR

Aliquots of each cell type–specific RNA (∼50 ng) were reverse transcribed with Powerscript (Clontech, Palo Alto, CA) and amplified by the SMARTTM cDNA construction method (Clontech). We conducted differential display (DD)-PCR (polymerase chain reaction) according to the manufacturer's specifications (Clontech), incorporating [35S]-αdCTP. Briefly, primer sets comprising permutations of P and T primers were used independently to amplify identical templates; primer sequences were designed to provide a statistical representation of gene expression in mammalian cells (Clontech). We resolved cDNAs on a 6% polyacrylamide gel spotted with radioactive ink for alignment purposes and exposed to Kodak BioMax x-ray film for 1–5 days. Selected bands were excised with a razor blade, transferred to 80 μl 0.5X TrisHCI-EDTA, and boiled for 5 minutes. Re-amplification was conducted as described (Clontech) and confirmed by agarose gel electrophoresis.

Cloning and DNA Sequencing

PCR products were gel purified by Mini elute Gel Extraction (Qiagen, Valencia, CA) and cloned into pGEMT-easy (Promega, Madison, WI). DNA was sequenced commercially (Northwoods DNA, Inc., Becida, MN) with standard primers.

Northern Blots

Total RNA (∼10 mg) was electrophoresed through a formaldehyde gel and transferred to positively charged nylon membrane (NEN Life Science Products, Inc., Boston, MA), as described by Ausubel and colleagues [25]. We conducted hybridizations according to standard procedures [25], incorporating 1 × 106 counts per minute/ml of [32P]-dCTP PCR-labeled probe.


Homogeneous populations of LES cells (M and N) and their respective precursors (DM and NOPQ) were manually dissected from appropriately staged Theromyzon embryos (∼100 cells of each type; Fig. 1B). These cells were targeted based on their accessibility during development and the degree to which their lineages have been characterized [26,27]. Following total RNA purification and cDNA synthesis, differential display-PCR [28,29] was conducted with around 150 primer combinations to generate a series of gene expression profiles for each cell type (Fig. 2). Each primer set generated around 70 distinct bands, resulting in the screening of more than 10,000 cDNAs, the estimated number of genes in a leech genome [30].

Figure Figure 2..

Differentially expressed cDNAs in Theromyzon embryonic cells. Representative autoradiograms of precursor-specific (A) and LES-cell-specific (B) cDNAs following differential display-PCR analysis; arrows identify respective bands. Bands appearing in precursor (DM, NOPQ) and LES-cell (M, N) lanes were designated “housekeeping” genes.

Examination of DD profiles revealed that about 98% of cDNA fragments were identical between cell types (i.e., “housekeeping” genes), while 236 (∼2%) were differentially expressed. Among the latter, eight categories were resolved and are presented in Figure 3. DD fragments that were expressed only in precursor cells (DM and NOPQ) were designated as precursor-specific (Fig. 2A), while those present only in M and N cells became LES cell–specific candidates (Fig. 2B). In total, 29 precursor-specific and 27 LES cell–specific cDNAs were identified.

Figure Figure 3..

Categories of differentially displayed cDNAs. Those cDNAs expressed in both DM and NOPQ precursors but not M or N stem cells (29, dark gray) were designated as precursor-specific; cDNAs in both M and N cells but not precursors DM or NOPQ were designated as LES cell-specific (27, light gray). Note that DM, which gives rise to the bilateral M cells, contained only one differentially expressed cDNA while NOPQ, which gives rise to four stem cell types (N, O, P, and Q), contained 39 differentially expressed cDNAs; these latter cDNAs are likely to be a mixture of O, P, and Q determinants.

DD cDNAs were cloned, sequenced, and subjected to GenBank (BLAST) Basic Local Alignment Search Tool searches (Tables 1, 2). Collectively, 19 (34%) of the cDNAs were similar to reported genes, 27 (48%) produced no significant match, and 10 (18%) matched hypothetical sequences (expressed sequence tags or poorly characterized proteins). Among the putative homologues in DM and NOPQ precursors were CCR4-NOT subunit (an antiproliferation gene [31]), beta dynein heavy chain, a G-protein, ubiquitin-related genes, a transcriptional regulator, and an uncharacterized progenitor cell protein (Table 1). LES cell–specific homologues included Rad family members, a transcriptional regulator, a TATA-binding-protein (TBP)-associated factor, and proteins induced by either fibroblast growth factor or retinoic acid (Table 2).

Table Table 1.. Precursor-specific cDNAs
  1. a

    Sizes of Northern blot bands are shown when available (dashes represent the absence of a detectable band).

CloneAnnotation (GenBank accession number); functionNorthern (∼bp)
K4CCR4-NOT transcription complex subunit (NP_004770.1); antiproliferation-
K34Protein 4.1G (CAD62252.1); cytoskeletal-related-
K43Period homolog 1 (NP_035195); behavioral rhythm-
K45Hypothetical (NP_075700.1)-
K60Ubiquitin-conjugating enzyme E2G1 (XP_016044.2); protein degradation-
K74Hypothetical (AF084549_1)-
K100Cyclin K (XP_085179.1); cell-cycle regulator-
K137Fucose-1-P guanyltransferase (AAH32308.1); sugar metabolism-
K1431 beta dynein heavy chain (XP_220603.1); cytoskeletal motor2600
K149G-protein (XP_146447.1); signaling1800
K151Transcriptional regulator (NP_642593.1)-
K204Ankyrin repeat gene (XP_125622.1)-
K214Hypothetical (NP_175919.1)2000
K224Hypothetical (NP_501519.1)2300
K225NADH dehydrogenase subunit 2 (AF337791_1); electron transfer-
K234Hematopoietic stem/progenitor cell protein (NP_060936.1)2400, 5700
Table Table 2.. LES cell–specific cDNAs
  1. a

    Sizes of Northern blot bands shown when available (dashes represent the absence of a detectable band)

CloneAnnotation (GenBank accession number); functionNorthern (∼bp)
K46Rad21 (NP_092768.1); DNA repair2000
K81.1Transcriptional regulator (NP_595131.1); zinc finger-
K88Hypothetical (NP_326419.1)-
K117Metal response element-binding transcription factor (NP_038855.1)4000
K133Hypothetical (XP_148820.1)-
K153TBP-associated transcription factor (NP_500378.1)1800, 5000
K157Rad21 (CAC10381.2); DNA repair-
K159Fibroblast-growth-factor-inducible protein (NP_032042.1)-
K167Novel3600, 6000
K181Hypothetical (AAH18007.1)-
K193Hypothetical (XP_284249.1)-
K194Hypothetical (NP_074988.1)-
K200Hypothetical (BAB31075.1)-
K243Retinoic-acid-inducible E3 protein (XP_168759.2)1500, 2600

To verify precursor-specific and LES cell–specific cDNAs, Northern blot analyses were performed using RNA from two distinct embryonic stages: stage 4, which contains precursors DM and DNOPQ, and stage 7, which contains all 10 LES cells (Fig. 1B). Representative Northern blots using cDNAs K224 (precursor-specific) and K243 (LES cell–specific) are shown in Figure 4, and Northern blot data is summarized in Tables 1 and 2. Although we examined all cDNAs reported here by Northern blot analysis, only 14 displayed detectable bands, suggesting that expression levels of the remaining cDNAs were below the sensitivity limits of the assay. Based on comparative analyses of gene expression in mammalian SCs, it has been proposed that stem cell–specific genes may be expressed at particularly low levels [32,33]. Precursor-specific cDNAs that were confirmed by Northern blot analysis included beta dynein heavy chain, a G-protein, and an uncharacterized progenitor cell protein. Northern blots also verified LES cell–specific transcripts Rad21, a metal response transcription factor, TBP-associated factor, and a retinoic acid–inducible protein. Several novel, differentially expressed cDNAs were also corroborated by Northern blots on staged embryos. We observed no erroneous bands in the Northern blot data set (e.g., precursor-specific probe hybridizing with stage 7 RNA).

Figure Figure 4..

Representative Northern blots of precursor- and LES cell–specific cDNAs. (A): Precursor cDNA K224 annealed to a ∼2,300 bp transcript (left arrow) in total RNA from stage 4 embryos (i.e., containing precursors DM and DNOPQ). (B): LES cell–specific cDNA K243 annealed to ∼2,600 and ∼1,500 bp transcripts (right arrows) in total RNA from stage 7 embryos (containing M, N, O, P, and Q). Arrowheads indicate rRNA and demonstrate that approximately equal amounts of RNA were loaded in each lane.


The search for stem cell–specific genes has proven somewhat elusive [3234], but nonetheless several genes are generally linked with stem cell–specific properties (e.g., self-renewal: esg1, nanog, oct 4, piwi). Although no leech homologues of these genes were identified in this study, it is important to note that most (esg1, nanog, oct 4) are also expressed in ES founder cells (e.g., inner cell mass precursor populations [1, 2, 4, 6]) and/or are broadly expressed during development (e.g., piwi homologues [8]). Thus, leech homologues of the aforementioned genes were not expected in this analysis, since gene expression profiles were compared between LES cells and their immediate founder cells (i.e., precursors DM and NOPQ). Rather, leech embryology permits an investigation of a fundamentally different set of genes, namely those associated with stem cell genesis from a non-SC precursor.

Results presented here identify dynamic changes in gene expression at the precursor → stem cell transition in leech. In total, 56 differentially displayed cDNAs (corresponding to gene-specific mRNAs) were either transcribed (SC-specific) or degraded (precursor-specific) upon the birth of LES cells. While many cDNAs displayed sequence similarity to previously described genes (e.g., transcription factors, growth factor–inducible proteins, G-protein, Rad), the majority of differentially expressed cDNAs encoded novel or hypothetical sequences.

The important role of mRNA degradation in cell differentiation and development has been established [35]. The observation that more than 20 cDNA fragments were turned off (e.g., downregulated or degraded) at the precursor → stem cell transition in these studies suggests that the cell lineages leading to the birth of LES cells (i.e., precursor cells) are held in a repressed state and derepressed in the absence of specific genes (i.e., precursor-specific). Consistent with this notion, one precursor-specific fragment (K4) shared strong sequence similarity with a subunit of the mammalian CCR4-NOT complex, a negative regulator of cell growth and an antiproliferation gene [31]. A similar process of mRNA degradation was observed in the leech Helobdella robusta, in which Hronos mRNA (Drosophila nanos homologue [36]) was rapidly degraded in the early cell divisions leading to stem cell formation [21,37]. Because zygotic transcription in leech begins just prior to the birth of LES cells (i.e., stage 5 [38]), most precursor-specific cDNAs identified here are probably maternal transcripts that are selectively degraded upon the birth of LES cells.

In light of studies that identify transcription factors as key modulators of cell type–specific differentiation [39,40], differentially expressed transcription factors at the precursor → LES cell transition are particularly worth noting. These include LES cell–specific cDNAs K117 and K153, which share sequence similarity with metal response [41] and TBP-associated transcription factors, respectively; the former was trapped in mouse ES cells and encodes two zinc finger domains [41]. Other putative transcription factors (K81.1, K151) are poorly described [42,43]. Also noteworthy are LES-specific cDNAs sharing sequence similarity with fibroblast-growth-factor (FGF) (K159) and retinoic acid–inducible (K243) proteins, whose inducible factors (e.g., FGF, retinoic acid) are supplemented in mammalian SC cultures [44,45]. Finally, a precursor-specific cDNA (K234) shares sequence similarity with an uncharacterized hematopoietic progenitor protein (R. Strausberg, unpublished data, 2001).

The seemingly high number of differentially expressed cDNAs (56 in total) between cell types separated by a single cell division (i.e., precursor → stem cell) is nonetheless consistent with dramatic changes in gene expression that occur in the early stages of mammalian development [46]. In mammalian studies, high levels of mRNA degradation and new mRNA synthesis were detected between identifiable stages of mouse embryogenesis (e.g., morula, blastula). Investigations aimed at determining gene expression profiles of mammalian SCs independently report in excess of 200 genes that are enriched in embryonic, neural, and hematopoietic SCs [10, 47, 48]. The number of overlapping genes in these studies, however, is unexpectedly low and may reflect several technical limitations currently associated with molecular SC research using vertebrate systems [1012, 3234]. For example, mammalian SCs are not readily purified to homogeneity, and subtractive techniques do not always remove all common genes [10, 12, 32, and 49]. Also, modern micro arrays generally do not encompass an entire genome [10,33]. These problems are largely circumvented by DD analysis with the caveat that only gene fragments are isolated; thus, numbers of differentially displayed genes may be overestimated due to the amplification of nonoverlapping fragments from the same cDNA. Nevertheless, the genetic profiles obtained in this study identify clear changes in gene expression that occur during stem cell genesis, albeit in a simple invertebrate organism (i.e., leech). However, if Drosophila melanogaster and Caenorhabditis elegans serve as precedents [5052], changes in gene expression at this critical juncture in development are likely to be conserved among all metazoan taxa.


This work was supported by the Rutgers Life Science Fellowship to K.A.H. and Busch Biomedical Research Grant 6–49167 to D.H.S.