- Top of page
Prominin-1 (CD133), a pentaspan membrane glycoprotein that constitutes an important cell surface marker of various, either normal or cancerous, stem cell populations is widely used to isolate or characterize such cells in different systems. Occurring throughout the metazoan evolution with a remarkably conserved genomic organization, it may be expressed as different splice variants with distinctive characteristics. A rational nomenclature has been proposed earlier for their consistent designation across species. Although generally accepted, it seems to be misunderstood in view of the recent report of novel prominin-1 complementary DNAs in rhesus monkey and humans with improper naming. As this may lead to confusion, we have reexamined the genomic organization of prominin-1 in various primates to provide an update that should further clarify the rationale of the nomenclature for prominin-1 gene products. This report comprises (i) the determination of the genomic organization of prominin-1 gene in two non-human primates, i.e. Macaca mulatta and Pan troglodytes, commonly used in research, (ii) the mapping of a new exon that creates an alternative cytoplasmic C-terminal end of prominin-1, (iii) the identification of various potential PDZ-binding domains generated by alternative cytoplasmic C-terminal tails, suggesting that different prominin-1 splice variants might interact with distinct protein partners, and (iv) a summing up of the different prominin-1 splice variants.
In the article entitled ‘Isolation, molecular cloning and in vitro expression of rhesus monkey (Macaca mulatta) prominin-1.s1 complementary DNA (cDNA) encoding a potential hematopoietic stem cell antigen’, recently published in Tissue Antigens by Husain et al. (1), the authors refer to a previous article from our group proposing a unifying nomenclature for the designation of the prominin family gene products (2) based on their conserved genomic organization across species (3, 4). The aim of this nomenclature was to attribute the same designation to a given splice variant irrespective of the species as splice variants described in one organism could be predicted to exist in others, with similar characteristics at least among mammals. This nomenclature seems to be generally accepted in the field (5–9), but the rationale underlying it may remain unclear as the rhesus prominin-1 splice variant described by Husain et al. does not correspond to the s1 splice variant (1), and several unpublished human prominin-1 splice variants appear in the NCBI GenBank database (accession numbers AY449690 to AY449693) with the incorrect suffix according to this same nomenclature. Given the wide use of prominin-1 (also termed CD133) for the characterization of stem and progenitor cell populations in different normal tissues (10–13) as well as in cancers [(14–16); for review, see (17)], it is important to clarify the rationale of this nomenclature. We believe that it is essential to maintain a consistent designation with regard to potential cross-specificity toward particular epitopes, e.g. stem-cell-associated AC133 epitope (18). Moreover, when dissecting the differential tissue expression and function of these prominin-1 splice variants, it is important to refer to homologous molecules in the different model organisms. Anarchic reference to the nomenclature can only bring in confusion as to the interpretation of data generated in different models. The chronology of the identification of prominin-1 splice variants in one species or the other is not necessarily related to the predominance of one variant in one species compared with the others and therefore cannot serve as a base for their designation. For instance, prominin-1.s1 was first described in mouse then in humans, while prominin-1.s2 was first described in human then in mouse, or in the same line, only prominin-1.s1 and s7 have been described in rat (19). Cross-species sequence comparison should be made between homologous variants in order to be significant. We therefore propose to sum up and update the nomenclature of the different prominin-1 splice variants and the underlying genomic features in humans and different animal models.
The prominin-1 gene is located on chromosome 4 in humans and chromosome 5 in mice and spans more than 150 kb (3, 20, 21). Its genomic structure, i.e. exon/intron boundaries, is strikingly similar across species (3). Since the original description of prominin-1 (22), several splice variants affecting the open reading frame have been identified in mouse (2, 23, 24) and in humans (25, 26) and their expression was characterized (4). Factors regulating the in vivo expression of prominin-1 remain to be determined, but the messenger RNA profile suggests that the mouse prominin-1 splice variants are tissue specific and developmentally regulated (4).
To date, alternative splicing was found to affect the N-terminal domain, the first and second extracellular loops or mostly the cytoplasmic C-terminal domain, which might implicate distinct cytoplasmic protein-interacting partners. The splice variants s1 and s3–s8 were first isolated in mouse and defined four alternative C-termini for prominin-1, which occur through (i) intron retention, (ii) exon skipping, or (iii) usage of a cryptic acceptor site (4). This splicing cassette, which locates within a cluster of short exons and introns, is conserved across species allowing the prediction of similar splice variants (4). Additional splice variants that define two more alternative C-termini have been isolated in humans (AY449689, AY449691, and AY449693). By comparing the prominin-1 cDNA sequence (nucleotides 2463–2487) isolated from KG-1a cell line (see AY449689) with the human prominin-1 genomic sequence (NC_000004), we have identified a novel facultative exon (SSWVTSVQ), i.e. exon 25, within this splicing cassette (positions 15591022 to 15590998 of the genomic clone) that conforms to the consensus GT-AG rule and encodes residue 822–829 of this particular prominin-1 variant (from here on referred to as s9) (Tables 1 and 2). Importantly, exon 25 being 25 nucleotides long, its insertion induces a frameshift on the following exon 26b, and hence, a premature ending following a C-terminal cysteine (residue 830) compared with the s1 and s2 splice variants (Table 3).
Table 1. Genomic structure of primate prominin-1 genesa
|Exon||Human cDNA (AF027208), 3794 nt||Human cDNA (AY449689), 2493 nt||Human chromosome 4 (NC_000004)||Rhesus cDNA (AY903606), 3927 nt||Rhesus chromosome 5 (NC_007862)||Chimpanzee predicted (XM_517115)||Chimpanzee chromosome 4 (NC_006471)|
|19b|| ||ND|| ||ND|| ||ND|
Table 2. Presence or absence of facultative exons in various prominin-1 splice variantsa
|Splice variant||Structure (facultative exons included in the coding sequences)||Mus musculus||Rattus norvegicus||Homo sapiens||Macaca mulatta||Pan troglodytes|
|s1||−||+||−||−||−||+||+||+||AF026269 (858)||AF386758 (857)||AF507034 (856)|| |
|s2||+||+||−||−||−||+||+||+||AF039663 (867)|| ||AF027208 (865)||XM_001100223 (864)||XM_517115 (865)|
|s3||−||+||−||−||+||−||−||−||AF305215 (834)|| |
|s4||−||−||−||−||+||−||−||−||AY223521 (804)|| |
|s5||−||−||−||−||+||−||−||−||AY223522 (809)|| |
|s6||−||+||−||−||−||−||−||−||AY099088 (823)|| |
|s7||−||+||−||−||−||−||−||+||AK029921 (827)||AY262731 (826)||AY449690 (825)|| |
|s8||+||+||+||−||−||−||−||+||BC028286 (842)|| |
|s9||−||+||−||+||−||+||−||−|| ||AY449689 (830)|| |
|s10||−||+||−||−||−||+||−||+|| ||AY449691 (833)|| |
|s11||+||+||−||−||−||−||−||+|| ||AY449692 (834)|| |
|s12||+||+||−||−||−||+||−||+|| ||AY449693 (842)||AY903606 (841)|| |
Table 3. Alternative prominin-1 C-terminia
Thus, the prominin-1 coding region spans 28 exons among which 7 are facultative with exon 26 bearing two mutually exclusive acceptor sites a and b (Table 2). Remarkably, we have identified a nearly identical genomic structure for two others primates, i.e. rhesus and chimpanzee (on chromosomes 5 and 4, respectively) by similar sequence comparison (Tables 1 and 3), lending further weight to the rationale underlying the nomenclature for prominin-1 gene products. Hence, prominin-1 from chimpanzee is predicted to display the characteristic five transmembrane domains (see updated sequence XM_517115) like in the other species, and the s1 variant would be predicted to contain 865 residues, contrarily to what appears in table 1 of Husain et al. (1). Together, 12 different prominin-1 splice variants affecting the protein sequence have been described to date in rodents and primates (Table 2). Some of them, e.g. s1, s2, s7, s12, have been detected in several species. Completing the chart for a given species might thus only be a question of time.
It appears therefore that a prominin-1 splice variant with a cytoplasmic C-terminal tail that is 24 amino acid shorter than the human prominin-1.s2 (AF027208) like the rhesus prominin-1.s1 splice variant described by Husain and colleagues does exist in humans. Consequently, this rhesus prominin-1 splice variant aligns better with the 842 amino acid long human splice variant encoded by an unpublished cDNA sequence deposited in GenBank (accession number AY449693; here referred to as s12; Table 2) than with human prominin-1.s2, yielding 95.7% amino acid sequence identity rather than 93.2% (using the stretcher program from the EMBOSS package with a EBLOSUM 62 matrix and gap and extend penalty of 12 and 2, respectively). The prominin-1 sequence from Macaca mulatta (AY903606) has therefore been renamed prominin-1.s12 (S. M. Husain, personal communication, St. Jude Children’s Research Hospital, Memphis, TN).
The importance of using the same suffix to design a given splice variant irrespective of the species is best illustrated when considering the cytoplasmic C-terminal tail of prominin-1 (Table 3). Although no information is currently available as to potential cytoplasmic proteins interacting with prominin-1, the presence of distinct C-terminal domains suggests that different prominin-1 splice variants might interact with alternative protein partners. Interestingly, we found that the last four C-terminal amino acids of the s3, s4 and s5 variants, i.e. HFTL, exhibit the characteristics of a class II PDZ-binding domain with a hydrophobic residue in position 0 and −2 (X-φ-X-φ; X, unspecified amino acid and φ, hydrophobic residue), while the C-terminus of the s9 splice variant, i.e. SVQC, would be highly related to a class III PDZ-binding domain (X-X-C) and rodent s1, s2, s7, s8 and s11 (PSQR) to a class I PDZ-binding domain (X-S/T-X-φ) (27, 28). In keeping with a potential interaction with PDZ-domain-containing protein, we have recently found that prominin-2, the prominin-1 paralogue (3), binds to a novel splice variant of the glutamate receptor-interacting protein, a PDZ-domain-containing protein in a yeast two-hybrid screening (Kathrin Opherk and DC, unpublished data; see GenBank accession number AY255674). Further studies are needed to determine whether prominin-1 molecules interact with various PDZ-domain-containing proteins as well as to study the physiological relevance of such interaction.
In conclusion, we have shown (i) that the genomic organization of prominin-1 gene in two non-human primates is highly similar to that in the human counterpart and (ii) the existence of a novel conserved facultative exon that creates an alternative cytoplasmic C-terminal end of prominin-1. Therefore, we recommend that the designation of the prominin-1 splice variants from Homo sapiens (AY449689 to AY449693) be corrected according to the present Table 2 in order to conform to the nomenclature for prominin-1 gene products they appear to refer to.