Continued progress in the development of antigen-specific cancer vaccines depends on our capacity to overcome tumor heterogeneity and immune escape. To address these complexities, a large repertoire of immunogenic, tissue-restricted gene products needs to be identified and incorporated as target antigens in polyvalent cancer vaccines. With the exception of transcripts encoded by mutated genes,1 no gene products with absolute cancer-restricted expression have been defined. Therefore, the search for target antigens has focused on identifying molecules expressed in a narrow range of normal adult or fetal tissues and cancers (e.g., melanocyte differentiation antigens), as well as molecules expressed at amplified levels in malignant versus normal tissues (e.g., Her2/neu). Immunologic methods of gene discovery have also led to the identification of a group of genes expressed exclusively in developing germ cells of the testis and fetal ovary, as well as in placental trophoblast, and, most notably, in many human cancers of diverse origins.2 These genes have been referred to collectively as cancer/testis (CT) antigens, a term reflecting their restricted expression pattern and immunogenicity in cancer patients.2, 3
The discovery of MAGE antigens by T-lymphocyte epitope cloning inaugurated current efforts to identify target molecules for immunotherapeutic cancer vaccines4 and led to the discovery of other CT antigens, such as BAGE5 and GAGE.6 Subsequently, by employing autologous antibody recognition, the SEREX (serological analysis of recombinant cDNA expression libraries) method7 uncovered additional CT antigens including HOM-MEL-40/SSX2,7 NY-ESO-1,8 and CT-7.3 More recently, the use of nonimmunologic techniques relying on differential mRNA expression has resulted in the identification of several gene products whose expression patterns resemble CT antigens, such as LAGE,9 CT9,10 CT1011 and SAGE.12 To date, 14 different CT antigen families have been described, most of which map to chromosome X.
Most cancer/testis antigens have no known biologic function. The exceptions to this are two gametogenic proteins, synaptonemal complex protein 1 (SCP-1), a meiotic protein involved in the pairing of homologous chromosomes,13 and acrosin binding protein/OY-TES-1, an acrosin-packaging protein of the sperm head.14 The underlying reasons for reactivation of CT antigen expression in cancer and the relationship between CT gene expression and malignancy are not known. A recent hypothesis relating the induction of gametogenic gene programs to tumorogenesis has been put forth to explain CT antigen expression in cancer.15 This theory has drawn many similarities between cancer cells and cells undergoing gametogenic/trophoblastic differentiation, including such shared properties as immortalization, invasion, lack of adhesion, angiogenesis, demethylation and downregulated expression of MHC molecules.
The Cancer Genome Anatomy project, together with the expressed sequence tag (EST) database, has archived over 3.5 million single-pass cDNA sequences, or ESTs, derived from numerous normal and malignant tissues and has provided essential on-line tools for analyzing or “mining” these data. With respect to the identification of cancer-associated gene products with potential relevance to cancer vaccine targets, EST database mining has resulted in the discovery of a prostate cancer-related gene, PAGE-1/GAGE-B,16 a Ewing's sarcoma-associated gene, XAGE-1,17 and a number of differentially expressed transcripts in breast cancer18 and glioblastoma multiforme.19 In order to identify new CT gene products, the current study mined the Unignene database, a compilation of both EST and Genbank databases, for transcripts expressed exclusively in cancer and normal testis. Subsequent RT-PCR analysis of candidate transcripts identified several gene products with highly restricted mRNA expression patterns, including 3 newly defined CT genes.
MATERIAL AND METHODS
Bioinformatic identification of cancer/testis-associated Unigene clusters
The cDNA X profiler tool of the Cancer Genome Anatomy Project (http://cgap.nci.nih.gov/Tissues/xProfiler) was used to search the Unigene database in the following manner. First, 2 pools of expressed sequence tags (ESTs) were established. Pool A consisted of ESTs derived from 6 normal testis cDNA libraries, and Pool B consisted of ESTs derived from 188 tumor-derived cDNA libraries (all histologic types). The X profiler search engine was directed to identify those Unigene clusters containing ESTs from both Pool A and Pool B and to exclude Unigene clusters containing ESTs from any other normal tissue cDNA library. Tissue expression patterns of the resultant Unigene clusters were also analyzed by in silico serial analysis of gene expression (SAGE) using the SAGE:Gene to tag mapping tool associated with each Unigene cluster entry. Their relation to known gene products was determined by BLAST searches of nucleotide and protein databases (http://www.ncbi.nlm.nih. gov/BLAST/) or by motif analysis of putative protein translations (http://motif.genome. ad.jp). Furthermore, BLAST searches of representative ESTs from all identified Unigene clusters were performed against the human genome sequence database to obtain gene mapping information and to determine intron/exon boundaries used in PCR primer design.
Total RNA from 20 different normal human tissues was purchased from Clontech (Palo Alto, CA) and Ambion (Austin, Texas). Tumor tissues were derived from surgical specimens obtained from Memorial Sloan-Kettering Cancer Center, Weill Medical College of Cornell University and Krankenhaus Nordwest (Frankfurt, Germany). Total RNA from tumor tissues was prepared by the guanidinium thiocyanate method.
The cDNA preparations used as templates in the RT-PCR reactions were prepared using the Superscript first strand synthesis kit (Invitrogen Life Technologies, Carlsbad, CA). The cDNA was synthesized by incubating 5 μg of total RNA in 40 μl of 1× reverse transcriptase buffer containing 100 ng random hexamers, 0.5 mM dNTP, 5.0 mM MgCl2, 10 μM DTT, 80 U ribonuclease inhibitor and 100 U Superscript II reverse transcriptase at 42°C for 50 min. Control templates for assessing amplification of genomic DNA were prepared as duplicate samples lacking reverse transcriptase.
Oligonucleotide primers, homologous to ESTs present in selected Unigene clusters, were synthesized commercially (Invitrogen Life Technologies). DNA sequences of relevant primer pairs are provided below. RT-PCR was performed as follows. Twenty-five-microliter PCR reaction mixtures, consisting of 2 μl cDNA (or 2.0 μl of genomic DNA amplification controls), 0.2 mM dNTP, 1.5 mM MgCl2, 0.25 μM gene-specific forward and reverse primers and 2.5 U platinum Taq DNA polymerase (Invitrogen Life Technologies), were heated to 94° for 2 min, followed by 34 thermal cycles of 94°C for 30 sec, 55°C for 30 sec and 72°C for 1 min and a final cycle of 94°C for 30 sec, 55°C for 30 sec and 72°C for 5 min. Thermal cycling was performed using an ABI 7700 Sequence Detector (PE Biosystems). Resultant PCR products were analyzed in 2% agarose/Tris-acetate-EDTA gels, and their identity was verified by DNA sequencing.
The primer pairs were as follows: CT15 forward: 5′ AGGAATTATGAAACCACTTG CT15 reverse: 5′ GACAACAGTTGTATCAGACC CT16 forward: 5′ CAAGGAGAGGTCGTGTCTTCG CT16 reverse: 5′ GGATCTTGTTACATGCTCACTCATATC CT16.2 forward: 5′ CCAGATTAAGAATAACGTTC CT16.2 reverse: 5′ AGAGGAAATGACCAAGAGTC CT17 forward: 5′ ACAAGACTAGCTTATGTGTGG CT17 reverse: 5′ TTGAGCAAGAATCTTGACTTC
Real-time quantitative RT-PCR
Total RNA samples from 8 different normal adult tissues and 8 tumor specimens were prepared and reverse transcribed into cDNA as described above. Gene-specific TaqMan probes and PCR primers were designed using Primer Express software (PE Biosystems, Foster City, CA) and synthesized commercially (PE Biosystems). DNA sequences of the relevant TaqMan primer pairs and probes are provided below. Multiplex PCR reactions were prepared using 2.0 μl of cDNA (or 2.0 μl of genomic DNA amplification controls), diluted in TaqMan Universal PCR Master Mix supplemented with 200 nM Fam (6-carboxy-fluorescein) labeled gene-specific TaqMan probe, 300–900 nM gene-specific forward and reverse primers (predetermined optimum concentration) and VIC-labeled human β-glucuronidase or phosphoglycero kinase endogenous control probe/primer mixtures (proprietary dye, PE Biosystems). Six 25 μl PCR reactions were prepared for each cDNA sample (3 per each endogenous control). PCR consisted of 40 cycles of 95°C denaturation (15 sec) and 60°C annealing/extension (60 sec).
Thermal cycling and fluorescent monitoring were performed using an ABI 7700 Sequence Analyzer (PE Biosystems). The cycle interval at which a PCR product is first detected above a fixed threshold, termed the cycle threshold (Ct), was determined for each sample, and the average of triplicate samples was recorded. The copy number of gene-specific transcripts per μg of RNA starting material was determined by comparison with a standard curve of Ct values generated from known concentrations of cDNA encoding the homologous gene product. To normalize the quantity of mRNA present in the total RNA samples, the Ct values obtained from the endogenous control were subtracted from the gene-specific Ct values (ΔCt = Ct FAM − Ct VIC). Real-time RT-PCR of triplicate samples yielded 2 sets of 3 ΔCt values per each RNA sample (1 set per each endogenous control), and the mean of the 6 ΔCt values was calculated. The concentration of gene-specific mRNA in normal or tumor-derived tissue, relative to normal testis, was calculated by equating the normalized Ct values (ΔΔCt = ΔCt of normal or tumor tissue − ΔCt of normal testis) and determining the relative concentration (relative concentration = 2−ΔΔCt). The transcript copy number per μg of normalized total RNA was calculated by multiplying the mean relative concentration for each cDNA sample by the copy number in testicular tissue, which was determined from the standard curve (copy number = 2−ΔΔCt × copy number testis).
The primer pairs and probes used were as follows: CT15taqman forward: 5′ GGGAGTATTGACAGTGGCAATTT CT15taqman reverse: 5′ TGTTCTCAATGTAGCGCCTTTC CT15taqman probe: 5′ CCACCTGTAGCTATACCAGCCAGACTCCC CT16taqman forward: 5′ GCAGAGTCCCCTCCCTGAC CT16taqman reverse: 5′ ACAGGAACTGGCTCTGCTTAAGA CT16taqman probe: 5′ TCAGGACCATCTCCAGGTGCATCCTC CT17taqman forward: 5′ CCAGAGTCTCATGTTAAAATCACTTACA CT17TaqMan reverse: 5′ GAAACACTTCCTCTCTTTCTTTAAGTACAA CT17taqman probe: 5′ ACCCAGAAAGACCACCACTTTGCAGGTA
In all, 1,325 different Unigene clusters with in silico expression profiles resembling CT antigens were identified by mining the Unigene database for gene clusters containing ESTs derived exclusively from both tumor tissue and normal testis cDNA libraries. These cancer/testis-associated Unigene clusters represented 61 known genes and 1,264 uncharacterized genes. As shown in Table I, the Unigene clusters were placed into 2 categories. Group I consisted of 859 different gene clusters containing ESTs derived exclusively from testis and tumor cDNA libraries, termed cancer/testis (CT)-related Unigene clusters. Group II consisted of 400 different gene clusters containing ESTs derived exclusively from testis and germ cell tumors (but not other types of cancer), termed testis-related Unigene clusters. An additional 66 gene clusters were not pursued further since their respective Unigene database entries were modified during the course of this study, or literature reports of known gene products indicated they were expressed in other normal tissues, in addition to testis. In accordance with the Unigene database, the present study designates specific gene clusters as Homo sapiens.numerical description (e.g., Hs.123456).
Table I. In silico classification of cancer/testis-associated unigene clusters
Category of Unigene cluster
Number of Unigene clusters in each subcategory
Cancer/testis-related Unigene-clusters: contain ESTs only from testis (or germ cell tumors) and tumor-derived cDNA libraries
SAGE tags present only in tumor and/or cell line SAGE libraries
No reliable SAGE tags
SAGE tags present in normal tissue-derived SAGE libraries
Testis-related Unigene clusters: contain ESTs only from testis and germ cell tumor cDNA libraries
SAGE tags present only in tumor and/or cell line SAGE libraries
No reliable SAGE tags
SAGE tags present in normal tissue-derived SAGE libraries
The mRNA expression patterns of Group I and Group II Unigene clusters were further analyzed by in silico serial analysis of gene expression (SAGE). As shown in Table I, group I and II Unigene clusters were further subdivided into subgroups A, B and C, based on the presence and tissue distribution of homologous SAGE tags. Subgroups IA and IIA have SAGE tags that are present only in tumor and/or cell line-derived SAGE libraries. Subgroups IB and IIB have no reliable SAGE tags. Subgroups IC and IIC have SAGE tags that are present in normal tissue SAGE libraries. Four known cancer testis antigens were identified among the 1,325 Unigene clusters, including CT11p/Hs.293266 (Group IA), GAGE 4/Hs.183199 (Group IB), MAGEB1/Hs.73021 (Group IC) and SAGE/Hs.195292 (Group IIA).
Identification of tissue-restricted mRNA transcripts by RT-PCR
The mRNA expression patterns of 73 of the 1,325 Unigene clusters identified in the current study were analyzed by RT-PCR using a panel of RNA samples derived from 20 normal tissues. Several criteria were used for choosing these particular cancer/testis-associated Unigene clusters for RT-PCR analysis. Since a large proportion of known CT antigens map to chromosome X, all cancer/testis-associated Unigene clusters mapping to chromosome X were tested (19 total). Also, since melanoma and sarcoma express a large number of CT antigens, 10 cancer/testis-associated Unigene clusters having ESTs derived from melanoma or sarcoma libraries were tested. Those cancer/testis-associated Unigene clusters having functional significance in relation to cancer (e.g., transcription factors, adhesion molecules) were also tested (21 total). The remaining 23 Unigene clusters analyzed were chosen at random. In relation to cancer/testis-associated Unigene cluster subgroupings, 59 CT-related Unigene clusters (38 from Group IA, 9 from Group IB and 12 from Group IC) and 14 testis-related Unigene clusters (6 from Group IIA, 7 from Group IIB and 1 from Group IIC) were analyzed by RT-PCR.
As shown in Figure 1, 10 of the 73 Unigene clusters analyzed by RT-PCR were considered differentially expressed, with transcripts detected in a limited number of normal tissues (i.e., mRNA expression detected in less than 7/20 normal tissues) and are listed in Table II. The mRNA expression patterns of the remaining 63 gene products were ubiquitously expressed in normal tissues (43 Unigene gene clusters) or yielded ambiguous RT-PCR results resulting from amplification of intronless DNA (8 Unigene gene clusters) or nonspecific amplification (4 Unigene gene clusters), or they could not be amplified (8 Unigene gene clusters). Of the 10 differentially expressed transcripts identified, 7 were expressed only in testis (0/19 in other normal tissue), and 3 other gene products were detected in a limited number of normal tissues besides testis and ovary (Fig. 1a). Of the 7 testis-restricted transcripts, 2 encode known proteins, Ubiquilin 3 (Ubqln 3, Hs.189184, Group IIA) and disintegrin and metalloproteinase 2 or fertilin β (ADAM2, Hs.177959, Group IC), and 5 encode uncharacterized gene products, Hs.121554 (Group IA), Hs.178062 (Group IA), Hs.245431 (Group IB), Hs.97643 (Group IC) and Hs.195932 (Group IIA).
Table II. Cancer/testis (CT)-associated unigene clusters identified by database mining having restricted expression profiles in normal tissues as determined by conventional RT-PCR
Category of CT-associated Unigene cluster (Table I)
An uncharacterized gene product termed testis transcript Y 12 (TTY12)
Testis, prostate, ovary, lung, colon, breast
An uncharacterized gene product with homology to GAGE genes
Ubiquilin 3 (Ubqln 3)
With regard to the presence of SAGE tags corresponding to the known gene product ADAM2/Hs.177959 in normal colon tissue (Group IC), our RT-PCR expression data provide no evidence for ADAM2/Hs.177959 expression in normal colon (Fig. 1a). In addition to the 7 testis-restricted transcripts, 3 other Unigene clusters were expressed in a limited number of normal tissues. Transcripts encoding regulatory factor X4 (RFX4, Hs.183009, Group IA) were detected only in testis and brain (0/18 in other normal tissues). Two uncharacterized transcripts, Hs.128836 (Group IC) and Hs.293317 (IIA), were expressed in testis, ovary, cervix and lung (0/16 in other normal tissues) and testis, prostate, ovary, lung, breast and colon (0/14 in other normal tissues), respectively.
Expression of CT-associated Unigene clusters in cancer
All 7 of testis-restricted transcripts can be considered “virtual” CT gene products based on the presence of identical sequences in tumor-derived EST libraries (Group I Unigene clusters) or SAGE libraries (Group IIA Unigene clusters). To confirm this in silico expression profile, the tissue-restricted transcripts defined in the current study were analyzed by RT-PCR using a panel of RNA samples derived from a variety of malignant tissues. As shown in Figure 1b, 3 of the 7 testis-restricted transcripts, ADAM2/Hs.177959, Hs.245431 and Hs.178062, were also expressed in tumor tissue and represent newly defined CT genes. These CT gene products represent 1 known gene product and 2 uncharacterized transcripts. The known protein, ADAM2/Hs.177959, was expressed exclusively in testis and in 2/16 cases of renal cancer (Tables II and III). In accordance with proposed nomenclature for CT antigens,3 ADAM2 was given the CT designation CT15. ADAM2/CT15 is a member of the metalloproteinase-like, disintegrin-like cysteine-rich domain family of sperm surface proteins involved in egg/sperm interactions.20
Table III. Conventional RT-PCR analysis of mRNA expression frequencies of newly defined cancer/testis (CT) genes in normal and malignant tissues
Normal tissue RNA panel included brain, testis, kidney, liver, pancreas, placenta, small intestine, heart, prostate, adrenal gland, spleen, fetal brain, colon, stomach, lung, bladder, ovary, breast, cervix and skeletal muscle.
Melanoma cell lines included SK-MEL-28, -23, -19, -109, -37, -10, -30, and -139.
Additional tumor cell lines tested included SK-LU-14 and SK-LU-17 lung cancer cells, as well as SW1045 and Fuji sarcoma cells.
Another of the CT gene products identified in the current study, designated CT16, is an uncharacterized transcript represented by the Hs.245431 Unigene cluster. CT16/Hs.245431 was expressed in 4/18 melanomas, 7/18 lung cancers, 1/18 breast cancers, 1/9 colon cancers and 7/16 renal cancers (Tables II and III). It was also expressed in several tumor cell lines, including SK-LU-17 lung cancer, SW1045 sarcoma and 4/8 melanoma cell lines (SK-MEL-19, -109, -37 and -10), but not in normal melanocytes. The CT16/Hs.245431 cDNA sequence consists of 763 nucleotides (Genbank accession no. BC009230), containing an open reading frame, which encodes a putative full-length protein of 110 amino acids. The predicted CT16/Hs.245431 amino acid sequence is 30–40% identical to members of the CT antigen family GAGE-A6 and 40–50% identical to the GAGE-B/PAGE-1 family.16 The third newly defined CT gene product, designated CT17, represents the Hs.178062 Unigene cluster, and was expressed in 1/18 breast cancers and 4/16 renal cancers (Tables II and III). The CT17/Hs.178062 cDNA sequence is composed of 877 nucleotides (Genbank accession no. AA470035), encoding a partial protein of 202 amino acids, which is 30% identical to phosphatidylserine-specific phospholipase A1.21 Expression of the remaining 4 testis-restricted gene products, TSPNY/Hs.97643, TTY12/Hs.195932, Ubqln 3/Hs.189184 and Hs.121554 (Table II), was not detected in tumor tissue.
Three gene products defined in the current study as being expressed in a limited number of normal tissues were also expressed in tumor tissue (Table IV). The known gene, regulatory factor X 4 (RFX4, Hs.183009), was expressed exclusively in testis and brain and also in 1/9 colon cancers and 4/8 melanoma cell lines (SK-MEL-19, -37, -10 and -30) but not in normal melanocytes (Tables II and IV). RFX4/Hs.183009 is presented in the Unigene database as a translocation product in breast cancer involving the ubiquitously expressed estrogen receptor 1 gene located on chromosome 6 and a novel, RFX-like gene (RFX-4) on chromosome 12.22
Table IV. Conventional RT-PCR analysis of mRNA expression frequencies of differentially expressed, non-CT genes in normal and malignant tissues
Differentially Expressed Non-CT genes
Normal tissue RNA panel included brain, testis, kidney, liver, pancreas, placenta, small intestine, heart, prostate, adrenal gland, spleen, fetal brain, colon, stomach, lung, bladder, ovary, breast, cervix and skeletal muscle.
Melanoma cell lines included SK-MEL-28, -23, -19, -109, -37, -10, -30 and -139.
Additional tumor cell lines tested include SK-LU-14 and SK-LU-17 lung cancer cells, as well as SW1045 and Fuji sarcoma cells.
A second differentially expressed transcript, represented by the Hs.128836 Unigene cluster, was expressed in normal testis, ovary, cervix and lung and also in 7/18 lung cancers, 2/4 ovarian cancers, 2/18 breast cancers, 1/9 colon cancers and 2/16 renal cancers (Tables II and IV). The cDNA sequence of Hs.128836 is composed of 558 nucleotides encoding a putative partial protein of 164 amino acids with no similarity to characterized proteins or known protein motifs.
A third differentially expressed transcript, represented by the Hs.293317 Unigene cluster, was expressed in normal testis, ovary, lung, breast, prostate and colon and also in 9/18 melanomas, 14/18 lung cancers, 6/18 breast cancers, 14/16 renal cancers, 2/4 ovarian cancers and 3/9 colon cancers (Tables II and IV). Transcripts were also detected in 2 tumor cell lines, SW1045 sarcoma and SK-LU-17 lung cancer, but not in 8 melanoma cell lines, although it was expressed in normal melanocytes. Hs.293317 is a novel cDNA sequence, composed of 549 nucleotides (GenBank accession no. AW002915) encoding a putative full-length protein of 111 amino acids that is 89% identical to the newly defined CT gene, CT16/Hs.245431, described above. Based on the similarity with CT16/Hs.245431, Hs.293317 has been designated CT16.2.
Quantitative analysis of cancer/testis gene expression
To investigate further the mRNA expression profiles of CT15, CT16 and CT17, quantitative real-time RT-PCR was performed using an RNA panel derived from various normal tissues and tumor specimens. For comparison, prototype CT antigens, NY-ESO-18 and MAGE-3,23 were also analyzed in this manner. The normalized level of CT gene expression in normal tissues and cancer, relative to their expression level in testis, is given in Table V. Overall, real-time RT-PCR analyses revealed either no expression or considerably lower levels (3% or less) of CT gene transcripts in normal, non-gametogenic tissues compared with normal testis. In normal tissues, CT15 expression was detected in pancreas at 1% of the level detected in testis. In the case of CT16 mRNA expression in normal tissues, transcripts were detected only in kidney and liver, at 0.1 and 0.4% of the level detected in testis, respectively. Expression of both CT17 and MAGE-3 mRNA was restricted to testis. In the case of NY-ESO-1, the expression levels in normal brain, colon and lung were 3, 2 and 3%, respectively, of the level detected in testis. NY-ESO-1 was also detected in normal ovary at 52% of the level detected in testis. The copy number of CT transcripts per μg of total RNA was also calculated based on these relative expression levels (Table V) and a comparison with a standard curve of homologous cDNA of known copy number. The expression level of CT genes in testis showed wide variation, with CT15 having the highest copy number (445,000 copies/μg RNA), followed by CT16 (149,000 copies/μg RNA), NY-ESO-1 (31,300 copies/μg RNA), CT17 (16,100 copies/μg RNA) and MAGE-3 (15,060 copies/μg RNA)
Table V. Quantitative analysis of mRNA encoding cancer/testis gene products in normal and malignant tissues relative to testis
Expression Level of mRNA transcripts encoding CT gene products in various tissues relative to mRNA in normal testis (%)
Expression of CT15/ADAM2 was analyzed in 3 renal cancer specimens (tumor #1, RCC1; tumor #2, RCC5; tumor #3, RCC6). Expression of CT16 was analyzed in 2 melanoma specimens (tumor #1, Mel-1; tumor #2 Mel-11) and a breast cancer specimen (tumor #3, HBR-297). Expression of CT17 was analyzed in a breast cancer (tumor #1, HBR-297), renal cancer (tumor #2, RCC5) and a melanoma specimen (tumor #3, Mel-1). Expression of NY-ESO-1 was analyzed in two lung cancer specimens (tumor #1, LU356; tumor #2, LU339) and a renal cancer specimen (tumor #3, RCC1). Expression of MAGE-3 was analyzed in two lung cancer specimens (tumor #1, Mel-1; tumor #2, Mel-11) and a lung cancer specimen (tumor #3, LU356)
The expression level of CT genes in tumor tissue was also analyzed by quantitative real-time RT-PCR. In renal cancer specimens, RCC1, RCC5 and RCC6, the expression levels of CT15 were 2, 0.07 and 0.8% of the level detected in testis (Table V), respectively. Both the RCC1 and RCC6 tumors were positive for CT15 expression by conventional RT-PCR, wherease RCC5 was negative (Fig. 1b). As shown in Table V, the level of CT16 expression in 2 melanoma samples was 3.1 and 64 times the level detected in testis, respectively. In a breast cancer specimen, the level of CT16 expression was 3.2 times the level detected in testis. These 2 melanoma samples, and the breast cancer sample, were positive for CT16 expression when analyzed by conventional RT-PCR (Fig. 1b). In a breast cancer specimen (HBR297) and renal cancer specimen (RCC5), the level of CT17 expression was 2.89 and 0.19 times the level detected in testis (Table V), respectively, which is consistent with conventional RT-PCR results. The levels of NY-ESO-1 expression in 2 lung cancer specimens were 14 and 19% of the level detected in testis (Table V), respectively. In a renal cancer specimen, NY-ESO-1 was expressed at a level that was 6% of the level detected in testis. Finally, MAGE-3 expression in 2 melanoma specimens and a lung cancer specimen was 8, 11 and 30% of the level detected in testis, respectively.
EST databases are repositories of the human transcriptome, containing a wealth of nucleic acid sequence information and mRNA expression data. An extension of the EST database is Unigene, which pools information from public domain sequencing projects, including EST, Genbank, OESTES and human genome projects, and links this information to a number of relevant databases, e.g., those dedicated to scientific literature, the human genome, the proteome, single nucleotide polymorphisms and gene mutations. In conjunction with the Cancer Genome Anatomy Project, the Unigene database also provides tools for analyzing EST data, including in silico serial analysis of gene expression (SAGE), gene expression profiling and digital differential display. In view of the immunotherapeutic importance of CT antigens, i.e., they represent promising target molecules for antigen-specific cancer vaccines, the current study mined the Unigene database for gene clusters containing ESTs derived exclusively from cancer and testis cDNA libraries.
The current bioinformatic analysis identified approximately 1,300 different cancer/testis-associated Unigene clusters. Preliminary evidence in support of the approach used to search the Unigene database was provided by the presence of 4 known CT antigens, CT11p, GAGE 4, MAGEB1 and SAGE, among these 1,300 cancer/testis-associated Unigene clusters identified in the current study. Conversely, this bioinformatic analysis failed to identify members of 9 other previously identified CT gene families cited in the literature. The reason for this is that the database search tool (Xprofiler) used in the current study does not cross-reference more than 2 groups of cDNA libraries. Unigene clusters corresponding to the CT antigens not identified in the present study contain ESTs derived from cDNA libraries outside the 2 cross-referenced pools (normal testis and cancer). For example, NY-ESO-1, SSX-2/HOM-MEL-40 and CT-7 Unigene clusters contain ESTs from placenta; BAGE, SCP-1 and CT-10 Unigene clusters contain ESTs from cell lines; the BRDT/CT9 Unigene cluster contains an EST from a subtracted testis library; and the sp32/OY-TES-1 Unigene cluster contains ESTs from normal retina and fetal heart. Also, the CTAGE-1 Unigene cluster contains only normal testis ESTs, but not tumor-derived ESTs.
The mRNA expression patterns of 73 of these cancer/testis-associated Unigene clusters were examined by RT-PCR using a panel of RNA samples derived from various normal and malignant tissues. Three of the 73 gene products, CT15/Hs.177959, CT16/Hs.245431 and CT17/Hs.178062, were shown by conventional RT-PCR to be expressed exclusively in testis and malignant tissues and therefore have expression profiles analogous to CT antigens. Other similarities exist between the newly defined CT genes and known CT antigens. Two of the identified CT genes, CT16/Hs.245431 and CT17/Hs.178062, represent Unigene clusters that contain ESTs from testis, as well as melanoma and sarcoma cDNA libraries, respectively. These 2 tumor types are known to express a large proportion of the known CT antigens.2. Also, CT16/Hs.245431 maps to chromosome X, the site in the genome where 8 of the 14 known CT antigens map. Furthermore, the frequency of mRNA expression of the newly defined CT genes in cancer is consistent with those of previously defined CT antigens (20– 40% of a given tumor type3), ranging from 11 to 44% in the case of CT16 expression in colon cancer and renal cancer, respectively, and 5 to 25% in the case of CT17 expression in breast cancer and renal cancer, respectively. Conversely, the apparent restricted nature of CT15/ADAM2 expression in normal testis and renal cancer is unique among CT genes. However, a relatively small sample size was examined in the current study, and a much broader mRNA expression analysis is required before definitive conclusions regarding their expression frequencies in cancer can be made.
With the exception of a proacrosin binding protein, OY-TES-1,14 and synaptonemal complex protein-1,13 the biologic functions of CT antigens are not known. In the current study, 2 of the identified CT gene products encode proteins with known functions or functional motifs. ADAM2/CT15/Hs.177959 is a member of the metalloproteinase-like, disintegrin-like cysteine-rich domain family of cell surface proteases/adhesion molecules and is believed to be involved in egg/sperm membrane interactions.20 Although ADAM2/CT15 lacks a functional metalloproteinase domain it does contain a disintegrin domain, which may bind to integrin α6β1, or other similar molecules.24 Another CT gene product, CT17/Hs.178062, has similarity with phospholipases, which during fertilization play a role in sperm acrosomal exocytosis.25
The remaining CT gene, CT16/Hs.245431, and its relative CT16.2/Hs.293317, are 30–50% similar to GAGE proteins.6, 16 Based on the similarities among the GAGE A family (90% or greater amino acid identity), and between the GAGE-A and GAGE-B families (40–50% amino acid identity), it was concluded that Hs.245431/CT16 represents a member of a new GAGE gene family, tentatively termed the CT16 family, which also includes the tissue-restricted gene product CT16.2/Hs.293317. A detailed analysis of these newly defined GAGE-like genes is currently under way and will be reported separately. The biologic functions of GAGE proteins are not known, and few immunologic responses to GAGE proteins have been reported. With the exception of GAGE-A1 and CT16, the majority of GAGE genes, including CT16.2, are expressed in a narrow range of normal adult tissues.26 Given the similarity among members of individual GAGE gene families, it is possible that the lack of an immune response to GAGE proteins reflects tolerance to highly similar and more universally expressed GAGE genes.
Cancer/testis antigens are defined by having conventional RT-PCR expression profiles that are restricted to normal gametogenic tissues and cancer, and also by their immunogenicity in cancer patients. Two questions arise: (i) How many PCR cycles should be used to define tissue negativity in conventional RT-PCR; and (ii) what constitutes biologically significant mRNA expression in relation to immunogenicity? In the current study, putative CT genes were defined by unconditional, testis- and cancer-restricted expression at 35 PCR cycles of conventional RT-PCR and were then analyzed further by real-time quantitative RT-PCR at 40 PCR cycles, using 2 housekeeping genes as endogenous controls for normalization of mRNA content. The results of real-time quantitative RT-PCR of 5 CT genes either confirmed the absence of CT gene expression in non-gametogenic tissues (MAGE-3 and CT17) or indicated low-level CT gene expression in normal tissues (CT15, CT16 and NY-ESO-1), even though conventional RT-PCR failed to provide evidence of their expression. The presence of mRNA transcripts does not necessarily mean they are translated into a biologically significant level of protein. In the case of NY-ESO-1, low-level mRNA expression was detected in brain by real-time RT-PCR (but not by conventional RT-PCR8), whereas an analysis of NY-ESO-1 protein expression by immunohistochemistry detected expression only in testis and developing ovary.27
In addition to CT genes, the current study also identified 3 highly tissue-restricted gene products, RFX4/Hs.183009, Hs.128836 and Hs.293317, which are also expressed in cancer. RFX4 was expressed only in normal testis and brain, as well as in 1/9 colon cancers and 4/8 melanoma cell lines. RFX4 can therefore be considered a putative member of a group proteins, termed cancer/testis/brain antigens (CTB antigens). Other CTB antigens include CDR,28 Ma129 and Ma2,30 which were identified by Posner and colleagues as the target molecules recognized by autoantibodies in patients with paraneoplastic syndromes. RFX4 belongs to a family of DNA binding proteins that regulate transcription of MHC class II genes.31 Defects in genes encoding RFX proteins, such as RFXANK, RFX5 and RFXAP, lead to the development of bare lymphocyte syndrome, a severe autosomal recessive immunodeficiency disease (reviewed in ref. 32). Given the downregulated expression of MHC genes in testis, brain and cancer, expression of RFX genes in these tissues may be of significance. Two uncharacterized transcripts, Hs.128836 and Hs.293317, also had mRNA expression profiles restricted to a limited number of normal tissues and cancer. Due to a lack of functional domains, the biologic significance of these gene products remains to be determined.
In addition to the 3 CT genes and 3 tissue restricted transcripts, 4 other gene products having testis-restricted expression profiles were identified, including Hs.121554, Hs.97643, Hs.195932 and Hs.189184. Unigene clusters corresponding to these 4 testis-restricted gene products also contain ESTs and/or SAGE tags derived from tumor tissue. Continued expression analysis of these gene products, using enlarged panels of RNA derived from a wider variety of malignant tissues, may lead to their detection in tumor tissue and subsequent classification as CT genes. With regard to the remaining 1,200 cancer/testis-associated Unigene clusters that were not examined by RT-PCR, further study will focus on those gene products having in silico expression profiles corresponding to CT-related (Group I) Unigene clusters and testis-related Unigene clusters with SAGE tags derived from tumor tissues (Group IIA). A method described by Loging and colleagues19 for rapid expression screening by real-time RT-PCR should advance these studies.
The use of individual CT gene products as target molecules for generic cancer vaccines may be inadequate based on their relatively low expression frequencies among cancer patient populations, heterogeneous expression within the tumor itself and antigen loss by a given tumor. An alternative is the development of polyvalent cancer vaccine containing epitopes encoded by many different CT genes. Such polyvalent vaccines would be an effective way to increase the number of cancer patients eligible for vaccination and may also overcome some of the obstacles associated with tumor heterogeneity and immune escape. To this end, the current study added CT15, CT16 and CT17 to the repertoire of proteins available for polyvalent CT cancer vaccines. Furthermore, ADAM2/CT15 can be considered a target molecule with dual immunotherapeutic value, since its cell surface localization makes it a potential target for monoclonal antibody-based immunotherapies as well. In conclusion, the Unigene database contains a wealth of information that, when tapped into, can lead to the discovery of new cancer-related genes of therapeutic significance.
We thank Mr. J. Curley of the LICR, New York Branch, for his technical assistance