Transient Myeloproliferative Disorder (TMD), also previously referred to as transient leukaemia (TL) or transient abnormal myelopoiesis (TAM), is a self-regressing neoplasia, almost exclusively occurring in babies with Down's syndrome (DS), during the first 4 weeks of life (Arceci, 2002; Gamis & Hilden, 2002; Taub & Ravindranath, 2002). It is estimated that as many as 10% of all DS cases suffer from a TMD episode (Zipursky et al, 1992, 1995; Al-Kasim et al, 2002). The self-regressive nature of the disorder means it could provide potential clues as to the mechanisms that could be exploited to control myeloproliferation (Gamis & Hilden, 2002). Approximately 20–30% of cases with TMD develop acute megakaryoblastic leukaemia (AMKL, or AML-M7) with an onset at 2–4 years of age (Zipursky et al, 1992, 1994; Homans et al, 1993; Lu et al, 1993). Childhood AMKL has a relative risk of 500-fold in DS children (Zipursky et al, 1994; Gamis & Hilden, 2002), although all other types of acute leukaemias of childhood (ALL, AML-not M7) are also increased in DS children compared with non-DS children (Zipursky et al, 1992). Recently, Wechsler et al (2002) observed acquired mutations in the erythroid/megakaryocyte lineage-specific transcription factor GATA1 in the genomic DNA samples from six of six examined cases of DS with AMKL, and none in 92 control cases, which included DS with other kinds of myeloid leukaemia, AMKL cases in non-DS individuals, and healthy controls. All mutations were acquired (DNA from remission samples did not show them), and all resulted in a premature translation termination in the GATA1 activation domain (encoded by the second exon). The resulting cells only produced the shorter version of the GATA1 protein (GATA1s), lacking the activating domain. This domain is important for the proper trans-activating potential of the protein (Shimizu et al, 2001). Following this first report, other groups found that a variety of similar mutations with identical protein consequences are acquired in utero in TMD patients (Groet et al, 2003; Hitzler et al, 2003; Mundschau et al, 2003; Rainis et al, 2003; Xu et al, 2003). The mutations do not predict whether a TMD will progress to a later AMKL (Groet et al, 2003; Rainis et al, 2003; Xu et al, 2003). When the individual TMD patients were followed after developing AMKL a few years later, the proliferating clone retained the same GATA1 mutation (Hitzler et al, 2003; Rainis et al, 2003). These findings clearly implicate mutation in GATA1 as an early event (Ahmed et al, 2003), having a very important role in the pathogenesis of AMKL in DS. What remains unanswered are the mechanisms that predispose the early, trisomy 21-bearing, myeloid precursor cells to acquire a GATA1 mutation in the first place, the exact pathomechanism by which GATA1 mutation leads to TMD, the mechanism responsible for the spontaneous regression of TMD, and the nature of the additional changes (‘second hits’) required for the development of the progressive AMKL. To investigate these questions, microarray hybridizations were used to obtain global transcriptional profiles of DS-TMD and DS-AMKL and identify genes with significantly differing levels between these two conditions.
Transient myeloproliferative disorder (TMD) is a unique, spontaneously regressing neoplasia specific to Down's syndrome (DS), affecting up to 10% of DS neonates. In 20–30% of cases, it reoccurs as progressive acute megakaryoblastic leukaemia (AMKL) at 2–4 years of age. The TMD and AMKL blasts are morphologically and immuno-phenotypically identical, and have the same acquired mutations in GATA1. We performed transcript profiling of nine TMD patients comparing them with seven AMKL patients using Affymetrix HG-U133A microarrays. Similar overall transcript profiles were observed between the two conditions, which were only separable by supervised clustering. Taqman analysis on 10 TMD and 10 AMKL RNA samples verified the expression of selected differing genes, with statistical significance (P < 0·05) by Student's t-test. The Taqman differences were also reproduced on TMD and AMKL blasts sorted by a fluorescence-activated cell sorter. Among the significant differences, CDKN2C, the effector of GATA1-mediated cell cycle arrest, was increased in AMKL but not TMD, despite the similar level of GATA1. In contrast, MYCN (neuroblastoma-derived oncogene) was expressed in TMD at a significantly greater level than in AMKL. MYCN has not previously been described in leukaemogenesis. Finally, the tumour antigen PRAME was identified as a specific marker for AMKL blasts, with no expression in TMD. This study provides markers discriminating TMD from AMKL-M7 in DS. These markers have the potential as predictive, diagnostic and therapeutic targets. In addition, the study provides further clues into the pathomechanisms discerning self-regressive from the progressive phenotype.
Materials and methods
Patients and samples
The project was approved by the North East London Health Authority's Ethical Committee. Samples were surplus clinical, or archived clinical material collected by the tissue bank of the Italian National Down's Syndrome Association, (CEPIM), (samples kept by the team of Galliera Genetic Bank, http://www.ggb.galliera.it/), Italian National Association for Paediatric Haematology-Oncology (AIEOP), or the UK-Childhood Cancer Study Group. Written consent was obtained by the tissue banks for all subjects. All patients (detailed in Table I) were diagnosed with DS at birth, and confirmed by cytogenetics to contain a constitutional full trisomy 21. The diagnosis of AML-M7 was made by morphological criteria and confirmed by immunophenotyping. All samples consisted of mononucleated cells collected through Ficoll gradient from peripheral blood (PB) or bone marrow (BM), during the course of routine clinical procedures. Non-leukaemic (normal) controls (n = 11) included nine normal donor BM samples and two donated normal PB samples. DS non-leukaemic controls comprised three PB samples from DS children who never had TMD or leukaemia. After being frozen in 10% dimethylsulphoxide and stored in tissue banks, samples were washed in phosphate-buffered saline, and used for the isolation of RNA. In some patients, blast cells used in Taqman (Applied Biosystems, Warrington, UK) reverse transcription polymerase chain reaction (RT–PCR) experiments were sorted using FACSVantage for CD34+ (in patients with CD34+ blast cells) or CD45 and CD33 and/or CD7 in the remaining patients (for details see Table I and supplementary Fig 2 online, all supplementary figures and tables for online access can be found at http://www.smd.qmul.ac.uk/haematology/). As the sorted blast cells were positive for either CD34 or CD33 (or both) in all sorted patient samples, the normal controls for sorted cells were prepared by sorting separately for either CD33+ or CD34+ cells from normal or remission tissues. The choice of control material was made to match the patient population as closely as possible, limited by the availability of consented samples of sufficient quality and quantity within our tissue bank sources. Normal sorted samples hybridized to microarrays were further limited to only those from which >5 μg of starting RNA could be obtained. The non-leukaemic (normal) sorted samples used for Taqman RT–PCR included magnetic bead-separated CD34+ cells (n = 6) from four normal cord blood samples, one non-leukaemic DS cord blood sample, and one pool of 10 BM-complete remission samples from DS children after various types of leukaemia. The CD34− effluents from those same separations were separated, by a fluorescence-activated cell sorter (FACS), for CD33+ cells in all except two normal cord blood samples. An additional sample of FACS-separated CD33+ cells from one adult male BM brought the total number of CD33+ normal samples to n = 5. This BM sample (NBM33+), along with CD33+ cells from one normal male cord blood (NCB33+) and one non-leukaemic DS male cord blood sample (DSCB33+) were also analysed on microarrays as non-leukaemic myelocyte controls. Although the normal sorted samples comprise a slightly heterogeneous group, they were included mainly for illustration purposes and conclusions in the study were not based on these samples, but strictly on comparisons between the two groups of patient samples. For details of cell purification and examples of purity assessment by FACS, see supplementary Fig 2 online. For all non-leukaemic normal and DS control samples the GATA1 exon 2 was amplified by RT–PCR from the RNA and confirmed by agarose gel electrophoresis, and sequencing using ABI3100 automated sequencer, to have the full-length wild type GATA1 sequence.
|Sample||Age at Dg||Sex||Blasts (%)||Material||RNA utilization||GATA1 exon 2||Therapy||Outcome|
|TMD1||5 d||M||85||Dg||U133A(Stand), Taqman||270–271ins7bp||None||Complete remission (2 years)|
|TMD2||2 d||F||50||Dg||U133A(Stand), Taqman||259dup34bp||None||Complete remission (>7 years)|
|TMD3||5 d||M||60||Dg||U133A(Stand), Taqman||161C>T, STOP||None||Complete remission (>7 years)|
|TMD4||20 d||F||55||Dg||Taqman||160–161del2bp||None||Died later (non-haem. cause)|
|TMD5||Newborn||M||ND||Dg||U133A(Amp), Taqman||Splice Δ-exon 2||None||Complete remission (>8 years)|
|TMD6||<1 month||F||ND||Dg||U133A(Amp), Taqman||Splice Δ-exon 2||None||Complete remission (>7 years)|
|TMD7||24 d||M||20||Dg||Insufficient RNA||Splice Δ-exon 2||None||Complete remission (2·5 years)|
|TMD8||4 d||M||50||Dg||U133A(Stand), Taqman||344–345ins2bp||None||ND|
|TMD9||15 d||M||33||Dg||U133A(Stand), Taqman||263delG||None||Complete remission (>9 years)|
|TMD10||3 d||M||98||Dg||U133A(Stand), Taqman||245–266del22bp||None||Died during TMD|
|TMD11||5 d||M||78||Dg||U133A(Stand), Taqman||263 del G||None||Complete remission (>1 years)|
|TMD12blasts||M||>94||CD33+34+45+||Taqman||303dup10bp||None||Complete remission (>3 months)|
|AMKL1||38 months||F||50||Dg||U133A(Stand), Taqman||270–271ins7bp||Treated||Complete remission (>3 years)|
|AMKL2||28 months||M||70||Dg||U133A(Amp), Taqman||344–345ins2bp||Treated||Complete remission (>9 years)|
|AMKL3||24 months||F||ND||Dg||Taqman||Splice Δ-exon 2||Treated||Complete remission (>9 years)|
|AMKL4||14 months||M||40||Dg||U133A(Stand), Taqman||Splice Δ-exon 2||Treated||Died during therapy|
|AMKL5||26 months||F||95||Dg||U133A(Stand), Taqman||251delT||Treated||Died during therapy|
|AMKL6||22 months||F||32||Dg||U133A(Stand), Taqman||160–161del2bp||Treated||Died during therapy|
|AMKL7||24 months||M||60||Dg||U133A(Stand), Taqman||Splice Δ-exon2||Treated||Died during therapy|
|AMKL8||24 months||M||>50||Dg||U133A(Amp)||197 G>T, STOP||Treated||ND|
|AMKL9blasts||36 months||M||>94||CD34+||Taqman||301 C>G, STOP||Treated||ND|
|AMKL11||24 months||F||50||Dg||Taqman||262–263 ins7bp||Treated||ND|
|CMK||10 months||M||100||Cell line||U133A(Stand), Taqman||117–119del3bpinsC||Treated**||Died during therapy**|
Preparation and utilization of RNA from clinical samples
We isolated total RNA from mononuclear cells from PB or BM samples taken during the presenting phase of the disease (diagnosis, pretreatment), from 12 DS-AMKL patients and 12 DS-TMD patients. The details of the age, sex, percentage blasts, sorted blast samples, and follow-up information are listed in Table I. More details (such as blast karyotype) can be found in Groet et al (2003), for the majority of the samples. None of the TMD cases were treated, and in all of them the blast proliferation regressed spontaneously. One died from the severe form of TMD (while his blasts were in spontaneous regression), and one died later from non-haematological causes, whereas all others (data availability permitting) are still alive. All AMKL cases underwent therapy using a variety of therapeutic regimes, and the outcomes, where available, are given in Table I. All analysed cases were sequenced for GATA1 in their RNA [majority published (Groet et al, 2003), with six new GATA1 mutations identified since that publication], and all of these data are also shown in Table I. Seven AMKL and nine TMD samples yielded RNA of sufficient quantity and quality for hybridizations to Affymetrix Human Genome-U133A microarrays. All microarray data and related information are Minimum Information about a Microarray Experiment (MIAME) compliant and have been deposited in the Europrean Bioinformatics Institute (EBI) MIAMExpress microarray data repository (http://www.ebi.ac.uk/arrayexpress). Sample TMD7 had insufficient RNA for any experiments besides GATA1 sequencing. The entire sample AMKL8 was used up for the array hybridization, and there was none left for the Taqman experiments. All other samples from Table I were used in Taqman verification experiments (a total of 10 TMD and 10 AMKL unsorted diagnosis samples). Note that additional samples TMD12 and AMKL9 were available as sorted cells only, and were used in Taqman comparisons alongside sorted cells from other patients.
The RNA was extracted from the samples indicated in Table I using Qiagen RNeasy mini-columns (Qiagen, Crawley, UK) and processed according to the protocol recommended by Affymetrix (http://www.affymetrix.com), with samples TMD5, TMD6, AMKL2 and AMKL8 undergoing an additional amplification cycle as described in the Affymetrix small sample protocol, due to the low amount of starting material available in those cases. Labelled cRNA samples were first hybridized to Affymetrix Test3 arrays and scaled to a target intensity of 500 to check sample quality. For the standard labelling protocol, only those samples in which >15% of probe sets were called present, and the 3′/5′GAPDH ratio was <3 (mean 1·8, range: 0·99–2·89) were used. Those samples that passed the selection criteria were hybridized to Affymetrix Human Genome-U133A GeneChips, scanned (Agilent Technologies UK Ltd, Stockport, UK) and analysed using a variety of software packages. RNA samples from three non-leukaemic myelocyte controls (NBM33+, NCB33+, DSCB33+), and from the CMK cell line [purchased from the German Collection of Microorganisms and Cell Cultures (DSMZ) and cultured in Roswell Park Memorial Institute (RPMI) 1640 medium with 10% foetal calf serum and 2 mmol/l glutamine (Sigma, Poole, UK)] were also analysed on Affymetrix Test3 arrays first, passed all criteria as above, and were labelled using the standard (non-amplified) protocol for hybridization and analysis on HG-U133A GeneChips. All RNA samples were processed under identical conditions on site, and were hybridized to arrays ordered from the same batch.
Initial analysis of the scanned images was performed using MicroArray Suite (MAS) v5 (Affymetrix). For absolute analysis, each chip was scaled to a target intensity of 500, and probe sets were assigned a signal intensity and detection call of Present, Marginal or Absent, based on a P-value calculated by a one-sided Wilcoxon's signed rank test on raw cell intensities. The threshold P-values for ‘Present’ and ‘Marginal’ calls were set to 0·05 and 0·065 respectively. The absolute data (signal intensity, detection call and detection P-value) were exported into Genespring v5.1 (Silicon Genetics, Redwood City, CA, USA) software for analysis by parametric test based on cross-gene error model (PCGEM), based on nine TMD and seven AMKL replicates. In addition to the PCGEM comparison, two other strategies were used to identify differentially expressed genes in DS-TMD and DS-AMKL in this data set: ‘MAS’ analysis, and paired Student's t-test. ‘MAS analysis’: each DS-TMD sample was compared with DS-AMKL samples which had similar blast percentages (details available as supplementary Fig 1) and a ‘change P-value’ and ‘change call’ (based on a P-value calculated by a signed rank test; threshold P-value for significant changes set at 0·05) were assigned to each probe set by the Affymetrix MAS software. The absolute data and comparative data (fold change, change call and change P-value) were exported into a relational database program based on Filemaker Pro (Filemaker Inc., Santa Clara, CA, USA) for qualitative comparison based on detection and change-calls assigned by the ‘comparison analysis algorithm’ of the MAS software. This approach was successfully used in previously published studies (Mulligan et al, 2002).
The absolute data (signal intensity, detection call and detection P-value) were exported into Microsoft Excel for analysis by paired t-test. See results for further details of the analyses performed.
Quantitative real-time RT–PCR
Unprocessed total RNA was reverse transcribed using Superscript II reverse transcriptase (Invitrogen, Paisley, UK). 1–50 ng cDNA (0·01–100 ng for standard curves) were amplified using Assay-On-Demand Taqman probes, primers and protocols (Applied Biosystems, Warrington, UK; Preferentially Expressed in Melanoma (PRAME), Hs00196132_m1; CDKN2C, Hs00176227_m1; MYCN, Hs00232074_m1; SLC37A1, Hs00375251_m1) in the ABI PRISM 7700 Sequence Detection System. The number of samples for the data reported were as follows: normal (unsorted), n = 11; DS non-leukaemic samples, n = 3; DS-TMD Diagnosis (Dg.) samples, n = 10; DS-AMKL Dg. samples, n = 10; DS-AMKL remissions, n = 3; sorted normal (including non-leukaemic DS) CD33+, n = 5, CD34+, n = 6; DS-TMD blasts, n = 6; DS-AMKL blasts, n = 5. Table I details all individual patient samples which were used in Taqman quantitative RT–PCR measurements. K562 cells [from the European Collection of Cell Cultures (ECACC), cultured in RPMI 1640 medium with 10% fetal calf serum and 2 mmol/l glutamine (Sigma)] and CMK cells were included as standards. Analysis was carried out in Sequence Detector v1.7 (Applied Biosystems) and Microsoft Excel. In the cases of PRAME and MYCN, the comparative Ct method was used for analysis, as the standard curve for these genes had the same gradient as the one for GAPDH; the remainder of genes were measured relative to a standard curve. All data were normalized to GAPDH levels of the same samples. The sequence of the GAPDH-FAM probe used was CAAGCTTCCCGTTCTCAGCC, forward primer sequence was GAAGGTGAAGGTCGGAGT and reverse primer sequence as GAAGATGGTGATGGGATTTC (Van Trappen et al, 2001).
Unsupervised clustering attempt to separate DS-TMD from DS-AMKL
The absolute Affymetrix GeneChip data (signal intensities, detection calls and detection P-values) were exported into Genespring v5.1 software (Silicon Genetics). Following normalization (see Fig 1 legend), an unsupervised hierarchical clustering algorithm (based on Spearman correlation) was applied to all probe sets. The experiment tree obtained (Fig 1) showed that this type of analysis was unable to cluster the TMD cases separately from the AMKL cases, indicating that there were not many substantial and consistent differences between transcript profiles of DS-TMD and DS-AMKL. An additional observation was that three of four samples prepared for microarray hybridization using the small sample (amplification) protocol, AMKL2, TMD5 and TMD6, clustered together, indicating that this method of preparation may potentially introduce a slight artificial data skewing.
Search for significantly differing genes between DS-TMD and DS-AMKL
In order to find significant differences between DS-TMD and DS-AMKL, the data were filtered to remove all probe sets that changed by less than twofold between DS-TMD and DS-AMKL. A PCGEM was applied to the remaining probe sets (a total of 1288), and generated a list of 133 genes (P < 0·05; range: 0·000001–0·0499; listed in supplementary Table I online, and called ‘PCGEM’ list in Fig 2). About 64 probe sets would be expected to pass the restrictions by chance. Therefore, the differences yielded by this analysis were not statistically significant. In an attempt to reduce the rate of false positive detection, the Benjamini and Hochberg multiple testing correction was applied, but did not detect any significant differences. We therefore utilized two additional strategies to find reliable differences in our data set.
The second search strategy took into account the blast percentage of the samples (Table I) in order to minimize the influence of transcript profile components contributed by non-blast RNA. Each TMD sample was compared with an appropriate AMKL sample that had a similar blast percentage using Affymetrix Microarray Suite v5 (MAS) software. This allowed for 14 pairwise comparisons (the exact pairs compared are listed in supplementary Fig 1 online) where the repeated use of single samples was permuted in various pairs within the similar blast percentage range, and nine of 14 pairs were also matched for the tissue of origin (PB or BM). In order to eliminate the potential skewing of data observed in Fig 1, all four ‘amplified protocol’ samples were excluded from this analysis. Each pairwise comparison was searched to find genes that had significantly changed, i.e. increased to, or decreased from, a detectable (‘Present’) signal in DS-TMD versus DS-AMKL. This analysis yielded a total of 161 probe sets (listed in supplementary Table II online, and called ‘MAS’ list in Fig 2), of which 95 were higher in DS-TMD and 66 were lower in DS-TMD.
The third search strategy compared only TMD versus AMKL samples, which were perfectly matched for method of RNA labelling (‘standard’ or ‘amplified’), tissue of origin (PB or BM) and blast percentage, in order to detect genes which differed between the two disease forms with strict statistical significance. We performed a paired, two-tailed Student's t-test analysis on absolute data from five pairs of TMD versus AMKL samples that passed these criteria (details in Fig 2 legend). This analysis yielded a list of 193 probe sets (shown in supplementary Table III online, and called ‘paired t’ list in Fig 2).
Significantly differing genes and supervised clustering
When the lists of differing probe sets found by the three separate methods –‘PCGEM’ (133 probe sets), ‘MAS’ comparison (161 probe sets), and ‘paired t-test’ (193 probe sets) – were cross-compared with each other in a Venn diagram (Fig 2), a total of 26 probe sets were found by at least two of the three search strategies (a total of 2·8 could be predicted by chance alone) and one probe set was found by all three. Moreover, two genes in the overlapping list were represented by a pair of probe sets each, further increasing the credibility of the final list. The 24 genes represented by the 26 probe sets are listed in Table II. As these genes were selected to be highly discriminative between the two conditions, we attempted supervised hierarchical clustering analysis in Genespring with the 26 probe sets. The non-leukaemic myelocyte controls (NBM33+, NCB33+ and DSCB33+), analysed using the standard, non-amplified labelling protocol were also included in this analysis, whereas ‘amplified protocol’ samples were excluded. As the top dendrogram in Fig 3 shows, this analysis was able to separate all DS-TMD cases as one cluster, and all DS-AMKL cases as a separate cluster, with all three non-leukaemic controls (marked in blue letters) grouped together within it. In further support of the accuracy of the clustering, we examined the cell line CMK (Sato et al, 1989), which was derived from a male DS-AMKL patient and grown in culture without being in vitro transformed. This cell line is known (Wechsler et al, 2002), and was re-verified experimentally (Table I), to carry the exon 2-eliminating GATA1 mutation. The supervised clustering has clearly placed its transcript profile within the DS-AMKL group (Fig 3).
|Affymetrix ID||Common name||GenBank||Description|
|209651_at||TGFB1I1||BC001830||Transforming growth factor β1-induced transcript 1|
|211799_x_at||HLA-C||U62824||Major histocompatibility complex, class I, C|
|202748_at||GBP2||NM_004120||Guanylate-binding protein 2, interferon-inducible|
|221875_x_at||HLA-F||AW514210||Major histocompatibility complex, class I, F|
|203255_at||VIT1||NM_018693||Vitiligo-associated protein VIT-1 (VIT1)|
|204198_s_at||RUNX3||AA541630||Runt-related transcription factor 3|
|209138_x_at||M87790||Antihepatitis A immunoglobulin λ-chain variable, constant and complementarity-determining regions|
|211529_x_at||HLA-G||M90684||HLA-G histocompatibility antigen, class I, G|
|200675_at||CD81||NM_004356||CD81 antigen (target of antiproliferative antibody 1)|
|202688_at||TNFSF10||NM_003810||Tumour necrosis factor (ligand) superfamily, member 10|
|204086_at||PRAME||NM_006115||Preferentially expressed antigen in melanoma|
|220307_at||CD244||NM_016382||Natural killer cell receptor 2B4|
|211792_s_at||CDKN2C||U17074||Cyclin-dependent kinase inhibitor 2C (p18, inhibits CDK4)|
|204321_at||NEO1||NM_002499||Neogenin homologue 1 (chicken)|
|209993_at||ABCB1||AF016535||ATP-binding cassette, subfamily B (MDR/TAP), member 1|
|211862_x_at||CFLAR||AF015451||CASP8- and FADD-like apoptosis regulator|
|208485_x_at||CFLAR||NM_003879||CASP8- and FADD-like apoptosis regulator|
|212224_at||ALDH1A1||NM_000689||Aldehyde dehydrogenase 1 family, member A1|
|216248_s_at||NR4A2||S77154||Nuclear receptor subfamily 4, group A, member 2|
|204621_s_at||NR4A2||AI935096||Nuclear receptor subfamily 4, group A, member 2|
|203574_at||NFIL3||NM_005384||Nuclear factor, interleukin 3 regulated|
|202088_at||LIV-1||AI635449||LIV-1 protein, oestrogen regulated|
Two discrete blocks can be observed from the left dendrogram in Fig 3: The nine genes at the top, which have either a higher expression in TMD or lower in AMKL and a varying one in the normal myelocytes, and a second block of 15 genes, which have a generally lower expression in TMD or higher in AMKL and vary in normal myelocytes. In this second group, PRAME, CD81 and CD244 are all antigens known to be markers of cells other than myeloid. However, the fact that all three were highly expressed in the CMK cell line, which is devoid of any contributions from non-blast cells, indicated that they could represent the true phenotype of the progressive DS-AMKL, distinguishing it from the self-regressive simile, DS-TMD.
One of the genes observed in this block was CDKN2C. This was particularly interesting as this gene was found to be the main effector of GATA1-mediated cell cycle arrest (Rylski et al, 2003). It has been shown that concomitant induction of c-MYC expression completely abolishes the GATA1-mediated increase in CDKN2C transcription (Rylski et al, 2003). Since our two groups of samples did not differ in the transcriptional levels of GATA1 (see below), we investigated whether they differed in the level of c-MYC, or other members of the MYC pathway. We noted that two members of the MYC pathway, MYCN and the MYC-interacting protein MAX (two independent probe sets for the latter), were found to be significantly increased in TMD compared with AMKL in our ‘MAS’ comparison method (see supplementary Table II online). Visual inspection (Fig 4A) of the microarray data also clearly indicates that, surprisingly, MYCN rather than c-MYC, seemed to be the only member of the MYC protein family that showed consistently higher expression in TMD compared with AMKL, along with both probe sets for the MYC-interacting protein MAX.
Also, as multiple regions of human chromosome 21 have been associated with a leukaemia-risk further increased from the one provided by trisomy 21 alone (Abe et al, 1989; Niikawa et al, 1991; Shen et al, 1995; Kempski et al, 1997; Cavani et al, 1998), we screened all three lists of significantly changed probe sets (MAS, PCGEM and paired t-test) for genes from chromosome 21 (Hattori et al, 2000). Six such genes were identified using Genespring (Fig 4B).
Verification of differences by the quantitative real-time RT–PCR from selected genes
A subset of genes was selected for verification by quantitative real-time RT–PCR (Taqman). We initially chose PRAME (Ikeda et al, 1997), as it was the only gene found by all three of our comparison methods (Fig 2), and as it is an antigen of potential therapeutic importance (Matsushita et al, 2003). In addition, we chose CDKN2C, because of its role as the key mediator of GATA1-regulated cell cycle arrest (Rylski et al, 2003). We also chose MYCN, because of its well-documented ability to take over all functions of c-MYC in its absence (Malynn et al, 2000), which potentially might include the abolition of GATA1-mediated cell cycle arrest via CDKN2C. From the chromosome 21 genes shown in Fig 4B we decided to verify RUNX1 further because of its widely documented association with a variety of haematological malignancies (Speck & Gilliland, 2002), and its interaction with GATA1 in directing differentiation (Elagib et al, 2003). We also chose SLC37A1 as its expression pattern showed the most consistent differences between the two conditions (Fig 4B).
As seen in Fig 5, the level of mRNA for the tumour-specific antigen PRAME found by microarray hybridizations (Fig 5A), and measured using quantitative real-time PCR (Fig 5B), showed a statistically highly significant difference of >20-fold higher expression in DS-AMKL patients, compared with DS-TMD or normal controls. A similarly high difference was seen when DS-AMKL were compared with a limited number of DS-ALL and non-DS-AMKL samples (not shown). This result was reproduced (Fig 5B) when blast cells from patient samples, purified using either CD34 or a combination of CD33 and other surface antigens (details in Table I), were compared between DS-AMKL, DS-TMD and non-leukaemic (normal) controls purified in the same way (for details of blast purification and purity assessment, see ‘Materials and Methods’ and supplementary Fig 2 online). For the patient DS-AMKL2 (Fig 5C), two samples obtained during the acute phase of the disease, both containing 70% blasts (day 1 = a pretreatment sample and day 23 = a sample obtained after a non-responding therapy attempt) showed c. 600- and 3000-fold higher levels of PRAME mRNA respectively, than the two subsequent remission samples obtained after the second (successful) attempt at therapy. This showed that the high transcriptional activity of the PRAME gene is clearly specific to the DS-AMKL blast cells.
In the case of GATA1 (Fig 6A), the normalized microarray signal intensities clearly showed a statistically significant increase in GATA1 (c. sixfold) when either TMD or AMKL leukaemic samples were compared with non-leukaemic (normal) myelocytes. This was an expected result, since the CD33+ cell fraction will have a relatively small proportion of erythroid/megakaryocytic precursor cells, which are the only ones expressing GATA1 (Cantor & Orkin, 2002). This increase in GATA1 paralleled a twofold increase in the level of CDKN2C in AMKL, but not TMD samples, compared with normal (Fig 6A), despite the same increase in GATA1. A similar result was reproduced using Taqman RT–PCR (Fig 6B); DS-AMKL cases showed a statistically significantly higher level of CDKN2C than DS-TMD (2·4-fold), and the same trend was confirmed when sorted blasts from DS-AMKL were compared with DS-TMD blasts (Fig 6B). A diametrically opposite result was obtained for the level of MYCN mRNA (Fig 6C–E); it was statistically significantly higher in DS-TMD than in DS-AMKL, both when comparing microarray signal intensities (Fig 6C), and by Taqman RT–PCR (Fig 6D). This result was more difficult to check on blasts separated using CD34, as CD34+ cells from normal controls also showed a comparably high level of MYCN expression (Fig 6E). Immature T-cell precursors (Douglas et al, 2001), and possibly other immature CD34+ non-myeloid haematopoietic cells, could be co-sorted together with those patient blast samples that were selected only for CD34. We therefore restricted the comparison with CD33+ blasts, which were unlikely to contain any immature T-lymphocytic lineage cells (Van Dongen & Adriaansen, 1996). Too few samples were left for t-test significance (Fig 6E; DS-TMD, n = 4 and DS-AMKL, n = 4), but we did observe an approximately threefold higher level of MYCN in DS-TMD blasts compared with DS-AMKL blasts, with non-overlapping standard error bars, and an even greater (and statistically significant) difference compared with non-leukaemic CD33+ controls.
For the chromosome 21 genes, no significant difference was observed with RUNX1a or RUNX1c isoforms (Levanon et al, 1996) (analysed using Taqman separately, not shown), but a statistically significant difference was observed for SLC37A1, which showed c. three- to fivefold higher level in DS-TMD compared with DS-AMKL (Figs 6F,G).
This study shows that the overall transcript profiles of the DS-TMD and DS-AMKL are very similar, but statistically significant and consistent differences were noted.
While unsupervised clustering could separate the different French-American-British (FAB) subtypes of AML from each other in published studies (Yagi et al, 2003), the same method was not successful in separating the transient from the acute form of the same FAB subtype (AML-M7) in DS, in our study (Fig 1). This is the first global molecular confirmation of the notion that these two conditions are related. This agrees with the fact that the blast cells in both are morphologically (Zipursky et al, 1995) and immuno-phenotypically (Yumura-Yagi et al, 1992; Karandikar et al, 2001) indistinguishable, and also share identical acquired GATA1 mutations in individual patients (Wechsler et al, 2002; Groet et al, 2003; Hitzler et al, 2003; Mundschau et al, 2003; Rainis et al, 2003; Xu et al, 2003). These results further support the scenario of a pre-existing regressed and dormant TMD clone acquiring unknown ‘second hit(s)’ resulting in leukaemogenesis a few years after the initial TMD episode (Hitzler et al, 2003; Rainis et al, 2003).
The GATA1 protein (Tsai et al, 1989), a transcription factor encoded by an X-linked gene, is required for normal growth and maturation of both erythroid cells and megakaryocytes (Orkin, 2000). Inherited mutations in GATA1 lead to congenital anaemia and thrombocytopenia (Nichols et al, 2000; Freson et al, 2001; Mehaffey et al, 2001), whereas megakaryocytes that lack GATA1 proliferate excessively (Shivdasani et al, 1997). Studies of controlled expression of full-length GATA1 transfected into GATA1-null erythroblasts have shown that it causes two major effects (Rylski et al, 2003): differentiation into the erythroid lineage direction, and an increase in mRNA production of CDKN2C, resulting in cell cycle arrest. The forced overexpression of full-length GATA1 directly represses the transcription of endogenously expressed c-MYC (Chen et al, 2003), a known inhibitor of CDKN2C, and was proposed as a potential mechanism for the GATA1-mediated cell cycle arrest (Chen et al, 2003; Rylski et al, 2003). We report a statistically significant difference in the level of CDKN2C between DS-TMD and DS-AMKL cases, despite both being partially differentiated down the erythroid/megakaryocytic pathway (Yumura-Yagi et al, 1992; Karandikar et al, 2001), with similar and strong expression of GATA1 (Fig 6A,B). The GATA1 that they express is GATA1s, because acquired mutations prevent the expression of full-length GATA1 (Wechsler et al, 2002; Groet et al, 2003; Table I). GATA1s retains the un-impaired ability to interact with FOG-1 (Wechsler et al, 2002), recently shown to be essential for the full repression of c-MYC promoter activity (Chen et al, 2003). However, it has not yet been determined whether GATA1s retains the same ability as the full-length GATA1 to potentiate CDKN2C transcription and arrest the cell cycle, through the repression of c-MYC. This effect of GATA1, but not its differentiation-driving function, is fully abolished by concomitant transfection and overexpression of c-MYC in the same cells (Rylski et al, 2003). Other members of the MYC family have not been tested in this scenario. The level of MYCN mRNA (supplementary Table II online, Figs 4A and 6C–E) was significantly different, and diametrically opposite to the difference in CDKN2C expression. Patients with the transient form of this disease had much more MYCN and its interacting partner MAX (Blackwood & Eisenman, 1991) (supplementary Table II online, Figs 4A and 6C–E), and significantly less CDKN2C (Figs 3 and 6A,B) transcripts than those with the progressive form. Amplification and/or overexpression of the MYCN proto-oncogene occur in 25% of neuroblastomas, and is highly correlated with poor prognosis, treatment failure and mortality (Bordow et al, 1998). A rare report of MYCN was made in lymphomas (Finnegan et al, 1995), and our result is the first observation, to our knowledge, of its differential expression in any type of leukaemia. Physiologically in the haematopoietic system, the expression of MYCN has been observed so far only in very immature thymocytes (Douglas et al, 2001), which is unlikely to be the source of the difference observed, as a c. threefold difference in MYCN was seen, with non-overlapping error bars, by Taqman on CD33+ separated blasts (Fig 6E), a fraction unlikely to contain immature thymocytes (Van Dongen & Adriaansen, 1996). It has been well-documented that MYCN can functionally fully replace c-MYC in its roles for development, cell growth and differentiation (Malynn et al, 2000). Our data therefore support a model for aberrant expression of MYCN in the differentiating myeloid cells of the erythroid/megakaryoblastic lineage in DS in utero, resulting in the inhibited ability of GATA1 to drive CDKN2C expression. In conjunction with acquired GATA1 mutations, this may result in the initial hyper-proliferation of such cells, causing TMD. The cause of the TMD regression remains less clear. An intriguing parallel is that some neuroblastomas (type IVS) have also been described as a self-regressing neoplasia in neonates (Pritchard & Hickman, 1994; Oue et al, 1996). It has been shown that expression of MYCN can sensitize neuroblastoma cells to rapid apoptosis caused by γ-interferon and a wide variety of other stimuli (Lutz et al, 1998; Fulda et al, 1999). A sudden change of environment for the child (uterine to postnatal) and for its myelocytes (fetal liver to BM) could provide such an apoptotic trigger, to cells already sensitized by an ectopic expression of MYCN. In support of this hypothesis, plasma levels of γ-interferon and tumour necrosis factor-α were recently found to be significantly increased in DS-TMD (Shimada et al, 2004). This may explain an obligatory spontaneous regression of TMD during the first 4 weeks of postnatal life. Alternatively, a cytotoxic mechanism may operate, in which TMD blasts are targeted by T cells or NK cells.
The observed increase in the level of CDKN2C in the DS-AMKL blasts points to an apparent state of non-responsiveness of DS-AMKL blasts to the CDKN2C-mediated cell cycle arrest. Hence, the ‘second hit’ causing progressive leukaemia could target the components of this pathway. Alternatively, the ‘second hit’ might represent a loss of an important mediator of apoptosis, as seen in neuroblastomas (Hogarty, 2003). Finally, discrete differences in levels of cell surface molecule and solute carrier transcripts were found (supplementary tables online, Figs 3 and 6F,G). The level of PRAME showed the greatest difference detected in our study (Fig 5): it appeared to be virtually absent in normal myelocyte controls and TMD leukaemic samples. The expression of PRAME in AMKL blasts was also much higher in DS-AMKL than in DS-ALL or non-DS AML-M7 (not shown). PRAME has been identified as a tumour antigen recognized by cytotoxic T cells, in cultured cells that were originally established from a melanoma patient (Ikeda et al, 1997). It belongs to the cancer/testis tumour-associated antigens (Juretic et al, 2003), is normally expressed only in trophoblasts and testis, but is highly expressed as a tumour-specific antigen in various types of solid tumours (Sarcevic et al, 2003), more than 40% of ALL (Steinbach et al, 2002) and AML (Greiner et al, 2000) cases of various types, and a quarter of all chronic lymphocytic disorders (Proto-Siqueira et al, 2003). PRAME was highly expressed in DS-AMKL cases (including the cell line CMK), but not in TMD cases. This makes it a potentially ideal predictive marker for the discrimination between cases that are likely to self-regress versus those more likely to progress, in DS-AML-M7. We also showed that it was specific to blast cells, and that its level regressed to zero with the remission of the disease. This could make it an interesting monitoring marker for DS-AMKL. More importantly, some studies are already exploring this protein as a therapeutic target in haematological malignancies (Matsushita et al, 2003). PRAME could be an interesting molecular target for novel autologous cellular immunotherapy approaches to aid the treatment of DS-AMKL.
Supported by the Leukaemia Research Fund, UK (grants 9838 and 0356), Special Trustees of Barts and The London Hospital (grant RAC405), CNR-Miur, MIUR ex 40%, The Phyllis and Sidney Goldberg Medical Trust, and The Fondation Jerome Lejeune. The Galliera Genetic Bank (GGB) is supported by the Italian Telethon grant C51. Authors thank CEPIM (Genova, Italy) and Fondazione ‘Citta della Speranza’ (Malo, Italy) for help. We also thank Paul Allen for cell lines and advice. All supplementary Figures and Tables for online access can be found at http://www.smd.qmul.ac.uk/haematology/. All microarray data and related information are MIAME compliant and have been deposited in the EBI MIAMExpress microarray data repository (http://www.ebi.ac.uk/arrayexpress).