Gene expression profiling in melanoma is an exercise in prospecting for fundamental molecular biologies useful for formulating hypotheses to explain disease characteristics. Over the last 10 years DNA microarray technologies have been employed in scores of such melanoma studies. As soon as the technology became available, researchers grasped the new tool and began to hammer away at melanoma’s molecular strata. A small army of data miners toiled into the dross, believing that glittering seams of relevance had been struck and from great slag heaps of data they would extract biological gold. What exactly has a decade of ceaseless sappering brought us, what can be profitably refined from what has already been extracted and what remains undiscovered? This review is a critical analysis of the multiple research programs, which have attempted to define and contextualize broad transcriptional changes relevant to melanoma biology.
There was a time when coal miners, working in atmospherically hostile environments and needing some way to monitor air quality, brought down into the darkness small caged birds. The coal miner’s canary, more sensitive to toxic gases than the miner, provided an essential warning if it suddenly fell silent and died. Such a mechanism ensured that the miner’s working environment was both safer and more productive. For the high-throughput data miner, reproducibility (the final arbiter of hypothesis testing) is his miner’s canary. If he does not want to get into trouble he should not go to work without reference to it. This is illustrated by a recent review, which cataloged 69 genes reported by others as being transcriptionally affected by DNA methylation in melanoma (Rothhammer and Bosserhoff, 2007). Three of the cited sources, which together identified 49 of these genes, used DNA microarrays to assess changes effected by treatment of cell lines with 5-aza-2′deoxycytidine (Gallagher et al., 2005; Muthusamy et al., 2006; Van Der Velden et al., 2003). Two additional uncited studies also used gene expression profiling to examine DNA methylation effects in melanoma (Karpf et al., 2004; Mori et al., 2005). It is conspicuous that the results of five studies with identical aims and nearly identical strategies show no consistency. The conclusion that must be drawn is that gene expression profiling of melanoma has failed to reproducibly reveal genes whose expression is affected by DNA methylation – the data miner’s canary has died (Figure 1). That the cited review accepted such different outcomes uncritically shows the canary’s passing went unnoticed. This failure to recognize inconsistency is symptomatic of how our view of DNA microarray data analysis is frequently dimmed by our unfamiliarity with it. The reason gene expression profiling failed to expose the role of DNA methylation in melanoma is because neither the researchers nor the reviewers gave sufficient attention to statistical criteria essential to high-throughput investigations. This review is an attempt to critically assess the bulk of gene expression profiling in melanoma research, to differentiate robust findings from weak, to point out why some strategies succeed and others do not, and why we must keep an eye on that bird.
DNA microarrays are small platforms of glass or silica hosting tens of thousands of single-stranded DNA sequences. The sequences are fully characterized and deposited onto their platforms such that the identity and location of each is standardized. For gene expression arrays, these sequences correspond to (or complement) the sequences specific to mRNA transcripts. From cell cultures or tissues mRNA is extracted, labeled (or used to derive labeled cDNA) and hybridized to DNA microarrays. Unbound or weakly bound material is washed off and the platform is scanned to detect the label’s signal. Signal intensity gives an approximation of the relative proportion for each labeled sequence and therefore an estimate of gene expression in the sample. With very large numbers of different probes on a platform concomitantly large sets of data are generated with each experiment. Critical to this review is the importance of understanding that, from a statistical viewpoint, each probe sequence on a platform represents a single experiment among perhaps tens of thousands. It follows that with standard significance testing DNA microarrays are thus guaranteed to produce false positives in the data. Mining large datasets for biologically relevant information must be considered with this in mind.
The term ‘data mining’ once had a pejorative connotation among the statistical community where it was associated with the inappropriate search for statistical significance within large datasets. Conventional statistical approaches begin with a testable hypothesis, collect relevant data and then test the significance of that data in relation to the hypothesis. It is a key that the sample used to test (or validate) the hypothesis is not the same as that used to derive it. Data mining, as it was once known, involved the mistaken idea that analysis of a dataset to derive a hypothesis also constitutes the test; the consequence being that chance patterns in the data can come to be held as significant. Although statisticians now call this practice data dredging, much that is today termed data mining can still be referred to in its originally negative sense (Ioannidis, 2005a). When considering what we have gained as a research community from high-throughput analyses of melanoma, and this review will attempt to cover scores of such works, it is important that we can differentiate dredging from mining.
In discriminating good work from bad, it is informative to consider the nature of the experiments conducted, the data obtained and the pitfalls peculiar to high-throughput analysis. Initially, the results of different experiments included few or no replicates and were qualified by arbitrarily chosen fold change limitations, which is a strategy blind to biological variation and now discredited (Allison et al., 2006). Over time significance calculations became broadly insisted upon and researchers transitioned from reporting only fold-change to increasing use of statistics. It is also now recognized that having sufficient sample numbers or replicates is vital and a minimum of five samples per class has become a general recommendation (Allison et al., 2006; Pavlidis et al., 2003; Pawitan et al., 2005; Tsai et al., 2003). Despite this an additional and necessary statistical consideration, the control of false discovery rates, is still frequently ignored. As already mentioned each probe or probe-spot on an array is its own experiment. When employed to interrogate gene expression in two different groups of samples a t-statistic for that experiment is calculated and compared against an arbitrarily chosen P-value to assess its significance. The critical issue is that, because DNA microarrays perform tens of thousands of these experiments simultaneously, an estimable number of individual results will pass the P-value limit by chance alone. The number of false positives that occur depends on a combination of the number of probes on the array and the P-value cutoff chosen. For example, if you perform experiments with an array comprising a thousand distinct probes and use a P-value cutoff of 0.05 you can expect approximately 50 probes to erroneously discard the null hypothesis (that there is no significant difference in expression between sample classes). If your test reveals 55 genes as being significant then it is likely that 90% of them are false positives. To take such a list of genes and try to use it to explain the biology of the system you are investigating (e.g. through GO term analyses) would be an essentially pointless exercise. Controlling false discovery rates involves either reducing (by orders of magnitude) the P-value cutoff or adjusting each probe’s calculated t-statistic according to the number of tests conducted, a strategy commonly referred to as multiple testing correction and widely accepted among high-throughput analysts (Allison et al., 2006; Benjamini and Hochberg, 1995). Perhaps more bedeviling than failure to address the above-mentioned criteria is non-replication of research discoveries, where conclusions are drawn on the basis of single studies (Ioannidis, 2005b), that this is inherently risky in high-throughput research lies in how very large datasets virtually ensure chance connections between sample groupings. For example, most of the studies mentioned in this review employ some form of clustering. As it is known that sharply defined clusters can be derived from random data, it is not possible to discern from an individual study whether the clustering observed is characteristic of its sample set or if it is reproducible using another (Michiels et al., 2007; Miller et al., 2004). An answer to this is to perform an identical study on one or more additional test sets to validate the hypothesis generated. Melanoma researchers have produced high-throughput studies of varying rigor across a broad range of melanoma-related topics. That so many works are of dubious value today is partly linked to the fact that the beginnings of high-throughput analysis in melanoma coincided with the earliest stages of DNA microarray development.
In the mid-1990s, soon after the advent of DNA microarrays (Schena et al., 1995), melanoma was one of the first cancers recognized as an obvious and amenable target for their practical application. Between 1996 and 2006, at least 129 separate reports were published detailing experiments using DNA microarrays to investigate various aspects of gene expression in melanoma biology (Figure 2). The earliest of these works, which interrogated relatively few genes on custom-made platforms, were as much validations of DNA microarray technology as they were investigations of melanoma (Derisi et al., 1996; Maniotis et al., 1999). When technological development stabilized to the point that arrays became commercially available array-based research programs were able to focus more on the disease itself. Studies seeking to identify improved clinical markers or understand malignant transformation compared the gene expression patterns of diseased and healthy tissues. Melanoma progression was analyzed by considering transcriptional signatures derived from different clinical stages, looking to explain their physiological dynamism and find improved progression markers. Gene expression patterns related to patient survival were sought in an effort to provide clinicians with more powerful prognostic tools. Healthy and diseased cells were subjected to a battery of different treatments, exploring everything from the effects of UV radiation on gene expression in melanocytes to the response of melanoma cells to various small-molecule challenges. An intensely investigated aspect of melanoma gene expression has been the examination of metastatic potential. The comparison of transcription patterns of cells which are characterized as being tumorigenic, invasive or otherwise aggressive against those of cells which are less so has been a vigorously pursued stratum. The overall situation can be summarized as follows; an industrious decade of gene expression profiling in melanoma research has generated an enormous quantity of data – what have we gained?
Breaking new ground
In 1996, a paper published a detailed result of the collaboration between Pat Brown’s DNA microarray group at Stanford and Jeffrey Trent’s well-established cancer genetics team at the NIH. This collaboration employed Brown’s DNA microarrays to hunt for tumorigenicity genes in human melanoma samples supplied by Trent. The experiment focused on the tumorigenic UACC-903 line, in which tumorigenicity is suppressed when transformed with a normal chromosome 6. With this strategy of comparison, it was hoped that a tumor suppressor would be identified. Indeed, on chromosome 6 the gene for Waf1 (CDKN1A), a p53-regulated factor involved in regulating cell cycle progression at G1, was shown to be upregulated with tumor suppression. The approach was able to confirm other genes identified previously through traditional approaches and added further candidates for factors involved in the regulation of tumorigenicity (Derisi et al., 1996). As a technical document outlining the practical application of a new technique the work remains noteworthy. However, from the perspective of more than a decade of additional experimentation and methodological improvement the results themselves are of questionable value. The platform used could interrogate fewer than one thousand different transcripts, representing <3% of all mRNA species. While their specific approach used an element of technical replication, crucial to establish the technique’s utility, no biological replication was employed and calculations of significance could not be made. However, as that report there have been a large number of similarly targeted programs and these documents a steady improvement in the way gene expression analysis in melanoma research has been conducted. Platforms have been enlarged to include almost all known transcript species. The numbers of biological replicates and use of statistics have improved. Unfortunately, full observance of minimum criteria has been rare. Despite this there are cases where we may yet draw meaningful conclusions.
Transformation from melanocyte to melanoma
Understanding the molecular differences between normal pigmented cells and melanoma is a useful starting point for both identifying practical clinical markers and comprehending the role of transcriptional change in neoplastic transformation. Accordingly, a number of different groups have employed gene expression arrays to document the up- and downregulation of genes in melanoma when compared with healthy cells and tissue. The initial studies used small arrays to identify by fold-change filtering a small number of genes with expression differences between samples (De Wit et al., 2005; Dooley et al., 2003; Mirmohammadsadegh et al., 2004; Seykora et al., 2003; Zuidervaart et al., 2003). The combined results of these small studies are that there is almost no agreement between them. The reasons for this are very likely due to their use of small platforms, few replicates and insufficient statistical stringency. There have been three studies using larger array formats to compare melanoma against normal cells or tissues (Haqq et al., 2005; Hoek et al., 2004; Talantov et al., 2005). Each of these identified putative molecular markers but only one also described signaling changes underlying transformation (Hoek et al., 2004). Systematic analysis of their respective gene lists shows that two genes (CITED1 and CDH3) were identified by all groups as being differentially expressed. For programs interrogating the greater fraction of expressed genes this is a diminishingly small yield. The study which included an assessment of co-regulation patterns within changed gene sets identified the Notch signal pathway as a candidate driver of transformation (Hoek et al., 2004). The problem is that despite its sophisticated correlative analyses this work used only two control samples. Therefore, in the absence of comparison to a larger and better controlled analysis, no firm conclusions can be reached.
With the passage of time and a growing enthusiasm for public databases it has only recently become possible to combine different datasets and carry out larger multistudy analyses. Such an analysis performed for this review shows that DNA microarray programs of sufficient scope may return with high reproducibility a large number of genes whose expression in melanoma are significantly changed. The strategy involved combining data from experiments which employed the same platform (in this case the Affymetrix HG-U133A probe set library) and source material (cell lines). Four sources were drawn from to assemble a 28 sample melanocyte control set (Hoek et al., 2004, 2006; Magnoni et al., 2007; Ryu et al., 2007). Five separate melanoma sets, with a minimum of 12 samples per set, were obtained from three different studies (Hoek et al., 2006; Johansson et al., 2007; Packer et al., 2007). Each melanoma set was separately compared with the melanocyte control set to identify significant differences in gene expression (full details in Appendix S1-A). Finally, genes which were not identified as being down- or upregulated in all cases were discarded, and the remaining genes were ranked based on their average performance – this approach, which analyses each institutional dataset in isolation before the final qualitative filter, avoids any institutional bias. Table 1 shows 86 genes, which undergo significant and reproducible downregulation in melanoma cells. Table 2 shows the top 150 genes (from a total of 1610 – see Appendix S1), which undergo significant and reproducible upregulation in melanoma cells. The results recapitulate a large number of factors well known to melanoma research. These include the downregulation of epithelial cadherin, dipeptidyl peptidase IV and c-Kit; and the upregulation of known melanoma antigens (MAGEA6, MAGEA3, MAGEA12, and PRAME), neuronal cadherin and osteopontin. Additional genes confirm earlier reports such as the upregulated Notch-2. Less familiar factors include a downregulated putative tumor suppressor (WFDC1) and upregulated tumor protein D52 (TPD52) neither of which have been specifically associated with melanoma before. These gene lists, selected by employing separate multiple-testing controlled analyses of five large sample sets and rigorous selection of uniform pattern reproducibility, currently represent the strongest candidates for gene expression change in melanoma neogenesis as measured by transcription profiling. A detailed examination of the mechanisms for their expression would likely provide useful additional clues about the process of transformation.
Table 1. Genes downregulated in melanoma
203716_s_at 211478_s_at 203717_at
Cadherin 1, type 1, E-cadherin (epithelial)
Cadherin 3, type 1, P-cadherin (placental)
201787_at 202994_s_at 207835_at
Oculocutaneous albinism II (pink-eye dilution homolog, mouse)
Integrin, alpha V (vitronectin receptor, alpha polypeptide, antigen CD51)
Comparison of these results against the findings of the only previous large study to use cell lines to examine transformation (Hoek et al., 2004) shows down- and upregulated gene lists with significant (P < 10−15 for both) overlap. The high significance level of overlap indicates that this earlier study may be viewed with confidence despite having few control samples. The two other studies addressing transformation (Haqq et al., 2005; Talantov et al., 2005) extracted RNA directly from tissues rather than from cell lines. The overlap between their gene lists and the present study are of low or no significance (Table 3). It is likely that the strategy of sourcing tissues rather than cell lines underlies the observed disagreement. To assess this possibility a qualitative comparison with the results of Tables 1 and 2 was performed using reanalyzed data originally obtained with a different microarray platform. Kaufmann et al. (2007) used Agilent arrays to generate a large dataset including both melanocytes and melanomas to explore the effect of cell cycle checkpoint function in melanoma. While they did not address the question of differences between melanocytes and melanoma per se they did make their data publically available. A reanalysis of this dataset comparing melanocytes against melanoma (Appendix S1-B) yields down- and upregulated gene sets with significant (P < 10−18 and P < 10−10 respectively) overlap with the present multiple dataset study. This shows that the gene sets outlined in Tables 1 and 2 are results primarily relevant to cell line-specific analyses.
Table 3. Extent of agreement between the present transformation analyses and others
aGenes or probe sets identified as being significantly changed between diseased and healthy cells or tissues.
bGenes or probe sets shared with the meta-analysis performed for this review (Appendix S1-A).
cCumulative P-value for the intersection as calculated by hypergeometric distribution.
As unequivocal as the results appear, a closer analysis of expression patterns across samples shows that many genes have significantly higher variance than others. For example, two top-ranked genes are the well known melanoma antigens MAGEA6 and PRAME. These both have large variance across melanoma samples. By comparison MDM1 and SPAST, also both in the top upregulated genes in melanoma, show much smaller variance across the samples (Figure 3). We frequently grant pre-eminence to genes by ranking them according to how much they change the amount of their expression between sample groups. This performance metric is often an averaged assessment of a gene’s expression among one set of samples as compared to the average of its expression among the samples of another. However, as shown in Figure 3, we find many genes given prominence in this way show significant variance. If a gene is given preference based only on average performance, how are we to account for the instances where its performance is nil (or nearly so) and yet the condition this performance is linked to – and perhaps inferred to explain – remains? Genes for whom variance is small, whose performance is reproducibly maintained among samples, are more tightly linked to the condition in terms of either cause or consequence. On the other hand, genes for whom variance is sufficiently large cannot be so tightly linked, but may instead be linked to a variable character of the condition (e.g. stage progression or metastatic potential). Therefore, care must be taken when interpreting the results of gene expression comparisons between groups, particularly when one or more of the groups are heterogeneous.
Progression from primary lesion to distal metastasis
Early in a primary lesion’s development there is frequently a phase in which proliferation is restricted to the epidermis (melanoma in situ). Following this, there is a distinct change to the growth pattern such that cells invade to the dermis and there proliferate. Early intra-epidermal invasion (radial growth phase) is distinguished from proliferation deeper into the dermis (vertical growth phase). The extent of this penetration is an accurate prognostic marker for patient survival (Breslow, 1978). These and other well-defined clinical stages of melanoma progression have long informed molecular studies of the disease. Lesions and cells isolated from different stages have been examined in various ways in order to understand progression. For example, extensive immunohistochemical analyses of lesions have identified more than a dozen ‘progression antigens’ (De Wit et al., 1992; Ferrier et al., 1998; Manten-Horst et al., 1995; Moretti et al., 1997; Niezabitowski et al., 1999; Silye et al., 1998; Vaisanen et al., 1998; Van Belle et al., 1999; Van Kempen et al., 2000).
Subsequently, DNA microarrays were used to look for better progression markers as well as for exploring the underlying biology of stage progression. Leaving aside programs which used few or no replicates (Baldi et al., 2003; Gallagher et al., 2005; Xi et al., 2006), there were several studies which interrogated stages of progression with sufficient samples for some form of statistical assessment. From these, it is clear that hierarchical clustering of data obtained from tissues were an accurate method for distinguishing melanoma stage groups (Haqq et al., 2005; Smith et al., 2005). While the capacity to accurately distinguish between RGP and VGP primaries in uncertain cases may prove useful in the clinic, it also shows that there are specific molecular differences between these stages. Whether or not these differences are due to melanoma cells per se remains debatable because cell lines derived from different stages are not so clearly distinguishable by transcription profiling (Jaeger et al., 2007).
The first study considered skin, nevi, primaries, and metastases (Haqq et al., 2005). Controlling for false positives the authors identified 2602 genes which could be used to distinguish between stages (Haqq et al., 2005). Another study used class discovery analysis to first cluster samples without reference to their clinical stage. The purpose of this was to see if unbiased data analysis could correctly identify stage groups. Two distinct groups of samples emerged, respectively comprising early stage (skin, nevi, and in situ melanoma) and advanced stage samples (vertical growth phase, lymph node metastases, and distal metastases). This suggested that the most significant change in gene expression during progression takes place between in situ and vertical growth phase melanomas. Multiple testing corrected anova analyses were used to pick out genes with significant differences between early and late stage groups. Prominent among these included CITED1, which was highlighted by the previously mentioned transformation studies, and SPP1 (osteopontin) identified by Haqq et al. (2005). Self-organized mapping (SOM) of this data performed the dual role of a fold-change filter and collator of co-regulated genes to identify many factors involved in mitotic cell cycle regulation and cell proliferation (Smith et al., 2005). A third study made superficial use of DNA microarrays to specifically assess, absent of multiple testing controls, only the known pro-apoptotic genes. This showed that many of these are downregulated early in progression, confirming that the transition between thin and thick primaries is the point of greatest change in gene expression patterns (Jensen et al., 2006). To support, this concept we can contrast these findings to a similar analysis, which compared primary lesions against metastases without distinguishing thick primaries from thin. This study identified 308 genes discriminating between primary and metastatic forms and highlighting changes in cell cycle regulation, mitosis, cell communication and cell adhesion (Jaeger et al., 2007). A more rigorous analysis of their data using the methods employed for this review (Appendix S1-C) yields 127 genes with significant expression differences between primary and metastatic samples (Appendix S2). An identical analysis of the Smith dataset (considering only the probe sets that were interrogated in the Jaeger dataset), which compared early stage samples against late stage samples, yields 2932 genes. Intriguingly there are significant overlaps in the down- (P < 10−29) and upregulated (P < 10−6) gene lists. This difference in yield, and not content, is likely because both Smith and Jensen identified that VGP tumors are transcriptionally more similar to metastases than to RGP tumors, whereas the Jaeger dataset does not make the distinction.
What we can therefore say from these studies is that between the radial and vertical growth phases a significant change in gene expression occurs to encourage proliferation and suppress apoptosis. While there is not enough data to confidently assess differences between later stages it may be that these will be less dramatic than the changes which differentiate RGP and VGP lesions.
Closely related to issues of disease progression is the recognized link this has with patient survival. Removal of primary melanomas before they exceed 1 mm in depth yields a high cure rate, but as depth increases patient outlook worsens in a proportional manner (Breslow, 1978). In melanoma, there are few prognostic tools as powerful as the Breslow index, although its use is restricted to primary lesions and there are no strong prognostic indicators for more advanced stages of the disease. For clinically minded researchers DNA microarray analyses offer an opportunity to extract independent prognostic information for metastatic patients (Kim et al., 2002). Several reports have since been published which attempt to assess its usefulness to clinicians.
An early analysis sought to characterize transcriptional differences between metastases which responded to immunotherapy against those which did not. Although they were unsuccessful they nevertheless identified a small number of genes linked with immune regulation whose expression changed significantly upon treatment (Wang et al., 2002). A later study looking at survival in uveal melanoma patients was able to identify a DNA double-strand break repair gene (NBS1) with prognostic value, validating this by immunohistochemical analysis of an enlarged sample set (Ehlers et al., 2005).
The group of J. William Harbor has over recent years demonstrated that a critical key to successfully delineating the clinical relevance of transcription signatures lies in the comparison of isolated datasets. Working with uveal melanoma and performing experiments on fresh tumor samples they found that primary uveal melanomas cluster into two distinct subtypes. From this a gene signature was extracted which both defined subtype membership and correlated with metastatic death (Onken et al., 2004). While this study has previously been criticized for poor cross-validation (Dupuy and Simon, 2007), these workers would later go on to perform sufficient cross-validation by demonstrating the same result using fine needle biopsy specimens. Critically, they extracted their gene signature from a total of three different datasets treated in isolation and further validated their findings by comparison with a fourth dataset generated by another group (Onken et al., 2004, 2006; Tschentscher et al., 2003). Furthermore, their gene signature was later shown to be better at sample classification than monosomy 3 and other clinicopathologic prognostic factors (Worley et al., 2007). Even without formal multiple testing controls these works together demonstrate how the combination of independent datasets can be used to pare away false positives generated by individual experiments.
The most extensive examination of survival and gene expression in primary cutaneous melanoma patients was conducted on behalf of the Melanoma Group of the European Organization for Research and Treatment of Cancer (EORTC). Rigorous class comparison analyses of data obtained from 82 primary tumors identified a panel of 254 genes capable of separating samples on the basis of metastasis-free survival. The discriminatory power of this gene set was validated using a separate population of 17 samples. Among the genes identified are factors involved in DNA replication and cell division processes (Winnepenninckx et al., 2006). A smaller program examined 43 stages III and IV metastases identified 30 genes which could be used to distinguish between patients with longer or shorter survival times. No separate sample set was used for validation. The overlap between the discriminatory gene lists of Winnepenninckx and Mandruzzato amounts to only two genes (P < 0.11). There are two possible explanations for the lack of significant overlap between studies. The first analysis considered only primary melanomas and the second concentrated on later stage metastases, and it is has already been shown that these may be quite different at the level of transcription (Haqq et al., 2005; Smith et al., 2005). A second possibility is the base methodologies used by these groups for identifying survival-associated genes is subtly different. The Winnepenninckx list was identified by clustering using a correlation coefficient which relies on measuring the angular separation of data vectors. The Mandruzzato set was picked out using Euclidian distance, which considers the distances between vector termini. In simpler terms, Euclidian distance takes into account the magnitude of the differences while Correlation coefficients, being insensitive to magnitude, take into account trends of change. Because shared trends are considered biologically relevant, magnitude is thought to be less important. Nevertheless, as long as downstream users of these lists employ appropriate clustering metrics it is probable that they will perform well (Gibbons and Roth, 2002). By contrast, using cell lines (instead of lesion biopsies) has not generated gene lists that correlate with patient prognosis (Bittner et al., 2000; Hoek et al., 2006). This is likely due to the changes brought by cell culturing. This indicates that once cells are removed from their in vivo context they may also lose the information, which grants them prognostic potential. From the biopsy studies the development of clinically useful prognostic tools look promising.
One study that did control the false discovery rate in its analysis examined the effects of telomerase suppression on melanoma cells (Bagheri et al., 2006). Telomerase is a ribonucleoprotein complex that functions to maintain the protective telomeric repeat sequences capping eukaryotic chromosomes (Greider and Blackburn, 1987). In normal human somatic cells this activity is suppressed, but in cancers it is re-activated (Kim et al., 1994). As with most other cancers, melanoma has also been shown to overexpress this complex (Cheng et al., 1997). Bagheri et al. (2006) used ectopic expression of a ribozyme that targets the core RNA moiety of the telomerase complex to suppress its activity. They showed that this treatment significantly reduced metastatic progression of B16 melanoma cells in vivo. Performing transcription profiling on multiple replicates and controlling false positives allowed them to determine that telomerase suppression resulted in the downregulation of 134 genes. Many of these are involved in transcriptional regulation, cell proliferation and glycolysis. Commenting on the known importance of glycolysis to tumor cell growth and invasion the authors speculated that telomerase may stimulate glycolysis via Akt kinase activation (Bagheri et al., 2006). This was recently given support by a study which demonstrated increased glycolytic activity and in vivo proliferation in Akt kinase-transformed melanoma cells (Govindarajan et al., 2007).
Heterogeneity of form, behavior and response
Melanoma displays considerable variation at all levels of its biology. Clinically, there are four common forms of primary melanoma. Lesion structuring and pigmentation is not uniform. Metastases may develop in different organs. Multiple metastases may respond asymmetrically to treatment. Heterogeneity is also evident in vitro, where melanoma cells show variation in motility, proliferation and response to cytokines. There are genetic aberrations that, although not universal to melanomas, are common to subsets of the disease (Curtin et al., 2005). Some of these differences are likely due to transcriptional changes and many researchers have used DNA microarrays to try and find genes with correlating expression patterns.
Mc1r loss of function
The melanocortin 1 receptor (Mc1r), when bound by α-melanocyte stimulating hormone, drives a signal cascade which culminates in upregulated microphthalmia-associated transcription factor (Mitf). As Mitf regulates tyrosinase expression, involved in eumelanin synthesis, Mc1r activity is critical to this aspect of melanogenesis. Loss of function mutations in the MC1R gene, which is expressed in the epidermis only in melanocytes (Roberts et al., 2006), contribute to a heightened risk for melanoma (Flanagan et al., 2000; Matichard et al., 2004). A recent expression profiling study looked at the effects of Mc1r loss of function in mouse skin to further characterize its role in epidermal biology. By employing multiple samples and controlling for false positives, April and Barsh (2006) identified several hundred new Mc1r signaling target genes, including those expressed by non-melanocytic cells indicating that Mc1r may influence paracrine signaling (). These authors noted the lack of overlap between their study and an earlier expression profiling experiment (which also looked at melanogenesis) and correctly surmised that at least part of this disagreement was likely due to experimental design. April and Barsh later validated their work and extended it by studying the effects of UV irradiation on Mc1r loss of function. This showed that Kit and Mc1r are independent contributors to the epidermal gene expression response, where Kit-dependent responses to UV irradiation involved genes with roles in antioxidant defenses, and Mc1r-dependent responses involved genes important for regulating cell cycle and oncogenesis (April and Barsh, 2007).
Bloethner et al. (2006) compared melanoma lines for which the CDKN2A locus was deleted against wild-type. Their study was complicated by a desire to account for possible interference from BRAF/NRAS mutations and thus their study sets were relatively small. The authors identified 30 genes with differential expression linked to CDKN2A deletion and independent of BRAF/NRAS mutations. Eight genes were successfully validated using RT-PCR analyses on a second sample set (Bloethner et al., 2006).
The BRAFV600E mutation
In recent years, different groups have deliberately pursued the significance in the relationship between a well-characterized variable and gene expression profiling in melanoma. The serine/threonine kinase BRAF, a member of the MAPK signal pathway, is subject to a specific activating mutation with high frequency in melanoma (Davies et al., 2002). The incidence of this mutation is closely correlated with primary melanomas arising in areas absent of chronic sun-induced damage (Curtin et al., 2005). Because the MAPK pathway is known to influence gene expression it was felt that an activating mutation in BRAFV600E would result in a specific gene expression signature. Multiple studies used DNA microarrays to approach the question of what effect BRAFV600E has on transcriptional processes in melanoma. Pavey et al. (2004) investigated a panel of 61 melanoma lines in which 42 (69%) carried the BRAFV600E mutation and another seven (11%) had NRAS mutations. From >18 000 distinct cDNAs they identified, through support vector machine-based analyses, 83 genes to be used in hierarchical clustering for separating BRAF mutants from wild-type lines. Bloethner et al. (2005) used a much smaller number of cell lines with no more than a few examples each of BRAF mutants, NRAS mutants and wild type samples and looked for correlations between these groups among the expression patterns of >22 000 different transcripts. Three separate softwares were employed to do this and the results were combined to identify gene expression patterns identified in each case. They found that 61 genes were differentially expressed in BRAFV600E samples compared to BRAF wild-types. Comparing these two works reveals no genes that are shared between their discriminator sets. A third study used 21 melanoma cell lines for their study, 16 with the BRAFV600E mutation and five were BRAF wild-type. Despite using arrays, which interrogated in excess of 14 000 genes, the authors restricted their analysis to only 36 genes coding for members of the RAS/RAF/MEK/mitogen-activated protein kinase (MAPK) signaling pathway (Tsavachidou et al., 2004). Reanalysis of their full dataset (Appendix S1-D) using multiple testing correction shows that 16 genes are differentially expressed between BRAF mutant and BRAF wild-type samples. None of these were detected by the Pavey or Bloethner studies. Analysis of three more datasets using multiple testing correction confirmed that no gene’s expression is consistently linked to the mutation status of BRAFV600E (Hoek et al., 2006). Furthermore, the different groups’ gene lists cannot be used to distinguish by clustering BRAFV600E from wild-type in different datasets fails, indicating that each gene list is sample group-specific.
A more recent paper by the authors of the first study acknowledged the absence of a statistically significant connection between the BRAF mutation and the expression of any gene and yet reaffirmed the existence of a BRAFV600E gene expression signature (Johansson et al., 2007). Their strategy was to look for individual BRAFV600E gene expression signatures in a total of four different datasets, using some of the samples from each set to derive a predictor that is tested against the remaining samples. Within each dataset a gene signature was found to correlate with BRAF mutation with 69–84% accuracy. The authors also attempted to validate a predictor generated with one sample set against the others, finding that their success rate for predicting BRAF mutation ranged between 57% and 78% accuracy. This variable result shows that with non-unique molecular signature definitions the success of this type of analysis is dependent upon sample selection (Michiels et al., 2007). This is not to say that there is no relationship between BRAFV600E and gene expression, but rather that the relationship is very probably modulated by factors which have not yet been fully accounted for in experiments attempting to establish the link. Until these are taken into account, the accuracy of predicting BRAF mutation status via expression signatures will likely remain sample group dependent. It is likely that several factors confound a useful connection between BRAF mutation and gene expression. One of these is that BRAF mutation does not guarantee uniform activity downstream (Dhomen and Marais, 2007). Activation of Erk1/2, the canonical target of BRAF signaling, is rarer in melanoma lesions than the BRAF mutation (Jorgensen et al., 2003). In nevi, where BRAF mutation is present in 82% of cases (Pollock et al., 2003), Erk1/2 is not activated at all (Jorgensen et al., 2003). These data suggest that the relationship between BRAF and gene expression is complicated by additional factors which are unaccounted for in current high throughput analyses.
No other aspect of melanoma has been subject to as intense investigation as the potential for transformed cells to escape the primary lesion and nucleate life-threatening metastases elsewhere in the body. Metastatic melanoma is the most dangerous stage of the disease and is aggressively pursued by clinical researchers searching for therapies that will increase patient survival rates. The earliest analysis of gene expression regulation of metastatic potential was a small part of a study which studied the relationship between vasculogenic mimicry and invasive behavior. On noting that tissue sections of aggressive metastases showed networks of channel-like structures, Maniotis et al. (1999) found that three-dimensional culturing recapitulated similar networks only in strongly invasive melanoma lines. A relatively small cDNA microarray platform was used to obtain gene expression data for strongly and weakly invasive melanoma lines. Fold-change analysis was the sole method for selecting significantly altered genes. Were this the only such analysis of melanoma metastatic potential available, our current understanding of high throughput study criteria would recommend against serious consideration of the results. However, the analysis of melanoma metastatic potential by expression profiling has been performed many times and comparison with later analyses finds both support and extension for the findings of this initial report.
Bittner et al. (2000) analyzed a library of 31 melanoma lines with the express purpose of identifying cell line classifications. Using a combination of unsupervised hierarchical clustering and non-hierarchical clustering methods they identified a major cluster of 19 samples. In vitro invasion and motility tests suggested that these samples were likely to be less metastatic than others. They extracted from this data a weighted list of genes whose variance correlates with the clusters found. Significantly, some of these genes were previously identified by the original Maniotis et al.’s (1999)study. This alignment with a study which found that highly invasive and not poorly invasive cell lines formed vascular networks strengthened the link between those genes and a weakly metastatic phenotype. Hoek et al. (2006) also pursued melanoma cell line classification. We used three different datasets to show that most melanoma cell lines belong to one of two distinct subgroups. Multiple testing corrected anova identified, in each dataset, 223 genes which consistently showed differential expression between the subgroups. Many of the genes we identified had previously been linked with changes in metastatic potential. The expression patterns of these genes were such that one sample subgroup strongly corresponds with the weakly metastatic signature identified by Bittner et al. The second subgroup corresponds to an invasive phenotype. From these studies one may conclude that the base taxonomy of gene expression in melanoma cells is one tightly linked to metastatic potential. Critically, this taxonomy suggests that melanoma cells may be roughly subdivided into those which are motile and invasive and those which are not motile but rapidly proliferative. There are cell lines, which cannot be so neatly categorized, but they fall between the characterized subtypes as intermediates which may both proliferate and invade. That these subtypes exist and are defined by the genes identified is supported by many other gene expression studies concerned with either invasive or tumorigenic behaviors in melanoma cells (Hoek et al., 2006). The relevance of these in vitro transcription subtypes to in vivo melanoma biology is not yet clear; however, they suggest an intriguing alternative to currently accepted models for melanoma progression. This new hypothesis describes the in vitro subtypes, frozen by culture, as transcriptional states that may be interchangeable in vivo. According to this model progression is conducted by cells oscillating, in response to changing microenvironmental signals, between proliferative and invasive transcription programs (Carreira et al., 2006; Hoek et al., 2006).
Haqq et al. (2005), working with tissue samples, identified that metastases could be divided among either of two distinct groups. Identification of the genes responsible for this division showed they may also distinguish between RGP and VGP samples. This challenging finding, the correlation between RGP signatures and one of the metastatic classes, led the authors to hypothesize that RGP lesion melanoma cells may yet be responsible for some metastases. Interestingly, one of the metastatic types identified by Haqq, expressing several melanocytic genes, strongly resembles the less invasive subtype signature described by both Bittner and Hoek.
Maniotis et al. (1999) compared cell lines, which differed in their capacity to pass through collagen/lamanin/gelatin-coated filters to assess gene expression changes associated with invasiveness. The same group followed this up to re-identify TIE1 as being upregulated in invasive lines (Hendrix et al., 2001). Later they performed further DNA microarray experiments to specifically assess expression of extracellular matrix-modifying factors (Seftor et al., 2001). This was followed by a more general assessment of gene expression changes between invasive and noninvasive cell lines (Seftor et al., 2002). None of these studies are statistically sound as they use few or no biological replicates and could not control for false positives, making their results difficult to trust in isolation. However, comparison of the Seftor gene lists against our subtype distinguishing gene lists shows a significant (P < 0.003) overlap. This data also shows that Seftor’s invasive and noninvasive cell lines are respectively equivalent to the invasive and proliferative signature samples that we described (Hoek et al., 2006). Thus the gene lists being produced, while very likely contaminated with large numbers of false positives, are nevertheless on the right track. Folberg et al. (2006) also performed transcription profiling experiments on cell lines with different invasive potentials in different culturing environments. The comparison of invasive versus noninvasive lines growing in two dimensional cultures yielded 5209 genes with differential expression. Comparison with our subtype distinguishing gene lists shows significant overlap with the Folberg list (P < 10−30). Again, we found that the invasive and noninvasive lines are respectively equivalent to our invasive and proliferative signature samples. The Folberg and Seftor invasiveness studies combined with ours and Bittner’s classification analyses show that among cell lines the primary drivers of melanoma cell gene expression are tightly linked to metastatic potential.
Analyses of cell lines have revealed significant and reproducible changes between melanoma cells and melanocytes, as well as between melanoma cells with differing characteristics of metastatic potential. Similar analyses of melanoma tissues, both primary and metastatic, have yielded gene lists with prognostic potential. It is appreciated that the association between BRAFV600E and gene expression is, while indirect, not absent – but until the contributing factors are identified and accounted for this link remains to be satisfactorily resolved. The studies of melanoma metastatic potential also revealed significant and reproducible changes in gene expression between melanoma cell types. Compared to these, a large number of studies have been found to be deficient in their design and execution, leaving their questions unanswered by the current standards of high-throughput analysis. That gene expression profiling of melanoma continues to be problematic can be illustrated by this final example. Magnoni et al. (2007) published the startling finding that melanocytes derived from the skin of healthy individuals have a transcription profile different from melanocytes derived from the unaffected skin of melanoma patients. This finding has all the appearance of something that should soon find immediate and spectacular application in the clinics. The report contained the elements of a reasonable analysis of the data, employing sufficient replicates, fold-change filtering and multiple testing corrections. However, the authors made the critical mistake of applying a fold-change filter prior to the statistical analysis. Multiple testing corrections are absolutely sensitive to the number of tests conducted. In first filtering out genes, which were not at least two-fold changed the number of likely false positives detected by subsequent inference testing is automatically increased. By cherry picking genes for statistical analysis the effectiveness of multiple testing is severely compromised. Further, if the analysis is performed with the same rigor as that which revealed gene expression changes between invasive and noninvasive cell lines the results are very different. A reanalysis of Magnoni et al.’s data, in which multiple testing corrected anova is performed first, shows that no genes have significantly different gene expression between melanocytes derived from healthy controls and melanoma patients (regardless of fold-change filtering). While the data miner’s canary was present in the original study, close inspection reveals it to be one that had been stuffed and nailed to its perch.
The author would like to acknowledge the computing resources afforded by the Functional Genomics Center Zürich (Zürich, Switzerland). The author is supported by grants from the Swiss National Foundation (grant no. 310040-103671/1), Oncosuisse (OCS-01927-08-2006) and the Gottfried and Julia Bangerter Rhyner Stiftung.