SEARCH

SEARCH BY CITATION

Keywords:

  • adenoma;
  • follicular carcinoma;
  • molecular classifiers;
  • thyroid;
  • ultrasound

Abstract

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. Acknowledgements
  7. REFERENCES

BACKGROUND

Nodules of the thyroid gland are observed frequently in patients who undergo ultrasound studies. The majority of these nodules are benign, corresponding to goiters or adenomas, and only a small fraction corresponds to carcinomas. Among thyroid tumors, the diagnosis of follicular adenocarcinomas by preoperative fine-needle aspiration biopsy is a major challenge, because it requires inspection of the entire capsule to differentiate it from adenoma. Consequently, large numbers of patients undergo unnecessary thyroidectomy.

METHODS

Using data from gene expression analysis, the authors applied Fisher linear discriminant analysis and searched for expression signatures of individual samples of adenomas and follicular carcinomas that could be used as molecular classifiers for the precise classification of malignant and nonmalignant lesions.

RESULTS

Fourteen trios of genes were described that fulfilled the criteria for the correct classification of 100% of samples. The robustness of these trios was verified by using leave-1-out cross-validation and bootstrap analyses. The results demonstrated that, by combining trios, better classifiers could be generated that correctly classified >92% of samples.

CONCLUSIONS

The strategy of classifiers based on individual signatures was a useful strategy for distinguishing between samples with very similar expression profiles. Cancer 2006. © 2006 American Cancer Society.

Clinically significant thyroid nodules are quite common and are present in 5% to 10% of the population.1, 2 Most of these nodules are benign, corresponding to goiters or adenomas, and a few correspond to carcinoma, usually of the differentiated (papillary or follicular) type.2 Clinical observations indicative of malignancy are not very common in differentiated tumors, and diagnoses based on age, gender, nodular rigidity, and response to hormone therapy are neither sensitive nor specific.3 In fact, the best initial diagnostic test is based on cytologic analysis of the nodule using material obtained by fine-needle aspiration biopsy (FNAB). However, this method is not capa

ble of discriminating between adenomas and follicular carcinomas, which comprise between 10% to 20% of carcinomas.4 These suspicious nodules usually are removed surgically in hemithyroidectomy or complete thyroidectomy; and, if follicular carcinoma is confirmed, then a second operation may be necessary. A precise and reliable method for distinguishing between benign and malignant thyroid nodules may help to reduce the number and extent of surgeries.

Differences in gene expression are a potential diagnostic tool for discriminating subtypes of tumors.5–7 High-throughput methods have been adopted to search for differentially expressed genes. A modified serial analysis of gene expression (SAGE) technique was used to determine that trefoil factor 3 (TTF3) was down-regulated in follicular carcinomas compared with adenomas.8 Another application of SAGE led to the identification of 4 genes (DNA-damage inducible transcript 3, arginase type II, integral membrane protein 1, and Clorf24) with combined expression that discriminated between the 2 tissues with 83% accuracy.9 A novel differential-display method showed that immunoglobulin G Fc-binding protein was overexpressed in adenomas and was underexpressed in follicular and papillary carcinomas compared with normal tissue.10

Microarrays have been employed to search for molecular markers for thyroid adenomas and follicular carcinomas. Using oligonucleotide arrays, Finley et al. identified differentially expressed genes among carcinomas, adenomas, and hyperplastic nodules; and the expression of those genes was used to cluster samples that represented different pathologies.11, 12 A similar technique was employed in the comparison of follicular carcinomas and adenomas and demonstrated different profiles after hierarchical clustering.13

Recently, we described the identification of molecular classifiers that could distinguish between malignant and nonmalignant lesions of the gastric mucosa. In our approach, we used Fisher linear-discriminant analysis to define trios of genes that could separate the 2 classes of samples precisely into 2 groups divided by a hemiplane of a tridimensional space.14 Despite the growing availability of studies that used high-throughput techniques to distinguish between adenomas and follicular carcinomas, we still have not found a molecular tool that may be used for diagnostic purposes. Using a combinational DNA (cDNA) microarray with 4800 genes,15 we determined the expression profiles of 24 thyroid samples that represented adenomas and follicular carcinomas. By using Fisher linear-discriminant analysis, we identified a series of trios and quartets that could distinguish precisely between all follicular adenomas and carcinomas.

MATERIALS AND METHODS

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. Acknowledgements
  7. REFERENCES

Tissue Samples and RNA Extraction

Postsurgical samples of adenomas and follicular carcinomas were obtained from Hospital do Cancer A.C. Camargo (Sao Paulo, Brazil), Hospital Araujo Jorge (Goiania, Brazil), and Instituto de Enfermedades Neoplasicas (Lima, Peru). All patients signed informed consent forms, and the study was approved by the Internal Review Board at each institutions. Excess of tissue that was not used for histopathology was frozen immediately and stored at − 140°C until RNA extraction. Before RNA extraction, the diagnosis was reconfirmed by hematoxylin and eosin staining, and all samples were hand dissected to enrich for the desired tissue type. Only samples with at least 70% of the corresponding alteration and no visible infiltrating, inflammatory cells were used for RNA extraction.

RNA extraction was performed using TRIZOL (Life Technologies, Bethesda, M.D.) and tissue homogenizer (Polytron, Norcross, GA). Samples with an OD 260/280 ratio of ≥1.7 were evaluated by agarose gel with ethidium bromide staining, and samples with a 28S/18S ratio >1 were considered nondegraded and were used for amplification. For reference RNA, we used a pool of 5 disease-free thyroid tissues from an independent set of 5 patients. The same reference RNA was used for hybridization with all 24 test samples.

The raw data from hybridizations and experimental conditions can be obtained from Gene Expression Omnibus (available at URL: http://www.ncbi.nlm.nih.gov/geo [accessed March 9, 2006]) under accession number GSE2468. For a detailed description of the 4.8-K array, the accession number is GPL1930.

Preparation of cDNA Microarray

Silanized glass slides (Corning, Corning, NY) were used for printing purified polymerase chain reaction (PCR) products of 4800 ORESTES clones.16 The map and gene list of our array are available on line (available at URL: http://www.ncbi.nlm.nih.gov/geo; accession number GPL1930 [accessed March 9, 2006]). Immobilized fragments have at least 300 base pairs (bp); yield a single hit when blasted against the human genome; do not have repetitive sequences; and, for any stretch of 100 bp, have < 85% homology to any sequence of the human genome. Whenever possible, the fragment is the 3′ end of the corresponding gene but is 5′ from the first polyadenylation site. Inserts were amplified by PCR using M13 reverse and forward primers. Amplicons were purified with G50 resin and were arrayed onto glass slides using Flexys robot (Genomic Solutions, Ann Arbor, MI). Six positive controls that represented bacterial genes or λ phage genes also were immobilized.

RNA Amplification

Our protocol for RNA amplification was based on a procedure previously described by Wang et al.17 with modifications described by Gomes et al..18 The procedure was optimized for 1.5 μg of total RNA as starting material and was proven not to introduce a bias in mRNA representation.

Labeling of cDNA and Hybridization to Arrays

RNA samples (3-8 μg) were spiked with 1 μL of a mixture of 5 synthetic RNAs corresponding to control fragments, 1 μL of random hexamer (dN6; 2 μg/μL), in a final volume of 6.1 μL; then heated for 10 minutes at 70°C; and cooled on ice for 1 minute. Next, we added 4 μL of 5 × buffer (ImpromII; Promega, Madison, WI) and 4.8 μL of MgCl2 25 mM (ImpromII, Promega), both preheated at 42°C; 0.6 μL of low-C dinucleuotide triphosphate mix (25 mM for deoxyadenosine triphosphate, deoxythymidine triphosphate, and deoxyguanine triphosphate and 10 mM for deoxycytidine triphosphate); 0.5 μL of RNasin (ImpromII, Promega); 1 μL of transcriptase reverse enzyme (ImpromII, Promega); and 3 μL of 25 mM indocarbocyanine-labeled or indodicarbocyanine-labeled deoxycytidine triphosphate (Amersham Biosciences, Arlington Heights, IL) in a final volume of 20 μL. Reaction occurred for 2 hours at 42°C. For every labeling reaction, quality control was determined as described by Carvalho et al.19

Template RNA was degraded by the addition of 1.5 μL of 0.5 M ethylenediamine tetraacetic acid and 1.5 μL of 1 M NaOH at 70°C for 20 minutes followed by neutralization with 1.5 μL of 1 M HCl. Labeled target was purified through an AutoSeq G50 column (Amersham Biosciences). For each sample, duplicate hybridizations were performed with inversion of the fluorescent dye.

Labeled cDNAs from sample tissues and reference RNA were mixed in the presence of 2 μL of poly(A) DNA (2 μg/μL; Amersham Biosciences), 2 μL of Cot1 DNA (1 μg/μL), and 2 μL of bovine serum albumin (10 mg/mL; Sigma Chemical Company, St. Louis, MO). The volume was adjusted to 12.4 μL (Speed-Vac; Labcono, Kansas City, MO) and added to hybridization buffer (formamide, Denhardt solution, and salmon sperm DNA).

Before hybridization, glass slides were incubated in prehybridization solution (5 × standard saline citrate [SSC], 0.2% sodium dodecyl sulfate [SDS], 1% bovine serum albumin, and 5 × Denhardt solution) for at least 6 hours at 42°C; washed with water; and dried by centrifugation. Fluorescent samples were heated at 95°C for 5 minutes and were added to glass slides in a hybridization station (GeneTac Hybridization Station; Genomic Solutions) at 65°C.

After overnight hybridization, slides were washed for 5 minutes in 2 × SSC, washed twice for 10 minutes in 0.1 × SSC-0.1%SDS, and washed twice for 5 minutes in 0.1 × SSC.

Image and Data Acquisition

After drying, the slides were scanned on a confocal laser scanner (Arrayexpress; Perkin-Elmer Life Science, Oak Brook, IL) using identical parameters for all slides. For each spot, signal and background intensities were measured using the histogram method in Scanarray software (Perkin-Elmer Life Science).

Statistical Analysis

After image acquisition and quantification, data analysis was performed with R (available at URL: /www.r-project.org/), which is an extensible, open source-interpreted computer language for statistical analysis and graphics, and tools of the associated project, Bioconductor (available at URL: http://www.bioconductor.org/ [accessed March 9, 2006]). Data preprocessing was performed with background subtraction, filtering, and loess normalization.

For clustering samples on the basis of their expression profile, we applied hierarchical clustering based on correlation distance and complete linkage. For clustering samples on the basis of the expression profile of genes pertaining to a specific metabolic pathway, we used the nonsupervised algorithm k-means. Once clusters were obtained, samples and genes were organized hierarchically, based on their correlation distances.

For the identification of genes with a greater probability of having differential expression in adenomas and follicular carcinomas, we used parametric (t test) and nonparametric (Mann–Whitney U test) methods and, to assess the significance of those findings, we consider the theoretical P values associated to those methods and bootstrap estimators. For finding molecular classifiers, we used Fisher linear-discriminant analysis and searched for the best trios and quartets of genes, such that datapoints representing signal intensity for all 3 or 4 genes for each sample would be separated by a plane in a 3-dimensional space that could distinguish precisely between adenomas and follicular carcinomas. For a given group of genes, this linear classification method searches for linear combinations of their expression with large ratios of between-groups to within-groups sum of squares.33 This maximal ratio of sum of squares or its square root, which is denoted here as singular value decomposition (SVD), measures how well the 2 groups are separated.

To identify the classifiers, we implemented the (k,l) algorithm. If l represents the number of genes (l = 4600 in the current situation), then the method starts by selecting the best k discriminating genes among the original l genes (we set k = 100). Denoting this set of 100 genes as G1, next, we searched the k best discriminating pairs of genes, such that at least 1 of the genes belonged to G1. Denoting this set of pairs of genes as G2, we then proceeded to search for the k best discriminating trios (denoted as G3), such that at least 1 of its pairs belonged to G2. We finally applied a similar process to get the set G4, the k best discriminating groups of 4 genes. The motivation for introducing this method is that it can be implemented much faster than the exhaustive search and, thus, is amenable to bootstrap procedures. This strategy was the same as that described by Meireles et al.14 We performed resampling experiments based on a sequential search method for a classifier in which we applied the method with k = 100 for 1000 bootstrap samples; and we never found, for any bootstrap search, better results than the results that corresponded to the real data set.

The identified trios were validated by the leave-1-out cross validation, and we selected only trios that appeared as sample independent (the same trio appeared again in all searches when samples number was n − 1) and with perfect performance for all validations. We also validated the trios by bootstrap analyses by using 100 random sets of 16 samples (8 adenomas and 8 follicular carcinomas) for training and by using the remaining 8 samples for validation.

RESULTS

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. Acknowledgements
  7. REFERENCES

Adenomas and follicular carcinomas have a similar expression profile. Having collected data for the expression of 4596 genes with signal >1.1 × the background signal, all individual slides were clusterized hierarchically. Figure 1 shows that replica slides for every sample grouped at the first level demonstrated the reproducibility of data obtained by the dye-swap strategy. In addition, it was apparent from the clustering analysis that adenomas and follicular carcinomas could not be distinguished on the basis of the global expression pattern of the tested genes.

thumbnail image

Figure 1. Hierarchical clustering of thyroid samples of adenoma (A) and follicular carcinoma (FC) was based on the expression profiles of 4600 genes. Amplified RNA samples derived from A (12 samples) and FC (12 samples) were hybridized against a combinational DNA microarray with 4600 gene fragments. All samples were hybridized in duplicate, with dye-swap (M, main; S, swap). All 48 slides were normalized as described in the text and were clusterized using correlation distance and complete linkage.

Download figure to PowerPoint

Next, we identified genes that were expressed differentially in the 2 samples groups on the basis of their fold changes and P values. Again, it was apparent from the volcano plot (Fig. 2) that adenomas and follicular carcinomas had very similar expression patterns, with genes showing only discrete fold changes. Only 32 differentially expressed genes showed nominal P values ≤.01 (Table 1). No precise hierarchical clusterization of samples could be obtained with these 32 genes (data not shown).

thumbnail image

Figure 2. This volcano plot represents differentially expressed genes in adenomas and follicular carcinomas of the thyroid. For each individual gene, the fold change was determined (transformed to log2), and its nominal P value was calculated (transformed to − log10) using the Mann–Whitney test.

Download figure to PowerPoint

Table 1. Genes with Differential Expression between Adenomas and Follicular Carcinomas with Nominal P Values <.01
Full-Length IdentifierGene NameFold Median (Log2)*P Value
  • *

    Numbers preceded by a minus sign had higher expression in follicular carcinomas

AK027855SLC5A6−.1405.001432610
NM_021913AXL.1515.001432610
AF348827PLVAP.6608.001432610
AL389942EUROIMAGE 2005635..1272.001829776
AK093183MKPX−.225.001829776
AB044088BHLHB3.3286.003636625
BC002983GRCC9−.4481.004513053
AF055007Homo sapiens clone 24707.1807.005559590
NM_031850AGTR1.5446.005559590
AK001963FLJ11101.2032.006811737
AK056433FLJ10305−.2807.008293161
AF041210MID1−.3477.008293161
BC001403CPSF5−.2288.010044539
U78166RIT2.1019.010044539
BC031391C17.2779.010044539
AK024251KIAA0918.3363.010044539
NM_014552LBP-32.1427.012093977
NM_003177SYK−.5432.012093977
NM_001845COL4A1.7465.012093977
AK075476E2IG4.6804.012093977
BC036658SSA2.2988.012093977
AK055909FLJ31347−.1153.012093977
NM_024102MEP50.1781.014492507
NM_003617RGS5.6725.014492507
NM_001855COL15A1.5302.014492507
AL117566UBE1C.2052.014492507
NM_002986CCL11.0622.014492507
AF123659LZTS1.1771.017271193
AL832955FLJ23153.8807.017271193
NM_000317PTS.4885.017271193
AB018353UNC84A−.5075.017271193
AK057653FLJ22726.1553.017271193

To investigate changes in specific metabolic pathways and whether such changes could generate meaningful sample clusters, we grouped all genes that were represented in our slides according to their biologic process, as described by the Kyoto Encyclopedia of Genes and Genomes (available at URL: http://www.genome.jp/kegg/[accessed March 9, 2006]), and we used a nonsupervised algorithm, K-means, for clusterizing samples on the basis of the expression profile of genes that pertained to a given biologic process. Again, none of the 17 biologic processes represented in our array could group precisely adenomas and follicular carcinomas. Figure 3 shows that neither genes related to cell adhesion nor genes related to the cell cycle had an expression profiles that were sufficiently distinct in the 2 groups of samples to generate a clear separation between the 2 groups.

thumbnail image

Figure 3. Nonsupervised clustering of samples representing adenoma (A) and follicular carcinoma (FC) of the thyroid was based on the expression patterns of genes related to adhesion and to the cell cycle. The expression profiles of genes pertaining to functional modules representing (A) cell adhesion or (B) the cell cycle were extracted from the data set and were used for clustering samples that represented A or FC using the nonsupervised algorithm K-means, with k = 2. Once clusters were obtained, samples and genes were ordered hierarchically on the basis of their correlation distance.

Download figure to PowerPoint

Molecular Classification of Adenomas and Follicular Carcinomas on the Basis of Their Signature Expression

Because the expression profiles of the 2 sample groups did not demonstrate a pattern that would be sufficient for clustering samples accordingly, we decided to search for expression signature of individual samples that could be used as molecular classifiers. The use of expression signatures, as discussed by Meireles et al.14 has the advantage that it is not based on comparative analyses of several samples; hence, it would be easier to implement in a routine service. By using the strategy of sequentially searching the entire data set, as described earlier, we identified the 100 trios with the highest SVD; and, among them, we selected 14 trios that could separate all samples precisely according to their pathology (Table 2) by using the leave-1-out criteria. Figure 4 represents 2 of the selected trios.

Table 2. Single-Trio Classifiers and Their Genes
TriosGene 1Gene 2Gene 3
  • *

    Genes in Table 1 that had P values <.01 for their differential expression in adenomas and follicular carcinomas.

1FLJ10305*AXL*JPH3
2GRCC9*BBS2RBM9
3DUSP22SLC5A6*H41
4UBE1C*SMCR8HERC4
5AXL*ZFYVE21KIAA1244
6MYH10GRCC9*KIAA1463
7SLC5A6ZFYVE21HK3
8OLFML3DUSP22CLDN2
9LZTS1*SLC5A6DUSP22
10RIT2*DUSP22MEST
11SYCP1MEP50*INSIG2
12SNAP25SNAI2NDRG2
13PPP3CASLC5A6DUSP22
14FSTL1COL15A1*FLJ12572
thumbnail image

Figure 4. These are 3-dimensional representations of single-trio classifiers. Panels A and B are graphic representations of 2 single-trio classifiers based on the expression signature of individual samples. The genes that compose the trios are described by their names as described by the HUGO Gene Nomenclature Committee. Adenomas are represented by green circles and follicular carcinomas are represented by red squares.

Download figure to PowerPoint

Each 1 of the 14 trios could function as a single-trio classifier. However, because cancer frequently is associated with several genetic alterations, it is reasonable to assume that a single-trio classifier could fail to classify a sample correctly in larger sample sets. Hence, it would be desirable to join various trios into a single classifier. In our series, the simplest composite classifier could be obtained by choosing q of the 14 trios following the majority rule. If we assume that q is an odd number, then a sample would be classified as adenoma if the majority of the q single-trio classifiers identify that sample as adenoma. Otherwise, it would be classified as follicular carcinoma. If the number of single-trio classifiers, q, is even, then we could assume an extra possible state for the composite classifier, say, “undecided”, to indicate that equal votes occurred.

However, because trios have a measurement of quality (SVD), a natural approach is to impose different weights to single-trio votes according to their SVD value. More precisely, we assume that each single-trio classifier provides a weighted vote: its SVD divided by the sum of all q SVD values in case it “votes” for adenoma and zero if it “votes” for follicular carcinoma. Thus, a sample would be classified as adenoma if the sum of these q-weighted votes is greater than one-half; otherwise, it is classified as follicular carcinoma.

To evaluate this idea, we considered the performance of the 13 composite classifiers, with q = 2, q = 3, and so on, up to q = 14, in which, in each case, the q chosen trios were those with largest SVD. Because the performance of single trios already was perfect according to the leave-1-out criteria, we also imposed a more stringent performance assessment: First, we chose 8 adenomas and 8 follicular carcinomas among the original 24 samples and used this set of 16 samples to train the composite classifier, and we tested their performance using the remaining 8 samples (4 adenoma and 4 follicular carcinoma) by measuring how many samples of each type were classified correctly. This exercise was repeated 100 times using different random sets of 16 and 8 samples, and we computed the mean number of correctly classified adenomas and follicular carcinomas. Figure 5 shows that the performance of the q-composite classifiers improved as q increased; and, with q ≥11, we correctly classified >90% of adenomas and >80% of follicular carcinomas. The statistical significance of this finding was determined by bootstrap analyses. We repeated the (k,l) sequential search described earlier using k = 10 and l = 100. We started by ordering the genes according to their t-value, then identified the gene with the largest SVD, and searched among the l − 1 remaining genes the 1 that yielded the pair of genes with largest SVD. Excluding these 2 genes, we repeated the search to identify the next best pair and so on, up to the identification of the 10 pairs of genes with the highest SVD. For each pair in this list, we searched among the l genes with largest t-value for the gene that yielded the best trio. We implemented this approach for k = 10 and l = 100 and found 5 trios with 100% performance according to the leave-1-out criteria. To assess significance by bootstrap analyses, we repeated the whole procedure for N sets of samples with permuted labels. For N = 100, we found a P value of .04.

thumbnail image

Figure 5. This chart illustrates the efficiency of composite classifiers in performing class distinction. Composite classifiers were constructed by combining q single-trio classifiers in which q varied from 2 to 14, and trios were grouped in the order of their singular value decomposition. Each composite classifier then was trained with 100 random sets of 16 samples that represented adenomas and follicular carcinomas (8 samples from each class) and was validated with the remaining 8 samples (4 from each class). The percentages of correct counts are represented in the y-axis. Adenomas are represented by solid squares, and follicular carcinomas are represented by open triangles.

Download figure to PowerPoint

DISCUSSION

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. Acknowledgements
  7. REFERENCES

With the broader application of ultrasound in modern medicine, the incidence of thyroid nodules raised dramatically, with nodules detected in up to 40% of women and 20% in men.20, 21 For patients with thyroid nodules, the diagnosis relies primarily on cytology of tissue samples obtained by FNAB, which provides accurate results for inflammatory nodules, goiter, and papillary or medullar carcinomas.20 However, for the differential diagnosis between adenomas and follicular carcinomas, cytology of FNA-derived samples often provides nonconclusive results, because the distinction between the 2 pathologies is based on the capsule invasion by follicular carcinoma. Consequently, patients with follicular lesions, whether they are adenomas or follicular carcinomas, undergo thyroidectomy, although only approximately 20% of these patients have malignant disease. Therefore, a new molecular-based diagnosis for the precise distinction between malignant and nonmalignant lesions would have a great and positive impact on the management of these patients.

Using cDNA arrays, we determined the expression profile of a set of 4600 genes in thyroid lesions that represented adenomas and follicular carcinomas, and we studied the data set with the objective of defining differences in the expression profiles that could lead to a precise distinction between these 2 pathologies. Different clustering approaches were used successfully to distinguish between tumor and nontumor samples22 and between morphologically similar samples5, 23 and to determine disease outcome.24, 25

In the current study, we failed to identify a set of genes with differential expression that would be capable of generating clusters able to distinguishing malignant lesions from nonmalignant lesions. Follicular adenomas and follicular carcinomas could not be grouped precisely by hierarchical clustering either on the basis of global gene expression (Fig. 1) or on the basis of the expression profile of genes selected as a function of their differential expression (data not shown). Such a failure could be explained by the great similarity in overall gene expression profiles observed between the 2 sample groups, as demonstrated by Figures 1 and 2, which may be further indication that the 2 lesions are similar not only phenotypically but also at the level of gene expression. Our data are in disagreement with those published previously by Finley et al. and by Barden et al.,11–13 who identified a set of genes that, by clustering algorithms, could distinguish between adenomas and follicular carcinomas. This discrepancy may be explained by the number and identity of the genes present in our arrays compared with the arrays used by other groups or by differences in mathematical and statistical approaches. Nevertheless, disagreement between data obtained by array technology seems to be a problem that escapes trivial explanations.26

It was described recently by Segal et al. that functional modules could be affected differentially in diseased samples.27, 28 Considering cell cycle and cell adhesion as 2 functional modules that could be altered in follicular carcinomas compared with adenomas, we clusterized our samples by using a nonsupervised algorithm and a set of genes that belonged to these 2 functional modules. Once again, the changes in expression profiles of such genes were not sufficient to produce a precise distinction between adenomas and follicular carcinomas (Fig. 3A,B).

Having failed to identify classifiers based on differentially expressed genes, we searched for expression signatures in individual samples that could be used for the construction of molecular classifiers. Although many statistical methods are available for identifying clustering-based classifiers (for review, see Quackenbush29), they often are based on the expression of a large set of genes and, necessarily, require data from the 2 sample groups to identify the set of genes with similarities and differences that can be used to define clusters. To be applied routinely, these requirements may represent potential pitfalls. An alternative would be the implementation of supervised learning procedures in which classification could be based on the expression signatures of individual samples rather than comparative expression profiles, as proposed previously.6, 29, 30 The major advantage would be the possibility of creating a data base against which the test sample could be classified, as demonstrated by Ramaswamy et al.6 A known group of samples could be used for training, and the resulting classifier could be used for the prediction of an unknown sample (class prediction), as demonstrated by Golub et al.5 Support Vector Machine, an example of a supervised learning algorithm, has been used successfully for class distinction.31, 32 Several other mathematical methods, such as Nearest Neighbors Classifiers and Classification Trees,33 also could be applied to search for groups of genes with different expression patterns or sample signature. It appears to us that Fisher linear-discriminant analysis34 attains a good compromise between simplicity and performance, making it a good choice for this investigation. Moreover, this approach to identify expression signatures corresponds to the usual approach to identify differentially expressed (single) genes based on t statistics.

Therefore, we decided to apply Fisher linear-discriminant analysis and, by searching sequentially, we were able to identify 14 trios of genes that could classify the 2 sample groups precisely as adenomas or follicular carcinomas (Table 2). It can be observed that only a few of the genes that composed the trios also are present in Table 1, demonstrating that trios are not necessarily made of differentially expressed genes. This observation supports the notion that, especially for biologically similar samples, searching for differences in expression profiles and, therefore, cluster-based classifiers may not be the best strategy.

It is more noteworthy that the identification of differentially expressed genes depends absolutely on the nature of the samples under comparison, which brings up concern regarding the robustness of classifiers based on comparative analysis. The set of genes composing such classifiers likely would vary if the sample collection were different. This problem can be solved in part by the analysis of very large numbers of samples. For this reason, an essential step for the validation of these classifiers is the validation with an independent set of samples.35, 36 Conversely, classifiers based on expression signatures of individual samples would not vary in composition if the samples under analysis changed but, rather, would vary in their ability to classify a given sample correctly that differed in the coordinated expression of the genes (3 in the current series). To overcome this problem, it is possible to have a larger number of classifiers composed of different genes, reducing the likelihood that a sample would escape all tests.

Follicular carcinoma of the thyroid is a rare disease; and, to obtain the 12 samples described in this report, we collected nearly 1000 tissue fragments from thyroidectomy specimens. Consequently, because validation of the trios described herein with large enough independent samples was not feasible, we applied different mathematical tools to determine the significance of our findings.

First, all 14 trios described herein performed with 100% efficiency in the leave-1-out analysis both for finding the same trio when n − 1 samples were used and for correctly classify the left-out sample. Molecular classifiers based on signature expression of individual samples, as discussed above, have the advantage that their performance in not based on comparative analyses, as in the case of classifiers based on clustering algorithms. However, when they are applied to larger set of samples, these classifiers likely would have their performance affected by variation in gene expression observed in different tumors. To overcome this problem, we combined single-trio classifiers to build classifiers with ≥2 trios and tested their performance using 100 random sets of training samples and 100 random sets of validating samples (Fig. 5). The efficiency in correctly classifying samples, as expected, was >90% when we combined ≥11 trios.

Finally, we also validated the identification of these trios by using bootstrap analysis, comparing the efficiency of finding perfect classifiers and their performance using the real data set with the performance obtained by using 100 data sets in which sample identity was permutated. This exercise indicated a P value of .04, suggesting that our findings were not due to chance but, indeed, reflect meaningful information about the differences in signature expression.

In the current study, we described 14 trios of genes that, when used in combination, were capable of correctly classify unknown samples with statistical significance. Considering the scarcity of gene expression data on follicular carcinomas of the thyroid, we expect that other groups also may be able to implement the classifiers described herein and test them in different data sets. Such efforts may establish the value of these classifiers as diagnostic tools in FNAB samples and, ultimately, may help to reduce the number of patients who undergo unnecessary thyroidectomy.

Acknowledgements

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. Acknowledgements
  7. REFERENCES

The authors thank Waleska Martins, Gustavo Esteves, Chamberlein Neto, Sarah Marques, and Carlos Nascimento for technical assistance and Dr. Fernando Soares for his expertise. They also thank all members of their laboratories for helpful discussions and Dr. Ricardo Brentani for critical reading of the article.

REFERENCES

  1. Top of page
  2. Abstract
  3. MATERIALS AND METHODS
  4. RESULTS
  5. DISCUSSION
  6. Acknowledgements
  7. REFERENCES