The clinical validity and applicability of the World Health Organization (WHO) histopathologic classification of thymomas (‘classification’) has been questioned. Evidence-based pathology promotes the use of systematic reviews and analysis of data with meta-analysis rather than subjective reviews of the literature.
The authors performed a review of the English literature from 1999 to the present to identify ‘best evidence’ regarding the use of the ‘classification.’ The data were analyzed with meta-analysis software.
To the authors' knowledge, only Level-3 or -4 evidence published in retrospective cases series is currently available regarding the use of the ‘classification.’ Meta-analysis demonstrated that only 3 WHO categories of thymomas are associated with significant survival differences: A/AB/B1, B2, and B3. It also indicated significant heterogeneity with regard to the results published in different studies. To the authors' knowledge there is no current evidence to determine whether thymoma types are significant prognostic features for patients previously stratified by stage.
Thymomas are unusual, low-grade malignant tumors of the thymic epithelium that can present with a wide variety of histopathologic features.1, 2 The classification of these tumors into 4 histopathologic variants (lymphocytic-predominant, epithelial-predominant, mixed [lymphoepithelial], and spindle cell thymomas), proposed by Bernatz et al.3, 4 from the Mayo Clinic in 1961, was widely used in the U.S. until the late 1990s and has been replaced in many hospitals by the World Health Organization (WHO) classification scheme. The staging scheme proposed by Masaoka et al.5 is still in use, with minor modifications. Levine and Rosai6 proposed the general concept that encapsulated thymomas, as classified earlier, are benign neoplasms, whereas invasive lesions are malignant. This view has been for the most part discarded after reports of clinical recurrences and/or metastases in patients with Masaoka stage I disease.7 Because the Bernatz classification scheme cannot accurately predict the prognosis or stage of disease of thymoma patients, Marino and Muller-Hermelink8 proposed in 1985 a different system that was based on the histogenesis of the neoplastic epithelial cells as determined by immunohistochemical evidence and morphologic features. Modification of this classification by Kirchner et al.9 includes the following categories of thymomas: medullary, mixed, predominantly cortical, cortical, well-differentiated thymic carcinoma, and high-grade thymic carcinoma. The investigators suggested that this classification correlated well with prognosis, independent of the assessment of capsular invasion. However, the reproducibility and prognostic value of the Mueller-Hermelink classification of thymomas has been controversial.1, 10–12 Other classification schemas of thymomas have been proposed by Suster and Moran12 and others. In the presence of somewhat intense controversy among pathologists interested in thymic pathology regarding the clinical value of different schema, the WHO published in 1999, and later modified in 2004, a system for ‘grouping’ thymomas that uses letters and numbers rather than cell type or other histopathologic descriptors to stratify these neoplasms into A, AB, B1, B2, and B3 types.13 Thymic carcinoma was initially classified in 1999 as Thymoma C but in the most current version of the schema it is listed as thymic carcinoma.14 The WHO terminology was intended to provide a translation formula to facilitate comparison among preexisting nomenclatures but it has become the de-facto classification system of thymomas, widely used in many countries.
Suster and Moran10, 12, 15 raised questions regarding several practical problems with the applicability of the WHO classification scheme of thymomas in daily pathology practice. For example, the system does not provide specific histopathologic criteria to classify heterogeneous tumors that exhibit different ‘types’ in separate tissue samples from a single neoplasm, and describes only general histopathologic criteria with which to distinguish A from AB lesions, or differentiate among the various B types of thymomas. They suggested that the classification of thymic epithelial neoplasms could be collapsed into 3 classes: thymoma, atypical thymoma, and thymic carcinoma.12 Clinical studies have also raised questions regarding the predictive value of the WHO classification scheme of thymomas in helping to select various therapeutic modalities such as neoadjuvant therapy.7, 14, 16–18 For example, a recent review article by Detterbeck,7 a thoracic surgeon, concluded that the prognosis of patients with thymomas subtypes A, AB, and B1 through B3 is variable and that the issue of how the WHO classification should be used in the clinical setting remains unclear.
Evidence-based pathology (EBP) promotes the use of an analytic approach to the evaluation of the information (‘evidence’) published in the medical literature and its applicability for individual patient care.19, 20 It relies on systematic review of the literature; classification of information into levels of potential accuracy in an attempt at identifying ‘best’ current evidence; and the use of meta-analysis and other quantitative tools for the integration of best evidence into diagnostic rules, guidelines, or other algorithms that can help guide the treatment of individual patients.21–28 We performed a systematic review of the literature to identify current ‘best evidence’ regarding the clinical value of the WHO classification of thymomas and evaluated the data using meta-analysis.
MATERIALS AND METHODS
The English language medical literature was reviewed for the period 1999 through 2007 with the PubMed database (National Medical Library) using the following search terms: thymoma, pathology, prognosis, and/or stage. Publications that did not report the use of the 1999 WHO classifications system of thymomas, case reports, case series reporting <10 patients, and publications that provided no information regarding tumor stage or those with a clinical follow-up <5 years were excluded. The study design of each publication was identified (retrospective or prospective randomized clinical trials, case-control study, retrospective or prospective cohort, etc). The information provided by each publication was classified into levels of evidence, using previously published criteria.19, 20
Evidence-based pathology advocates the use of the so-called FRAP paradigm (frame questions, research, analyze best evidence, derive patient-based rules).19, 20 The information available in each publication was organized in tables designed to answer specific questions (Table 1).
Table 1. Evidence-based Pathology: Questions to Assess the Clinical Applicability of the World Health Organization Classification of Thymomas
WHO indicates World Health Organization.
1. What levels of “best evidence” are currently available?
2. Is the WHO classification scheme of thymomas reproducible among different pathologists?
3. Are there significant survival differences in thymoma patients, categorized byWHO class?
4. Are there significant survival differences in patients with thymoma stratified bystage and WHO class?
5. Can the WHO scheme be collapsed into a smaller number of classes?
The number of thymomas cases, by WHO histologic type, was tabulated and data were normalized to percentages. The survival proportions, by WHO histologic type, were collected from various sections of each article and tabulated after normalization into percentages. The survival proportions, by WHO histologic type and stage, were searched for. The data were analyzed with meta-analysis using Comprehensive meta-analysis software (version 2; Biostat, Englewood, NJ). We compared the survival proportions in each WHO histologic type of thymoma with all others, using 2 at a time comparisons, in an attempt to determine which subtypes had statistically significantly better survival. For example, WHO type A thymomas were compared with AB, B1, B2, and B3 thymomas, sequentially. The software estimates the odds (odds = probability/1-probability) for each outcome by WHO histologic subtype for the data from each publication. At the conclusion of the analysis, the overall odds ratios (ORs) for all cases from all publications are calculated and P values estimated. The software weighs the data from various publications according to their cohort size. The data can be visualized graphically as shown in Figures 1–4. Each horizontal line represents the 95% confidence interval (95% CI) for each study. The square size represents the cohort size. The diamond at the bottom of the graph represents the overall OR. The width of the diamond is proportional to the overall 95% CI. The data can be analyzed using a fixed or a random model. In the fixed model, the effect sizes of each study are conserved. The random model uses a more flexible approach in which the effect sizes are normalized toward an overall mean effect size. We analyzed all data using both models. The data were also analyzed for heterogeneity between study results using funnel plots. As shown in Figure 5, these plots demonstrate a vertical line, 2 lateral curves, and multiple small circles. Each circle represents a single study. In instances in which the data from various studies are homogeneous, the curves are close to the vertical line and all circles are clustered near the vertical line in a symmetrical distribution. The software also calculates various statistics such as Egger regression intercept test to estimate the significance of the data.
Fifteen studies provided information regarding 2192 patients with thymomas classified according to the WHO scheme and followed for >5 years postoperatively.14, 16, 17, 29–40
What Levels of ‘Best Evidence’ Are Currently Available?
No retrospective or prospective randomized clinical trials of thymoma patients have been reported in English in PubMed since 1999. All 15 studies were comprised of retrospective reports of cohorts of thymoma patients ranging from 61 to 273 individuals. These studies provide Level 3 or 4 evidence.41
Is the WHO Classification Scheme of Thymomas Reproducible Among Different Pathologists?
Review of the literature yielded a single study that evaluated the interobserver reproducibility of the WHO classification of thymomas for the “B” subtypes (B1, B2, B3) using Kappa statistics. Their data showed interobserver agreement of only 0.49 within this group.40 The data from the 15 studies providing current ‘best evidence’ demonstrate considerable variability with regard to the proportions of different subtypes of thymomas in various cohorts. For example, the proportion of type A thymomas varies from 5% to 24%, whereas the proportion of type B3 thymomas varies from 6% to 34% (Table 2). Even studies published from a single country such as Japan demonstrate considerable variability with regard to the proportion of thymoma subtypes among different studies. For example, the proportions of thymoma types A, AB, B1, B2, and B3 reported by Okumura et al.35 and Nakakgawa et al.34 are, respectively: 7%, 28%, 20%, 36%, and 10% and 14%, 43%, 12%, 22%, and 9% (Table 2). This variability within a similar ethnic group suggests that different pathologists may be using the diagnostic criteria of the WHO classification scheme of thymomas somewhat variably.
Table 2. Numbers of Cases and Proportions of Thymomas by WHO Type in Different Studies
Are There Significant Survival Differences in Thymoma Patients by WHO Type?
Table 3 and Figure 1 summarize the prognosis of the 2192 thymoma patients reviewed. The majority of studies provided 10-year follow-up, 2 studies provided 5-year follow-up, and 1 study provided an 8-year follow-up. All but 4 of these studies reported that the WHO classification provided statistically significant information. However, none of the studies reported statistical significance between all the thymoma types when individual comparisons among types A through B3 were analyzed. Analysis of the data with meta-analysis was performed using both fixed and random models; both of these demonstrated similar results. There were no statistically significant differences in survival rates noted between patients with thymoma types A, AB, and B1 (Fig. 2). The statistically significant difference (P < .033) noted between thymoma types AB and B1 is difficult to interpret because there were no differences noted between AB and A thymomas or A and B1 thymomas. In contrast, there were statistically significant differences in survival rates noted between patients with thymoma types A, AB, and B1 and either B2 or B3 thymomas. Figures 2 to 4 show the comparison between patients with thymomas types A, AB, and B1 with all other subtypes, respectively, as provided by the software. The left portion of the figures show the studies analyzed with their corresponding OR, low and upper limits, z value, and P value. The software includes for analysis only studies that report events. The right portion of the figures shows a forest plot of the data. The squares symbolize the OR for each study. The larger the square, the more weight it contributes to the results of the meta-analysis. The horizontal lines represent the values within the 95% CI. The central vertical line shows an OR of 1. Results that include an OR of 1 are not significant to a P value of .05. The bottom of the forest plots show a diamond summarizing the integrated OR and 95% CIs for all studies in the analysis.
Table 3. Summary of the Prognosis of the 2192 Thymoma Patients Reviewed
SP indicates surviving patients; TNC, total number of cases; % Surviving, percentage of surviving cases; NA, not available.
All subsets of the data were analyzed for data heterogeneity to investigate for the presence of publication bias using funnel plots and various statistics provided by the software. The data show considerable heterogeneity in several categories. For example, the funnel plot of precision by log OR shown in Figure 6 demonstrates a larger number of studies to the left of the vertical line than to the right of this line. Studies with larger weight are shown higher on the graph. Meta-analysis of homogeneous data usually yielded plots in which the number of circles is balanced in height and in number on both sides of the vertical line. The software provides several statistical methods with which to evaluate for heterogeneity. For example, the Egger regression intercept test yielded the P values shown in Table 4 for all comparisons of thymoma types. Meta-analysis of data that are not significantly heterogeneous across studies should show P values >.05. The various metrics indicate that our current knowledge regarding the prognosis of patients with various WHO types of thymomas is based on data that are considerably heterogenous among studies.
Table 4. Evaluation of Heterogeneity Between Study Results*
Egger regression intercept test.
A and AB
A and B1
A and B2
A and B3
AB and B1
AB and B2
AB and B3
B1 and B2
B1 and B3
B2 and B3
Are There Significant Survival Differences in Patients With Thymoma by WHO Class Stratified by Stage?
The 15 studies selected as providing ‘best evidence’ provided incomplete information regarding the survival proportions of patients with thymoma by WHO subtype stratified by stage. Unfortunately, the tables or survival curves in these studies provided survival proportions by either of those variables but not both together.
Could the WHO Scheme Be Collapsed Into a Smaller Number of Classes?
As shown in Figures 2 through 4, meta-analysis showed significant differences in survival proportions between patients with thymoma types A/AB/B1, B2, and B3, suggesting that the WHO could be collapsed into these 3 categories.
A systematic review of the English literature indicated that only Level 3 or 4 evidence was available to evaluate the clinical value of the WHO classification of thymomas. Level 3 or 4 evidence, reported in retrospective case series, is subject to a variety of biases, including data accrued from patient population groups that are heterogeneous as a result of various referral practices, variability in the interpretation of diagnostic features by different pathologists, the use of different therapeutic modalities, and other technical issues that affect data analysis with statistical methods.19, 20, 41,42 The publications that provide ‘best current evidence’ include a relatively small number of patients, multiple stages of thymoma, and somewhat heterogeneous treatment. For example, the number of patients with thymoma type B3 ranged from 5 to 51 in the studies reviewed, the number of patients in various tumor stages varied considerably in the different reports, and only some of the patients with stage II disease were treated with radiotherapy. The majority of patients with thymomas are treated with surgery; the use of radiotherapy in stage II disease remains controversial and patients with more advanced disease usually receive chemotherapy.44–45 More significantly, the data in the 15 studies provided limited prognostic information from patients stratified by stage and then by WHO classification. It is unclear how a particular WHO type of thymoma would help clinicians stratify patients into these various therapeutic modalities, by stage.
It is difficult to derive evidence-based guidelines regarding the use of the WHO classification of thymomas based on current information. Meta-analysis suggests that the classification scheme of thymomas could be simplified into 3 types with significant prognostic value: A/AB/B1, B2, and B3. This schema would not apply to thymic carcinomas, which are neoplasms with a more aggressive malignant potential.12, 46–51 However, until further data are available regarding potential interobserver variability in the use of diagnostic criteria and prognostic/predictive information stratified by tumor stage, it is difficult to recommend a definitive classification of thymomas that provides significant prognostic and predictive value for patients with the disease. We believe the data strongly suggest that the WHO schema needs to be updated to address these various questions.
Review of available information regarding thymomas from an evidence-based pathology standpoint raises interesting questions concerning how to best study a relatively unusual neoplasm with low-malignant potential. It is very difficult to organize randomized clinical trials for patients with infrequent neoplasms that require follow-up periods of >10 years. This practical problem underscores the need to provide more detailed information in publications of case series, perhaps with tables that list various details for all patients such as stage of disease, specific treatment, length of follow-up, and other data. Comprehensive data accrued from different studies could then be analyzed with meta-analysis tools such as those used in our review to answer some of the questions we pose. An alternative approach could include the organization of a national or international registry of thymoma patients, perhaps with a data repository and tissue bank to collect information from the large number of patients that are needed to identify clinically valuable prognostic and predictive features and useful therapeutic modalities. As a first step in the direction of an international cooperative study of thymomas with meta-analysis, we hope that the authors of the studies reviewed herein will be willing to share their survival data by WHO thymoma type and stage with our laboratory to derive more useful clinical information in the near future.
We thank the Foundation for Thymic Cancer Research for sponsoring a symposium in New York City in March 2007 to stimulate a multidisciplinary review of thymic neoplasms