A metaanalysis of 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography in the staging and restaging of patients with lymphoma

Authors

  • Carmen R. Isasi M.D., Ph.D.,

    Corresponding author
    1. Department of Nuclear Medicine, Albert Einstein College of Medicine, Yeshiva University, and the Montefiore Medical Center, Bronx, New York
    2. Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Yeshiva University, Bronx, New York
    • Department of Nuclear Medicine, Montefiore Medical Center, Montefiore Medical Park, 1695 A Eastchester Road, Bronx, NY 10025
    Search for more papers by this author
    • Fax: (718) 405-8457

  • Ping Lu M.D.,

    1. Department of Nuclear Medicine, Albert Einstein College of Medicine, Yeshiva University, and the Montefiore Medical Center, Bronx, New York
    Search for more papers by this author
  • M. Donald Blaufox M.D., Ph.D.

    1. Department of Nuclear Medicine, Albert Einstein College of Medicine, Yeshiva University, and the Montefiore Medical Center, Bronx, New York
    Search for more papers by this author

Abstract

BACKGROUND

In recent years, the use of positron emission tomography (PET) has become widespread for the staging and follow-up of several malignancies. In the current study, the authors conducted a metaanalysis of the published literature to evaluate the diagnostic performance of 18F-2-deoxy-2-fluoro-D-glucose PET (FDG-PET) in the staging of patients with lymphoma.

METHODS

The authors conducted a systematic MEDLINE search of articles published between January 1995 and June 2004. Studies that evaluated FDG-PET with a dedicated camera and that reported sufficient data to permit the calculation of sensitivity and specificity were included in the analysis. Two reviewers independently reviewed the eligibility of the studies and abstracted data (sample population; characteristics of FDG-PET; and the number of true-positive results, true-negative results, false-positive results, and false-negative results). The authors estimated the pooled sensitivity, false-positive rate, and maximum joint sensitivity and specificity.

RESULTS

Twenty studies were eligible for the metaanalysis. Fourteen studies included patient-based data, comprising a sample size of 854 subjects, and 7 studies included lesion-based data, totaling 3658 lesions. Among those studies with patient-based data, the median sensitivity was 90.3% and the median specificity was 91.1%. The pooled sensitivity was 90.9% (95% confidence interval [95% CI], 88.0–93.4) and the pooled false-positive rate was 10.3% (95% CI, 7.4–13.8). The maximum joint sensitivity and specificity was 87.8% (95% CI, 85.0–90.7). The pooled sensitivity and false-positive rate appeared to be higher in patients with Hodgkin disease compared with those with non-Hodgkin lymphoma.

CONCLUSIONS

The results of the current study indicate that FDG-PET is a valuable tool for the staging and restaging of patients with lymphoma; showing a high positivity and specifity. Clinicians may consider adding FDG-PET to the staging workup of patients with lymphoma. Cancer 2005. © 2005 American Cancer Society.

In the U.S., lymphoma is a common cancer. Non-Hodgkin lymphoma is 1 of the 10 leading cancers diagnosed in the U.S., and in general is reported to have a worse prognosis than Hodgkin disease. The survival rates of Hodgkin disease and non-Hodgkin lymphoma reportedly vary widely by cell type and stage of disease.1 Improvements in staging methods and the monitoring of patients can significantly improve the prognosis of patients with lymphoma. In recent years, the use of positron emission tomography (PET) has become widespread for the staging and follow-up of several malignancies.2, 3 The use of 18F-2-deoxy-2-fluoro-D-glucose PET (FDG-PET) in the staging of lymphoma patients offers advantages over other conventional imaging techniques. FDG-PET provides information regarding the metabolic activity of tumors that can complement the anatomic information provided by other imaging methods and, because FDG-PET can survey the entire body in a single scan, it can be particularly useful for determining the extent of the disease.

Recent studies have indicated that FDG-PET is an accurate method for the staging of patients with lymphoma.4–7 In addition, a study evaluating the impact of FDG-PET on the staging and management of patients from the clinician's perspective showed that PET results led to changes in the clinical stage in 44% of patients8; 21% of the patients were upstaged and 23% were downstaged. Furthermore, changes between treatment modalities (i.e., from surgery to radiation therapy) were reported in 42% of the patients. The purpose of the current study was to conduct a systematic review of the published literature to evaluate the diagnostic accuracy of FDG-PET in the staging of lymphoma, to address whether the diagnostic accuracy is similar for Hodgkin disease and non-Hodgkin lymphoma, and to evaluate the impact of FDG-PET on patient management.

MATERIALS AND METHODS

Data Sources and Eligibility

Published studies of the accuracy of FDG-PET in the staging of lymphoma were identified by systematic searches of MEDLINE, supplemented by a manual search of the references listed in original and review articles. Searches included the following keywords: lymphoma OR Hodgkin disease OR non-Hodgkin's lymphoma, staging, positron emission tomography, sensitivity, specificity, diagnostic accuracy, and test performance. Searches were limited to the period between 1995 and June 2004, and were performed with the assistance of a professional librarian.

Eligibility criteria included the use of FDG, dedicated PET camera, and sufficient data to allow for the calculation of sensitivity and specificity. Studies were not excluded based on sample size or the language of the publication. Because the validity of the individual studies may affect the interpretation of a metaanalysis, we adapted the criteria for study quality reported by Gould et al.9 and the Society of Nuclear Medicine Guidelines for performing FDG-PET studies.10 The criteria for assessing study quality are listed in Table 1.

Table 1. Criteria for Assessing Study Quality
  1. FDG-PET: 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography; mCi: millicuries.

Technical quality of FDG-PET
 Spatial resolution < 11 mm
 FDG uptake period ≥ 30 min
 FDG dose ≥ 10 mCi
 Acquisition time for emission scan specified
 Attenuation correction performed
 Participants with hyperglycemia excluded
 Participants studied in the fasting state
 Positive test results defined according to specific criteria
Technical quality and application of the reference test or tests
 Description of reference standard
Independence of test interpretation
 FDG-PET readers blinded to the results of the reference test or tests
Clinical characteristics of the study sample described
 Age, gender, and number of patients enrolled, reason for performing PET
Cohort assembly
 Participants enrolled prospectively
 Individual patient used as unit of data analysis

Data Extraction

Information extracted from each study included authors; year of publication; sample size; age of subjects; reference standard; unit of analysis (patients or lesions); technical characteristics of PET; use of attenuation correction; method of image interpretation (qualitative or quantitative); and the number of true-positive results, false-positive results, true-negative results, and false-negative results. Data were extracted by two of the investigators (P.L. and C.I.) and any differences were resolved by consensus. Data abstraction was not blinded to authors, institution, or source of publication.

Statistical Analysis

We calculated the true-positive rate (sensitivity) and the false-positive rate (1–specificity) for each study. In addition, we estimated the summary (pooled) true-positive rate (sensitivity) and false-positive rate (1-specificity). Summary receiver operating characteristics (sROCs) were computed using random effects methods.10 We calculated the maximum joint sensitivity and specificity, Q*, as a global measure of diagnostic accuracy (the point on the sROC curve at which the sensitivity and specificity are equal).11, 12 The maximum joint sensitivity and specificity has a similar interpretation to the area under the ROC curve, and its values range from 0.5 (no diagnostic value) to 1.0 (perfect test). The presence of heterogeneity between studies was examined using the chi-square test. Several subgroup analyses were performed to explore the presence of heterogeneity: the use of attenuation correction, visual interpretation of scans, whole-body scans, and blinding among other characteristics. The effect of study characteristics on parameters of diagnostic accuracy was evaluated using regression methods. Furthermore, because studies reporting only lesions as the unit of analysis may bias the estimates of diagnostic accuracy, we conducted separate analyses for studies with patient-based data and those with lesion-based data. The presence of publication bias was evaluated with funnel plot and the Begg test. All data analyses were performed using Stata software (StataCorp, College Station, TX).13

RESULTS

Eligibility

The literature search yielded 47 potentially relevant studies, 27 of which were excluded. The reasons for exclusion were insufficient data (n = 15 studies),14–28 use of a coincident gamma camera (n = 5),29–33 and no evaluation of staging performed (n = 7).34–40 Twenty studies were eligible for inclusion in the metaanalysis. Of these, 13 studies reported patient-based data, 6 studies reported lesion-based data, and 1 study reported both. For the metaanalysis, the studies with patients as the unit of analysis comprised a total sample size of 854 subjects. The studies with lesions as the unit of analysis totaled 3658 lesions.

Study Description

The sample size of the 20 studies ranged from 15–93 subjects (median, 50.5 subjects) (Table 2). The age of the subjects ranged from 7–90 years. The percentage of male participants in these studies ranged from 44.6–67.8% (median, 55.6%). Five of the studies included only patients with Hodgkin disease, 3 studies enrolled only patients with non-Hodgkin lymphoma, and 12 studies included patients with both Hodgkin disease and non-Hodgkin lymphoma. Among the studies including both Hodgkin disease and non-Hodgkin lymphoma patients, the percentage of Hodgkin disease patients included in the sample ranged from 6.5–70% (median, 47%). Among the studies including non-Hodgkin lymphoma patients, 13 reported the histological grade: 6 studies included patients with low-grade, intermediate-grade, and high-grade lymphoma; 6 studies included patients with low-grade and high-grade lymphoma; and 1 study included only patients with low-grade lymphoma. Seven studies reported that FDG-PET was performed as part of the staging workup. One study had histologic results as the reference standard, 6 studies used clinical follow-up, and 13 studies used both histologic results and clinical follow-up as the reference standard. Among the studies using clinical follow-up, 10 indicated the follow-up period, which ranged from 3–72 months. Thirteen of the 20 studies compared FDG-PET with other imaging methods (computed tomography [CT] in 4 studies, gallium67 in 2 studies, C11 methionine-PET in 1 study, PET/CT in 1 study, CT plus ultrasound in 1 study, bone scan in 1 study, and “conventional imaging” in 3 studies).

Table 2. Characteristics of Studies evaluating FDG-PET for the Staging of Patients with Lymphoma (January 1995–June 2004)
ReferenceCamera modelAttenuation correctionPET interpretationReference standardMean age (median age) (yrs) [range]
  • FDG-PET: 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography; NA: not available; CT: computed tomography; SUV:.

  • a

    One study (Bangerter et al.41) reported patient-based and lesion-based data, and therefore it is listed twice in the table.

Patient-based data     
 Rodriguez et al., 199747GE 4096YesNAPathology and clinical follow-up66.3 [48–81]
 Bangerter et al., 199841ECAT 931/08/12, ECAT EXACT HR+NAVisualClinical follow-up(29) [10–73]
 Cremerius et al., 199858ECAT 953/15YesVisualPathology and clinical follow-up43 [18–67]
 Moog et al., 199859ECAT 931/08/12, ECAT EXACT HR+YesVisualPathology(37.6) [7–73]
 Stumpe et al., 199844GE AdvanceNoVisualClinical follow-upNA [17–88]
 Bangerter et al., 199960ECAT 931/08/12, ECAT EXACT HR+YesVisualClinical follow-up(41.6) [14–74]
 Moog et al., 199942ECAT 931/08/12, ECAT EXACT HR+YesVisualPathology and clinical follow-up36 [13–72]
 Hüeltenschmidt et al., 200146ECAT EXACT 47YesVisualPathology and clinical follow-up38.1 [NA]
 Lang et al., 200143ECAT EXACT 47YesVisualPathology and clinical follow-up41.5
 Blum et al., 200348GE Quest 300HYesVisualClinical follow-up(55)
 Filmont et al., 20036ECAT EXACT HR+YesVisualClinical follow-up(57)
 Mikosch et al., 200345ECAT ARTYesVisualPathology and clinical follow-upNA
 Freudenberg et al., 20044Siemens dual modality PET/CTYesSUVPathology and clinical follow-up46 [19–70]
 Naumann et al., 20045ECAT EXACT HR+YesSUVClinical follow-up(34) [17–83]
Lesion-based dataa     
 Bangerter et al., 199841ECAT 931/08/12, ECAT EXACT HR+NAVisualClinical follow-up(29) [10–73]
 Moog et al., 199861CTI EXACT 931/08/12YesVisualPathology and clinical follow-up38.4 [7–72]
 Sutinen et al., 200062GE AdvanceNoVisualPathology and clinical follow-up(55) [26–81]
 Buchman et al., 200163ECAT EXACT 931/31, ECAT EXACT HR+YesVisualPathology and clinical follow-up41.2 [14–76]
 Menzel et al., 200264ECAT EXACT 47NANAPathology and clinical follow-up39 [17–65]
 Sasaki et al., 200265ECAT EXACT HR+NoVisualClinical follow-up60.4 [23–90]
 Hong et al., 20037GE AdvanceYesVisualPathology and clinical follow-up48.9 [20–81]

Study Quality

Among the 20 eligible studies, 7 were performed prospectively. Fasting was reported in 19 studies, and the fasting period was reported in 18 of these 19 studies. In 13 of the studies the fasting period was 6 hours or less, and was 4 hours or less in 5 studies. Eight studies reported measuring glucose levels prior to PET, and seven of these studies indicated that hyperglycemic patients were excluded. The dose of FDG and the uptake period were reported in 17 of the 20 studies. The reported uptake period ranged from 15–90 minutes. The acquisition time was reported in 16 of the 20 eligible studies. Fifteen studies reported using attenuation correction, and in 11 of these studies attenuation was performed in the entire sample population. The method of image reconstruction was reported in all studies but 1, with 11 studies reporting the use of an iterative reconstruction method, 5 studies used a filtered-back projection method, and 4 studies reported using both methods.

Readers were blinded to the results of the reference standard in 12 of the 20 studies, and 4 studies did not specify whether readers were blinded. The definition of a positive PET scan was clearly stated in 17 studies. Scans were interpreted visually in 15 of the 20 eligible studies.

Ten studies reported that the PET findings led to changes in the staging of patients. The percentage of patients who were upstaged ranged from 7.7–17.4% (median, 13.2%), and the percentage of patients who were downstaged ranged from 2.3–23.4% (median, 7.5%). Six of the 20 eligible studies reported changes in patient management as a result of PET findings.

Diagnostic Accuracy of FDG-PET

Among the studies with patient-based data, the median sensitivity was 90.3% (range, 70.6–100%) and the median specificity was 91.1% (range, 50–100%) (Table 3). The summary (pooled) true-positive rate (sensitivity) was 90.9% (Table 4) and the summary false-positive rate was 10.3%. The maximum joint sensitivity and specificity, a global measure of diagnostic accuracy, was 87.8%. The test for homogeneity indicated the absence of statistical heterogeneity. One of the studies reported a very low specificity (50%)41 and another study demonstrated low sensitivity (70%).42 The study with the lowest specificity included only patients with Hodgkin disease, and found two true-negative results and two true-positive results. The study with low sensitivity evaluated PET in the detection of bone marrow involvement only, and found 12 true-positive results and 5 false-negative results. When these 2 studies were excluded from the analysis, the pooled sensitivity and the maximum joint sensitivity and specificity increased to 91.8% and 89.6%, respectively, whereas the false-positive rate decreased to 9.5%. Figure 1 presents the sROC after the exclusion of these outliers.

Table 3. True-Positive Rate and False-Positive Rate of FDG-PET in the Staging of Patients with Lymphoma: Patient Based Data
ReferenceNo. of patientsTPR (%) (sensitivity)FPR (%) (1-specificity)
  1. FDG-PET: 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography; TPR: true-positive rate; FPR: false-positive rate.

Rodriguez et al. 1997471587.514.3
Bangerter et al., 1998414490.050.0
Cremerius et al., 19985827100.08.3
Moog et al., 1998597878.93.4
Stumpe et al., 1998447186.52.9
Bangerter et al, 1999608997.90
Moog et al., 1999425670.612.8
Hüeltenschmidt et al., 2001464894.710.3
Lang et al., 2001436593.314.3
Blum et al., 2003484797.60
Filmont et al., 200367887.06.3
Mikosch et al., 20034512190.619.1
Freudenberg et al., 200442785.70
Naumann et al., 200458893.10
Table 4. Summary True-Positive Rate, False-Positive Rate, and Maximum Joint Sensitivity and Specificity of FDG-PET in the Staging of Patients with Lymphoma (January 1995–June 2004)
 No. of studiesTPR (95% CI)FPR (95% CI)Maximum joint sensitivity and specificity (95% CI)
  1. FDG-PET: 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography; TPR: true-positive rate; 95% CI: 95% confidence interval; FPR: false-positive rate.

Patient-based data    
 All1490.9 (88.0–93.4)10.3 (7.4–13.8)87.8 (85.0–90.7)
 Excluding studies with lowest sensitivity and lowest specificity1291.8 (88.8–94.3)9.5 (6.6–13.1)89.6 (87.5–91.6)
 Hodgkin disease692.6 (88.4–95.6)13.4 (8.0–20.6)89.4 (84.5–94.3)
 Non-Hodgkin lymphoma589.4 (82.8–94.1)11.4 (5.6–19.9)85.0 (78.2–82.0)
Lesion-based data    
 All795.6 (93.9–97.0)1.0 (0.6–1.3)95.6 (93.1–98.1)
 Excluding study with lowest specificity695.1 (93.0–96.7)1.0 (0.5–1.3)95.8 (92.0–99.6)
Figure 1.

Summary receiving operating characteristics (sROC) of 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography (FDG-PET) in the staging of lymphoma patients (n = 12). 1/var: inverse of variance; AUC: area under the curve; Q*: maximum joint sensitivity and specificity, calculated as a global measure of diagnostic accuracy (the point on the summary receiver operating characteristic curve at which the sensitivity and specificity are equal).

Nine studies provided enough information to conduct a separate metaanalysis for Hodgkin disease5, 41, 43, 44, 45, 46 and non-Hodgkin lymphoma patients6, 44, 45, 47, 48 (Table 4). Among patients with Hodgkin disease, the median sensitivity and specificity were 93.2% (range, 85.7–100%) and 87.7 (range, 50–100%), respectively. The pooled sensitivity and false-positive rate were 92.6% and 13.4%, respectively, and the maximum joint sensitivity and specificity was 89.4%. Among patients with non-Hodgkin lymphoma, the median sensitivity and specificity were 87.5% (range, 81.5–97.6%) and 93.8% (range, 80–100%), respectively. The pooled sensitivity and false-positive rate were 89.4.% and 11.4%, respectively, and the maximum joint sensitivity and specificity was 85.0%.

Among the studies with lesion-based data, the median sensitivity was 96.6% (range, 92.1–99.3%) and the median specificity was 99.1% (range, 33.3–100%) (Table 5). The pooled estimates of the true-positive rate and the false-positive rate were 95.6% and 1%, respectively. The maximum joint sensitivity and specificity was 95.6% (Table 4). The test of homogeneity indicated the presence of statistical heterogeneity. One of the studies was found to have very low specificity41; when this study was excluded from the analysis, the pooled estimates of diagnostic accuracy did not vary, but the heterogeneity disappeared.

Table 5. True-Positive Rate and False-Positive Rate of FDG-PET in the Staging of Patients with Lymphoma: Lesion Based Data
ReferenceNo of lesionsTPR (%) (sensitivity)FPR (%) (1-specificity)
  1. FDG-PET: 18F-2-deoxy-2-fluoro-D-glucose positron emission tomography; TPR: true-positive rate; FPR: false-positive rate.

Bangerter et al., 19984113997.866.7
Moog et al., 1998615896.70
Sutinen et al., 20006217396.60.09
Buchmann et al., 2001631,15699.30.03
Menzel et al., 20026439294.94.5
Sasaki et al., 20026587492.11.0
Hong et al., 2003786692.40

The effect of the study design characteristics on the parameters of diagnostic accuracy were explored in subgroup analysis and by regression methods. The pooled estimates of the true-positive rate and the false-positive rate were found to be higher when the readers were not blinded and when the studies were conducted retrospectively (Table 6). In addition, the false-positive rate was higher when PET was performed as part of the staging workup compared with studies in which patients underwent PET scans because of equivocal findings. Using regression methods, the method of scan interpretation (visual vs. nonvisual), blinding, and the reason for patient referral to PET scanning were found to be significant predictors of the true-positive rate. Blinding and the reason for PET referral were found to be significant predictors of the false-positive rate (Table 7). The funnel plot and the Begg test did not indicate the presence of publication bias.

Table 6. Summary True-Positive Rate and False-Positive Rate by Study Characteristic in Patient Based Dataa
CharacteristicSummary TPR (%)-95% CISummary FPR (%)-95% CI
  • TPR: true-positive rate; 95% CI: 95% confidence interval; FPR: false-positive rate; PET: positron emission tomography.

  • a

    Outliers were excluded.

  • b

    n is the number of studies.

Fasting (nb = 11)92.1 (89.0–94.5)9.9 (6.9–13.7)
Whole body (n = 11)91.1 (87.7–93.8)9.5 (6.4–13.4)
Attenuation correction (n = 11)92.4 (89.2–94.8)10.2 (7.1–14.2)
Visual interpretation (n = 8)92.2 (88.1–95.1)7.3 (4.4–11.2)
Other than visual interpretation (n = 4)91.4 (85.9–95.2)15.7 (8.9–25.0)
Readers blinded (n = 6)91.3 (85.8–95.1)6.4 (3.4–10.9)
Readers not blinded (n = 3)93.6 (87.8–97.2)16.7 (10.2–25.1)
Prospective studies (n = 2)90.6 (83.3–95.4)3.3 (0.04–11.5)
Retrospective studies (n = 10)92.3 (88.7–95.0)10.8 (7.4–15.1)
Clinical follow-up only (n = 5)92.7 (88.8–95.5)6.1 (2.5–12.2)
Pathology and clinical follow-up (n = 6)92.1 (86.3–96.0)14.0 (9.1–20.3)
PET done as part of staging work-up (n = 4)92.0 (86.7–95.7)13.4 (8.3–20.1)
PET done because of equivocal findings or other reasons (n = 8)91.8 (87.7–94.8)6.7 (3.6–11.1)
Table 7. Predictors of the True-Positive Rate and the False-Positive Rate in Patient-Based Data
 B-coefficientP value
  • NS: not significant; PET: positron emission tomography.

  • a

    Only one study used pathology alone as the reference standard.

  • b

    A P value of > 0.05 was considered nonsignificant.

True-positive rate (sensitivity)  
Study design (prospective vs. retrospective)−0.270.007
Interpretation of scans (visual vs. non-visual)0.120.012
Reference standard (clinical follow-up only vs. pathology plus clinical follow-up)a0.002NSb
Blinding (readers blinded vs. no blinding)0.200.040
PET referral (PET as part of staging work-up vs. equivocal findings/other reasons)0.190.047
False-positive rate (1 specificity)  
Study design (prospective vs. retrospective)0.03NS
Interpretation of scans (visual vs. non-visual0.07NS
Reference standard (clinical follow-up only vs. pathology plus clinical follow-up)a−0.08NS
Blinding (readers blinded vs. no blinding)0.440.026
PET referral (PET as part of staging work-up vs. equivocal findings/other reasons)0.450.026

DISCUSSION

The results of this metaanalysis indicate that FDG-PET has a high diagnostic accuracy for the evaluation of staging and restaging in lymphoma patients. The summary true-positive rate (sensitivity) was 91% and the summary false-positive rate was 10% using a patient-based analysis, whereas the joint maximum sensitivity and specificity was 88%. The summary true-positive rate and summary false-positive rate were found to be slightly higher in patients with Hodgkin disease compared with patients with non-Hodgkin lymphoma. Despite the clinical relevance of the number of patients whose FDG-PET results led to changes in the staging of disease and its management, not all the studies addressed this finding. Among the studies reporting changes in staging, between 8–17% of the patients were upstaged, whereas 2–23% were downstaged. Changes in management were reported in 30% of the studies. Similar to what has been reported previously in the literature,9 the pooled parameters of diagnostic accuracy were higher in the lesion-based analysis than in the patient-based analysis.

Treatment decisions in patients with lymphoma are made based on the stage of the disease; patients with Stage I or II disease (according to the Ann Arbor/Cotswold classification) would receive short courses of chemotherapy whereas patients with more advanced stage disease would be eligible for more extended courses of chemotherapy and radiotherapy.49, 50 Therefore, the accuracy of staging procedures is critical for such decisions. In this way, FDG-PET represents a valuable addition to the staging procedures already available. To our knowledge few studies have been published to date comparing FDG-PET with other imaging or staging procedures with which to perform a concurrent metaanalysis. However, these studies suggest a relative advantage for FDG-PET over other imaging modalities. Stumpe et al.44 compared FDG-PET to CT for the staging and restaging of patients with Hodgkin disease and non-Hodgkin lymphoma. In the study by Stumpe et al., FDG-PET was found to have a higher sensitivity and specificity than CT in both Hodgkin disease and non-Hodgkin lymphoma patients; the overall accuracy for FDG-PET was 94% in patients with Hodgkin disease and non-Hodgkin lymphoma, whereas the overall accuracy of CT was 60% in patients with Hodgkin disease and 73% in patients with non-Hodgkin lymphoma. Similarly, Freudenberg et al.4 reported a higher sensitivity and specificity for FDG-PET when compared with CT in the restaging of lymphoma patients. The accuracy of FDG-PET was reported to be 95% and the accuracy for CT was 84%. Their study also evaluated the diagnostic accuracy of FDG-PET in combination with CT, and found an improvement in sensitivity and specificity over each imaging modality alone.

There are several potential limitations to conducting a metaanalysis of diagnostic tests. The presence of clinical heterogeneity (heterogeneity originated by the inclusion of patients at different stages of disease and other clinical characteristics) affects the generalizability of the results51, 52 and it is not necessarily ruled out by the lack of statistical heterogeneity.53 It is important to note that the majority of the studies included a mix of patients with Hodgkin disease, non-Hodgkin lymphoma, and non-Hodgkin lymphoma from different cell types. Furthermore, due to the nature of this disease, biopsy results were available in only a few studies; the majority had to rely on clinical follow-up, including a variety of imaging modalities and clinical examinations, not all of which were performed in the same manner in all the studies. The use of an imperfect reference standard, together with variability in the quality of the primary studies, introduces important limitations for the interpretation of this metaanalysis. In addition, the verification bias potentially present in the primary studies cannot be fully addressed in a metaanalysis. Nevertheless, despite these limitations, metaanalytic techniques have been very useful for demonstrating the significant role of FDG-PET imaging in the diagnosis and staging of several malignancies.54–57

The results of this metaanalysis suggest that the diagnostic accuracy of FDG-PET may be higher in patients with Hodgkin disease than in those with non-Hodgkin lymphoma. The summary sensitivity and the joint maximum sensitivity and specificity were found to be higher among patients with Hodgkin disease; however, the false-positive rate also was higher in this group compared with non-Hodgkin lymphoma patients. These findings should be interpreted with caution because they are based on a small number of studies. Conversely, non-Hodgkin lymphoma is a highly heterogeneous disease that includes a large series of different entities; therefore, the diagnostic accuracy of FDG-PET may differ within the group of patients with non-Hodgkin lymphoma. Unfortunately, there was not enough information available in the primary studies to address this issue in the current metaanalysis. In recent years, there has been an increased trend toward the use of PET/CT as a routine procedure, and it has been suggested that this practice will improve the sensitivity and specificity of PET. However, the use of PET/CT was not addressed in the current metaanalysis because of a lack of currently available data.

The results of the current metaanalysis demonstrated that FDG-PET is a very accurate imaging modality for the staging and restaging of patients with lymphoma, with a high sensitivity and high specificity reported. Clinicians should consider adding FDG-PET to the routine staging workup of patients with lymphoma.

Acknowledgements

The authors thank Ms. Karen Sorensen for her assistance with the MEDLINE searches.

Ancillary