Lymphatic mapping and sentinel lymph node biopsy in early-stage breast carcinoma

A metaanalysis

Authors


Abstract

BACKGROUND

Lymphatic mapping with sentinel lymph node biopsy has the potential for reducing the morbidity associated with breast carcinoma staging. It has become a widely used technology despite limited data from controlled clinical trials.

METHODS

A systematic review of the world's literature of sentinel lymph node (SLN) biopsy in patients with early-stage breast carcinoma was undertaken by using electronic and hand searching techniques. Only studies that incorporated full axillary lymph node dissection (ALND), regardless of SLN results, were included. Individual study results along with weighted summary measures were estimated using the Mantel–Haenszel method. The correlations of outcomes with the study size, the proportion of positive lymph nodes, the technique used, and the study quality were evaluated.

RESULTS

Between 1970 and 2003, 69 trials were reported that met eligibility criteria. Of the 8059 patients who were studied, 7765 patients (96%) had successfully mapped SLNs. The proportion of patients who had successfully mapped SLNs ranged from 41% to 100%, with > 50% of studies reporting a rate < 90%. Lymph node involvement was found in 3132 patients (42%) and ranged from 17% to 74% across studies. The false-negative rate (FNR) ranged from 0% to 29%, averaging 7.3% overall. Eleven trials (15.9%) reported an FNR of 0.0, whereas 26 trials (37.7%) reported an FNR > 10%. Significant inverse correlations were observed between the FNR and both the number of patients studied (r = − 0.42; P < 0.01) and the proportion of patients who had successfully mapped SLNs nodes (r = − 0.32; P = 0.009).

CONCLUSIONS

Lymphatic mapping with SLN biopsy is used widely to reduce the complications associated with ALND in patients with low-risk breast carcinoma. This systematic review revealed a wide variation in test performance. Cancer 2006. © 2005 American Cancer Society.

The presence and extent of axillary lymph node involvement remains the most powerful predictor of recurrence and survival. It has been shown that the presence of regional metastases within the axillary basin decreases a patient's 5-year survival by approximately 28–40%.1, 2 Furthermore, data derived from National Surgical Adjuvant Bowel and Breast Project Protocol B-04 has shown that the likelihood of treatment failure increases as the number of metastatic axillary lymph nodes increases.3 The removal of axillary lymph nodes also improves locoregional control, which may translate into improved overall survival for the patient.4–8 However, lymph node metastases are found in only 40% of patients who undergo axillary lymph node dissection (ALND).9, 10 The remaining patients derive no therapeutic benefit from the procedure, whereas all patients are exposed to the complications from ALND, including lymphedema, pain, stiffness, and shoulder weakness.11, 12 Additional complications include seroma formation and vascular and brachial plexus injuries. The results from more recent studies suggest that ≥ 70–80% of patients with early-stage breast carcinoma have pathologically lymph node-negative disease.10, 13

Over the past several years, sentinel lymph node (SLN) biopsy (SNB) has been studied and applied as an alternative to ALND in patients with negative SLN sampling. The SNB is based on the hypothesis originally proposed by Cabanas.14 The SLN hypothesis states that tumor cells shed from a primary carcinoma migrate through a lymphatic channel to a single lymph node before involving further lymph nodes within that basin. The SLN, therefore, is the first lymph node that receives lymphatic drainage from a tumor, and its identification and analysis for tumor involvement should predict the status of the remaining lymph nodes.

The international acceptance of the SNB over routine ALND is based on several considerations. SNB is a less invasive procedure that may be done under local anesthesia, often on an outpatient basis. It is associated with a lower risk of the common morbidities noted with complete ALND. The SNB also allows the pathologist to study the few SLNs removed in greater detail for tumor burden compared with the examination of a large number of lymph nodes removed by ALND.

To our knowledge, only one prospective, randomized, controlled trial has been reported to date with relatively short-term follow-up in a small number of carefully selected patients.15 Although several additional trials are underway, each has limitations in the questions addressed and the populations studied. The SNB procedure has not been standardized universally; and the methods, materials, and patient selection vary by institution and surgeon. The intraoperative identification of the SLN in patients with breast carcinoma patients was shown to be successful by Krag et al. using 99m-technetium-labeled sulfur, Giuliano et al. using blue dye, and by Albertini et al. using a combination of 99m-technetium-labeled sulfur colloid and blue dye, with initial identification rates reported of 82%, 66%, and 92%, respectively.16–18

In this report, we present a systematic review and metaanalysis of all published studies of SNB in early-stage breast carcinoma. This study provides a thorough assessment of the test performance characteristics of SNB reported in the literature and explores the reasons for the observed heterogeneity in study results.

MATERIALS AND METHODS

Literature Search

A comprehensive, systematic review of the all-language published literature was undertaken using the following data sources: The Cochrane Library, Best Evidence (American College of Physicians Journal Club and Evidence-Based Medicine), DARE (Database of Abstract of Reviews of Effectiveness), Dissertation Abstracts, Pre-MEDLINE (published studies that have not been completely indexed by the National Library of Medicine), and MEDLINE. When possible, the data bases were searched by linking three broad content areas: breast cancer, SLN, and SNB. Search terms included “sentinel lymph node,” “sentinel lymph node biopsy,” and “lymphatic mapping.” An author search also was conducted of the 12 leading investigators cited by Cox et al.19 Furthermore, a review of the references from identified articles was performed to look for cited articles that were not found in the MEDLINE search. Finally, an extended search was done to identify relevant studies found in abstracts and data presented as letters to the editor.

Inclusion and Exclusion Criteria for Identified Literature

The following criteria were applied prospectively to the articles that were identified by the literature search. First, the population had to be an original study group. Duplicate articles based on the same group of patients were identified and excluded. For follow-up studies that included a subset of previously reported patients, only the most recent article that contained the most up-to-date results of the study group was reported. Second, selected articles had to have at least a defined subgroup of patients who underwent complete ALND after SLN dissection, regardless of the results of SNB.

Data Extraction

A data-extraction form was developed prospectively that included extensive information on the publication details, the patient population studied, details of the technique used, the outcome measures used, the number of patients studied, the number of patients who had successfully mapped SLNs, and measures of test performance, including true-positive results, true-negative results, and false-negative results. In addition, study quality was assessed by two independent observers and included the following study elements: 1) description of patient characteristics, 2) reasons for study withdrawal, 3) measures of test performance, 4) measures of variability, and 5) a description of the SLN technique used (radiocolloid, blue dye, or both).

Reporter Bias

To address the issue of potential bias, data extraction for all included studies was undertaken by two independent investigators. A comparison of the statistical results and quality scoring was done, and any discrepancy between the results led to a reanalysis of the original article. Finally, any unresolved issues were settled by an independent investigator.

Statistical Analysis

Primary outcomes

Primary outcomes consisted of measures of test performance, including the false-negative rate (FNR), the posttest probability-negative (PPN) rate, the diagnostic odds ratio (OR), and the proportion of patients who had successfully mapped SLNs. The FNR was defined as the probability of a negative SLN when the patient has positive axillary lymph nodes (sensitivity = 1). In accordance with common clinical practice, patients with positive lymph node status were defined as those who underwent a positive SNB, or a positive ALND, or both. Therefore, by definition, the false-positive rate was 0, and the specificity was considered to be 100%, because positive SNB results meant that the axillary lymph nodes were positive despite any negative ALND results. On the basis of these considerations, the FNR was equivalent to the negative likelihood ratio, i.e., (1 − sensitivity)/specificity = (FNR/1). The predictive value negative (PVN), by convention, is defined as the probability of achieving a negative dissection when the SLN sampling is negative. Alternatively, the PPN is defined as the probability that ALND results will be positive when the SLN is negative or 1-PVN. Although the FNR, like sensitivity and specificity, is considered a pure measure of test performance, the PPN depends on both test performance and the a priori risk of axillary lymph node involvement. The percentage of patients who had successfully mapped SLNs was defined as the percentage of patients studied who had identifiable SLN(s) based on the mapping procedure.

Descriptive statistics

The distributions of all measures were evaluated and, when appropriate, cut-off points for continuous variables were selected a priori based on clinically relevant criteria or reporting convention. Summary measures of central tendency included the mean, median, and mode. The measures of variability employed included the range, standard error, and 95% confidence limits. Means of continuous measures with distributions that approached normal were compared based on the Student t test or on analyses of variance for multiple groups. Bivariate correlations between continuous measures were based either on the Pearson coefficient as a measure of linear associations for data close to normal in distribution or the Spearman ρ as a nonparametric measure of correlation. Tests for linearity were derived from the sum of squares, degrees of freedom, and mean square associated with linear and nonlinear components based on analyses of variance. When distributions differed considerably from normal, analyses were based either on transformed values (e.g., logarithmic transformation) or on nonparametric methods. Nonparametric measures used included the Mann–Whitney U test or the Kruskal–Wallis H statistic. Categorical measures were compared using the chi-square test. The correlations between primary outcomes and study size, proportion of patients with positive lymph nodes, technique used, and study quality were evaluated.

Metaanalysis

A metaanalysis of primary outcome measures was undertaken by estimating individual study results along with weighted summary measures of the specified outcomes using the method of Mantel–Haenszel after assessing for study-by-study heterogeneity based on the Q statistic. The hypothesis that the studies are all drawn from a population of studies with the same effect size is rejected if Q exceeds the upper 100 (1-α) percentile of the chi-square distribution. An inconsistency index (I2) was calculated as an estimate of the proportion of variation in estimates because of heterogeneity rather than between-study variation. The I2 was estimated by using the method of Higgins as (H2 − 1)/H2, in which H2 = Q/(k − 1), with k representing the degrees of freedom.20 The combined estimates are then calculated as the weighted sum of the individual estimates, in which the weights are the reciprocal of the variance or the interstudy-adjusted variance of the estimates, depending on the model applied. Fixed-effects models were used to estimate summary measures when no significant heterogeneity was found across studies. Under the fixed-effects model, we assume that all studies came from a common population (i.e., if the sample size in each study were infinite, then the true effect size would be the same in all studies). Therefore, because the only source of variation is random error, the standard error will approach zero as the sample size becomes large, revealing the true effect size. Alternatively, random-effects models were used to estimate summary measures when significant heterogeneity was observed for outcomes across reporting studies. Under the random-effects model, it is assumed that the samples are from different populations with different true effect sizes. With this conservative approach, the true effect may differ between studies because of differences in patient populations, or treatment variation, or because outcome measures differ from one study to the next. Therefore, two sources of variation are assumed, consisting of random error and variation because of real differences between populations, treatments, or measured outcomes. The standard error of the effect size estimates again will approach zero as the sample sizes within studies or the numbers of studies in the metaanalysis increase. However, the differences in the true effect sizes between studies will persist. Hypothesis testing on summary effect estimates was based on a z statistic with estimates of the standard error and 95% confidence limits (95% CLs) provided for all individual studies as well as the summary overall effect estimate. Results are presented as forest plots with effect estimates and 95% CLs for each individual study and with a summary measure and 95% CLs across all studies.

Regression analysis

All regression models were based on fixed models from a priori hypotheses and sets of covariates. Linear regression analysis was used to model normal or normal-transformed outcome measures. Coefficients represent adjusted estimates of the rate of change in the dependent variable for each unit change in the independent variable. The standard error and 95% CLs for the coefficients were calculated for each covariate with hypothesis testing for a nonzero coefficient based on a t statistic. Global significance was assessed on the basis of an F statistic, whereas model performance was based on the correlation coefficient (R2) as a measure of the proportion of variation in the outcome measure accounted for by the independent variables in the model. Logistic regression analysis was performed for dichotomous outcome measures with transformation of the regression coefficients used to estimate an adjusted odds ratio for each dichotomous, independent variable. Significance was based on a Wald statistic, and 95% CLs were provided for estimated odds ratios. Global model significance was based on a chi-square test, and overall model performance was based on the R2.

RESULTS

Literature Search

The comprehensive literature search yielded 715 references between publication dates of 1970 and April 2003. Of these, 71 articles met the strict inclusion criteria described earlier. Of those 71 articles, 2 articles were not included in our final analysis, because they could not be translated completely from their native language. The remainder of this analysis is based on the 69 eligible studies that were available for data extraction.21–89

Study Characteristics

Selected study characteristics from the 69 included articles are presented in Table 1. In total, 10,454 patients were enrolled across all included studies. The average age of all patients across studies was 56.6 years. The intention-to-study population analyzed represented 8059 patients who actually were offered SNB followed by a completion ALND. The average number of patients completing each study ranged from 19 to 758 patients and averaged 116.8 patients per report (median, 76 patients).

Table 1. Characteristics of Studies Included in Systematic Review
StudyNo. completing studyPercentage of positive lymph nodesProportion mapped successfullyTechniqueaProportions
Total positive lymph nodesTotal positive SLNs
  • SLNs: sentinel lymph nodes.

  • a

    0: blue dye alone; 1: radiocolloid alone; 2: both blue dye and radiocolloid.

Olson et al., 20007522343.140.912.000.430.41
Cox, 20002048472.481.002.000.720.72
Noguchi et al., 20007067441.390.992.000.410.37
Tafra et al., 2001854830.042.000.300.26
Krag et al., 19985744328.150.932.000.280.25
Veronesi et al., 19998837648.520.992.000.490.45
Winchester et al., 1999897272.220.751.000.720.65
Laurisen et al., 2000598055.130.982.000.550.55
Bobin et al., 20002624337.780.932.000.380.35
de Kanter et al., 20005019934.390.922.000.340.32
Canavese et al., 20013121237.380.972.000.370.35
Giuliano et al., 19974517236.840.660.000.370.32
Ilum et al., 20004815950.520.610.000.510.43
Krag et al., 19985615750.000.762.000.500.48
Bembenek et al., 19992314646.230.811.000.460.42
Doting et al., 20003813646.830.932.000.470.44
Morrow et al., 19996713920.090.792.000.290.25
Fraile et al., 20004213239.370.961.000.390.38
Borgstein et al., 19972710443.271.001.000.430.42
Nos et al., 19997112230.890.880.000.310.28
Cserni et al., 20003612274.340.932.000.740.67
Nwariaku et al., 19987211928.130.812.000.280.27
Kollias et al., 19995411732.630.822.000.330.31
Giuliano et al., 19941810742.000.930.000.420.42
Reynolds et al., 1999789544.520.000.450.41
Molland et al., 2000658650.001.000.000.500.48
Bobin et al., 19992510046.990.830.000.470.45
Koller et al., 1998539853.130.980.000.530.50
Jaderborg et al., 1999497931.250.812.000.310.30
Sandrucci and Mussa, 1998818443.840.871.000.440.41
Roumen et al., 1997808340.350.691.000.400.39
Nason et al., 2000688246.970.802.000.470.39
Snider et al., 1998848020.000.881.000.200.19
Folscher et al., 1997407953.130.410.000.530.38
van der Ent et al., 1999877038.571.002.000.390.37
Mertz et al., 1999637942.110.971.000.420.41
Vaggelli et al., 2000867649.250.951.000.490.49
Rodier et al., 2000797350.000.840.000.500.46
Moffat et al., 199914, 647035.480.891.000.350.32
Noguchi et al., 1999697246.030.862.000.460.40
Flett et al.,1998396837.500.820.000.380.32
Martin et al., 20006175832.590.892.000.330.31
Gucciardo et al., 2000445041.860.861.000.420.30
Altinoyollar et al., 2000216044.890.820.000.450.39
O'Hea et al., 1998745941.820.932.000.420.36
Chatterjee et al., 1998326035.590.972.000.360.34
Clark et al., 1999335540.380.952.000.400.38
Breslin et al., 2000295158.140.842.000.580.51
Crossin et al., 1998355017.390.841.000.170.15
Mechella et al., 2000624835.710.881.000.360.31
Galli et al., 2000434634.090.961.000.340.27
Morgan et al., 1999664437.500.730.000.380.31
Langer et al., 2000584453.660.932.000.540.51
Barnwell et al., 1998224239.470.902.000.390.39
Liu et al., 2000604151.280.932.000.510.49
Offodile et al., 1998734145.000.981.000.450.45
Hsieh et al., 2000474139.021.001.000.390.34
Ratanawichitrasin et al., 1998774025.710.880.000.260.20
Delaloye et al., 2000374030.770.982.000.310.31
Kern, 1999524038.460.980.000.380.38
Horgan et al., 1998463850.000.920.000.500.42
Kowolik et al., 2000553727.270.892.000.270.24
Berclaz et al., 1998243433.330.972.00
Borgstein et al., 1997272556.001.002.000.560.56
Kapteijn et al., 1998513038.460.870.000.380.38
Schneebaum et al., 1998833032.140.932.000.320.25
Canavese et al., 2000305535.801.000.000.360.30
Forner et al., 2000412138.101.002.000.380.33
Schrenk and Wayand, 2001821952.631.002.000.530.53
Olson et al., 20007522343.140.912.00.0430.41

Patients with Successfully Mapped SLNs

Among the patients who completed the planned studies, 7765 patients (96.35%) had successfully mapped SLNs. Figure 1 shows that the proportion of patients mapped successfully across studies ranged from 41% to 100% (average, 89%; median, 92%). Less than 90% of patients reportedly were mapped successfully in 31 studies (45%). There was no significant correlation between study size and the reported proportion of patients who were mapped successfully. The average number of SLNs reported was 1.92 (median, 2 SLNs; mode, 2 SLNs) and ranged from 1.0 to 4.1 SLNs across studies. The average number of lymph nodes identified per trial at the time of completion ALND was 15.9 (range, 11.0–23.0 lymph nodes identified). The average percentage of lymph nodes considered positive for all studies was 42% (median, 40%; mode, 50%) and ranged from 17% to 74%. Among the patients who had a positive SLN, 53% had additional positive lymph nodes at the completion ALND, and 47% had an SLN as the only axillary lymph node identified.

Figure 1.

This graphic shows the distribution (proportion) of patients who were mapped successfully across all studies.

Test Performance Measures

FNR

The summary values for test performance measures, which were estimated on the basis of the clinical definition described earlier, are presented in Table 2. Figure 2 shows that the reported FNR ranged from 0.0% to 29.4% (average, 8.4% across studies; median, 7%). The observed FNR decreased with increasing study population size (P = 0.046). The FNR was significantly lower in the 23 studies that included ≥ 100 patients (6.7%) compared with the 46 studies that included < 100 patients (9.0%; P = 0.007). Twenty-one studies (36.2%) reported an FNR > 10%. Figure 3 shows that there was a significant, inverse correlation between the FNR and the reported proportion of patients who had successfully mapped SLNs (P = 0.001). Individual and overall weighted summary estimates of the FNR were studied using the method of Mantel and Haenszel. Significant heterogeneity was observed for FNR across studies (Q statistic = 212.8; P < 0.0001), providing an inconsistency estimate (I2), which represented the proportion of the variability in point estimates because of heterogeneity rather than sampling error, of 68.5%. Therefore, the current analysis was based entirely on a random-effects model, as shown in the forest plot of Figure 4, which is ordered on the basis of increasing study sample size. The overall summary FNR estimate (± 95% CL) based on a random-effects model was 7.0% (95% CL, 5.2–8.8%; P < 0.0001).

Table 2. Summary Measures of Test Performance
Measure (%)No.MedianMean95% CLRange
  1. 95% CL: 95% confidence limits.

False-negative rate697.1438.4456.84–10.050.0–29.4
Posttest probability negative694.6505.6844.42–6.950.0–25.0
Percentage mapped successfully6992.589.086.0–92.041.0–100.0
Percentage lymph node positive6740.38041.70939.04–44.3817.39–74.36
Figure 2.

This graphic shows the distribution (percentage) of lymph node-positive patients who underwent a false-negative sentinel lymph node biopsy (false-negative rate) across all studies.

Figure 3.

This scatter plot illustrates the false-negative rate by the proportion of patients who were mapped successfully across all studies. A fitted linear-regression equation is shown with 95% confidence limits.

Figure 4.

This forest plot illustrates the sentinel lymph node (SLN) biopsy false-negative (FN) rates ± 95% confidence intervals, which were estimated by the Mantel–Haenszel method for each study, ordered by ascending study population sample size. The weighted, combined FN rate ± 95% confidence interval based on a random-effects model is shown at the bottom. PPN: posttest probability negative rate.

PPN rate

Figure 5 shows that the reported PPN rate ranged from 0% to 25% and averaged 5.7% across studies. Figure 6 shows that the observed PPN rate increased as the percentage of positive lymph nodes identified at ALND increased (P = 0.033). The PPN rate increased along with increasing proportions of patients who had positive lymph nodes, from < 30% positive lymph nodes (PPN rate, 3.7%), to 30–40% positive lymph nodes (PPN rate, 4.4%), to 40–50% positive lymph nodes (PPN rate, 5.5%), to ≥ 50% positive lymph nodes (PPN rate, 8.9%). Figure 7 shows that an inverse correlation was observed again between the PPN rate and an increase in the proportion of patients who were mapped successfully (P < 0.0001). There was no significant correlation between the PPN rate and the number of patients studied. Individual and overall summary estimates of the PPN rate were studied using the method of Mantel and Haenszel. No significant heterogeneity was observed with a Q statistic of 72.87 (P = 0.29). The estimated inconsistency (I2), which represented the proportion of the variation because of heterogeneity rather than sampling error, was 8.1%. Therefore, further analysis was based on a fixed-effects model, as shown in the forest plot of Figure 8, which is ordered on the basis of increasing study population size. The overall summary estimate (± 95% CLs) for the PPN rate based on the fixed-effects model was 4.6% (95% CL, 3.8–5.4%; P < 0.0001).

Figure 5.

This graphic shows the distribution (percentage) of patients who underwent a negative sentinel lymph node biopsy with false-negative results (posttest probability negative rate) across all studies.

Figure 6.

This scatter plot illustrates the percentage of patients who had positive lymph nodes and their posttest probability negative rate. A fitted linear-regression equation is shown with 95% confidence limits (95% CI).

Figure 7.

This scatter plot illustrates the proportion of patients who were mapped successfully and the posttest probability negative rate. A fitted linear-regression equation is shown with 95% confidence limits.

Figure 8.

This forest plot illustrates the sentinel lymph node (SLN) biopsy posttest probability negative (PPN) rate ± 95% confidence interval, as estimated by using the Mantel–Haenszel method for each study, ordered by ascending study population sample size. The weighted, combined PPN rate ± 95% confidence interval based on a fixed-effects model is shown at the bottom. FN: false-negative.

Mapping Technique

There was limited consistency in the reporting of study techniques used across published reports. Studies reported using blue dye alone (n = 18 studies), radiocolloid alone (n = 16 studies), or both combined (n = 34 studies). The reported proportion of patients who had successfully mapped SLNs across these 3 groups of studies were 83.1%, 89.2%, and 91.9%, respectively (P = 0.007). Alternatively, the FNRs were 10.9% for blue dye alone, 8.8% for radiocolloid alone, and 7.0% for combination studies (P = 0.047). The FNR in the 51 studies that used radiocolloid was 8% compared with 11% for the 18 studies that did not use this technique (P = 0.034). Although the differences did not achieve statistical significance, similar differences were observed for PPN with respective rates of 7.4%, 5.7%, and 4.8% across study groups (P value for trend = 0.106).

Study Quality Assessment

In addition to the reported proportion of patients who had successfully mapped lymph nodes, study quality was rated separately according to five criteria and in a combined quality score (see Materials and Methods, above). Satisfactory reporting of these criteria included a description of patient characteristics (80% of studies), a report of the reasons for patient withdrawal (29% of studies), a report of test performance measures (46% of studies), the incorporation of measures of variability (19% of studies), and the use of radiocolloid for lymphatic mapping technique (51% of studies). Blinding between the surgeons and pathologists was not incorporated into the score, because only 2 studies reported such blinding (3%), whereas the majority of studies failed to report this parameter. The summary quality score ranged from 0 to 5, with > 50% of studies scoring poorly (summary score, ≤ 2). A significant correlation was observed between study quality and the observed FNR (P value for trend = 0.002), with the poorest quality studies reporting FNRs of 14% (Fig. 9). Likewise, a significant correlation was observed between study quality and the observed PPN rate (P = 0.012), with the poorest quality studies reporting PPN rates of 10%. Finally, the proportion of patients who had successfully mapped SLNs increased with study quality, ranging from 78% in the poorest quality studies to 93% in the highest quality studies (P = 0.017).

Figure 9.

This bar graph displays the false-negative rate ± standard error of the mean (SEM) by summary quality score.

Multivariate Analysis

A series of multivariate analyses was performed for the major outcome measures. Significant independent predictors of the FNR in linear regression analysis, which was adjusted for the number of patients who completed the study, included 1) reporting of measures of test performance (P = 0.009), 2) the proportion of patients mapped successfully (P = 0.011), and 3) the proportion of positive lymph nodes (P = 0.013). Table 3 shows that significant predictors of an FNR < 10% in multivariate logistic regression analysis included 1) a report of patient characteristics (OR = 5.80), 2) a description of reasons for study withdrawal (OR = 6.60), and 3) a proportion successfully mapped > 90% (OR = 3.50). Significant independent predictors of the PPN rate in linear regression analysis, which was adjusted for the number of patients who completed the study, included 1) the proportion of positive lymph nodes (P = 0.003), 2) a report on the measures of test performance (P = 0.010), and 3) the proportion of patients mapped successfully (P = 0.013). Table 4 shows that significant predictors of a PPN rate < 10% in multiple logistic regression analysis included 1) < 50% positive lymph nodes (OR = 17.3), 2) a proportion of patients mapped successfully > 90% (OR = 17.5), and 3) a description of patient characteristics (OR = 9.7).

Table 3. Logistic Regression Analysis: False-Negative Rate < 10%
CovariateCoef.aSEWald statisticbP valueOR95% CL
  • Coef: coefficient; SE: standard error; OR: odds ratio; 95% CL: 95% confidence limit.

  • a

    Correlation coefficient = 0.345.

  • b

    C statistic = 0.775 (0.660–0.890).

Patient characteristics1.7890.8234.7310.0305.981.19–29.99
Patients withdrawn1.8620.8454.8510.0286.4351.23–33.73
Measures of variability1.4160.8932.5180.1104.1220.72–23.71
Proportion mapped successfully > 90%1.3130.6044.7300.0303.7171.14–12.13
Table 4. Logistic Regression Analysis: Posttest Probability Negative < 10%
CovariateCoef.aSEWald statisticbP valueOR95% CL
  • Coef.: coefficient; SE: standard error; OR: odds ratio; 95% CL: 95% confidence limit.

  • a

    Correlation coefficient = 0.747.

  • b

    C statistic = 0.892 (0.793–0.992).

Patient characteristics3.5491.1439.6380.00234.793.70–327.11
Proportion mapped successfully > 90%3.5901.3217.3810.00736.222.72–482.64
Proportion of positive lymph nodes− 0.0540.034.2050.0400.9470.90–1.00

DISCUSSION

The current study represents a systematic review of the published literature on SNB in patients with early-stage breast carcinoma who underwent a completion ALND, permitting an estimation of the test performance characteristics of the procedure. The reports that were included in this review generally reflect the early experience of investigators and institutions with this procedure in women with breast carcinoma. Most have abandoned the use of completion ALND in patients with negative SNB results as experience and confidence has been gained with lymphatic mapping and SNB. Institutions with greater experience (larger studies) and those associated with better quality reports reflect significantly better test performance with this procedure, as demonstrated in the current analysis and as reflected in lower FNR and PPN rates. It is noteworthy that studies that reported higher rates of successful mapping also reported lower FNRs and PPN rates, suggesting the FNR may be an early indicator of the accuracy of SNB results at an institution. In multivariate analysis, studies that reported > 90% successful mapping procedures were significantly more likely to result in a FNR < 10% compared with studies that reported < 90% successful mapping procedures.

Early experience with the procedure appears to be associated with a wide variation in reported rates of successful mapping and test performance. It is important for investigators and practitioners to be aware of these limitations and of the need for a period of learning and careful attention to detail, both to confirm the initial experience with completion ALND to be certain of SNB accuracy in the hands of individual teams at each institution. It is also important to distinguish measures primarily of test performance (e.g., FNR) from measures, such as predictive value or PPN, that also are influenced by the risk or prevalence of lymph node disease in the population or patients under study. The studies reported here involved patient populations that had a wide range of positive lymph nodes, which may be explained in part by differences in individual practice and study eligibility criteria. The current analysis showed that SNB studies in higher risk populations that had a greater risk of positive lymph nodes reported higher PPN rates. These observations support the patient eligibility criteria used in the controlled clinical trials and the practice by most surgeons of limiting the use of the SNB in high-risk patients with clinically apparent lymph nodes and larger tumors.

It is apparent that considerable diversity exists across these studies in patient selection and in the techniques used as well as the results reported. Frequently reported differences included the use of dye versus radiocolloid, radiocolloid size and type, injection site, timing to surgery, use of lymphoscintigraphy, and intraoperative versus histopathologic analysis. This inherent heterogeneity across institutions and surgeons undoubtedly contributes to the wide range of reported successful mapping procedures and false-negative findings on SNB. The FNR and PPN rates reported decreased as the number of patients who completed the study increased, as discussed earlier. It has been reported that lower FNRs are observed as surgical experience improves. For example, a multicenter trial reported an improvement in the FNR from 5.8% to 4.3% when the study was corrected after the participating surgeons had completed 30 procedures.85 Unfortunately only 15 of 69 studies fully characterized the correlation between the FNR and other study measurements. It is noteworthy that major multicenter studies demonstrated no statistical correlation between primary tumor size and FNR.56, 85

A comparison of studies that used radiocolloid and blue dye with studies that used blue dye alone revealed an increase in the percentage of patients successfully mapped (92% vs. 83%; P = 0.006). However, neither comparison was shown to be an independent predictor for lowered FNR or PVN in multivariate analysis. The literature, to date, provides data that support the use of radiocolloid alone,56, 81 blue dye alone,39, 18, 45, 46, 66 or both.19, 27, 30, 61, 70, 75, 90, 91 In summary, the current meta-analysis provided data that favor the use of both blue dye and radiocolloid to optimize identification of the SLN. This should not preclude the use of blue dye by those surgeons who have mastered the technique of blue dye and have produced reliable, high identification rates and low FNR. Rather, the utilization of both tracers may expedite the learning curve to achieve acceptable SLN identification and FNR rates.

Systematic reviews of diagnostic and prognostic studies may contribute to a better understanding of these techniques by summarizing the results of test accuracy from several different investigations, identifying some of the reasons for variation in the results of individual studies, and potentially improving the quality of future studies though a delineation of the methodological inadequacies of previous reports. However, meta-analyses of diagnostic studies, like those of therapeutic trials, are dependent on the quality of the individual studies that are included in the analysis.92–95 The selection process of patients with and without disease as well as the characteristics of the screened population should be described fully. The reference standard must be defined explicitly and must be accepted generally as diagnostic of the condition. The screening process and the definition of positive and negative test results also should be described fully. Ideally, the investigator and interpreter of the test are blinded as to the disease status of the patients, and the method of concealment should be described fully. The same uniform reference and testing procedure should be used for all patients. Studies that recruit disease and control participants separately or that do not blind observers may overestimate test performance.96 Inconsistent reporting on patient characteristics, test characteristics, and statistical analysis of results increase the interstudy variability, leading to increased variability in the results of the meta-analysis.93, 96 An additional important limitation of the studies reported here is the failure in all but two studies to describe any process used to blind the pathology review of the SNB and the ALND results. The relations between study quality and the primary outcomes reported here, including successful mapping, FNR, and PPN rates, are intriguing and consistent.

The American Society of Clinical Oncology has recently developed clinical practice guidelines for sentinel lymph node biopsy in early-stage breast carcinoma based, in part, on the systemic review presented here.97

Acknowledgements

The authors gratefully acknowledge the technical assistance of Olayemi Agboola, M.S., with the data extraction for this systematic review.

Ancillary