Diagnosing pulmonary embolism: running after the decreasing prevalence of cases among suspected patients


H. Bounameaux, Division of Angiology and Hemostasis, Geneva University Hospitals, CH-1211 Geneva 14, Switzerland. Tel.: +41 22 3729292; fax: +41 22 3729299; e-mail: henri.bounameaux@medecine.unige.ch

In the past 10 or 15 years, the proportion of cases diagnosed with pulmonary embolism (PE) among suspected/tested patients has steadily decreased, an evolution that paralleled the emergence of non-invasive diagnostic strategies. In the first trials that demonstrated the efficacy of heparin in venous thromboembolism (VTE), the prevalence of PE as assessed by pulmonary angiography was about 50%[1]. In the PIOPED study (corresponding to the recruitment period 1985–86), it was about 30%[2], and more recently that prevalence reached 20% in Europe [3], and less than 10%[4] or even as low as 5%[5] in some studies from North America. This suggests that the wide availability and increasing acceptance of modern non-invasive diagnostic strategies along with medico-legal concerns led clinicians to an over-testing for PE. As a consequence, the number of patients who undergo investigations in order to find one case of PE increases, leading to a cost-efficacy unbalance. What should and could be done to counteract this embarrassing evolution? For sure, we should improve the selection of individuals who require testing. In this issue of the Journal of Thrombosis and Haemostasis, two different approaches [6,7] are presented that address this important issue.

Because d-dimer measurement did probably play a role in this evolution, Linkins et al. [6] tried to define different d-dimer cut-off points according to the clinical pretest probability of PE. Indeed, this simple biologic test has a very high sensitivity (proportion of positive tests among patients with the disease) for the presence of PE, and thus a high negative predictive value (NPV, proportion of patients without the disease among those with a negative test). Therefore, a negative d-dimer (i.e. a d-dimer level below a certain cut-off value) allows ruling out PE [3]. As a consequence, only patients with a positive d-dimer test require further invasive and/or expensive testing. On the other hand, because of its poor specificity (proportion of negative tests among patients without the disease), the number of false positive tests (i.e. patients with positive d-dimer test but without PE) increases with decreasing prevalence of the disease. These additional false positive tests will induce a parallel increase in the number of patients in whom further expensive tests will be performed after a positive d-dimer test, and an increase in the proportion of patients in whom these investigations will turn out to be negative.

Linkins et al. hypothesized that the use of different d-dimer cut-off levels depending upon the clinical probability of PE (pretest probability) would reduce the number of false positive results. Pre-test probability of venous thromboembolism can be assessed implicitly [8] or with the help of scores or prediction rules [9,10]. According to Bayes' theorem, the probability that a patient has a disease according to the results of a diagnostic test (post-test probability of disease) is not only related to the test's characteristics (sensitivity and specificity) but also to the prevalence of the disease in the tested population, or pretest probability. Thus, for the same sensitivity and specificity, the negative predictive value increases with decreasing prevalence of the disease (Fig. 1A) and selecting a higher d-dimer cut-off value in patients with a low pretest probability may result in a similar negative predictive value despite a decrease in sensitivity. In the study by Linkins et al. patients were classified into three categories of clinical probability of venous thromboembolism (VTE) according to clinical models [8]. The prevalence of VTE was 5.9, 16.9 and 56.1% in the low, intermediate and high clinical probability groups, respectively. The authors retrospectively selected a d-dimer threshold for each probability category to ensure an acceptably high negative predictive value. The overall specificity of the d-dimer test was around 45% when a single 500 μg/L cut-off was used, and close to 60% when pretest probability-specific cut-offs were used (2100, 500 and 200 µg/L for the low, intermediate, and high pretest probability, respectively). In low-probability patients, selecting a higher cut-off (2100 µg/L) resulted in a 95% specificity, and NPV was still 98.4%. Applying this threshold, PE would have been ruled out by a ‘negative’d-dimer in 80 additional patients. In spite of the overall increased efficacy of the d-dimer testing, the important decrease in sensitivity (75%, 95% CI: 47–91%) when a higher cut-off is applied in the low-probability subgroup raises serious concerns. Indeed, among the 12 cases of PE in this group, three (25%) had a d-dimer level below the 2100 µg/L threshold. Moreover, the 95% confidence interval is compatible with a more than 50% proportion of ‘missed PE’ (patients with PE and a negative d-dimer test with the modified cut-off).

Figure 1.

(A) Negative predictive value (NPV) according to the prevalence of PE for 4 different sensitivity levels of d-dimer test (40% constant specificity). (B) Number of patients needed to be investigated (NNI) after a positive d-dimer test to diagnose one PE, according to the prevalence of PE for three different specificity level d-dimer tests (sensitivity kept constant at 99%).

On the other hand, because of its higher specificity, this modified d-dimer test cut-off would be associated with a lower false-positive rate, thereby avoiding further testing in a substantially higher number of patients. To illustrate these changes, in Fig. 1(B) we display the number of patients with a positive d-dimer result who must be investigated to diagnose one case of PE (NNI), according to the prevalence of PE for three different specificity levels. Figure 1(B) shows that a reduction of the prevalence of PE in the suspected population below 10% is accompanied by a dramatic increase in the number of patients who need to be further investigated to identify one PE when the d-dimer result is positive (NNI). Increasing the specificity of the d-dimer test from 40 to 60% (i.e. that obtained by Linkins et al. by increasing the d-dimer cut-off in low probability patients) reduces the NNI. But the influence of prevalence is much greater and increasing the prevalence from 5 to 10% reduces more the NNI than increasing the specificity for any given value of the prevalence of PE.

In response to those concerns, Kline et al.[7] tried to identify a subgroup of patients who would not need either d-dimer testing or any other investigation, which would allow to increase the prevalence of PE in patients submitted to objective diagnostic tests. In 3148 emergency department patients suspected of PE, they derived by logistic multivariate analysis a clinical prediction rule named ‘the PERC (Pulmonary Embolism Research Consortium) rule’. PE suspicion was defined as ‘enough clinical suspicion for PE that a board-certified emergency physician thought that formal evaluation for PE was necessary’. Overall prevalence of PE was 11% in this derivation set. The final model comprises eight ‘negative’ variables selected among those that were significantly associated with the absence of PE: age < 50, pulse < 100, SaO2 > 94%, no unilateral leg swelling, no hemoptysis, no recent trauma or surgery, no history of VTE, no hormone use. The main validation set was a cohort of 1427 ‘low-risk patients’, defined as being ‘at low enough risk to justify exclusion of PE on the basis of a negative d-dimer test’, in which overall prevalence of PE was 8%. In 25% of patients (362/1427) of this validation set, the 8 negative clinical criteria of the PERC rule were present. Among them, the proportion of PE, i.e. the false-negative rate of the rule, was 1.4% (5/362, 95% CI 0.4–3.2). The rule was ‘positive’ as defined by the absence of any of the negative criteria in 96% (109/114, 95% confidence interval 90–99%) of the patients with proven PE. Such an approach is potentially interesting because it influences the prevalence of PE among tested patients. Indeed, excluding 25% of patients before d-dimer testing in this cohort would raise the prevalence from 8 to 10.3% (109/1065), thereby slightly reducing the NNI after a positive d-dimer test (Fig. 1B). However, that increase remains modest due to the very low prevalence of PE in the entire cohort.

Are we ready to advocate the application of the Linkins' proposal or the PERC rule in clinical practice? First, their safety should be formally evaluated in prospective outcome studies in which decisions would be based on the modified d-dimer thresholds or the PERC rule. Moreover, the proposals should be externally validated in populations with different PE prevalences. Finally, even if these results were confirmed, the acceptability by clinicians of this concept of different thresholds according to the clinical probability and of a ‘negative clinical score’ may be a true challenge.

One concern raised by these two studies is that because false-negative results are potentially fatal (patients with a negative test, so left untreated, but having PE), it remains crucial that the first test used to rule out PE, whether the d-dimer test with the cut-off chosen or the PERC rule, be highly sensitive to the presence of the disease. Indeed, whatever the prevalence, a 99% sensitive test will leave untreated 1% of cases, a 96% sensitive 4%, a 90% sensitive (lower limit of the confidence interval of the PERC rule sensitivity) 10%, and so on. Moreover, as previously stated, with decreasing prevalence, the use of the false-negative rate may become misleading if not carefully interpreted. Indeed, as illustrated in Fig. 1(A), even if the sensitivity of a screening test is obviously not sufficient, leaving a substantial proportion of cases of PE undiagnosed, the NPV may reach about 100% when the prevalence is very low.

In summary, as long as the proportion of PE was about 20% among suspected patients, a 99% sensitive and 40% specific d-dimer test allowed excluding PE in about one third of suspected outpatients. It was effective for selecting patients who should proceed to further imaging procedures and one out of three patients with a positive d-dimer had a PE. The progressive decrease in the proportion of confirmed cases among suspected patients, especially in North America, has resulted in an important cost-efficacy unbalance in the diagnostic strategies based on d-dimer testing, as the proportion of patients with positive d-dimer but without PE increases dramatically. Attempts should be made to resolve this issue, e.g. by increasing the specificity of the test or by better selecting patients requiring investigation for PE, with the constraint to maintain a high sensitivity of the initial test. Indeed, even if PE is a life-threatening disease, and wishing to rule it out is a laudable intention, we may question whether implementing a costly diagnostic work-up in a population with an only 5% prevalence of the disease is still reasonable. Hence, the question may be no more how should we investigate PE but rather in whom?