The Art and Science of Probabilistic Decision-making in Emergency Medicine

Authors

  • Shahriar Zehtabchi MD,

  • Jeffrey A. Kline MD


  • A related article appears on page 469.

Ramin, a foreign-born scholar with a robust resume and multiple degrees in social medicine, preventive medicine, and global health from top-notch American universities, was sitting in front of his immigration lawyer. His multiple attempts to obtain a work visa and to continue his stay in the United States had been unfruitful. The lawyer was explaining his last option and how the failure of this application could result in his deportation. Ramin asked about the probability of this path being successful. About 20%, said the lawyer, with a tilted head and a grim look on his face. Ramin’s next move baffled the lawyer. He made a sudden “Tom Cruise” move, jumped on the couch and, waving his hands in the air screamed, “Yes, yes, yes!”

“I think you misunderstood me,” the lawyer said. “I said only 20%.” That did not deter Ramin’s joyful cries. “What is all this about?” the lawyer asked. “Why are you getting so excited?”

Ramin slowly brought himself down from the couch, trying to catch his breath. “A 1 to 4 odd is the best I have ever had to overcome in my life! Do you have any idea what the odds are of getting into medical school in my country, or of surviving a revolution, a devastating war, and numerous political crackdowns by the government? Let me tell you: it is not 1 to 4! These are the best odds I have ever had! Let’s go for it!”

“Medicine is a science of uncertainty and an art of probability.”—William Osler1

In medical decision-making, clinical estimate of probability strongly affects the physician’s belief as to whether or not a patient has a disease, and this belief, in turn, determines actions: to rule out, to treat, or to do more tests. Two important points must be considered. First, the impact of probability assessment on medical decision-making varies in proportion to the consequences of the disease. For example, consider the difference in magnitude of consequences of false-negative results from employing a decision rule to assess the need for ordering x-rays for a sprained ankle versus a decision rule to assess the presence or absence of subarachnoid hemorrhage. Second, probabilities must be revised when information becomes available. Updating of probabilities—also known as conditional probability assessment—has particular importance in emergency medicine, where most decisions must be made within a few hours at the longest.

In this issue of Academic Emergency Medicine, Jeffrey Hom’s evidence-based review2 of the risk of intra-abdominal injuries in hemodynamically stable pediatric patients with blunt abdominal trauma and negative abdominal computerized tomography (CT) scan reminds us of the art of balancing the probabilities in our clinical decision-making. This review reveals that the risk of intra-abdominal injuries in such cases is less than 1%. This raises the question: is <1% an acceptable “minimal risk”? What is the basis for this threshold? How should this number be interpreted and what are the factors that should be considered before applying this evidence into practice?

Bayesian Theorem

In the past few decades, the use of probability in medical decision-making has become more scientific and structured. The most prominent strategy for bringing the concept of probabilities to the patient’s bedside is Bayes’ theorem. This philosophy is a way of understanding how the probability of an event or disease can change with time.

Bayesian reasoning first requires an estimate of the baseline probability of a disease before any test is ordered. This baseline probability goes by the synonyms prior probability and pretest probability. A test could mean a physical finding; a pertinent medical history; or an electrocardiogram, laboratory assay, or diagnostic imaging. No test is perfect; they all produce false-positive and false-negative results. Accordingly, the deciding clinician must modify the baseline probability based upon the magnitude of “skew” introduced by the test’s diagnostic sensitivity (proportion of diseased patients with a positive test) and specificity (proportion of nondiseased patients with a negative test). The numeric tool that summarizes this skew is the likelihood ratio. Two likelihood ratios exist for any given test: the likelihood ratio negative [(1 – sensitivity)/specificity] is used when the test is negative, and the likelihood ratio positive [sensitivity/(1 – specificity)] is applied when the test is positive. The appropriate likelihood ratio is multiplied by the pretest probability (converted to odds using the formula O = P/(1 – P)) to skew the baseline probability upward (in the case of a positive test) or downward (in the case of a negative test). This revised probability is termed the posttest or posterior probability.3 In Bayesian reasoning, one keeps updating an original probability by gathering new information to the point that one is satisfied with the probability value that can be used to make a final decision. Although they may not use the term “Bayesian reasoning” to describe it, experienced clinicians do this every day: they update and combine independent observations to revise the probability—and thus their beliefs—about a dependent event (usually a disease) in their day-to-day practice.

The diagram in Figure 1 depicts how probability can be reduced to practice. If a clinician’s pretest probability is low enough to rule out a disease (i.e., probability falls in the rule-out zone), no further testing seems necessary. Similarly, when the pretest probability is high enough to secure the diagnosis (i.e., probability falls in the rule-in zone), treatment should be started, again without further testing. However, when the pretest probability of the disease falls in the uncertainty zone, the physician starts gathering more information and orders tests to shift the posttest probability across the threshold for the rule-out zone to safely exclude the diagnosis or across the rule-in zone to confirm the diagnosis and start treatment.

Figure 1.

 Relationship of the probability thresholds with the decision-making zones.

In addition to the step of formulating the baseline probability, another challenging aspect of Bayesian reasoning is the task of setting the decision thresholds. Clearly, when the disease is not lethal, physicians are more comfortable with taking a higher risk for missing the diagnosis. Going back to our previous example, the clinicians’ tolerance for missed diagnosis is considerably different for ankle fracture and subarachnoid hemorrhage.

Objective methods of setting the test threshold exist, including a simple algebraic formula proposed by Pauker and Kasirer4 or the use of complicated nonlinear multivariate modeling.5 These formulae require quantified estimates of the consequences of wrongly excluding or diagnosing disease, which are often highly speculative and fungible. Regardless of the method of estimation, patients, clinicians, lawyers, and society must accept that a posttest probability of zero is a fallacy. Probability theory proves it. Even if a test repeatedly produces a posttest probability of zero in multiple research settings, eventually, given enough sample size, the test will have a false negative. Clinicians have a duty to know and teach that all diagnostic tests carry some risk of wrongness in both directions, diagnosis and exclusion. How much risk one should take is determined on a case-by-case basis and in consideration of cultural and human factors that vary between patients and locations.6 Human factors include the patient’s psychic state, education, and religious beliefs and cultural factors include the surrounding socioeconomic status, medicolegal climate, and resource availability. No formula can consider all of these. One of the authors (JAK) asked a patient what test threshold he would like for a potentially fatal disease. The patient, somewhat to that author’s surprise, pronounced himself to be born-again Christian, further elaborating that he had no concern over his numeric risk of dying because of his belief that his death would lead him to his reward in heaven. We submit that all patients may not share such affection for dying, nor would all patients think of 1:4 as favorable odds, as did our foreign friend, Ramin. Nonetheless, these examples illustrate that just as no single best probability threshold can be sensibly used for ruling out all diseases, no single best method of setting the test threshold exists for any one disease without considering patient-oriented factors.

Rules of decision-making based on probabilities

1. Tests Should Only Be Ordered if They Affect the Decision

Indiscriminate ordering of tests can delay diagnosis and treatment and expose the patient to test- and treatment-related risks (e.g., radiation exposure or unnecessary surgery). Emergency medicine flourishes as a specialty at least in part because it teaches the credo to decide quickly, based on limited information. Put another way, part of the daily grind for we, the “pit docs” of emergency medicine, is the ability to rapidly formulate beliefs based on limited and often ambiguous information.

As in any moment of decision, the best thing you can do is the right thing, the next best thing is the wrong thing, and the worst thing you can do is nothing.—Theodore Roosevelt7

2. Threshold Values Should Be Established Before Any Tests Are Ordered

Before any tests are ordered, the rule-in threshold and the target posttest probability should be established or at least considered. In some cases, an exacting numeric value may make sense, as in the case of potentially fatal diseases. And for medical problems that do not threaten to kill the patient or destroy vital organs, such as a sprained ankle, the test threshold may be more qualitatively stated “low or nonlow.” Determining the benefit-to-risk ratio can assist the clinician in placing the thresholds: If the benefit of treatment is higher than the risks associated with treatment (e.g., prescribing antibiotics to a woman with dysuria), the benefit-to-risk ratio becomes large and the treatment threshold becomes low. In this case, treatment may be appropriate even at a relatively low posttest disease probability. However, if compared with the risks, the benefits of treatment are very small (e.g., chemotherapy for some types of cancer), the benefit-to-risk ratio is small and the treatment threshold moves closer to 1.

3. The Thresholds Should Be Individualized for Each Disease

Probability numbers are relative. They only become meaningful in the context of the disease in question, the patient, and the location.

The drawback

Critics of Bayesian logic consider this approach too “subjective,” especially at the first step of estimating pretest probability.8 They argue that in traditional or frequentist statistics, pretest probabilities are driven objectively by studies with rigorous methodologies. Bayesian statistics uses the same approach, but also leaves room for modification based on experience and knowledge. To use a weather analogy, we can predict from prior records that on any randomly selected day in Seattle, in the absence of any other condition, the probability of rain is about 43%. However, if we add the condition that the randomly drawn day also be a sunny day, this probability drops to about 1%, but does not reach zero. The sky does rarely produce rainfall when the sun is shining, and patients do rarely have subarachnoid hemorrhage without headache.

Thus, this subjectivity alters the posttest probability in a definable but sensible way. If two clinicians start with widely divergent estimates of the pretest probabilities of a disease, inexorably their posttest probabilities will differ as well. However, in reality, as more evidence (vis-à-vis, “test equivalents,” which can include a new historical fact afforded with arrival of a family member) accumulates, the likelihood ratios of these “tests” tend to normalize; differing pretest probabilities converge toward a more central posttest probability estimate.9

In summary, the Bayesian theorem allows clinicians to combine the objective results of a test, or a test equivalent, with their prior clinical suspicions to hone the probability of a particular disease. However, employing this strategy requires careful considerations and adherence to certain rules. Important to understand is the relativity of probability values and decision-making between different diseases and between different patients. Probability should be factored into medical decisions and treatment plans with attention to patients’ preferences and local circumstances.

Ancillary