- Top of page
Context: Current efforts to improve the cost-effectiveness of health care focus on assessing accurately the value of technologically complex, costly medical treatments for individual patients and society. These efforts universally acknowledge that the determination of such value should incorporate information regarding the risks posed by a given treatment for an individual, but they typically overlook the implications for medical decision making that inhere in how notions of risk are understood and used in contemporary medical discourse. To gain perspective on how the hazards of surgery have been defined and redefined in medical thought, we examine changes over time in notions of risk related to operative care.
Methods: We reviewed historical writings on risk assessment and patient selection for surgical procedures published between 1957 and 1997 and conducted informal interviews with experts. To examine changes attributable to advances in research on risk assessment, we focused on the period surrounding the 1977 publication of an influential surgical risk-stratification index.
Findings: Writings before 1977 demonstrate a summative, global approach to patients as “good” or “poor” risks, without quantifying the likelihood of specific postoperative events. Beginning in the early 1980s, assessments of operative risk increasingly emphasized quantitative estimates of the probability of dysfunction of a specific organ system after surgery. This new approach to establishing surgical risk was consistent with concurrent trends in other domains of medicine. In particular, it emphasized a more “scientific,” standardized approach to medical decision making over an earlier focus on individual physicians’ judgment and professional authority.
Conclusions: Recent writings on operative risk reflect a viewpoint that is more specific and, at the same time, more generic and fragmented than earlier approaches. By permitting the separation of multiple component hazards implicit in surgical interventions, such a viewpoint may encourage a distinct, permissive standard for surgical interventions that conflicts with larger policy efforts to promote cost-effective decision making by physicians and patients.
The high and rising costs of medical care in the United States (Bodenheimer 2005; Cutler, Rosen, and Vijan 2006), along with persistent regional variations in health services utilization (Fisher et al. 2003a, b; Wennberg 2010) have thrust to the fore the processes of medical decision making—particularly in regard to costly interventions with equivocal or poorly defined benefits—in efforts to restrain potentially wasteful medical spending (Neuman 2010). At the level of the doctor-patient encounter, efforts to increase the transparency of medical decisions and the extent to which treatment choices align with the preferences of individual patients have featured prominently in proposed policy strategies to control growth in medical spending. Such efforts are exemplified by shared decision-making strategies (Guadagnoli and Ward 1998) and tools such as structured decision aids, which seek to supplement discussions between clinicians and patients with information about the risks and benefits of treatment alternatives (Barry 2002; O’Connor, Llewellyn-Thomas, and Flood 2004; O’Connor et al. 2007).
Calls for the more widespread use of tools such as patient decision aids imply that the mechanics of everyday medical decision making are flawed. Such calls bespeak a desire for more standardization of the processes by which practitioners and patients communicate “information on options, outcomes, probabilities, and scientific uncertainties,” and “the personal value or importance [that patients] place on benefits versus harms” of specific treatments (O’Connor, Llewellyn-Thomas, and Flood 2004, 64). As a result, they frame current patterns of health care use and spending as symptomatic of failed communication between physicians and patients on a large scale and assume that reforming such communication will yield a better, more sustainable pattern of health care utilization.
While frequently acknowledging the uncertainties present in all medical choices, such efforts to improve the mechanics of medical decision making view the inputs to treatment choices, such as risks and benefits, as stably defined “outcomes and probabilities” waiting to be communicated (O’Connor, Llewellyn-Thomas, and Flood 2004). Yet in regard to the risks of medical treatments in particular, alternate viewpoints articulated over the last three decades have framed “risk” itself as a contingent, “collectively constructed” category and entity (Douglas and Wildavsky 1982; Slovic, Fischhoff, and Lichtenstein 1980). From this standpoint, the selection and definition of risks at any given moment in time are subject to specific, largely unseen and unappreciated cultural assumptions and biases (Heyman, Henriksen, and Maughan 1998; Slovic 1999). As a result, the way in which “risk” is defined in current medical practice carries its own set of implications for the personal, professional, and policy discourse that determines which patients are thought to be appropriate candidates for costly medical interventions, how this information is communicated between physicians and patients, and how physicians and third-party payers agree on “routine indications” for a given treatment.
Past work in the sociology of medicine has shown that research and the evidence it produces are culturally shaped (De Vries and Lemmens 2006; De Vries, Lemmens, and Bosk 2008; Mol 2002). Nonetheless, the notion of risk as a subjective phenomenon challenges assumptions common to current medical thought and practice. For example, an excerpt from a recent textbook of surgery characterizes risk assessment as a straightforward, value-free exercise in measurement: “The aim of preoperative evaluation is … to identify and quantify comorbidity that may impact operative outcome” (Neumayer and Vargo 2008, 251–2). This text, like others in surgery, internal medicine, anesthesiology, and other disciplines, offers simple, statistically derived prediction rules to facilitate the process of risk quantification before surgery (Arozullah et al. 2000; Detsky et al. 1986; Goldman et al. 1977; Lee et al. 1999).
This conceptualization of operative risk assessment contrasts sharply with the understanding of risk assessment evident in similar textbooks from just over four decades ago:
The assessment of operative risk should be approached as a statistical problem. … However, the statistical approach demands accurate data pertaining to the effects of many factors such as age, starvation, heart disease, etc. … upon the operative risk. These are practically nonexistent. … Obviously, because of these factors the accurate assessment of the operative risk for an individual case is impossible today. All we can do is guess. (Moyer 1970, 232)
Just how operative risk assessment changed from an enterprise perceived by physicians to be a matter of “guesswork” to one seen as a process of “quantification” represents an overlooked chapter in the history of medical thought. Past scholarship has demonstrated the emergence since the 1950s of the “risk factor” as a concept that has come to define the contemporary study, treatment, and experience of illness across a range of conditions by creating a status of being “at-risk” that coexists with states of “sick” and “well” (Aronowitz 1998, 2009; Rothstein 2003). Over this same period, an analogous terminology of risk factors also emerged in the context of decision making for surgical interventions. Drawing on analytic methods and concepts originating in epidemiologic studies of chronic disease, this new terminology came to be applied to, and in turn altered, the task of characterizing, categorizing, and making sense of the hazards of surgery.
To gain perspective on how surgery's hazards have been defined and redefined in medical thought, we examine in detail here changes over time in notions of risk related to operative care. Choices to undertake surgery all involve, to a greater or lesser degree, an acceptance of implicit procedural hazards, making physicians’ assessments of the dangers of treatment to an individual a central element of decision making surrounding surgical procedures (Bosk 1979). Accordingly, we studied writings from the years surrounding the 1977 publication of the first major statistical “risk factor” system focused on predicting a subset of adverse surgical outcomes, the Cardiac Risk Index (Goldman et al. 1977). We traced how the rapid appearance of this index in academic and clinical surgical writings, along with the development of similar statistical models to predict a range of other postoperative outcomes, offered physicians an increasingly standardized and statistically grounded way to assess the risks of surgery in individual patients. Taking the widespread acceptance of probabilistic statements regarding surgery's specific risks as a development to be explained rather than as a simple step forward for medical science (Berg 1995; Hacking 1990), we follow in this article Douglas and Wildavsky's admonition that “what needs to be explained is how people agree to ignore most of the potential dangers that surround them and interact so as to concentrate only on certain aspects” (Douglas and Wildavsky 1982, 9). We explain how the adoption of a new way of assessing the likelihood of a specific type of negative outcome—postoperative cardiac complications—implied a focus on certain dangers of surgery and the relative neglect of others while simultaneously obscuring the subjective nature of risk assessment itself. Further, we look at how such an approach to risk assessment created distinct challenges to cost-effective decision making by physicians and patients that remains beyond the reach of probabilistic rules, statistical guidelines, and applicable decision-making tools.
- Top of page
We reviewed major textbooks of surgery and anesthesia published between 1956 and 1997, supplemented by selected editorials and original research articles published in the medical literature during the same period. The textbooks we reviewed included the sixth (1956) through fifteenth (1997) editions of The Textbook of Surgery, which was the continuation of the first major multiple-authored American textbook of surgery (Anonymous 1942) and today remains the “gold standard” surgical reference (Organ 2001; Purcell 2003); the third, fourth, and fifth editions of Surgery: Principles and Practice, published in 1965, 1970, and 1977, a highly regarded surgical text (Raffensperger 1966) in print until 1977; the American College of Surgeons’Manual of Preoperative and Postoperative Care in its 1967, 1971, and 1983 editions; and the first (1957) through ninth (1997) editions of Introduction to Anesthesia: The Principles of Safe Practice, an influential early textbook of anesthesiology (Hedley-White 1979).
Both of us closely read excerpts from chapters devoted to the assessment of operative risk, as well as chapters on principles of patient evaluation before surgery more generally. We also examined sections in these surgical textbooks that discussed the role of statistics and computing technologies in the study of patient outcomes. In addition, we reviewed selected editorials and original research articles published in major academic medical journals, including the New England Journal of Medicine, the Journal of the American Medical Association, Annals of Surgery, and Anesthesiology. We identified articles to review by reviewing the chapters’ bibliographies and through online databases, including MEDLINE and the ISI Web of Knowledge, which we chose as comprehensive listings of medical journal articles published during this period. Our documentary research was supplemented with informal interviews with experts in preoperative risk assessment.
From a methodological standpoint, it was not our aim to write a history of risk assessment practices in surgery during the last half of the twentieth century in the United States. Rather, we sought, in Ian Hacking's words, to gain insight into “the public life of concepts” (Hacking 1990, 7) related to risk assessment in surgery and, in particular, how one specific notion of operative risk gained authority over time. We recognize that the majority of physicians’ assessments of patients’ operative risks, in both the past and the present, are likely to take place as unrecorded acts. Thus, we consider the historical writings we review here as an opportunity to learn what leading academic clinicians believed to be the best available knowledge at different points in history (Christakis 1997; Rabow et al. 2000). Finally, surgical textbooks are especially valuable for tracking temporal changes in thinking about the basic principles of surgical decision making. Textbooks are updated frequently, and they offer the prevailing guidance to physicians on how to assess risks. Thus, changes from one edition to the next offer an opportunity to understand how prevalent definitions of operative risk change over time.
- Top of page
From the 1950s through the first half of the 1970s, “operative risk” figured as a prominent theme in academic and clinical surgical writings. Indeed, “risk” was often the defining characteristic of an individual patient, who was commonly described as a “good” or a “poor” risk, without clearly specifying the hazards or predisposing factors underlying these categories. As the 1967 American College of Surgeons’Manual of Preoperative and Postoperative Management states,
An early assessment of risk as one of three kinds should not be difficult. Good risk patients are those in excellent health admitted to the hospital for surgical correction of a lesion of a local nature which has no obvious systemic effects. There is no disease immediately apparent involving other organ systems. A poor risk patient is one whose local lesion is of sufficient severity to produce pronounced systemic effects or who has severe disease of one or another vital organ system. … Many other patients fall into a large intermediate risk category in which for reasons of age, mild systemic disease or early systemic effects of the surgical lesion itself, certain corrective procedures should be instituted and more than the routine preoperative investigation should be carried out. (Ballinger 1967, 7)
Here the Manual distinguishes between “good” and “poor” risk patients as distinct archetypes, for whom “operative risk” is a defining attribute encapsulating a broad medical biography. This concept of risk does not separate preexisting diseases from the surgical lesion itself. In fact, risk is dissociated from any single outcome in particular. Instead, risk here encompasses the vast range of potential adverse outcomes that may occur in individuals with a “severe disease of one or another vital organ system.” Categories of risk occur as attributes of patients whose assignment requires an act of individual judgment by an authoritative physician-observer (Berg 1995). Thus, the separation of “good risk” patients from “poor risk” patients relies on a physician's judgment of what constitutes “excellent health,” a “local lesion,” or “pronounced systemic effects.” Such judgments themselves are elevated in status by the fact that a patient's degree of risk is conceptualized in deterministic, rather than probabilistic, terms. Here, risk occurs as a largely fixed quality of an individual, whereas “corrective procedures” may ameliorate the hazards of surgery for intermediate-risk patients. No consideration is given to how clinical interventions might change surgery's hazards for individuals at the extremes of risk or how an individual patient might move from one risk category to another.
Archetypes of the “good” and “poor risk” patient appear in the surgical literature as early as the 1940s. But the Manual's easy assurance that the assessment of risk “should not be difficult” belies deeper disagreements over whether meaningful assessments of “operative risk” could be achieved, as well as more fundamental uncertainties as to exactly what constitutes “operative risk.” Ten years before the Manual's publication, the first edition of Dripps, Eckenhoff, and Vandam's influential anesthesia text had already characterized the assignment of patients to categories of good or poor risk as an enterprise so uncertain as to be meaningless: “The term [risk] as ordinarily used by surgeon or anesthetist is unsound and should be abandoned.” They go on:
To evaluate a “risk” completely would necessitate foreknowledge of such variables as reliability of suture material to be used, adequacy of sterilization of instruments, availability of drugs, the responsibility of those in charge of postoperative nursing care, and a host of other aspects which cannot be assessed for each patient. (Dripps, Eckenhoff, and Vandam 1957, 5)
For Dripps, Eckenhoff, and Vandam, the large number of unmeasurable factors contributing to an individual's surgical outcome makes futile any efforts to sort patients into categories of good and poor risk. Yet even for those who saw operative risk assessment as a valuable and necessary task, these efforts represented an inherently imprecise undertaking (Moyer and Key 1956). Carl Moyer, chairman of surgery at Washington University in St. Louis, noted in 1970:
The factors ostensibly affecting the operative risk are: the anatomic site, the magnitude of the procedure, the age of the person, the character of the disease, the duration of the illness, the metabolic state of the individual, the technic employed to perform an operation, [and] the quality of ancillary medical care and anesthesia. (Moyer 1970, 232–3)
Unlike Dripps, Eckenhoff, and Vandam, Moyer does not see the multiplicity of factors affecting a patient's operative risk as an argument against the value of risk assessment itself. Nonetheless, for Moyer, as for Dripps, Eckenhoff, and Vandam, risk assessment appears as an enterprise firmly rooted in the individual judgments of clinicians. While disagreeing on the utility of such judgments as a guide to clinical decision making, both sources frame the principal challenges of risk assessment as essentially epistemological ones. As a task demanding the simultaneous consideration of multiple unmeasurable influences on the likelihood of an adverse surgical outcome, operative risk appears as abstract and fundamentally unquantifiable.
Beyond debates as to whether operative risk could be meaningfully assessed, writings on surgery and anesthesia before the mid-1970s also disagree on what was meant by “operative risk” in the first place. To Carl Moyer, “operative risk” equaled the likelihood of death: “The appraisal of the operative risk to be assumed by an individual is a sketchy, intuitive evaluation of the probability of dying during an operation and convalescence” (Moyer and Key 1956, 853). For others, it was “an estimate of prognosis from the standpoint of either mortality or morbidity” (Dripps, Eckenhoff, and Vandam 1957, 5), a measure of the likelihood of a “normal convalescence” (Simeone 1972, 118), or the chance of a recovery “free from complications” (Varco 1968, 175).
Such divergences situate the approach to operative risk assessment common to medical writing through the mid-1970s still more firmly within the realm of physicians’ authority and individual judgment. Indeed, beyond relying on the qualitative assessment of an individual clinician to determine a patient's status as a good or poor risk, determining the very meaning of such categories appears as each individual physician's prerogative. Thus, it was a matter of clinical judgment (Bosk 1979) informed by “local knowledge” (Geertz 1983) not only to determine what risk category a given patient occupied but also to decide whether such risk categories should be defined in terms of “the probability of dying” or simply as the odds of a “normal convalescence.”
By the late 1970s, however, broader trends in medical thought had begun to question the place of individual judgment and professional authority as a foundation for medical decision making. Harry Marks (Marks 1997), Jeanne Daly (Daly 2005), and others (Berg 1995; Timmermans and Berg 2003; Weisz et al. 2007) have identified the last half of the twentieth century as the time when a newly “scientific” and standardized approach to medical care emerged in the United States, and reasoning grounded in clinical experimentation and statistical analysis began to challenge practices accepted on the basis of physicians’ authority and individual judgment (Chalmers, Enkin, and Keirse 1989). Alvan Feinstein, an internist at Yale University, and other early advocates of such an approach (Fletcher and Fletcher 1979; Sackett 1969; Wulff 1976, 1986) argued for the application of “scientific methodology” to “the basic elements of clinical medicine” (Feinstein 1963a, b, 1964a, b, c, d) as a means of evaluating and standardizing the “exercises in deductive and inductive reasoning” implicit in “every act of diagnosis, prognostic estimation, [and] therapeutic decision” by physicians (Feinstein 1963b, 929). For Feinstein, improving the means by which physicians could categorize and classify disease states represented a key dimension through which an increasingly “scientific” approach to medical practice could yield marked improvements in clinical care (Daly 2005; Feinstein 1963b):
Clinicians had often analyzed each disease as though it were a single homogeneous fruit salad, rather than a mixture of heterogeneous fruits. Many of our misunderstandings and confusion about the biology of disease had arisen because different clinicians, seeing different mixtures of patients with the same disease, had been neglecting the clinical distinctions of the patients and referring only to the morphologic and other non-clinical characteristics of the disease. By distinguishing and analyzing the clinical components separately, we should be able to clarify many aspects of biologic behavior in human disease; we should be able to prognosticate more accurately and to evaluate therapy more effectively. (Feinstein 1967, 11)
Feinstein and his contemporaries anticipated that all these advances in classification, prognostication, and evaluation would be enabled by developments in computer technology (Barnett 1968; Bleich 1971). To Feinstein in particular, computers promised to “expand the human horizon of clinical medicine” (Feinstein 1967, 370) and be able to resolve fundamental problems that had previously complicated a range of clinical assessment tasks, including determinations of operative risk. Computers would enable individual clinicians to “manage … data with mathematical and quantitative agility” (Feinstein 1967, 370) and to consider a broader array of clinical variables than previously thought possible. Furthermore, it appeared within the grasp of computing technologies to decrease the number of clinical “aspects which cannot be assessed for each patient” that Dripps, Eckenhoff, and Vandam had previously seen as standing in the way of meaningful risk assessments (Dripps, Eckenhoff, and Vandam 1957, page 5). Specifically, computers promised to “complete gaps in [the clinician's] own immediate experience,” potentially making evaluations of patients and clinical decision making more accurate, uniform, and reproducible both within and across physicians (Feinstein 1967, 370).
During the 1970s, themes articulated by proponents of a more “scientific” clinical practice began to permeate surgical textbooks. The 1977 edition of the Textbook of Surgery cites Feinstein's 1967 monograph, Clinical Judgment, as a detailed discussion of “the process of assessing operative risk,” and new chapters on computers and statistical techniques in surgery described the potential for new analytic technologies to “permit the division of the total patient population … into particular subgroups that may have different prognoses” (Siegel 1972, 218).
These writings presaged the publication in October 1977 of a multivariate index to predict cardiac complications of noncardiac surgery by Lee Goldman and his collaborators (Goldman et al. 1977). Goldman, who had designed, conducted, and published the work while still a trainee—first as a senior resident in internal medicine at Massachusetts General Hospital and then as a cardiology fellow at Yale—had not published previously on the topic of operative risk assessment, nor had he completed formal training in advanced statistics (Goldman, personal communication, March 31, 2011). Although he did not meet or work with Feinstein until after his cardiac risk project was completed, Goldman's 1977 publication resonated with Feinstein's earlier emphasis on efforts at standardizing the means of “distinguishing and analyzing … separately” the “clinical components” of phenomena observed in daily practice (Feinstein 1967, 11). Motivated by his own experiences in risk assessment as a consulting physician, Goldman drew on multivariate modeling techniques similar to those used to define coronary heart disease risk factors in the Framingham Heart Study (Aronowitz 1998; Kannel 1992; Rothstein 2003) to develop a simple bedside prediction method for postoperative cardiovascular events. Goldman's method, the Cardiac Risk Index, was the first major “risk factor” index designed to predict surgical outcomes, incorporating nine patient characteristics obtainable from history, physical examination, and laboratory studies to estimate the varying probabilities of specific postoperative cardiac complications (see table 1).
Table 1. The Cardiac Risk Index
|1. Age over 70 years||5|
|2. Myocardial infarction in previous 6 months||10 |
|3. Third heart sound or jugular venous distention||11 |
|4. Important aortic stenosis||3|
|5. Rhythm other than sinus or premature atrial contractions||7|
|6. More than 5 premature ventricular contractions per minute||7|
|7. Hypoxemia, hypercarbia, hypokalemia, acidosis, renal dysfunction, liver dysfunction, or bedridden status||3|
|8. Intraperitoneal, intrathoracic, or aortic operation||3|
|9. Emergency operation||4|
|Total Possible||53 |
Goldman's index was quickly absorbed into the medical literature. By 1982, it had been cited by 80 biomedical journal articles, and by 1987 it had been cited 224 times. As early as 1981, surgical textbooks praised Goldman's work for going beyond “initial efforts to quantitate what appeared to be subjective impressions” to advance a “whole line of inquiry toward precise determination of operative risk” (Polk 1981, 123).
Goldman's focus on the prediction of postoperative cardiovascular events rather than a more broadly defined set of postoperative complications emerged as both a key innovation and a limitation of his work. While contemporary surgical researchers had already employed multivariate statistical methods to examine mortality among patients with a particular operative illness (Irvin and Zeppa 1976), Goldman's index predicted the occurrence of any one of several potential negative outcomes, all linked to the dysfunction of a single organ system, across a range of surgical procedures. And while cardiac events had been recognized in Goldman's time to be a principal contributor to surgical morbidity and mortality (Arkins, Smessaert, and Hicks 1964; Tarhan et al. 1972), the “precise determination” promised by Goldman's approach was limited to the extent that it did not predict a range of other key end points, such as noncardiac complications or all-cause mortality, relevant to operative risk assessment (Goldman 2010). In contrast to the apparent “guesswork” implicit in earlier approaches to risk assessment, Goldman's index promised a precise, numerical estimate of risk but did so for only a selected set of complications, described in the 1981 edition of the Textbook of Surgery as “fatal and nonfatal, but life-threatening, complications of cardiac origin” (Polk 1981, 123).
Goldman's notion of a discrete “cardiac risk,” distinct from a more general “operative risk,” quickly became a part of didactic writings on risk assessment in surgery and anesthesia, markedly changing discussions of the relationship between preexisting cardiovascular disease and surgical outcomes. In his 1977 chapter on preoperative evaluation, Hiram Polk, chairman of surgery at the University of Louisville, emphasized the potential for symptomatic heart disease to drastically alter a patient's global operative risk: “The patient with congestive heart failure poses an absolutely prohibitive operative risk and should not undergo operation, except those known to be immediately and unequivocally lifesaving” (Polk 1977, 127). Four years later, Polk's chapter was extensively revised to incorporate Goldman's findings. In the later edition, the section on cardiovascular evaluation is largely silent on the implications of advanced heart disease for overall operative risk. The focus is instead on factors found to predict postoperative cardiovascular events:
[Goldman's] work is a useful advance on prior methods to the same end and is as important for what it did not find as for its positive observations. … Goldman and associates did not confirm the significance of diabetes mellitus, smoking, hypertension, hyperlipidemia, stable angina pectoris, remote myocardial infarcts, ST segment or T wave changes on EKG, bundle branch blocks, mitral valvular disease, or cardiomegaly. These must not be ignored but are apparently less pertinent determinants of cardiac risk than had been previously thought. (Polk 1981, 123)
This change, occurring over a period of only four years, suggests an immediate, marked influence of Goldman's work on discussions of risk assessment in surgery. Here the statistical prediction of “complications of cardiac origin” has emerged as a central task of operative risk assessment, replacing an earlier emphasis on the relevance of cardiovascular disease to physicians seeking to distinguish “good risk” from “poor risk” patients on the basis of professional judgment. Stated differently, the focus shifted away from the determination of “surgical risk in the cardiac patient” (Skinner and Pearce 1964, 57) and toward the assessment of “cardiac risk” in the surgical patient.
Over the next two decades, Goldman's index gained progressively greater influence in textbooks writing about anesthesia and surgery related to cardiac risk assessment before surgery. Moreover, the “risk factor” approach adapted by Goldman to the study of postoperative cardiac events came to be applied to predict a progressively greater range of surgical end points. Hiram Polk's 1991 chapter on preoperative evaluation listed “basic factors affecting operative risk,” as well as separate tables listing “cardiac risk factors” and “risk factors for pulmonary complications” (Polk 1991, 82). Similarly, the chapter on patient evaluation in the 1997 edition of the Introduction to Anesthesia lists “predictors of perioperative cardiac risk” and “preoperative risk factors … associated with postoperative pulmonary complications” (Traber 1997, 16–18).
- Top of page
Decision researcher Paul Slovic has argued that “defining risk is … an exercise in power” and that “whoever controls the definition of risk controls the rational solution to the problem at hand” (Slovic 1999, 689). From this perspective, the changing status of operative risk as a concept in medical thought evades simple characterization as a story of progress, enabled by statistical innovations, from a state of confusion to one of understanding. Rather, it offers an example of the abandonment of an older formulation of operative risk for a newer one, with implications for how problems in decision making related to surgical care are defined and how acceptable solutions to these problems come to be found.
Our work spans a period in which the hazards of surgery changed in important ways, characterized by steep declines in associated mortality (Crawford et al. 1981; Hannan et al. 1995; Katz, Stanley, and Zelenock 1994), the migration of a range of surgical procedures from inpatient to outpatient settings (Cullen, Hall, and Golosinskiy 2009), and the development of minimally invasive surgical technologies (Zetka 2003). Yet as the practice of surgery changed, the ways in which physicians thought and wrote about the hazards of surgery also were transformed. Our work traces this conceptual shift related to operative risk as exemplified by the 1977 publication of Lee Goldman's multivariate predictive index for postoperative cardiovascular complications. Goldman's work resonated with broader, ongoing intellectual trends that emphasized practices based on evidence from randomized trials and systematic reviews (Berg 1995; Daly 2005; Marks 1997) and applied industrial principles of standardization to clinical decision making (Timmermans and Berg 2003; Weisz et al. 2007).
More generally, Goldman's approach also echoed a growth between the 1960s and 1990s in the concept of “risk” itself as an organizing theme, not only in medical thought (Skolbekken 1995), but also in society as a whole as a means of articulating and quantifying threats emerging from modernization itself (Beck 1992). Arising during a period of rapid technological change in surgery, Goldman's index offered a way in which the prediction of adverse outcomes after surgery, once the domain of expert physician judges, could, for a subset of surgical complications, be standardized and made quantifiable with equal facility by senior surgeons and first-year trainees. This approach allowed operative risk assessment and, by extension, operative decision making to begin to be reframed as a matter of scientifically reproducible measurement that could be carried out by a range of practitioners with various levels of experience or skill. Thus, along with the many risk-prediction indices that followed it, Goldman's work can be seen as an early step toward situating surgical care in a larger “risk society” (Beck 1992) by meeting the demand for a consistent, uniform language through which physicians, patients, and payers could conceptualize and articulate the distinct hazards of operative care.
Goldman's work appeared at a time in which authorities in surgery and anesthesia voiced dissatisfaction with the available tools for risk assessment yet still saw the ideal, “statistical approach” to operative risk assessment as a technical “impossibility.” As a means to move past guesswork toward quantification in risk assessment, Goldman's index was embraced rapidly as a key first step to overcoming this “impossibility.” That it appeared almost immediately in prominent surgical texts contrasts markedly with the slow diffusion of medical innovations noted by other observers (Antman et al. 1992; Berwick 2003) and argues for its status as what Joseph Ben-David characterized as a “revolutionary” innovation. Notably, for Ben-David, such innovations derive their impact in part from their emergence from outside an established field of scientific inquiry (Ben-David 1960).
By virtue of Goldman's professional orientation as an internist, rather than a surgeon or anesthesiologist, his academic status, and his lack of prior research on operative outcomes, his work likewise emerged from outside the “invisible college” of researchers (Crane 1972) then focused on the study of surgical outcomes (Goldman, personal communication, March 31, 2011). Goldman's external perspective drew on his own practical experiences to interrupt and shift prevalent modes of discourse on how one key dimension of operative risk should be defined and measured (Ben-David 1960). As a resident and fellow, he was called on often to provide preoperative risk assessments and, like Carl Moyer, was frustrated that all he could do was guess.
Goldman's alternative to risk assessment based on “guesswork” was rapidly embraced in surgical writing as an authoritative approach to assessing cardiovascular risk before surgery and came to serve as a model for subsequent efforts to develop analogous prediction rules for a range of other operative complications. Such observations attest to the utility of Goldman's approach as an organizing theme in clinical research and practice. At the same time, however, our observation of a shift from an older notion of operative risk to a newer one demands reflection on not only what insights may have been gained in this transition but also what may have been lost. Implicit in the notion of operative risk as a statistical phenomenon, defined in terms of event probabilities for a population of patients, is a separation of surgery's outcomes from the experience of any individual in particular. Whereas earlier, more general notions of operative risk were tightly connected to patients’ unique disease histories, more recent efforts to define sets of risk factors for specific surgical outcomes offer a generic, de-personalized view of the hazards of surgery.
To the extent that risk-factor approaches implicitly or explicitly influence the ways in which physicians interpret surgery's hazards, they carry with them the potential to prioritize certain outcomes over others. By defining operative risk as those end points for which prediction rules exist, physicians and clinical researchers elevate a set of predictable outcomes over alternative end points such as changes in quality of life that, albeit difficult to predict, may nonetheless be important to individual patients. Thus, an approach to operative risk assessment that lends primacy to the prediction of near-term cardiovascular or pulmonary complications could marginalize the assessment of other important hazards by separating the immediate dangers of surgery from downstream risks such as those associated with rehabilitation or convalescence. This—along with shortened lengths of stay and the emergence and growth of medical specialties devoted to managing surgical recovery, such as physiatry and critical care—may enable a separation and revaluing of the multiple components of medical work, permitting those decisions related to surgery itself to be abstracted from the social costs of the postsurgical recovery period.
Still more problematic is the observation that statistical prediction models for discrete complications of surgery, such as cardiac, pulmonary, renal, or infectious events, disarticulate the overall hazards of surgery into several smaller component risks. Moreover, these statistical models themselves offer no guidance as to whether or how predictions regarding multiple discrete risks can be reassembled to yield a summative statement of the danger or safety of surgery for an individual patient. Thus, the task of integrating the predictions of diverse statistical models to formulate a coherent notion of operative risk for the individual continues to rely on qualitative judgments regarding the relative importance of surgical hazards that differ in their nature and timing. For example, by disaggregating the experience of operation from that of convalescence, contemporary statistical approaches to risk assessment make it all the more difficult to integrate information on the diverse hazards faced by an individual surgical patient. Such considerations make Carl Moyer's 1970 dictum—“all we can do is guess”—likely to be as relevant a comment on operative risk assessment today as it was in its own time. Yet where Moyer acknowledged the substantial amount of uncertainty in risk assessment, contemporary discussions appear to overlook the high degree of guesswork implicit in how such assessments are made and used in decision making. Furthermore, by separating complications occurring immediately after surgery from those emerging during rehabilitation and recovery, statistical approaches to risk assessment are likely to contribute to a permissive standard for decisions regarding surgical care by inflating the benefits of a surgical intervention at the same time as they work to deflate its potential costs to individuals, their primary caregivers, and society.
Our findings must be interpreted in the context of important limitations. The academic and clinical writings we have examined here can only approximate how individual physicians have comprehended and assessed risks in practice. Further research is required to confirm these findings and explicate how the hazards of surgery are conceptualized by clinicians in practice, communicated to patients, and incorporated into decision making, particularly in the context of changing clinical evidence surrounding interventions intended to mitigate surgical risk (McFalls et al. 2004). Finally, our study did not look at other factors that also likely influenced the utilization of surgical service over this period, such as changing reimbursement practices, the development of minimally invasive technologies, and the development of safer anesthetic and surgical techniques.
Nonetheless, the changes we describe here regarding notions of operative risk occurred over a period in which operative decision making and patient selection for surgery changed in dramatic ways. Since the 1960s, efforts to determine the “age limit for operations of a certain magnitude” (Wojnar and Moghul 1963) and to define the safety of major surgery among the oldest old (Burnett and McCaffrey 1972; Djokovic and Hedley-Whyte 1979; Kohn et al. 1973; Marshall and Fahey 1964) have given way to concerns that the surgical workforce in the United States will not be sufficient to meet older adults’ growing demands (Etzioni et al. 2003) and that not enough physicians will be available to oversee the advanced medical treatments needed to support their recovery (Kelley et al. 2004).
Such shifts over time in the nature of surgical patients bespeak real changes since Carl Moyer's time in how individuals come to be classified as “good” or “poor” surgical candidates from the standpoint of operative risk. Taken alongside our review of historical medical writings over four decades, they speak to important gaps in our knowledge of how advanced medical and surgical treatments ceased to be exceptional events in a person's life and came instead to be an everyday part of a process of aging. Our discussion of how a new way of categorizing and measuring surgery's hazards emerged in medical thought points to the need to understand better what we talk about when we talk about risk in the context of medical decisions. Such an understanding is necessary for grasping the unintended and unacknowledged ways in which our current language of risk informs how decisions regarding medical interventions are made and how this language helps create and sustain the viewpoint from which the utilization and outcomes of surgical care are now measured.