Innovations in the Assessment of Transplant Center Performance: Implications for Quality Improvement

Authors


* Corresponding author: John D. Kalbfleisch, jdkalbfl@umich.edu

Abstract

Continuous quality improvement efforts have become a central focus of leading health care organizations. The transplant community has been a pioneer in periodic review of clinical outcomes to ensure the optimal use of limited donor organs. Through data collected from the Organ Procurement and Transplantation Network (OPTN) and analyzed by the Scientific Registry of Transplant Recipients (SRTR), transplantation professionals have intermittent access to specific, accurate and clinically relevant data that provides information to improve transplantation. Statistical process control techniques, including cumulative sum charts (CUSUM), are designed to provide continuous, real-time assessment of clinical outcomes. Through the use of currently collected data, CUSUMs can be constructed that provide risk-adjusted program-specific data to inform quality improvement programs. When retrospectively compared to currently available data reporting, the CUSUM method was found to detect clinically significant changes in center performance more rapidly, which has the potential to inform center leadership and enhance quality improvement efforts.

Introduction

Continuous quality improvement programs have emerged as key components of high-performing health care organizations. The collection and timely analysis of clinically relevant data is crucial in accomplishing quality improvement initiatives and ensuring the highest quality care for patients. Organ transplantation is unique among medical specialties in the quantity and quality of data collected on a national basis. Through data collected from the Organ Procurement and Transplantation Network (OPTN), accurate and clinically relevant data are available that can provide information to improve transplantation outcomes. Currently, however, these data are analyzed on an episodic basis and provided to centers semiannually through the Program-Specific Reports (PSRs). Consequently, recognition of clinically relevant changes in clinical outcomes may be delayed, limiting the success of quality improvement efforts.

Statistical process control charts were originally developed to study industrial processes in the 1930s by W.A. Shewhart and his colleagues at Bell Laboratories (1). These charts measure performance over time and ‘signal’ if there is a deviation from accepted production standards. The CUSUM, or cumulative sum, chart was introduced in 1954 by Page and provided a very sensitive approach to monitoring a process and identifying changes in outcome (2). The purpose of these charts is to give timely and easily interpreted summaries of outcome data. The potential utility of CUSUMs in health care was recognized in the early 1970s when the paper, ‘Why don't doctors use CUSUM?’ was published in the Lancet (3). Several years later, the New England Journal of Medicine published a manuscript, which highlighted the value of CUSUM techniques in clinical applications (4). Broad acceptance of these techniques, however, was delayed initially by data collection limitations and subsequently by the inability to include meaningful risk adjustment.

Recent high-profile events, including a cluster of heart transplant deaths in Britain (5) and the Institute of Medicine's report, ‘Crossing the Quality Chasm’ (6), have contributed to a heightened awareness of the need to monitor surgical outcomes. Public interest and recent improvements in the analytic methods have led to rapid increases in the utilization of CUSUMs to track surgical outcomes. In a 2007 review, Biau and colleagues identified 31 studies which utilized CUSUMs to track surgical outcomes in cardiac, general and ENT surgery (7). Other work on CUSUM methods and applications in medical studies can be found in various statistical and medical journals (8–10).

Application of CUSUMs to the management of transplant centers offers physician leaders the opportunity to track outcomes in a real-time, risk-adjusted manner. A previous retrospective analysis demonstrated that CUSUMs identified changes in clinical practice sooner and with higher sensitivity than current center monitoring techniques (11). Recent improvements in chart construction are based on survival analysis techniques and allow the incorporation of outcomes as they occur, rather than after passage of a specific time period (e.g. 1-year posttransplant) as had previously been the case (12). Furthermore, these CUSUM charts are risk adjusted using comprehensive models currently employed by the SRTR to adjust outcomes for patient and donor characteristics and are adjusted for patient mix. In general, the CUSUM compares observed outcomes with expected results; it increases in value as graft failures or patient deaths occur and decreases during periods with no failures. If too many failures occur over time (compared to what would be expected) the value of the CUSUM will exceed a predetermined threshold value and ‘signal’ that a process review should be initiated.

This article provides a brief overview of the construction and application of CUSUM charts for transplant professionals. We begin with a brief summary of the construction and interpretation of CUSUM charts, including a number of examples. Following this, we briefly review the methods currently used by the SRTR to assess transplant programs in the PSRs. Next, we present a retrospective comparison of CUSUM monitoring with the techniques employed in the PSRs. Finally, we address the strengths as well as potential difficulties and risks of a broad application of this technique.

Defining CUSUMs

A CUSUM chart for a given program or center presents a simple graphical comparison of observed and expected numbers of events over time. In their initial description of a clinically relevant, risk-adjusted CUSUM, Steiner et al. and Grigg et al. described methods for assessing performance of a clinical system that produced a binary outcome (e.g. death following cardiac surgery) (10,13,14). In these methods, the CUSUM is increased or decreased by a variable degree depending upon the observed and expected outcomes from the process. Axelrod et al. applied this approach to transplant data and developed CUSUM charts to monitor 1-year posttransplant survival for liver transplant recipients and 1-year allograft survival for kidney transplant recipients (11). Based on the method of Steiner et al. (13), a logistic regression model, which included several donor and recipient factors, was utilized for risk adjustment.

Binary outcome CUSUM charts are very useful tools for monitoring situations in which the outcome is binary and rapidly ascertained; for example, in monitoring conversion rates (the percentage of possible donors which actually result in transplantable organs), acceptance rates (the percentage of organ offers which are accepted by a program) or mortality rates over a short, fixed period of time. They do have a disadvantage: in monitoring longer term survival outcomes such as 1- or 2-year mortality rates, the data on any given individual cannot be used until the corresponding period has elapsed.

In 2008, Biswas and Kalbfleisch developed a method to create risk-adjusted CUSUM charts that are based on a continuous time survival analysis approach and are able to incorporate deaths or graft failures as they occur (12). These charts have a substantial advantage in the monitoring of longer term survival endpoints and are more consistent with the Cox model-based risk-adjusted methods used by the SRTR. This method can be utilized to construct two types of CUSUM methods: a one-sided chart in which the value is restricted to nonnegative sums and a two-sided or O-E chart, as described below; a more complete description of the calculation of the CUSUMs is included in the Appendix.

The one-sided CUSUM:  It is constructed principally to assess for a clinically significant excess of allograft failures or patient deaths. The CUSUM is restricted to positive values and so is bounded below at a value of zero. Thus, a center performing at, or better than, the expected performance (and thus having fewer observed failures than expected) would have a chart which tends to stay relatively close to zero. It would increase with any failure, and then return to zero in an ensuing period without failures. Conversely, if the center's outcomes are much poorer than the national average, the number of failures will lead to a substantial increase in the CUSUM and this will eventually result in a signal. The one-sided CUSUM signals when the plot line crosses a horizontal line, termed the control limit, which defines the signaling threshold. The height of this line (L) reflects the balance between rapid signaling that will very quickly identify centers with poor outcomes (sensitivity) and the desirability of avoiding false positive results (specificity) and signaling when the center's performance is actually consistent with the national average. The value of L can be adjusted for center volume to ensure that the sensitivity is kept at a suitably low level for all centers. Figures 1–3 (bottom panels) provide examples of the one-sided CUSUM. In our analysis, if a kidney transplant center's volume is 40 transplants/year or more and its true rate is twice the adjusted national average, the chance of a signal over a 3-year period is 90% or more. If the center's rates are the same as the national average, a period of 30 years would be expected before the first false positive signal.

Figure 1.

Panel A1 (top): O-E CUSUM chart for 1-year mortality in the liver program at Center A, which had 200 liver transplants within a 3.5-year period. Panel A2 (bottom): One-sided CUSUM chart for 1-year follow-up in the liver program at Center A.

Figure 2.

Panel B1 (top): O-E CUSUM chart for the liver program at Center B, which had 256 liver transplants within the 3.5-year period. Panel B2 (bottom): One-sided CUSUM chart for the liver program at Center B.

Figure 3.

Panel C1 (top): O-E CUSUM chart for the liver program at Center C, which had 414 liver transplants within the 3.5-year period. Panel C2 (bottom): One-sided CUSUM chart for the liver program at Center C.

The two-sided or O-E CUSUM:  For a given center, this simplest of CUSUM plots, as a function of time, the difference between the observed number of deaths and the number of deaths that would be expected based on the risk-adjusted national average. This CUSUM can be viewed as being updated daily by adding to its previous value the observed number of deaths on that day less the expected number. The expected number of deaths is estimated from a survival model based on national data and is adjusted for the particular patient mix at the center. Thus, the O-E chart traces out an approximately horizontal path (slope = 0) if the death rate at the center is close to the risk-adjusted national average. An upward trend of the O-E plot over a specified time interval indicates that the center has worse outcomes than the adjusted national average. Conversely, a downward trend of the plot corresponds to better outcomes than the adjusted national average. Figures 1–3 (top panel) provide examples of O-E plots and are discussed further below. As indicated by the arrows superimposed on the plots, the ‘slope’ of the plot gives an estimate of the approximate relative risk (RR), which is the ratio of the death rate at the center to that for the adjusted national average.

It is possible to use the two-sided CUSUM to provide a signal when there is statistical evidence that a center's outcomes are different from the national average. The two-sided CUSUM signals if the ‘slope’ of the plot exceeds a predetermined value over an extended period. The signal is obtained by systematically checking the slope of the plot at each successive time using a V-mask as introduced by Barnard (15,16) and discussed in the Appendix. As in the height of L in the one-sided chart, the angle of slope, which is considered significant in the two-sided chart, can be adjusted to balance sensitivity and specificity.

The principal advantage of the O-E plot is that the slope of the plot over a given interval gives an immediate picture of the relative rate of outcomes within the center of interest compared to expected results. In the one-sided CUSUM, the slope of the chart is more difficult to interpret and immediate comparisons to the national average or expected results are more difficult to see. We find that both charts provide useful and complementary information, as the examples in the next subsection illustrate.

Examples of CUSUM charts and interpretation

A sample of CUSUM charts over a 3-year period for liver transplant programs, labeled Center A, Center B and Center C, is described here. For each center, the one-sided CUSUM and O-E CUSUM charts are presented for 1-year patient survival. Similar charts could be constructed for graft survival or for other outcomes such as 1-month or 2-year survival.

Center A:  From the O-E chart, the failure rate for the 1-year survival in Center A is close to the national average for the first year, as is suggested by the nearly horizontal plot line (slope approximately 0) (Figure 1). For the second period, from 1 to 3.5 years, the death rate exceeded the national average, as illustrated by an increase in the slope of the O-E CUSUM. From the one-sided CUSUM chart (Figure 1), we see that these trends would have led to a signal at the end of the second year. If the chart had been in place, this signal suggests a review of center practices may be appropriate.

Center B:  The O-E chart in Figure 2 illustrates that Center B experienced death rates very close to the national average (adjusted for patient mix) over the first 2 years of the CUSUM period. During the last 1.5 years, the center had considerably better 1-year outcomes than the adjusted national average. Correspondingly, the one-sided CUSUM chart (Figure 2) did not signal.

Center C:  From the O-E chart (Figure 3), we see that the 1-year mortality at the center was higher than the national average for the first year or so. After that period, the 1-year death rates were considerably lower than the adjusted national rates through the end of the 3.5-year period. From the one-sided CUSUM (Figure 3), we see that the higher death rates observed early on were not sufficient to lead to a signal.

Current Methods of Risk Adjustment and Quality Assessment: SRTR Processes

Although there is no ‘gold standard’ for performance assessment, the methods utilized in the PSRs prepared by the SRTR for the United Network for Organ Sharing (UNOS) Membership and Professional Standards Committee (MPSC) provide an important benchmark for comparison (17). However, it is important to keep in mind a key difference between the CUSUM methods discussed here and the PSRs. The PSRs are a regulatory tool used to help ensure compliance with current performance standards; they are not intended nor constructed to be used as a quality improvement instrument. The PSRs supplied by the SRTR help the MPSC identify transplant programs or organ procurement organizations (OPOs) that might require site visits or case reviews to look more deeply into potential problems, whereas the CUSUM procedures can be used by the center directors themselves for real-time monitoring and quality improvement efforts.

Broadly speaking, the MPSC seeks to identify programs that experience significant deviations from expected performance measures related to the care of wait-listed and transplanted patients. Centers identified are characterized as needing further review if they meet the following criteria:

  • • Clinical outcome failures that exceed predetermined thresholds relative to expected performance. Currently the thresholds include an excess death rate of at least 50% (Observed/Expected > 1.5);
  • • A clinically important number of incidences, defined as an absolute number of excess deaths greater than 3;
  • • Statistical confidence that the observed difference between observed and expected results is unlikely to have occurred by chance (one-tailed P-value < .05).

The SRTR utilizes a robust data set including all observed events during the time a patient is actually followed, either by the center and the OPTN or through other data sources including the Social Security Death Master File. Kidney graft survival data are supplemented by examination of claims data related to return to dialysis from the Centers for Medicare & Medicaid Services (CMS). This methodology essentially eliminates the possibility of uncaptured recipient death or kidney allograft failure events.

All data provided to the MPSC are risk-adjusted using Cox proportional hazards regression models, including donor data, recipient characteristics and transplant variables. For example, for the 1-year deceased donor adult kidney patient survival model, the proportion of the variation explained by the model was 73% for the July 2008 release. It is important to note that the SRTR models used by the MPSC do not adjust for perioperative or posttransplant management practices (e.g. type of induction and immunosuppression). These factors, while clearly influencing outcomes, are within the control of the transplant center and contribute to the center's performance that is examined by the MPSC. The models are designed to control for differences in the underlying recipient severity of illness and donor quality, which are largely determined by the location of the center and the donor characteristics within the OPO.

The program-review criteria were set by the MPSC in order to facilitate identification of programs for which interventions were likely to have a demonstrable clinical impact. The criteria are stringent to reduce the risk of a false positive finding in which outcomes vary from those expected by chance alone. Because smaller centers with poor outcomes may not meet the three thresholds despite poor outcomes, all centers performing nine or fewer transplants in a 2.5-year cohort are evaluated by a separate standard, whose only criterion is at least one adverse event (graft failure or death). The review thresholds for both small and large programs are reviewed periodically by the SRTR and the MPSC.

The MPSC examines quarterly data on a 2.5-year rolling average of outcomes. In order to have complete data on 1-year survival, for example, the PSRs for the MPSC and center directors are somewhat time delayed. Consequently, these analytic techniques are not well suited to day-to-day management of transplant center outcomes by center directors. Previous analyses have demonstrated that significant clusters of surgical failures may be missed utilizing only episodic analysis of average outcomes, particularly for large centers. Given the regulatory and legal implications of MPSC actions, criteria that are highly reliable and very specific are crucial. However, for the purpose of center management, a sensitive, continuous, up-to-date outcomes tracking tool would allow better assessment of center performance. CUSUM techniques may complement the PSRs, and would potentially allow center directors to determine trends in performance in an expedited manner.

Application to Liver and Kidney Outcome Assessment Using the OPTN Database

To assess the potential value of CUSUM monitoring of transplant center performance, we performed retrospective analyses of 1-year patient survival at liver and 1-year graft survival at kidney transplant centers. Two separate charts were constructed to detect declining center performance: a one-sided continuous CUSUM chart and an O-E CUSUM chart with a V-mask (see Appendix). For each center, the value of L and the slope of the V-mask were defined so that they would provide a signal approximately 8% of the time in a 3.5-year period in centers of the same size whose 1-year survival rates are identical to the overall national average. This ‘false positive rate’ is equivalent to the false positive rate that would arise from the methods used to prepare the PSRs.

Data sources

Data from the cohort of recipients of deceased donor transplants between July 1, 2004, and December 31, 2007, were reviewed for kidney and liver transplant programs in the OPTN database. The data included 11,957 liver transplant recipients transplanted at 67 centers, which ranged in size from 1 to 185 liver transplants per year over the 3.5-year period. We omitted nine centers with less than eight transplants per year, for which the CUSUMs would be expected to yield little power. The SRTR models for posttransplant survival used for the PSRs were used to determine the expected number of failures risk-adjusted to correspond to the center's donor and recipient characteristics. The SRTR's 1-year survival model for liver transplants from a deceased donor adjusts for 23 donor and recipient characteristics, whereas the model for living donor liver transplants adjusts for seven donor and recipient characteristics. This study was approved by Health Resources and Services Administration's (HRSA) SRTR project officer. HRSA has determined that this study satisfies the criteria for the IRB exemption described in the ‘Public Benefit and Service Program’ provisions of 45 CFR 46.101(b)(5) and HRSA Circular 03.

We also considered results from 113 kidney transplant programs that included 31,666 recipients transplanted between July 1, 2004, and December 31, 2007. These programs had average volumes of 1 to 278 deceased and living donor kidney transplants per year. We included the facilities with 8 to 220 transplants per year in our study and grouped them into six categories. This omitted 11 small programs with fewer than 8 transplants per year, and two large centers with 278 and 225 transplants per year. Again, we utilized the SRTR's PSR models to represent national rates for graft failure; the model for deceased (living) donors includes 22 (13) donor and recipient characteristics. Details of both the liver and the kidney models can be found on the SRTR website (http://www.ustransplant.org).

For each program, the control limit L and V-mask were defined based on the center's volume (see Appendix) and it was determined whether the CUSUMs produced a signal (or flag). In addition, a benchmark was based on a review of the PSRs to determine whether or not a flag occurred for each program in the study. The time to flagging for the PSR was taken as 3 or 3.5 years depending on whether the flag occurred in the January 2008 or the July 2008 report.

Results

Results of this comparison are presented for five-volume strata for liver transplant programs (Table 1). There is very close agreement between the CUSUM methods; in fact, they both identified exactly the same centers and at almost identical times. However, the results differed slightly from those obtained from the current PSR approach. There were two centers that were flagged by the CUSUMs that were not identified by the PSR approach due to clusters of surgical failures, which were not captured using the PSR methods. For the centers which were flagged as having below-expected performance, the average time needed to reach the signal point was less than 2 years for the CUSUMs, compared to more than 3 years for the PSR approach. In this case, the CUSUMs signaled slightly more often than the SRTR's PSR approach (12 versus 10).

Table 1.  The number of facilities flagged and average time to flagging in the Program Specific Reports compared to methods based on one-sided and O-E CUSUMs for 1-year survival in 58 liver transplant programs
Facility size transplants/yr# facilities# and average time for PSR flagging# and average time for one-sided CUSUM# and average time for O-E CUSUM
#Time (yrs)#Time (yrs)#Time (yrs)
8–20413.5011.7611.76
20–5020 23.0020.6820.68
50–10025 63.2582.0682.01
100–140713.0013.2713.27
140–18520NA0NA0NA

Similar results were generated when these methods were applied to the 100 kidney transplant programs using 1-year graft survival as the outcome of interest (Table 2). In this case, both CUSUMs signaled at the same 21 centers, compared to nine centers that were flagged by the PSR. Here again, the average time to signal was much shorter for the two CUSUM approaches. In the 12 centers signaled in the CUSUMs, but not in the PSR, only eight slightly exceeded the control limit, one signaled after 3 years when the PSRs are not fully complete, and the other three centers exceeded the control limit by a substantial amount.

Table 2.  The number of facilities flagged and average time to flagging in the Program Specific Reports compared to methods based on one-sided and O-E CUSUMs for 1-year survival in 100 kidney transplant programs
Facility size Transplants/yr# facilities# and average time for PSR flagging# and average time for one-sided CUSUM# and average time for O-E CUSUM
#Time (yrs)#Time (yrs)#Time (yrs)
8–2030NA11.0011.00
20–5027 53.2081.5181.52
50–10037 23.0051.9251.91
100–14017 0NA30.7530.75
140–180913.0031.0331.03
180–220713.0011.2211.22

These data suggest that there is general agreement between the CUSUMs and the PSRs in determining which centers have not achieved expected levels of performance, though the CUSUMs did signal somewhat more often than the PSRs using the signal limits established here, especially in the kidney outcomes. This is not surprising, perhaps, given that the aim of the CUSUMs is to be an early warning of potential difficulties. In general, the CUSUMs identified centers much more quickly, often 1 to 2 years earlier, than the PSRs and could have provided an opportunity for earlier intervention on the part of center personnel.

Conclusions and Discussion

The medical community is under increasing pressure to develop and implement effective strategies to assess and improve performance. The solid organ transplant community has already established its leadership in this area, by collecting and publicly reporting clinically valid, risk-adjusted center-specific outcome data. Currently, these data are primarily utilized in a regulatory fashion and are not particularly well designed for improving practice in an ongoing fashion. The implementation of a real-time, clinically relevant system of outcomes monitoring with CUSUMs may accelerate efforts to improve both transplant outcomes and organ utilization.

The CUSUMs discussed here depend on appropriate risk adjustment and require substantial analytical development to support them. In the case of posttransplant graft and patient survival, there is substantial experience in risk evaluation and adjustment through the ongoing efforts of the SRTR and the OPTN organ-specific committees. As noted earlier, however, it is important to continually review these models to be sure that important baseline patient characteristics are included. If the center leadership misunderstands the CUSUM results, there is the potential of inappropriate reactions, such as a decision to limit access to high-risk transplants. It should be noted, however, that appropriate adjustment takes these high risks into account in defining the CUSUMs and limiting access to high-risk transplants would not necessarily result in any measured improvement in center performance.

The signaling thresholds in the one-sided and O-E CUSUM are designed to help the clinician understand when the differences are of sufficient magnitude that a review is suggested and help avoid premature or unjustified reaction to apparent trends. As proposed here, the CUSUM would not be a regulatory tool. Rather, as a quality improvement tool, it can be used to assess the clinical impact of changing practices (e.g. accepting higher risk donors). Thus, the CUSUM can help the center avoid coming to the attention of the MPSC and will not increase scrutiny. As noted, there is an excellent agreement between the methods in identifying which centers are at risk; the real difference in the two methods is the rate of ascertainment.

CUSUM charts have been clinically applied in a variety of surgical applications in single-center series to assess outcomes of cardiac surgery (18–21), colorectal surgery (22), breast surgery (23) and other specialties (7). In general, CUSUM charts have been used for two purposes: first, to assess the learning curve inherent in the adoption of new surgical procedures (e.g. laparoscopic nephrectomy) (24), and second, to assess the outcome of a system of care within a single institution or across a group of providers. In the case of the learning curve, the CUSUM chart is followed until the number of observed events is consistently less than expected. The CUSUM chart can be effectively used to monitor processes for either clusters of failures or a steady change in outcomes (either positive or negative), providing that an appropriate one- or two-sided chart is used.

In this paper, we have concentrated on methods associated with monitoring mortality or graph survival outcomes with a CUSUM method based on the Cox regression model. This method is able to utilize all failures as they occur and avoids the waiting time required in the assessment of survival outcomes, using CUSUMs for binary outcomes as previously reported (10,11,13–15). We feel that these methods represent a major advance over previously published techniques. Because the risk-adjustment models are consistent with those used in the SRTR's PSRs, center directors can feel confident that the CUSUM will respond to clinically relevant changes in outcomes that may be under the center's control but will not penalize programs that transplant higher risk patients or utilize high-risk organs. Furthermore, the utility of the CUSUM for ongoing management is enhanced by its ability to capture and include all patient deaths and graft failures immediately upon reporting rather than waiting, for example, for a full-year posttransplant.

It should be noted that CUSUM techniques could be used to monitor many other outcomes besides survival. The methods based on a binary outcome, for example, could be used to monitor such outcomes as conversion rates or acceptance rates, in addition to short-term survival. In the industrial context, CUSUMs were initially developed to monitor normally distributed outcomes; normal CUSUMs could be used to monitor other transplant outcomes such as measures of quality of life or creatinine clearance.

Notably, there have been no prospective clinical trials in surgical fields demonstrating that CUSUM charting is effective in a multicenter context. Previous analyses, including our own (11), have all been retrospective. We hope that by increasing knowledge of the techniques and collecting prospective data we can demonstrate the advantages of this methodology and gain support from OPTN members to adopt it. We are currently undertaking a prospective study of these techniques with a limited number of transplant centers and this work may help to fill this void. From the retrospective studies, however, it seems likely that there are substantial gains to be had through the earlier identification of poorer center performance that the CUSUM techniques can help to affect.

There appears to be two major barriers to widespread adoption of CUSUM in the transplant community. First, there is limited familiarity and comfort with using this method of assessing outcomes. Second, some center directors may fear that CUSUM will be used as yet another regulatory tool to identify and censure poor performance. As proposed here, however, CUSUM would be provided confidentially to individual centers as a management tool to help them improve their own performance. One danger with CUSUM charts is that users may react to small trends in the charts that arise by random variation and look to examine and revise the process when such change is unnecessary and perhaps even detrimental. This risk can be minimized through adequate education and appropriate determination of signaling thresholds. The signal from a CUSUM is a useful indicator, but not definitive proof, of a change in center performance. As a tool for quality improvement, the CUSUM chart both validates the success of practice changes and can trigger for a comprehensive review and examination, if needed.

Organ transplantation offers an unparalleled opportunity to restore life through the provision of a precious resource. Patients, payers and the press are all interested in systems of increased transparency of outcomes to ensure that the limited supply of organs is optimally used. To obviate any need for additional regulations, the transplant community would be well served to adopt state-of-the-art monitoring systems to improve performance and maximize the benefit of the limited supply of donor organs.

Appendix

Statistical Methods Appendix

Calculation of the continuous CUSUM

Utilizing national data, a Cox proportional hazards model is used to characterize the outcome of interest, such as patient survival, following liver transplant. The current SRTR models for posttransplant survival (or graft survival) can provide the basis for this, and by incorporating appropriate clinical and donor characteristics, an estimated probability of each patient's survival for each day following transplant can be determined. The CUSUM is then constructed by examining all the individuals who have been transplanted since the day the chart was initiated. The change in the CUSUM on day t depends on the observed number of patient deaths (dt) and the risk-adjusted expected number of patient deaths (et) on each day as derived from the model. The CUSUM is then recalculated daily, incorporating both longer survival times as well as the experience of new transplants. This method can be used to track outcomes such as mortality over a specified period of time (e.g. 1 year) posttransplant. In this case, patients who are more than 1-year posttransplant would not be included in the CUSUM calculation.

One-sided CUSUM:  This CUSUM is constructed by considering a test of the hypothesis that the actual mortality rate in the institution of interest is the same as that of the population in general (relative risk [RR]= 1) versus an alternative in which the mortality rate is a multiple (RRA > 1) of the overall death rate. This relative risk RRA is chosen to correspond to an increase in the mortality rate that would be considered clinically important. In what follows, we consider a relative risk of RRA= 2, or a doubling of the risk of death as the alternative of interest. In calculating the change in the CUSUM on day t, each death on day t increases the CUSUM by an amount (0.69) and this is reduced by subtracting et, the risk-adjusted number of deaths expected. Thus, the value of the CUSUM (Ut) on day t is Ut= Ut-1+ 0.69dt– et, where dt is the observed number of deaths. In the one-sided CUSUM, the sum is restricted so as to never become negative. Thus, if it becomes negative, its value is replaced by 0. For example, suppose that 30 days into the CUSUM, one patient dies (dt= 1) whereas the model predicts that on average et= 0.05 patients would die on that day. The CUSUM would increase by 0.69 − 0.05 = 0.64. On the other hand, if there were no deaths on that day (dt= 0), the CUSUM would decrease by the value 0.05. If the CUSUM became negative because of this adjustment, it would take the value 0. Once the CUSUM achieves the value 0 it stays there until the next death. Thus, the one-sided CUSUM will increase when failures occur but will decrease only to 0 with a long period of no deaths.

The one-sided CUSUM signals that a process review is appropriate when the CUSUM (Ut) exceeds some prespecified level L. An appropriate method for choosing L is still the subject of investigation and is largely an empirical process. On the one hand, we wish to choose L so that there will tend to be a long waiting time until a signal occurs if the institution has failure rates that are equal to or better than the national average. On the other hand, if the death rates of the facility are substantially in excess of the national average, we would like to identify the problem as quickly as possible. One approach is to choose L so that, given the facility size (average number of transplants per year), there would be a prespecified probability of achieving a signal over a given period of time if, in fact, the facility death rates were exactly at the national rate. In this work, we adopted this approach and the level was chosen to achieve a ‘false positive rate’ of about 8% over a 3.5-year period. This is a comparable false positive rate to the screening methods currently in use by the SRTR and the MPSC. The average length of time prior to signaling, referred as the average run length (ARL), is one way of assessing a specified control limit. Ideally, the ARL should be long for centers that are in control (i.e. operating at the national average or better) and short for centers that have high failure rates.

Two-sided or O-E CUSUM:  This chart plots, as a function of time, the difference between the observed number of deaths Ot up to a given time t and Et, the number of deaths that would be expected up to time t if the death rate at the center was exactly the same as for the overall national average. Thus, the O-E chart involves plotting Ot– Et versus t, where Ot= d1+ d2+…+ dt and Et= e1+ e2+…+ et for t = 1, 2, … .

The V-mask, introduced by Barnard, provides a way of determining signals from a two-sided CUSUM (O-E plot) (15,16). This is a somewhat more complicated approach to determining signals than that available in the one-sided CUSUM. The V-mask is applied at each time t and consists of constructing a horizontal V with vertex located a given distance (h/k) to the right of the current plotting position (t, Ot–Et), where the arms of the V are chosen to pass through points h units above and h units below the current plotting position. The V-mask is defined by the choice of h and k and we have selected these values in order to achieve a false signal rate of approximately 8% over a 3.5 -year period, in the same manner as the choice of the control limit, L, in the one-sided CUSUM chart. An O-E CUSUM along with an associated V-mask is illustrated in Figure 4. The V-mask yields a signal at time t if the arms of the V intersect the previous path of the CUSUM. If the upper (lower) arm crosses the path, the rate of events at the corresponding center is significantly less (greater) than the national average. In the plot in Figure 4, the CUSUM has signaled at the indicated time t with an indication that the failure rates are significantly higher than the national average at this center. It is at this time that the V-mask signal would call for a review to determine whether there are correctable causes of the observed high rate of failures. The subsequent path shown in the figure was that observed without any such review.

Figure 4.

An O-E CUSUM and associated V-mask. The V-mask generates a signal at the indicated time t.

Determination of an appropriate signaling threshold:  To determine an appropriate level of L and shape of the V-mask, we designed a simulation study modeled on survival outcomes for liver transplant programs (Table 3). The simulation is constructed so that the rate of false alarms is set to be approximately 8% for a 3.5-year period. This choice is similar to the criteria currently used by the SRTR, in which a significance test of the null hypothesis at the 5% level is conducted every 6 months for a moving window length of 2.5 years. The reported control limit is for a 1-year death rate of 13.09%, which corresponds to the national average of 1-year mortality rate for liver transplants. The continuous CUSUM charts (both the one-sided and two-sided O-E V-mask charts) are designed to be sensitive to a relative risk of RRA= 2, as compared to the national average. From Table 3, it is clear that the choice of a limit by this criterion is significantly affected by the size of the transplant program. By adjusting L, the potential for a false positive CUSUM signal can be made approximately equal regardless of the program size. The column entitled ‘Power’ specifies the probability that a center whose relative risk is RR = 2 would signal in the 3-year period. The ARLs in the table give the average number of years of follow-up before the first signal would occur in a center whose failure rates are at the national average (RR = 1). In the review of liver transplant programs described in the main article, we utilize the control limits as obtained from this simulation.

Table 3.  Simulated power and average run length (ARL) for one-sided and O-E CUSUMs for a 1-year survival, assuming a base national average 1-year death rate of 13.1%*. The values of L and h and d (for the V-mask) are chosen to yield signals with probability 8% over 3.5 years for a facility of the given size operating at the national rate
Facility size (transplants/yr)One-sided CUSUMO-E CUSUM (with V-mask)
LPowerARL (yrs)hh/dPowerARL (yrs)
  1. *13.1% is the overall national 1-year death rate for liver transplants.

102.40.6324.63.40.440.5927.1
403.80.9433.55.20.440.9430.1
804.50.9935.16.40.440.9936.1
1204.91.0030.97.00.441.0033.5
1605.21.0030.47.40.441.0030.1

Table 4 gives similar results from a simulation modeled on 1-year graft survival in national kidney transplant programs, in which the rate of 1-year graft failure is 7.24%. Here again the probability of false signal is set at 8% over the 3.5-year period. The control limits from this simulation were used in setting control limits for the study of kidney transplant programs in the main article. We see from this table that the signal threshold (L) increases with program size to reduce the likelihood of inappropriate signaling. Once the volume is about 40 transplants per year, there is a 90% likelihood that a true signal would be detected, and a period of nearly 30 years expected between false positive signals (ARL).

Table 4.  Simulated power and average run length (ARL) for one-sided and O-E CUSUMs for a 1-year survival, assuming a base national average 1-year death rate of 7.2%*. The values of L and h and d (for the V-mask) are chosen to yield signals with probability 8% over 3.5 years for a facility of the given size operating at the national rate
Facility size (transplants/yr)One-sided CUSUMO-E CUSUM (with V-mask)
LPowerARL
(yrs)
LPowerPowerL
  1. *7.2% is the overall national 1-year death rate for kidney transplants.

102.10.4428.22.90.440.4529.2
403.30.8030.64.60.440.7930.6
803.90.9532.05.50.440.9430.2
1204.40.9932.96.20.440.9930.6
1604.61.0029.46.70.441.0031.1
2004.91.0027.87.20.441.0029.8

Acknowledgment

The Scientific Registry of Transplant Recipients is funded by contract number 234–2005-37009C from the Health Resources and Services Administration (HRSA), US Department of Health and Human Services. The views expressed herein are those of the authors and not necessarily those of the US Government. This is a US Government-sponsored work. There are no restrictions on its use.

Ancillary