Database Comparison of the Adult-to-Adult Living Donor Liver Transplantation Cohort Study (A2ALL) and the SRTR U.S. Transplant Registry

Authors


  • Presented in part at the American Transplant Congress, Toronto, Canada, June, 2008.

  • This is publication number 13 of the Adult-to-Adult Living Donor Liver Transplantation Cohort Study.

Corresponding author: Brenda Gillespie, bgillesp@umich.edu

Abstract

Data submitted by transplant programs to the Organ Procurement and Transplantation Network (OPTN) are used by the Scientific Registry of Transplant Recipients (SRTR) for policy development, performance evaluation and research. This study compared OPTN/SRTR data with data extracted from medical records by research coordinators from the nine-center A2ALL study. A2ALL data were collected independently of OPTN data submission (48 data elements among 785 liver transplant candidates/recipients; 12 data elements among 386 donors). At least 90% agreement occurred between OPTN/SRTR and A2ALL for 11/29 baseline recipient elements, 4/19 recipient transplant or follow-up elements and 6/12 donor elements. For the remaining recipient and donor elements, >10% of values were missing in OPTN/SRTR but present in A2ALL, confirming that missing data were largely avoidable. Other than variables required for allocation, the percentage missing varied widely by center. These findings support an expanded focus on data quality control by OPTN/SRTR for a broader variable set than those used for allocation. Center-specific monitoring of missing values could substantially improve the data.

Introduction

All U.S. transplant programs and organ procurement organizations (OPOs) are required by the Organ Procurement and Transplantation Network (OPTN) Final Rule (1,2) to submit data to OPTN on all individuals who register to receive an organ transplant or who donate an organ for transplantation. Demographic, socioeconomic and clinical data, including vital status and allograft status are collected on a regular basis for candidate organ recipients and donors. The data are ultimately reposited with the Scientific Registry of Transplant Recipients (SRTR), which is the organization with statutory responsibility for utilization of OPTN data for development of allocation policy, program performance evaluation and research.

OPTN/SRTR data have been the basis of more than 1000 peer-reviewed scientific publications, scores of organ allocation policies, medical practice guidelines and regular program-specific reporting of waitlist and transplant outcomes.

Although submission is federally mandated, the OPTN/ SRTR data have never been systematically validated against source documents. The overall accuracy and completeness of the data depend on range and consistency checks performed at the time of data entry as well as additional data cleaning performed by the OPTN contractor prior to transfer of the data to the SRTR. The Adult-to-Adult Living Donor Liver Transplantation Cohort Study (A2ALL) was created in 2002 to investigate the benefits and risks of living donor liver transplantation (LDLT). The A2ALL clinical sites collected retrospective data on all potential liver recipients who had a living donor evaluated between January 1998 and February 2003 at nine participating centers. A2ALL research coordinators extracted data from medical records, imaging and laboratory studies. One of the objectives of A2ALL was to compare the completeness and reproducibility of transplant candidate and donor data elements in OPTN/SRTR with A2ALL.

Methods

Of 819 liver transplant candidates, 34 were excluded: 6 because OPTN listing date was missing in A2ALL, and 28 because no OPTN liver transplant listing was found within 30 days of the date reported in A2ALL. Of 605 A2ALL transplant recipients, 11 were excluded due to no OPTN transplant date within 30 days of the date reported in A2ALL. Of 387 living donors in A2ALL, one was excluded due to no OPTN donation date within two days of the date reported in A2ALL. These cases represent data discrepancies in themselves but are not considered further due to lack of information.

OPTN data were collected for candidate organ recipients from the time of listing until waitlist removal, transplant failure, or death. Organ donors were followed for 2 years from the time of donation. The OPTN data, collected through a web-based data entry system (UNET) since October 1999, were subject to automated range and consistency checks at the time of data entry, as well as further queries based on analytic checks. Site audits were performed to verify data used for organ allocation. Over the years, data elements have been added or dropped. For example, following the 2002 introduction of the Model for End-stage Liver Disease (MELD) (3), the three elements needed for MELD score calculation were required. The SRTR also incorporated extra ascertainment of mortality from the Social Security Death Master File (SSDMF).

In the A2ALL study, each center employed at least one full-time coordinator who was centrally trained for medical chart review. The coordinator collected data—without referring to the OPTN data—using a secure web-based data entry system (BioDBx). The study protocol was approved by the Institutional Review Board and Privacy Board of each participating center and the A2ALL Data Coordinating Center (DCC) prior to beginning data entry. Subject name, gender and date of birth were submitted to the DCC for linkage to corresponding OPTN/SRTR data under an approved data use agreement with the SRTR. As each A2ALL data entry screen was completed, discrepancies with the linked OPTN/SRTR data were flagged and required confirmation or modification of the data entered by the A2ALL coordinator. Subsequent data cleaning, including range and consistency checks, allowed further opportunities for data correction over 5 years of analysis. Additional queries were made as additional discrepancies were found between A2ALL and OPTN/SRTR data during analysis. Thus, a critical aspect of the design of this study was the attempted resolution of discrepancies in the two data sets. There was no transfer of corrections from A2ALL to OPTN or SRTR. For this report, A2ALL data were compared with OPTN/SRTR data as of February 2009, which may have incorporated corrections since the original OPTN/SRTR data presented to coordinators in 2003. The OPTN/SRTR form and variable names for all examined variables are available as supplementary materials.

Comparisons between OPTN/SRTR and A2ALL were made on variables collected in comparable formats in the two databases: 29 baseline elements for potential recipients, 19 transplant-related elements for liver transplant recipients and 12 living liver donor-related elements. For each data element, we present the numbers and percents of values missing in both databases, missing in OPTN/SRTR but present in A2ALL, present in OPTN/SRTR but missing in A2ALL, present in both databases but with inconsistent values, and present and identical in both databases. For continuous variables, e.g. weight, we allowed differences within a narrow window, as noted in table footnotes. Discrepancies between OPTN/SRTR and A2ALL data were investigated using paired t-tests, scatterplots and histograms of the differences between the two values. For dichotomous variables, we tested whether discrepancies were symmetric between OPTN/SRTR and A2ALL using McNemar's test. For variables with substantial missing data in OPTN/SRTR, we compared the A2ALL data for those with missing values in OPTN/SRTR to the complete data in OPTN/SRTR to see if the data were missing at random (4). We used box plots to illustrate the distributions of center-specific percent missing, tested for center differences in the proportion missing using chi-square tests and graphically examined patterns of missing data over calendar time.

A2ALL used the OPTN disease and cause of death codes, but allowed more codes to be listed (3 instead of 2 diagnostic codes and 3 instead of 1 cause of death codes). Of 66 unique diagnostic codes used, many subjects had close but inexact matches in the two databases. We grouped the codes into six diagnostic categories for matching: acute hepatic necrosis (AHN), noncholestatic cirrhosis (non-hepatitis C), hepatitis C, cholestatic cirrhosis, metabolic disorders and hepatocellular carcinoma (HCC). The 55 unique cause of death codes reported were grouped into 14 categories: liver disease, graft failure, cardiovascular, cerebrovascular, pulmonary insufficiency/respiratory failure, renal failure, multi-organ system failure, hemorrhage, infection, malignancy, operative, suicide, trauma and other. A match was declared if a diagnostic or cause of death category was reported in both OPTN and A2ALL.

Results

Data for 785 potential liver recipients, 594 transplant recipients and 386 living liver donors who had records in both the A2ALL database and in OPTN/SRTR data were included. Of the transplant recipients, 387 received an LDLT and 207 received a deceased donor liver transplant (DDLT).

Of 29 data elements for potential recipients most concerned demographics or medical status at listing (Table 1A). The majority were present and identical in both OPTN/SRTR and A2ALL. Variables with greater than 90% identical values between OPTN/SRTR and A2ALL included most demographics (gender, date of birth, ethnicity, race), ABO blood type, weight, previous liver transplants, medical condition and whether on a ventilator at listing, some diagnoses, dialysis, reason for removal from the waitlist and death date. Among variables with less than 90% matching between OPTN/SRTR and A2ALL, most nonmatches represented data present in A2ALL but missing in OPTN/SRTR, including education, height, diabetes, coronary artery disease and hypertension. The collection of some OPTN/SRTR data elements has changed over time. For example, history of transjugular intrahepatic portosystemic shunt (TIPSS) has only been collected since 1999, and serum creatinine at listing has been required since the introduction of the MELD in 2002. No variable had more than 3% missing in both SRTR and A2ALL except education (8% missing) and INR (6% missing). Education was poorly collected in both A2ALL and SRTR, with more than 25% missing in at least one. Overall, A2ALL centers sent more complete data to A2ALL than they did to the OPTN.

Table 1A.  OPTN/SRTR validation of selected data elements for potential recipients (n = 785)
VariableMissing in both OPTN/SRTR and A2ALLMissing in OPTN/SRTR, present in A2ALLPresent in OPTN/SRTR, missing in A2ALLPresent in both OPTN/SRTR and A2ALL, but different valuesPresent in both OPTN/SRTR and A2ALL, same values1
N%N%N%N%N%
  1. 1‘Same values’ were the values with difference between OPTN/SRTR and A2ALL ≤: 5 cm for height, 5 kg for weight, 0.2 mg/dL for serum creatinine and INR, 0.4 g/dL for albumin, 1.0 mg/dL for serum bilirubin and 7 days for birth and death dates.

  2. 2MELD score was initiated 2/28/2002. N = 673/112 patients were listed pre/post-MELD.

  3. 3Restricted to patients who were listed after 10/25/1999 when OPTN/SRTR started to collect value.

  4. 4Restricted to patients who were removed from the waitlist according to A2ALL.

  5. 5Restricted to patients whose death was recorded in both A2ALL and OPTN/SRTR.

Demographics
 Gender00.0%00.0%00.0%60.8%77999.2%
 Date of birth00.0%00.0%00.0%313.9%75496.1%
 Ethnicity00.0%20.3%00.0%253.2%75896.6%
 Race00.0%00.0%00.0%243.1%76196.9%
 Highest education level at enrollment668.4%8410.7%567.1%486.1%53167.6%
Medical data at listing
 ABO blood type00.0%00.0%30.4%40.5%77899.1%
 Height20.3%779.8%00.0%334.2%67385.7%
 Weight00.0%10.1%10.1%597.5%72492.2%
 Previous liver transplants00.0%00.0%00.0%40.5%78199.5%
 Medical condition00.0%60.8%00.0%293.7%75095.5%
 On ventilator00.0%00.0%00.0%50.6%78099.4%
 Functional status121.5%9912.6%70.9%526.6%61578.3%
Diagnoses
 Acute hepatic necrosis (AHN)00.0%00.0%00.0%334.2%75295.8%
 Noncholestatic cirrhosis00.0%00.0%00.0%8210.4%70389.6%
 Hepatitis C (HCV)00.0%00.0%00.0%354.5%75095.5%
 Cholestatic cirrhosis00.0%00.0%00.0%111.4%77498.6%
 Metabolic disorder00.0%00.0%00.0%192.4%76697.6%
 Hepatoma (HCC)00.0%00.0%00.0%476.0%73894.0%
   Pre-MELD200.0%00.0%00.0%385.6%63594.4%
   Post-MELD00.0%00.0%00.0%98.0%10392.0%
Encephalopathy60.8%445.6%101.3%9512.1%63080.3%
 Pre-MELD60.9%446.5%60.9%7611.3%54180.4%
 Post-MELD00.0%00.0%43.6%1917.0%8979.5%
Ascites40.5%466.0%91.1%8110.3%64582.2%
 Pre-MELD40.6%466.8%60.9%669.8%55181.9%
 Post-MELD00.0%00.0%32.7%1513.4%9483.9%
Serum creatinine162.0%19224.5%20.3%607.6%51565.6%
 Pre-MELD162.4%19228.5%10.1%456.7%41962.2%
 Post-MELD00.0%00.0%10.9%1513.4%9685.7%
Total serum albumin91.1%19424.7%50.6%486.1%52967.4%
 Pre-MELD91.3%19428.8%40.6%405.9%42663.3%
 Post-MELD00.0%00.0%10.9%87.1%10392.0%
Total serum bilirubin152.0%50464.2%60.8%445.6%21627.5%
 Pre-MELD152.2%50474.9%40.6%263.9%12418.4%
 Post-MELD00.0%00.0%21.8%1816.1%9282.1%
INR455.7%47160.0%50.6%384.8%22628.8%
 Pre-MELD456.7%47170.0%30.4%233.4%13119.5%
 Post-MELD00.0%00.0%21.8%1513.4%9584.8%
Medical history prior to listing
Variceal bleeding within 2 weeks121.5%8410.7%60.8%212.7%66284.3%
Previous upper abdominal surgery121.5%9011.5%20.3%10913.9%57272.9%
Spontaneous bacterial peritonitis91.1%8811.2%81.0%283.6%65283.1%
History of TIPSS (n = 554)371.3%8315.0%00.0%152.7%44981.0%
Diabetes mellitus121.5%8010.2%30.4%313.9%65983.9%
Dialysis00.0%293.7%91.1%10.1%74695.0%
 Pre-MELD00.0%294.3%50.7%00.0%63994.9%
 Post-MELD00.0%00.0%43.6%10.9%10795.5%
Angina/coronary artery disease101.3%8611.0%20.3%91.1%67886.4%
Drug treated hypertension111.4%9311.8%81.0%415.2%63280.5%
Postlisting: reason for removal from waitlist (n = 706)400.0%50.7%10.1%628.8%63890.4%
Mortality follow-up: death date (n = 198)500.0%00.0%00.0%136.6%18593.4%

Discrepancies of at least 10% were found for seven variables that had values recorded in both databases (diagnosis at listing of noncholestatic cirrhosis, encephalopathy, ascites, previous upper abdominal surgery and post-MELD creatinine, albumin and INR). Further analysis of the discrepancies found potentially nonrandom differences (Table 1B). For example, encephalopathy had unbalanced discrepant values, with 3%‘Yes’ for A2ALL and 10%‘Yes’ for OPTN/SRTR (p = 0.0001). Both encephalopathy and ascites were more likely to be recorded in OPTN/SRTR than A2ALL before and after implementation of MELD-based deceased donor liver allocation. For previous upper abdominal surgery, the discrepant values were 11%‘Yes’ for A2ALL and 5%‘Yes’ for OPTN/SRTR (p < 0.0001). The higher proportion reporting prior upper abdominal surgery in A2ALL persisted despite thorough review of responses, where many incorrect ‘Yes’ responses (e.g. appendectomy and hysterectomy) were corrected to ‘No’. For date of birth, only six cases differed (by 1 to 10 years), and no systematic bias by database was detected by paired t-test. For height, 10 cases differed by at least 10 cm. For weight, 27 cases differed by at least 10 kg, with A2ALL reporting significantly higher weights (p = 0.0001).

Table 1B.  Comparison between OPTN/SRTR and A2ALL of selected data elements for potential recipients among values present in both sources
Dichotomous variables N Both Yes Both No OPTN/SRTR:Yes, A2ALL:No OPTN/SRTR:No, A2ALL:Yes McNemar's P-value
Diagnoses
 Acute hepatic necrosis (AHN) 785 14 (2%) 738 (94%) 12 (2%) 21 (3%) 0.1172
 Noncholestatic cirrhosis 785 201 (26%) 502 (64%) 25 (3%) 57 (7%) 0.0004
 Hepatitis C (HCV) 785 333 (42%) 417 (53%) 6 (1%) 29 (4%) 0.0001
 Cholestatic cirrhosis 785 139 (18%) 635 (81%) 2 (0.3%) 9 (1%) 0.0348
 Metabolic disorder 785 13 (2%) 753 (96%) 11 (1%) 8 (1%) 0.4913
 Hepatoma (HCC) 785 42 (5%) 695 (89%) 8 (1%) 40 (5%) <0.0001
   Pre-MELD 673 33 (5%) 602 (89%) 5 (1%) 33 (5%) <0.0001
   Post-MELD 112 9 (8%) 94 (84%) 3 (3%) 6 (5%) 0.3173
 Encephalopathy 725 335 (46%) 295 (41%) 70 (10%) 25 (3%) 0.0001
   Pre-MELD 617 284 (46%) 257 (42%) 56 (9%) 20 (3%) <0.0001
   Post-MELD 108 51 (47%) 38 (35%) 14 (13%) 5 (5%) 0.0389
 Ascites 726 445 (60%) 200 (28%) 65 (9%) 16 (2%) <0.0001
   Pre-MELD 617 377 (61%) 174 (28%) 55 (9%) 11 (2%) <0.0001
   Post-MELD 109 68 (62%) 26 (24%) 10 (9%) 5 (5%) 0.1967
 Variceal bleeding within 2 weeks 683 19 (3%) 643 (94%) 9 (1%) 12 (2%) 0.5127
 Previous upper abdominal surgery 681 164 (24%) 408 (60%) 34 (5%) 75 (11%) <0.0001
 Spontaneous bacterial peritonitis 680 38 (6%) 614 (90%) 12 (2%) 16 (2%) 0.4497
 History of TIPSS1 464 21 (5%) 428 (92%) 5 (1%) 10 (2%) 0.1967
 Diabetes mellitus 690 102 (15%) 557 (81%) 11 (2%) 20 (3%) 0.1060
 Angina/coronary artery disease 687 6 (1%) 672 (98%) 3 (0.4%) 6 (1%) 0.3173
 Drug treated hypertension 673 53 (8%) 579 (86%) 18 (3%) 23 (3%) 0.4349
Continuous variablesPercentiles of the difference distribution (A2ALL minus OPTN/SRTR)Paired t-test p-Value
Min1% tile5% tile10% tile25% tile50% tile75% tile90% tile95% tile99% tileMax
Date of birth−3653−30300000 00613040.1540
Height (cm)−30−8−10000 026200.6002
Weight (kg)−22−9−20000 2522550.0001
Serum creatinine (mg/dL)−7.2−100000 00.20.71.50.1533
Total serum bilirubin (mg/dL)−9.3−3.6−1.3−0.7000 12.91321.50.0629
INR (mg/dL)−1.9−1−0.2−0.1000 0.20.41.11.60.1407
Ordinal variables N  Same value  1 level different OPTN/SRTR > 1 level higher A2ALL > 1 level higher
  1. 1Restricted to patients who were listed after 10/25/1999 when OPTN/SRTR started to collect value.

Highest education level 579  531 (92%)  36 (6%) 2 (0.03%) 10 (1.7%)
Functional status (NYHA) 667  615 (92%)  37 (6%) 10 (1.5%) 5 (0.7%)

Of the variables in Table 1A with more than 10% of values missing in OPTN/SRTR but present in A2ALL, many appeared to be missing completely at random, as indicated by similar values in A2ALL among those missing in OPTN/SRTR compared with those present in OPTN/SRTR. Variables that did not appear to be missing completely at random were angina/coronary artery disease (1% among those present in OPTN/SRTR vs. 7% reported in A2ALL among those missing in OPTN/SRTR, p = 0.0003), and education (higher attainment among those present vs. missing in OPTN/SRTR, p < 0.0001).

Of 17 transplant-related and two mortality variables, the majority were present and identical in both OPTN/SRTR and A2ALL (Table 2A). Data missing in both OPTN/SRTR and A2ALL were uncommon, except for cold ischemia time (CIT) among LDLT recipients (20%) and HCV RNA result (32%). Among those with LDLT CIT recorded in A2ALL, it was significantly shorter if CIT was missing in OPTN/SRTR (mean 199 min) than if it was present (mean 334 min, p = 0.002). The same was true for the converse combination (i.e. missing LDLT CIT in A2ALL and present in OPTN/SRTR) for CIT recorded in OPTN/SRTR (256 vs. 339 min, p = 0.097). Variables with at least 10% missing in OPTN/SRTR but present in A2ALL included: HCV RNA (36% among 272 with HCV), cause of death (30% among 118 reported deaths), INR (28%), encephalopathy (28%) and dialysis (29%) in the pre-MELD era, ALT (17%), treated rejection during first year post-transplant (14%), and functional status (11%). Variables present in OPTN/SRTR but missing in A2ALL were less common, with the largest percentages (approximately 6% in each case) for serum albumin in the MELD era, LDLT CIT and HCV RNA. As with pretransplant data, centers sent more complete recipient data to A2ALL than they did to the OPTN.

Table 2A.  OPTN/SRTR validation of selected data elements for transplanted recipients
VariableNumber of patientsMissing in both OPTN/SRTR and A2ALLMissing in OPTN/SRTR, present in A2ALLPresent in OPTN/SRTR, missing in A2ALLPresent in both OPTN/SRTR and A2ALL, but different valuesPresent in both OPTN/SRTR and A2ALL, same values1
NN%N%N%N%N%
  1. 1‘Same values’ were the values with difference between OPTN/SRTR and A2ALL ≤: 5 kg for weight, 0.2 mg/dL for serum creatinine and INR, 0.4 g/dL for serum albumin, 1 mg/dL for serum bilirubin, 60 min for cold ischemia time, 30 IU/L for ALT, SGPT and 7 days for death date.

  2. 2Restricted to patients who were transplanted after 5/1/2001 when OPTN/SRTR started to collect the data.

  3. 3Restricted to patients who were transplanted after 10/25/1999 when OPTN/SRTR started to collect value.

  4. 4Restricted to patients who were transplanted after 9/1/1999 when OPTN/SRTR started to collect the data.

  5. 5Restricted to patients who had HCV at or prior to transplant in A2ALL.

  6. 6Restricted to patients who had HCC at or prior to transplant in A2ALL.

  7. 7Restricted to patients who received a transplant and whose death was recorded in both A2ALL and OPTN/SRTR.

At time of transplant
 Weight59420.3%7512.6%30.5%6611.1%44875.4%
 Functional status594101.7%6611.1%101.7%8013.5%42872.1%
 Ascites59420.3%203.4%61.0%6310.6%50384.7%
 Pre-MELD39420.5%194.8%30.8%307.6%34086.3%
 Post-MELD20000.0%10.5%31.5%3316.5%16381.5%
 Encephalopathy233900.0%4112.1%30.9%5215.3%24371.7%
 Pre-MELD13900.0%3928.1%00.0%1812.9%8259.0%
 Post-MELD20000.0%21.0%31.5%3417.0%16180.5%
 Variceal bleeding59450.8%345.7%40.7%111.9%54090.9%
 Spontaneous bacterial peritonitis59440.7%294.9%20.3%315.2%52888.9%
 History of TIPSS351920.4%428.1%00.0%224.2%45387.3%
 Dialysis233900.0%4212.4%00.0%30.9%29486.7%
 Pre-MELD13900.0%4028.8%00.0%21.4%9769.8%
 Post-MELD20000.0%21.0%00.0%10.5%19798.5%
 Serum creatinine59400.0%264.4%20.3%427.1%52488.2%
 Pre-MELD39400.0%266.6%10.3%328.1%33585.0%
 Post-MELD20000.0%00.0%10.5%105.0%18994.5%
 Serum albumin59440.7%406.7%172.9%549.1%47980.6%
 Pre-MELD39441.0%399.9%51.3%328.1%31479.7%
 Post-MELD20000.0%10.5%126.0%2211.0%16582.5%
 Serum bilirubin59400.0%244.0%20.3%6711.3%50184.3%
 Pre-MELD39400.0%235.8%10.3%4310.9%32783.0%
 Post-MELD20000.0%10.5%10.5%2412.0%17487.0%
 INR453040.8%9217.4%20.4%468.7%38672.8%
 Pre-MELD33041.2%9127.6%10.3%288.5%20662.4%
 Post-MELD20000.0%10.5%10.5%189.0%18090.0%
 Cold ischemia time (LDLT recipient)3877920.4%307.8%256.5%164.1%23761.2%
 Cold ischemia time (DDLT recipient)20783.9%104.8%83.9%167.7%16579.7%
 ALT, SGPT59420.3%9816.5%10.2%467.7%44775.3%
HCV RNA result52728631.6% 9936.4%165.9%62.2%6523.9%
Post-transplant follow-up
 HCC recurrence611132.7%109.0%43.6%98.1%8576.6%
 Treated rejection (1st year)59400.0%8013.5%10.2%13723.1%37663.3%
 Mortality follow-up7
 Death date11800.0%00.0%00.0%65.1%11294.9%
 Cause of death11865.1%3529.7%00.0%108.5%6756.8%

Discrepancies of at least 10% were found for seven variables among non-missing recipient data: rejection treated during the first year post-transplant (23% discrepant), encephalopathy (pre-MELD, 13%; post-MELD, 17%), ascites (post-MELD, 17%), functional status (14%), weight (11%), albumin (post-MELD, 11%) and serum bilirubin (pre-MELD, 11%; post-MELD 12%) (Table 2A). Among cases with values present in both OPTN/SRTR and A2ALL there was significant asymmetry for rejection treated during the first year post-transplant, in that A2ALL more often showed that rejection had occurred: 24%‘Yes’ for A2ALL/‘No’ for OPTN/SRTR and 3%‘Yes’ for OPTN/SRTR/‘No’ for A2ALL (p < 0.0001, Table 2B). A2ALL data also showed significantly more encephalopathy than OPTN/SRTR (16% vs. 2%; p < 0.0001) and TIPSS (4% vs. 1%; p = 0.0006). OPTN/SRTR data recorded significantly more spontaneous bacterial peritonitis than A2ALL (4% vs. 1%; p = 0.0002).

Table 2B.  Comparison between OPTN/SRTR and A2ALL of selected data elements for transplant recipients among values present in both sources
Dichotomous variables N Both Yes Both No OPTN/SRTR:Yes, A2ALL:No OPTN/SRTR:No, A2ALL:Yes McNemar's p-Value
Ascites 566 332 (59%) 171 (30%) 37 (7%) 26 (5%) 0.1658
Encephalopathy 295 20 (7%) 223 (76%) 7 (2%) 45 (16%) <0.0001   
Spontaneous bacterial peritonitis 559 20 (4%) 508 (91%) 26 (4%) 5 (1%) 0.0002
History of TIPSS1 475 38 (8%) 415 (83%) 3 (1%) 19 (4%) 0.0006
Dialysis 297 9 (3%) 285 (96%) 0 (0%) 3 (1%) n/a
HCV RNA result 71 Positive: 63 (90%) Negative: 2 (3%) 3 (4%) 3 (4%) 1.000   
HCC recurrence 94 14 (15%) 71 (76%) 1 (1%) 8 (7%) 0.0196
Treated rejection during 1st year post-transplant 513 89 (17%) 287 (56%) 17 (3%) 120 (24%) <0.0001   
Continuous variablesPercentiles of the difference distribution (A2ALL minus OPTN/SRTR)Paired t-test p-Value
Min1% tile5% tile10% tile25% tile50% tile75% tile90% tile95% tile99% tileMax
Date of death−366−1210000001723650.6898
Weight−29.8−12.5−6.2−2.60002.482247.80.1891
Serum creatinine−1.5−0.7−0.2000000.11.18.60.4892
Serum albumin−2.9−1.1−0.300000.10.61.92.50.0257
Serum bilirubin−26.1−9.3−1.4−0.20000.21.25.6550.6556
INR−35−1.1−0.200000.20.312.10.4824
Cold ischemia time (LDLT) (minutes)−2243−810−47−43−26−107161304540.0146
Cold ischemia time (DDLT) (minutes)−170−149−42−18−1.5015.5521353603760.0014
ALT, SGPT−1599−446−57−40000926250890.8411
Ordinal variables  N Same value 1 level different OPTN/SRTR > 1 level higher  A2ALL > 1 level higher
  1. 1Restricted to patients who were transplanted after 10/25/1999 when OPTN/SRTR started to collect value.

Functional status (NYHA)  508 428 (84%) 44 (9%) 21 (4%)  15 (3%)

Although each continuous variable in Table 2B had a few instances of extreme discrepancies, CIT exhibited a systematic difference between OPTN/SRTR and A2ALL, with shorter LDLT CIT values reported in A2ALL than OPTN/SRTR (p = 0.0146) and longer DDLT CIT values reported in A2ALL than OPTN/SRTR (p = 0.0014). There were 16 LDLT CIT and 16 DDLT CIT values that were discrepant by >1 hour. Although most albumin values were identical in A2ALL and OPTN/SRTR, those that were different were generally higher in A2ALL than in the OPTN/SRTR data. Among the discrepant MELD components, 8 of 42 serum creatinine values differed by more than 1.0 mg/dL; 8 of 67 bilirubin values differed by more than 1.2 mg/dL; and 12 of 46 INR values differed by more than 1.0. In all cases, the distributions of differences were fairly symmetrical. For functional status, of the 80 with discrepant values, 44 differed by a single level of the five New York Heart Association classes.

Of the recipient transplant variables with more than 10% of values missing in OPTN/SRTR but present in A2ALL, all but rejection appeared to be missing at random, as indicated by similar values in A2ALL among those missing in OPTN/SRTR compared with those present in OPTN/SRTR. For rejection treated during the first year post-tranplantation, 21% were reported as having rejection in OPTN/SRTR versus 34% reported as having rejection in A2ALL among those with missing values in OPTN/SRTR (p = 0.009).

Among 12 living liver donor variables, most values were present and identical in both databases (Table 3A). However, of the four donor deaths recorded in A2ALL data, only two were recorded in OPTN/SRTR. Both were identified only by linkage to SSDMF and had not been entered to OPTN by the center. Donor education was missing most commonly (19% in both databases). Several variables had substantial proportions of missing values in OPTN/SRTR that were present in A2ALL: height (28%), weight (22%), CMV IgG (24%) and education (20%). Variables where values were present in OPTN/SRTR but missing in A2ALL were uncommon. Donor elements with values present in both OPTN/SRTR and A2ALL but discrepant included education (8%) and relationship to the recipient (6%) (Table 3B). Donor variables with more than 10% missing in OPTN/SRTR but present in A2ALL appeared to be missing at random, as indicated by similar values in A2ALL data among those missing in OPTN/SRTR compared with those present in OPTN/SRTR.

Table 3A.  OPTN/SRTR validation of selected data elements for living donors
VariableNumber of patientsMissing in both OPTN/SRTR and A2ALLMissing in OPTN/SRTR, present in A2ALLPresent in OPTN/SRTR, missing in A2ALLPresent in both OPTN/SRTR and A2ALL, but different valuesPresent in both OPTN/SRTR and A2ALL, same values1
NN%N%N%N%N%
  1. 1‘Same values’ were the values with difference between OPTN/SRTR and A2ALL ≤ 5 cm for height, 5 kg for weight.

  2. 2Donor weight at enrollment in A2ALLversus donor weight at transplant in OPTN/SRTR.

  3. 3Restricted to donors whose death was recorded in A2ALL.

Demographics
 Gender38610.3%00.0%00.0%30.8%38299.0%
 Ethnicity38610.3%00.0%10.3%133.4%37196.1%
 Race38610.3%00.0%10.3%71.8%37797.7%
 Highest education level at enrollment3867419.2%7619.7%82.1%307.8%19851.3%
 State of permanent residence at enrollment38630.8%61.6%30.8%143.6%36093.3%
Medical data
 ABO blood type38610.3%00.0%10.3%10.3%38399.2%
 Height at enrollment38671.8%10727.7%00.0%71.8%26568.7%
 Weight238610.3%8622.3%10.3%133.4%28573.8%
 Relationship to recipient38600.0%10.3%10.3%225.7%36293.8%
 CMV IgG at enrollment386194.9%9324.1%20.5%133.4%25967.1%
Mortality follow-up3
 Date of death 400.0%250.0%00.0%00.0%250.0%
 Primary cause of death 400.0%4100.0%00.0%00.0%0 0.0%
Table 3B.  Comparison between OPTN/SRTR and A2ALL of selected data elements for donors among values present in both sources
Ordinal variablesNSame valueOPTN/SRTR higherA2ALL higher
Highest education level228198 (87%)5 (2%)25 (11%)
Categorical variablesNSame valueOPTN/SRTR: biological, A2ALL: nonbiologicalOPTN/SRTR: nonbiological, A2ALL: biologicalBoth biological, but different typeBoth nonbiological, but different type
Relationship to recipient384362 (94%)6 (2%)8 (2%)6 (2%)2 (0.5%)

2009 national data from SRTR were analyzed to show the distributions of percent missing across all U.S. liver transplant programs for eight variables, separately for A2ALL and non-A2ALL centers (Figure 1). These variables are often tested in SRTR inferential models, and many have been included in published SRTR-based research. There was wide variation in the percent missing among programs for each variable. Although well over one-half of programs had 20% or less missing for all eight variables, a few had 30–95% missing values for several of the variables.

Figure 1.

Boxplot distributions of percent missing for OPTN/SRTR data from all U.S. liver transplant centers for several recipient candidate variables tested in SRTR analytic models (2009 data). (Func Status = functional status by New York Heart Association scale; Vari Bld = variceal bleed; Up Ab Surg = upper abdominal surgery; HTN = hypertension.) Boxes show the 25th to 75th percentiles, ‘+’= mean, and bar across box = median. Whiskers extend to the data point closest to the center within 1.5 * IQR (interquartile range) from each box end, and values beyond these points are shown individually.

Discussion

This study provided a comprehensive comparison of OPTN data with source data based on systematically collected clinical information from nine major transplant centers. The design of the A2ALL retrospective chart review provided a unique opportunity to evaluate not only the accuracy of nationally submitted transplant registry data, but the extent to which otherwise missing data may be captured from a thorough review of the medical record by trained personnel.

While the results demonstrated that most submitted OPTN/SRTR data were consistent with A2ALL, substantial problems with missing and discrepant data were revealed. Missing OPTN/SRTR data were shown to exist by their ability to be collected in A2ALL. The extent of avoidable missing OPTN/SRTR data was 10–12% for several recipient candidate variables, up to 36% for transplant variables and up to 28% for donor variables. The pattern of less missing data at listing than at transplant or follow-up may reflect a greater incentive for centers to enter complete data prior to versus following the receipt of an organ for transplant. An investigation of whether variables were ‘missing completely at random’ in the OPTN/SRTR data using the more complete data from A2ALL revealed several variables that violated this assumption. In particular, the presence of coronary artery disease and the occurrence of treated rejection during the first year post-transplant were both reported with significantly lower frequency in OPTN/SRTR than in A2ALL. When educational attainment was missing in OPTN/SRTR data, it was likely to be less than average when found in A2ALL. Estimation of the associations of such variables with outcomes can be severely biased if missing data are not missing at random.

Unresolved discrepancies between OPTN/SRTR and A2ALL were common for some variables. Among continuous variables, CIT differed by hours in several cases. For categorical variables, discrepancies of 9% to 28% were found for previous upper abdominal surgery, presence of encephalopathy or presence of ascites at listing. Some variables may be difficult to accurately code based on medical chart review. However, variables such as CIT and previous upper abdominal surgery have been found to be significantly predictive of outcomes in SRTR (5,6) and A2ALL (7), based on potentially flawed data.

Because OPTN/SRTR data are used extensively for analyses that inform medical practice and national transplant policy, the consequences of these data issues must be considered. For example, an analysis of treated rejection during the first year after LDLT versus DDLT based on OPTN/SRTR data (8) yielded conclusions that differed substantially from those based on a subsequent analysis of A2ALL data (7,9). It is disconcerting that among A2ALL subjects with rejection data, 23% had discrepant values for rejection in OPTN/SRTR, and an additional 14% were missing information on treated rejection in the OPTN/SRTR data. These findings raise concerns about reports that rely on OPTN/SRTR data on the incidence of acute rejection (10), including publications funded by pharmaceutical corporations that demonstrate putative benefits of particular products (11,12). A second programmatic implication is the effect of missing or discrepant data on OPTN assessment of program performance. Accurate estimation of expected survival requires complex modeling based on multiple donor and recipient variables. Unfortunately, large amounts of missing data at individual centers degrade the model estimates for all centers, with resultant biases of unknown size and direction.

Missing values always have an impact on an analysis, whether observations with missing data are deleted (leading to reduced statistical power and possible bias), included with missing data indicators (leading to biased estimates, possibly severe), or multiply imputed (the preferred method, but still yielding reduced power compared with complete data). Most SRTR publications have used the second method, even with variables missing as much as 31% (education), 30% (variceal bleed) and 35% (functional status) in models based on data from the most recent 3 years (5). This method can cause bias in parameter estimates and standard errors of any variable in a regression model (13,14), with higher proportions missing associated with greater bias, although the impact on model prediction is smaller.

We investigated center-level characteristics predictive of missing data for two variables with substantial missing data (CIT and education), with inconsistent results. For example, higher center volume was significantly associated with lower probability of missing CIT, but higher probability of missing education. Significant but inconsistent differences by calendar year were observed, and no significant geographic effects (rural, micro-urban, metro-urban) were seen. The type of medical record system could contribute to ease of data extraction, but was not known for this analysis. The strongest predictive factor by far was the effect of individual center, found in both models. Center variability in percent missing was wide, with a minority of the centers accounting for the bulk of the missing data. For example, 65% of missing functional status were from a handful of centers, each with 10–60% missing. Furthermore, centers with substantial missing data on one variable were much more likely to have missing data on other variables, with correlations between center percent missing for pairs of variables ranging from 0.42 to 0.90.

A limitation of this study is that data were collected mostly prior to 2004. However, many OPTN/SRTR analyses still incorporate these data, and problems of missing data exist with more current (2009) OPTN/SRTR data (Figure 1). Because of the thorough A2ALL chart reviews, logic and error checks, further scrutiny during analyses for A2ALL manuscripts (15–19), and the benefit of A2ALL corrections after comparison with OPTN/SRTR data, we believe most discrepancies represented errors or omissions in submission of data to OPTN. Nevertheless, OPTN data were submitted closer to the time of listing and transplant and may have benefited from information available at those times that was not documented in the patient charts. Another limitation is that the current results do not necessarily apply to other organ recipients or donors.

Initiatives by OPTN to improve data quality have included a decrease in the number of variables to reduce the burden of data submission, building range and logic checks into UNET and performance of periodic audits of centers to monitor accuracy of data submitted for allocation. The achievement of 100% complete data for variables required for liver organ allocation in the MELD era is notable, and efforts have been made to report levels of missing data. The annual SRTR Report on the State of Transplantation has included papers on national transplant data and analysis issues that examined how to overcome the problem of missing outcomes using additional sources of ascertainment, such as the SSDMF (20,21). However, SRTR or the OPTN have not addressed missing or incorrect values for other variables. The use of external databases, such as hospital databases, Medicare records (22,23) or private payer claims data (24) to augment OPTN/SRTR data has been suggested. Given that agreement between databases is far from perfect (22–24), policies to deal with inconsistencies would have to be developed. A future standardized national health record could facilitate the electronic submission of hospital data, and would likely improve the quality of submitted data.

Funding is often insufficient to have complete, correct and timely data in large registry databases. Data collection and monitoring in A2ALL required as much as one full-time equivalent (FTE) coordinator per center and 2–3 FTEs at the Data Coordinating Center. However, the quality of OPTN/SRTR data might be improved by monitoring missing data frequencies for individual variables by center, increasing the number of required variables, adding further range or logic checks in UNET, and auditing sites for the accuracy of variables other than those used for organ allocation. Reporting of the number of subjects with missing data should be standard in all peer-reviewed manuscripts and in data reports from the OPTN and SRTR. Such actions would improve the ability of the OPTN, SRTR and other investigators to address important scientific questions in the field of solid organ transplantation.

Acknowledgments

This study was supported by National Institute of Diabetes & Digestive & Kidney Diseases through cooperative agreements (listed below). Additional support was provided by Health Resources and Services Administration (HRSA), and the American Society of Transplant Surgeons (ASTS). This study was supported in part by the National Institutes of Health (NIDDK grant numbers U01-DK62536, U01-DK62444, U01-DK62467, U01-DK62483, U01-DK62484, U01-DK62494, U01-DK62496, U01-DK62498, U01-DK62505, U01-DK62531). The following individuals were instrumental in the planning, conduct and/or care of patients enrolled in this study at each of the participating institutions as follows:

Columbia University Health Sciences, New York, NY (DK62483). PI: Jean C. Emond, MD; Co-PI: Robert S. Brown, Jr., MD, MPH; study coordinators: Rudina Odeh-Ramadan, PharmD; Scott Heese, BA; Northwestern University, Chicago, IL (DK62467). PI: Michael M.I. Abecassis, MD, MBA; Co-PI: Laura M. Kulik, MD; study coordinator: Patrice Al-Saden, RN, CCRC; University of Pennsylvania Health System, Philadelphia, PA (DK62494): PI: Abraham Shaked, MD, PhD; Co-PI: Kim M. Olthoff, MD; Study Coordinators: Brian Conboy, PA, MBA; Mary Shaw, RN, BBA; University of Colorado Health Sciences Center, Denver, CO (DK62536). PI: Gregory T. Everson, MD; Co-PI: Igal Kam, MD; study coordinators: Carlos Garcia, BS, Anastasia Krajec, RN; University of California Los Angeles, Los Angeles, CA (DK62496). PI: Johnny C. Hong, MD; Co-PI: Ronald W. Busuttil, MD, PhD; study coordinator: Janet Mooney, RN, BSN; University of California San Francisco, San Francisco, CA (DK62444). PI: Chris E. Freise, MD, FACS; Co-PI: Norah A. Terrault, MD; study coordinator: Dulce MacLeod, RN; Vivian Tan, MD; University of Michigan Medical Center, Ann Arbor, MI (DK62498). PI: Robert M. Merion, MD; DCC Staff: Anna S.F. Lok, MD; Akinlolu O. Ojo, MD, PhD; Brenda W. Gillespie, PhD; Margaret Hill-Callahan, BS, LSW; Terese Howell, BS; Lan Tong, MS; Tempie H. Shearon, MS; Karen A. Wisniewski, MPH; Monique Lowe, BS; Abby Smith, BA; University of North Carolina, Chapel Hill, NC (DK62505). PI: Paul H. Hayashi, MD, MPH; study coordinator: Tracy Russell, MA; University of Virginia (DK62484). PI: Carl L. Berg, MD; Co-PI: Timothy L. Pruett, MD; study coordinator: Jaye Davis, RN; Medical College of Virginia Hospitals, Virginia Commonwealth University, Richmond, VA (DK62531). PI: Robert A. Fisher, MD, FACS; Co-PI: Mitchell L. Shiffman, MD; study coordinators: Andrea Lassiter; April Ashworth, RN; National Institute of Diabetes and Digestive and Kidney Diseases, Division of Digestive Diseases and Nutrition, Bethesda, MD. James E. Everhart, MD, MPH; Leonard B. Seeff, MD; Patricia R. Robuck, PhD; Jay H. Hoofnagle, MD.

Supplemental data included here have been supplied by the Arbor Research Collaborative for Health as the contractor for the Scientific Registry of Transplant Recipients (SRTR). The interpretation and reporting of these data are the responsibility of the author(s) and in no way should be seen as an official policy of or interpretation by the SRTR or the U.S. Government.

Ancillary