Objectives: To validate the accuracy of using probabilistic linkage for matching de-identified ambulance records to a state trauma registry.
Methods: This was a retrospective cohort analysis. Three thousand nine hundred nineteen true matches between ambulance and state trauma registry data from 1998 to 2003 were identified by deterministic matching on trauma identification number and verified by human review. Two thousand thirty-eight ambulance records from trauma patients not meeting criteria for a true match, and an identical number of trauma registry records randomly selected from the one local county served by a different EMS provider, were included as nonmatches. There were 17 variables considered for linkage, which included the following: age, gender, race, county, hospital, date, rural setting, call and arrival times, mechanism, penetrating injury, vital signs, intubation, and intoxication. Probabilistic linkage was used to link the two data sets, using seven different combinations of common variables (maximum, 17; minimum, 4). The sensitivity and specificity of identifying true matches and nonmatches (95% confidence intervals [95% CI]) were calculated for each combination of variables.
Results: Using all 17 available variables, 3,766 of 3,919 true matches were appropriately linked (sensitivity, 96.1%; 95% CI = 95.4% to 96.7%), with eight mismatches (specificity, 99.6%; 95% CI = 99.2% to 99.8%). Sensitivity fell below 95% with < 15 variables; however, sensitivity was very dependent on the inclusion of variables with high discriminatory power. Specificity remained >98% regardless of the number of variables included.
Conclusions: Probabilistic linkage is a valid method for matching ambulance records to a trauma registry without the use of patient identifiers; however, the sensitivity of identifying true matches is critically dependent on the number and type of common variables included in the analysis.