The objective of this study was to compare the classification of hospitals as outcomes outliers using a commonly implemented frequentist statistical approach vs. an implementation of Bayesian hierarchical statistical models, using 30-day hospital-level mortality rates for a cohort of acute myocardial infarction patients as a test case. For the frequentist approach, a logistic regression model was constructed to predict mortality. For each hospital, a risk-adjusted mortality rate was computed. Those hospitals whose 95% confidence interval, around the risk-adjusted mortality rate, excludes the mean mortality rate were classified as outliers. With the Bayesian hierarchical models, three factors could vary: the profile of the typical patient (low, medium or high risk), the extent to which the mortality rate for the typical patient departed from average, and the probability that the mortality rate was indeed different by the specified amount. The agreement between the two methods was compared for different patient profiles, threshold differences from the average and probabilities. Only marginal agreement was shown between the Bayesian and frequentist approaches. In only five of the 27 comparisons was the kappa statistic at least 0.40. The remaining 22 comparisons demonstrated only marginal agreement between the two methods. Within the Bayesian framework, hospital classification clearly depended on patient profile, threshold and probability of exceeding the threshold. These inconsistencies raise questions about the validity of current methods for classifying hospital performance, and suggest a need for urgent research into which methods are most meaningful to clinicians, managers and the general public.