• algorithms;
  • positive predictive value;
  • primary sclerosing cholangitis;
  • sensitivity;
  • validation


Background/Aims: Administrative databases could be useful in studying the epidemiology of primary sclerosing cholangitis (PSC); however, there is no information regarding the validity of the diagnostic code in administrative databases. The aims of this study were to determine the validity of administrative data for a diagnosis of PSC and generate algorithms for the identification of PSC patients.

Methods: The sensitivity (Se) and positive predictive value (PPV) of a PSC diagnosis based on administrative data from 2000 to 2003 were determined through chart review data. Algorithms were developed by considering variables associated with PSC and coding details. A logistic regression model was constructed using covariates associated with PSC. Based on this model, each subject was assigned a probability of having PSC. A cutoff value was selected that maximized the Se and specificity (Sp) of correctly predicting PSC cases.

Results: In the administrative data, the initial Se and PPV were 83.7 and 7.2% respectively. The optimal algorithm included one PSC code and one inflammatory bowel disease code and had Se 56% and PPV 59%. Overall, the algorithms yielded inadequate PPV and Se estimates to identify a cohort of true PSC cases. The predictive model was constructed using six covariates. For this model, the area under the receiver operating characteristic curve was 93.5%. A cutoff of 0.0729 was used, which maximized the Se 81.9% and Sp 90.7%; however, the PPV was 41.0%.

Conclusion: An algorithm for the identification of true PSC cases from administrative data was not possible. We recommend that PSC receives a distinct ICD code from ascending cholangitis.