• covariate dependence;
  • length of stay;
  • Markov chain;
  • right skewed data;
  • statistical modeling


Objectives:  To present a relatively novel method for modeling length-of-stay data and assess the role of covariates, some of which are related to adverse events. To undertake critical comparisons with alternative models based on the gamma and log-normal distributions. To demonstrate the effect of poorly fitting models on decision-making.

Methods:  The model has the process of hospital stay organized into Markov phases/states that describe stay in hospital before discharge to an absorbing state. Admission is via state 1 and discharge from this first state would correspond to a short stay, with transitions to later states corresponding to longer stays. The resulting phase-type probability distributions provide a flexible modeling framework for length-of-stay data which are known to be awkward and difficult to fit to other distributions.

Results:  The dataset consisted of 1901 patients' lengths of stay and values for a number of covariates. The fitted model comprised six Markov phases, and provided a good fit to the data. Alternative gamma and log-normal models did not fit as well, gave different coefficient estimates, and statistical significance of covariate effects differed between the models.

Conclusions:  Models that fit should generally be preferred over those that do not, as they will produce more statistically reliable coefficient estimates. Poor coefficient estimates may mislead decision-makers by either understating or overstating the cost of some event or the cost savings from preventing that event. There is no obvious way of identifying a priori when coefficient estimates from poorly fitting models might be misleading.