Modeling Length of Stay in Hospital and Other Right Skewed Data: Comparison of Phase-Type, Gamma and Log-Normal Distributions
Article first published online: 24 JUL 2008
© 2008, International Society for Pharmacoeconomics and Outcomes Research (ISPOR)
Value in Health
Volume 12, Issue 2, pages 309–314, March/April 2009
How to Cite
Faddy, M., Graves, N. and Pettitt, A. (2009), Modeling Length of Stay in Hospital and Other Right Skewed Data: Comparison of Phase-Type, Gamma and Log-Normal Distributions. Value in Health, 12: 309–314. doi: 10.1111/j.1524-4733.2008.00421.x
- Issue published online: 17 FEB 2009
- Article first published online: 24 JUL 2008
- covariate dependence;
- length of stay;
- Markov chain;
- right skewed data;
- statistical modeling
Objectives: To present a relatively novel method for modeling length-of-stay data and assess the role of covariates, some of which are related to adverse events. To undertake critical comparisons with alternative models based on the gamma and log-normal distributions. To demonstrate the effect of poorly fitting models on decision-making.
Methods: The model has the process of hospital stay organized into Markov phases/states that describe stay in hospital before discharge to an absorbing state. Admission is via state 1 and discharge from this first state would correspond to a short stay, with transitions to later states corresponding to longer stays. The resulting phase-type probability distributions provide a flexible modeling framework for length-of-stay data which are known to be awkward and difficult to fit to other distributions.
Results: The dataset consisted of 1901 patients' lengths of stay and values for a number of covariates. The fitted model comprised six Markov phases, and provided a good fit to the data. Alternative gamma and log-normal models did not fit as well, gave different coefficient estimates, and statistical significance of covariate effects differed between the models.
Conclusions: Models that fit should generally be preferred over those that do not, as they will produce more statistically reliable coefficient estimates. Poor coefficient estimates may mislead decision-makers by either understating or overstating the cost of some event or the cost savings from preventing that event. There is no obvious way of identifying a priori when coefficient estimates from poorly fitting models might be misleading.