Get access

The utility of the zero-inflated Poisson and zero-inflated negative binomial models: a case study of cross-sectional and longitudinal DMF data examining the effect of socio-economic status


W. M. Thomson, Department of Oral Sciences, School of Dentistry, The University of Otago, PO Box 647, Dunedin, New Zealand
Tel: +64 3 479 7116


Abstract – Objectives: To examine the utility of the zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) modelling approaches for modelling four sets of dental caries data from the same cohort study [with particular attention to the influence of childhood socioeconomic status (SES)]: cross-sectional data on the deciduous dentition at age 5 years; cross-sectional data on the permanent dentition at age 18 and 26 years; and longitudinal data on caries increment between ages 18 and 26 years. Methods: Data on dental caries occurrence at ages 5, 18 and 26 years were obtained from the Dunedin Multidisciplinary Health and Development Study (DMHDS). ZIP and ZINB models were fitted to the cross-sectional (n = 745) and longitudinal (n = 809) data sets using Stata (Intercooled Stata 7.0). The dependent variables for the three cross-sectional analyses were the DMFS indices at age 5, 18, and 26 years, and net DFS increment (NETDFS) was the dependent variable for the longitudinal analysis. Results: The empty ZIP model was a poor fit for all four data sets, whereas the empty ZINB model showed good fit; consequently both the cross-sectional and longitudinal analyses were conducted using ZINB modelling. Being in the high-SES group during childhood was associated with a greater probability of being caries-free by age 18 years, over and above that which would be expected from the negative binomial process. Low childhood SES also had the largest coefficient in the modelling of the negative binomial process, but at age 5 years, where the adjusted mean dmfs score in the low-SES group was 6.8 (compared with 4.7 and 2.9 in the medium- and high-SES groups, respectively). The substantial SES differences which existed at age 5 years (in the deciduous dentition) had reduced somewhat by age 18 years, and had widened again by age 26 years. In the longitudinal analysis, ‘baseline’ caries experience (age 18-year DMFS) was a predictor both of being an extra zero and of caries severity. Conclusion: This investigation of the utility of the zero-inflated approach for modelling both cross-sectional and longitudinal caries data has shown that ZIP/ZINB models can provide new insight into disease patterns. It is anticipated that they will become increasingly useful in epidemiological studies that use the DMF index as the outcome measure.