MEASURING OVERFITTING IN NONLINEAR MODELS: A NEW METHOD AND AN APPLICATION TO HEALTH EXPENDITURES
Article first published online: 9 OCT 2013
Copyright © 2013 John Wiley & Sons, Ltd.
Volume 24, Issue 1, pages 75–85, January 2015
How to Cite
2015), MEASURING OVERFITTING IN NONLINEAR MODELS: A NEW METHOD AND AN APPLICATION TO HEALTH EXPENDITURES, Health Econ., 24, 75–85, doi: 10.1002/hec.3003, and (
- Issue published online: 8 DEC 2014
- Article first published online: 9 OCT 2013
- Manuscript Accepted: 4 SEP 2013
- Manuscript Revised: 30 JUL 2013
- Manuscript Received: 18 MAR 2012
- Copas test;
- health expenditure;
When fitting an econometric model, it is well known that we pick up part of the idiosyncratic characteristics of the data along with the systematic relationship between dependent and explanatory variables. This phenomenon is known as overfitting and generally occurs when a model is excessively complex relative to the amount of data available. Overfitting is a major threat to regression analysis in terms of both inference and prediction.
We start by showing that the Copas measure becomes confounded by shrinkage or expansion arising from in-sample bias when applied to the untransformed scale of nonlinear models, which is typically the scale of interest when assessing behaviors or analyzing policies. We then propose a new measure of overfitting that is both expressed on the scale of interest and immune to this problem. We also show how to measure the respective contributions of in-sample bias and overfitting to the overall predictive bias when applying an estimated model to new data.
We finally illustrate the properties of our new measure through both a simulation study and a real-data illustration based on inpatient healthcare expenditure data, which shows that the distinctions can be important. Copyright © 2013 John Wiley & Sons, Ltd.