Get access

A partially linear regression model for data from an outcome-dependent sampling design


Haibo Zhou, Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA.


Summary.  The outcome-dependent sampling scheme has been gaining attention in both the statistical literature and applied fields. Epidemiological and environmental researchers have been using it to select the observations for more powerful and cost-effective studies. Motivated by a study of the effect of in utero exposure to poly-chlorinated biphenyls on children's intelligence quotient at age 7 years, in which the effect of an important confounding variable is non-linear, we consider a semiparametric regression model for data from an outcome-dependent sampling scheme where the relationship between the response and covariates is only partially parameterized. We propose a penalized spline maximum likelihood estimation for inference on both the parametric and the non-parametric components and develop their asymptotic properties. Through simulation studies and an analysis of the intelligence study, we compare the proposed estimator with several competing estimators. Practical considerations of implementing those estimators are discussed.