A Semiparametric Empirical Likelihood Method for Data from an Outcome-Dependent Sampling Scheme with a Continuous Outcome




Summary. Outcome-dependent sampling (ODS) schemes can be a cost effective way to enhance study efficiency. The case–control design has been widely used in epidemiologic studies. However, when the outcome is measured on a continuous scale, dichotomizing the outcome could lead to a loss of efficiency. Recent epidemiologic studies have used ODS sampling schemes where, in addition to an overall random sample, there are also a number of supplemental samples that are collected based on a continuous outcome variable. We consider a semiparametric empirical likelihood inference procedure in which the underlying distribution of covariates is treated as a nuisance parameter and is left unspecified. The proposed estimator has asymptotic normality properties. The likelihood ratio statistic using the semiparametric empirical likelihood function has Wilks-type properties in that, under the null, it follows a chi-square distribution asymptotically and is independent of the nuisance parameters. Our simulation results indicate that, for data obtained using an ODS design, the semiparametric empirical likelihood estimator is more efficient than conditional likelihood and probability weighted pseudolikelihood estimators and that ODS designs (along with the proposed estimator) can produce more efficient estimates than simple random sample designs of the same size. We apply the proposed method to analyze a data set from the Collaborative Perinatal Project (CPP), an ongoing environmental epidemiologic study, to assess the relationship between maternal polychlorinated biphenyl (PCB) level and children's I & test performance.