Semiparametric inference for data with a continuous outcome from a two-phase probability-dependent sampling scheme



Multiphased designs and biased sampling designs are two of the well-recognized approaches to enhance study efficiency. We propose a new and cost-effective sampling design, the two-phase probability-dependent sampling design, for studies with a continuous outcome. This design will enable investigators to make efficient use of resources by targeting more informative subjects for sampling. We develop a new semiparametric empirical likelihood inference method to take advantage of data obtained through a probability-dependent sampling design. Simulation study results indicate that the sampling scheme proposed, coupled with the proposed estimator, is more efficient and more powerful than the existing outcome-dependent sampling design and the simple random sampling design with the same sample size. We illustrate the method proposed with a real data set from an environmental epidemiologic study.