• Cost-effective designs;
  • Empirical likelihood;
  • Outcome-dependent sampling;
  • Partial linear model;
  • Polychlorinated biphenyls;
  • P-spline

Summary:  Outcome-dependent sampling (ODS) has been widely used in biomedical studies because it is a cost-effective way to improve study efficiency. However, in the setting of a continuous outcome, the representation of the exposure variable has been limited to the framework of linear models, due to the challenge in terms of both theory and computation. Partial linear models (PLM) are a powerful inference tool to nonparametrically model the relation between an outcome and the exposure variable. In this article, we consider a case study of a PLM for data from an ODS design. We propose a semiparametric maximum likelihood method to make inferences with a PLM. We develop the asymptotic properties and conduct simulation studies to show that the proposed ODS estimator can produce a more efficient estimate than that from a traditional simple random sampling design with the same sample size. Using this newly developed method, we were able to explore an open question in epidemiology: whether in utero exposure to background levels of polychlorinated biphenyls (PCBs) is associated with children's intellectual impairment. Our model provides further insights into the relation between low-level PCB exposure and children's cognitive function. The results shed new light on a body of inconsistent epidemiologic findings.