Volume 58, Issue 2

A Semiparametric Empirical Likelihood Method for Data from an Outcome‐Dependent Sampling Scheme with a Continuous Outcome

Haibo Zhou

Corresponding Author

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A.

*email:zhou@bios.unc.eduSearch for more papers by this author
M. A. Weaver

Family Health International, Research Triangle Park, North Carolina 27599, U.S.A.

Search for more papers by this author
J. Qin

Department of Epidemiology and Biostatistics, Memorial Sloan‐Kettering Cancer Center, 1275 York Avenue, New York, New York 10021, U.S.A.

Search for more papers by this author
M. P. Longnecker

Epidemiology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina 27709, U.S.A.

Search for more papers by this author
M. C. Wang

Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205, U.S.A.

Search for more papers by this author
First published: 21 May 2004
Citations: 48

Abstract

Summary. Outcome‐dependent sampling (ODS) schemes can be a cost effective way to enhance study efficiency. The case–control design has been widely used in epidemiologic studies. However, when the outcome is measured on a continuous scale, dichotomizing the outcome could lead to a loss of efficiency. Recent epidemiologic studies have used ODS sampling schemes where, in addition to an overall random sample, there are also a number of supplemental samples that are collected based on a continuous outcome variable. We consider a semiparametric empirical likelihood inference procedure in which the underlying distribution of covariates is treated as a nuisance parameter and is left unspecified. The proposed estimator has asymptotic normality properties. The likelihood ratio statistic using the semiparametric empirical likelihood function has Wilks‐type properties in that, under the null, it follows a chi‐square distribution asymptotically and is independent of the nuisance parameters. Our simulation results indicate that, for data obtained using an ODS design, the semiparametric empirical likelihood estimator is more efficient than conditional likelihood and probability weighted pseudolikelihood estimators and that ODS designs (along with the proposed estimator) can produce more efficient estimates than simple random sample designs of the same size. We apply the proposed method to analyze a data set from the Collaborative Perinatal Project (CPP), an ongoing environmental epidemiologic study, to assess the relationship between maternal polychlorinated biphenyl (PCB) level and children's I & test performance.

Number of times cited according to CrossRef: 48

  • Maximum likelihood estimation for outcome‐dependent samples, Australian & New Zealand Journal of Statistics, 10.1111/anzs.12287, 62, 1, (49-70), (2020).
  • Accelerated failure time model for data from outcome-dependent sampling, Lifetime Data Analysis, 10.1007/s10985-020-09508-y, (2020).
  • Hypothesis testing in outcome-dependent sampling design under generalized linear models, Communications in Statistics - Simulation and Computation, 10.1080/03610918.2019.1682155, (1-25), (2019).
  • Understanding and Mitigating the Replication Crisis, for Environmental Epidemiologists, Current Environmental Health Reports, 10.1007/s40572-019-0225-4, (2019).
  • Semiparametric inference for a two-stage outcome-dependent sampling design with interval-censored failure time data, Lifetime Data Analysis, 10.1007/s10985-019-09461-5, (2019).
  • Two-Phase, Generalized Case-Control Designs for the Study of Quantitative Longitudinal Outcomes, American Journal of Epidemiology, 10.1093/aje/kwz127, (2019).
  • Regression analysis of longitudinal data with outcome‐dependent sampling and informative censoring, Scandinavian Journal of Statistics, 10.1111/sjos.12373, 46, 3, (831-847), (2018).
  • Secondary outcome analysis for data from an outcome‐dependent sampling design, Statistics in Medicine, 10.1002/sim.7672, 37, 15, (2321-2337), (2018).
  • Likelihood‐based analysis of outcome‐dependent sampling designs with longitudinal data, Statistics in Medicine, 10.1002/sim.7633, 37, 13, (2120-2133), (2018).
  • Outcome‐dependent sampling with interval‐censored failure time data, Biometrics, 10.1111/biom.12744, 74, 1, (58-67), (2017).
  • Statistical inference methods and applications of outcome-dependent sampling designs under generalized linear models, Science China Mathematics, 10.1007/s11425-016-0152-4, 60, 7, (1219-1238), (2017).
  • Case-cohort studies with interval-censored failure time data, Biometrika, 10.1093/biomet/asw067, 104, 1, (17-29), (2017).
  • Optimal generalized case–cohort sampling design under the additive hazard model, Communications in Statistics - Theory and Methods, 10.1080/03610926.2015.1085563, 46, 9, (4484-4493), (2016).
  • Statistical inferences for data from studies conducted with an aggregated multivariate outcome‐dependent sample design, Statistics in Medicine, 10.1002/sim.7195, 36, 6, (985-997), (2016).
  • Recent progresses in outcome-dependent sampling with failure time data, Lifetime Data Analysis, 10.1007/s10985-015-9355-7, 23, 1, (57-82), (2016).
  • Outcome-dependent sampling design and inference for Cox’s proportional hazards Model, Journal of Statistical Planning and Inference, 10.1016/j.jspi.2016.05.001, 178, (24-36), (2016).
  • Estimation of a partially linear additive model for data from an outcome-dependent sampling design with a continuous outcome, Biostatistics, 10.1093/biostatistics/kxw015, 17, 4, (663-676), (2016).
  • Time‐dependent classification accuracy curve under marker‐dependent sampling, Biometrical Journal, 10.1002/bimj.201500171, 58, 4, (974-992), (2016).
  • Statistical inference for the additive hazards model under outcome‐dependent sampling, Canadian Journal of Statistics, 10.1002/cjs.11257, 43, 3, (436-453), (2015).
  • Optimal generalized case-cohort analysis with Cox’s proportional hazards model, Acta Mathematicae Applicatae Sinica, English Series, 10.1007/s10255-015-0555-4, 31, 3, (841-854), (2015).
  • Analysis of an outcome-dependent enriched sample: hypothesis tests, Statistical Methods & Applications, 10.1007/s10260-014-0285-4, 24, 3, (387-409), (2014).
  • Semiparametric empirical likelihood estimation for two-stage outcome-dependent sampling under the frame of generalized linear models, Acta Mathematicae Applicatae Sinica, English Series, 10.1007/s10255-014-0410-z, 30, 3, (663-676), (2014).
  • Semiparametric methods for survival analysis of case‐control data subject to dependent censoring, Canadian Journal of Statistics, 10.1002/cjs.11218, 42, 3, (365-383), (2014).
  • Outcome‐Dependent Selection Models, Wiley StatsRef: Statistics Reference Online, 10.1002/9781118445112, (2014).
  • Estimating effect of environmental contaminants on women's subfecundity for the MoBa study data with an outcome-dependent sampling scheme, Biostatistics, 10.1093/biostatistics/kxu016, 15, 4, (636-650), (2014).
  • An Index of Local Sensitivity to Nonignorability for a Pseudolikelihood Method, Communications in Statistics - Theory and Methods, 10.1080/03610926.2011.588367, 42, 6, (954-973), (2013).
  • Adjusted regularized estimation in the accelerated failure time model with high dimensional covariates, Journal of Multivariate Analysis, 10.1016/j.jmva.2013.07.011, 122, (96-114), (2013).
  • Outcome Vector Dependent Sampling with Longitudinal Continuous Response Data: Stratified Sampling Based on Summary Statistics, Biometrics, 10.1111/biom.12013, 69, 2, (405-416), (2013).
  • ROC curve estimation under test-result-dependent sampling, Biostatistics, 10.1093/biostatistics/kxs020, 14, 1, (160-172), (2012).
  • Regression analysis for a summed missing data problem under an outcome‐dependent sampling scheme, Canadian Journal of Statistics, 10.1002/cjs.11131, 40, 2, (282-303), (2012).
  • Estimation of AUC or Partial AUC Under Test-Result-Dependent Sampling, Statistics in Biopharmaceutical Research, 10.1080/19466315.2012.692514, 4, 4, (313-323), (2012).
  • Goodness-of-fit tests for general linear models with covariates missed at random, Journal of Statistical Planning and Inference, 10.1016/j.jspi.2012.02.039, 142, 7, (2047-2058), (2012).
  • Mixed effect regression analysis for a cluster-based two-stage outcome-auxiliary-dependent sampling design with a continuous outcome, Biostatistics, 10.1093/biostatistics/kxs013, 13, 4, (650-664), (2012).
  • Information in the sample covariate distribution in prevalent cohorts, Statistics in Medicine, 10.1002/sim.4180, 30, 12, (1397-1409), (2011).
  • Semiparametric inference for a 2-stage outcome-auxiliary-dependent sampling design with continuous outcome, Biostatistics, 10.1093/biostatistics/kxq080, 12, 3, (521-534), (2011).
  • Outcome‐Dependent Sampling from Existing Cohorts with Longitudinal Binary Response Data: Study Planning and Analysis, Biometrics, 10.1111/j.1541-0420.2011.01582.x, 67, 4, (1583-1593), (2011).
  • Partial linear inference for a 2-stage outcome-dependent sampling design with a continuous outcome, Biostatistics, 10.1093/biostatistics/kxq070, 12, 3, (506-520), (2010).
  • A Partial Linear Model in the Outcome‐Dependent Sampling Setting to Evaluate the Effect of Prenatal PCB Exposure on Cognitive Function in Children, Biometrics, 10.1111/j.1541-0420.2010.01500.x, 67, 3, (876-885), (2010).
  • Statistical Inference for a Two‐Stage Outcome‐Dependent Sampling Design with a Continuous Outcome, Biometrics, 10.1111/j.1541-0420.2010.01446.x, 67, 1, (194-202), (2010).
  • On estimation of conditional density models with two-phase sampling, Journal of Statistical Planning and Inference, 10.1016/j.jspi.2010.01.041, 140, 7, (1986-2002), (2010).
  • Design and Inference for Cancer Biomarker Study with an Outcome and Auxiliary‐Dependent Subsampling, Biometrics, 10.1111/j.1541-0420.2009.01280.x, 66, 2, (502-511), (2009).
  • Outcome- and Auxiliary-Dependent Subsampling and Its Statistical Inference, Journal of Biopharmaceutical Statistics, 10.1080/10543400903243025, 19, 6, (1132-1150), (2009).
  • A note on semiparametric efficient inference for two-stage outcome-dependent sampling with a continuous outcome, Biometrika, 10.1093/biomet/asn073, 96, 1, (221-228), (2009).
  • Outcome-Dependent Sampling, Epidemiology, 10.1097/EDE.0b013e31806462d3, 18, 4, (461-468), (2007).
  • Optimal design for epidemiological studies subject to designed missingness, Lifetime Data Analysis, 10.1007/s10985-007-9068-7, 13, 4, (583-605), (2007).
  • A Semiparametric Empirical Likelihood Method for Biased Sampling Schemes with Auxiliary Covariates, Biometrics, 10.1111/j.1541-0420.2006.00612.x, 62, 4, (1149-1160), (2006).
  • Outcome‐Dependent Selection Models , Encyclopedia of Environmetrics, 10.1002/9780470057339, (2006).
  • In Utero Exposure to Background Levels of Polychlorinated Biphenyls and Cognitive Functioning among School-age Children, American Journal of Epidemiology, 10.1093/aje/kwi158, 162, 1, (17-26), (2005).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.