Volume 29, Issue 5
Research Article

Hybrid pooled–unpooled design for cost‐efficient measurement of biomarkers

Enrique F. Schisterman

Corresponding Author

E-mail address: schistee@mail.nih.gov

Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, U.S.A.

Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, U.S.A.Search for more papers by this author
Albert Vexler

Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, U.S.A.

Search for more papers by this author
Sunni L. Mumford

Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, U.S.A.

Search for more papers by this author
Neil J. Perkins

Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, NIH/DHHS, 6100 Executive Boulevard, Rockville, MD 20852, U.S.A.

Search for more papers by this author
First published: 09 February 2010
Citations: 15

This article is a U.S. Government work and is in the public domain in the U.S.A.

Abstract

Evaluating biomarkers in epidemiological studies can be expensive and time consuming. Many investigators use techniques such as random sampling or pooling biospecimens in order to cut costs and save time on experiments. Commonly, analyses based on pooled data are strongly restricted by distributional assumptions that are challenging to validate because of the pooled biospecimens. Random sampling provides data that can be easily analyzed. However, random sampling methods are not optimal cost‐efficient designs for estimating means. We propose and examine a cost‐efficient hybrid design that involves taking a sample of both pooled and unpooled data in an optimal proportion in order to efficiently estimate the unknown parameters of the biomarker distribution. In addition, we find that this design can be used to estimate and account for different types of measurement and pooling error, without the need to collect validation data or repeated measurements. We show an example where application of the hybrid design leads to minimization of a given loss function based on variances of the estimators of the unknown parameters. Monte Carlo simulation and biomarker data from a study on coronary heart disease are used to demonstrate the proposed methodology. Published in 2010 by John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 15

  • Logistic regression with a continuous exposure measured in pools and subject to errors, Statistics in Medicine, 10.1002/sim.7891, 37, 27, (4007-4021), (2018).
  • A pooling strategy to effectively use genotype data in quantitative traits genome‐wide association studies, Statistics in Medicine, 10.1002/sim.7898, 37, 27, (4083-4095), (2018).
  • An efficient design strategy for logistic regression using outcome‐ and covariate‐dependent pooling of biospecimens prior to assay, Biometrics, 10.1111/biom.12489, 72, 3, (965-975), (2016).
  • Estimation of interaction effects using pooled biospecimens in a case‐control study, Statistics in Medicine, 10.1002/sim.6798, 35, 9, (1502-1513), (2015).
  • Positing, fitting, and selecting regression models for pooled biomarker data, Statistics in Medicine, 10.1002/sim.6496, 34, 17, (2544-2558), (2015).
  • A Discriminant Function Approach to Adjust for Processing and Measurement Error When a Biomarker is Assayed in Pooled Samples, International Journal of Environmental Research and Public Health, 10.3390/ijerph121114723, 12, 11, (14723-14740), (2015).
  • Regression for skewed biomarker outcomes subject to pooling, Biometrics, 10.1111/biom.12134, 70, 1, (202-211), (2014).
  • Estimation of gene–environment interaction by pooling biospecimens, Statistics in Medicine, 10.1002/sim.5357, 31, 26, (3241-3252), (2012).
  • The biomarker revolution, Statistics in Medicine, 10.1002/sim.5499, 31, 22, (2513-2515), (2012).
  • Assessment of skewed exposure in case‐control studies with pooling, Statistics in Medicine, 10.1002/sim.5351, 31, 22, (2461-2472), (2012).
  • Likelihood‐based methods for regression analysis with binary exposure status assessed by pooling, Statistics in Medicine, 10.1002/sim.4426, 31, 22, (2485-2497), (2012).
  • Pooling Designs for Outcomes under a Gaussian Random Effects Model, Biometrics, 10.1111/j.1541-0420.2011.01673.x, 68, 1, (45-52), (2011).
  • Estimation and testing based on data subject to measurement errors: from parametric to non‐parametric likelihood methods, Statistics in Medicine, 10.1002/sim.4304, 31, 22, (2498-2512), (2011).
  • Logistic regression analysis of biomarker data subject to pooling and dichotomization, Statistics in Medicine, 10.1002/sim.4367, 31, 22, (2473-2484), (2011).
  • Binary Regression Analysis with Pooled Exposure Measurements: A Regression Calibration Approach, Biometrics, 10.1111/j.1541-0420.2010.01464.x, 67, 2, (636-645), (2010).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.