Prevalence estimation subject to misclassification: the mis-substitution bias and some remedies

Authors

  • Zhiwei Zhang,

    Corresponding author
    1. Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, MA, U.S.A.
    • Correspondence to: Zhiwei Zhang, DBS/OSB/CDRH/FDA, 10903 New Hampshire Ave., Silver Spring, MA 20993, U.S.A.

      (E-mail:zhiwei.zhang@fda.hhs.gov)

    Search for more papers by this author
  • Chunling Liu,

    1. Department of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong, China
    Search for more papers by this author
  • Sungduk Kim,

    1. Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MA, U.S.A.
    Search for more papers by this author
  • Aiyi Liu

    1. Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MA, U.S.A.
    Search for more papers by this author

Abstract

We consider the problem of estimating the prevalence of a disease under a group testing framework. Because assays are usually imperfect, misclassification of disease status is a major challenge in prevalence estimation. To account for possible misclassification, it is usually assumed that the sensitivity and specificity of the assay are known and independent of the group size. This assumption is often questionable, and substitution of incorrect values of an assay's sensitivity and specificity can result in a large bias in the prevalence estimate, which we refer to as the mis-substitution bias. In this article, we propose simple designs and methods for prevalence estimation that do not require known values of assay sensitivity and specificity. If a gold standard test is available, it can be applied to a validation subsample to yield information on the imperfect assay's sensitivity and specificity. When a gold standard is unavailable, it is possible to estimate assay sensitivity and specificity, either as unknown constants or as specified functions of the group size, from group testing data with varying group size. We develop methods for estimating parameters and for finding or approximating optimal designs, and perform extensive simulation experiments to evaluate and compare the different designs. An example concerning human immunodeficiency virus infection is used to illustrate the validation subsample design. Copyright © 2014 John Wiley & Sons, Ltd.

Ancillary