Volume 35, Issue 14
Research Article

How to regress and predict in a Bland–Altman plot? Review and contribution based on tolerance intervals and correlated‐errors‐in‐variables models

Bernard G. Francq

Corresponding Author

Institut de Statistique, Biostatistique et sciences Actuarielles, Université Catholique de Louvain, Voie du Roman Pays 20, Louvain‐la‐Neuve, 1348 Belgium

Robertson Centre for Biostatistics, University of Glasgow, University Avenue, Level 11 Boyd Orr Building, Glasgow, G12 8QQ Scotland, U.K

Correspondence to: Bernard G. Francq, Institut de Statistique, Biostatistique et sciences Actuarielles, Université Catholique de Louvain, Voie du Roman Pays 20, 1348 Louvain‐la‐Neuve, Belgium.

E‐mail: bernard.g.francq@uclouvain.be;bernard.francq@glasgow.ac.uk

Search for more papers by this author
Bernadette Govaerts

Institut de Statistique, Biostatistique et sciences Actuarielles, Université Catholique de Louvain, Voie du Roman Pays 20, Louvain‐la‐Neuve, 1348 Belgium

Search for more papers by this author
First published: 28 January 2016
Citations: 24

Abstract

Two main methodologies for assessing equivalence in method‐comparison studies are presented separately in the literature. The first one is the well‐known and widely applied Bland–Altman approach with its agreement intervals, where two methods are considered interchangeable if their differences are not clinically significant. The second approach is based on errors‐in‐variables regression in a classical (X,Y) plot and focuses on confidence intervals, whereby two methods are considered equivalent when providing similar measures notwithstanding the random measurement errors. This paper reconciles these two methodologies and shows their similarities and differences using both real data and simulations. A new consistent correlated‐errors‐in‐variables regression is introduced as the errors are shown to be correlated in the Bland–Altman plot. Indeed, the coverage probabilities collapse and the biases soar when this correlation is ignored. Novel tolerance intervals are compared with agreement intervals with or without replicated data, and novel predictive intervals are introduced to predict a single measure in an (X,Y) plot or in a Bland–Atman plot with excellent coverage probabilities. We conclude that the (correlated)‐errors‐in‐variables regressions should not be avoided in method comparison studies, although the Bland–Altman approach is usually applied to avert their complexity. We argue that tolerance or predictive intervals are better alternatives than agreement intervals, and we provide guidelines for practitioners regarding method comparison studies. Copyright © 2016 John Wiley & Sons, Ltd.

Number of times cited according to CrossRef: 24

  • Choosing a Mobile Sensor Technology for a Clinical Trial: Statistical Considerations, Developments and Learnings, Therapeutic Innovation & Regulatory Science, 10.1007/s43441-020-00188-2, (2020).
  • Confidence and Prediction in Linear Mixed Models: Do Not Concatenate the Random Effects. Application in an Assay Qualification Study, Statistics in Biopharmaceutical Research, 10.1080/19466315.2020.1776762, (1-11), (2020).
  • Confidence, prediction, and tolerance in linear mixed models, Statistics in Medicine, 10.1002/sim.8386, 38, 30, (5603-5622), (2019).
  • Cross-cultural adaptation and validation of the Arabic version of the Index of Dental Anxiety and Fear (IDAF-4C), Journal of Oral Science, 10.2334/josnusd.18-0097, 61, 2, (229-237), (2019).
  • Associations Between Perceived Susceptibility to Pregnancy and Contraceptive Use in a National Sample of Women Veterans, Perspectives on Sexual and Reproductive Health, 10.1363/psrh.12122, 51, 4, (211-218), (2019).
  • Covariate‐adjusted region‐referenced generalized functional linear model for EEG data, Statistics in Medicine, 10.1002/sim.8384, 38, 30, (5587-5602), (2019).
  • Risk of skin tears and its predictors among hospitalized older adults in Singapore, International Journal of Nursing Practice, 10.1111/ijn.12790, 25, 6, (2019).
  • Implementation of at‐line capillary zone electrophoresis for fast and reliable determination of adenovirus concentrations in vaccine manufacturing, ELECTROPHORESIS, 10.1002/elps.201900068, 40, 18-19, (2277-2284), (2019).
  • How to evaluate agreement between quantitative measurements, Radiotherapy and Oncology, 10.1016/j.radonc.2019.09.004, (2019).
  • Intraexaminer and Interexaminer Reproducibility of the Downing Test for Sacroiliac Joint Evaluation of Symptomatic and Asymptomatic Individuals, Journal of Chiropractic Medicine, 10.1016/j.jcm.2018.11.007, (2019).
  • Local Meteoric Water Line of Northern Chile (18° S–30° S): An Application of Error-in-Variables Regression to the Oxygen and Hydrogen Stable Isotope Ratio of Precipitation, Water, 10.3390/w11040791, 11, 4, (791), (2019).
  • A study of the Resilience Analysis Grid method and its applicability to the water sector in England and Wales, Water and Environment Journal, 10.1111/wej.12539, 0, 0, (2019).
  • An Ambulatory Blood Pressure Monitor mHealth System for Stroke-Risk Early Warning: Design and Test (Preprint), JMIR mHealth and uHealth, 10.2196/14926, (2019).
  • EMFIT QS heart rate and respiration rate validation, Biomedical Physics & Engineering Express, 10.1088/2057-1976/aafbc8, 5, 2, (025016), (2019).
  • Assessing bias, precision, and agreement in method comparison studies, Statistical Methods in Medical Research, 10.1177/0962280219844535, (096228021984453), (2019).
  • Reliability, validity, and responsiveness of three scales for measuring balance in patients with chronic stroke, BMC Neurology, 10.1186/s12883-018-1146-9, 18, 1, (2018).
  • Cystatin C serum levels in healthy children are related to age, gender, and pubertal stage, Pediatric Nephrology, 10.1007/s00467-018-4087-z, (2018).
  • Comparison of Thymic Stromal Lymphopoietin Concentration in Various Human Biospecimens from Asthma and COPD Patients Measured with Two Different ELISA Kits, Pathobiology of Pulmonary Disorders, 10.1007/5584_2016_162, (19-27), (2017).
  • Translation, cross-cultural adaptation and psychometric properties of the Nepali versions of numerical pain rating scale and global rating of change, Health and Quality of Life Outcomes, 10.1186/s12955-017-0812-8, 15, 1, (2017).
  • Comparison of salivary hemoglobin measurements for periodontitis screening, Journal of Oral Science, 10.2334/josnusd.16-0204, 59, 1, (63-69), (2017).
  • Characterization of specimens obtained by different sampling methods for evaluation of periodontal bacteria, Journal of Oral Science, 10.2334/josnusd.16-0573, 59, 4, (491-498), (2017).
  • Confidence and coverage for Bland–Altman limits of agreement and their approximate confidence intervals, Statistical Methods in Medical Research, 10.1177/0962280216665419, 27, 5, (1559-1574), (2016).
  • Assessing dataset equivalence and leveling data in geochemical mapping, Journal of Geochemical Exploration, 10.1016/j.gexplo.2016.05.012, 168, (36-48), (2016).
  • To tolerate or to agree: A tutorial on tolerance intervals in method comparison studies with BivRegBLS R Package, Statistics in Medicine, 10.1002/sim.8709, 0, 0, (undefined).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.