How to regress and predict in a Bland–Altman plot? Review and contribution based on tolerance intervals and correlated‐errors‐in‐variables models
Abstract
Two main methodologies for assessing equivalence in method‐comparison studies are presented separately in the literature. The first one is the well‐known and widely applied Bland–Altman approach with its agreement intervals, where two methods are considered interchangeable if their differences are not clinically significant. The second approach is based on errors‐in‐variables regression in a classical (X,Y) plot and focuses on confidence intervals, whereby two methods are considered equivalent when providing similar measures notwithstanding the random measurement errors. This paper reconciles these two methodologies and shows their similarities and differences using both real data and simulations. A new consistent correlated‐errors‐in‐variables regression is introduced as the errors are shown to be correlated in the Bland–Altman plot. Indeed, the coverage probabilities collapse and the biases soar when this correlation is ignored. Novel tolerance intervals are compared with agreement intervals with or without replicated data, and novel predictive intervals are introduced to predict a single measure in an (X,Y) plot or in a Bland–Atman plot with excellent coverage probabilities. We conclude that the (correlated)‐errors‐in‐variables regressions should not be avoided in method comparison studies, although the Bland–Altman approach is usually applied to avert their complexity. We argue that tolerance or predictive intervals are better alternatives than agreement intervals, and we provide guidelines for practitioners regarding method comparison studies. Copyright © 2016 John Wiley & Sons, Ltd.
Citing Literature
Number of times cited according to CrossRef: 24
- Colleen Russell, Marie McCarthy, Joseph C. Cappelleri, Susan Wong, Choosing a Mobile Sensor Technology for a Clinical Trial: Statistical Considerations, Developments and Learnings, Therapeutic Innovation & Regulatory Science, 10.1007/s43441-020-00188-2, (2020).
- Bernard G. Francq, Dan Lin, Walter Hoyer, Confidence and Prediction in Linear Mixed Models: Do Not Concatenate the Random Effects. Application in an Assay Qualification Study, Statistics in Biopharmaceutical Research, 10.1080/19466315.2020.1776762, (1-11), (2020).
- Bernard G. Francq, Dan Lin, Walter Hoyer, Confidence, prediction, and tolerance in linear mixed models, Statistics in Medicine, 10.1002/sim.8386, 38, 30, (5603-5622), (2019).
- Essam A. Alsanawi, Raghib Abusaris, Ashraf A. El-Metwally, Cross-cultural adaptation and validation of the Arabic version of the Index of Dental Anxiety and Fear (IDAF-4C), Journal of Oral Science, 10.2334/josnusd.18-0097, 61, 2, (229-237), (2019).
- Laura E. Britton, Colleen P. Judge‐Golden, Tierney E. Wolgemuth, Xinhua Zhao, Maria K. Mor, Lisa S. Callegari, Sonya Borrero, Associations Between Perceived Susceptibility to Pregnancy and Contraceptive Use in a National Sample of Women Veterans, Perspectives on Sexual and Reproductive Health, 10.1363/psrh.12122, 51, 4, (211-218), (2019).
- Aaron W. Scheffler, Donatello Telesca, Catherine A. Sugar, Shafali Jeste, Abigail Dickinson, Charlotte DiStefano, Damla Şentürk, Covariate‐adjusted region‐referenced generalized functional linear model for EEG data, Statistics in Medicine, 10.1002/sim.8384, 38, 30, (5587-5602), (2019).
- Zann Soh, Wenru Wang, Gek Kheng Png, Norasyikin Hassan, Vivien Xi Wu, Risk of skin tears and its predictors among hospitalized older adults in Singapore, International Journal of Nursing Practice, 10.1111/ijn.12790, 25, 6, (2019).
- Ewoud Tricht, Lars Geurink, Francisca Galindo Garre, Martijn Schenning, Harold Backus, Marta Germano, Govert W. Somsen, Cari E. Sänger – van de Griend, Implementation of at‐line capillary zone electrophoresis for fast and reliable determination of adenovirus concentrations in vaccine manufacturing, ELECTROPHORESIS, 10.1002/elps.201900068, 40, 18-19, (2277-2284), (2019).
- Annette Kopp-Schneider, Thomas Hielscher, How to evaluate agreement between quantitative measurements, Radiotherapy and Oncology, 10.1016/j.radonc.2019.09.004, (2019).
- Pedro O.P. Lima, Wenya P.X. Melo, Márcio A. Bezerra, Gabriel P.L. Almeida, Ana Carla L. Nunes, Rodrigo R. Oliveira, Intraexaminer and Interexaminer Reproducibility of the Downing Test for Sacroiliac Joint Evaluation of Symptomatic and Asymptomatic Individuals, Journal of Chiropractic Medicine, 10.1016/j.jcm.2018.11.007, (2019).
- undefined Boschetti, undefined Cifuentes, undefined Iacumin, undefined Selmo, Local Meteoric Water Line of Northern Chile (18° S–30° S): An Application of Error-in-Variables Regression to the Oxygen and Hydrogen Stable Isotope Ratio of Precipitation, Water, 10.3390/w11040791, 11, 4, (791), (2019).
- Mayra Rodríguez, Elizabeth Lawson, David Butler, A study of the Resilience Analysis Grid method and its applicability to the water sector in England and Wales, Water and Environment Journal, 10.1111/wej.12539, 0, 0, (2019).
- Guangyu Wang, Anpeng Huang, Silu Zhou, Shahbaz Rezaei, Xin Liu, An Ambulatory Blood Pressure Monitor mHealth System for Stroke-Risk Early Warning: Design and Test (Preprint), JMIR mHealth and uHealth, 10.2196/14926, (2019).
- Jukka Ranta, Timo Aittokoski, Mirja Tenhunen, Mikko Alasaukko-oja, EMFIT QS heart rate and respiration rate validation, Biomedical Physics & Engineering Express, 10.1088/2057-1976/aafbc8, 5, 2, (025016), (2019).
- Patrick Taffé, Assessing bias, precision, and agreement in method comparison studies, Statistical Methods in Medical Research, 10.1177/0962280219844535, (096228021984453), (2019).
- Ahmad H. Alghadir, Einas S. Al-Eisa, Shahnawaz Anwer, Bibhuti Sarkar, Reliability, validity, and responsiveness of three scales for measuring balance in patients with chronic stroke, BMC Neurology, 10.1186/s12883-018-1146-9, 18, 1, (2018).
- Niels Ziegelasch, Mandy Vogel, Eva Müller, Nadin Tremel, Anne Jurkutat, Markus Löffler, Nicolas Terliesner, Joachim Thiery, Anja Willenberg, Wieland Kiess, Katalin Dittrich, Cystatin C serum levels in healthy children are related to age, gender, and pubertal stage, Pediatric Nephrology, 10.1007/s00467-018-4087-z, (2018).
- Katarzyna Górska, Patrycja Nejman-Gryz, Magdalena Paplińska-Goryca, Małgorzata Proboszcz, Rafał Krenke, Comparison of Thymic Stromal Lymphopoietin Concentration in Various Human Biospecimens from Asthma and COPD Patients Measured with Two Different ELISA Kits, Pathobiology of Pulmonary Disorders, 10.1007/5584_2016_162, (19-27), (2017).
- Saurab Sharma, Joshna Palanchoke, Darren Reed, J. Haxby Abbott, Translation, cross-cultural adaptation and psychometric properties of the Nepali versions of numerical pain rating scale and global rating of change, Health and Quality of Life Outcomes, 10.1186/s12955-017-0812-8, 15, 1, (2017).
- Ayako Okada, Yoshiaki Nomura, Kaoru Sogabe, Hirofumi Oku, Arika Sato Gillbreath, Fumihiko Hino, Hideo Hayashi, Hirokazu Yoshino, Hisanori Utsunomiya, Kazuyuki Suzuki, Keizo Koresawa, Kenji Koba, Kimiyuki Uetani, Mami Kotoh, Naoyuki Nishitsuji, Satoshi Akutsu, Takakazu Nakasone, Yasushi Tobi, Yoichi Fukuzawa, Yoshihide Yabuki, Yoshinobu Naono, Masataka Yajima, Keita Shimizu, Nobuhiro Hanada, Comparison of salivary hemoglobin measurements for periodontitis screening, Journal of Oral Science, 10.2334/josnusd.16-0204, 59, 1, (63-69), (2017).
- Ayako Okada, Kaoru Sogabe, Hiroaki Takeuchi, Masaaki Okamoto, Yoshiaki Nomura, Nobuhiro Hanada, Characterization of specimens obtained by different sampling methods for evaluation of periodontal bacteria, Journal of Oral Science, 10.2334/josnusd.16-0573, 59, 4, (491-498), (2017).
- Andrew Carkeet, Yee Teng Goh, Confidence and coverage for Bland–Altman limits of agreement and their approximate confidence intervals, Statistical Methods in Medical Research, 10.1177/0962280216665419, 27, 5, (1559-1574), (2016).
- Benoît Pereira, Aubry Vandeuren, Bernadette B. Govaerts, Philippe Sonnet, Assessing dataset equivalence and leveling data in geochemical mapping, Journal of Geochemical Exploration, 10.1016/j.gexplo.2016.05.012, 168, (36-48), (2016).
- Bernard G. Francq, Marion Berger, Charles Boachie, To tolerate or to agree: A tutorial on tolerance intervals in method comparison studies with BivRegBLS R Package, Statistics in Medicine, 10.1002/sim.8709, 0, 0, (undefined).




