A rank‐sum test for clustered data when the number of subjects in a group within a cluster is informative
Summary
The Wilcoxon rank‐sum test is a popular nonparametric test for comparing two independent populations (groups). In recent years, there have been renewed attempts in extending the Wilcoxon rank sum test for clustered data, one of which (Datta and Satten, 2005, Journal of the American Statistical Association 100, 908–915) addresses the issue of informative cluster size, i.e., when the outcomes and the cluster size are correlated. We are faced with a situation where the group specific marginal distribution in a cluster depends on the number of observations in that group (i.e., the intra‐cluster group size). We develop a novel extension of the rank‐sum test for handling this situation. We compare the performance of our test with the Datta–Satten test, as well as the naive Wilcoxon rank sum test. Using a naturally occurring simulation model of informative intra‐cluster group size, we show that only our test maintains the correct size. We also compare our test with a classical signed rank test based on averages of the outcome values in each group paired by the cluster membership. While this test maintains the size, it has lower power than our test. Extensions to multiple group comparisons and the case of clusters not having samples from all groups are also discussed. We apply our test to determine whether there are differences in the attachment loss between the upper and lower teeth and between mesial and buccal sites of periodontal patients.
Citing Literature
Number of times cited according to CrossRef: 8
- Mary Gregg, Somnath Datta, Doug Lorenz, Variance estimation in tests of clustered categorical data with informative cluster size, Statistical Methods in Medical Research, 10.1177/0962280220928572, (096228022092857), (2020).
- Akash Roy, Solomon W. Harrar, Frank Konietschke, The nonparametric Behrens‐Fisher problem with dependent replicates, Statistics in Medicine, 10.1002/sim.8343, 38, 25, (4939-4962), (2019).
- Dennis Dobler, Markus Pauly, Factorial analyses of treatment effects under independent right-censoring, Statistical Methods in Medical Research, 10.1177/0962280219831316, (096228021983131), (2019).
- Mary E. Gregg, Somnath Datta, Doug Lorenz, A log rank test for clustered data with informative within‐cluster group size, Statistics in Medicine, 10.1002/sim.7899, 37, 27, (4071-4082), (2018).
- Sandipan Dutta, Somnath Datta, Rank‐based inference for covariate and group effects in clustered data in presence of informative intra‐cluster group size, Statistics in Medicine, 10.1002/sim.7979, 37, 30, (4807-4822), (2018).
- Yi Zhao, Tsung-Heng Tsai, Cristina Di Poto, Lewis K. Pannell, Mahlet G. Tadesse, Habtom W. Ressom, Variability Assessment of Label-Free LC-MS Experiments for Difference Detection, Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry, 10.1007/978-3-319-45809-0, (157-176), (2017).
- Jaakko Nevalainen, Hannu Oja, Somnath Datta, Tests for informative cluster size using a novel balanced bootstrap scheme, Statistics in Medicine, 10.1002/sim.7288, 36, 16, (2630-2640), (2017).
- Douglas J Lorenz, Steven Levy, Somnath Datta, Inferring marginal association with paired and unpaired clustered data, Statistical Methods in Medical Research, 10.1177/0962280216669184, (096228021666918), (2016).




