Some large-sample distribution-free estimators and tests for multivariate partially incomplete data from two populations


  • John M. Lachin

    1. The Biostatistics Center, Department of Statistics/Computer and Information Systems, The George Washington University, 6110 Executive Blvd. Suite 750, Rockville, MD 20852, U.S.A.
    Search for more papers by this author


The most common instance of multivariate observations is the case of repeated measures over time. The two most widely used methods for the analysis of K repeated measures for two groups are the K degrees of freedom (d.f.) T2 MANOVA F-test and the within-subjects 1 degree of freedom ANOVA F-test. Both require complete samples from normally distributed populations. In this paper, I describe alternative K and 1 d.f. distribution-free procedures which allow for randomly missing observations. These include a large-sample analysis of means, the Wei and Lachin multivariate Wilcoxon test with estimates of the Mann-Whitney parameter, and a multivariate Hodges-Lehmann location shift estimator based on the multivariate U-statistic of Wei and Johnson. Each of these methods provides a distribution-free K-variate estimate of the magnitude of group differences which can be used as the basis for an overall test of group differences. These tests include the K d.f. omnibus T2-like test, 1 d.f. tests of restricted hypotheses, such as the Wei-Lachin multivariate one-sided test of stochastic ordering, and the test of general association based on a minimum variance generalized least squares (GLS) estimate of the average group difference. I then describe covariate stratified-adjusted GLS estimates and tests of group differences. This approach also provides tests of homogeneity (interaction) for within-subjects and between-subjects effects. I illustrate these analyses with an analysis of repeated cholesterol measurements in two groups of patients, stratified by sex. Such analyses provide an overall distribution-free summary estimate and test of the treatment effect obtained by combining the group differences over both time (repeated measures) and strata.