Impact of character correlation and variable groupings on modern human population tree resolution



Relationships among modern human populations are often explored through the use of linear measurements taken on the cranium and expressed in the form of dendrograms. However, craniometric variables are strongly correlated and thereby violate the assumption of independence that most statistical analyses require. This study explores the relationship between differing methods of variable treatment and the statistical robustness of the outcomes they yield, as depicted in interpopulational trees of relatedness among modern humans. Three methods of grouping variables are examined. The first method leaves them ungrouped, the second groups variables on the basis of the developmental and/or functional complex of the cranium to which they are thought to belong, and the last method reduces variables by using principal components analysis. The strength of each of these methods is tested through the use of the Continuous Character Maximum Likelihood (CONTML) program in the PHYLIP phylogeny inference package. This program produces output in the form of trees, and the resolution of the branching topology is given as a log-likelihood value, with statistical confidence intervals supporting each branch placement on the tree. The results indicate that leaving variables ungrouped provides misleadingly strong results by failing to account for character correlation. Of the alternative two grouping methods, the covarying components method yields the best-resolved tree with stronger statistical support for its topology than the approach of grouping variables on the basis of their location on the cranium. Finally, the implications for interpreting population histories based on such methods are discussed. Am J Phys Anthropol 121:000–000, 2003. © 2003 Wiley-Liss, Inc.