Adjusting for population heterogeneity: A framework for characterizing statistical information and developing efficient test statistics



The focus of this work is the TDT-type and family-based test statistics used for adjusting for potential confounding due to population heterogeneity or misspecified allele frequencies. A variety of heuristics have been used to motivate and derive these statistics, and the statistics have been developed for a variety of analytic goals. There appears to be no general theoretical framework, however, that may be used to evaluate competing approaches. Furthermore, there is no framework to guide the development of efficient TDT-type and family-based methods for analytic goals for which methods have not yet been proposed. The purpose of this paper is to present a theoretical framework that serves both to identify the information which is available to methods that are immune to confounding due to population heterogeneity or misspecified allele frequencies, and to inform the construction of efficient unbiased tests in novel settings. The development relies on the existence of a characterization of the null hypothesis in terms of a completely specified conditional distribution of transmitted genotypes. An important observation is that, with such a characterization, when the conditioning event is unobserved or incomplete, there is statistical information that cannot be exploited by any exact conditional test. The main technical result of this work is an approach to computing test statistics for local alternatives that exploit all of the available statistical information. Genet Epidemiol 24:284–290, 2003. © 2003 Wiley-Liss, Inc.