Sum statistics for the joint detection of multiple disease loci in case-control association studies with SNP markers



In complex traits, multiple disease loci presumably interact to produce the disease. For this reason, even with high-resolution single nucleotide polymorphism (SNP) marker maps, it has been difficult to map susceptibility loci by conventional locus-by-locus methods. Fine mapping strategies are needed that allow for the simultaneous detection of interacting disease loci while handling large numbers of densely spaced markers. For this purpose, sum statistics were recently proposed as a first-stage analysis method for case-control association studies with SNPs. Via sums of single-marker statistics, information over multiple disease-associated markers is combined and, with a global significance value α, a small set of “interesting” markers is selected for further analysis. Here, the statistical properties of such approaches are examined by computer simulation. It is shown that sum statistics can often be successfully applied when marker-by-marker approaches fail to detect association. Compared with Bonferroni or False Discovery Rate (FDR) procedures, sum statistics have greater power, and more disease loci can be detected. However, in studies with tightly linked markers, simple sum statistics can be suboptimal, since the intermarker correlation is ignored. A method is presented that takes the correlation structure among marker loci into account when marker statistics are combined. Genet Epidemiol 25:350–359, 2003. © 2003 Wiley-Liss, Inc.