Data mining



Group 14 used data-mining strategies to evaluate a number of issues, including appropriate diagnosis, haplotype estimation, genetic linkage and association studies, and type I error. Methods ranged from exploratory analyses, to machine learning strategies (neural networks, supervised learning, and tree-based methods), to false discovery rate control of type I errors. The general motivations were to find the “story” in the data and to summarize information from a multitude of measures. Several methods illustrated strategies for better trait definition, using summarization of related traits. In the few studies that sought to identify genes for alcoholism, there was little agreement among the different strategies, likely reflecting the complexities of the disease. Nevertheless, Group 14 found that these methods offered strategies to gain a better understanding of the complex pathways by which disease develops. Genet. Epidemiol. 29(Suppl. 1):S103–S109, 2005. © 2005 Wiley-Liss, Inc.