Authors are listed alphabetically and all were equal contributors.
Original Article
Data mining of RNA expression and DNA genotype data: Presentation Group 5 contributions to Genetic Analysis Workshop 15†
Article first published online: 28 NOV 2007
DOI: 10.1002/gepi.20279
© 2007 Wiley-Liss, Inc.
Issue

Genetic Epidemiology
Supplement: Genetic Analysis Workshop 15: Summaries of the Design and Analysis of Genomic Data
Volume 31, Issue S1, pages S43–S50, 2007
Additional Information
How to Cite
Falk, C. T., Finch, S. J., Kim, W. and Mukhopadhyay, N. D. (2007), Data mining of RNA expression and DNA genotype data: Presentation Group 5 contributions to Genetic Analysis Workshop 15. Genet. Epidemiol., 31: S43–S50. doi: 10.1002/gepi.20279
- †
Publication History
- Issue published online: 28 NOV 2007
- Article first published online: 28 NOV 2007
- Abstract
- References
- Cited By
Keywords:
- neural networks;
- support vector machines;
- Bayesian networks;
- network construction;
- multistage analysis;
- mixture modeling
Abstract
The complexity of data available in human genetics continues to grow at an explosive rate. With that growth, the challenges to understanding the meaning of the underlying information also grow. A currently popular approach to dissecting such information falls under the broad category of data mining. This can apply to any approach that tries to extract relevant information from large amounts of data, but often refers to methods that deal, in a non-linear fashion, with very large numbers of variables that cannot be simultaneously handled by more conventional statistical methods. To explore the usefulness of some of these approaches, 13 groups applied a variety of strategies to the first dataset provided to GAW 15 participants. With the extensive microarray and SNP data provided for 14 CEPH families, these groups explored multistage analyses, machine learning methods, network construction, and other techniques to try to answer questions about gene-gene interaction, functional similarities, co-regulated gene expression and the mapping of gene expression determinants, among others. In general, the methods offered strategies to provide a better understanding of the complex pathways involved in gene expression and function. These are still “works in progress,” often exploratory in nature, but they provide insights into ways in which the data might be interpreted. Despite the still preliminary nature of some of these methods and the diversity of the approaches, some common themes emerged. The collection of papers and methods offer a starting point for further exploration of complex interactions in human genetic data now readily available. Genet. Epidemiol. 31 (Suppl. 1):S43–S50, 2007. © 2007 Wiley-Liss, Inc.

1098-2272/asset/olbannerleft.jpg?v=1&s=7594b96a41be6d121ac42d260a9e61edb86678af)
1098-2272/asset/olbannerright.png?v=1&s=b6f0f2541c409e5b7f8d9f5207c4667ef587b61a)