A Method to Detect Differentially Methylated Loci With Next-Generation Sequencing
Article first published online: 1 APR 2013
© 2013 Wiley Periodicals, Inc.
Volume 37, Issue 4, pages 377–382, May 2013
How to Cite
Xu, H., Podolsky, R. H., Ryu, D., Wang, X., Su, S., Shi, H. and George, V. (2013), A Method to Detect Differentially Methylated Loci With Next-Generation Sequencing. Genet. Epidemiol., 37: 377–382. doi: 10.1002/gepi.21726
- Issue published online: 15 APR 2013
- Article first published online: 1 APR 2013
- Manuscript Accepted: 24 FEB 2013
- Manuscript Revised: 9 JAN 2013
- Manuscript Received: 30 OCT 2012
- Georgia Health Sciences University
- DNA methylation;
- differential methylation test;
- next-generation sequencing
Epigenetic changes, especially DNA methylation at CpG loci have important implications in cancer and other complex diseases. With the development of next-generation sequencing (NGS), it is feasible to generate data to interrogate the difference in methylation status for genome-wide loci using case-control design. However, a proper and efficient statistical test is lacking. There are several challenges. First, unlike methylation experiments using microarrays, where there is one measure of methylation for one individual at a particular CpG site, here we have the counts of methylation allele and unmethylation allele for each individual. Second, due to the nature of sample preparation, the measured methylation reflects the methylation status of a mixture of cells involved in sample preparation. Therefore, the underlying distribution of the measured methylation level is unknown, and a robust test is more desirable than parametric approach. Third, currently NGS measures methylation at over 2 million CpG sites. Any statistical tests have to be computationally efficient in order to be applied to the NGS data. Taking these challenges into account, we propose a test for differential methylation based on clustered data analysis by modeling the methylation counts. We performed simulations to show that it is robust under several distributions for the measured methylation levels. It has good power and is computationally efficient. Finally, we apply the test to our NGS data on chronic lymphocytic leukemia. The results indicate that it is a promising and practical test.