Method to detect differentially methylated loci with case-control designs using Illumina arrays


  • Shuang Wang

    Corresponding author
    1. Department of Biostatistics, Mailman School of Public Health, Columbia University New York, New York
    • Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th Street, Room 630, New York, NY 10032
    Search for more papers by this author


It is now understood that many human cancer types are the result of the accumulation of both genetic and epigenetic changes. DNA methylation is a molecular modification of DNA that is crucial for normal development. Genes that are rich in CpG dinucleotides are usually not methylated in normal tissues, but are frequently hypermethylated in cancer. With the advent of high-throughput platforms, large-scale structure of genomic methylation patterns is available through genome-wide scans and tremendous amount of DNA methylation data have been recently generated. However, sophisticated statistical methods to handle complex DNA methylation data are very limited. Here, we developed a likelihood based Uniform-Normal-mixture model to select differentially methylated loci between case and control groups using Illumina arrays. The idea is to model the data as three types of methylation loci, one unmethylated, one completely methylated, and one partially methylated. A three-component mixture model with two Uniform distributions and one truncated normal distribution was used to model the three types. The mixture probabilities and the mean of the normal distribution were used to make inference about differentially methylated loci. Through extensive simulation studies, we demonstrated the feasibility and power of the proposed method. An application to a recently published study on ovarian cancer identified several methylation loci that are missed by the existing method. Genet. Epidemiol. 2011. © 2011 Wiley Periodicals, Inc. 35:686-694, 2011