Get access

Identification of Functional cis-regulatory Polymorphisms in the Human Genome


  • Additional Supporting Information may be found in the online version of this article.

  • Communicated by William S. Oetting

  • Contract grant sponsors: Italian Association for Cancer Research (AIRC) (IG-9272, IG-9408, CR-4016); Italian Ministry of University and Scientific Research; Human Genetic Foundation; European Union Sixth Framework Program; Wellcome Trust (076113).

Correspondence to: Valeria Poli, University of Turin, Department of Molecular Biotechnology and Life Sciences, Torino, Italy. E-mail: or Paolo Provero, University of Turin, Department of Molecular Biotechnology and Life Sciences, MBC, Via Nizza52, Torino, 10126, Italy. E-mail:


Polymorphisms in regulatory DNA regions are believed to play an important role in determining phenotype, including disease, and in providing raw material for evolution. We devised a new pipeline for the systematic identification of functional variation in human regulatory sequences. The algorithm is based on the identification of SNPs leading to significant changes in both the affinity of a regulatory region for transcription factors (TFs) and the expression in vivo of the regulated gene. We tested the algorithm by identifying SNPs leading to altered regulation by STAT3 in human promoters and introns, and experimentally validated the top-scoring ones, showing that most of the SNPs identified by the algorithm indeed correspond to differential binding of STAT3 and differential induction of the target gene upon stimulation with IL6. Using the same computational approach, we compiled a database of thousands of predicted functional regulatory SNPs for hundreds of human TFs, which we provide as online Supporting Information. We discuss possible applications to the interpretation of noncoding SNPs associated with human diseases. The method we propose and the database of predicted functional cis-regulatory polymorphisms will be useful in future studies of regulatory variation and in particular to interpret the results of past and future genome-wide association studies.

Get access to the full text of this article