Automated splicing mutation analysis by information theory

Authors

  • Vijay K. Nalla,

    1. Laboratory of Human Molecular Genetics, Children's Mercy Hospital and Clinics, University of Missouri-Kansas City, Kansas City, Missouri
    2. School of Computing and Engineering, University of Missouri-Kansas City, Kansas City, Missouri
    Search for more papers by this author
  • Peter K. Rogan

    Corresponding author
    1. Laboratory of Human Molecular Genetics, Children's Mercy Hospital and Clinics, University of Missouri-Kansas City, Kansas City, Missouri
    2. School of Medicine, University of Missouri-Kansas City, Kansas City, Missouri
    3. School of Computing and Engineering, University of Missouri-Kansas City, Kansas City, Missouri
    • Laboratory of Human Molecular Genetics, Children's Mercy Hospital and Clinics, 2401 Gillham Rd., Kansas City, MO 64108
    Search for more papers by this author

  • Communicated by A. Jaime Cutticchia

Abstract

Information theory–based software tools have been useful in interpreting noncoding sequence variation within functional sequence elements such as splice sites. Individual information analysis detects activated cryptic splice sites and associated splicing regulatory sites and is capable of distinguishing null from partially functional alleles. We present a server (https://splice.cmh.edu) designed to analyze splicing mutations in binding sites in either human genes, genome-mapped mRNAs, user-defined sequences, or dbSNP entries. Standard HUGO-approved gene symbols and HGVS-approved systematic mutation nomenclature (or dbSNP format) are entered via a web portal. After verifying the accuracy of input variant(s), the surrounding interval is retrieved from the human genome or user-supplied reference sequence. The server then computes the information contents (Ri) of all potential constitutive and/or regulatory splice sites in both the reference and variant sequences. Changes in information content are color-coded, tabulated, and visualized as sequence walkers, which display the binding sites with the reference sequence. The software was validated by analyzing ∼1,300 mutations from Human Mutation as well as eight mapped SNPs from dbSNP designated as splice site variants. All of the splicing mutations and variants affected splice site strength or activated cryptic splice sites. The server also detected several missense mutations that were unexpectedly predicted to have concomitant effects on splicing or appeared to activate cryptic splicing. Hum Mutat 25:334–342, 2005. © 2005 Wiley-Liss, Inc.

Ancillary