Get access

DOCKGROUND system of databases for protein recognition studies: Unbound structures for docking

Authors

  • Ying Gao,

    1. Center for Bioinformatics, The University of Kansas, Lawrence, Kansas
    Search for more papers by this author
  • Dominique Douguet,

    1. Centre de Biochimie Structurale (CNRS UMR 5048, INSERM UMR U554, UMI), Montpellier, France
    Search for more papers by this author
  • Andrey Tovchigrechko,

    1. Center for Bioinformatics, The University of Kansas, Lawrence, Kansas
    Current affiliation:
    1. J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850
    Search for more papers by this author
  • Ilya A. Vakser

    Corresponding author
    1. Center for Bioinformatics, The University of Kansas, Lawrence, Kansas
    2. Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas
    • Center for Bioinformatics, The University of Kansas, 2030 Becker Drive, Lawrence, KS 66047-1620
    Search for more papers by this author

Abstract

Computational docking approaches are important as a source of protein–protein complexes structures and as a means to understand the principles of protein association. A key element in designing better docking approaches, including search procedures, potentials, and scoring functions is their validation on experimentally determined structures. Thus, the databases of such structures (benchmark sets) are important. The previous, first release of the DOCKGROUND resource (Douguet et al., Bioinformatics 2006; 22:2612–2618) implemented a comprehensive database of cocrystallized (bound) protein–protein complexes in a relational database of annotated structures. The current release adds important features to the set of bound structures, such as regularly updated downloadable datasets: automatically generated nonredundant set, built according to most common criteria, and a manually curated set that includes only biological nonobligate complexes along with a number of additional useful characteristics. The main focus of the current release is unbound (experimental and simulated) protein–protein complexes. Complexes from the bound dataset are used to identify crystallized unbound analogs. If such analogs do not exist, the unbound structures are simulated by rotamer library optimization. Thus, the database contains comprehensive sets of complexes suitable for large scale benchmarking of docking algorithms. Advanced methodologies for simulating unbound conformations are being explored for the next release. The future releases will include datasets of modeled protein–protein complexes, and systematic sets of docking decoys obtained by different docking algorithms. The growing DOCKGROUND resource is designed to become a comprehensive public environment for developing and validating new docking methodologies. Proteins 2007. © 2007 Wiley-Liss, Inc.

Ancillary