A new, structurally nonredundant, diverse data set of protein–protein interfaces and its implications

Authors

  • Ozlem Keskin,

    Corresponding author
    1. Koc University, Center for Computational Biology and Bioinformatics and College of Engineering, Rumelifeneri Yolu, Sariyer, Istanbul 34450, Turkey
    2. Basic Research Program, Science Applications International Corporation (SAIC)-Frederick, Inc., Laboratory of Experimental and Computational Biology, National Cancer Institute (NCI)-Frederick, Frederick, Maryland 21702, USA
    • NCI-Frederick, Building 469, Room 151, Frederick, MD 21702, USA; fax: (301) 846-5598.
    Search for more papers by this author
  • Chung-Jung Tsai,

    1. Basic Research Program, Science Applications International Corporation (SAIC)-Frederick, Inc., Laboratory of Experimental and Computational Biology, National Cancer Institute (NCI)-Frederick, Frederick, Maryland 21702, USA
    Search for more papers by this author
  • Haim Wolfson,

    1. School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
    Search for more papers by this author
  • Ruth Nussinov

    Corresponding author
    1. Basic Research Program, Science Applications International Corporation (SAIC)-Frederick, Inc., Laboratory of Experimental and Computational Biology, National Cancer Institute (NCI)-Frederick, Frederick, Maryland 21702, USA
    2. Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
    • NCI-Frederick, Building 469, Room 151, Frederick, MD 21702, USA; fax: (301) 846-5598.
    Search for more papers by this author

Abstract

Here, we present a diverse, structurally nonredundant data set of two-chain protein–protein interfaces derived from the PDB. Using a sequence order-independent structural comparison algorithm and hierarchical clustering, 3799 interface clusters are obtained. These yield 103 clusters with at least five nonhomologous members. We divide the clusters into three types. In Type I clusters, the global structures of the chains from which the interfaces are derived are also similar. This cluster type is expected because, in general, related proteins associate in similar ways. In Type II, the interfaces are similar; however, remarkably, the overall structures and functions of the chains are different. The functional spectrum is broad, from enzymes/inhibitors to immunoglobulins and toxins. The fact that structurally different monomers associate in similar ways, suggests “good” binding architectures. This observation extends a paradigm in protein science: It has been well known that proteins with similar structures may have different functions. Here, we show that it extends to interfaces. In Type III clusters, only one side of the interface is similar across the cluster. This structurally nonredundant data set provides rich data for studies of protein–protein interactions and recognition, cellular networks and drug design. In particular, it may be useful in addressing the difficult question of what are the favorable ways for proteins to interact. (The data set is available at http://protein3d.ncifcrf.gov/∼keskino/ and http://home.ku.edu.tr/∼okeskin/INTERFACE/INTERFACES.html.)

Ancillary