Library of local descriptors models the core of proteins accurately



In this article, we present a novel approach to describing proteins based on multifragment structure motifs called local descriptors. We collect structurally similar descriptors in groups to construct a compact library of groups of descriptors. To demonstrate its feasibility for a wide spectrum of applications, ranging from structure comparison and analysis to structure prediction, it is critical to show the ability of groups from our library to reproduce proteins accurately. We show that this library describes all local 3D structure patterns occurring in the core of proteins and present an algorithm for reconstruction of accurate global 3D structures. Moreover, we show that the sequence of motifs used in such a construction correlates significantly with the amino acid sequence of the considered protein. Finally, we present how our library may be successfully used for predicting protein sequence based on the structure. Proteins 2007. © 2007 Wiley-Liss, Inc.