Standard Article

Protein Structure Prediction and Databases

  1. Olga V Kalinina,
  2. Thomas Lengauer

Published Online: 15 SEP 2014

DOI: 10.1002/9780470015902.a0006214.pub2

eLS

eLS

How to Cite

Kalinina, O. V. and Lengauer, T. 2014. Protein Structure Prediction and Databases. eLS. .

Author Information

  1. Max-Planck Institute for Informatics, Saarbrücken, Germany

  1. Based in part on the previous version of this eLS article ‘Protein Structure Prediction and Databases’ (2006) by Francisco S Domingues, Thomas Lengauer and Ingolf Sommer.

Publication History

  1. Published Online: 15 SEP 2014

Abstract

Three-dimensional structures of proteins are the key to understanding their molecular function. Most reliably protein structures are determined by experiment. Recent advances in experimental techniques have lead to a large increase in numbers of both protein sequences and 3D structures. Yet, the number of experimentally resolved proteins 3D structures is three orders of magnitude lower than that of sequences. This calls for computer support of protein structure prediction. Today several databases complement the comparatively small set of experimentally resolved protein structures with much larger sets of protein models generated by computer.

Key Concepts:

  • Protein structure prediction relies heavily on the experimental data on protein structures; the volume of such data is the prime determinant for the quality of protein structure predictions.

  • The three major types of methods for protein structure prediction are homology, or template-based modelling; fold recognition, or threading; de novo, or ab initio prediction.

  • Homology modelling is the most reliable class of methods, but require experimental knowledge of a structure of a homologous – and thus structurally similar – protein, called the template.

  • Sensitive sequence similarity search tools are used for detection of potential templates.

  • The protein structure is modelled step-wise: (1) aligning the target protein to the template, (2) placing the aligned target residues onto their respective template residues, (3) placing the side chains of nonconserved residues, healing backbone breaks and modelling loops that form gaps in the alignment, and (4) refining the model.

  • The two most popular computational tools for homology modelling are MODELLER and SWISS-MODEL; the two protein model databases based on them are ModBase and the SWISS-MODEL Repository, respectively.

  • The Protein Modelling Portal unites data from these and other databases, and provides an independent system for model evaluation called CAMEO.

Keywords:

  • protein structure;
  • protein structure prediction;
  • protein structure databases;
  • structural genomics;
  • protein structure modelling