Get access
Advertisement

Elucidating protein secondary structure with circular dichroism and a neural network

Authors

  • Vincent Hall,

    1. Molecular Organisation and Assembly in Cells Doctoral Training Centre, University of Warwick, Coventry, United Kingdom
    2. Department of Chemistry, University of Warwick, Coventry, United Kingdom
    3. School of Engineering, University of Warwick, Coventry, United Kingdom
    Search for more papers by this author
  • Anthony Nash,

    1. Molecular Organisation and Assembly in Cells Doctoral Training Centre, University of Warwick, Coventry, United Kingdom
    2. Centre for Scientific Computing, University of Warwick, Coventry, United Kingdom
    Search for more papers by this author
  • Evor Hines,

    1. School of Engineering, University of Warwick, Coventry, United Kingdom
    Search for more papers by this author
  • Alison Rodger

    Corresponding author
    1. Department of Chemistry, University of Warwick, Coventry, United Kingdom
    2. Warwick Centre for Analytical Science, University of Warwick, Coventry, United Kingdom
    Search for more papers by this author

Abstract

Circular dichroism spectroscopy is a quick method for determining the average secondary structures of proteins, probing their interactions with their environment, and aiding drug discovery. This article describes the development of a self-organising map structure-fitting methodology named secondary structure neural network (SSNN) to aid this process and reduce the level of expertise required. SSNN uses a database of spectra from proteins with known X-ray structures; prediction of structures for new proteins is then possible. It has been designed as 3 units: SSNN1 takes spectra for known proteins, clusters them into a map, and SSNN2 creates a matching structure map. SSNN3 places unknown spectra on the map and gives them structure vectors. SSNN3 output illustrates the process and results obtained. We detail the strengths and weaknesses of SSNN and compare it with widely accepted structure fitting programs. Current input format is Δɛ per amino acid residue from 240 to 190 nm in 1 nm steps for the known and unknown proteins and a vector summarizing the secondary structure elements of the known proteins. The format is readily modified to include input data with, for example, extended wavelength ranges or different assignment of secondary structures. SSNN can be used either pretrained with a reference set from the CDPro web site (direct application of SSNN3, with the provided output from SSNN1 and SSNN2) or all three modules can be used as required. SSNN3 is available trained (with the reference set of the 48-spectra set used in this work complemented by five additional spectra) at http://www2.warwick.ac.uk/fac/sci/chemistry/research/arodger/arodgergroup/research_intro/instrumentation/ssnn/. © 2013 Wiley Periodicals, Inc.

Get access to the full text of this article

Ancillary