Comparison of and limits of accuracy for statistical analyses of vibrational and electronic circular dichroism spectra in terms of correlations to and predictions of protein secondary structure

Authors

  • Petr Pancoska,

    1. Department of Chemistry, University of Illinois at Chicago, Chicago, Illinois 60607–7061
    2. Department of Chemical Physics, Charles University, Ke Karlovu 3, 121 16 Prague 2, Czech Republic
    Search for more papers by this author
  • Eduard Bitto,

    1. Department of Chemical Physics, Charles University, Ke Karlovu 3, 121 16 Prague 2, Czech Republic
    Search for more papers by this author
  • Vit Janota,

    1. Department of Chemical Physics, Charles University, Ke Karlovu 3, 121 16 Prague 2, Czech Republic
    Search for more papers by this author
  • Marie Urbanova,

    1. Department of Chemistry, University of Illinois at Chicago, Chicago, Illinois 60607–7061
    2. Institute of Chemical Technology, Department of Physics and Measurements, Technicka 5, 166 23 Prague 6, Czech Republic
    Search for more papers by this author
  • Vijai P. Gupta,

    1. Department of Chemistry, University of Illinois at Chicago, Chicago, Illinois 60607–7061
    Search for more papers by this author
  • Timothy A. Keiderling

    Corresponding author
    1. Department of Chemistry, University of Illinois at Chicago, Chicago, Illinois 60607–7061
    • Department of Chemistry, m/c 111, University of Illinois at Chicago, 845 W. Taylor Street, Chicago, Illinois 60607-7061
    Search for more papers by this author

Abstract

This work provides a systematic comparison of vibrational CD (VCD) and electronic CD (ECD) methods for spectral prediction of secondary structure. The VCD and ECD data are simplified to a small set of spectral parameters using the principal component method of factor analysis (PC/FA). Regression fits of these parameters are made to the X-ray-determined fractional components (FC) of secondary structure. Predictive capability is determined by computing structures for proteins sequentially left out of the regression. All possible combinations of PC/FA spectral parameters (coefficients) were used to form a full set of restricted multiple regressions with the FC values, both independently for each spectral data set as well as for the two VCD sets and all the data grouped together. The complete search over all possible combinations of spectral parameters for different types of spectral data is a new feature of this study, and the focus on prediction is the strength of this approach. The PC/FA method was found to be stable in detail to expansion of the training set. Coupling amide II to amide I' parameters reduced the standard deviations of the VCD regression relationships, and combining VCD and ECD data led to the best fits. Prediction results had a minimum error when dependent on relatively few spectral coefficients. Such a limited dependence on spectral variation is the key finding of this work, which has ramifications for previous studies as well as suggests future directions for spectral analysis of structure. The best ECD prediction for helix and sheet uses only one parameter, the coefficient of the first subspectrum. With VCD, the best predictions sample coefficients of both the amide I' and II bands, but error is optimized using only a few coefficients. In this respect, ECD is more accurate than VCD for α-helix, and the combined VCD (amide I'+II) predicts the β-sheet component better than does ECD. Combining VCD and ECD data sets yields exceptionally good predictions by utilizing the strengths of each. However, the residual error, its distribution, and, most importantly, the lack of dependence of the method on many of the significant components derived from the spectra leads to the conclusion that the heterogeneity of protein structure is a fundamental limitation to the use of such spectral analysis methods. The underutilization of these data for prediction of secondary structure suggests spectral data could predict a more detailed descriptor.

Ancillary