SEARCH

SEARCH BY CITATION

Keywords:

  • antigen–antibody interaction;
  • descriptor;
  • orthogonal signal correction (OSC);
  • partial least squares (PLS);
  • quantitative sequence–kinetics relationship (QSKR)

Abstract

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

The interaction between recombinant Fab57P and the coat protein of tobacco mosaic virus was studied using quantitative structure–activity relationship (QSAR) method. The development of quantitative multivariate model has shown to be a promising approach for unraveling protein–protein interactions by designed mutations in peptide sequence. This approach makes it possible to stereo-chemically determine which residue properties contribute most to the interaction. A set of side-chain descriptors was proposed and applied in structural characterization of three positions (positions 142, 145 and 146) in the peptide antigen. Quantitative sequence–kinetics relationship (QSKR) models describing the dissociation rates (log kd) were developed successfully using orthogonal signal correction–partial least squares method. The results showed that peptides will have high log kd values when the amino acids in position 142 and 145 have high net charge index, and when residue 145 has high hydrophobicity and residue 146 has low hydrophobicity.

Three-dimensional quantitative structure–activity relationship (3D-QSAR) modeling has become a common technique for elucidating the stereo-chemical features important for the function of small ligands. Several successful experiments have been reported (1–3). The principle of the QSAR approach is to establish mathematical models that relate variations in responses. The factors are structural or physicochemical properties of the compounds and are described using quantitative molecular descriptors. Once a mathematical model has been established, the molecular properties of the compound can in principle be possibly modified to reach a desired biologic activity. The models are also helpful in explaining the molecular activity. In the case of large molecules such as peptides or proteins, these properties can only be partially described, so that the choice of positions to modify and of structural descriptors is critical for the success of the method. Some examples have been published (4).

A multivariate quantitative sequence–kinetics relationship (QSKR) approach involving modifications in peptide sequence was recently used in predicting the kinetics of peptide–antibody interaction (5). The interaction between recombinant Fab 57P (6) and peptide antigens was used for the QSKR analysis, which was characterized in great detail in the study by Andersson et al. (5). Fab 57P is directed against the coat protein of tobacco mosaic virus and recognizes peptides corresponding to region 134–151 of the protein sequence. In that QSKR study, the variable factors were either the buffer composition or peptide sequence. Peptide sequence was described using standard zz scales (7) and a helix-forming tendency scale (8). The mathematical model developed by Andersson et al. for relating kd to peptide sequence in standard HBS buffer suggests that log kd values correlated with the helix-forming tendency at position 145 and to the electronic properties at position 146. Subsequently, QSKR model of this system was also analyzed by Choulier et al. (9). The mathematical model was assessed by comparing predicted and measured kinetic parameters for the interaction of Fab 57P with eight new double or triple variants of the peptide antigen that were based on 18 peptides used in Andersson’s study.

In this study, the data presented in Andersson’s and Choulier’s papers were reanalyzed using a new set of side-chain descriptors proposed by our laboratory. Three descriptors were derived from a matrix of three structural variables of the natural amino acid, including Van Der Waal’s volume, net charge index and hydrophobic parameter of side residues (10). To improve the predictive ability of QSKR model, the orthogonal signal correction (OSC) method was to be used to pretreated the variable matrix X, and then a QSKR model was generated using partial least squares (PLS) method.

Methods

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Experimental data

Selection of substitution positions in the peptide antigen was based on the previous interaction characterization of Fab 57P with peptides. For practical reasons, residues selected for modification should influence interaction kinetics, but not essential for binding. Single changes at positions 141, 143 and 144 have been shown to drastically decrease binding, while changes at positions 142, 145 and 146 only slightly influenced interaction kinetics (11,12). Positions 142, 145 and 146 were therefore selected to analyze the effect of changed chemical properties. Twenty-five peptides including the 17 used in Andersson’s study and the eight in Choulier’s study were used in this analysis. The consistency of the data collected from the two different sources was verified in Choulier’s study. Peptides are designated by the amino acid present at the three positions (142, 145 and 146) that were varied in the antigen. For example, the wild-type peptide containing S142, E145 and S146 is called SES. Table 3 shows the experimental values of the dissociation rate (log kd) constants for the 25 peptides.

Table 3.   The amino acid sequences and kd value of peptide antigens
No.PeptideObsd. (M-1S-1)Calcd. (M-1S-1)No.PeptideObsd. (M-1S-1)Calcd. (M-1S-1)
  1. ‘*’ means the date of peptide was removed from the model. M, mol; S, second

Training set
1NES−2.75−2.7212FGR−1.2−1.35
2RVA*−2.4513SES−3.4−2.96
3DRK−2.45−2.5014SEA−3.1−3.01
4EES−2.45−2.5715GRA−2.3−2.50
5VQE−2.4−2.2016GAK−1.25−1.14
6QDF−2.3−1.9417GDS−2.25−2.32
7SAS−2.25−2.3818AEL−2.55−2.69
8RDG−2.2−2.2219AKS−3.35−3.26
9DYD−1.9−1.8320ERS−2.45−2.79
10MYT−1.85−2.0021THS−2.5−2.55
11GSQ−1.5−1.49    
Test set
22DSA−1.95−2.4824EGK−1.7−1.38
23AES−3.05−3.0525ENS−2.4−2.53

Molecular descriptors

Peptide sequence was characterized using a set of descriptors that was recently proposed by us (10), which was based on the three physicochemical properties (13–15) of the side residues. They were Van Der Wall’s volume, net charge index and hydrophobic parameter of side chains and were encoded as V1, V2 and V3, respectively (Table 1). The correlation matrix of the three parameters is shown in Table 2. The result showed that the three parameters had low correlation; thus, they were used to characterize the amino acid residue structures of peptide antigen directly. Quantitative sequence–kinetics relationship was analyzed based on orthogonal signal correction combined with partial least square (OCS-PLS). For each amino acid residue, three descriptors are needed to characterize its structural property. Therefore, nine descriptors were needed to characterize the structure modification of the three positions mutated peptide antigen (positions 142, 145 and 146).

Table 1.   Scales for amino acids
Amino acidsV1 (nm3)V2V3
Ala (A)0.057020.0071870.42
Arg(R)0.589460.043587−1.37
Asn(N)0.229720.005392−0.82
Asp(D)0.21051−0.02382−1.05
Cys(C)0.14907−0.036611.34
Gln(Q)0.348610.049211−0.30
Glu(E)0.328370.006802−0.87
Gly(G)0.002790.1790520.00
His(H)0.37694−0.010690.18
Ile (I)0.376710.0216312.46
Leu(L)0.378760.0516722.32
Lys(K)0.453630.017708−1.35
Met(M)0.388720.0026831.68
Phe(F)0.552980.0375522.44
Pro(P)0.22790.2395310.98
Ser(S)0.092040.004627−0.05
Thr(T)0.193410.0033520.35
Trp(W)0.793510.0379773.07
Tyr(Y)0.61150.0235991.31
Val(V)0.256740.0570041.66
Table 2.   Correlation matrix of parameters
 V1V2V3
V11.000−0.1300.329
V2−0.1301.0000.101
V30.3290.1011.000

Orthogonal signal correction

Orthogonal signal correction was introduced by Wold et al. (7) to remove systematic variation from the response matrix (absorption) that is unrelated, or orthogonal, to the property matrix (concentration). Therefore, one can believe that important information regarding the analyte is retained. Since then, several groups (16–18) have published various OSC algorithms in an attempt to reduce model complexity by removing orthogonal compounds from the signal. Recently, application of OSC in spectrophotometry for simultaneous determination by PLS has been reported (19–21). These requirements fit the description of structured noise in X. Thus, the OSC filter can be used as a preprocessing step prior to latent variable regression modeling, e.g., PLS, to remove the structured noise in X. The PLS and OSC were finished by Simca-P 11.0 (22).

Results and Discussion

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

QSKR model

Orthogonal signal correction is a preprocessing technique used for removing the information unrelated to the target variables based on constrained principal component analysis, which is a suitable preprocessing method for PLS calibration of mixtures without loss of prediction capacity using spectrophotometric method. In this study, OSC was used to pretreat the variable matrix X after the model was built to improve the predictive ability. Then, a QSKR model was generated using the PLS method. The following procedures were performed in sequence for the QSKR analysis:

  • • 
    The descriptors were used to characterize the peptides. For the three mutated peptide antigens, nine descriptors were needed to characterize the structural modification.
  • • 
    The OSC method was used to pretreat the variable matrix X before the QSKR model was established.
  • • 
    The PLS method was used to perform QSKR analysis.

In the first step, all kinetic data (5,9) were collected for QSKR analysis (model 1). The statistical parameters of the QSKR model were Q 2 = 0.665 and R 2 = 0.725, which were slightly better than the reference results (Q 2 = 0.61, R 2 = 0.66). As shown in Figure 1, the plot of the observed versus calculated log kd values (Figure 1) indicates that experimentally measured parameters deviate most from the predictions for peptide RCA, which is represented by the number 2 box. When the data for RCA was removed, the statistical parameters of the model (model 2) were significantly improved (Q 2 = 0.771, R 2 = 0.847) when compared to the statistical parameters obtained previously (5,9). Furthermore, the RMS value dropped from 0.311 to 0.237 when RCA was removed. Figure 2 shows the relationship between the observed and calculated activities generated by orthogonal signal correction–partial least squares.

image

Figure 1.  Calculated versus observed log kb for peptides in model 1.

Download figure to PowerPoint

image

Figure 2.  Calculated versus observed log kb for peptides in model 2.

Download figure to PowerPoint

In the second step, all the peptides except RCA were divided into two groups; the training set included a group of 20 peptides used to build the model, and the remainder four peptides used to verify the model predictions was the test set group. The training set was used to build model 3. The test set was used to validate the predictability of the QSKR model, and a good result was obtained (Q 2 = 0.823, R 2 = 0.90 and RMS = 0.198). The predicted log kd values are shown in Table 3. The comparisons between various QSAR models of log kd values are shown in Table 4. The relationship between calculated and observed kinetics data of the training set is shown in Figure 3 and that for the test set is shown in Figure 4. To avoid over-fitting, the whole data set was randomly divided into training set and test set, followed by repeating the operation mentioned above for 10 times (model 4 to model 12). The results obtained from each model are shown in the Appendix S1. The results indicated that this QSKR model not only gave good correlation, but also gave good predictability to the outer samples.

Table 4.   The comparison between quantitative structure–activity relationship models
ModelDescriptorsMethodNCQ 2R 2RMS
  1. ‘Nd’ means no data were reported in the reference.

1ZZ-scales (9)NdNd0.610.660.37
2Model 1Orthogonal signal correction combined with partial least square (OCS-PLS)20.670.730.31
3Model 2OCS-PLS20.770.850.24
4Model 3OCS-PLS20.820.900.20
image

Figure 3.  Calculated versus observed log kb for the training set in model 3.

Download figure to PowerPoint

image

Figure 4.  Calculated versus observed log kb for the test set in model 3.

Download figure to PowerPoint

Model weight analysis

The relative importance of the descriptors to QSKR model is shown in Figure 5. The figure showed that contributions from the V142-2, V145-2, V145-3 and V146-3 descriptors were more than those of the other descriptors, particularly the contribution of V145-3. In fact, the V142-2 and V145-2 descriptors refer to net charge index of the 142 and 145 amino acid side residues, respectively, while the V145-3 and V146-3 descriptors refer to the hydrophobic parameter of the 145 and 146 amino acid side residues, respectively. This suggests that the net charge index and hydrophobic parameter are favorable for the log kd values. Moreover, the net charge index of residues 142 and 145, and the hydrophobic properties of residue 145 was positively correlated with the log kd values. However, the hydrophobic properties of 146 was negatively correlated with the log kd values. As the slowest dissociation means the strongest binding affinity, the peptide will have stronger binding affinity if the amino acids in positions 142 and 145 have lower net charge index, and residues 145 has a low hydrophobicity, but residues 146 has high hydrophobicity.

image

Figure 5.  Plot of PLS loadings for log kb value (Vi-j represents Vj of amino acid i).

Download figure to PowerPoint

On the basis of the proposed QSKR model, we may elucidate the antigen–antibody interaction, predict the kinetic properties of the mutant peptides, help explain what structural properties are important in influencing the interactions between antigen and antibody, and how these properties should be changed to improve the binding. This model has led to the design of eight new peptides in an effort to identify mutants that can bind more tightly than those known at present. The peptide modifications were made as follows. Asp (D) and Cys (C) were substituted in position 142 because of their low net charge index; Arg (R) and Lys (K) were substituted in position 145 for their low net charge index and low hydrophobicity; Trp (W) and Ile (I) were substituted in position 146 for their high hydrophobicity. Model 3 was used to predict the kinetic parameters for these new peptides. The peptide sequences and the corresponding predicted log kd values are listed in Table 5. The data show that the modified peptides have lower log kd values, indicating that they bind more tightly with antibody than those known at present.

Table 5.   Modified amino acid sequences and the corresponding predicted log kd values
MutantCalcd.MutantCalcd.MutantCalcd.MutantCalcd.
DRW−3.83DKW−3.84CRW−3.40CKW−3.40
DRK−3.84DKI−3.84CRI−3.40CKI−3.41

Conclusions

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

In this study, OSC was used as a preprocessing technique to pretreat the matrix and eliminate noise signal, which allowed the pretreated matrix to be taken as a new independent variable matrix for further processing by PLS method to build a calibration model.

As a result, a set of nine descriptors was proposed from only a matrix of 3 structural parameters of the natural amino acid side chain. Then, QSKR study was performed based on OCS-PLS. The proposed QSKR model was validated and predicted that peptides will have stronger binding if the amino acids in position 142 and 145 have lower net charge index, and that residues 145 should have a low hydrophobicity, but residues 146 should have high hydrophobicity. On the basis of this guidance, we designed eight new peptides with greater antibody binding affinity than those currently known. In this study, we dealt with three positions in peptide recognizing region 134–151. This, however, does not mean that the other positions are not important to the interaction of the antigen–antibody; it only means that other positions are not changed through the mutation process.

This study demonstrates that molecular descriptors developed for protein sequences can yield consistent QSKR models, which are informative with respect to structure–activity relationships. Together with the limited influence on kinetics of accessory epitope residues, the possible relationship between peptide conformation and kinetics contributes to make Fab 57P-peptide interaction a difficult example for QSAR studies. Compared to residue size or hydrophobicity, peptide conformation is a property that is more difficult to control and quantify.

Acknowledgments

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

This study was supported by the National Natural Science Foundation of China (No. 60873103), the Key Project of Natural Science Foundation of China (No. 30830090), the Program for New Century Excellent Talents in University (No.NCET-06-0780), the Foundation for the Author of National Excellent Doctoral Dissertation of P.R. China (200776), and the Program for Innovative Research Team of Chongqing University of Technology.

References

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information
  • 1
    Ortiz A.R., Pisabarro M.T., Gago F. (1995) Prediction of drug binding affinities by comparative binding energy analysis. J Med Chem;38:26812691.
  • 2
    Stanton D.T. (2000) Developement of a quantitative structure-property relationship model for estimating normal boiling points of small multifunctional organic molecules. J Chem Inf Comput Sci;40:8190.
  • 3
    Xing L., Welsh W.J., Tong W. (1999) Comparison of estrogen receptor alpha and beta subtypes based on comparative molecular field analysis (CoMFA). SAR QSAR Environ Res;10:215237.
  • 4
    Hellberg S., Eriksson L., Jonsson J. (1991) Minimum analogue peptide sets (MAPS) for quantitative structure-activity relationships. Int J Pept Protein Res;37:414424.
  • 5
    Andersson K., Choulier L., Hamalcinen M.D. (2001) Predicting the kinetics of peptide-antibody interactions using a multivariate experimental design of sequence and chemical space. J Mol Recognit;14:6271.
  • 6
    Chatellier J., Rauffer-Bruyere N., Van Regenmortel M.H.V. (1996a) Comparative interaction kinetics of two recombinant Fabs and of the corresponding antibodies directed to the coat protein of tobacco mosaic virus. J Mol Recognit;9:3951.
  • 7
    Sandberg M., Eriksson L., Jonsson J. (1998) New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem;41:24812491.
  • 8
    Deleage G., Roux B. (1987) An algorithm for protein secondary structure prediction based on class prediction. Protein Eng;1:289294.
  • 9
    Choulier L., Andersson K., Hamalainen M.D. (2002) QSAR studies applied to the prediction of antigen-antibody interaction kinetics as measured by BIACORE. Protein Eng;15:373382.
  • 10
    Lin Z.H., Long H.X., Zhu Bo., Wang Y.Q., Wu Y.Z. (2008) New descriptors of amino acids and their application to peptide QSAR study. Peptides;29:17981805.
  • 11
    Altschuh D., Dubs M.-C., Weiss E. (1992) Determination of kinetic constants for the interaction between a monoclonal antibody and peptides using surface plasmon resonance. Biochemistry;31:62986304.
  • 12
    Choulier L., Rauffer-Bruyere N., Ben Khalifa M. (1999) Kinetic analysis of the effect on Fab binding of identical substitutions in a peptide and its parent protein. Biochemistry;38:35303537.
  • 13
    Rose G.D., Geslowitz A.R., Lesser G.J., Lee R.H. (1985) Hydrophobicity of amino acid residues in globular proteins. Science;229:834838.
  • 14
    Raevsky O.A. (1997) Conformation of 18-Crown-5 and Its Influence on Complexation with Alkali and Ammonium Cations: Why 18-Crown-5 Binds More Than 1000 Times Weaker Than 18C6. Phys Org Chem;10:405410.
  • 15
    Zhou P., Tian F. F., Wu S. R. (2006) Genetic algorithm-based virtual screening of combinative mode for peptide/protein. Acta Chim Sinica (inChinese);64:691697.
  • 16
    Sjöblom A., Meili M., Sundbom M. (2000) The influence of humic substances on the speciation and bioavailability of dissolved mercury and methylmercury, measured as uptake by Chaoborus larvae and loss by volatilization. Sci Total Environ;16:115124.
  • 17
    Lindahl T.L., Lundahl T.H., Andersson C.A. (1999) APC-resistance is a risk factor for postoperative thromboembolism in elective replacement of the hip or knee--a prospective study. Thromb Haemost;81:1821.
  • 18
    Allison M.E., Fearon D.T. (2000) Enhanced immunogenicity of aldehyde-bearing antigens: a possible link between innate and adaptive immunity. Eur J Immunol;30:28812887.
  • 19
    Ghasemi J., Niazi A., Kubista M. (2005) Thermodynamics study of the dimerization equilibria of rhodamine B and 6G in different ionic strengths by photometric titration and chemometrics method. Spectrochim Acta A Mol Biomol Spectrosc;62:649656.
  • 20
    Niazi A., Sadeghi M. (2006) PARAFAC and PLS applied to spectrophotometric determination of tetracycline in pharmaceutical formulation and biological fluids. Chem Pharm Bull;54:711713.
  • 21
    Niazi A., Ghasemi J., Yazdanipour A. (2007) Spectrophotometric simultaneous determination of nitrophenol isomers by orthogonal signal correction and partial least squares. J Hazard Mater;146:421427.
  • 22
    Popelier P.L.A., Chaudry U.A., Smith P.J. (2002) Quantum topological molecular similarity. Part 5. Further development with an application to the toxicity of polychlorinated dibenzo-p-dioxins (PCDDs). J Chem Soc Perkin trans;2:12311237.

Supporting Information

  1. Top of page
  2. Abstract
  3. Methods
  4. Results and Discussion
  5. Conclusions
  6. Acknowledgments
  7. References
  8. Supporting Information

Appendix S1.The results obtained from each rando models were shown in the Appendix S1.

Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

FilenameFormatSizeDescription
CBDD_1022_sm_appendixS1.doc313KSupporting info item

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.