Modeling and prediction of retention behavior of histidine-containing peptides in immobilized metal-affinity chromatography



Two kinds of structural characterization method as local descriptors and global descriptors were used to parameterize peptide structures, and several quantitative structure-retention relationship models were then constructed using partial least square (PLS), least-squares support vector machine (LS-SVM) and Gaussian process (GP) coupled with genetic algorithm-variable selection. These models were validated rigorously and investigated systematically by Tropsha et al. criteria, Monte Carlo cross-validation and one-way analysis of variance. Results show that regression models constructed using nonlinear approaches such as LS-SVM and GP are more robust and predictable than those by linear PLS method. By including linear and nonlinear terms in the covariance function, the GP is capable of handling both linear and nonlinear-mixed relationship, and thus presents a better performance than LS-SVM. Investigation of the optimal GP model revealed that diversified properties contribute to the retention behavior of peptides in immobilized metal-affinity chromatography. Particularly, coordination interaction, electrostatic factor, sovlation effect and hydrogen bonding are correlated significantly with the peptide retention ability.