Evaluation of algorithms for protein identification from sequence databases using mass spectrometry data



In this work, the commonly used algorithms for mass spectrometry based protein identification, Mascot, MS-Fit, ProFound and SEQUEST, were studied in respect to the selectivity and sensitivity of their searches. The influence of various search parameters were also investigated. Approximately 6600 searches were performed using different search engines with several search parameters to establish a statistical basis. The applied mass spectrometric data set was chosen from a current proteome study. The huge amount of data could only be handled with computational assistance. We present a software solution for fully automated triggering of several peptide mass fingerprinting (PMF) and peptide fragmentation fingerprinting (PFF) algorithms. The development of this high-throughput method made an intensive evaluation based on data acquired in a typical proteome project possible. Previous evaluations of PMF and PFF algorithms were mainly based on simulations.