Get access

Identifying protein complexes in protein–protein interaction networks by using clique seeds and graph entropy

Authors


  • Colour Online: See the article online to view Figs. 1–6 in colour.

Correspondence: Professor Fang-Xiang Wu, Division of Biomedical Engineering, and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada

E-mail: faw341@mail.usask.ca

Fax: +1-306-966-5427

Abstract

The identification of protein complexes plays a key role in understanding major cellular processes and biological functions. Various computational algorithms have been proposed to identify protein complexes from protein–protein interaction (PPI) networks. In this paper, we first introduce a new seed-selection strategy for seed-growth style algorithms. Cliques rather than individual vertices are employed as initial seeds. After that, a result-modification approach is proposed based on this seed-selection strategy. Predictions generated by higher order clique seeds are employed to modify results that are generated by lower order ones. The performance of this seed-selection strategy and the result-modification approach are tested by using the entropy-based algorithm, which is currently the best seed-growth style algorithm to detect protein complexes from PPI networks. In addition, we investigate four pairs of strategies for this algorithm in order to improve its accuracy. The numerical experiments are conducted on a Saccharomyces cerevisiae PPI network. The group of best predictions consists of 1711 clusters, with the average f-score at 0.68 after removing all similar and redundant clusters. We conclude that higher order clique seeds can generate predictions with higher accuracy and that our improved entropy-based algorithm outputs more reasonable predictions than the original one.

Ancillary