Combining functional and topological properties to identify core modules in protein interaction networks
Article first published online: 22 JUN 2006
Copyright © 2006 Wiley-Liss, Inc.
Proteins: Structure, Function, and Bioinformatics
Volume 64, Issue 4, pages 948–959, 1 September 2006
How to Cite
Lubovac, Z., Gamalielsson, J. and Olsson, B. (2006), Combining functional and topological properties to identify core modules in protein interaction networks. Proteins, 64: 948–959. doi: 10.1002/prot.21071
- Issue published online: 2 AUG 2006
- Article first published online: 22 JUN 2006
- Manuscript Accepted: 27 MAR 2006
- Manuscript Revised: 24 MAR 2006
- Manuscript Received: 25 AUG 2005
- systems biology;
- semantic similarity;
- Gene Ontology;
Advances in large-scale technologies in proteomics, such as yeast two-hybrid screening and mass spectrometry, have made it possible to generate large Protein Interaction Networks (PINs). Recent methods for identifying dense sub-graphs in such networks have been based solely on graph theoretic properties. Therefore, there is a need for an approach that will allow us to combine domain-specific knowledge with topological properties to generate functionally relevant sub-graphs from large networks. This article describes two alternative network measures for analysis of PINs, which combine functional information with topological properties of the networks. These measures, called weighted clustering coefficient and weighted average nearest-neighbors degree, use weights representing the strengths of interactions between the proteins, calculated according to their semantic similarity, which is based on the Gene Ontology terms of the proteins. We perform a global analysis of the yeast PIN by systematically comparing the weighted measures with their topological counterparts. To show the usefulness of the weighted measures, we develop an algorithm for identification of functional modules, called SWEMODE (Semantic WEights for MODule Elucidation), that identifies dense sub-graphs containing functionally similar proteins. The proposed method is based on the ranking of nodes, i.e., proteins, according to their weighted neighborhood cohesiveness. The highest ranked nodes are considered as seeds for candidate modules. The algorithm then iterates through the neighborhood of each seed protein, to identify densely connected proteins with high functional similarity, according to the chosen parameters. Using a yeast two-hybrid data set of experimentally determined protein–protein interactions, we demonstrate that SWEMODE is able to identify dense clusters containing proteins that are functionally similar. Many of the identified modules correspond to known complexes or subunits of these complexes. Proteins 2006. © 2006 Wiley-Liss, Inc.