SEARCH

SEARCH BY CITATION

Keywords:

  • two-component systems;
  • mutual information;
  • molecular co-evolution;
  • machine learning

Abstract

The two-component system (TCS) is a signal transduction system that involves a histidine kinase (HK) and a response regulator (RR). Although up to hundreds of TCSs may operate in parallel in a bacterial cell, the high-fidelity of a TCS signaling is well maintained, minimizing irrelevant crosstalk between TCSs. When a HK gene and a RR gene in a given TCS system exist in neighboring positions, it is almost certain that their protein products (i.e., HK and RR) are interacting partners. However, large bacterial genomes often have multiple HK genes and/or cognate RR genes that are not neighboring positions. In many partially assembled genomes, some HK genes and RR genes belong to different contigs. In these cases, it is not clear which HK(s) and RR(s) interact. By combining information-theoretic and graph-theoretic approaches, we developed a computational method identifying co-evolving residue pairs between HKs and cognate RRs and predicting the interacting HK:RR pairs for each TCS. In addition, we built a TCSppWWW webserver (http://compath.org/platcom/tcs) that takes query sequences of pairing candidates and predicts their HK:RR pairing using precomputed models. The current release of TCSppWWW provides predictors for 48 TCSs using over 20,000 protein sequences from about 900 bacterial genomes. Three different types of predictors using Random Forest, RBF Network, and Naïve Bayes are provided. Once a set of HK and RR candidate sequences are submitted, TCSppWWW aligns query sequences to the precomputed multiple sequence alignment of HK:RR pairs, extracts co-evolving column positions, then returns prediction results with prediction margin and additional information. Proteins 2011. © 2010 Wiley-Liss, Inc.