• Proteins;
  • Drug design


A key task in structural biology is to define a meaningful similarity measure for the comparison of protein structures. Recently, the use of graphs as modeling tools for molecular data has gained increasing importance. In this context, kernel functions have attracted a lot of attention, especially since they allow for the application of a rich repertoire of methods from the field of kernel-based machine learning. However, most of the existing graph kernels have been designed for unlabeled and/or unweighted graphs, although proteins are often more naturally and more exactly represented in terms of node-labeled and edge-weighted graphs. Here we analyze kernel-based protein comparison methods and propose extensions to existing graph kernels to exploit node-labeled and edge-weighted graphs. Moreover, we propose an instance of the substructure fingerprint kernel suitable for the analysis of protein binding sites. By using fuzzy fingerprints, we solve the problem of discontinuity on bin-boundaries arising in the case of labeled graphs.