Similarity searching is currently one of the most widely applied approaches to computationally screen large databases for novel active compounds, and molecular fingerprints are among the most popular search tools. Fingerprint searching has recently also been applied in chemical biology to identify compounds that are selective for a target within a group of related ones. In general, fingerprints are bit string representations of molecular structure and properties but their design, size, and complexity often vary substantially. Like essentially all similarity search tools, fingerprints display a strong compound class dependence in their ability to identify active molecules and distinguish them from other database compounds. In practical applications, this limitation makes it very difficult to select or prioritize fingerprints that are most suitable for a given search problem. We have previously (i) devised a Bayesian-scoring scheme to combine fingerprints and molecular property descriptors for similarity searching and (ii) developed an information-theoretic approach to predict active compound recall rates for fingerprint searching. Herein, we combine these methods and present an approach for the prediction of compound recall in search calculations using Bayesian screening with molecular property descriptors, fingerprints and their combination. For practical similarity search applications, this analysis is highly relevant because it makes it possible to identify search methods that are most likely to be successful for a given compound activity class and screening database. Copyright © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 2: 123-134, 2009
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.