ConsDock: A new program for the consensus analysis of protein–ligand interactions



Protein-based virtual screening of chemical libraries is a powerful technique for identifying new molecules that may interact with a macromolecular target of interest. Because of docking and scoring limitations, it is more difficult to apply as a lead optimization method because it requires that the docking/scoring tool is able to propose as few solutions as possible and all of them with a very good accuracy for both the protein-bound orientation and the conformation of the ligand. In the present study, we present a consensus docking approach (ConsDock) that takes advantage of three widely used docking tools (Dock, FlexX, and Gold). The consensus analysis of all possible poses generated by several docking tools is performed sequentially in four steps: (i) hierarchical clustering of all poses generated by a docking tool into families represented by a leader; (ii) definition of all consensus pairs from leaders generated by different docking programs; (iii) clustering of consensus pairs into classes, represented by a mean structure; and (iv) ranking the different means starting from the most populated class of consensus pairs. When applied to a test set of 100 protein–ligand complexes from the Protein Data Bank, ConsDock significantly outperforms single docking with respect to the docking accuracy of the top-ranked pose. In 60% of the cases investigated here, ConsDock was able to rank as top solution a pose within 2 Å RMSD of the X-ray structure. It can be applied as a postprocessing filter to either single- or multiple-docking programs to prioritize three-dimensional guided lead optimization from the most likely docking solution. Proteins 2002;47:521–533. © 2002 Wiley-Liss, Inc.