• Cluster analysis;
  • Descriptor;
  • Fingerprint;
  • Molecular diversity analysis;
  • Molecular similarity;
  • Similarity coefficient;
  • Similarity measure;
  • Similarity searching;
  • Structural similarity;
  • Structure representation;
  • Weighting scheme


Measures of structural similarity play an important role in chemoinformatics for applications such as similarity searching, database clustering and molecular diversity analysis. A similarity measure comprises three components: a structure representation; a weighting scheme; and a similarity coefficient. The paper introduces these components and describes methods for comparing different measures. The use of similarity measures in chemoinformatics research is illustrated by recent projects in the author’s laboratory on: the interactions between a weighting scheme and a similarity coefficient; the design of comparative studies of similarity measures; the use of 2D fingerprints for scaffold-hopping searches; and the registration of orphan drugs for rare diseases.