• Genetic algorithm;
  • Model distance and Q2LOO guided model selection;
  • Structural similarity based consensus modeling;
  • Average consensus modeling


Thumbnail image of graphical abstract

A novel strategy of “structural similarity based consensus modeling” (SSCM) based on “model distance and

guided model selection” (MD-QGMS) submodel set was proposed. The SSCM strategy is built upon a hypothesis, that is, similar compounds are most probably predicted more accurately by a same submodel among a model population, which can be concluded from the fact that models employing a different set of descriptors can predict compounds with specific structures more accurately. It is proved that the proposed SSCM strategy can remarkably improve the external prediction ability of QSAR models by employing two different datasets. In future, the proposed SSCM strategy may provide a new direction to develop more accurate predictive models.