On further application of rmath image as a metric for validation of QSAR models

Authors

  • Indrani Mitra,

    1. Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
    Search for more papers by this author
  • Partha Pratim Roy,

    1. Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
    Search for more papers by this author
  • Supratik Kar,

    1. Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
    Search for more papers by this author
  • Probir Kumar Ojha,

    1. Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
    Search for more papers by this author
  • Kunal Roy

    Corresponding author
    1. Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India
    • Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata 700 032, India.
    Search for more papers by this author

Abstract

Validation is a crucial aspect for quantitative structure–activity relationship (QSAR) model development. External validation is considered, in general, as the most conclusive proof of predictive capacity of a QSAR model. In the absence of truly external data set, external validation is usually performed on test set compounds, which are members of the original data set but not used in model development exercise. In the case of small data sets, QSAR researchers experience problem in model development due to the fact that the developed models may be less reliable on account of the small number of training set compounds and such models may also show poor external predictability because the models may not have captured all necessary features required for the particular structure–activity relationships. The present paper attempts to show that ‘true rmath image(LOO)’ statistic calculated based on the model derived from the undivided data set with application of variable selection strategy at each cycle of leave-one-out (LOO) validation may reflect external validation characteristics of the developed model thus obviating the requirement of splitting of the data set into training and test sets. This approach may be helpful in the case of small data sets as it uses all available data for model development and validation thus making the resulting model more reliable. Copyright © 2009 John Wiley & Sons, Ltd.

Ancillary