Simplified Molecular Input-Line Entry System and International Chemical Identifier in the QSAR Analysis of Styrylquinoline Derivatives as HIV-1 Integrase Inhibitors

Authors


Corresponding author: Andrey A. Toropov,andrey.toropov@marionegri.it

Abstract

The simplified molecular input-line entry system (SMILES) and IUPAC International Chemical Identifier (InChI) were examined as representations of the molecular structure for quantitative structure–activity relationships (QSAR), which can be used to predict the inhibitory activity of styrylquinoline derivatives against the human immunodeficiency virus type 1 (HIV-1). Optimal SMILES-based descriptors give a best model with n = 26, r2 = 0.6330, q2 = 0.5812, s = 0.502, F = 41 for the training set and n = 10, r2 = 0.7493, inline image = 0.6235, inline image = 0.537, s = 0.541, F = 24 for the validation set. Optimal InChI-based descriptors give a best model with n = 26, r2 = 0.8673, q2 = 0.8456, s = 0.302, F = 157 for the training set and n = 10, r2 = 0.8562, inline image = 0.7715, inline image = 0.819, s = 0.329, F = 48 for the validation set. Thus, the InChI-based model is preferable. The described SMILES-based and InChI-based approaches have been checked with five random splits into the training and test sets.

Ancillary