Global Model for Octanol-Water Partition Coefficients from Proton Nuclear Magnetic Resonance Spectra



The ability to estimate chemical and physical properties from experimental spectra is highly desirable, as it eliminates the need for a priori knowledge of exact chemical structure and allows the property estimation of mixtures. Here we report the proof of principle that a predictive method for octanol-water partition coefficient (logP) based on 1H-NMR spectra in d3-chloroform is feasible and can yield accuracy comparable to in silico logP models. The Spectrometric Data-Activity Relationship (QSDAR) reported predicts logP of neutral organic chemicals using descriptors derived only from 1H-NMR chemical shifts, integrations and peak widths. Proton NMR spectra of 140 compounds with diverse structures were used to construct a Multiple Linear Regression (MLR) and a Partial Least Squares (PLS) model that predicts logP. The optimized models were internally validated by K-fold cross validation and leave-one-out validation, and externally with a test set of 28 chemicals. The squared regression coefficients of prediction for the MLR and PLS regression models were 0.970 and 0.971 respectively, showing that the method allows accurate prediction of logP values exclusively from predicted 1H NMR spectra.