A diverse set of 154 chemicals that included US Food and Drug Administration–regulated compounds tested for their aquatic toxicity in Daphnia magna were modeled by a 3-dimensional quantitative spectral data–activity relationship (3D-QSDAR). Two distinct algorithms, partial least squares (PLS) and Tanimoto similarity-based k-nearest neighbors (KNN), were used to process bin occupancy descriptor matrices obtained after tessellation of the 3D-QSDAR space into regularly sized bins. The performance of models utilizing bins ranging in size from 2 ppm × 2 ppm × 0.5 Å to 20 ppm × 20 ppm × 2.5 Å was explored. Rigorous quality-control criteria were imposed: 1) 100 randomized 20% hold-out test sets were generated and the average R2test of the respective models was used as a measure of their performance, and 2) a Y-scrambling procedure was used to identify chance correlations. A consensus between the best-performing composite PLS model using 0.5 Å × 14 ppm × 14 ppm bins and 10 latent variables (average R2test = 0.770) and the best composite KNN model using 0.5 Å × 8 ppm × 8 ppm and 2 neighbors (average R2test = 0.801) offered an improvement of about 7.5% (R2test consensus = 0.845). Projection of the most frequently occurring bins on the standard coordinate space indicated that the presence of a primary or secondary amino group—substituted aromatic systems—would result in an increased toxic effect in Daphnia. The presence of a second aromatic ring with highly electronegative substituents 5 Å to 7 Å apart from the first ring would lead to a further increase in toxicity. Environ Toxicol Chem 2014;33:1271–1282. © 2014 SETAC
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.