Explainable Fragment-based Molecular Property Attribution

,


Introduction
The "AI & Drug Discovery" mode has accelerated the research and development of drugs and made outstanding contributions to safeguarding human health.In general, drug discovery is considered to be the process of identifying chemical entities with potential druggability in response to imminent and unsatisfied medical needs [1 (Sausville, 2012)].However, the complex experimental verification and low success rate results make this process extremely time and cost-consuming, which now takes about 10-15 years with an average cost of $2.5 billion [2(DiMasi et al., 2016)].With the rapid development of artificial intelligence, especially deep learning, huge breakthroughs have already occurred in the drug discovery process, including drug design, drug screening, and chemical synthesis, reducing overall attrition rate and significantly improving discovery efficiency.Recently, AlphaFold2 [3 (Tunyasuvunakool et al., 2021)] was used to directly predict the 3D structure of the protein from the amino acid sequence of the protein and achieved atomic-level accuracy, which deciphered the entire human proteome (98.5% of human protein).Deep models [4, 5, 6(Schwaller et al., 2019;Tetko et al., 2020;Segler et al., 2018)] were also used to improve the prediction accuracy of retrosynthesis pathways.As a critical step to be approved for a candidate drug, deep models [7, 8, 9, 10, 11(Shi et al., 2019;Wallach et al., 2015;Ji et al., 2018;Yuan et al., 2019;Zhang et al., 2021)] assisted in the prediction of toxicity, physiological activity, and other properties, which carried out molecular screening more efficiently.
However, although deep learning greatly accelerates and promotes the process of drug discovery, the "black box" characteristic [12 (Buhrmester et al., 2021)] of deep models seriously hinders the in-depth research and application of existing methods.The "black box" characteristic is mainly reflected in two aspects, namely the unexplainable decision route and the unexplainable prediction result of deep models.Developing an explainable deep learning method for drug discovery has significant meaning, such as providing transparent property prediction results of drugs for review (model results can directly determine the life and death of patients); explaining the decision logic of deep models in property prediction, molecular synthesis, and other tasks, to help experts understand the predicted results and find important factors for specified tasks.
Recently, a few works have attempted to explain the relationship between the predicted results and original molecules.Xu et al. (2017) [13 (Xu et al., 2017)] proposed a framework for acute oral toxicity (AOT) prediction based on convolutional neural networks and attempted to use the deep molecular fingerprints for mining vital substructures related to AOT.Wu et al. (2021) [14 (Wu et al., 2021)] proposed a multi-task graph attention (MGA) framework for the task of predicting toxicity and also tried to use the attention mechanism to explore the most critical structural information.These two methods provide only a few examples of one or two specific tasks for verifying interpretability, which was inevitably subjective.Meanwhile, the main drawback is that the two methods essentially revolve around the relationship between atoms, which is inconsistent with existing scientific discoveries [15, 16, 17, 18, 19, 20(Nelson, 1982;Kalgutkar, 2019;Limban et al., 2018;Hemmerich and Ecker, 2020;Al-Hasani and Bruchas, 2011;Smith, 2011)].
In fact, those researches [15, 16, 17, 18, 19, 20(Nelson, 1982;Kalgutkar, 2019;Limban et al., 2018;Hemmerich and Ecker, 2020;Al-Hasani and Bruchas, 2011;Smith, 2011)] reveal that molecular properties are directly affected by specific fragments.Smith et al. (2011) [20(Smith, 2011)] mentioned the molecular mechanism of carcinogenesis, such as electrophilic fragments or fragments that can trigger electrophilic intermediates.Typical genotoxins, such as aromatic amines, are believed to cause mutations because they are highly nucleophilic and form strong DNA covalent bonds that lead to the formation of aromatic amine-DNA adducts, preventing accurate replication.Therefore, common electrophiles should be avoided unless a specific drug-protein covalent interaction is the intended target.
Based on the above fact, we propose a novel fragment-based molecular property attribution method (Fig. 2) to explore the relevance between the molecular property and the specific fragments.A total of 365 specific molecular fragments were obtained from 42 property tasks with the devised gradient attribution technique ("Step I" in Fig. 2).Mathematical statistics and scientific mechanisms demonstrated the reliability of these attribution results: about 90% of all the attribution fragments obtained by our method show potential relevance to specific properties, and the attribution fragments for randomly selected positive molecules are pretty identical to the structures in the literature on the six classical side effect tasks.Meanwhile, based on the above-constructed property-fragment relation map, we also demonstrated that the attribution fragments can measure the relationships between property tasks ("Step IV" and "Step V"), which have about 80% coincidence degree with the inherent relationships through quantitative calculations. figures/structure/structure-eps-converted-to.pdf

Results
Potential relevance of specific fragments to the molecular property.The following are mathematical statistics and mechanism explanations to verify the reliability of obtained attribution fragments.
Statistical description.365 attribution fragments from 42 different property tasks of four datasets were obtained based on the proposed fragment-based attribution method.As shown in Fig. 3A, 325 (89%) fragments are more sensitive to the corresponding property tasks (the highest difference value reached 0.70), among which 257 single fragments already have a positive contribution, and 68 fragments are combined with fragments of the same task to play important roles jointly.The other 40 (11%) fragments more sensitive to negative have low difference values, and the highest parts are all-around a particular task.The specific calculation method of the "Difference" value for fragment "Sulfur-Nitrogen" bonding was shown in Fig. 3B.The probabilities of matching the 121 positive samples and 23 negative samples split by a new dataset are 0.07 and 0. The difference (0.07=0.07-0) between the two probabilities is considered as the statistical contribution of the fragment "Sulfur-Nitrogen".
Mechanism verification.For verifying the attribution results from the mechanism, we firstly found the "Ground Truth" mechanism fragments, which represent these specific molecular fragments directly affecting the properties.They can also be considered as structural alerts of many different properties.Our strategy is to search for related databases first.There is an online database called ToxAlerts [23 (Sushko et al., 2012)], which extracts the structural alerts of some tasks, as one of the supporting data for the verification process.Another approach used to find the "Ground Truth" is by querying mechanisms reported in the literature and analyzing given important fragments.For example, Reddy et al. (2015) [24(Reddy et al., 2015)] mentioned that the mechanism of genotoxicity is the attack of electrophilic groups on DNA, so electron-rich structures in the compounds are considered as toxic structural groups.Combined with the contents of the database and mutual verification through the literature of drug toxicological mechanisms, we finally summarized most of the identified "Ground Truth" fragments for all the property tasks to be verified.
We randomly selected several high-confidence positive samples from six classic side-effect tasks, obtained the attribution results, and found that the attribution fragments with our fragment-based method highly overlap with the "Ground Truth" fragments given in the literature.As shown in Fig. 3C, the two molecules have the corresponding actual structural alert fragments [21, 22(Smith et al., 2003;Usui et al., 2009)] (shown in the red highlights of the left column) on the Hepatobiliary side effect task.For the first molecule, the obtained attribution fragments with the top-0, top-1, top-2 confidence (shown in the three red highlights of the first row in the right column) perfectly cover the result in the literature [21 (Smith et al., 2003)].Attribution fragments for the other molecule also match the "Ground Truth" fragment [22 (Usui et al., 2009)] with the top-0 and slightly low top-5 fragments.More mechanism verification results are shown in Figure S2-S21 (Supporting Information).
Molecular properties and their specific relevance due to the respective fragments.After obtaining the attribution fragments for each property task, the property relationships (Fig. 4A) were measured with the similarity of the fragment sequences between the task pairs.We verified this finding with the transfer learning [30(Pan and Yang, 2009)] result, which was extended to all 42 property tasks.On the one hand, the relationship between tasks in the same dataset is closer than the property relationship between datasets (shown by the dividers in Fig. 4A); on the other hand, comparing the transferability results of 42 tasks with the fragment similarity results above, we use the cosine similarity to represent the relationship between the two results.The final similarity is 0.76, which means that it can be primarily believed that the relationship between the property tasks with attribution fragments is pretty effective.Based on the above task relationship measurement, the property relation map was constructed among 42 tasks (Fig. 4B).For example, the androgen receptor plays a crucial role in AR-dependent prostate cancer and other androgen-related diseases, potentially disrupting normal endocrine function and causing endocrine toxicity [31 (Sakkiah et al., 2018)].The above mechanism reveals that the "NR-AR" property task shows a pretty close relationship with the "Endocrine disorders" task, and the relationship was also shown in the proposed relation map (orange ball with number "3" and yellow ball with the number "27").Therefore, the obtained attribution fragments can reveal the relationship of property tasks and assist in designing property tasks that lack adequate and diversified samples.
Effectiveness of the fragment-based molecular property attribution.Above two discoveries were figures/fragment/fragment-eps-converted-to.pdfA) The vertical axis "Difference" denotes the probability difference of positive match and negative match, and the scope is from -1 to 1.The higher value means the fragment has a higher proportion of occurrence in positive molecules than negative molecules, which indicates that the fragment has more vital relevance to the specific property.B) The calculation process of difference value.The red highlights denote that the attribution fragments occur in the tested samples.C) Mechanism description of attribution fragments.The left column denotes the "Ground Truth" fragments with red highlights, and References [21, 22(Smith et al., 2003;Usui et al., 2009)] give the related literature.The right column displays several obtained attribution fragments with red highlights.The "Top-k" denotes that the fragment ranks the k-th highest in the overall results for the molecule.
figures/relationmap/relationmap-eps-converted-to.pdf Figure 4: Property relation map with attribution fragments.A) Property task relationships with attribution fragment measurement.The left part is the property relationship result, where the divider lines (from top left to bottom right) represent the distinction between property tasks from four datasets (BBBP, ClinTox, Tox21, and Sider), and the number of the top-left represents the index of property tasks described in detail in C. The right part represents the value of the relationships (the scope is from 0 to 1), and the higher value denotes the closer relationship between the two tasks.B) The property relation map of all the forty-two property tasks.Each ball represents one property task, and the number on the ball corresponds to the task index in C. The shorter distance between the two balls denotes that the two property tasks have a more close relationship, and a longer distance is the opposite, having a less close relationship.C) The description of the whole forty-two property tasks with the corresponding index.
based on attribution results of the proposed method.Meanwhile, the statistical description and mechanism verification also implicitly verified the effectiveness of the fragment-based molecular property attribution.In this section, we further demonstrated the effectiveness and advantage of the proposed attribution method from the following aspects: prediction performance, the spatial distribution of positive and negative molecule activations, and the ability to distinguish the positive causes.
Property prediction task can achieve better performance with fragments than atoms.We trained two graph neural networks (GCNs) [32 (Kipf and Welling, 2016)], atom-based and fragment-based, respectively, to achieve prediction property tasks.The essential difference between the two training strategies is that the fragment-based method trains the network by exchanging feature information with the specific fragments in the molecule as the smallest unit.In contrast, the atom-based method only uses atoms.We divided the dataset of each task into training, validation, and test subset following an 8:1:1 ratio, which is used to train the network, select hyperparameters, and test the prediction performance.The other training settings are the same.As shown in Fig. 5A, the fragment-based method performs better than the atom-based GCN on the new dataset of many property tasks.There is a remarkable improvement in the AUROC score from 0.700 to 0.815 for ClinTox dataset with two property tasks.Meanwhile, slight improvement is shown on other tasks of the Tox21 dataset and Sider dataset.
The spatial distribution of positive and negative activations can reveal the reason for the success of the method.We randomly selected 150 molecules with positive and negative labels, respectively (The dataset itself limits the number of selected molecules), and obtained the gradient activations for each molecule with the devised gradient attribution technique, which are used to represent the molecules here.300 high-dimensional activations were then mapped into two-dimensional space by the t-SNE method [33(Van der Maaten and Hinton, 2008)] for better visualization.As shown in Fig. 5B and Fig. S1 (Supporting Information), the gradient activations of positive (blue dots) and negative (red dots) samples are separated, thus showing a cluster effect.Although there is a small amount of coverage at the junction of the two clusters, the boundaries to distinguish the clusters are still clearly visible.Therefore, from the spatial distribution of the gradient activations of positive samples and negative samples, our gradient attribution method can demonstrate the ability to represent critical internal factors for specific property tasks.The full results on the 12 tasks of Tox21 are shown in Fig. S1 (Supporting Information).
The high-dimensional representation outputted from the model can distinguish different positive causes for the property task.We randomly selected 200 positive molecule samples for the Hepatobiliary task.For each molecule, the model outputs a high-dimensional vector, which generally indicates the representation of the molecule in the whole prediction process.These high-dimensional representations were then clustered with the t-SNE method [33(Van der Maaten and Hinton, 2008)], and we found that the 200 molecules are mainly divided into several categories.As shown in Fig. 5C, we displayed part of molecules in these clusters ((I)-(VI)), and the mechanism of these molecules was verified.The high-response attribution fragments for each molecule are highlighted with different cluster colors, and we found that the attribution fragments within the same cluster are the same or quite similar.Therefore, the representations of these positive molecules demonstrate a close relationship with the property task, demonstrating our method's effectiveness in exploring the positive cause of each property task.
figures/effectiveness/effectiveness-eps-converted-to.pdfFor each property task, the left bar denotes the performance of the atom-based method, and the right one denotes the performance of the fragment-based method.The same color denotes that these tasks belong to the same dataset, and the error line represents the variance of prediction performance.B) Spatial distribution of positive and negative molecule activations.The blue points denote activations of positive molecules, and the red ones denote negative.C) Representation clusters of the positive molecules in the "Hepatobiliary" task.The grey-green, orange, blue, grey, red, wine red highlights denote the attribution fragments of different molecule clusters represented by (I), (II), (III), (IV), (V), (VI) in "Molecule Representation Space" respectively, and References [21, 22, 25, 26, 27, 28, 29(Smith et al., 2003;Usui et al., 2009;Remmel et al., 2007;Uetrecht et al., 1995;Geneve et al., 1987;Stepan et al., 2011;Walgren et al., 2005)] give the related mechanism literature.

Discussion
Direct affect of attribution fragments to the corresponding property tasks.The above experimental results demonstrated the reliability of the attribution fragments.On the one hand, from the perspective of mathematical statistics, we obtained a total of 365 attribution fragments for all the forty-two property tasks.For each fragment, the difference in the two probabilities of occurrence on the new positive and negative datasets demonstrates its relevance to the property task.We found that about 90% of the attribution fragments have positive relevance with the corresponding property tasks; on the other hand, six classic side effects property tasks have been verified from the mechanism.We searched for their corresponding exact substructures in the literature for several positive samples randomly selected in each task and compared them with the obtained attribution fragments.The comparison results (Fig. 3 and Fig. S2-S21 in the Supporting Information) showed that the attribution method can give accurate fragments with high confidence, which has an excellent guiding significance for experts.In general, the most crucial significance of the above attribution discovery is that specific fragments of a particular property task can be successfully generalized to improve many downstream tasks, such as molecule generation with specific functions in molecular design; as a priori knowledge to improve the accuracy of property prediction, thereby reducing the loss rate in the drug discovery process, shortening the cycle of synthesis and testing, and objectively designing molecules to reduce human-induced bias.
Meanwhile, we also found that the specific fragments reveal the internal relationships between property tasks, which the transferability between property tasks has verified.The relationship can be used to make a unified judgment on closely related tasks.In the above section, we discussed the close relationship of "NR-AR" and "Endocrine disorders" property task (orange ball with number "3" and yellow ball with the number "27" in Fig. 4B) that has been described in the literature [31 (Sakkiah et al., 2018)].In addition, the cytochrome P450 enzyme system (CYP450) in the liver is the main enzyme system for metabolizing drugs in the body, so the liver is closely related to the metabolic system in the body [34(Leise et al., 2014)], corresponding to "Hepatobiliary disorders" and "Metabolism and nutrition disorders" property task (yellow ball with the number "15" and number "16").Another case is that estrogen receptors play an important role in pregnancy and may lead to tumor diseases such as uterine cancer and breast cancer [35 (Lambertini et al., 2018)], and the "NR-ER" and "Pregnancy, puerperium and perinatal conditions" is also displayed in the relation map (orange ball with number "7" and yellow ball with the number "37").Based on this relation map, we can provide a guide for exploring the relationship between drug molecules and properties more quickly.For example, suppose a new drug has an important effect on a certain property.In that case, we should pay more attention to these properties closely related to the former one because the drug molecule has a high probability of having a similar effect on the related ones.Meanwhile, the obtained property relation map can also guide achieving higher-performance model transfer, thereby promoting the development of "AI & property prediction".
Advantages of our fragment-based method compared to other methods.We first discuss the comparison between the atom-based attribution method and our fragment-based attribution method.As shown in Fig. 3C and Fig. S2-S21 (Supporting Information), we displayed the structures given in the pharmacological literature (that is, "Ground Truth" fragments that activate the related property), the attribution results obtained by atom-based attribution method, and results by fragment-based attribution method of six classical task cases.In general, our method can often obtain more accurate results compared with the atom-based method, which meets the basic fact that molecular properties are closely related to specific fragments in the molecules [15, 16, 17, 18, 19, 20(Nelson, 1982;Kalgutkar, 2019;Limban et al., 2018;Hemmerich and Ecker, 2020;Al-Hasani and Bruchas, 2011;Smith, 2011)].There is no doubt that the prediction of molecular properties by simply considering the relationship between atoms can obtain the high-dimensional features that accurately represent the molecular map through the information interaction between atoms, thereby completing the prediction task.However, due to multiple information interactions of the model, the information of several atoms belonging to a region (i.e., fragment) is mixed into the surrounding atoms, so all the atoms, which cover the "Ground Truth" fragment in this processing mode, are almost impossible to be accurately located.As shown in Fig. S2-S21 (Supporting Information), the distribution of high-confidence atoms is wholly scattered, which cannot give a reasonable explanation for the "Ground Truth" fragment.Our method firstly divides the whole molecule into fragments by splitting the molecular tree and then uses the fragment as the smallest unit to build a new molecular map to make predictions.The processing method has two benefits: first, it considers the role of both the atoms and the bonds.The two parts then are combined into a whole fragment to explore the cause of being positive, which is in line with objective laws; second, as a guide to critical structures, when we take the fragment as the smallest unit of the model, we can directly locate the fragments that significantly affect the properties, instead of manually extracting the region around the most critical atoms or expecting the activation atoms to surround together.Experimental results (Fig. S2-S21 in the Supporting Information) demonstrated that critical substructures extracted by our method can match the "Ground Truth" with high accuracy.Therefore, as a computer-aided positioning method, the fragment-based method is more robust than the atom-based method for the guidance of experts.
Meanwhile, we analyze the difference between the MGA framework [14 (Wu et al., 2021)] and our method from the prediction performance and crucial substructure mining capability on the above-mentioned six side effect property tasks.The MGA framework aims to improve the performance of predicting toxicity tasks and also tries to use the attention mechanism to explore the most critical structural information.As shown in Fig. S22 (Supporting Information), our method achieves on par with MGA and even outperforms it on some tasks.More importantly, our method demonstrates stronger interpretability on most property tasks in the situation where the prediction performances are nearly equal.We present the comparative results between two methods of mining crucial substructures related to properties (Fig. S23-S28 in the Supporting Information).It is obvious that the localization of property-related substructures based on our method is more accurate than the results given by MGA.The specific method is to find the atom with the largest attention weight, and use it as the center point to delineate an area around it as the propertyrelated substructure.However, this positioning strategy is not suitable for practical use for two reasons: on the one hand, the atom-based strategy of MGA lacks general credibility.MGA only focuses on the atom with the largest attention weight.By default, the substructure formed by the area around this atom is considered as the factor that affects the properties, which lacks scientific basis.Instead, our fragment-based strategy is based on many existing scientific discoveries [15, 16, 17, 18, 19, 20(Nelson, 1982;Kalgutkar, 2019;Limban et al., 2018;Hemmerich and Ecker, 2020;Al-Hasani and Bruchas, 2011;Smith, 2011)] and consider multifaceted effects in the substructure mining process.On the other hand, the positioning strategy of MGA cannot find all the crucial substructures.Actually, there may be two or more substructures related to the property for a certain molecule.However, MGA only chooses the most crucial atom as the center to delineate the substructure, resulting in the inability to provide multiple results.Our method takes into account the overall effect of the Top-k output fragments (k represents the number of selected attribution fragments), which has the ability to output all correct results with high confidence (Fig. S24 in the Supporting Information).
Fragment combination strategy for molecule decomposition dilemma.Junction Tree [36 (Jin et al., 2018)] was a pretty promising decomposition method for molecular design, and we also used the universally recognized method to implement our fragment-based method.However, only some simple rings and diatomic fragments can be extracted due to the limitation of static molecular tree decomposition.When the "Ground Truth" fragments are pretty complex structures, the method can only attribute to part of the whole structure.Meanwhile, due to the specificity of the "Ground Truth" fragments for each property task, it is difficult to make a straightforward way of splitting specific tasks, which is an inherent limitation.
Fortunately, there is an effective coping strategy to deal with this situation because fragments with different confidence can be attributed to a positive molecule.In general, fragments with higher confidence tend to have a more significant impact on the property, and fragments with higher ranks are then combined in this method.As shown in Fig. 3C, the top-0, top-1, and top-2 attribution fragments on the first molecule form the "Ground Truth" fragment.Other similar combinations, as shown in the second line of Fig. 3C, such as the top-0 and top-5 of attribution results which together form the "Ground Truth".Although the top-5 result is less reliable, the top-0 fragment closely related to Ground Truth is given greater confidence.Therefore, we consider this situation as a guiding attribution result, and the situation mentioned above with the required fragments entirely at the top confidences is called accurate attribution.
Top-ranked fragments selection to effectively avoid attribution bias.The obtained attribution fragments have a certain degree of bias, which means that there exist a small number of structures that are not related to the property task in all the attribution fragments.The number of attribution fragments for each positive test molecule is not the same, and each attribution fragment has a degree of confidence.In general, fragments with high confidence contribute more to the positive property.As shown in Tab. 1, all the results are positive numbers, which means that the overall attribution effect shows partiality for the positive label.Meanwhile, the metrics on almost all property tasks are the best (shown in bold) when k is 20, and the three exceptions also appear when k is 50.As fewer top fragments are selected, the existing bias fragments tend to be filtered out, and the attribution fragments thereby have a more significant impact on property tasks.Therefore, if there is a doubt about whether the attribution method positions the incorrect fragments or not, choosing the attribution fragments with higher confidence tends to bring better overall results.Limitation and future work.As shown in Fig. 5A performs quite worse than the atom-based method on specific property tasks.We consider that the keypoint fragments of these molecular properties are in disorder due to the fixed fragment decomposition mechanism [36 (Jin et al., 2018)] as mentioned above, which finally leads to the opposite effect.In addition, dihedral angles are used to specify the molecular conformation, which significantly affects the chemical properties of the molecule [37 (Michl, 2003)].For example, the similar molecule structures tended to have a completely different mechanism to be positive due to the minor difference in molecular conformations [37 (Michl, 2003)].However, the role of angles is not considered in this work, which directly influences the model performance.Therefore, only 90% of the obtained attribution fragments have been verified to show positive relevance to the property tasks in the above experiments.The source of the 10% error is likely to come from the lack of thorough analysis of the molecular structural information.At the same time, some molecules contain structural alerts and never generate toxic effects, as well as compounds that can be rejected as drugs due to manifested toxic effects without having a structural alert in their molecule [38 (Claesson and Minidis, 2018)].
Fortunately, the cases mentioned above are only a few.Therefore, although there are misjudgments in the attribution process, the accuracy of the judgment can be guaranteed in general.Further improvement of our work will focus on using three-dimension molecule conformations to represent the molecules.

Conclusion
We propose an explainable fragment-based molecular property attribution method for analyzing the relevance between the biochemical property and molecular fragments.Moreover, statistical results and mechanism verification are adopted to demonstrate the reliability of discovered relevance between molecular property and fragments.Experiments on forty-two biochemical property tasks show that about 90% of the attribution fragments strongly relate to the corresponding property task, and random-selected attribution results from six classical side effect property tasks satisfy the biochemical mechanism excellently.The discovered relationship between molecular property and fragments can be applied to various tasks, such as exploring the relation of different molecular properties and targeted property molecular synthesis with specific fragments.Based on the attribution fragment sequence for different property tasks, we build the property relation map of all the forty-two properties.The transfer learning experiments are adopted to verify the benefits of the property relation map for assisting rapid and accurate transfer learning performance.In summary, as a computerassisted molecular discovery method, our fragment-based attribution method can provide pharmacologists with sufficiently precise guidance, accelerate the process of analyzing the properties of drug molecules, and promote the efficiency of clinical trials.In future work, we will focus on using more information to represent the characteristics of molecules, such as adding dihedral angles in the three-dimensional conformation and realizing more natural molecular tree decomposition methods to achieve more precise positioning.

Methods
Experiment dataset setting.The training and validation data is obtained from four datasets (BBBP, Tox21, Sider, ClinTox) of the physiology field in Mufei Li et al. [39(Li et al., 2021)].These datasets include the experimental bioactivity data for 42 different property tasks, namely including eye disorder, hepatobiliary disorder.The BBBP dataset contains the data for binary labels of blood-brain barrier penetration (permeability) with 2,039 compounds.The Tox21 dataset denotes qualitative toxicity measurements on 12 biological targets, including nuclear receptors and stress response pathways, totaling 7,831 compounds.The Sider Dataset is the database of marketed drugs and adverse drug reactions (ADR), with 27 system organ classes and 1,427 compounds.The ClinTox dataset contains 1,478 qualitative data of drugs approved by the FDA and those that have failed clinical trials for toxicity reasons.Each property task may not contain all the data in the corresponding dataset.
Before being put into training, each molecule was decomposed into one fragment tree with Junction Tree Method [36 (Jin et al., 2018)], mainly including the ring fragments and diatomic fragments.For each atom and bond, the CanonicalFeaturizer interface of DGL-LifeSci [39 (Li et al., 2021)] was used to generate features.The feature of a fragment composed of several atoms and bonds was represented by the weighted concatenation of these atoms and bonds features.After fragment decomposition for every task, all the molecules were split into training, validation, and test subsets following an 8:1:1 ratio.Different splittings were recommended depending on the contents of each dataset: the BBBP dataset adopted scaffold splitting, and the other three datasets adopted random splitting in our setting.
Processing of fragment-based attribution method and validation of attribution fragments.Each property task was processed with the pipeline of the proposed fragment-based attribution method as follows: Train Stage.For each property task, the fragment-based GCN model was trained with the same setting, and the prediction models were then saved.

Sample Selection Stage.
Taking into account the accuracy of attribution, the positive molecules used for attribution should be screened.The prediction loss is used to sort these molecules in the screening process, and low prediction loss means high confidence for the positive label.For low-confidence molecules, the prediction process is generally unreliable.The attribution results are then difficult to represent the property task effectively, so we adopted high-confidence molecules.Finally, the top-200 high-confidence samples were chosen for the next attribution stage.The selection of the number 200 mainly depends on the trade-off of the number of positive samples for these tasks.
Attribution Stage.In this stage, we obtained the attribution fragments of the above high-confidence samples.The reverse derivation process was performed from the prediction result layer of each molecule to the input fragment feature layer.After obtaining the gradient response of each fragment, we sorted them according to their response values and discarded fragments with small values.
Fragment-validation Stage.The reliability of each obtained fragment was validated with a new dataset M, which consists of positive subset M pos and negative subset M neg .For each molecule in M pos and M neg , we tried to determine whether attribution fragment g occurs in the test molecule.To eliminate the effect of the imbalance in the number of test molecules, we considered the difference in the two probabilities of occurrence on the new positive and negative datasets as the final validation metric.Fig. 3B shows one case of the calculation process, and the equation ( 1) for a specific difference Dif f g of attribution fragment g is given as follows: The value scope of the calculated Dif f g is from -1 to 1.The situation when Dif f g > 0 denotes that g shows positive relevance to the property.As Dif f g gets greater, the relevance of the obtained fragment g is much higher.
Construction and verification of property relation map.The property relation map was constructed with the obtained attribution fragments of all forty-two tasks.We considered that property tasks with similar attribution fragments are generally more related.Therefore, we clustered all attribution fragments in the form of Morgan fingerprints [40(Morgan, 1965)], and calculated how similar the attribution fragments are between each task pair.The calculation method is as follows: For each cluster, we determined whether the fragments of task a and task b appeared in this cluster.If the fragments of the two tasks both appeared, we recorded the number of fragments that appeared, respectively.When all the clusters were traversed, we summarized the total number of fragments for task a and task b that appeared in the same clusters, and calculated the ratios of these fragments to the total number of fragments for the two tasks.The two ratios were averaged as the similarity based on attribution fragments for task a and task b.With the above calculation process, the similarity between every two tasks was obtained, and then the property relation map F was constructed (the shape of F is 42*41).The relationship between the task and itself is the closest (that is, 1) by default.In the following verification process, we do not consider the relationship between the task and itself.1e-8, beta (0.9, 0.99), amsgrad false.The Binary Cross Entropy loss function is employed to measure model performance in both the training and validation stages.The loss formula (4) is as follows: where ŷ is the probability that the model predicts that the sample is a positive example.y is the ground truth label, if the sample is a positive example, the value is 1, otherwise the value is 0.
The Area Under the Receiver Operator Curve (AUROC) is another well-known metric for evaluating the accuracy of a binary classification task.It is calculated by considering the TP and FP rates obtained using different decision thresholds.The AUROC value ranges from 0 to 1, with a value of 0.5 indicating a random prediction and a value of 1.00 denoting perfect predictive accuracy.

Figure 2 :
Figure 2: Overview of our fragment-based molecular property attribution method.Molecules from different datasets are used for the fragment attribution stage (Step I), which provides each molecule's attribution fragments.The obtained fragments are then summarized for each property task (Step II and Step III), and these tasks have the corresponding attribution fragments.Next, all these fragments are clustered with their fingerprints and shown in the "Fragment Space" (Step IV).Finally, the relation map of these property tasks is measured with the distance relationship of the respective fragment fingerprints (Step V).

Figure 3 :
Figure3: The reliability validation of attribution fragments obtained with the fragment-based method.A) The vertical axis "Difference" denotes the probability difference of positive match and negative match, and the scope is from -1 to 1.The higher value means the fragment has a higher proportion of occurrence in positive molecules than negative molecules, which indicates that the fragment has more vital relevance to the specific property.B) The calculation process of difference value.The red highlights denote that the attribution fragments occur in the tested samples.C) Mechanism description of attribution fragments.The left column denotes the "Ground Truth" fragments with red highlights, and References[21, 22(Smith et al., 2003;Usui et al., 2009)] give the related literature.The right column displays several obtained attribution fragments with red highlights.The "Top-k" denotes that the fragment ranks the k-th highest in the overall results for the molecule.

Figure 5 :
Figure5: Effectiveness of the fragment-based method.A) Comparison of prediction performance of atom-based GCN and fragment-based GCN method.For each property task, the left bar denotes the performance of the atom-based method, and the right one denotes the performance of the fragment-based method.The same color denotes that these tasks belong to the same dataset, and the error line represents the variance of prediction performance.B) Spatial distribution of positive and negative molecule activations.The blue points denote activations of positive molecules, and the red ones denote negative.C) Representation clusters of the positive molecules in the "Hepatobiliary" task.The grey-green, orange, blue, grey, red, wine red highlights denote the attribution fragments of different molecule clusters represented by (I), (II), (III), (IV), (V), (VI) in "Molecule RepresentationSpace" respectively, and References [21, 22, 25, 26, 27, 28,  29(Smith et al., 2003;Usui et al., 2009;Remmel et al., 2007;Uetrecht et al., 1995;Geneve et al., 1987;Stepan et al., 2011;Walgren et al., 2005)] give the related mechanism literature.
pos , M k neg represent the k-th molecule of positive molecule set and negative molecule set which are used to validate the relevance, K pos , K neg represent the molecule number of M k pos , M k neg , 1(•) denotes that when satisfying the condition in parentheses, the function outputs 1, otherwise 0.

Table 1 :
, the fragment-based method Comparison of overall differences with the top-k fragments.The results of "Top-k Fragments" are the average of the differences with top-k attribution fragments.Higher results indicate more favorable to the property task.