Artificial Intelligence and Machine Learning in Computational Nanotoxicology: Unlocking and Empowering Nanomedicine

Advances in nanomedicine, coupled with novel methods of creating advanced materials at the nanoscale, have opened new perspectives for the development of healthcare and medical products. Special attention must be paid toward safe design approaches for nanomaterial‐based products. Recently, artificial intelligence (AI) and machine learning (ML) gifted the computational tool for enhancing and improving the simulation and modeling process for nanotoxicology and nanotherapeutics. In particular, the correlation of in vitro generated pharmacokinetics and pharmacodynamics to in vivo application scenarios is an important step toward the development of safe nanomedicinal products. This review portrays how in vitro and in vivo datasets are used in in silico models to unlock and empower nanomedicine. Physiologically based pharmacokinetic (PBPK) modeling and absorption, distribution, metabolism, and excretion (ADME)‐based in silico methods along with dosimetry models as a focus area for nanomedicine are mainly described. The computational OMICS, colloidal particle determination, and algorithms to establish dosimetry for inhalation toxicology, and quantitative structure–activity relationships at nanoscale (nano‐QSAR) are revisited. The challenges and opportunities facing the blind spots in nanotoxicology in this computationally dominated era are highlighted as the future to accelerate nanomedicine clinical translation.


Introduction
Computational toxicology aims to decode the factors which are responsible for toxic interactions using a sophisticated model. The DOI: 10.1002/adhm.201901862 model itself has to take all potential interactions of the substance into account to produce meaningful results. The evolution of artificial intelligence (AI) and machine learning (ML) led to the development of novel approaches for toxicity testing (Figure 1 and Table 1). Computational modeling in the field of nanomaterials is used to establish a correlation between their biokinetics, the dynamics of the biological response, and the fate once the substance has reached the target organ of interest. Especially biology based mathematical models are well suited for the modeling of the fate of nanotherapeutics and engineered nanomaterials (ENMs) within experimental systems. ML was already successfully applied to simulate the biokinetics and interaction of nanomaterials in a varying environment. [1] The most studied approaches for the assessment of nanomaterial induced toxicity are structure based mathematical models (e.g., Bayesian methods and Markov Chain Monte Carlo simulation) and in particular quantitative structure-activity relationships (QSAR) at nanoscale (nano-QSAR). [2] The QSAR computational approach endeavor to predict the biological function of a compound depending on the physicochemical properties and theoretical descriptors of molecules. The QSAR are most studied models owning the highest www.advancedsciencenews.com www.advhealthmat.de prediction accuracy as contemporary models based on density function theory (DFT) calculations for the physicochemical properties.
Physicochemical properties of the nanomaterial are a key requisite for the QSAR modeling. One approach for the determination of QSAR is based exclusively on a theoretical model as first principle. For example, chemical reactivity based on the energies of the molecular orbitals constituting the substance of interest could be used to derive the QSAR of the substance. Another modeling approach for QSAR is entirely statistical, i.e., essentially using pattern recognition in order to correlate descriptors that are potentially useful with the effect one is trying to predict. [3] The hybrid QSAR model is a combination of mechanistic reasoning and statistical fit which use theoretical considerations to identify the descriptors that are likely to be predictive. [4] The statistical models are effective to calibrate the parameters surrounding the molecular descriptors. There are some useful physicochemical properties (e.g., surface charge, corona, aggregation, solubility, etc.) for predicting biological activities of nanomaterials. Open-source ML tools can provide quantitative structureproperty relationship (QSPR) workflow in this context to predict a variety of nanomaterial properties that could have crossplatform feasibility to integrate into existing nanoinformatics workflows. [5] These models can, of course, be used on their own right but they can also be employed to provide input parameters to the more complex physiologically based pharmacokinetic (PBPK) models. Through information on QSAR, simulations can be subsequently carried out, which are able to predict biological activities and potential interaction partners, which may induce a toxic response. A variety of different computational approaches was developed in the past years in order to improve simulations and produce reliable results.

Models Predicting Colloidal Properties as Determinant to Nanotoxicity
All QSAR models, which aim to predict the toxicity of a nanomaterial, rely on extrinsic material properties, which determine the bio-nanointeractions in the context of reactivity and nanotoxicity. These physicochemical descriptors, which describe the fate of nanomaterials within the body or in the environment, are incorporated into QSAR models.
One of those descriptors is the point of zero zeta potential (PZZP) which is being used to predict the agglomeration/aggregation behavior of nanomaterials in computational models. [24] The PZZPs are overlaid to simulate the potentials of ENMs in fluids with different pH levels to elucidate agglomeration of the ENMs within a particular tissue. [25] Certain ENMs are able to exchange electrons with the cellular environment; a situation which may disrupt the cell redox states. [26] In consequence, this leads to oxidative stress since the cell machinery has only limited capacity to compensate for radical reactions. Enhanced cytotoxic reactions, inflammation, and additional adverse effects in the body may arise.
Molecular modeling based SAR models for 1D ENMs such as nanoneedle or carbon nanotubes (CNTs) were shown suitable to predict their reactivity. [35] The study demonstrated that the length versus the diameter ratio (L/D) of a nanoneedle, shown as a   [6] for nanomaterial (NM) adsorption, distribution, and metabolism into the human body by prediction of the NM properties. During the 20th century various statistical approaches to discrimination (1901), [7] grouping or clustering (1974,2000), [8] prediction (1964,1995) [9] and regression (1988,2000) [10] of multiple datasets have been developed and intensively used in machine learning (ML). To predict, evaluate and optimize these data sets neural networks (1973,1980,1990) [11] and support vector machines (1990) [12] have been introduced in artificial intelligence (AI) and respectively in ML. In parallel, models for the prediction of quantitative structure-activity relationships (QSAR, 1969) [13] and physiologically based pharmacokinetic (PBPK, 1937) [14] of drugs and other substances have been introduced. The important input parameters for PBPK models (partition coefficient (1897), [15] biological activity (1970) [16] and bioavailability (1997) [17] ) can be predicted with QSAR, based on the Hammet constant (1940). [18] At the beginning of the 21st century, QSAR models have been extended with the introduction of a method validation, NMs and the prediction of the solubility of substances. The most important developmental steps of the PBPK models are the human inhalative (2002), [19] oral (2006) [20] and lifespan (2010) [21] PK profiles. Recently developed PBPK models describe the uptake, adsorption and distribution of NM in rates (after inhalation, 2016), [22] and the transport of NM from the blood supply into cells (2019). [23] function of carbon atoms, has a vital influence on the reactivity. The frontier molecular orbital theory is another input parameter for the prediction of QSAR. A longer gap between the lowest unoccupied molecular orbital (LUMO) energy ( L ) and the highest occupied molecular orbital (HOMO) energy ( H ), reduces the stability and enables higher conductivity of the ENM, leading to an enhanced reactivity. [36] 3D Nano-QSAR model have also been developed via matching the low energy conformations, which were docked into ADME models to design novel nanotherapeutics as shown in Figure 2.
Another mathematical model which predicts toxicity of the nanoparticulate and long aspect ratio nanoneedle, and CNT, has only limited success for the crystalline materials. [37] The close proximity of atoms in a crystal structure increases the probability of an overlap of orbital energies and a subsequently split. This results in a valence band and a conduction band, separated by an energy gap. Models which are extended with this input parameter were able to estimate redox active nanomaterials more reliably. Moreover, it can also be predicted if chemicals in the cellular environments are able to donate or receive the electrons to the conduction band. [38] Overlapping conduction bands indicating redox potentials in the cellular environment of ENMs predicted the cytotoxicity and other disruptive effects for several metal oxide ENMs. [39] However, the common QSAR methods used for www.advancedsciencenews.com www.advhealthmat.de  [27] Decision tree Classify nanomaterials • Automatic selection of input variables • Removal of insignificant descriptors and to find those worthwhile out of small, noisy and large datasets • Cannot handle non-numerical data • Training time is high when compared with other models [28] Support vector machines (SVM) Collinear descriptors, nonlinear relations, small and large datasets, and over fitted models • Gives high precision and speculation • Can manage both classification and regression problem • Can handle problems such as nonlinear relationships, collinear descriptors, small datasets and model overfitting • High sensitivity of model performance to the selection of design parameters (e.g., kernel functions) and the complexity of direct interpretation of SVM decisions [29] Artificial neural network (ANN) Nonlinear data relationship and large datasets • Provides the ability to deal with the nonlinear nature of structure-activity relationship and large descriptor datasets including unnecessary variables • Choosing the ideal complexity • Trouble of overfitting • High generalization sensitivity to variation in parameters and network topology [ 30,31] Partial least square (PLS) Reduction of the number of descriptors to make them more suitable for further analysis • Works well when there are several noise sources and intercorrelated descriptors • Difficulty in interpreting loading of independent latent variables • Distributional properties of estimates not known [ 32,33] [34] bulk chemicals are not fully adequate for the accurate prediction of toxic interactions of ENMs.

Nano-QSAR Cytotoxicity Models
A variety of different models are available to overcome the limits of standard QSAR approaches for nanomaterials. The computational hybrid nano-QSAR model for the nanocytotoxicity adopts dual descriptors: enthalpy of a formation (related to the bandgap energy), and the electronegativity (related to stability). The simplistic nano-QSAR approach successfully computed the cytocompatibility of metal oxide ENMs for a number of cell lines. [40] In order to foster the development of computational tools for the assessment of toxicity induced by ENMs, the European Commission has launched different projects under the Registration, Evaluation, Authorization, and Restriction of Chemicals (REACH) regulation. [41] Many of the assessed nano-QSAR models are also able to predict the biopersistance of the ENMs which supports a safe regulation. [42] However, the value of these models is rather different when it comes to grouping of nanomaterials based on their similarity in terms of physicochemical properties and biological activities. [44] Based on analyses drawn from the nanocluster growth rate and the physicochemical properties of the ENM a read across of the toxicological properties ENMs based on governance, risk, and compliance (GRC) transformation can be performed. [45] Such QSAR based models can be used in their own right but they can also be employed to provide input parameters to the more complex PBPK models. The sophisticated Nano-QSAR models are employed to develop safer ENMs for nanomedicinal approaches as described below.

Metal Containing NPs
Fourches et al. [46] considered four experimentally determined descriptors of 44 different nanoparticles (NPs) [47] from size, zeta potential arising from the intensity of charge on the surface, R1 relaxivity, and R2 relaxivity. The last two terms are related to the magnetic properties of the NPs, which determine their ability to influence the relaxation rates of proton spins in the surrounding water molecules. They also analyzed lipophilicity dataset to evaluate cellular uptake. The authors investigated the effectiveness of support vector machines (SVM) and QSAR methodology in predicting nanotoxicology, and in designing safer NPs. [48] Another study uses nano-SAR model [49] to build and integrate the naïve Bayesian classifier on the same dataset. [47] This hybrid model classify the 44 NPs into bioactive or inactive groups based on similar four descriptors-size, zeta potential, spin-lattice relaxivity, and spin-spin relaxivity. [47] Yet another study investigated the toxicity of 82 metallic, metal-oxide, dendrimer, and polymer NPs against zebrafish embryos using six descriptorsconcentration, shell composition, functional groups, purity, core structure, and surface charge. [50] Kleandrova et al. developed a nano-QSAR model to test the ecotoxicity of NPs taking into account six descriptors-molar volume, polarizability, size, electronegativity, hydrophobicity, and polar surface area. [51] Applying different cut-off values for several endpoints like CC 50 , EC 50 , IC 50 , TC 50 , and LC 50 , the NPs were divided into either toxic or nontoxic. Further, ecotoxicity of three nickel based NPs was predicted which was in good agreement with the experimental www.advancedsciencenews.com www.advhealthmat.de evidence. Researchers have also reported an accuracy of more than 90% in their cytotoxicity predictions in NPs. Luan et al. developed a QSAR-perturbation model to predict the cytotoxicity of NPs against mammalian cell lines considering three descriptors-molar volume, size, and polarizability of NPs; and reported an accuracy of 93%. [52] Concu et al. developed a unified in silico machine learning model based on artificial neural networks to predict general toxicity profiles of NPs. Applying this model to 260 NPs using two families of descriptorsphysicochemical and 2D topology, showed an accuracy of more than 97%. [53] Boukhvalov and Yoon developed descriptors based on the results of first principle calculations for metal NPs. [54] They considered two reactions: Ion extraction from the surface of a NP to aqueous media and water dissociation on the crystal planes with different miller indices, such as (001) and (111), nanorods, and two cubic nanoparticles of 0.6-0.3 nm size of different metals-Al, Fe, Cu, Ag, Au, Pt. The dependence of chemical activity on the shape and size of NPs was investigated. Another study using QSAR reports the toxicity tests for three types of cluster NPs-monometallic (Au-Pd) clusters, core-shell particles, and bimetallic clusters(Au/Pd) on Escherichia coli and CHO-K1 cells based on size and specific surface area of NPs. [55] The cytotoxicity of bimetallic clusters (Au-TiO 2 , Pd-TiO 2 ) and bimetallic clusters (Au/Pd-TiO 2 ) was found to be enhanced compared to that of pure TiO 2 . The proposed mechanism of cytotoxicity of NPs was shown as the release of ions and reactive oxygen species (ROS) from the TiO 2 surface.

Multiwall Carbon Nanotubes (MWCNTs)
Fiber-shape and ability to generate ROS are considered indicators of high toxicity of a material. While all CNTs are fiber shaped, some also produce ROS. [56] To study the toxicity of MWCNTs, quasi-QSAR models [57] were developed based on the representation of conditions-concentration, presence of S9 mix, type of MWCNT (surface area), and usage of preincubation in a quasisimplified molecular input-line entry system (SMILES) form whose descriptor correlation weights were calculated with the Monte Carlo method. In another study, the genotoxicity was modeled as a function of five parameters-particle type (MWCNT or fullerene), illumination, concentration, metabolic activation, and preincubation producing satisfactory statistical parameters. [58] In these examples, the endpoint of cytotoxicity measurement is the reverse mutation test (TA100). In another study, [59] nano-QSAR models were developed to predict the toxicity of 20 MWCNT types in human lung cells. Quasi-SMILES was used to represent the physicochemical properties and experimental conditions: diameter, length, surface area, in vitro toxicity assay, cell line, exposure time, and dose. [60] The quasi-SMILES-based nano-QSAR model calculation using Monte Carlo method, provided sufficient statistical parameters. In this case, the endpoint of cytotoxicity measurement is the cell viability (%).

Fullerenes
Taking the cytotoxicity data of C60 NPs toward Salmonella typhimurium from ref. [61] , a mathematical model as a function of dose, S9 mix, and illumination was constructed using quasi-SMILES descriptors obtained with the Monte Carlo method resulting in statistical parameters: R 2 = 0.755, q 2 = 0.571. [61,62] Continuing the study, taking two datasets from ref. [61] , mathematical models as a function of dose, S9 mix, and illumination were constructed using quasi-SMILES optimal descriptors obtained with the Monte Carlo method. In this study several splits were made into the training, calibration, and validation datasets as opposed to one split in the earlier study. In both of these studies, the endpoint of cytotoxicity measurement was the reverse mutation test, either TA100 or WP2 uvrA/pKM101. [62]

Silica Nanomaterials
Taking a similar approach as described above, a predictive model for calculating the cytotoxicity of 20 and 50 nm silica NPs was built as a mathematical function of size, concentration, and exposure time using quasi-SMILES descriptors. The dataset was split into three random sets-training, calibration, and validation. The toxicity was estimated by 3-[4,5-dimethylthiazole-2-yl]-2,5-diphenyltetrazolium bromide (MTT) assay as cell viability of cultured human embryonic kidney cells exposed to different concentrations of silica NPs. [63] In another similar study, [63] taking numerical cell viability data for silica NPs from literature [64] and splitting it randomly into three datasets-sub-training, calibration, and validation, high determination coefficients (0.83-0.89) and q 2 parameter values (0.71-0.82) are reported for the subtraining set. The cell viability is also shown as a function of size, concentration, and exposure time. The endpoint of cytotoxicity measurements in both of these examples is the cell viability. Quasi-QSAR method (using quasi-SMILES optimal descriptors) was compared with the random forest (RF) approach in a set of cytotoxicity experimental data containing 19 data points for silica NPs. While for RF, aspect ratio and zeta potential were found to be the most important variables, for quasi-QSAR no such conclusion could be drawn. It was also demonstrated that the RF approach is applicable to modeling the cytotoxicity of silica. [60] Later, better nano-QSAR models based on quasi-SMILES were built using CORAL software with high determination coefficients (0.8-0.95). [65]

The ADMET In Silico Modeling Accelerating the Safer ENMs
Traditionally, nanotherapeutics was developed by testing them against several biological screens in time-consuming multi-step processes. The target compounds which showed good potential were then investigated for their pharmacokinetic properties, metabolism, and potential toxicity, during which adverse effects most commonly have been discovered. As can be imagined, this entire process was extremely expensive, tedious, labor-intensive, and time consuming, and most often resulted in a nonusable drug; requiring starting the process all-over again with a new drug. With the increase in biological screening and chemical synthesis, the demands for useful early information on absorption, distribution, metabolism, excretion and toxicity data (together called ADMET data) has also dramatically increased. [66] This has led to the development and use of various high-throughput in Figure 3. A schematic diagram showing integration of ADME modeling with experimental organs-on-chip brining two worlds together: computation productions and real time experimental verification which could putatively adapt to nanotoxicology. Reproduced with permission. [74] Copyright 2018, Elsevier.
vitro ADMET screens. There is also an increasing need for reliable and robust tools for predicting these properties to serve two key aims. First, prediction at the design stage of new compound libraries so as to reduce the risk of late-stage attrition; and second, to optimize the screening and testing by looking at only the most promising compounds. [67] It has been reported that the quest for early, fast, and relevant ADMET data can be tackled in three ways. [68] First, a variety of in vitro assays are automated using robotics and miniaturization as high throughput (HTP) advances. Second, in silico models are used to assist in the selection of both appropriate assays, as well as in the selection of subsets of compounds to go through these screens. Third, predictive models have been developed that might ultimately become sophisticated enough to replace in vitro assays and/or in vivo experiments. The need for ADMET information starts with the design of new compounds. At such an early stage, computational approaches can be used to make predictions and decide a lead, even though they may not be perfect at this point. This may also help in deciding the synthesis route of the compound. More robust tools can then be used to optimize and predict the mechanism of a lead compound into a feasible clinical product. [69] Understanding the relationship between important ADMET parameters and chemical structures helps in developing in silico models for estimating ADMET properties. [70] For example, dose size and dose frequency (Figure 3) can be estimated from oral absorption, bioavailability, tissue penetration, clearance, and the distribution volume. Yet, despite its importance, lack of published data makes it difficult to predict such properties from chemical structures alone. A good understanding of physicochemical properties, coupled with their measurement and prediction, are crucial for a successful nanomedicine project. [70] Another example of an ADMET property that can be predicted is drug absorption. Simulation programs, such as GastroPlus [71] and Idea, [72] are available which help in lead optimization and compound selection. Such programs use computer simulation models to predict the rate and extent of absorption using in vitro data inputs. [73] There are two different kinds of approaches for predicting potential nanotoxicology issues in in silico models. The first approach uses models derived on the basis of extracting and organizing human knowledge and scientific literature. [75] The second method approaches chemical structure descriptors and performs systematic analysis of the relationships between these descriptors and the nanotoxicological endpoints. [76] Automated tools and new methodologies are required to ease the metadata extraction, semantic representation and analysis of published content to complement the first approach. Open source and freeware offer a forum for automatically extracting, enriching, and characterizing various structural and semantic aspects of scientific data published. Open forums help to evaluate the published data by depending on the scientific text mining system established within the system of the open sources. However, most of the systems did not analyze whether the system promotes fundamental task of classification of rhetorical data and sum-up the extractive text.
Major cons of the mathematical models arguably mimicking in vivo physiology and pharmacokinetics is of model animals, which largely encompass rodents, canine and rabbits which www.advancedsciencenews.com www.advhealthmat.de morphometrically differ from target humans. Further, morphological differences into internal organ systems in different age groups of humans (adult vs children) are generally not considered while physiological fitting parameters (i.e., breathing rate and route). Until recently, more advanced dosimetry models are introduced giving an opportunity to choose in accordance with relevance and biological similarities (e.g., swine or human model itself) which reflect accurate prediction in context with physiologically regional tissue deposition and distribution.
The second approach is hampered by the limited nanotoxicology data available in the literature. It has been reported that current software includes models for the irritation, sensitization, immunotoxicology, and neurotoxicity. [70,76,77] While these are important endpoints, there are several others for which in silico models remain to be built. [78] For example, QT prolongation, [79] hepatotoxicity, [80] and phospholipidosis. [81] Such predictive AD-MET models are commercially available and can be used in early stages of drug discovery. [82] However, the models are still only at early stage as the data they are based on are limited. [70]

The Mechanistic Dosimetry Models for Inhalation Nanotoxicology
The Paracelsus adage "sola dosis facit venenum" that means dose makes poisons, makes "dosimetry" a strong force in nanotoxicology. [83] Why do we need a computational tool for establishing the prediction between dosimetry and risk assessment? First, it helps in the assessment of external exposure and biological response. The early dosimetry modeling complements the correlation of complication of those in vitro and inhalation studies. [84] When we do an experiment to decipher toxicity of a particulate matter, all we get is the results but we do not really gain any insight into mechanistic aspects of toxicology. [85] Being able to model the mechanisms empowers us to gain better understanding of the processes and the product design. Further, the modeling can be faster, cheaper, and can fill gaps that could not be addressed through experimental studies. [86] The experimental planning and execution is expensive and takes a long time, minimum 4-12 weeks for in vivo test. Therefore, dosimetry modeling could be a great help during the handling of preliminary data and information. However, subsequently this information can be extrapolated for the prediction to humans' and animals' responses. Here, we describe ML tools and AI advances applied for the development of key models in inhalation nanotoxicology and nanomedicine.
The multiple path particle dosimetry is a quantitative computational tool to predict the deposition of inhaled nanoparticulate in the upper and lower respiratory tract of a lung in a human and several other mammalian species. [87] Dosimetry is part of the ADMET model and describes the administered amount of substance in a given uptake pathway, e.g., the uptake of nanomaterials by inhalation. An experimental determination of the available dose for uptake into the body, whether human or animal, is very difficult. A direct determination of the actual number of particles in the different lung areas is not possible or only possible to a limited extent. Another parameter of interest is the variability of the species under investigation in terms of age and sex, also for PBPK and genome scale metabolic network (GSMN) models. Mathematical models developed by the national and international commission on radiological protection (NCRP, ICRP) or the multi path particle dosimetry model (MPPD) are able to address the issues without animal or human testing. The real objectives of computational dosimetry are to develop mechanistic models based on the physical and physiological parameters to govern the transport within the respiratory tract. As a result, the local and regional deposition of particulate matter in the lung can be predicted. [88] With the MPPD, the local and regional depositions and clearance of particulate matter in the lung are predicted. It is practically impossible to design an entire lung models since we do not know every detail of all the elements, about ventilation physiology and physics in context with aerodynamics to assign the boundary conditions. [89] An alternative is to develop a dose-limiting model from the entire respiratory tract by making simplified assumptions regarding the geometry, lung ventilation and particle transport. The MPPD engine models picture the entire breathing process complementing inhalation and exhalation for deposition, and clearance as respiratory cycles, applicable to inhalation toxicology and workplace exposure assessment (Figure 4). [90] If the exposure of an engineered, accidental or manmade nanomaterial is known and the subsequent inhalation test is to be performed in an animal model, the MPPD user interface makes it extremely easy to run the model quickly to speed up the test, significantly reducing the time. [91,92] The morphometry for different age groups of humans (adults vs children) and experimental animal variations (guinea pigs, rhesus monkeys, rats, mice, etc.) help to precisely decide input characteristics. [93] The exposure characteristics such as mass number, particle characteristics such as concentration, diameter, size distribution, and physiological fitting parameters (e.g., breathing frequency, total volume, breathing routes: nasal vs oral) give key information about regional tissue deposition and distribution. [91] The addition of more advanced features in inhalation methods immensely helps to extrapolate the modeling results to occupational exposure limits (OELs). [84] These computational models also offer a useful platform for interfacing with other physiological and mechanistic models such as PBPK/ADMET, etc.

Decoding Nanotoxicology Related OMICS Profile with PBPK Models
The OMICS (proteomics, genomics, and metabolomics) datasets are often complex and heterogeneous. [94] Extracting meaningful data from this vast amount of information is by far the most important challenge for the machine learning experts and bioinformaticians. In this context, there is an increasing interest in the potential of deep learning (DL) and artificial neural network methods to create predictive trends and to identify complex patterns from these large datasets. The PBPK models are the mathematical model that aims to integrate the known information from existing data about the physiological processes for the particular species of interest. It further enables the possibility to add any other known attributes about a specific target compound in order to predict or simulate formal kinetic properties in vivo. [95] The PBPK models are quite similar to classic top-down two compartment models and structure. However, it involves organs as the compartments and the blood flow rates decide basically the drug transport from one compartment to the other. In PBPK models, we have organs arrow as our compartments they can be used to simulate concentration time profiles in any tissue of interest (Figure 5). The PBPK models have great extrapolation abilities from one species to another and one population to another. [96] For the prediction model, the common parameters can be divided into two main groups: physiological parameters which are the species specific which would include the organ volume surface area tissue compositions blood flow rates protein abundance. On the other side, the nanotherapeutics specific parameters, one need mainly the distribution related parameters and disposition related parameters. Of course, for the extra vascular absorption and the permeability calculations, one also needs metabolism, elimination and absorption rates to predict accurately.
The PBPK based toxicity models are a suitable tool to identify the toxicity mechanism associated with different classes of nanomaterial. However, if the target genes are located in the toxicity related network pathway, it does not give us more information about the causal mechanisms of regulation. [97,98] Therefore, we cannot predict if a particular gene regulates the other genes or any other working mechanism inside. Hence, we need to apply other types of ML approaches including deep learning, random forests, k-nearest neighbors, and support vector machines to identify the relevant effector pathways.
In OMICS screening of nanomaterials, we start with a nanomaterial with known surface physicochemistry and we identify which proteins/sites get perturbed or targeted by the specific nanoparticle interaction. [99] The PBPK model enables to screen the transcription factors involved in the signaling pathway, which regulate the expression of the target genes related with the nanotoxicity (e.g., cell death, proliferation, oxidative stress, and inflammation). [97] The versatility in machine learning algorithm helps to predict absorption, distribution, metabolism and excretion of different classes of nanomaterials via PBPK modeling in different organs. [100] There are a variety of modeling approaches that aim at identifying nanotoxicity pathways. Table 1 examines the most commonly utilized machine learning models for the prediction of nanotoxicity along with their pros and cons. Figure 5. The workflow building to represent the hierarchical generic structure of a whole-body PBPK model in connection to nanotherapeutics circulation to-and-from brain tumor in vivo (Q, transported molecular or particle concentration into the blood flow before and after leaving the organs addressed into the PBPK model, CL describe the clearance of the liver and kidney).
On a different note, the in vivo biokinetics, PBPK modeling can be used for predicting the interspecies and population variance, and correlate in vitro data with in vivo data (Figure 6). There are different pathway identification approaches to identify the main pathway that regulates the gene expression in a particular nanotoxicology scenario, like linear programming algorithms or other network based ML approaches [101] (Table 2).
We can leverage a wealth of omics data created for humans and other organisms with an ensemble of interaction networks between proteins, metabolic reactions or pathways that can be used to create the intermediate layer of the network. [102] Subsequently, we can acquire information on gene expression from genotoxicity assays and prediction of plausible mechanisms by comparing datasets that have information about the transcription factors that regulate expression of different genes in such a scenario. [103] For the interatomic data, one can use all the information that is available on protein interactions (e.g., BLAST sequence alignment algorithm) in different organisms [104] and they also help to understand the crystal structure information relevant to nanotherapeutics. Another example is the implementation of these to analyze the data with the Monte Carlo algorithm such as Steiner tree approaches to identify the pathways to help identifying the protein-protein interactions that connect the toxicity phenotype with the toxicity-associated genes. [105] Chemical biology will further help to know the drug targets, which can be correlated with human genetic data to predict which proteins are associated with the toxicity phenotype. For instance, if we use toxicological profiling of humans, we can receive information on the expression of proteins in different tissues to obtain protein interaction networks specific for liver, kidney, and lung; if we have information for organ specific cell types. [106] This core network could be fed into the PBPK model algorithm to further expand the scope of prediction. This is of utmost importance for nanotoxicology response analysis. The curiosity model must be incorporated one way or another in modeling the NP-associated metabolism. The curiosity models or algorithms implement the curiosity reward to the agent learning process and curiosity is framed as the interest of an agent to learn unknown regularities. The idea is to reward the system if it finds something unexpected. [107] . A curiosity reward is added to the agent learning process, proportional to the difference between the expected result from a given input and the actual output. Such approaches have been successfully applied in cancer nanomedicine to address the metabolomics [108] and could be very helpful in perspective we describe herein. During the last decade, there has been a lot of work done for the development of GSMN in the context of nanotoxicology. [97,109] These are being used to model human metabolism at different levels, particularly at the cellular level for tissues and organs depending on the models. The GSMN models are being used to predict the human metabolomics integrating organ-and tissuespecific genome-scale metabolic network. Since vast information is present at genome level, the implementation of GSMN models might deepen our mechanistic understanding toward human metabolism. A few of the models are publicly available and are represented by a set of differential equations. [110] The largescale models imply hundreds of equations to estimate the human metabolic reactions in the nanotoxicology perspective. Many publicly available models also give information about gene expression network, protein confirmation and co-crystallization in the toxic environment as implemented in contemporary protein data banks (PDB) (Figure 7).
PBPK models help in the development of formulation for nanocarriers, predicting dose-effect relationships and risk assessments toward nanopharmaceuticals. [99,111] The draft is adopted for the eco guidance on the grouping of nanomaterials for the assessment of these substances according to REACH, the chemicals regulation of the European Union. The guideline helps to distinguish between different kinds of properties that essentially describe the chemical and physical identity of the substance.

Challenges Ahead with ML and AI in Nanotoxicology Modeling
Like there is a need for a discussion to establish ethical guidelines for the operation of robots (roboethics), there must be discussions regarding AI and ML too, including for nanotoxicology applications. [112] The guidelines must be established to provide indications and suggest principles, which, at the opportune moment, will help us to cope with a situation.
Recent developments in modeling ligand-protein interactions are opening new avenues for superior nanomaterials design, which needs ML approaches to optimize the nanomedicine applications. This could be particularly promising for the personalized cancer nanomedicine since cancer cells have recently been known to secret protein/peptides which can make advance Figure 6. Linking PBPK models interfacing relevant physiological organs and tissues including the drug-specific ADME processes with the genome-scale metabolic network. The latter describes the cellular biochemistry in the interstitial (extracellular) and intracellular (cytosol, mitochondria, peroxisome, etc.) space. Reproduced with permission. [97] Copyright 2018, Nature. materials of biological origin. [113] One of the biggest challenges in nanomaterial library design is the scarcity of computational tools to predict the ligand-metal interactions (number of reducing amphiphilic amino acid ligand and gold ion binding site), which could be helpful in predicting the final shape and size of the nanostructure formed. The machine learning based hierarchical approach termed as Iterative Threading ASSEmbly Re-finement (I-TASSER) is able to predict the gold ion ligand binding sites into self-assembled short amino acid sequences (15-20 amino acid long).The i-TAASER approach could be tamed to design thin platelets nanoflower like structures with superior plasmonic wave guidance for photothermal applications [114,115] (Figure 8). Predictive efficacy of any mathematical model depends on simplifying assumptions and at least a compromise www.advancedsciencenews.com www.advhealthmat.de In vivo rat model [25] Nanocarrier In vitro [ 118] TiO 2 Monte Carlo simulation In vivo dietary risk assessment [ 119] Carbon nanotube Particle dosimetry for human alveolar deposition fraction In silico [90] MPPD Copper containing aerosol/particles (nano-to-microparticles) Algorithm deciphering variability in Mice (Balb/c and B6C3F1), Sprague-Dawley rats, male rhesus monkeys, sheep, and pigs.

ANN
Unified in silico machine learning model [ 119] Nano-QSAR/QSPR Functionalized iron Oxide NPs Linear regression In silico [ 121] All category of NPs Artificial neural networks (ANN) In silico [ 121] Metal oxide nanoparticles Logistic linear regression with expectation minimization algorithm In silico [ 122] C60 NPs SVs [ 123] Small molecules and NPs Bayesian In silico [ 124] Numerical predictions All category NPs to predict the 24 h postfertilization (hpf) mortality Numerical prediction Bayes, logistic regression, k-nearest neighbor (k-NN) In vivo embryonic zebrafish [50] ADMET All category of NPs k-Nearest neighbor (k-NN) In silico vnn web server [ 125] All category of NPs with a single QSPR model Support vector machine, decision tree, ANN, regression In silico [ 126] All category of NPs Combined algorithm In silico [ 126] PEGylated nanotherapeutics Combined algorithm [ 127] i-TASSER Short peptide-gold nanoflowers Iterative threading and refinement In vitro and in silico [ 114] Anisotropic SnO 2 NPs Iterative threading and refinement In vitro and in silico [ 128] Quantitative feature-activity relationships (QFAR) ZnO, CuO, Co 3 O 4 , and TiO 2 CORAL In silico [ 129] between physiology and physics of input labels. In dosimetry modeling for inhalation toxicology, multiscale airway dimensions and complicated airway morphometry (straight tube vs dichotomous branching) of the lower respiratory tract (LRT) make conditions challenging. The LRT is basically tube-within-tube and implementing those features is a herculean task in dosimetry and allied regenerative nanomedicine models. [116] Further, physics and aerodynamics of physiological lung ventilation is not fully understood, which makes it difficult to implement this in dosimetry. For example, parabolic flow velocity assuming low Reynolds number, convective versus resistive forces for air transport assuming independent events are only some of the fundamental assumptions that need more concerted evidence. [117] To assume nanoparticulate versus microparticulates transport depositions, averaging cross sections as 1D models may compromise predictive power, and need further improvement. High data requirements and tuning the code could be complicated for a use to application variability in nanomedicine. The current algorithms have specific challenges in implementing each kind of modeling ( Figure 9). Many parameters, which are correlated, have been implemented under different ML toolboxes in each kind of modeling. There must be adoptable common standard operating procedures (SOPs) to use the best predictive ML toolbox across all models.

Future and Outlook: Green Algorithms, Ethics, and Prospects of Remote Monitoring of Nanomaterials for Public Safety
With respect to the application of the 3R principles (replacement, refinement, and reduction of animal experiments) during implementation of the EU REACH regulation, recent algorithms have shown higher accuracy than animal test prediction. They have proved a promising potential and perspective to diminish animal testing during the development of future nanomedicine. [130] The algorithms in focus can generate a chemical map that contains the putative toxicology ingredients of thousands of compounds and OMICS data on the organism level. A PBPK model is developed and validated with human PK data. A human organ-specific genome-scale metabolic network (GSMN) model is reconstructed with the help of OMICS data. Middle panel: A hybrid PBPK-GSMN model is developed to estimate the in vivo organ-specific drug metabolism as time-resolved reaction rates including the absorption, distribution, metabolism, excretion processes. Lower panel: Prediction of cellular responses with the combined multiscale PBPK-GSMN model to predict organ-specific drug-induced metabolic perturbations, resulting in altered intracellular and extracellular reactions rates. The combined model allows the explicit consideration of specific dosing schemes, patient physiology, and genetic characteristics (abbreviations: dynamic version of the minimization of metabolic adjustment algorithm (dMOMA). Reproduced with permission. [97] Copyright 2018, Nature.
from databases with superior predictability. The algorithm can compare and replace an unknown chemical moiety within the map from thousands of databases available in nanochemistry, predicting the potential side effects and toxicity. The algorithm, named after its discoverer Hartung [131] is an improvement on the read across methodology in toxicology and hailed as a software which could replace animal tests. [132] Both, EU and USA make it mandatory to perform health-based risk assessments prior to the sale of consumer products containing nanomaterials. Unlike most of Asian countries and other Schengen states including Norway and Israel where animal testing of cosmetic and allied products with nanomaterials is feasible, the EU has set a high priority on the development of alternatives to animal tests. The read across algorithm and recent ML tools such as the Hartung algorithm will immensely support big data analysis from animal alternatives, organs-on-chip, and 3D culture models in vitro. [133] These methods are getting an ethical voice as insightful alternative methods for animal tests, supporting the 3R principle. Europe's REACH regulation, an ambitious project to implement the safety of chemicals, has largely benefitted from the support of ML tools. [134] Further, the point that ML/AI helps to accomplish more in less time, will shorten the duration of experimental task and analysis lags in nanomedicine. The functional aspect and biological responses of every individual against particular toxicogenic stimuli is different. The AI gives an edge to crack down the differences since "one size fits all" is not applied to read the output of modern modeling tools. Despite these ML advances, a long road ahead awaits to completely replace animal tests with sole iterative threading assembly refinement (i-TASSER) algorithm for the secondary structure predictions of a short peptide in the presence of gold ions (green dots), docking as ligand into the peptide to produce nanoclusters. Reproduced with permission. [114] Copyright 2018, American Chemical Society. Figure 9. A futuristic hybrid multimodal system linking in vitro models. The formation of nanomaterials (NM) is based on DFT calculations including I) the material synthesis ligand associated, II) structural information like size, shape and composition, III) the interaction and adsorption of molecules or proteins at the surface of created NM. Relevant physical-chemical and biological properties (like solubility, surface charge, partition coefficient, etc.) are predicted by nano quantitative structure-activity relationship (nano-QSAR). The synthesized NM is investigated with respect to administration, adsorption, distribution, metabolism, excretion, and toxicity (ADMET) into human body over material or human lifespan by multiple administration. Physiologically base pharmacokinetic models (PBPK) are used to predict organ specific and respectively whole body adsorption and distribution. By combination of PBPK models to metabolic networks the NM metabolism can be investigated for all organs implemented in the human body model. By using optimized models, the excretion of the administrated nanomaterials can be calculated if necessary. Additionally, outputs of the multiscale PBPKmetabolic networking modeling route are the toxicity and estimated input parameters for NM administration. All models investigated with multiple input parameters are supported by machine learning (ML) approached for faster and reliable prediction.
computational intelligence, since it will be scientifically a hard pill to swallow. However, riding on advances in ML toolbox and AI software which are more accurate compared to animal testing in predicting nanotoxicity, emphasis is now shifting toward a sustainable nanomedicine expansion as "green nanotoxicology." [135] Like telemedicine, we demonstrate that the remote monitoring of nanomaterial containing consumer products will further win public confidence and bring transparency in biosafety measures. AI and machine learning will be at heart of such technology complemented with advanced in nano-biosensors, which may enable the ability to institute the "cradle-to grave" monitoring of nano and other advanced material-enabled products. With a unique universal material ID (UID) and potentially also the incorporation of trustless, block chain-based ledger system, future nanomaterials could be tracked from their production, deployment into the consumer or industrial market and eventual end-of-life recycling and safe disposal; all information about the safety and handling requirements for each enhanced material could be easily available through a simple scan on a handheld device. Recent developments in breathing/air quality monitoring sensors, which measure gases/volatile compounds as input and laser sensors to measure particulate in surrounding, can be used to scan the consumer products (e.g., textiles, toys, food/food contact and education/recreational kits containing nanomaterials). In case of complexity, the information collected after scan can be wirelessly sent to regulatory experts in local area for interpretation and feedback (Figure 10). [136] The wireless communication between 'fitness monitoring' wearable e-gadgets and smartphone with sophisticated analysis app developed with AI/ML, could be followed for nanomedicine and monitoring biosafety of nanomaterials This will also fill the communication gap between regulatory bodies and public domains. [137] As shown in Figure 11, expanding the AI toolbox will help to connect and benefit the interdisciplinary research in www.advancedsciencenews.com www.advhealthmat.de Figure 10. A multimodal computational systems toxicology platform for remote monitoring of nanomaterial containing consumer products. Figure 11. The machine learning toolbox for the artificial intelligence experts to complement and predict the interdisciplinary connections among nanoscale materials modeling. The blank boxes represents the prediction of different structure (i.e., synthesis, composition, interaction with ligands, etc.) and material properties (i.e., density, solubility, zeta potential, etc.), the interaction of "synthesized" nanomaterials with the environment like proteins and the resulting possible toxic effects.
nanotoxicology. For examples, the nanotheranostic (nanocarrier and nanosensors) and food nanotechnology is rather less explored in context with new materials introduced and their possible health risk associated with exposure, particularly leaching of nanomaterials from coating. The Machine Learning algorithm can have better applications in predicting the different unknown nodes of material properties and composition (grey and cyan box in Figure 11) to connect with the surrounding behavior media (teal box) to predict the toxic response (purple). The rapidly computed empirical and ab initio models (e.g., i-TASSER, ab initio modeling of proteins for molecular replacement (AMPLE) as iterative-and-truncated approaches) could be key to solve the physicochemical property conundrum of hidden layers of toxicology.