Artificial Intelligence and Machine Learning Empower Advanced Biomedical Material Design to Toxicity Prediction

Materials at the nanoscale exhibit specific physicochemical interactions with their environment. Therefore, evaluating their toxic potential is a primary requirement for regulatory purposes and for the safer development of nanomedicines. In this review, to aid the understanding of nano–bio interactions from environmental and health and safety perspectives, the potential, reality, challenges, and future advances that artificial intelligence (AI) and machine learning (ML) present are described. Herein, AI and ML algorithms that assist in the reporting of the minimum information required for biomaterial characterization and aid in the development and establishment of standard operating procedures are focused. ML tools and ab initio simulations adopted to improve the reproducibility of data for robust quantitative comparisons and to facilitate in silico modeling and meta‐analyses leading to a substantial contribution to safe‐by‐design development in nanotoxicology/nanomedicine are mainly focused. In addition, future opportunities and challenges in the application of ML in nanoinformatics, which is particularly well‐suited for the clinical translation of nanotherapeutics, are highlighted. This comprehensive review is believed that it will promote an unprecedented involvement of AI research in improvements in the field of nanotoxicology and nanomedicine.


Introduction
The rapid development of knowledge in the field of advanced materials and nanomaterials has fueled a discussion on the best means to develop this emerging technology both safely and sustainably, without limiting the incredible potential benefits that these advancements bring about in material design and formulation. [1] One of the first difficulties encountered in this domain pertains to how we organize and utilize the massive volume of information that is being produced, in relation to the performance and environmental and health and safety (EHS) implications of these nanoscale materials. Nanotechnology, machine learning (ML), and artificial intelligence (AI) are a few leading technologies in this domain; although ML and AI have recently surpassed nanotechnology in popularity, they have largely complemented each other. [2] We have been conditioned to expect the development of AI in a wide range of applications such as in flying drones for home delivery, traffic routing, and small-scale robotic assistance in performing daily chores. We are probably interacting with AI more than we realize due to a prominent upsurge in the use of AI in electronic gadgets and digital media, and with AI grabbing the attention of the consumer industry. [3] Technology/gadgets using ML are so common that often we do not realize that computers are outperforming humans in terms of efficacy. However, this also raises an alarm and clarifies our understanding of what AI is capable of and how it can complicate our future. In simple terms, AI is a broad area of computer science that attempts to impart to machines human-like intelligence to learn and perform the given tasks. [4] In 1956, John McCarthy, a Dartmouth professor, first coined the term "Artificial Intelligence" when he observed that machines can solve problems such as understanding language semantics and forming abstractions and concepts, which were thought to be limited to humans. [5] McCarthy along with a group of computer scientists and mathematicians demonstrated that machines are capable of formal reasoning using trial and error, thus paving the way for a new era of AI over 60 years ago. Since then, AI has mostly remained limited to the Internet, university classrooms, and exclusive labs. The timeline of advances in computer programming indicates that a wealth of applications has been created along with uncertainties in different areas of life ( Figure 1). [6] AI and ML are growing exponentially and can soon become ubiquitous. [7] Over the past few years, two factors have led to the skyrocketing of AI worldwide, i.e., data availability and a faster processing capacity. [8] The amount of data being generated is growing exponentially, which can be seen from the fact that 90% of the data globally has been generated over the past 2 years alone. [9] With high processing speeds, computers can process all of this information more quickly and effectively, thus steadily rendering AI more real than artificial, and significantly more intelligent. In this review, we aim to address the developments in ML implemented in theoretical approaches and simulations used in characterizing nanoscale materials over the last decade. However, by incorporating AI into its core, the ML process has reached an all-time high. In this article, we review ML algorithms, which are continually being applied in new areas based on the widely distributed branches of AI, for classifying the diverse properties of nanomaterials, as well as correlation, validation, and grouping algorithms ( Figure 2). [10] In Table 1 and 2, we list the principal ML tools adopted for both in vivo and in vitro nanotoxicity analyses used for the different classes of nanomaterials based on a meta-analysis and an analysis of collected data. Many of these algorithms help with the classification (e.g., logistic regression and support vector machines) and statistical regression analysis (e.g., decision trees and random forests, which are used in both classification and regression) of nanomaterials based on their categorical or continuous numerical characteristics such as pulmonary toxicity, cell-specific targeting, and nanomaterial grouping in the case of nanotoxicology.

Standard Information Reporting in Nanomedicine and Nanotoxicology
Although various ML tools are already being used for nanotoxicity analyses, the comparison or correlation of various studies Figure 1. Timeline of AI and ML in nanomaterial development. Evolution timeline for both the development of nanoparticles (NPs), starting after the first synthesis and quantum effects as observed in 1853 by Faraday, and AI including statistical approaches. In 2010, both timelines merged when AI was applied in tasks such as the identification of NP properties or interaction partners, the grouping of NPs depending on their properties or toxic effects, and the prediction of NP toxicity. has only been possible to a limited extent. To date, there has been a lack of standardization in nanotoxicological research in characterizing and understanding the interactions between nanomaterials and the surrounding biological media, leading to a high degree of variability in the published literature. [62] There is also a need for standardization within the field of nanotoxicology. It is time for the nanotoxicology community to adopt ML-and AI-oriented "reporting standards" to enhance the quality and reusability of published research. In this section, we briefly review how ML and AI help in interpreting the collected data to predict safer cell-particle interactions, which has been a major bottleneck in the fields of nanotoxicology and nanomedicine.
The minimum information reporting in bio-nano experimental literature (MIRIBEL) for published accounts of bio-nano research has recently been proposed for the detailed reporting of biomaterial characterization and standard operating procedures (SOPs) in the development of experimental protocols ( Figure 3). [63] The data to be published are divided into three groups. The first group describes the important parameters for the characterization (Figure 3, left panel) of the NPs to be used, such as the size, zeta potential, composition, material density, aggregation behavior, labeling used, and possible drug loading. All of these parameters can influence the interaction with the biological environment and thus have an impact on the toxicity of the NPs. The second group describes the basic biological characteristics of the selected biological model (cells, tissues, or animals) on which the toxicological study is to be conducted ( Figure 3, middle panel). Based on the cell culture experiments, details of the appearance of the cells, the general cell characteristics, and toxicological studies (e.g., determination of the mean effective concentration, i.e., EC 50 ) should be stated. Overall, it is important to characterize the biofluids used in in vitro and in vivo experiments. It is also essential to characterize the NPs in the biofluids because the behavior of the NPs in these biofluids has a decisive effect on their uptake of the biological model and thus on the toxicological effects. The third group (Figure 3, right panel) describes the details of the experiment performed, such as the dimensions of the cell culture, the dose administered, the image and signal details of the cells with and without NPs, and the details of the data analysis. [64] Differences in the three groups or the lack of information, even if insignificant, can lead to different results. An evaluation using ML methods is therefore extremely difficult due to the high variability of the information. Standardization of the information presented by the research community in journals and corresponding platforms can mitigate this fundamental problem, thereby increasing the reusability, quality, and distinctiveness of the generated data. In addition, the MIRIBEL list can be used to generate quantifiable data from qualitative data, which additionally enhances the meaningfulness and robustness of the data. [64] The aforementioned problems of inhomogeneous or incomplete datasets and the harmonization of resulting datasets for risk assessment are also addressed by the EU US Roadmap Nanoinformatics 2030. [65] Furthermore, although many nanomaterials have been proposed as drug carriers and for targeted therapies, there are no protocols or standard probes for analyzing the toxicity data of the carrier material. Under this scenario, the assessment of the possible toxic effects of the nanomaterials used is extremely important. ML and AI can aid in the safe manufacturing and efficient production of nanomaterials in the future. [66] Based on the existing literature, toxic effects can be detected for different materials using classification or cluster methods (see Table 1). If the materials used in biological models are sufficiently investigated, the interaction of the nanomaterials with the environment can also be studied and predicted accordingly using atomistic or quantitative structure-activity relationship (QSAR) models. Based on the results obtained, the atomistic or QSAR models can also be used to predict the properties of new nanomaterials. Although this procedure is complex, it is necessary to train ML models. There is no existing alternative approach for a reliable prediction of all necessary information in a single model (or an ensemble model) regarding the nanomaterial properties, the interaction with biological media, and finally, the level of toxicity. The first concept of such a model combination can be found in recent studies. [67]  CORAL Metal oxide NPs Cell viability assay [12] ZnO, CuO, Co 3 O 4 , and TiO 2 Quantitative feature-activity relationships (QFARs) [13] SiO 2 NPs Percent cellular viability (CV%) prediction [14] Random forest (RF) CNT pulmonary toxicity Physical dimensions and impurities affect the toxicity [15] CNT impurity toxicity NP characteristic interaction effects on pulmonary toxicity [16] Metal oxides NPs Multiple toxicity endpoints of nanomaterial effect [17,18] Silica, TiO 2 , Mn 2 O 3 , Cu-phthalocyanine blue, and Cu-phthalocyanine green Nanomaterial grouping [19] Artificial neural network (ANN) Nano-sized metal oxides Physicochemical effect prediction to cytotoxicity [20] All categories of NPs QSAR [21] Polymethyl methacrylate (PMMA), silica, and polylactic acid (PLA) In vitro NP-cell interactions [22] Mixed naive Bayes, sequential minimal optimization (SMO), J48, bagging, locally weighted learning (LWL), decision Q-dots and FeOx NPs Cellular uptake of cross-linked iron oxide NPs [25] Target specificity of NPs Nanoinformatics prediction [18,26] Logistic linear regression with an expectation minimization algorithm NPs in the printing industry Pulmonary toxicity [27] Apriori algorithm Nanoparticulate aerosol Systems toxicology meta-analysis [28] Conductive metal NPs SEM analytic tool [29] Decision tree Poly amido amine dendrimers Cytotoxicity, prediction as cell viability considered as a binary variable, toxic/nontoxic NPs in human colorectal cancer cells [23] 21 different NPs Classify nanomaterials [30] Engineered nanomaterials (ENMs) Ecotoxicity of ENMs [31] k-nearest neighbors Q-dots and FeOx NPs Cellular uptake of cross-linked iron oxide NPs [25,32] 3. Theoretical and Computational Ab Initio Tools to Address Safer Bio-Nano Materials We know experimentally that physical parameters such as the size, shape, surface functionalization, and/or physicochemical composition influence the properties of nanomaterials. [69] These dependencies are also intrinsically linked and must be considered before we make predictions about possible risks. [70,71] Although an understanding of the potential risks and hazards associated with nanomaterials and NPs can certainly benefit from further experiments, theoretical and computational studies can also contribute through the application of relevant calculations in this area. [72] The physical characteristics that give NPs an advantage in many applications are also the ones that Table 2. Principal ML tools for adopting the rigorous method of selecting, evaluating, and synthesizing all available evidence for a nanotoxicity analysis of different classes of nanomaterials and their example application areas.

Predictive algorithm Nanomaterial type Application area in nanotoxicology Reference
Bayesian network Cross-linked iron oxide (CLIO) NPs Cell-specific targeting [33,34] Small molecules and NPs NanomaterialÀQSAR (NanoÀQSAR) [34,35] Organic, inorganic, and carbon-based NPs 24 h postfertilization (hpf ) effect on zebrafish model [36,37] CORAL All categories of nanomaterial cytotoxicity modeling External leave-one-out cross-validation (LOO) for approach verification [38] Random forest (RF) Soil NPs pH Nanotoxicology prediction in ago-ecosystems [39] Linear regression (LR) Sizes of the anatase TiO 2 NPs on ROS production ROS correlations between in vitro and in vivo data [40] TiO 2 and ZnO Predictive optimization of experimental conditions [41] Organo-coated silver NPs Mechanistic ecotoxicity [41] CeO 2 NPs induce DNA damage Genotoxic dosimetry [42] Nano drug mimic Nano drug-specific protein expression [43] FeOx NPs with 108 different functionalization protocol To build the nano-QSAR model [44] Estimating ultrafine particle number concentrations (PNCs) Urban ecotoxicology [45] Artificial neural network (ANN) Chitosan/streptokinase NPs Cytotoxicity as a function of polymer concentration, pH of a solution, and stirring time [46] Polyethylene glycol (PEG)/PLA NPs doxorubicin release from polymeric micelles Entrapment efficiency prediction polymers [47] PEG-chitosan NPs Prediction of cell adhesion [48] Polystyrene fluorescent NP To predict quantity adhering to the vessel walls as a function of wall shear rate and NP diameter [49] Poly lactic-co-glycolic acid NPs To predict the release of macromolecules [50] IBK, bagging, M5P, and k-star Multiple NPs To predict embryonic zebrafish postfertilization toxic effect [36] Support vector C60 NPs Nano-QSAR/QSPR [51] Cobalt-ferrite NPs (Co-Fe NPs) Cytotoxicity as a binary value [52] Logistic linear regression with an expectation minimization algorithm

MWCNTs
In vivo modeling in embryonic zebrafish [53] Colloidal NPs Adverse outcome pathways (AOPs) [54] CaCO 3 NPs Pulmonary hypofunction [55] Metal oxide NPs NanoÀQSAR [56] C 60 fullerene Pulmonary toxic effects [57] Apriori Algorithm Pristine C 60 fullerene Predicting hemocompatibility [58] k-nearest neighbors PEG/PLA NPs Microarray gene expression analysis and clinical outcome prediction [47] All category NPs to predicts the 24 h postfertilization (hpf ) In vivo embryonic zebrafish [36] K-means clustering algorithm Quantitative resolution of NP size aqueous matrices at environmentally relevant concentrations Au-NPs [59] Protein NPs Local protein sequence motifs representing common structural property [59] i-TASSER Short peptide-gold nanoflowers In-silico modeling for hyperthermia [60] SnO 2 NPs Biomineralization efficiency [61] Figure 3. Summary of MIRIBEL components, guiding principles, and potential benefits. The development of MIRIBEL was guided by the principles of reusability, quantification, practicality, and quality. The information to be reported is divided into three groups. I. The important parameters for the characterization of the NPs to be used (e.g., size, zeta potential, composition, material density, aggregation behavior, labeling used, and possible drug loading) (left panel). II. The basic characteristics of the selected biological model (cells, tissues, or animals) on which the toxicological study is to be conducted (middle panel). III. The experimental details (i.e., dimensions of the cell culture, the dose administered, image and signal details of cells with and without NPs, and details of data analysis) (right panel). If combined with a journal and community adoption (lower panels), MIRIBEL can lead to improved outcomes in the field, including data exchange and communication, reproducibility, a deeper analysis of the published data, and a systematic comparison between approaches and materials. Reproduced with permission. [63] Copyright 2018, Springer Nature.
www.advancedsciencenews.com www.advintellsyst.com increase the possible risk and toxicity. [70,73] In addition, the toxicology of different nanomaterials is being studied with the development of new nanotechnology applications. [74] Numerous studies summarizing the existing knowledge and highlighting areas that require focused attention have mainly relied on experimental approaches to solve this problem. [75,76] Computational methods have an advantage because, unlike experimental methods, each of the critical parameters can be individually controlled such that the underlying mechanisms responsible for variations in the nano-bio interactions can be identified. A methodological approach can be followed to create the required virtual environment for the nanostructures within a short period of time. Such methods and virtual spaces can be used to possibly investigate the interaction of nanostructures under extremely complex environments, which would otherwise not be possible experimentally. A pertinent example mentioned in the literature is that of titanium dioxide (TiO 2 ), where properties of nanomaterials can be optimized using density functional theory (DFT) calculations to predict the crystal structure related to the photocatalytic effects. [78] This mineral, also known as titania, occurs in three forms: rutile, anatase, and brookite. There have been concerns regarding the toxicity of the photocatalytic anatase form because the surface of these NPs can produce a greater amount of reactive oxygen species (ROS), which can lead to oxidative stress in exposed organisms. It was mentioned that the (001) surface of the anatase form is particularly reactive, although the relative fraction of such a surface found in different samples was reported to depend on the temperature and chemical environment. [79] Therefore, studying its toxicity would traditionally require setting up a significant number of tightly controlled experiments. By contrast, DFT calculations can be used to predict the number of reactive sites on the surface of polyhedral particles as a function of temperature and NP size, [79,80] thereby saving precious time, money, and effort. Through the knowledge gained from theory and simulations, we can build predictive models that will help in building nanomaterial interactions within the biochemical environment in question that are both effective and nontoxic. A similar concept of the interaction of a particular nanomaterial with biological molecules is shown in Figure 4. As indicated, the simulation starts with the NP properties determined experimentally or through simulations (DFT, QSAR) as input parameter vectors and uses them to train a set of features against the target functions based on the NP-biomolecule interaction properties. As a result, the interaction between NPs and biomolecules can be predicted. With the speed and sophistication of established models and computational resources, algorithms can be developed to assess the toxicity of nanomaterials in a variety of natural environments, which will help to avoid potential toxic hazards in nanomedicine. In Table 3, the freely available theoretical and computational tools available for the analysis are presented along with their corresponding links.

Nanodescriptors Characterizing the Surface Competitive Adsorption Index of the Nanomaterial and NP-Protein Corona
In a physiological environment, NPs selectively adsorb proteins to form a NP-protein corona, [81,82] as shown in Figure 5. This leads to a reduction in the surface energy and simultaneous increase in the bioavailability of the NPs due to the resulting protein corona formation as a monolayer. The intrinsically large surface area of the nanomaterials results in the preferential adsorption of chemicals or biomolecules, reducing their surface energy. [83] The adsorption of proteins depends on the properties of the nanomaterial [84] (e.g., size, shape, composition, and pH), particularly the energetic state of the nanomaterial surface.
To understand the interaction of nanomaterials with biomolecules, the biological surface adsorption index (BSAI) approach has been reported in the literature. [82] This approach characterizes the adsorption properties of the NPs by quantifying the competitive adsorption of a set of small-molecule probes onto the NPs by mimicking the molecular interactions of the NPs with the amino acid residues of the proteins. The basis of this approach is the forces that are dominant at that scale, i.e., the Coulomb force, hydrogen bonds, lone-pair repulsions, and London dispersion forces. [85] By measuring the quantities of the probe compounds adsorbed, and their concentration in the surrounding media, the adsorption coefficient (k) is calculated as shown in Figure 6. [82] In another study, the BSAI approach was developed to identify and quantify the significant factors that govern the adsorption properties of nanomaterials using a solid-phase microextraction (SPME) and gas chromatography mass spectrometry (GC-MS) method. [82,86] For NPs, BSAI nanodescriptors are a perfect tool for predicting the adsorption of small molecules on their surface, which is a critical process for nanomaterials used in biological and environmental systems. The BSAI approach can play an important role in the development of predictive nanomedicine and for a quantitative risk assessment and safety evaluation of nanomaterials. [87] The collection of morphological nanodescriptors of the surface exhibits significant aspects of the nanomaterial characteristics such as the size/shape anisotropy, density/number of corners, aspect ratio, bio-persistence, surface area, and curvature. In mathematical calculations, nanodescriptors are crucial in representing the contributions and relative strengths of each molecular interaction for creating pharmacokinetic and nanomaterial safety assessment models. [88] Evolutionary algorithms can further optimize the clustering of nanodescriptors and predict the particle properties.

A Molecular Dynamics-Based Approach for Characterizing Thermal Transport in Nanoscale Material
In silico modeling of the energy-dependent transport phenomenon in nanoparticulate matter can be of significant assistance in preclinical and human trials of plasmonic nanotherapeutics in cancer nanotechnology, and in infection control. [60,89] A hybrid calibrated fluorescence assay (CF) and ML tool, i.e., a Fluorescence Cell Assay and Simulation Technique (FORECAST), were used to quantify the exposure dose in correlation with the membrane deposition and internalization of quantum dots, as well as the polystyrene nanomaterials. [90] The hybrid algorithm provides a multitude of information regarding the NP stability, quantitative biokinetics, and intracellular fates based on the transport phenomena. [91] www.advancedsciencenews.com www.advintellsyst.com New carbon nanotube (CNT) materials have been proposed in the literature as ideal candidates for potential applications in biomedical, chemical, and industrial processes. [92] An important property that affects the usability of CNT materials is their The nanomaterial properties (e.g., size, shape, composition, granularity, and pH) resulting in surface-specific energy contribute toward nanomaterial-protein interactions and can be used as input parameter vectors based on real experimental data or simulations (DFT and QSAR). c) By training the set of features against appropriate target functions, appropriate models can be built and optimized depending on available datasets. d,e) As a result, possible interactions with proteins can be predicted. Reproduced with permission. [68] Copyright 2018, Science and Technology Review Publishing House. thermal transport, which depends on their size, chirality, temperature, and defects, among other factors. It is therefore important to understand the influence of several different physical factors on their thermal transport. [93] Under such a scenario, molecular dynamics (MD) simulation models would be a good alternative to time-consuming, elaborate, and expensive experiments for studying the transport phenomenon at the nanoscale level. For a more realistic prediction, in the calculation of the molecular transport during the physisorption and chemisorption, monolayers of small molecules were created to passivate the surfaces of the nanostructures, as shown in Figure 4. "Clean" surfaces are experimentally impossible except under ultra-high-vacuum conditions and are conveniently presumed in molecular simulations. However, an MD simulation does not link the input parameters with the generated output. Under such a scenario, AI techniques coupled with modeling such as MD simulations can be introduced to fill this gap. [94] A literature review suggests that there is a need to develop an integrated MD-based AI simulation technique for modeling the material properties of nanoscale materials with a long aspect ratio. [94] The new integrated approach can combine the accuracy and low cost of an MD simulation with the input-output linking of the AI techniques. However, building an AI technique requires training data, which can be obtained from controlled MD simulations. Once trained, the AI will then be able to meaningfully support our ability to generate solutions for complex nanotoxicology problems. CNTs with a high aspect ratio can generate ROS which are considered indicators of the high toxicity of a material. [95] To study the toxicity of multiwalled CNTs (MWCNTs), ML models were developed based on the representation of molecular descriptors using the Monte Carlo approach. This involves weights of descriptor correlation through a preincubation in a quasi-simplified molecular-input line-entry system (quasi-SMILES) using descriptors such as the concentration, presence of an enzyme mix, and surface area of the MWCNTs. [96] In another study, [97] the genotoxicity was modeled as a function of five parameters, i.e., particle type (MWCNTs or fullerene), illumination, concentration, metabolic activation, and preincubation, which produce a satisfactory statistical parameter. Quasi-SMILES is used to reflect the physiochemical properties and experimental conditions such as the diameter, weight, surface area, an assay of toxicity in vitro, cell line, exposure time, and dose. The calculation of the quasi-SMILES model using the Monte Carlo equation provides ample statistical parameters. [98] In this case, the cytotoxicity calculation endpoint is the percentage of viability of the cells. In another case, i.e., an in vitro ML-based algorithm, as shown in Figure 7, it was shown that single-walled CNTs (SWCNTs) can amend the cellular motility and biological chirality of the cells, an anomaly that can be related to abnormal in vivo fetal development. [101] As shown in this report, in the context of CNT Figure 5. Visual representation of an NP corona formation process. Single-type proteins attach to an NP surface at rate k on , leaving the NP at rate k off ; on average, a complete "n" protein can fully cover the NP surface. Reproduced with permission. [77] Copyright 2013, Public Library of Science. Figure 6. Nanodescriptors of bio-physicochemical prediction. A radar compass plot comparing the five nanodescriptors of 12 different nanomaterials. The nanodescriptors [r, p, a, b, v] are regression coefficients representing the relative molecular interaction strengths of the nanomaterials. The nanodescriptors of NiOx NPs present an irregular pattern due to their unique chemisorption of phenol-derivative probe compounds. Reproduced with permission. [82] Copyright 2010, Springer Nature.
www.advancedsciencenews.com www.advintellsyst.com toxicity, the cellular motility is an important parameter that plays a significant role in cell-cell interactions and cellular microenvironment homeostasis. [100] For example, an MD-based AI approach used to investigate the effects of the geometry, chirality, and vacancy defects on the thermal conductivity of CNTs has been reported. [102] A model for calculating the thermal conductivity of CNTs in terms of their diameter, length, and several defects is described in the aforementioned study. It has also been reported that the thermal transport in CNTs can be calculated using a reversible nonequilibrium MD simulation, [102] with a good agreement with the actual data by Cao and Qu. [103] Although further studies are needed, engineers can use their mathematical formulation to estimate the thermal conductivity of CNTs in relation to their physical properties, which in turn can prove useful for design purposes and nanotoxicology applications.

AI-Based Translational Nanoinformatics Catalyzing Clinical Trials in Nanomedicine
It has been reported that nanoinformatics and DNA-based computing will have a major impact on the field of nanomedicine in the future by changing the way we model and process information in biomedicine. [104] The developments of nanotechnology and computational resources are leading to informatics and computing becoming one of the most important tools for testing the toxicity at the nanoscale. At that scale, theoretical and computational ab initio tools are needed to address the nanosafety of biomaterials because almost all physiochemical properties such as the size, shape, surface area, concentration, and electrostatic properties can affect their interaction with the surrounding media. [105] Although it provides an advantage, i.e., such properties can be used to achieve a targeted interaction with a specific biological environment, it also requires more intensive studies to test for nanotoxicity.
To effectively understand these interactions, new informatics tools must be developed and implemented. The data from the literature can be used in new computational methods, which can reveal the relation between the physical properties of the NP and its biological interactions, and eventually its toxicity. [106] Although their unique electrical, optical, and chemical properties due to their small size raise concern about their potential toxicity, [104,107] in some cases, the dose and mechanism of action result in a toxic NP therapy. [108] To build effective nanoinformatics models, we need a sharable toxicity database. Examples of such available databases are those developed by the National Institute for Occupational Safety and Health and the Oregon Nanoscience and Microtechnologies Institute. [104,109] Such databases can be used to feed nanoinformatics models and simulate the toxicity processes, thereby reducing the time taken to translate drugs and NPs from the testing phase to clinical practice. It has also been reported that the data integration i) The circular histogram of biased angles (based on the deviation of the short orange lines in (b) from the circumferential direction) shows a CW bias in the endothelial cells. j) The percentage of chiral rings decreases with the CNT concentration after a three day exposure. k) Time-dependent chirality loss of endothelial cells when exposed to 10 μg mL À1 of CNT. Scale bars: 50 μm. a-f ) Reproduced with permission. [101] Copyright 2011, Landes Bioscience. g-k) Reproduced with permission. [99] Copyright 2014, American Chemical Society.
www.advancedsciencenews.com www.advintellsyst.com at the nanoscale poses some challenges [104,110] ; prime among them is the development of central repositories and databases of NP toxicity, [111] standards for information storage and exchange, [112] domain nano-ontologies, [113] and tools for decision support. [114]

Ensemble Classifiers and Regression Trees in Nanosafety Practices for Decision-Directed Nano Research
Several initiatives are underway to provide open data repositories, which are linked, open, searchable, accessible, and interoperable through a framework of appropriate semantic protocols and ontologies of tagged metadata descriptors, thereby allowing for machine readability. [115] The findable, accessible, interoperable, reusable (FAIR) data principles are the foundation of the European open science cloud (EOSC), a virtual environment for open and seamless services for storage, management, analysis, and re-use of research data (EOSC pilot report, 2018). In the final report, an action plan from the European Commission expert group on FAIR data laid out a foundational structure of what constitutes a minimum viable research data ecosystem in Europe, its main rules of participation, a governance framework, and possible finance models. The goal of the cloud is to effectively interlink people, data, services and training, publications, projects, and organizations. In addition, they presented an action plan to achieve FAIR research data. The European Commission expects the research data generated by Horizon 2020 projects to follow the FAIR data principles, and the EOSC has recommended that this infrastructure should be founded on the FAIR principles, where data should be as open as possible, and as protected as necessary (European Commission expert group, 2018). As these initiatives expand and more data become available and are linked across disciplines, the potential for an increase in the use of ML/AI applications will expand, and new relationships in NanoEHS data may become evident. The numerical meta-analysis algorithm, random forest, which is an ensemble of a large number of regression trees, has recently gained popularity in nanotoxicology predictions (Figure 8). [116] It has been reported that the parameters having the largest impact on the toxicity of Q-dots are the diameter, surface ligand type, surface modifications, and shell composition. [117,118] The complex relationships between the aforementioned variables and toxicity are not straightforward but can be retrieved using a random forest. [117] A random forest is a multiple-learning method based on an ensemble of multiple decision trees. [119] Decision trees help us go from an observation to a conclusion. In our case, observations can imply a toxicological exposure, and a conclusion can indicate the ensuing result of an exposure. In a parametric toxicology study involving three parameters, with three variables and three different values each, there will be 27 observations and their conclusions.
Decision trees (also called regression trees) are comprised of nodes and branches. Each node is a decision statement, according to which it may split into further nodes via branches. The node from which a decision tree begins is called the root, and the nodes at the terminal ends of the branches are called leaves. Using decision trees, we can sort observations into subsets by identifying the most critical independent variables and using them as nodes and creating further splits. [120] Figure 8 shows a qualitative ensemble of the most popular algorithms used in nanotoxicology for predicting safer biomedical material designs ranging from an advanced nano scaffold preparation to modeling the corona and cytotoxicity dosimetry determinations of NPs. Proceeding with the sorting, we reach the leaves, which are nodes that cannot be further branched. Starting from the root, and following the branches representing the decision statements of interest, we can reach an isolated leaf that satisfies all the decision statements. These trees can therefore potentially be used for isolating the mechanisms of toxicity and eventually for designing safer NPs for medical use. www.advancedsciencenews.com www.advintellsyst.com As shown in Figure 9, the i-TASSER (i.e., iterative threading assembly refinement), a template-based bioinformatics tool, is used to adopt an advanced and novel in silico approach to determine the final nanostructure size/shape from their precursor metal ligand. It predicts the secondary structures of an amino acid via a local meta-threading-server for a protein structure. Incorporating structure-based protein function predictions enables this tool to determine the ionic ligand binding sites into 3D self-assembled amino acid scaffolds. The final nanostructure of the metal-biomolecule frameworks can be controlled by selecting an optimal amino acid concentration and experimental verification and scanning electron microscope (SEM). The algorithm helps in understanding the mechanism involved in anisotropy and the final structures based on a ligand-amino acid interaction, which consequently leads to an energy minimization and a docked gold nanocluster assembly into nanoflower-like structures. Table 4 and Figure 8 indicate the most commonly used ML models for the prediction of in vivo and in vitro nanotoxicity along with their pros and cons.

Challenges in the Implementation of ML and AI Algorithm in Reporting Nano-Bio Interactions
The implementation of ML methodologies in terms of actual usage in the evaluation of material quality, properties, and general toxic effects is still not very clear in the context of nanomedicine. The problem lies in the implementation of ML being an objective in terms of decision support versus automation. In the design of biomaterials, an elaborate category of operational ML rules provides a standard for minimum information reporting in experimental bio-nano studies. [63] The particular problems posed by ML might need to be addressed during the implementation process rather than focusing on approaches for the adoption of emerging technologies in general, which has been thoroughly discussed in the field of cancer nanomedicine. We emphasize filling what we see as the most important gap in cell-NP characterization. In a recent study, over 100 articles in multidisciplinary journals from 2018 that have impacted the field of nano bio-med were selected, evaluated, and labeled according to two major categories: application-driven and technology-driven approaches. [127] The study then subdivided each major category into scientific fields such as an oncology, cardiovascular disease, biomaterials, gene therapy, and theranostics. [128] The study further divided the articles according to the stage of research experience of the corresponding author (number of years that have passed since the corresponding author's first publication). These stages are defined in the study as early (22 articles [127] suggested that there can be a correlation between the fulfilment of the MIRIBEL checklist and the field of study. Specific examples indicate that the field of oncology scores higher than the cardiovascular-related studies, and the field of theranostics scores higher than the field of gene therapy. [129] Therefore, it can be seen that not all fields are ready in terms of data availability for the implementation of AI and ML techniques. There are certain fields that are more mature and rigorous, and better meet the information reporting criteria. Florindo et al. [127] also suggested that researchers at the "early" stage are more likely to miss the criteria for minimum information reporting compared to those at the "intermediate" and "late" stages. Therefore, there is a need to develop a MIRIBEL checklist that all articles must satisfy before publication for the sake of uniformity across the literature and for the sake of future implementation of AI and ML techniques in their particular fields. The criteria can be modified and enhanced gradually and regularly based on new information in the future.

Future Outlook and Opportunities
The influence of AI is not limited to programming computers to drive a car by obeying traffic rules or automated parking in a mechanical manner; it extends to programming a computer to further mimic human behavior. Similar to telemedicine, the remote monitoring of consumer products containing nanomaterials will further earn the public's confidence and bring about transparency in biosafety measures. AI and ML will be at the heart of such technology, which is complemented by advances in nano-bio sensors, enhancing the ability to institute a "cradle-to-grave" monitoring of nano and other advanced materialenabled products. With a unique universal material ID and potential incorporation of a trustworthy blockchain-based ledger system, future nanomaterials can be tracked from their production, deployment in the consumer or industrial market, and eventual end-of-life recycling and safe disposal. All information regarding the safety and handling requirements for each enhanced material can be easily available through a simple scan on a handheld device. Recent developments in breathing/air quality monitoring sensors that measure gases/volatile compounds in the input samples and use laser sensors to measure particulate matter in the surroundings can be used to scan consumer products (e.g., textiles, toys, food/food contact, and education/recreational kits containing nanomaterials). [130] In the case of complex data, the information collected after the scan can be wirelessly sent to regulatory experts in the local area for interpretation and feedback ( Figure 10). The wireless communication between "fitness monitoring" wearable e-gadgets or smartphones and sophisticated analysis apps developed using AI/ML can be adopted for nanomedicine and monitoring biosafety of nanomaterials. [131] This will also bridge the communication gap between regulatory bodies and public domains. Furthermore, as freely available general-purpose high-level programming languages supporting multiple programming paradigms gain popularity, the modeling in nanotoxicology and nanomedicine will embrace new horizons. For example, Python, which was once popular for an objectoriented approach, [132] has made inroads into many nanotoxicology and nanomedicine labs. Advances in the areas of computing language development, ontology development, and the employment of appropriate semantics related to the data warehousing and interoperability of nanosafety data are well underway and Figure 10. A multimodal computational system toxicology platform for remote monitoring of consumer products containing nanomaterials envisioned by interlinking the different submodules starting from pollutant generation in the environment, pollutant screening, data production and sharing, communication, and law enforcement and regulatory measures taken to make human ecosystems cleaner and safer.
www.advancedsciencenews.com www.advintellsyst.com will begin to enable predictions across a vast array of EHS data (e.g., toxicity, omics, and environmental fate) spanning a range of environmental, industrial, and consumer spaces. In general, the quality of the output of AI/ML applications relies heavily on the amount and quality of data that can be included in the analysis. As the amount of data grows rapidly and with an increase in efforts to make this data FAIR, according to the initiatives currently underway to promote EOSC, Center for Open Science, and open knowledge foundation (OKF), among others, the potential of AI/ML techniques to provide new insights and form new relationships between advanced materials and the people who use them will become apparent. This, in turn, will help speed up the development of safe, efficient, cost-effective, and advanced material-based solutions and bring about material production methods in line with the 21st century advances in AI/ML technologies. As shown in Figure 11, an AI and ML toolbox can immensely help in defining the core properties of nanomaterials relevant to biological activity, which will be useful for applications related to minimize toxicity in the context of nanotherapeutics. Because the bio-physicochemical interface of the NP corona determines the biological identity of the NP, unlocking the material properties, Figure 11. AI toolbox for understanding bio-physicochemical identity at the nano-bio interface. The model envisions complementing and predicting the core properties of nanomaterials to identify the nano-bio interaction sides. After the selection of appropriate NP properties, real experimental data must be used for ML model development, optimization, and cross-validation before unknown NP properties or the nano-bio interaction can be predicted. The boundary condition indicated by radial outlines (right panel) represents working spaces to apply AI and ML approaches to predict unknown properties for the prediction of toxicity and health effects in nanomedicine.
www.advancedsciencenews.com www.advintellsyst.com size/shape, and surface characteristics using ML algorithms may reveal important insights into NPs and their interaction with the surrounding environment. Furthermore, NP-cell interactions in vitro in viscous biological media with a high salt concentration act as another stealth layer of nanotoxicology. Here as well, AI approaches (e.g., grid search, artificial neural networks, and MD simulations) may help to decode the dissolution behavior, electrostatic agglomeration and accumulation, and the competitive binding of protein over NPs. These characteristics are generally complex and difficult to predict using conventional experimental methods based on a correlation of data or an identification of the most influential parameters (Figure 11). In particular, phase transformations and a free energy release based predictive relation can be established to decode the nano-bio interface of nanotoxicology using ML tools.