Artificial intelligence methods for a Bayesian epistemology-powered evidence evaluation

Rationale, aims and objectives: The diversity of types of evidence (eg, case reports, animal studies and observational studies) makes the assessment of a drug's safety profile into a formidable challenge. While frequentist uncertain inference struggles in aggregating these signals, the more flexible Bayesian approaches seem better suited for this quest. Artificial Intelligence (AI) offers great promise to these approaches for information retrieval, decision support, and learning probabilities from data. Methods: E-Synthesis is a Bayesian framework for drug safety assessments built on philosophical principles and considerations. It aims to aggregate all the available information, in order to provide a Bayesian probability of a drug causing an adverse reaction. AI systems are being developed for evidence aggregation in medicine, which increasingly are automated. Results: We find that AI can help E-Synthesis with information retrieval, usability (graphical decision-making aids), learning Bayes factors from historical data, assessing quality of information and determining conditional probabilities for the so-called ‘ indicators ’ of causation for E-Synthesis . Vice versa, E-Synthesis offers a solid methodological basis for (semi-)automated evidence aggregation with AI systems. Conclusions: Properly applied, AI can help the transition of philosophical principles and considerations concerning evidence aggregation for drug safety to a tool that can be used in practice.

overview of AI for medical diagnoses). The idea of a 'smart' hospital, with programs and devices coordinated by AI, is no longer just science fiction. 4 AI also has roles to play in identifying drug interactions, interpreting possibly minute details in images, logging and processing health records, and more. Still, rigorous research into the performance of AI in many of these areas is still in its infancy. 5,6 AI's use for public health more widely is at more of a prospective stage, but its potential is obvious. 7 In this article, we focus on pharmacosurveillance. We explore how AI can contribute to the continuous assessment of putative Adverse Drug Reactions (ADRs). This manuscript is organized as follows: in the Methods section, we briefly present E-Synthesis, a framework for combining different types of evidence in pharmacovigilance, based on Bayesian epistemology, as well as AI methodology for evidence aggregation in medicine. In the Results section, we show how E-Synthesis and AI can be intertwined to the benefit of both. Finally, in the Discussion section, we offer some concluding remarks and provide an outlook on a possible research agenda in drug safety assessment.

| METHODS: E-SYNTHESIS AND AI
The synthesis of evidence from multiple sources providing different kinds of information (randomized studies, observational studies, case reports, in vitro evidence), with the aim of evaluating hypotheses and making decisions, plays a fundamental role in in many areas of medicine. In pharmacosurveillance, for instance, relevant evidence only becomes available in an unsystematic and motley way, so that evaluating hypotheses is far from the textbook ideal of interpreting a neat result from a randomized controlled trial (RCT). Thus, there is a need for methods of synthesis that assess the significance of heterogeneous evidence in a systematic, well-grounded, and manageable way.
Since traditional frequentist statistical methods struggle with aggregating different kinds of information, a more flexible approach is required here. We next present a Bayesian approach to drug safety assessment, and then we outline how AI methods can serve evidence aggregation. The interaction between AI and this Bayesian approach will be explored in the Results section.

| E-Synthesis: Bayesian epistemology for evidence aggregation in pharmacovigilance
E-Synthesis is a Bayesian framework for evidence aggregation in pharmacosurveillance to support timely decision making based on all the available 'safety signals'. [8][9][10][11][12] The framework rests on Bayesian epistemology, which unlike Bayesian statistics enables representation of and reasoning with uncertainties attaching to arbitrary propositions.
In previous papers, we have presented its philosophical foundations, 8 studied the incorporation of evidence qualities, 11 investigated the aggregation of knowledge concerning biological mechanisms and dose-response, 9,10 and made strides towards applying E-Synthesis in personalized medicine. 12 In this subsection, we give a brief overview of E-Synthesis.

| Motivation and goal
The risk-benefit profile of a drug is assessed and updated throughout the development process: after its formula is proposed, during its synthetization, and in the post-marketing period. There is no point at which its safety is definitively established: its developers and drug regulators must make multiple judgements at different phases of development, using heterogeneous evidence, such as whether to withdraw the drug. Currently, these decisions are made using systematic reviews that combine the wide variety of available evidence (preclinical studies, clinical trials, spontaneous reports, basic research etc.) to justify or undermine hypotheses about the presence or absence of causal relations between the drug and harms. However, it is difficult to combine heterogeneous data with various sources, modalities (observational vs experimental) and different degrees of external and internal validity. The ultimate objective of E-Synthesis is to surmount this difficulty, by providing a systematic, epistemologically principled, and usable method for combining evidence.
This framework rests on the paradigmatic philosophical account of uncertain inference (Bayesian epistemology) in order to provide a theoretically justified probability of a drug causing a harm on the basis of all the available evidence. It employs a Bayesian network 13 incorporating indicators of causality derived from the Bradford-Hill guidelines 14 as well as evidence qualities and uncertainties attaching to these evidence qualities. Unlike the GRADE approach, which is not straight-forwardly applicable to decision problems, 15 the probability produced by E-Synthesis has been designed to be used for making decisions via the maximization of expected utilities.

| Bayesian networks
In order to have an inferential mechanism that can handle heterogeneous types of evidence, E-Synthesis utilizes the tools of Bayesian networks and Bayesian epistemology. We provide a brief introduction to these ideas and the rationale of their implementation in E-Synthesis. It is generally very difficult to calculate conditional probabilities directly or to make a long and complex series of inferences using them. Bayesian networks offer a convenient means for graphically displaying and reasoning with probability functions. 13,16 We can use them to specify and read-off conditional independencies from a graph.
Technically, a Bayesian network is defined on a set of pairwise different variables by a directed acyclic graph (which means that the edges are directed such that the graph does not contain a directed cycle, that is, it has no path of directed edges which leads back to its starting point). Secondly, a probability distribution specifying the conditional probabilities of all variables given their parent variables (all other variables which directly point to this variable). See Figure 1 for an example graph.
Technically, this works as follows. Denoting the parents of a vari- for all possible values y, x 1 , …, x n under the condition that P y ∈ Y P (Y = yj X 1 = x 1 , …, X n = x n ) = 1. This condition ensures that we have defined a probability function that satisfies the standard probability calculus. To calculate conditional and unconditional probabilities of interest, one may use the so-called 'chain rule'. adverse effect, such that higher dosages lead to a more and/or stronger adverse effect. However, note that a causal relationship might lack a dose-response relationship (anaphylaxis) and a dose-response relationship might exist without a causal relationship, due to confounding. The indicators are probabilistic consequences in the sense that their truth is more likely, if the hypothesis is also true, than if the latter is false, that

| Indicators of causation
Therefore, there is an association between each relevant experimental study, observational study, case series, case report or basic science finding with a set of causal indicators which it is informative about. 8,11,17 E-Synthesis thus analyses the inferential process from the raw data to the hypothesis that a causal link holds between a drug and an ADR into two steps: (a) from data (study reports) to causal indicators and (b) from causal indicators to causality.
A core idea of Bayesian epistemology is that the confirmatory value of evidence with respect to hypotheses is degree-valued. The same holds here with respect to evidence for or against our causal indicators. We use evidential modulators to make this fine-grained and incremental element in Bayesian reasoning explicit, by determining the quality of evidence as a function of various choices in study design and data analysis (blinding, randomization, sample size, study duration, stratification), see Figure 1.

| Evidential modulators
One key feature of E-Synthesis is the possibility of assessing the quality of items of evidence. The assessed quality of evidence then modulates the degree to which the item of evidence (dis-)confirms indicators of causation. This is achieved by first creating a 'report' variable, Rep, for every item of evidence and then creating for every such variable a set of pertinent modulator variables Q 1 , …, Q k , for example, duration of a study, sample size and blinding. In the Bayesian network,   By contrast, the decisions about drug safety that are made by E-Synthesis will ultimately be formalizable in algorithms. It is true that there could be some exogenous elements. One example is that, at the input level, the selection of data for the evidential modulators could be decided by non-transparent neural network machine learning. At the output level, we are not proposing the complete automation of drug safety decisions, but instead just semi-automation, and therefore there will still be human judgements that could be opaque, depending on how the regulators make their choices.
However, E-Synthesis shares a common Bayesian advantages that it forces us to make our probabilistic assumptions explicit, and thus open to criticism. 34,Chapter 11 Therefore, in comparison to some types of AI, using E-Synthesis would improve transparency. Note that this superior transparency holds even if we think that the priors are ultimately 'subjective' in an epistemological sense: users can still raise challenges on criteria such as alignment of the prior probabilities with well-tested physical probabilities, the liability of priors to help us avoid catastrophic choices, 35 and other desiderata that users might have for priors.
For the capacity of E-Synthesis to improve pharmacological predictions, we can point to some promising precedents in which AI has been used to improve predictive power. 36,37 AI is especially promising for orphan drugs 38 where the quantity and quality of data cannot compare with largely used medications. We think that E-Synthesis may contribute in improving these AI methods with a more sophisticated evidence aggregation and evaluation, favouring a better understanding of causal underpinnings in drug safety management. In the following, we pin down three main areas of interaction between E-Synthesis and AI: machine learning, information retrieval and graphical decision aids. We conclude that evidence synthesis for pharmacosurveillance can be enhanced by AI, (cf. Section 4.2).

| Machine learning
Machine learning can greatly strengthen E-Synthesis, creating automated systems that make better use of the vast amount of accumulating publications and promoting the uptake of that evidence into a wide range of contexts. Using machine learning, E-Synthesis will be enhanced in identifying, extracting, synthesizing and interpreting relevant information, converting this into knowledge that can answer complex questions over causal associations. We identify two main applications of machine learning for improving E-Synthesis: (a) estimation of conditional probabilities of causal indicators and learning the weighting schemes of the evidential modulators from data and (b) modelling the 'linkage between a direct molecular initiating event […] and an adverse outcome at a biological level of organization relevant to risk assessment'. 40,p. 731 The latter occurs through an adverse outcome pathway (AOP), that is, a conceptual constructexpressed in terms of flow-charts-that portrays existing knowledge concerning the linkage between that initiating event at a molecular level and the adverse outcome that can be macroscopically observed.
Such 'mechanisms' play an important inferential role. 41

| Assessing probabilities and predictive powers
As shown above, E-Synthesis delivers a probability of causal association between a drug and an ADR, based on a Bayesian updating of evidence that accrues through causal indicators. Machine learning could help E-Synthesis in: Learning the weighting scheme of the evidential modulators The task determining how likely it is that a study (observational or an RCT) correctly identifies the absence or presence of a causal relationship between a drug and an ADR given the characteristics of the study, for example, duration and sample size. Machine learning can be used to estimate frequencies from past studies, since we know whether the causal link was present and the values of the modulator variables.
Note that, while machine learning can help us to obtain values for the evidential modulators, we still face 'The Problem of the Reference Class': the challenge of selecting the set of studies from which to infer these frequencies. 42 Which studies should we learn these frequencies from? Do we include all studies of the same/similar drug, similar/same adverse event (reaction), same type of sponsor of study (commercial or institutional), 43 beneficial and/or adverse effects? There does not seem to be an obvious answer. Considering only studies which are similar to the study under consideration leads to a small set of specific studies (little but specific data) while considering many, some of which less similar, studies leads to a large set of studies (much but unspecific data).
Ample data is the tool of choice to decrease statistical noise while specific data helps ensuring that the actual phenomenon of interest is studied. In our world of limited specific data, it is impossible to say how to optimally strike a balance between the value of these tools in general. However, a Bayesian framework like E-Synthesis helps us make our answers to the methodological questions (in the form of our Bayesian probabilities for particular events) more rigorously formulated and open to scrutiny than if choice among reference classes is left implicit.

Learning the conditional probabilities of indicators of causation
The goal is to estimate the conditional probability of an indicator variable given © or its negation (and its other parent variables, if there are any). The predictive power of the causal indicators may be inferred from past drugs with a suspected ADR, such that (1) we now know whether each of those drugs causes the ADR and (2)

| Information retrieval
Given larger and larger amount of publications available, the need for advanced information retrieval (IR) systems increases. AI may also help here. At present, most IR systems, such as general search engines (eg, Google and Yahoo) and scientific literature search engines (eg, PubMed and ACM Digital Library), use keywords to query and index documents.
However, this traditional keyword-based IR model provides little semantic context for the understanding of user information needs. For example, a keyword usually has several senses and its meaning is ambiguous without context. In addition, one meaning can be expressed by many keywords. 49 There is a long-running research program of trying to addressing these problems. 50

| AI-powered graphical decision aids
Facing an increasing amount of information puts pressure not only on the way such data must be analysed, 54 but also on the way those data have to be presented for an effective decision making. In fact, researchers with limited information processing capability are usually unable to cope with an exponentially increasing amount of information, leading to a phenomenon called 'information overload'. This phenomenon has widely been recognized to have adverse effects on decision quality. 55 The use of graphs as decision aids to reduce the adverse effects of information overload on decision quality has been positively investigated both in management 56 and communicating risks between patients and physicians. 57 AI could aid these goals by making it easier to visualize the confirmatory impact of (hypothetical) evidence and the confirmatory impact of indicators. An interactive graphical representation of strengths of associations may lead to better decisions based on E-Synthesis.

| DISCUSSION
We have shown how AI may contribute to pharmacovigilance by improving a Bayesian framework for evidence synthesis. We think that such applications will also benefit other approaches to evidence synthesis. The prospects for AI supported inference in medicine seem bright, yet we stress that AI will not cure all ills.

| Limitations: AI is not a panacea
AI can reduce some of the limitations of E-Synthesis, yet some will remain. For instance, while machine learning can help in making the weighting scheme of evidential modulators, as well as the probabilities of the causal indicators more objective, it is still a human who chooses the algorithm for these machine learning operations. There will hence continue to be room for subjective choice and disagreement about these choices. Furthermore, while graphical decision aids can improve the usability and explainability of decision processes, good decision making under uncertainty is a complicated task at which we routinely fail to be optimal. 58 One current limitation of E-Synthesis is its concept of causation. It is true that the superior transparency of AI is not guaranteed.
We noted above that machine learning systems are often incomprehensible, in some sense, even for experts. Yet, even in these cases, it is not clear that AI is any less transparent than human reasoning, since the latter might involve intuitive judgements that are also impossible to articulate formally. 70

| Future work
While we can understand causal relations between binary variables by how much (in some sense) the presence of the cause variable causes the probability of the effect variable to increase, there is also a pertinent graded sense of causation between many valued variables: how strong an ADR does a particular dosage cause? AI holds great promise to squeeze such more fine-grained information from evidence, which will require continued interaction between stakeholders and scientists from numerous areas. We echo the call for an increase of such interactions to improve pharmacovigilance for the good of us all. 9,71,72