Propensity score‐adjusted three‐component mixture model for drug‐drug interaction data mining in FDA Adverse Event Reporting System

With increasing trend of polypharmacy, drug‐drug interaction (DDI)‐induced adverse drug events (ADEs) are considered as a major challenge for clinical practice. As premarketing clinical trials usually have stringent inclusion/exclusion criteria, limited comedication data capture and often times small sample size have limited values in study DDIs. On the other hand, ADE reports collected by spontaneous reporting system (SRS) become an important source for DDI studies. There are two major challenges in detecting DDI signals from SRS: confounding bias and false positive rate. In this article, we propose a novel approach, propensity score‐adjusted three‐component mixture model (PS‐3CMM). This model can simultaneously adjust for confounding bias and estimate false discovery rate for all drug‐drug‐ADE combinations in FDA Adverse Event Reporting System (FAERS), which is a preeminent SRS database. In simulation studies, PS‐3CMM performs better in detecting true DDIs comparing to the existing approach. It is more sensitive in selecting the DDI signals that have nonpositive individual drug relative ADE risk (NPIRR). The application of PS‐3CMM is illustrated in analyzing the FAERS database. Compared to the existing approaches, PS‐3CMM prioritizes DDI signals differently. PS‐3CMM gives high priorities to DDI signals that have NPIRR. Both simulation studies and FAERS data analysis conclude that our new PS‐3CMM is a new method that is complement to the existing DDI signal detection methods.

of DDIs. The incidence of DDI-induced ADEs is expected to increase due to the increasing trend of polypharmacy. Currently in US, 21.8% of US population receives three or more prescription drugs simultaneously, and 10.7% of the persons even have five or more. 2 Comparatively, these percentages were only 11.0% and 3.6% before 2000. 2 DDI-induced ADEs have become a major challenge to the clinical care setting, as DDIs are suspected to be related with up to 59.1% of ADEs, and they became a common cause for drugs withdraw from the market. 3,4 However, only limited DDIs have been detected in premarketing clinical trials; most of these trials focus on single-drug effect, and comedications due to the comorbidities are usually limited by stringent inclusion/exclusion criteria. Thus, most DDIs are discovered through analyzing the postmarketing pharmacovigilance databases. 5 The spontaneous reporting system (SRS) is an important type of pharmacovigilance database for the postmarketing ADE detection. 5 SRS collects ADE reports from a variety of sources including patients, pharmaceutical companies, and healthcare professionals. As the SRS collects only "cases", ADE incidence rates are usually inflated. Despite the potential bias, it is commonly acceptable to use the SRS and test whether a drug or a drug combination has an increased ADE relative to the other patients who did not have this drug or drug combination. This analysis is also called disproportionality analyzes (DPAs) for the drug-induced ADE signal detection. DPAs are originally developed to detect single-drug-ADE signals by comparing the observed report frequency for a drug-ADE combination (ie, observed frequency) to the baseline frequency under the no association assumption (ie, expected frequency). 6 Well-known DPAs include the Information Component (IC), 7 IC with Dirichlet prior, 8 empirical Bayes geometric mean (EBGM) also known as the multigamma Poisson shrinkage (MGPS), 9 and the three-component mixture model (3CMM). 10 On the other hand, the likelihood ratio test (LRT) and its extension zero-inflated Poisson model-based LRT (ZIP-LRT) are able to test the associations between an ADE and many drugs. 11,12 The DPA method has also been used for investigating the DDI-induced ADEs by comparing the observed report frequency of a drug-drug-ADE combination to its baseline frequency assuming no interactions between two drugs. By considering the drug-drug pair as a "new drug", the observed report frequency can be obtained from the SRS database. Therefore, the single-drug-ADE signals detection methods can be extended to identify the DDI signals by designing a new baseline frequency. For example, by extending the baseline frequency from a two-item set to the three-item set based on a so-called concept of "market basket problem", the EBGM approach was extended to identify the suspected DDIs. 13,14 The IC with Dirichlet prior approach was extended for detecting DDI signals based on a proposed three-order IC function. 8 The three-order IC is defined as the deviation between the IC under the condition of the third event and the original IC, and the baseline frequency account for both the main effects and the two-drug interactions. Despite these extensions of DPA methods, Thakrar et al 15 proposed multiplicative and additive models to detect DDI signals in the FDA Adverse Event Reporting System (FAERS) database. Under the assumption of no interaction, the multiplicative model assumes that the risk associated with one drug multiplies to the background risk, and the relative risk (RR) associated with this drug is not influenced by the other drug; the additive model assumes that the risk associated with one drug is additive to the background risk, and this drug's risk is not influenced by the other drug. Norén et al 16 also proposed a well-known Ω shrinkage method for the suspected DDIs detection in individual case safety reports (ICSRs) data. It determines the baseline model with an addictive model and calculates a shrinkage observed-to-expected ratio of the ADE risk for the drug-drug pair. The shrinkage penalizes the signals of rare drug-drug-ADE combinations with a prespecified parameter.
Detecting DDI signals from SRS has several major challenges. SRS analysis subjects to confounding bias from thousands of comedications. In single-drug-ADE association analysis, Dumouchel et al 9 proposed a stratified expectation by using the Mantel-Haenszel formula. 17 This approach calculates the expected frequency by using the expected frequency within each strata. The stratified approach can be generalized for LRT, IC, and their aforementioned extensions as well. However, stratification is limited by the number of suspected confounders. As the increased number of strata yields reduced sample size within each strata, the power to detect DDI signals is reduced. 18 Besides stratification, confounding bias can be addressed by multiple regression models in which confounding variables (eg, demographic variables and comedications) are treated as covariates. For example, van Puijenbroek et al 19 proposed a multiple logistic regression model for detecting DDI signals. Though multiple regression approach is not necessarily limited by the sample size challenge, techniques for properly selecting confounding variables are still underdeveloped. 20 False positive control is another significant challenge in detecting DDI signals from hundreds of thousands drug pairs. 10 False discovery rate (FDR) is more practical than traditional hypothesis testing techniques (eg, P-value) for large-scale DDI signals detection. For instance, Bayesian FDR and local FDR were proposed to control false positive rate for detecting single-drug signals. 21,22 While these approaches have not been developed for DDI signals detection.
In this article, our aim is to detect suspected DDI signals with a low FDR from the FAERS, which is a prominent SRS database. Specifically, we propose a novel approach to compute the expected report frequency, in which confounding bias is adjusted by using the comedication information. Then, we model the observed and expected frequencies for the drug-drug-ADE combinations under an empirical Bayes model frame work. By defining the null risk (eg, RR = 1), the proposed approach estimated local FDR for each signal. The rest of the article is organized as follow. In Section 2, we review the Ω shrinkage method and describe the proposed method. In Section 3, we conduct simulation studies to evaluate and compare the performance of the proposed approach and the Ω shrinkage method. In Section 4, we illustrate the utilization of the proposed approach by analyzing the FAERS. Last, Section 5 concludes and discusses the proposed approach.

The FAERS database and data processing
The FAERS is a prominent SRS maintained by the U.S. Food and Drug Administration. The FAERS database collects adverse event reports directly from manufacturers, pharmacists, physicians, nurses, and consumers. 23 Each report consists of information including: (1) patients demographics; (2) drugs information which classified as "primary suspect", "secondary suspect", "concomitant", and "interacting"; (3) ADEs coded by using the Medical Dictionary for Regulatory Activities' (MedDRA) Preferred Terms (PTs) 24 ; (4) patient outcomes; (5) report sources; and (6) indications coded by using MedDRA PT code for the reported drugs. For FAERS analysis, the ADEs can be defined by using MedDRA PTs. However, the drug names may involve abbreviations, synonyms, brand names, and spelling mistakes. Thus, normalization of the drug names is necessary for FAERS analysis. The FAERS database has two major advantages: (1) massive sample size (>18 million reports) 25 and (2) the implementation of MedDRA PT allows the ADEs to be directly available. Despite the advantages, the FAERS database also has some limitations. First, the true ADE incidence rates cannot be obtained, as the FAERS database only contains ADE reports. Second, there is a great uncertainty between the drugs and the ADEs on a report, as the report collects all possible drugs and ADEs for a patient. 9 Third, the quality of the FAERS database may be compromised by inaccurate reports and missing data (eg, age and gender). For instance, over 30% of the FAERS reports involve missing data, as reporting of demographics is voluntary. 26 Although the FAERS database has the aforementioned limitations, it is a valuable source for DDI study.
In this article, we selected FAERS reports from 2004Q1 to 2012Q3 as our data source. To avoid the spurious associations caused by the duplicated reports, we only left the reports with the latest primary ID and removed other reports with the same primary IDs. As there was a great uncertainty of the drugs' roles (eg, primary suspect), we used all the drugs on each report including "primary suspect," "secondary suspect", "concomitant", and "interacting". Furthermore, the drug names were normalized into generic names by using the dictionary proposed by DrugBank and Wu et al 27,28 Frequent drug names (report frequency > 999) that failed to be normalized were manually checked and corrected. After data processing, our FAERS dataset contained 4 280 322 reports with 1180 generic drug names and 15 445 MedDRA PT ADE names. Of these 4 280 322 reports used for data analysis, about 38% of the reports involved with missing age information and nearly 10% of the reports involved with missing gender information.
To detect potential DDI signals, we extracted the information for drug-drug combinations and ADEs from our processed FAERS dataset. The total number of drugs in our FAERS dataset was 1180. In order to limit the search space of the drug-drug combinations and maintain sufficient candidate drug-drug combinations, we set criteria to filter the drug-drug combinations by their frequencies. Specifically, all drug-drug combinations appeared in our FAERS dataset with the frequency bigger than 50 were used in our illustrative FAERS analysis. After filtering the report frequencies of drug-drug combinations (eg, >50), we identified 81 312 drug-drug combinations generated by 1061 drugs. Furthermore, we selected four primary ADEs for analysis including delirium, myopathy, neuropathy, and skin pigmentation disorder. 10 Our final dataset contained 256 887 drug-drug-ADE combinations which were formed from four ADEs and 1061 drugs ( Figure 1).

Notations and definitions
For a drug-drug-ADE combination (eg, Drug1 and Drug2), the FAERS reports can be summarized into eight frequencies ( No Yes Abbreviation: ADE, adverse drug event. r 00 = a a+b , r 10 = c c+d , r 01 = e e+f , and r 11 = g g+h ( Table 1). We define N = g as the observed report frequency, E as the expected report frequency, as the RR, and̂= N E as the observed RR for a drug-drug-ADE combination. For different DPAs, E are calculated differently.

A review of the shrinkage method
In this section, we review the well-known DPA to detect DDI signals from SRS by Norén et al 16 This method, also known as the Ω shrinkage method, penalizes the signals of less frequent drug-drug-ADE combinations. Under this approach, based on the addictive risk model: The expected frequency E is calculated as: In Equation (1), (g + h) represents the total number of reports that contain both Drug1 and Drug2 ( Table 1). The maximum functions on denominator are equivalent as r 10 = max(r 10 , r 00 ) and r 01 = max(r 01 , r 00 ). These restrictions guarantee the ADE risk caused by single drug is higher than its background risk. And the Ω shrinkage measure is calculated as In Equation (2), is prespecified (eg, = 0.5). The Ω shrinkage method can also be derived under the Bayesian framework by assuming N ∼ Pois( E) and ∼ Γ( , ). The posterior distribution of is also a gamma distribution: | N, E ∼ Γ(N + , E + ). To detect DDIs, the signal generation can be based on the logarithm of the lower limits of 95% credibility interval for denoted as Ω 025 . According to Noren et al 16 , positive Ω 025 value indicates DDI signals.

Propensity score-adjusted three-component mixture model
Our proposed statistics for DDI signal detection is named as PS-adjusted three-component mixture model (PS-3CMM). We will first introduce the calculation for the expected frequency E and subsequently the empirical Bayes mixture model for adjusted-FDR estimation.

Logistic regression-based expected frequency for drug-drug-ADE combinations
If there are no confounders, logistic regression model (3) can be used to estimate the individual drug and DDI effect.
where in Equation (3), D1 and D2 are the binary drug exposure status, P(ADE = 1) is the risk of ADE, and 0 , 1 , 2 , 3 are the coefficients. The estimation for the background odds of the ADE in the absence of both D1 and D2 is: Similarly, the estimations for the odds of the ADE in the absence of one of D1 and D2 are The estimation for the odds of the ADE in present of both D1 and D2 under the condition of no interaction is From Equations (4) to (7), we get the relation for the reporting rate (8) Therefore, the baseline model can be seen as a multiplicative odds model. From Equations (7) and (8), the expected report frequency of taking both drugs under a non-DDI assumption can be calculated as:

PS-adjusted logistic regression-based expected frequency for drug-drug-ADE combinations
Because the large number of comedication confounders cannot be straightforwardly controlled in the SRS dataset, we propose a PS-adjusted expectation to address confounding bias. Assuming that the unobserved confounders can be characterized by the comedications in our dataset, the logistic regression (10) and the principal components (PCs) derived from comedications are used to compute the probability (ie, propensity score, PS) of taking the drug.
In Equation (10), K denote the number of PCs that we used to calculate the PS. PCs are derived from the comedications and K are selected by a prespecified threshold for the fraction of the total explained variance. Then, for each drug-drug-ADE combination, we used the logistic regression model (11) to estimate the individual drug and DDI effect with adjustment of two PSs.
In Equation (11), D1 and D2 are the binary indicators of two drugs and PS1 and PS2 are their PSs, respectively. The expected report frequency is calculated as: In Equation (12), for report i, is the estimation of the PS-adjusted probability for the ADE occurrence with only the individual drug effects. In another word, E is the expected report frequency of taking both drugs under a non-DDI assumption while adjusted for potential confounders.

FDR estimation for drug-drug-ADE combinations
To model the observed and expected report frequencies (ie, Ns and Es), we adopt the empirical Bayes mixture model framework. As nearly 83% of report frequencies for drug-drug-ADE combinations in our dataset is equal to 0, we choose the three-component empirical Bayes mixture model proposed by Zhang et al 10 to model the RR for the drug-drug-ADE combinations. Moreover, the 3CMM framework also provides a much needed false positive control for the signals. We assume the drug-drug-ADE combinations belong to three groups. They are (i) zero DDI risk: drug-drug-ADE combinations have 0 frequency (eg, neither the drug combination nor the individual drugs causes the ADE); (ii) background noise: the report frequencies for drug-drug-ADE combinations are close to their expected frequencies due to the additive effects from two drugs; and (iii) DDI signals: drug-drug-ADE combinations have much higher RRs than the background additive effect. We assume the unobserved RR ( ) follows the 3CMM (13).
In Equation (13), I( = 0) is an identity distribution that characterizes the drug-drug-ADE combinations with RR = 0 (eg, no ADE risk). Γ( ; l , l ) is the gamma distribution such that Γ( ; l , l ) = l l Γ( l ) l −1 exp(− l ). There are two gamma distributions with different restrictions in the 3CMM. One gamma distribution (l = 2) restricts 2 = 2 , and has its mean equal to 1. Therefore, this gamma distribution characterizes the drug-drug-ADE combinations to have a background ADE risks (ie, mean RR = 1). The other gamma distribution (l = 3) restricts 3 > 3 and has its mean greater than 1. Therefore, this gamma distribution characterizes drug-drug-ADE combinations with increased ADE risks (ie, mean RR > 1), which represents signals of DDI. Furthermore, from Equation (13), we assume N ∼ Pois( × E). Thus, the observed distribution of N is 2 = 2 , 3 > 3 , and P 1 + P 2 + P 3 = 1.
In Equation (14), ] N+ l is the negative binomial distribution function. For the adverse DDIs detection, our concern is drug-drug-ADE combinations with positive report frequencies (N > 0). Therefore, to reduce the excessive amount of drug-drug-ADE combination with N = 0, the conditional distribution (15) can be used for inference, which will not affect local FDR estimation accord to Zhang et al 10 In Equation (15), P(N > 0; 2 , 2 , N) is the probability for N > 0, F(N = k; 2 , 2 , E) is the distribution function for P(N > 0; 2 , 2 , N). The conditional log-likelihood function for Equation (15) is: By considering P 3 P 2 as one parameter, the four parameters in Equations (16) can be obtained by maximizing the observed likelihood (eg, maximum likelihood estimate). In this study, the MLEs are obtained by using the R function nlminb, which is able to conduct both unconstrained and box-constrained optimization by using Newton-like methods. 29 As the mean of the first gamma component in Equation (16) equals 1, it will be served as the null distribution to compute the local FDR accord to Efron et al 22 The local FDR for a drug-drug-ADE combination is defined as: In another word, local FDR represents the posterior probability that a drug-drug-ADE combination has the background RR (null). As we use PSs to adjust confounding bias, we name the statistic in Equation (17) as adjusted-FDR for screening adverse DDIs.

Proportional reporting ratio for DDI
Proportional reporting ratio (PRR) is a well-known method for pharmacovigilance proposed by Evan et al. 30 PRR is developed for detecting signals of single-drug-induced ADE from a SRS database. Specifically, PRR calculates the ratio of ADE reporting rates with and without a drug. Based on the report frequencies of Drug1 and Drug2 (Table 1), PRR D1 and PRR D2 for Drug1 and Drug2 are defined as: For the Drug1-Drug2-ADE combination, we define its PRR by considering the Drug1-Drug2 pair as a "new drug", based on the drug pair's ADE frequency (Table 1), the PRR D1D2 for the Drug1-Drug2-ADE combination is: The lower bound of the 95% confidence interval (CI) for the PRR is defined as: where SD denotes the SD, for Drug1, SD D1 = √ 1 c+g − Comparing to adjusted-FDR and Ω 025 , PRR and PRR 025 are measurements without shrinkage adjustment. Therefore, in this study, they will be served as a reference for the Ω shrinkage and our proposed approach.

SIMULATION STUDY
We conducted extensive simulation studies to evaluate the performance of our proposed method and compare it with the Ω shrinkage method. In each method, the area under the receiver-operating characteristic (ROC) curve (AUC) was calculated by using the simulated true positive DDIs and negative controls. Let PS be the PS, D be the binary drug exposure status, and Y be the binary ADE status. Specifically, the data were simulated as: In order to evaluate and compare the performance of our proposed method to the Ω shrinkage method, we first define two types of DDIs. One type is DDI with nonpositive individual drug relative ADE risk (DDI-NPIRR) (eg, r 01 − r 00 ≤ 0 or r 10 − r 00 ≤ 0) and the other is DDI with positive individual drug relative ADE risk (DDI-PIRR) (eg, both r 01 − r 00 > 0 and r 10 − r 00 > 0).
For all the three situations, the total number of reports was chosen to be 100 000, and PS was generated from beta distribution such that PS ∼ Beta(1, 6). In each simulation, 10 000 positive DDI signals and 10 000 negative controls were simulated from the uniformly distributed parameters shown in Table 2.
The ROC curves and AUC values are shown in Figure 2. In the first simulation, Figure 2A, we display the AUCs in the absence of confounding bias [ 4 = 5 = 0 ] in Equation (21) for both DDI-NPIRR and DDI-PIRR. Although both proposed PS-3CMM and the Ω shrinkage methods have good performance, PS-3CMM performs better than Ω shrinkage, AUC = 0.94 vs AUC = 0.88. In the second simulation with only DDI-NPIRR, PS-3CMM has even better DDI detection performance than the Ω shrinkage method, AUC 0.93 vs 0.78, respectively. In the third simulation, Figure 2C, when confounding bias is present, [ 4 ≠ 0 and 5 ≠ 0 ] in Equation (21), the PS-3CMM's DDI detection performance did not reduce (AUC = 0.94) much from the simulation where there was no confounding. On the other hand, the Ω shrinkage method has obvious decreased performance from 0.88 to 0.84 when confounding bias was presented.

FAERS ANALYSIS
We demonstrate the utilization of the proposed approach in analyzing the FAERS database including 256 887 drug-drug-ADE combinations. In our analysis, 160 PCs, [K = 160 ] in Equation (10), were used to calculate the PS and  = 0.08. These estimates mean that the component 2 (background noise group) had mean RR = 1 with SD = 0.68 and the component 3 (increased risk group) had mean RR = 5.33 with SD = 11.33. For each drug-drug-ADE combination, the adjusted-FDR and Ω 025 were calculated for signal detection and comparison. We also calculate the value of PRR 025 for each drug-drug-ADE combination as a reference for the DDI signals detected by PS-3CMM and Ω shrinkage methods.

Performance comparisons between PS-3CMM and shrinkage
If both drugs cause an ADE, we define it as the DDI-PIRR, which yields additive or synergistic risk for the ADE. In terms of reporting rate, DDI-PIRR has the pattern r 11 > r 10 , r 01 > r 00 (Table 1). Alternatively, if only one drug has ADE risk and the other one does not, we define it as DDI-NPIRR, which increases the ADE risk when two drugs are coprescribed. In terms of reporting rate, DDI-NPIRR has the pattern r 11 > r 10 > r 00 ≥ r 01 or r 11 > r 01 > r 00 ≥ r 10 . We compared the top-100 DDI signals between two methods in Figure 3: PS-3CMM and Ω shrinkage. The x-axis is the difference of the marginal reporting rate of drug 1 and the reporting rate of no drug (eg, r 10 − r 00 ); and the y-axis is the difference of the marginal reporting rates of drug 2 and the reporting rate of no drug (eg, r 01 − r 00 ). Clearly, top-100 DDI signals of PS-3CMM and Ω shrinkage have different patterns. Two lists only have 25 overlapped DDIs (Table 5). Among the Ω shrinkage's top-100 signals, 62% of them have r 10 , r 01 > r 00 . On the other hand, only 15% of the PS-3CMM's top-100 signals are DDI-PIRR. For DDI-NPIRR (r 00 ≥ r 10 or r 00 ≥ r 01 ), the percentages are 32% for Ω shrinkage and 64% for PS-3CMM.

Evaluation of top-100 DDI signals detected by adjusted-FDR and 025
The drug side-effect database (SIDER) and DrugBank database were used to evaluate the top-100 adjusted-FDR DDI signals (Table S1) and the top-100 Ω 025 DDI signals (Table S2). SIDER collects drug-ADE association data from drug labels. 33 DrugBank documented DDIs from a variety of sources. 28 Our evaluation was conducted manually through searching the drug-drug-ADE combinations in SIDER and DrugBank. The results for the adjusted-FDR DDI signals and the Ω 025 DDI signals are summarized in Tables 3 and 4, respectively. As Table 3  Except the overlap 25 DDI signals, both our proposed method and the Ω shrinkage method can detect some DDI signals that cannot detected by each other, therefore, the PS-3CMM is an important complementary to the Ω shrinkage method.

CONCLUSION AND DISCUSSION
In this article, we propose a new approach, PS-3CMM to detect signals of DDI from the FAERS database. This method controls false positive rate and adjusts for confounding bias due to comedications. Specifically, this approach first uses the comedications to derive the PSs. These PSs are then utilized to calculate adjusted expected frequencies for drug-drug-ADE combinations under the no DDI assumption. The empirical Bayes mixture model is used to characterize the RR between the DDI-induced ADE frequency and the expected ADE frequency of drug-drug combination under no DDI effect assumption. This 3CMM estimates adjusted-FDR for each drug-drug-ADE combination. The PSs are derived from PC analyses of comedication variables. The number of PCs can be determined by the percentage of total variation that explained by the selected PCs. In our illustrative FAERS analysis, the selected PCs capture 70% of variations of comedications. Therefore, we believe these comedication PCs contain sufficient variations among the comedications, and they shall capture a great deal of comorbidity variation among the patients. Our model also includes PS as continuous variable. It avoids the potential sparsity caused by the stratification of the data used in the EBGM and IC. 18 For instance, in our FAERS database the drug pairs' frequency has a median of 180. For P binary variables (eg, age, gender, and comedications), the number of strata is 2 P . The sparsity appears even for moderate P (eg, stratify by four age groups and gender yield P = 8). Unlike the Ω shrinkage method in which the penalize parameter is prespecified, the proposed approach is data-driven in which the parameters in the prior distribution are estimated from data under the empirical Bayes framework. Furthermore, the adjusted-FDR provides much needed false positive control for high throughput DDI signals detection. Our simulation studies show the PS-3CMM has better performance comparing to the Ω shrinkage method especially in detecting DDI-NPIRR. In addition, the simulation studies also demonstrate that our PS-3CMM has the advantage of adjusting for confounding bias, which is a common problem in SRS.
The application of PS-3CMM is demonstrated through the FAERS database. In FAERS analysis, we notice that the PS-3CMM and the Ω shrinkage prioritized signals differently. PS-3CMM is more prone to identify DDI-NPIRR signals. DDI changes the expected pharmacologic or clinical response for an individual drug when two drugs are coadministered. 1 The incidence of ADE can be significantly increased by: (1) pharmacokinetic (PK) DDI, in which a drug's plasma concentration is increased by coprescribed drug and exceeded the maximum tolerated dose 34 and (2) PD DDI in which additive or synergistic effects on ADE risk occurs. For DDI-NPIRR signals, a drug's ADE risk is significantly increased by another drug which has either similar or lower ADE risk compared with background. In another word, DDI-NPIRR signals are more likely to have a mechanism similar to PK DDI. For instance, the acetaminophen-methamphetamine-neuropathy combination and acetaminophen-neostigmine-neuropathy combination are highly prioritized by PS-3CMM but not by the Ω shrinkage method. They are ranked in 11th and 17th by PS-3CMM, and 915th and 8126th by the Ω shrinkage method (Table S1). In both combinations, the neuropathy risks of methamphetamine and neostigmine are lower than background risks, which imply both of these DDI signals are DDI-NPIRR signals. Moreover, both drugs are known to have PK drug interaction with acetaminophen according to DrugBank. 35 For instance, the description of drug interactions for acetaminophen-methamphetamine combination is "Acetaminophen may decrease the excretion rate of Metamfetamine which could result in a higher serum level". Thus, these DDI-NPIRR signals are likely to have a mechanism of PK DDI. Alternatively, DDI-PIRR signals are more likely to have a mechanism similar to PD DDI. For DDI-PIRR signals, both individual drugs have higher than background risks, and the drug combinations have much increased risks. For instance, the cetirizine-levodopa-neuropathy combination and the ibuprofen-tinzaparin-neuropathy combination are highly prioritized by Ω shrinkage but not by the PS-3CMM. They are ranked in 32th and 34th by the Ω shrinkage method and 5716th and 19226th by PS-3CMM (Table S2). Both of these two DDI signals are DDI-PIRR signals. These two-drug combinations are known to have increased ADE risk according to DrugBank with no clear evidence of PK drug interaction. 35 In conclusion, DDI-NPIRR signals are more likely to have a mechanism similar to PK DDI and DDI-PIRR signals are more likely to have a mechanism of PD DDI. The difference of the DDI signals pattern is mainly caused by the different baseline risk assumptions, and the Ω shrinkage method also restricts the ADE risk caused by single drug should be higher than its background risk, but the PS-3CMM do not has such restriction. Among the top-100 signals detected by PS-3CMM, 49 drug pairs have documented DDI evidences from DrugBank and 24 drug pairs have at least one drug with the ADE in its drug label. At the meantime, 75% of top 100 PS-3CMM DDI signals are not ranked top 100 by the Ω shrinkage method. These results suggest that PS-3CMM is a different and complementary approach to the Ω shrinkage approach, as well as the PRR approach.
As we described earlier in Section 2.4.1, instead of using PS-adjusted expected DDI-ADE frequency under the no DDI effect assumption, unadjusted expected DDI-ADE frequency can be computed in Equation (9). Although PS-adjusted analysis is clearly supreme to the unadjusted analysis in FAERS data analysis, there are other situations that unadjusted analysis is probably better. For example, if the pharmaco-epidemiological study is designed from a longitudinal cohort, such as the electronic health record (EHR) dataset, the confounding variables, such as demographics, comedications, or comorbidity can be matched and adjusted through nested case-control study design. 36,37 In this case, in analyzing DDI-ADE associations, the unadjusted analysis is a proper choice. Because it does not need to adjust for confounding variables, it shall have better power comparing to the adjusted analysis. Additionally, in real application, either the full model (Equation 13) or the conditional model (Equation 15) can be used for signal detection. However, the conditional model is more efficient when the primary interest is to detect drug-drug-ADE combinations with increased risk.
In this article, we illustrate the utilization of our proposed method by using the FAERS database which is a preeminent SRSs. Here, we would like to point that the proposed method can be utilized to other SRS databases. Nowadays, the importance of pharmacovigilance through SRS is internationally recognized. For instance, all European Union countries were obliged to establish patient/consumer reporting within their SRSs since 2012. 38 China and Japan have well-established SRS as well. 39,40 Additionally, the WHOs SRS (VigiBase) contains over 20 million reports which are coded by MedDRA and WHO Drug Dictionaries. 41 A detailed review of SRS database can be found in Huang et al 42 These SRS databases share similar informatics structure (eg, MedDRA), which facilitates the utilization of the proposed method. For instance, the Ω Shrinkage method has been applied to both VigiBase and the China's SRS database, 16,[43][44][45] in which the frequencies in Table 1 are able to be obtained. Thus, PS-3CMM can be straightforwardly implemented to the other SRS databases, which share similar informatics structure with FAERS. As for illustration, we only choose four ADEs in our FAERS analysis. However, the framework of PS-3CMM has no limitation with respect to the number of ADEs. We would also like to point out that the study can be initiated by either ADE(s) or drug combination(s). The current study is initiated by ADEs, as our motivation is to detect novel drug interaction signals. Alternatively, a study may be initiated with drug combinations with evidence of DDI. Such a study is more powerful to detect ADE signals and it can be used as a validation for pharmacology study.
One limitation of the proposed approach is to handle highly correlated drugs. It is a major challenge for all of the current DDI detection research. These highly correlated drugs (eg, a drug always coadministered with another) are due to increasing practice of polypharmacy. Another limitation is the proposed approach does not incorporate the drugs' duration and dosage information which are important factors for DDI, as those informations are not well captured by most SRS databases. Alternatively, EHRs databases contain detailed temporal information with respect to patients' pharmacy prescriptions, as well as the dosages. The associations between drugs and ADEs in EHR can be investigated under elegant epidemiology designs. For instance, nested case-control design with a 30-day drug exposure can be used to investigate acute ADEs, as well as the case-crossover design. 46 Hennessy et al provides detailed guidance for investigating DDI from EHR databases. 47 We would like to point out that our proposed method has the potential to analyze EHR data with proper epidemiology designs. For instance, instantaneous exposure before the ADE (ie, duration information) can be captured by the drug exposure window, and the analysis can be stratified or adjusted by dosage information. We believe the detection of DDI signals from EHR databases can be an important direction to further expand our model in order to characterize drugs' duration and dosage information.

ACKNOWLEDGMENTS
This work has been supported by several NIH grants, DK102694, GM10448301-A1, R01GM117206, and R01LM011945; and NSF grant, NSF1622526. It has also been supported by National Natural Science Foundation of China (61471139), Natural Science Foundation of Heilongjiang Province (F2016006), HEU Fundamental Research Funds for the Central University (3072019CFG0401, HEUCFP201722).

DATA AVAILABILITY STATEMENT
Data sharing is not applicable to this article as no new data were created or analyzed in this study.