A Bayesian Joint Model for Compositional Mediation Effect Selection in Microbiome Data

Analyzing multivariate count data generated by high-throughput sequencing technology in microbiome research studies is challenging due to the high-dimensional and compositional structure of the data and overdispersion. In practice, researchers are often interested in investigating how the microbiome may mediate the relation between an assigned treatment and an observed phenotypic response. Existing approaches designed for compositional mediation analysis are unable to simultaneously determine the presence of direct effects, relative indirect effects, and overall indirect effects, while quantifying their uncertainty. We propose a formulation of a Bayesian joint model for compositional data that allows for the identification, estimation, and uncertainty quantification of various causal estimands in high-dimensional mediation analysis. We conduct simulation studies and compare our method's mediation effects selection performance with existing methods. Finally, we apply our method to a benchmark data set investigating the sub-therapeutic antibiotic treatment effect on body weight in early-life mice.


Introduction
The human microbiome is the collection of micro-organisms (e.g., bacteria, archaea, viruses, fungi) that live on and inside of our bodies.A major research question in human microbiome studies is the feasibility of designing interventions that modify the composition of the microbiome to promote health and cure disease.Methodological developments designed to address this research question have taken on various forms and are challenged by the compositional structure, high-dimensionality, overdispersion, and zero-inflation characteristic of microbial count data.Examples of recent developments include sparsity-induced univariate and multivariate count regression models to identify exposures that characterize the composition of the microbiome (Chen and Li, 2013;Jiang et al., 2021;Koslovsky, 2023;Koslovsky and Vannucci, 2020;Liu et al., 2021;Wadsworth et al., 2017;Xu et al., 2015;Zhang and Yi, 2020;Zhang et al., 2017), compositional regression models to predict biological, genetic, clinical, or experimental conditions using microbial abundance data (Lin et al., 2014), and joint models for simultaneous inference of these relations (Koslovsky et al., 2020), among others.
Several clinical studies have hypothesized that the microbiome may mediate the relation between an assigned treatment (e.g., diet) and an observed phenotypic response (e.g., body mass index).The total effect of the treatment on the outcome is then comprised of a direct effect (not through the microbiome) and an indirect effect through its relation with the compositional mediators, both of which may be confounded by other covariates.
Hypothesis testing and regularization techniques have been proposed to test and identify mediation effects of the microbiome.For example, Zhang et al. (2018) designed a distancebased approach which incorporates prior structural information of the microbial data, such as evolutionary relations, and uses a robust, permutation-based approach for simultaneous inference on multiple distances.This approach estimates an overall mediation effect for the microbiome but cannot estimate mediation effects for each taxon and does not allow for additional covariates in the model.Sohn and Li (2019) assumed a linear log-contrast model to model the relation between potential mediators and the outcome and applied a debiased regularization procedure for estimation to produce both overall and component-wise mediation effect estimates while allowing for additional covariates.Zhang et al. (2019) took a similar approach as Sohn and Li (2019) but applied isometric log-ratio transformations, often referred to as balances (Egozcue et al., 2003), to model the relation between the microbial taxa and the outcome.Thereafter, inference for relative indirect effects is performed using a joint significance test with a focus on pre-specified taxa.Here, following these authors, we refer to the indirect or mediation effects for each taxon as relative indirect effects, to reflect the relative nature of the information captured in each balance (see section 2.1 for more details).Zhang et al. (2020) extended the work of Zhang et al. (2019) via a closed testing-based selection procedure to identify individual taxa that mediate the relation between the exposure and phenotypic outcome.Wang et al. (2020) proposed a two-stage regularized estimation approach for high-dimensional compositional mediation analysis, which uses a Dirichlet regression model to characterize the relation between treatment and the microbial data while simultaneously investigating potential interaction terms.Similar to Sohn and Li (2019), this model identifies relative and overall mediation effects in addition to accommodating other covariates and interaction terms.Song et al. (2020) and Song et al. (2021) demonstrate the benefits of a Bayesian approach for exploratory high-dimensional mediation analysis using various types of shrinkage priors to identify active mediators while simultaneously quantifying model uncertainty.However, these approaches are designed for a high-dimensional set of continuous mediators which is not suitable for the compositional structure of microbiome data.

Compositional Mediators Covariates
Figure 1: Causal directed acyclic graph of the assumed compositional mediation framework with auxiliary covariates (measured or unmeasured) in both levels of the model.
In this paper we build upon the approach of Koslovsky et al. (2020) and recast their joint model for compositional microbiome data into the causal framework represented in Figure 1, which allows for the identification of mediation effects under the assumption of a randomized treatment.Compared to two-step approaches, which first model the relation between microbial abundances and a set of covariates and then regress a phenotypic outcome on the estimated relative abundances obtained in the first step, Koslovsky et al. (2020) propose jointly modeling phenotypic outcomes and microbial abundances, which directly accommodates uncertainty in the abundance estimates.Notably, their approach is related to the broad class of methods that makes distributional assumptions for covariates to reduce inferential biases (Carroll et al., 2006;Tadesse et al., 2005).In simulation, they show that this results in improved selection, estimation, and predictive performance.To accommodate overdispersion, the microbial abundance data are assumed to follow a Dirichlet-multinomial distribution, given the treatment assignment and a set of observed covariates.A compositional linear regression model relates the relative abundances, which represent the proportion of each microbe in the microbial sample, to the outcome.We show how the use of discrete spike-and-slab priors for regression coefficients, which explicitly place a point mass at zero for excluded terms, provides direct inference on the presence of overall and relative mediation effects, treatment effects, and potential confounders.By using a fully Bayesian approach for inference, our method inherently quantifies uncertainty for each term in the model and functions thereof.As such it provides a more comprehensive approach for compositional mediation analysis compared to existing approaches.We demonstrate our method's performance versus comparative approaches on simulated data, provide recommendations for hyperparameter specifications and estimation of causal estimands, and apply our model to a benchmark study investigating the meditation effects of the gut microbiome on the relation between sub-therapeutic antibiotic treatment and body weight in early-life mice (Schulfer et al., 2019).
In section 2, we first present the Bayesian joint model for compositional mediation analysis and then describe inference on direct and indirect effects following specification of the causal assumptions.In section 3, we demonstrate our method's performance in various simulated settings and compare the results to existing methods.In section 4, we apply our model to the benchmark study.We conclude with final remarks in section 5.

Methods
Let y i denote the observed continuous outcome of subject i = 1, . . ., n and t i ∈ {0, 1} the assigned treatment, with t i = 1 if subject i received the treatment and t i = 0 otherwise.
Furthermore, let z i = (z i1 , ..., z iJ ) indicate a J-dimensional vector of taxa counts and x i = (x i1 , ..., x iP ) a P -dimensional vector of observed covariates.We first recast the joint model for compositional microbiome data of Koslovsky et al. (2020) into a framework for mediation analysis, where the relative abundances are treated as potential mediators, and then describe inference on direct and indirect effects following specification of the causal assumptions.

Bayesian Joint Model for Mediation Effect Selection
We adopt a joint model formulation that comprises a linear regression model for the phenotypic outcome and a Dirichlet-multinomial regression model for the compositional taxa.
The two models are linked via balances, calculated based on estimated relative abundances, that serve as the shared parameters.
Outcome Model: A multiple linear regression model is used to capture the direct effect of the treatment on the outcome, while adjusting for potential mediators and other covariates (including potential confounders of the outcome and mediators), as where the balances B(η j , ψ i ) are a function of the relative abundances ψ i = (ψ i1 , . . ., ψ iJ ) , with J j=1 ψ ij = 1, as described below.The relative abundances represent the proportion of the microbiome sample that is made up of each microbe.Regression coefficients β = (β 1 , . . ., β J−1 ) represent the balances' effects, c 0 the intercept term, and c 1 the direct effect of treatment.Coefficients κ = (κ 1 , . . ., κ P ) capture the effects of the covariates, x i , and i represents the error term.Spike-and-slab priors (Brown et al., 1998;George and McCulloch, 1997;Tadesse and Vannucci, 2021) are imposed on the coefficients β and κ, allowing us to investigate whether the balances and/or covariates are associated with the outcome, respectively.Specifically, (2) where δ 0 (•) represents a Dirac delta function, or point mass, at zero.Here, the latent inclusion indicators ξ j and ν p take on values of 0 or 1, where ξ j = 1 (ν p = 1) indicates that the corresponding balance (covariate) is included in the model, and 0 otherwise.We assume Bernoulli priors on the binary inclusion indicators, with Beta hyperpriors imposed on the inclusion probabilities.This allows the inclusion probabilities to be marginalized out for efficient sampling.We indicate this prior construction as ξ j ∼ Beta-Bernoulli(a j , b j ) and , where a j (a p ) and b j (b p ) control the sparsity of the balances (covariates) in the model.To complete the outcome model's formulation, we assume c 0 , c 1 ∼ Normal(0, h c σ 2 ) and i ∼ Normal(0, σ 2 ), where σ 2 ∼ Inverse-Gamma(a 0 , b 0 ) for some a 0 > 0 and b 0 > 0.
Dirichlet-Multinomial Model: The microbial taxa counts are treated as compositional and assumed to follow a multinomial distribution given the relative abundances ψ i (i.e., z i ∼ Multinomial( żi |ψ i ), where żi = J j=1 z ij ).Conjugate priors for ψ i can be specified as ψ i ∼ Dirichlet(γ i ), where γ i is a J-dimensional vector of concentration parameters.
Note that the distributional assumptions for the taxa counts could take on various forms (Zhang et al., 2017).We chose a Dirichlet-multinomial (DM) model as it accommodates overdispersion and provides a computationally efficient Markov chain Monte Carlo (MCMC) routine that exploits data augmentation (Koslovsky et al., 2020).A log-linear regression framework can be used to relate the relative abundances with the treatment and covariates by introducing λ ij = log(γ ij ) and defining (3) In this formulation, α j is a taxon-specific intercept term, φ j is the taxon-specific regression coefficient for treatment, and θ j = (θ j1 , . . ., θ jP ) are the taxa-specific regression coefficients corresponding to the covariates.Note that in general the potential covariates included in Equation (Eq.) (3) do not have to match those included in Eq. (1).Similar to the outcome model, influential terms can be identified by imposing spike-and-slab priors on each of the regression coefficients, φ j and θ jp , with Gaussian slabs centered at 0 and variance r 2 j .We assume Beta-Bernoulli priors for the latent inclusion indicators, ϕ j ∼ Beta-Bernoulli(a v , b v ) and ζ jp ∼ Beta-Bernoulli(a t , b t ), respectively.The prior specification is completed by assuming α j ∼ Normal(0, σ 2 α ).Construction of Balances: The outcome model and the DM model are linked via balances, calculated based on relative abundances, that serve as shared parameters.Balances are isometric log-ratio transformations, defined proportionally to the difference in the mean of the log-transformed abundances between two groups or partitions, and are scale invariant (Egozcue et al., 2003).For a generic balance k, the relative abundances ψ are divided into two non-overlapping partitions, denoted as ψ k+ and ψ k− , which we represent as a J-dimensional vector η k .The elements of η k take on values of 1, −1, or 0 with indices corresponding to the taxa positions in ψ.Specifically, 1 indicates that the corresponding ψ j belongs to partition ψ k+ , −1 that it belongs to partition ψ k− , and 0 implies it is not in either partition.The balance for a partition is defined as where |•| indicates the dimension of the partition and g(•) the geometric mean.Thus, balances can be seen as a normalized log ratio of the geometric mean of the elements assigned to each partition, and β j in Eq. 1 is interpreted at the expected change in Y for a unit increase in the logarithm of the ratio between the geometric mean of the taxa in ψ j+ and the taxa in ψ j− .
It is important to note that prediction performance of the model does not depend on the order in which the partitions are defined using sequential binary separation (Koslovsky et al., 2020).Additionally, balances cannot handle observed zero counts and require adjustments based on assumptions of their occurrence (Martín-Fernández et al., 2015).To handle zero values for ψ, we use a multiplicative replacement strategy in which zero values are replaced with relatively small pseudovalues, and the corresponding probability vector is scaled to sum to one (Martin-Fernandez et al., 2000).This strategy does not affect the modeling of the relationship between treatment and relative abundances.

Causal Assumptions and Definition of Mediation Effects
We now discuss the assumptions required to identify the direct and indirect causal effects in our modeling approach.We operate under the potential outcomes framework (Rubin, 2005).Within this framework, potential outcomes for each subject exist under any possible treatment value, but the outcome for a subject can only be observed under one treatment value.The potential outcome under the treatment value the subject does not receive (counterfactual treatment) is typically referred to as the counterfactual outcome (Höfler, 2005).
The total effect of the treatment on the outcome on the additive scale is the summation of a direct effect and an indirect effect through its relation with the compositional mediators.
One of the key advantages of our approach is that it provides inference on taxon-specific mediation effects as well as an overall mediation effect for the microbiome, in addition to inherently estimating model uncertainty.Under the typical stable unit treatment value assumption (i.e., consistency and no interference) (Rubin, 1980(Rubin, , 1986)), we assume that: Assumption 3.There is no interaction between T i and ψ i , for all x, and where t, t ∈ {0, 1} are the assigned and counterfactual treatments, respectively, B(η, ψ i (t)) is the corresponding balance set, X i refers to potential confounders of the outcome and mediators, as well as additional covariates in each model, and Y i (t, b) is the potential outcome when treatment T i = t and balance B(η, ψ i (t)) = b.Assumptions 4a and 4b, in particular, imply that there are no unmeasured confounders after controlling for the covariates and treatment (Imai et al., 2010).Note that assumptions 1 and 4a are expected to hold for the simulation and application study by definition of the randomized design.In contrast, assumption 4b is an untestable assumption even with a randomly assigned exposure, but we explore the robustness of the method with respect to violations of this assumption in the simulation study.Furthermore, adjustment for post-treatment variables as a subset of X i that may confound the mediator-outcome relationship can only be made under the additional assumption that they are not induced by the exposure, tantamount to a crossworld independence assumption (Andrews and Didelez, 2020).It should also be noted that the assumption of no interaction between the treatment and mediator is not necessary for the identification of direct and indirect effects, though the simulations and applied example in the current study operate under this assumption.
Under the assumptions above, we can now define the direct effect of the treatment on the phenotypic response for the i th subject, ∆ i , as Note that the direct effect is shared among all subjects as ∆ i = ∆ i , ∀i = i .The subjectspecific overall indirect effect, δ i , is then defined as where and ) is the digamma function following Honkela et al. (2001), given the corresponding inclusion indicators ξ j = ϕ j = 1.Note that the subject-specific overall indirect effects vary across subjects, since the estimated relative abundances are a function of treatment assignment and a set of uniquely observed covariates.When there are no covariates in the DM portion of the model, the population-level overall indirect effect is the same for each individual.

Strategies for Determining Relative Mediation Effects
Unlike the overall indirect effect, identification of the separate indirect effects is typically subject to the additional assumption of independence between individual mediators (here, individual taxa) conditional on T i and X i (Imai and Yamamoto, 2013;VanderWeele and Vansteelandt, 2014).Kim et al. (2019) have recently proposed an alternative decomposition of indirect effects of individual mediators from the joint (or overall) indirect effect of multiple mediators using a Bayesian estimation approach, while allowing for interdependence between mediators.Their approach relies on a joint distributional assumption for the multiple mediators, similar to the proposed method in which we assume a Dirichlet-multinomial distribution for the multiple mediators.With the proposed approach, the calculation and interpretation of the relative indirect effects for each taxon depends on the order of the taxa when constructing the balances via sequential binary separation, though the order makes no assumptions about the nature of the causal relationships between individual taxa.To better understand this concept, consider a balance tree structure that is constructed by creating a (J − 1) * J-dimensional vector η = (η 1 , . . ., η J−1 ) , with . . .
and calculating the (J−1)−dimensional vector of balances This implies that the relative mediation effect of the taxon corresponding to the first element in η 1 , δ i1 , given B(η 1 , ψ i ), can be defined as where ] are evaluated using Eq. ( 4), similar to the overall mediation effect.As such, when ϕ [1] 1 and ξ [1] 1 are both active, or selected, into the model, the taxon corresponding to ψ 1 has a significant mediation effect relative to the rest of the taxa.
The relative mediation effects of the remaining taxa are expressed as . . . where . Note that the relative mediation effects depend on the corresponding latent inclusion indicators ξ k and ϕ k for k = 1, . . ., j − 1.Details of these derivations are provided in the Supplementary Material.
Thus, while the relative mediation effects can be estimated for taxon j = 1 using our method, we only obtain direct inference regarding the identification of a non-null δ i1 (i.e., δ i1 = 0) via its corresponding latent inclusion indicators, ξ 1 and ϕ 1 , for a given ordering of the taxa when constructing the balances.This result is an artifact of the compositional structure of the multivariate count data, and is therefore not unique to our modeling approach.Similar to the overall indirect effect, when there are no covariates in the DM portion of the the model, the population-level relative indirect effects are the same for each individual.
Given the definitions described above, we put forward three different strategies to determine active relative mediation effects for each taxon and later investigate their performance in the simulation study.Direct inference on the presence of a relative mediation effect for a given taxon via its corresponding latent inclusion indicators is only available if the taxon is assigned to the first index in η 1 .Therefore, one strategy is to run the MCMC algorithm J times with the balances constructed using a different compositional element in the first index on each run.For each run, we identify the j th taxon-specific mediation effect as active if the marginal posterior probabilities of the corresponding inclusion indicators for φ 1 and β 1 , respectively) are both greater than or equal to 0.5, where the superscript [j] indicates the j th taxon is the 1 st element in η 1 .In the comparative study performed in section 3.3 below, we refer to this inferential strategy as CMbvs 1 .
In order to avoid running the MCMC algorithm J times, an alternative strategy for determining relative mediation effects is to construct 95% credible intervals for each of the relative mediation effects determined using Eqs.( 5) and ( 6).Effects are then identified as active if their 95% credible intervals for δ ij do not contain zero.While this approach does not perform inference directly on the latent inclusion indicators, model uncertainty is still propagated into the corresponding effects' credible intervals as the MCMC algorithm iterates through different model parameterizations.We refer to this strategy as CMbvs 2 .A third potential strategy for determining relative mediation effects (CMbvs 3 ) involves a combination of strategies 1 and 2. First, the model is run once to determine active treatment effects in the DM portion of the model based on the marginal posterior probabilities of inclusion (MPPIs) for ϕ j .Then the model is re-run with the inactive terms removed.Relative mediation effects are then determined based on the 95% credible intervals for δ ij .Note that for CMbvs 2 and CMbvs 3 , selection of active relative indirect effects is determined for each unique covariate profile across subjects.If there are no covariates in the DM portion of the model, selection is performed at the population level.

Posterior Sampling and Inference
To sample the posterior distribution, we adopt the Metropolis-Hastings (MH) within Gibbs algorithm of Koslovsky et al. (2020) that uses a data augmentation approach to sample the relative abundances ψ.Let k ij represent latent variables, such that , results in closed-form Gibbs updates for u i and k ij .Inclusion indicators of the spike-and-slab priors and corresponding regression coefficients are updated jointly using an Add-Delete MH algorithm (Savitsky et al., 2011).A generic iteration of the MCMC algorithm is described in the Supplementary Material, and more details are provided in Koslovsky et al. (2020).
Given the output of the MCMC algorithm, MPPIs are used to determine the active terms at both levels of the model.The MPPIs for treatment, covariates, and balances are determined by taking the average of their respective inclusion indicators' MCMC samples after burn-in.Generally, a term is selected if its corresponding MPPI ≥ 0.50 (Barbieri et al., 2004).One of the strengths of using discrete spike-and-slab priors for Bayesian variable selection is that non-active, or excluded, covariates' corresponding regression coefficients are set to 0 and effectively removed from the model.As a result, MCMC samples for regression coefficients with corresponding MPPIs < 1.0 will be zero-inflated (e.g., an MPPI of 0.6 for a given covariate would result in selection using a 0.5 threshold, but 40% of the corresponding regression coefficient's MCMC samples would be equal to zero).While we favor this approach over refitting the model with selected covariates fixed into the model since it fully accommodates model uncertainty, this may result in skewed credible intervals and shrink posterior estimates towards zero.

Simulation Study
The assumptions made in section 2.2 imply that there are no unmeasured confounders in the model.In observational studies, unmeasured confounding will result in biased exposure effect estimates (Fewell et al., 2007).In practice, researchers may employ sensitivity analyses assessing the magnitude of these biases (VanderWeele and Arah, 2011).In this simulation study, we evaluate the method's ability to successfully identify and estimate mediation effects in various settings, including unmeasured mediator-outcome confounding (i.e., the presence of covariates associated with the mediators and outcome not accounted for in the model) as well as model misspecification.

Simulated Data
We evaluated the proposed model in various simulated scenarios to demonstrate its relative mediation effect selection and parameter estimation performance.In all scenarios, relative abundances were generated from a Dirichlet distribution and log transformed in the outcome model, instead of transformed via balances.Specifically, we generated the continuous outcome with y i = c 0 + c 1 t i + J j=1 β log,j log(ψ ij ) + i , where β log,j represents the true regression coefficients specified for the log of the j th relative abundance, similar to Zhang et al. (2020).Thus, the data generation process does not match the assumptions of our proposed model.In each scenario, we generated multivariate compositional count data for n = 200 observations with J = 50 compositional elements.We simulated the treatment received by each subject, t i , from a Bernoulli distribution with probability 0.5.Let t i = 1 indicate that subject i received the treatment, and t i = 0 otherwise.We assumed c 0 = 0, c 1 = 1, and i ∼ Normal(0, 1) when generating the continuous outcome.For each subject, the relative abundances, ψ i , were generated similar to Eq. ( 3), (i.e., λ ij = α j + φ j t i ).The taxon-specific intercept terms α j were generated from a Uniform(−2, 0.5), for j = 1, . . ., 50.We simulated data in which the first three taxa have active indirect effects with φ = (1, 1.2, 1.5, 0, . . ., 0) and β log = (3, −1.5, −1.5, 0, . . ., 0) for the corresponding log transformed relative abundances.
In the first scenario, we assumed both levels of the model were correctly specified with respect to the casual assumptions (i.e., no model misspecification or unmeasured mediatoroutcome confounding).In scenarios 2 and 3, we evaluated how misspecification of the causal assumptions (i.e., ignoring influential covariates) in the DM and linear portions of the model, respectively, may affect inference.Scenario 4 introduced unmeasured mediator-outcome confounding due to covariates unaccounted for in both levels of the model.To simulate misspecification of the causal assumptions in scenarios 2, 3, and 4, we included a binary covariate, U 1i , simulated from a Bernoulli distribution with probability 0.5 and a continuous covariate, U 2i , simulated from a standard normal distribution in each layer of the model when generating the data as necessary, but ignored these covariates when fitting the models.For the purposes of this simulation, the same covariates affected both the mediators and outcome in scenario 4. Specifically, for scenarios 2 and 4, we simulated multivariate count data with , where ν 1j and ν 2j are the j th elements of the J-dimensional vectors ν 1 = (0.8, 0, 0, 0, 1.2, 0, . . ., 0) and ν 2 = (0, 1.2, 0, 0.8, 0, . . ., 0) , respectively.In scenarios 3 and 4, we set We further explored the performance of the method in scenario 1 under different settings including a skewed distribution for the treatment (i.e., P (T i = 1) = 0.25) as well as varying sample sizes and numbers of compositional elements (i.e., n = 50 with J = 50 and J = 100).Additionally, we evaluated the models in a setting motivated by the application data with 36 observations, 36 compositional elements, and effect sizes of φ = (0.7, 1, 1.2, 0, . . ., 0) and β log = (1.8,−1, −0.8, 0, . . ., 0).Here, we used an imbalanced treatment assignment with a 2:1 ratio of those assigned to treatment and control, similar to that in the application study.We evaluated the model in the four scenarios outlined above, with U 1i and U 2i included in the data generation as necessary.

Parameter Settings and Performance Measures
Simulation results were obtained by assuming uniform priors on the parameters of the Beta-Bernoulli distributions for the latent inclusion indicators in the linear and DM regression models (i.e., a j = b j = a p = b p = a v = b v = a t = b t = 1).As such, we impose no prior knowledge on whether or not covariates are active in both levels of the model as well as whether or not the balances are associated with the continuous outcome or treatment is associated with the relative abundances.We set the scale parameters h c = h β = h κ = 1, representing weakly informative priors for the slab variances, and assumed a weakly informative prior for σ 2 in Eq. (1) (i.e., a 0 = b 0 = 1).Additionally, we set the prior variances for the intercept terms in the DM portion of the model σ 2 α = 1 and variances for the corresponding regression coefficients for treatment and covariates associated with the relative abundances r 2 j = 10 for all j = 1, . . ., J.This places a 95% prior probability between ±1.96 and ±6.20, respectively.We initiated the MCMC chains at zero for all regression coefficients in both levels of the model.Each MCMC chain was run with 5000 iterations and thinned to every 10 th iteration with a 250 iteration burn-in.Convergence was assessed by examining traceplots for φ and β.
Selection performance was evaluated via sensitivity (SENS), specificity (SPEC), and Matthew's correlation coefficient (MCC), a balanced measure of the quality of binary classification (Powers, 2020).These metrics are defined as were averaged over 50 replicated data sets for each simulated setting described in section 3.1.

Methods Comparison
We compared the results of our model with the methods proposed in Zhang et al. (2019) and Zhang et al. (2020), since these methods have shown superior performance in identifying relative indirect effects when compared to other existing methods.These models take a penalized approach to shrink regression coefficients corresponding to both the direct and indirect effects using a debiased Lasso approach (Zhang and Zhang, 2013).In both models, taxon-specific mediation effects are tested as where the superscript [j] indicates the j th taxon is the 1 st element in η 1 .Note that these approaches require the model to be run J times for inference on each taxon, similar to CMbvs 1 .For the method proposed by Zhang et al. ( 2019), which we denoted as B-H, a joint significance test is used to test the null hypothesis above.Specifically, the p-value for the indirect effect, P joint [j] , is set to max{P φ )) and

Results
Table 1 presents the results of the simulation study across all scenarios.With a correctly specified model (scenario 1), all methods provided excellent performance in terms of sensitivity (SENS > 0.97, with the exception of CMbvs 2 and CMbvs 3 , which obtained SENS = 0.773 and SENS = 0.720, respectively).CMbvs 1 , CT-Lasso, and B-H obtained the best performances overall in scenario 1, with MCC > 0.98.CMbvs 2 obtained the lowest specificity (SPEC = 0.832) among the Bayesian methods, resulting in the worst performance overall in scenario 1 (MCC = 0.357).In scenario 2, all methods were able to maintain high specificity.
However, the proposed methods obtained lower sensitivity in scenario 2 relative to scenario 1.Further investigation revealed that the reduction in sensitivity for the relative mediation effects in the presence of model misspecifiation in the DM portion of the model resulted in poorer selection performance in the linear portion of the model (see Supplementary Table S1).We attribute this downstream reduction in performance to poorer estimation of the relative abundances from the DM portion of the model.In scenario 3, CMbvs 1 demonstrated the best performance in terms of sensitivity, specificity, and MCC.In scenario 4, CMbvs 1 and CT-Lasso performed the best overall (MCC = 0.738 and MCC = 0.791, respectively).
While the other methods maintained relatively high specificity, their overall performance greatly declined in the presence of unmeasured confounding due to a greater reduction in sensitivity.
In Table 2, we report the estimation performance for the direct effect and the overall indirect effect using the proposed model (CMbvs 1 ).In scenario 1 (i.e., no unmeasured confounding), the bias for the direct effect and overall indirect effect estimated by CMbvs 1 was 0.662 and 1.252, respectively, the corresponding MSE was 1.255 and 4.109, respectively, and 94% of replicates recovered the true direct effect, while all replicates recovered the true overall indirect effect in the 95% posterior credible intervals.In scenario 3 (i.e., misspecified LM portion of the model), CMbvs 1 maintained similar performance as in scenario 1.However, the proposed method's estimation performance suffered in scenarios 2 and 4 (i.e., misspecification of the DM portion of the model and unmeasured confounding).These results align with the reduction in selection performance observed in these settings.
To further investigate model performance, we considered alternative data generation settings in scenario 1 (Table 3).With a lower proportion of observations assigned to treatment (i.e., P (T i = 1) = 0.25), the methods demonstrated a reduction in sensitivity.As expected, selection performance for all methods decreased with a smaller sample size and larger number of compositional elements.In the Supplementary Material, we provide additional simulation results, including more details of the selection performance for CMbvs 1 in both levels of the model in various scenarios (Supplementary Tables S1-S3), a comparison of all models under different data generation settings in scenario 4 (Supplementary Table S4), as well as the acceptance probability for α and φ across MCMC iterations (Supplementary Table S5).
Table 4 presents the results of the simulation study with data generated similar to the application study in the 4 scenarios outlined above.In these settings, all methods obtained lower selection performance due to the relatively small sample size, as expected.We observed that CMbvs 1 obtained the best overall selection performance in scenarios 1, 2, and 3 (MCC = 0.625, 0.435, and 0.564, respectively).The two comparison methods performed reasonably well when the LM portion of the model was correctly specified.However, ignoring influential covariates when modeling the multivariate count data resulted in considerably lower sensitivity and MCC for the CT-Lasso and B-H methods compared to the proposed method.
Taken overall, our results indicate CMbvs 1 as the best strategy, as it typically obtained the best overall selection performance due to high specificity across simulations.CMbvs 2 often gave the lowest specificity and sensitivity, resulting in the worst performance overall.
CMbvs 3 , the hybrid approach, has shown to be more robust in small n and larger J settings and in the presence of unmeasured confounding.Since CMbvs 3 does not require J fits, we recommend it in addition to CMbvs 1 in these select settings.

Sensitivity Analysis
To investigate our model's sensitivity to prior specification, we set each of the hyperparameters to the values used in section 3.2 (referred to as the baseline setting) and then evaluated the effect of manipulating each term on the results obtained.Specifically, we investigated sensitivity to the prior probability of inclusion, spike-and-slab variances in the Dirichlet-multinomial model, scale parameters in the linear model, and hyperparameters for the variance of the error terms.For comparison to the baseline setting, we randomly selected a simulated data set from scenario 4 as reference and re-ran our model with all three strategies.Each MCMC algorithm was run for 5000 iterations and thinned to every 10 th iteration, with the first 250 samples as burn-in.Results of the sensitivity analysis are presented in Table 5.We found that with smaller prior probabilities of inclusion (i.e., 1%, ) our proposed model identified fewer active mediators, as expected.We observed that CMbvs 1 was relatively robust to the specification of the spikeand-slab variances in the DM portion of the model, r 2 j , and the scale parameters in the linear model, h c , h β , and h κ .Lastly, we found moderate sensitivity of CMbvs 1 to the hyperparameter specification for the variance of the error terms in the linear model which resulted in the underselection of balances.Compared to CMbvs 1 , CMbvs 2 typically obtained more false positives and CMbvs 3 was more sensitive to the specification of r 2 j .

Application
We applied the proposed joint model to a benchmark data set collected to study the impact of sub-therapeutic antibiotic treatment on gut microbiota and body weight in earlylife mice (n = 57) (Schulfer et al., 2019).DNA and operational taxonomic units (OTUs) were extracted with the 96-well MO BIO PowerSoil DNA Isolation Kit and QIIME2, respectively (Caporaso et al., 2010).OTU counts used in this analysis were extracted on day 26 of the study, and weights of mice measured on day 116.Prior to analysis, we filtered out taxa with >90% zero read counts and used a pseudovalue of 0.5 for zero reads when constructing the balances.Following Zhang et al. (2020), we first analyzed the male mice samples only.The antibiotic treatment group was assigned as exposure (t i = 1 for 23 mice) and compared with the control group (t i = 0 for 13 mice).The weights at sacrifice were treated as the outcome and standardized prior to analysis.
Assumptions of no unmeasured confounding can be broken up based on assumptions 4a and 4b.One (4a) is expected to hold in this application because of the randomized study design; the other (4b) is untestable.In the simulation study, we demonstrate the robustness of the proposed model under circumstances where assumption 4b is violated.In this application, unmeasured common causes of the mediator (microbiome) and outcome (body weight), would lead to violation of this assumption.Examples of such common causes could be genetics, medication use, stress, or injury.The combined strength of the associations between any of these factors and the microbiome and outcome, however, is expected to be weaker compared to the effect of diet, thus potentially limiting the potential for bias (Cohen et al., 2019;Rothschild et al., 2018;Wen and Duffy, 2017).Microbes not considered as part of the analysis could also serve as potential mediator-outcome confounders or even treatmentinduced mediator-outcome confounders, though omitted microbes typically appear in very low frequencies in the study population, which would also limit the potential for bias.
For inference using the proposed Bayesian joint model, the MCMC algorithm was run for 15000 iterations and the chain was thinned to every 10 th iteration, with the first 1000 iterations treated as burn-in.The hyperparameters were set similar to those described in section 3.2.Convergence was determined using trace plots of the regression coefficients.Each run took roughly 9 seconds for an overall computation time of around 5 minutes for CMbvs 1 .
Inclusion in the model was determined using the median model approach (i.e., MPPI > 0.50).The results were additionally compared to the penalized approaches discussed in the simulation study.

Results
Plots of the corresponding MPPIs of φ 1 and β 1 for each of the j = 1, . . ., J taxon in the first position of η 1 , obtained with CMbvs 1 , are shown in Figure 2. A 0.50 threshold on the MPPIs identified Candidatus Arthromitus and Clostridiales as potential mediators.
To demonstrate our approach's flexibility in accommodating and identifying potential confounders, we performed a second analysis, using the full data set, including sex in both levels of the model.We again filtered out taxa that had non-zero reads in less than 10% of the samples, leaving 37 taxa for inference.Similar to the male-only analysis, CMbvs 1 selected Candidatus Arthromitus and Clostridiales as potential mediators.The relative mediation effect for Candidatus Arthromitus was −0.049 (−0.751, 0.000), and that for Clostridiales was 1 , j = 1, . . ., J, terms (2a) and β [j] 1 , j = 1, . . ., J, terms (2b), in the male mice only analysis, obtained from the J runs of the model using the CMbvs 1 strategy.The vertical line at 0.5 represents the inclusion threshold.Blue lines indicate selected terms.and the direct effect of treatment on mice weight was 0.719 (0.460, 0.971).We also identified a significant effect of sex in the outcome model with a posterior mean of 1.265 (1.035, 1.520).CMbvs 2 selected Erysipelotrichaceae, Streptophyta, S24-7, and Clostridiales as potential mediators with estimated relative mediation effects of 2.567 (1.872, 3.241), −0.137 (−0.703, −0.049), 0.425 (0.259, 0.637) and −0.363 (−1.305, −0.214), respectively.
CMbvs 3 also identified S24-7 and Clostridiales as potential mediators with corresponding mediation effects 0.092 (1.194, 2.895) and 0.156 (0.038, 0.271).Applying the comparative models, which adjust for but do not perform selection on potential confounders, to the full data set, we observed that the B-H model identified Candidatus Arthromitus and Clostridiales, together with Streptophyta, Turicibacter and other 11 taxa.Moreover, the CT-Lasso model identified Candidatus Arthromitus in addition to Streptophyta, RF39, Clostridiaceae, Turicibacter, and Streptococcus.With the inclusion of sex in the model, we observed a reduction in the overall indirect effect with the proposed method using CMbvs 1 .In both the male-only and full data set analyses, we identified a "competitive mediation effect" as the direct effect of treatment on mice body weight and the overall mediation effect were in the opposite direction (Zhao et al., 2010).As such the microbiome acts as a suppressing effect, reducing the total effect of treatment on mice weight.

Discussion
In this work, we proposed a formulation of a Bayesian joint model for compositional data that allows for the identification, estimation, and uncertainty quantification of various causal estimands in mediation analysis.The proposed model takes advantage of sparsityinducing priors to facilitate inference in high-dimensional compositional settings.Compared to existing approaches for high-dimensional compositional mediators, the proposed method employs discrete spike-and-slab priors to achieve simultaneous inference regarding the existence of direct effects, relative indirect effects, and overall indirect effects, in addition to potential covariates.Through simulation, we have demonstrated that our method obtains similar selection performance for relative mediation effects compared to existing approaches.
All methods demonstrated a reduction in selection performance in the presence of unmeasured confounding and with misspecification of the linear predictor in the outcome model.
The frequentist methods, CT-Lasso and B-H, were relatively robust to misspecification in the DM portion of the model.We have also applied our method to a benchmark data set investigating the sub-therapeutic antibiotic treatment effect on body weight in early-life mice, in which we observed a negative overall mediation effect and a positive direct effect of treatment.Overall, the proposed method identified fewer relative mediation effects than the alternative approaches, which was expected given the simulation results.As such, our method may favor more sparse models in practice which would result in fewer false positives but potentially more false negatives relative to the competing methods.
Using simulated data, we explored three strategies for posterior inference of relative mediation effects using the proposed method.The first strategy obtained the best selection performance overall but requires refitting the model J times.The second strategy, which only requires fitting the model once, demonstrated the worst selection performance overall.
CMbvs 3 , the hybrid approach, was more robust in small n and larger J settings and in the presence of unmeasured confounding.Based on our investigation, we recommend using the first strategy for moderate to large data sets when more sparsity is desired and additionally using CMbvs 3 , which does not require J fits, in small n and larger J settings and in the presence of potenital unmeasured confounding.
Using the proposed CMbvs 1 approach for inference, researchers may naively cycle through each taxon without using information from the previous fit to inform the selection of the next taxon to investigate.However, in the simulation study we observed that if the j th taxon is associated with the outcome for j = 2, . . ., J, then β j−1 is typically selected, regardless of the inclusion status of other terms in the outcome model.Moreover, if the relative mediation effect exists for a given taxon, the corresponding treatment effect in the DM portion of the model must be active.This information can be used to guide which taxon's relative indirect effect should be explored next, resulting in a dramatic reduction in total computation time in sparse settings.With CMbvs 1 , the estimation of relative indirect effects only depends on β 1 , α 1 , and φ 1 (when there are no covariates in the model).However with CMbsv 2 , relative indirect effects are dependent on a large number of estimated effects (i.e., β k , α k , and φ k for k = 1, . . ., j − 1 for the j th element in ψ).Thus, a major limitation of CMbsv 2 is that there is more opportunity for error to propagate into the estimate for δ j , and estimation performance is highly dependent on the ordering of the taxa.Furthermore, the three strategies were proposed for relative mediation effect selection when the balances are constructed using sequential binary separation.A future extension of this work would be to extend the model space to include the balance structure, in a similar spirit to the work of Huang and Li (2021), who constructed a Bayesian hierarchical model with variable selection to learn the balance structure that mediates the effect of treatment on the outcome.While this would increase computation time, it would provide simultaneous inference on the presence of any relative indirect effect, while fully incorporating model uncertainty.
The proposed method is designed to perform selection on each potential covariate in the model.When the goal of the analysis is to draw inference on the treatment effect, Thomas et al. (2007) and Antonelli and Dominici (2021) suggest forcing treatment into the model (i.e., not performing selection on the treatment effect).Our approach follows this setting.
In simulation results not shown, we found that performing selection on the direct effect of treatment in the linear portion of the model did not affect selection performance for the relative mediation effects.Also, in the formulation of our causal framework we have assumed a randomized treatment.Our method, however, could be extended to observational studies where treatment-outcome and treatment-mediator relationships share (measured) common causes, which can also be accounted for in the joint model.For example, in an observational study of human diet (treatment), microbiome (mediator), and obesity (outcome), we would assume that all common causes of diet and obesity are adjusted for in the model.Under this scenario, both assumptions on unmeasured confounding (4a and 4b) would be untestable.
Similar sensitivity analysis for the presence of unmeasured exposure-outcome confounding can be conducted as was the case for unmeasured mediator-outcome confounding in the simulations of the current study.Lastly, the current formulation assumes that no interaction exists between treatment and mediator, though this is not a necessary assumption for quantification of direct and indirect effects.The proposed approach could be extended to accommodate such interactions.with n = 200 and J = 50 averaged over 50 simulations.Bias -the difference between the posterior mean effect and the true effect.MSE -the mean squared error between the posterior mean effect and the true effect.Coverage -the proportion of replications in which the 95% posterior credible interval of the effect included the true value.
T N − F P × F N (T P + F P )(T P + F N )(T N + F P )(T N + F N ) , where TP, TN, FP, FN represent the true positives, true negatives, false positives and false negatives based on the selection and exclusion of relative indirect effects.To evaluate and compare the estimation performance for the population-level (since there are no covariates in the model) direct and overall indirect effects of the models, we calculate the average bias, mean squared error, and coverage probabilities of the equal-tail credible intervals.Results

1)
), with abs(•) denoting the absolute value and Φ(•) representing the cumulative density function of a normal distribution.For the second comparative model (CT-Lasso), Zhang et al. (2020) addressed potential multiple testing issues associated with the B-H approach by proposing a closed testing-based selection procedure to calculate the p-value for each mediator (see Algorithm 1 in Zhang et al. (2020) for more details).

Figure 2 :
Figure 2: Benchmark Study: Results from the DM portion (2a) and the outcome portion (2b) of the joint model.Marginal posterior probabilities of inclusion (MPPIs) for the corresponding φ [j]

Table 2 :
Simulation results for estimating the direct and overall indirect effects using CMbvs 1

Table 3 :
Simulation results for the proposed method, with three strategies for determining the relative mediation effects, and the comparison methods in scenario 1 with various data structures.Results are averaged across 50 replicate data sets.SENS -sensitivity; SPECspecificity; MCC -Matthew's correlation coefficient.

Table 4 :
Simulation results for the proposed method on data simulated similar in structure to the application data using the three strategies for determining the relative mediation effects and the comparison methods under four scenarios: (1) correctly specified model, (2) misspecification in the DM portion of the model, (3) misspecification in the linear portion of the model, and (4) unmeasured confounding.Results are averaged across 50 replicate data sets.SENS -sensitivity; SPEC -specificity; MCC -Matthew's correlation coefficient.