Exploring causal mechanisms and quantifying direct and indirect effects using a joint modeling approach for recurrent and terminal events

Recurrent events are commonly encountered in biomedical studies. In many situations, there exist terminal events, such as death, which are potentially related to the recurrent events. Joint models of recurrent and terminal events have been proposed to address the correlation between recurrent events and terminal events. However, there is a dearth of suitable methods to rigorously investigate the causal mechanisms between specific exposures, recurrent events, and terminal events. For example, it is of interest to know how much of the total effect of the primary exposure of interest on the terminal event is through the recurrent events, and whether preventing recurrent event occurrences could lead to better overall survival. In this work, we propose a formal causal mediation analysis method to compute the natural direct and indirect effects. A novel joint modeling approach is used to take the recurrent event process as the mediator and the survival endpoint as the outcome. This new joint modeling approach allows us to relax the commonly used “sequential ignorability” assumption. Simulation studies show that our new model has good finite sample performance in estimating both model parameters and mediation effects. We apply our method to an AIDS study to evaluate how much of the comparative effectiveness of the two treatments and the effect of CD4 counts on the overall survival are mediated by recurrent opportunistic infections.


INTRODUCTION
the causal pathway between exposure to time-to-event outcomes, such as death, failure, relapse, or the emergence of a new illness.
As a motivating example, OIs-a form of recurrent events-frequently occur in patients with HIV and can substantially impact their health and survival, particularly for those who have been diagnosed with Acquired Immunodeficiency Syndrome (AIDS). In the Community Programs for Clinical Research on AIDS (CPCRA) study, HIV-infected patients were enrolled and randomly assigned to receive either didanosine (ddI) or zalcitabine (ddC) and followed up for up to 21 months. The study monitored their survival, CD4 counts, and the development of opportunistic diseases. 10,11 Previous research by Liu 12 proposed a joint frailty model of recurrent and terminal events and found a strong positive correlation between recurrent OIs and death, emphasizing the significance of OIs as a risk factor. Furthermore, the occurrence of OIs is also closely associated with the patient's CD4+ T lymphocyte (CD4) count. 13,14 In the current study, we aim to develop causal mediation analysis methods using joint models to evaluate how much of the comparative effectiveness between the two treatments and the effect of baseline CD4 counts on the overall survival are mediated by the occurrences of opportunistic diseases. This will help us assess how different treatments and patients' immune systems impact the HIV disease progression and patients' health outcomes.
Handling recurrent and terminal events simultaneously is challenging since these events are usually not independent. For example, the relapse of leukemia potentially increases the risk of death, while the recurrent events will not happen after death. One approach to this issue is the use of marginal models. Li and Lagakos 5 proposed a model to either treat the terminating event as a censoring event for each recurrent event or treat the minimum of the first recurrent event time and survival time as the failure time. Ghosh and Lin 7 developed semiparametric methods using IPCW (Inverse Probability of Censoring Weighting) and IPSW (Inverse Probability of Survival Weighting). The drawback of these marginal models is that they do not specify the dependence between recurrent and terminal events, while the joint frailty model can address this issue. Nielsen et al 15 introduced a general frailty model specifically designed for recurrent and terminal event data. However, their model assumes that the frailty effects on recurrent and terminal event rates are the same. Liu et al 8 proposed a joint semiparametric model for the hazard functions of both recurrent events and death, incorporating shared gamma frailty that can have varying effects on the two hazards. In the current study, we have utilized a novel joint frailty model of recurrent and survival events by further allowing the direct effect of recurrent events on the hazard of the terminal event. We can thus quantify the natural direct effect (NDE) and the natural indirect effect (NIE) separately while controlling certain unmeasured confounding.
In the causal mediation framework, NDE and NIE are two important quantities, and the total effect (TE) can be decomposed into the sum of these two effects. Thus the ratio of NIE and TE indicates the proportion of effect that goes through the mediator. Various causal mediation analyses with longitudinally measured mediators and a time-to-event endpoint have been developed recently. Zheng and van der Laan 16 proposed a flexible nonparametric estimation of the NDE and NIE for the longitudinal exposure/mediator with the survival outcome under the Sequential Ignorability (SI) assumption. Vansteelandt et al 17 discussed how to handle the time-varying confounding to estimate path-specific effects when there is a longitudinally measured mediator and a survival outcome. The interventional direct effect (IDE) and the interventional indirect effect (IIE) are used to extend the definition of the NDE and NIE. Lin et al 18 proposed a g-formula-based method for estimating the IDE and IIE under the SI assumption. Didelez 19 discussed the problem of defining the causal mediation effect when the longitudinal biomarker is only well defined up to the event time and provided a solution by introducing separable effects. For the time-to-event mediator, several methods for mediation analysis have been developed recently under a semi-competing risk setting. [20][21][22] However, there is still limited research on the causality in the joint analysis of recurrent and terminal event data.
In this paper, we first provide estimators for NDE and NIE when the recurrent event mediator and the terminal event outcome are jointly modeled under the SI assumption. Additionally, We extend previous shared frailty models by relaxing part (ii) of the SI assumption to a weaker assumption that allows baseline confounding in the form of shared random effects between the mediator and the outcome. With this innovation, we provide estimators for NDE and NIE under a weaker assumption. The contributions we made in the current study also include incorporating the number of recurrent events in the survival model and using shared random effects to address unmeasured confounding.
The rest of the paper is organized as follows: In Section 2, we define notations and present the formula for calculating the NDE and NIE followed by a discussion of the numerical computation approach. In Section 3, we present simulation studies to explore the finite sample performance of our proposed method, as well as its sensitivity to model misspecification. In Section 4, we apply our method to the CPCRA study and examine the causal effect of recurrent OIs on the terminal event (death), the extent to which the comparative effectiveness between the two different treatments (ddI/ddC) is mediated by the occurrence of OIs, and how much the effect of baseline CD4 counts on the overall survival is mediated by changing recurrence patterns of OIs. In Section 5, we summarize our findings and discuss potential extensions.

METHOD
We consider a potential outcome framework. The Stable Unit Treatment Value Assumption (SUTVA) is an important assumption in causal inference. It requires that the potential outcome for each patient under each treatment level can only take a single value, regardless of the treatment assignment of other patients. In other words, the treatment assignment for one patient does not affect the potential outcome for another patient. We denote the "treatment" (or primary exposure of interest) of individual i as Z i . Under the SUTVA, we define the potential counting process of recurrent events (as mediators) for individual i at time t as M z i (t) if the individual were given treatment z, where t ∈ [0, ] and is the largest follow-up time. Let n z i denote the number of potential recurrent events happening between [0, ]. If the potential recurrent event times for individual We define T zm i as the potential time to the terminal event with treatment z and recurrent event mediator counting process m = {m(u), u ∈ [0, ]}, and C zm i as the corresponding potential censoring time. In our study, the survival status could affect the recurrent events because the OIs cannot happen for those who have already died. Comparable to the context mentioned in Didelez, this may result in the counterfactual T z ′ M z i i being poorly defined for z ≠ z ′ . Specifically, if the treatment has a direct effect on T zm i such that the survival under z is shorter than that under z ′ , then the process M z i will end earlier and thus become "incomplete." To handle this, we define and model the recurrent event process as the potential mediator process under a certain treatment if the individual is alive, similar to the separable effect definition in Martinussen and Stensrud 23 (ie, we hypothetically think that we can have two separable treatments, the first only affects the terminal event and the second only affects recurrent events; as a result, T z ′ M z i i now represents the event time for which the first treatment is set at z ′ , while the second treatment is set at z for corresponding recurrent event process and it is well defined.). Additionally, we adopt the consistency assumption, which links the potential outcomes to the observed outcomes. Therefore, we have a potentially observable recurrent mediator counting process The consistency assumption requires no measurement error in the mediator process. This assumption is reasonable when the recurrent events are recorded timely and accurately. Due to censoring, we observe the follow-up time T * i = T i ∧ C i and the terminal event indicator Δ i = I(T i ≤ C i ). We denote the observed baseline covariates for individual i as X i ∈ R p . With the survival outcome, TE of Z on T, as well as NDE and NIE can be defined on different functional forms, for example, the survival function, the cumulative hazard, or the hazard function. We discuss the estimation of the effects on survival probability as defined in Equations (1) and (2) under different assumptions: For NDE and NIE among subgroups with covariates X = x, they can be defined as Although the definitions above work for both continuous and discrete survival time T zm i , for the joint model we proposed in the next sections, we assume T zm i to be a continuous variable.

General models and determination of causal mechanism
There are mainly three types of causal mechanisms between the recurrent events and the terminal event after adjusting for all the measured variables: (1) the recurrent events are a "cause" for the outcome without unmeasured confounding; (2) the association between the recurrent events and the terminal event is purely due to unmeasured confounding; or (3) the recurrent events are a "cause" for the terminal event, but the association is partly confounded by unmeasured variables ( Figure 1). Previous studies have differentiated these three mechanisms and examined the relationship between multiple longitudinally measured continuous biomarkers and the time-to-event outcome. 24,25 Zheng et al 26 explored how to quantify the mediation effects when the first or the third mechanism holds. However, further investigation is needed to determine how to handle the intermediate variable of recurrent events. We assume the following joint models (Model I) for the causal mechanism as shown in the bottom panel of Figure 1: is a shared random effect (frailty) that is independent of X, z. Here r z i (t) is the intensity function for the recurrent event process M z i (t) and zm i (t) is the hazard function for the potential time to terminal event T zm i . Different causal mechanisms can be tested using this general framework. Specifically, we could answer whether the association between M(t) and T is (i) purely confounded by the shared random effect ( m = 0); (ii) causal without confounding ( = 0); or (iii) partially causal and confounded by the shared random effect ( m ≠ 0 and ≠ 0).
If we want to further allow the interaction between the random effect and the mediator, we can replace the survival model (6) by the following form (Model II): Different causal mechanisms can be tested as below: (i) purely confounded by the shared random effect ( m = 2 = 0); (ii) causal without confounding ( 1 = 2 = 0); (iii) partially causal and confounded by the shared random effect (( m , 2 ) ≠ (0, 0) and ( 1 , 2 ) ≠ (0, 0)). When the association between M(t) and T is causal with or without confounding, we might be interested to quantify the direct and indirect effect of Z on T mediated by M(t). For a subgroup with specific covariate x, the NDE and NIE under the survival difference as defined in Equations (3) and (4) can be estimated bŷ The specific formula ofŜ(t, z, z ′ , x) depends on certain model specifications and will be discussed in the next two subsections as well as in Appendices A and B. Then we can further integrate over the distribution of X to obtain the general population level NDE and NIE as defined in Equations (1) and (2), which can be estimated bŷ

Sequential ignorability
To identify the NDE and NIE, we usually need to assume SI, which indicates that (i) conditioning on observed covariates, there is no confounding between the exposure and the mediator process; (ii) conditioning on observed covariates and the exposure, there is no confounding between the mediator process and the outcome of interest.
Under this assumption, we can represent the potential outcome distribution S(t, z, z ′ ; x) using the distribution of the observable outcome: is the conditional survival function of the failure time given the exposure, the mediator, and the covariates; dF M|Z,X (m|z, x) = F (dm; z, x) with F (⋅; z, x) being the conditional probability measure for the infinite dimension mediator process M given the exposure and the covariates, which can be viewed as an analog to the probability density function but for the infinite-dimensional random variables M.
Specifically, consider the model without the shared random effect as below: Then we can write where J(m) is the number of jumping points within [0, ]; R 1 (m), … , R J(m) (m) are corresponding jumping time points; F is the cumulative density function of from normal distribution with mean 0 and variance 2 . Both S(t|Z = z ′ , M = m, X = x) and dF M|Z,X (m|z, x) are no longer dependent on potential outcomes and thus are estimable from the data when there is no censoring. When there is censoring, the assumption on the censoring mechanism such as an independent censoring assumption will allow us to obtain a consistent estimation for these quantities, denoted asŜ(t|Z = z ′ , M = m, X = x) and dF M|Z,X (m|z, x) which will lead to the plugin estimator

Confounding via shared random effect
The first part of the SI assumption requires that Z is ignorable given the measured covariates X. This assumption is automatically guaranteed when the exposure of interest is the treatment in a randomized trial. In our example, we have two exposure variables of interest, the treatment, which is automatically guaranteed since the study is a randomized trial, and the baseline CD4 count, which requires us to make the assumption of ignorability of the Z condition on observed covariates X (ie, no unmeasured confounding between exposure and outcome and no unmeasured confounding between exposure and mediator in Equation 8). We believe this is a reasonable assumption given that important covariates, such as hemoglobin (HgB), are included in our analysis. However, the second part of the SI assumption, the ignorability of the M(t) conditioning on observed covariates and the exposure in Equation (9), cannot be guaranteed even in a randomized trial. Therefore, usually assumed SI assumption could be too strong and hard to satisfy.
Here we propose to relax the second part of the traditional SI assumption to a weaker version where we only require the conditional independence to hold after conditioning on all shared random effects. This is a weaker assumption than the usual SI assumption, as it allows for time-independent unmeasured confounding that can be quantified by the shared random effects. Specifically, we replace Equation (9) with the following assumption mathematically: The main idea of this relaxation is to utilize the shared random effect to represent the unmeasured time-independent confounders. This approach has been successful in modeling unmeasured confounders using latent factors in linear models with multiple exposures 27,28 and using shared random effect in survival models with longitudinal measurements. 26 The underlying assumption implied by Equation (11) is that the unmeasured confounders must affect more than a single occurrence of the recurrent event process. If a confounder influences only a single occurrence of the event, it cannot be considered a confounding variable across multiple recurrent events and, therefore, cannot be represented by . This will lead to uncontrolled confounding between recurrent events and the terminal event even after we condition on . Therefore, in real data applications, it is important to keep in mind that potential unmeasured confounding with only instant effects is not taken care of with our proposed method.
Under this relaxed assumption, we can write where S(t|Z = z ′ , M = m, X = x, ) is the conditional survival function of failure time given the exposure, the mediator, the covariates, and the shared random effect; dF M|Z,X, (m|z, x, ) = F (dm; z, x, ) with F (⋅; z, x, ) being the conditional probability measure for the infinite dimension mediator process M given the exposure, the covariates and the shared random effect ; F is the cumulative density function of the random effect distribution. All these terms are no longer dependent on potential outcomes and thus are estimable from the data.
Specifically, under our model (6) without interaction and assuming ∼ N(0, 2 ), we have Under the independent censoring assumption and piecewise constant approximation of baseline hazards as detailed in Appendix A in Data S1, we can obtain a consistent estimation for these quantities, denoted asŜ(t|Z = z ′ , M = m, X = x, ), dF M|Z,X, (m|z, x, ), anddF ( ) which will lead to the plugin estimator

Numerical estimate for NDE and NIE
To estimate the NDE and NIE, we first approximated 0 (t) by piece-wise constant functions and used maximum likelihood estimation to estimate the parameters x , z , x , z , m , 1 , 2 and . The details of joint likelihood and assumptions, including the independent censoring assumption T zm ⟂ ⟂ C zm |X have been specified in Section A1. The joint likelihood can be maximized conveniently by the Gaussian quadrature tools which are built in standard statistical packages, such as Proc NLMIXED in SAS. 12 Since m has infinite dimensions, there is no closed form for the defined integration for the NDEs and NIEs. We can use a Monte-Carlo method to numerically compute this integration to overcome the computational difficulty. We first sample ∼ N(0,̂2). Then we sample the recurrence event process M 1 (s) and M 0 (s), s ∈ [0, t] using the inverse approach. 29 Finally, we can either compute S(t|0, M 0 , X, ), S(t|1, M 0 , X, ), S(t|0, M 1 , X, ), S(t|1, M 1 , X, ) or directly sample the potential survival times. If we are interested in multiple time points t, sampling potential survival time will be better. Otherwise, computing S via numerical integration can be more efficient. Then we can compute NDE and NIE based on the simulated data. We repeat this procedure multiple times and use the average calculated NDE and NIE as the final estimates. The confidence interval can be computed via Bootstrap. Also, the Monte Carlo error can be evaluated by varying the number of nodes used for the integration as well as the number of time points K. When the computational resource is limited, we can sample parameters from the asymptotic variance-covariance matrix of the estimated parameters, and compute the corresponding NIE and NDE distributions. This approach is used in our simulation studies. Details can be found in Appendix B in Data S1.

SIMULATION
The finite sample performance of the parameter estimation for joint modeling of recurrent and terminal events has been studied previously. 8,30,31 In this paper, we are interested in NDE and NIE which are functions of parameters from the modified joint models. In this section, we perform simulation studies to evaluate the performance of our proposed estimator for NDE and NIE. We also evaluate the robustness of our proposed estimators to the misspecification of the model. The details on the simulation setting can be found in Appendix C in Data S1. Table 1 shows the simulation results for NDE and NIE. When the random effect follows normal distribution N(0, 0.49) and the joint model is correctly specified (settings I and II), the fitting for both NIE and NDE perform well with small bias and approximate correct coverage rate. A slightly lower coverage is observed for NIE when there is interaction, which might reflect the insufficient sample size for the asymptotic normality to work well. When the random effect is misspecified (settings III) or interaction is missed in the specified model (setting IV), our estimator is robust to the misspecification of the random effect distribution for early time points but is sensitive to the misspecification for later time points, especially for the NIE estimator. The results also showed that our proposed estimator is sensitive to the misspecification of the fixed effect form. This suggests the necessity of evaluating the goodness of fit in real data analysis to determine the correct fixed effects model and avoid using an oversimplified model when sample size allows.

CPCRA study
We applied our method to the CPCRA study, in which 467 patients who previously received zidovudine and had 300 or fewer CD4 cells per cubic millimeter were evenly randomized into two groups to receive either didanosine (ddI, n = 230) or zalcitabine (ddC, n = 237). The primary outcome of interest was the overall survival, with 100 and 88 deaths occurring in the ddI and ddC groups, respectively, during the follow-up period (median 13 months; range 1-21 months). Additionally, 235 patients developed at least one OI disease as defined in table 1 in Neaton et al 11 or table 3  To apply the models described previously, we first verify whether all required assumptions hold. In this study, we are interested in two exposure variables, treatment and baseline CD4 count. The first part of the SUTVA assumption is that the potential outcome for each patient under each treatment level can only take a single value. In our example, this assumption is reasonable because the treatment and baseline CD4 count are clearly defined, and no subcategories are involved. The second part of the SUTVA assumption is that there is no interference between individuals, which is also likely to hold here because one individual's treatment assignment or baseline CD4 count are unlikely to affect the outcome of another individual, and the outcomes from different subjects are independent of each other. The consistency assumption requires no measurement error in the mediator process, which is reasonable because the recurrent events OIs were recorded accurately and in a timely manner. For part I of the SI assumption, it is automatically guaranteed for treatment as this study is a randomized trial. We also believe this assumption is reasonable for baseline CD4 count given the inclusion of important covariates such as hemoglobin (HgB) in our analysis. For the second part of the SI assumption, we allow the relaxation of this assumption. Thus, by representing the unmeasured time-independent confounders using TA B L E 1 Bias, empirical standard deviation (SD), median estimated standard error (MeSE), and coverage rate for 95% nominal confidence interval (CR) from four simulation settings. the shared random effect, we could assume the ignorability of the mediator OIs conditioning on observed covariates, exposure, and quantified unmeasured confounding.
In this section, we investigate two variables as exposure separately. In Section 4.2, the treatment (ddI/ddC) is our primary exposure of interest, while the other five variables are included in the models as covariates. In Section 4.3, we examine the baseline CD4 count as our primary exposure of interest, while treatment is included as one of the covariates.

4.2
Comparative effectiveness between the two treatment arms Didanosine (ddI) and Zalcitabine (ddC) are both reverse transcriptase inhibitors that are approved as the second and third drugs for AIDS by FDA. Previous studies have suggested that ddI can delay disease progression and death in patients, while ddC in combination with AZT has also shown benefits. 32 We are interested in exploring and comparing the effect of ddI and ddC on patients' survival, and learning how much the comparative effectiveness of different treatments is mediated by OIs. The estimates for the parameters involved in the recurrent event model (5) and the survival model (6) (Model I) or (7) (Model II) are shown in Table 2. We found that after adjusting for prevOI, gender, stratum, baseline HgB, and baseline CD4 count, ddC had a similar effect on the recurrent events as ddI (P = .93), but statistically significantly extended patients' survival compared to ddI (P = .03) through direct effect pathway. These findings suggested that patients receiving ddC treatment had a higher probability of survival during the follow-up period than those receiving ddI treatment. PrevOI had a statistically significant effect on patients' survival (P = .004) and a marginally significant effect on recurrent events (P = .06). Moreover, baseline HgB levels were significantly associated with the recurrence of OIs and patients' survival. A lower baseline level of hemoglobin was associated with a higher frequency of OIs (P = .002), as well as higher mortality (P < .001). Female patients had a lower chance of getting OIs (P = .02), but had similar survival time compared to male patients (P = .53). Stratum, an indicator of AZT intolerance, did not have significant effects on either the recurrence of OIs or patients' survival.
In the current study, we modeled the baseline hazard as a piece-wise constant function with 10 intervals, separated by every 10th percentile of the terminal event time or recurrent event time. Interestingly, the estimate of the hazard rate of the terminal event increased over time, while there was no obvious trend with respect to the occurrence of OIs over time. For the survival model, the estimated HR associated with the number of OIs (mediator, M(t)) was 0.27 (P = .03), indicating a positive association between the number of OIs and mortality. The fitting results showed that Model I was preferred over Model II since there was no evidence that the interaction exist (P = .42). Also, the significant effect of the shared latent effect ( ) in the survival model and the significance of the variance component of the latent effect suggested the potential violation of the SI assumption (9) (ie, the mechanism in Figure 1(3) holds). Thus, the effect of the occurrences of OIs on the survival outcome was confounded by the shared latent variables. Since there was no evidence that the interactions exist, we used Model I to estimate the NIE, NDE, and TE hereafter. The estimated NIE, NDE, and TE (solid line) of Treatment (ddC vs ddI) with the bootstrapped 95% point-wise confidence intervals (dashed lines) from Model I are presented in Figure 2. A Cox regression model was also performed to fit the survival data, and the TE of the treatments on the terminal event estimated from the Cox regression models (green line) was plotted in Figure 2. The TE calculated from the fitting of the Cox model was similar to the TE computed from the sum of NDE and NIE, indicating good consistency between our method and the Cox model. Based on our findings, we observed that the NDE was significantly different from 0 over the follow-up period, indicating that treatment (ddC vs ddI) had a significant direct effect on the survival outcome. However, we did not find a significant indirect effect through the number of OIs during follow-up since the NIE was not significant. Our results suggest that the treatment has no effect on the recurrent events, which explains the lack of an indirect effect.

Effect of baseline CD4 count
The baseline CD4 count indicates the robustness of patients' immune systems. We aim to study its direct and indirect effects (through OIs) on patients' survival. The estimates for the parameters involved in the recurrent event model (5) and the survival model (6) or (7) are shown in Table 2. It can be seen that the baseline CD4 had a statistically significant effect on patients' survival (P < .001) and OIs (P < .001). These findings suggested that patients with a lower level of baseline CD4 count had a higher probability of mortality and a higher frequency of OIs during the follow-up period. To further study the NIE, NDE, and TE of the baseline CD4 count on the overall survival probability, we compared the third quartile (Q3: 109 cells∕mm 3 ) with the first quartile (Q1: 11 cells∕mm 3 ) of the baseline CD4 count. The estimation (solid line) of NIE, NDE, and TE of the baseline CD4 count (Q3 vs Q1) with bootstrapped 95% point-wise confidence intervals (dashed lines) on the overall survival probability using Model I are shown in Figure 3. From the plots, we found that both NIE and NDE were significantly different from 0 over the follow-up period. Therefore there was evidence that the number of OIs mediated the effect of baseline CD4 on the survival outcome. However, NDE had a relatively higher magnitude compared to NIE, indicating that the direct effect of CD4 still played a major role in patients' survival. A Cox regression model was also performed to estimate the TE of the baseline CD4 (Q3 vs Q1) on the overall survival probability, which TA B L E 2 Fitted regression parameters from the data analysis of CPCRA study with (Model 2) and without (Model 1) interaction between shared random effect and the mediator. is plotted (green line) in Figure 3. The TE estimated from the fitting of the Cox model was close to that from the sum of NDE and NIE, indicating a good consistency between our model and the Cox model. This study suggested that there was evidence to support the number of OIs as a valid mediator for the effect of baseline CD4 on patients' survival.

DISCUSSION
In this work, we illustrated how to quantify the causal relationship between recurrent and terminal events through joint modeling in the potential outcome framework, either under the assumption of SI or under a relaxed assumption allowing the confounding between the mediator and the latent shared random effect. This method provides important information for practitioners, as it allows for the evaluation of whether recurrent infections (or other recurrent events such as relapses) mediate the causal effect of an exposure on the outcome (terminal event). If so, treating recurrent infections may help improve the survival probability.
In real applications, the performance of NDE and NIE estimation relies on the selection of a suitable joint model. Hence, we suggest the use of model selection techniques, such as AIC, BIC, or tests for higher-order terms when implementing our proposed method.
Our method for computing NDE and NIE allows the relaxation of the SI assumption by including confounding modeled by the latent shared random effects. This relaxation is clinically important in practical applications. For clinical trials, unlike treatment assignment which is independent of confounders, it is unrealistic to also randomize the recurrent events to satisfy Equation (9). Our method thus provides a feasible solution to tackle this issue.
Although the joint model framework presented in this paper has been established previously, adding the number of recurrent events in the survival model and utilizing the shared random effect to handle unmeasured confounders are novel contributions. When either the recurrent event model or the survival model does not follow the proportional hazards model, our proposed framework of using the shared random effect to model baseline unmeasured confounding and the general integration formula for NIE and NDE (Equations 1 and 2) can still be applied in principle. However, the specific formula for the survival function, the distribution of random effect , and the distribution of F M|Z, ,X need to be modified accordingly.
In this paper, we have performed mediation analysis for ddI/ddC treatment and baseline CD4 as primary exposure variables. Although the treatment was randomized and thus satisfied the ignorable assumption, it is questionable to treat baseline CD4 as ignorable (ie, no unmeasured confounding between baseline CD4 and either the OIs or the death). Additional factor analysis similar to Wang et al 27 and McKennan et al 33 could be considered. For example, we could further identify latent unmeasured confounding variables between baseline CD4 and recurrent events or the terminal event so that the first part of the SI assumption can be relaxed. Another limitation is that we did not differentiate types of recurrent events, although their impacts and contributions on survival outcomes may vary. For example, there are around 20 common OIs caused by different pathogens in HIV-infected individuals. Patients' susceptibility to different OIs and mortality rate after infections can be influenced by many factors including pathogens types, patients' immune system, or any preexisting infections. 34 Consequently, different types of OIs have very different effect scales on survival or different response to exposure (treatment, CD4 counts). Combining them together as a single type might lead to model misspecification and effect attenuation. Multivariate models for different types of recurrent events with shared frailty need to be developed in the future to overcome this limit.