Adjusting for time‐varying confounders in survival analysis using structural nested cumulative survival time models

Abstract Accounting for time‐varying confounding when assessing the causal effects of time‐varying exposures on survival time is challenging. Standard survival methods that incorporate time‐varying confounders as covariates generally yield biased effect estimates. Estimators using weighting by inverse probability of exposure can be unstable when confounders are highly predictive of exposure or the exposure is continuous. Structural nested accelerated failure time models (AFTMs) require artificial recensoring, which can cause estimation difficulties. Here, we introduce the structural nested cumulative survival time model (SNCSTM). This model assumes that intervening to set exposure at time t to zero has an additive effect on the subsequent conditional hazard given exposure and confounder histories when all subsequent exposures have already been set to zero. We show how to fit it using standard software for generalized linear models and describe two more efficient, double robust, closed‐form estimators. All three estimators avoid the artificial recensoring of AFTMs and the instability of estimators that use weighting by the inverse probability of exposure. We examine the performance of our estimators using a simulation study and illustrate their use on data from the UK Cystic Fibrosis Registry. The SNCSTM is compared with a recently proposed structural nested cumulative failure time model, and several advantages of the former are identified.

earlier exposures (eg, pneumonia infection). This induces feedback relationships between exposures and confounders over time that cannot be untangled via traditional survival analysis regression methods that adjust for time-varying covariates, such as a history of exposure and confounders, at each timepoint (Robins et al., 2000). The reason for this is twofold. First, such adjustment procedures eliminate the indirect effects of early exposures on survival that are mediated through those confounders. For example, it would be undesirable to eliminate the effects of hospital-acquired pneumonia on survival that are mediated through disease severity, as scientific interest is primarily in the overall effect of infection. Second, such adjustment procedures are prone to collider-stratification biases that can render exposure and outcome dependent even in the absence of an exposure effect. See Daniel et al. (2013) for a review of these difficulties.
Time-varying confounding has received much attention in the causal inference literature. For survival time outcomes, the two main approaches are based on structural nested accelerated failure time models (AFTMs) (Robins and Tsiatis, 1991;Robins and Greenland, 1994) and marginal structural models (MSMs) (Robins et al., 2000). The latter approach is more popular, because of its greater simplicity and flexibility. In particular, accounting for noninformative censoring in MSMs does not, unlike in structural nested AFTMs, require an "artificial recensoring" procedure in which originally uncensored subjects may become censored. Avoiding this recensoring is advantageous, because recensoring causes information loss, which can result in poor estimators and difficulties solving the estimating equations (Joffe et al., 2012). However, fitting MSMs relies on inverse weighting by the probability of exposure, which has it own drawback: estimators prone to large finite-sample bias and variance when confounders are highly predictive of the exposure, or when the exposure is continuous or discrete with many levels.
More recently, Young et al. (2010) and Picciotto et al. (2012) proposed a new class of discrete-time structural nested cumulative failure time models, which parameterize the effect of the exposure at each time t on the outcome at each later time in terms of the ratio of two (possibly) counterfactual cumulative failure risks at that later time under exposure regimes that differ only at time t. Their procedure has the desirable properties of structural nested AFTMs-viz. by avoiding inverse probability weighting, it handles continuous exposures without estimators being subject to large bias and variance, and it allows modeling of effect modification by time-varying covariates-while avoiding the need for artificial recensoring.
Here, we use developments by Martinussen et al. (2011) and Dukes et al. (2019) (hereafter DMTV). The former showed how to adjust for time-varying confounding when the effects of exposure and confounders are parameterized on the additive hazard scale. They focused on the simple setting where interest is in estimating the direct effect of a binary baseline exposure on a survival outcome, that is, the effect not mediated by a given intermediate variable, and where there are no baseline confounders. DMTV proposed an additive hazards model for the effect of a baseline exposure on survival time conditional on baseline confounders and derived the efficient score when (as assumed by Martinussen et al. 2011) the confounders act additively on the hazard; this additivity assumption is not needed for consistency of their estimators.
Here, we propose a novel class of semiparametric structural nested cumulative survival time models (SNCSTMs), of which the models of Martinussen et al. (2011) and DMTV are special cases, and propose three estimators of its parameters. Our model allows baseline and time-varying confounders, binary or continuous exposure, any number of exposure measurement times and the option of constraining exposure effects to be common at different times; it does not parameterize the effects of confounders on the baseline hazard. It also allows investigation of exposure effect modification by time-varying factors. The SNCSTM is closely related to Picciotto et al.'s model, and our estimators share the forementioned desirable properties of the latter. The SNCSTM generalizes Picciotto et al.'s model to continuous time and parameterizes relative survival risks instead of failure risks. Our approach has several advantages over that of Picciotto et al. One of our estimators (method 1) can be calculated using GLM software. Our other two estimators (methods 2 and 3) are more efficient, double robust and available in closed form. All three estimators automatically handle random censoring. Also, because of being parameterized in continuous time, SNCSTMs can handle irregular measurement times and allow the interpretation of parameters in terms of hazards.
We define notation and state fundamental assumptions in Section 2. A simple version of our SNCSTM is introduced in Section 3. In Section 4, we propose three methods for estimating its parameters. The general SNCSTM is described in Section 5. In Section 6, we discuss random censoring. A simulation study is described in Section 7. Section 8 describes an analysis of data from the UK Cystic Fibrosis (CF) Registry, looking at the effect of treatment with DNase on survival in people with CF. We conclude with a discussion in Section 9.
2 | NOTATION AND ASSUMPTIONS Consider a study in which, for each of n subjects, a timevarying exposure and vector of possibly time-varying confounders are measured at time zero and at up to K follow-up visits. Until Section 5 we assume the follow-up times are regular, that is, the same for all individuals, and (for notational simplicity) are K 1, 2, …, , and that all individuals are administratively censored at time K + 1. Until Section 6 we assume there is no censoring apart from this administrative censoring. If visits are regular but not at times K 1, …, , or if administrative censoring occurs at a time different from K + 1 or not at all, this can easily be accommodated by rescaling the time variable within each interval between consecutive visits.
Let T i denote individual i's failure time, and A ki and L ki denote, respectively, his exposure and vector of be the at-risk indicator. If individual i fails before his kth visit, A ki and L ki are defined as zero. Let x y means the minimum of x and y.
Define T A (¯, 0) i ki as individual i's (possibly) counterfactual failure time that would have applied if his exposures up to the kth visit had been as observed and his exposures from the k ( + 1)th visit onward had been set to zero by an intervention. We make the consistency assumption that We make the following sequential no unmeasured confounders assumption (NUC): (Robins, 1986). That is, among individuals who are still alive (or event-free) at time k, the assigned exposure A k at time k may depend on Lk and Ā k−1 , but given these, has no residual dependence on the remaining lifetime that would apply if all future exposures were set to zero.

CUMULATIVE SURVIVAL TIME MODEL
We first introduce a simple version of the SNCSTM that does not allow for exposure effect modification. The more general SNCSTM is described in Section 5.
For each k K = 0, …, , let k  be the model defined by the restriction . Equation (1) means that among the survivors in the population at the kth visit time, in the stratum defined by any given A L (¯,¯) k k the proportion who survive to a later time t when exposures from visit k + 1 onward (ie, ) have already been set to zero would be multiplied by k is the (controlled) direct effect of A k on the probability of survival to time t given survival to visit k, that is, the effect of A k not mediated through the later exposures A A , …, In Section 5 we extend the SNCSTM to allow the effect to depend on the history.
By taking logs of each side of Equation (1) and differentiating with respect to t, it can be shown that Model k  can also be written as (2) can be interpreted as follows. In a stratum defined by A L (¯,¯) k k and ⩾ T k, the hazard of failure at any time between visits l and l + 1 have already been set equal to zero would be reduced by A ψ k k l ( ) if A k were also set to zero. Note that Model -as a totally unspecified "baseline" hazard, rather than parameterizing its dependence on Ā k−1 and Lk. One advantage of this is that the danger of incompatibility between Models , …, K 0   is avoided. To illustrate this danger, suppose it were assumed that | 3 together with NUC, implies 1  holds. However, it also implies a restriction on the association between A 0 and T, a restriction which might conflict with that of 0  . Such conflict would be the result of there being no coherent overall model.

| ESTIMATION METHODS
In order to estimate ψ k l ( ) , we introduce nuisance . The dispersion parameter ϕ k is assumed not to depend on Ā k−1 or Lk, and g is the canonical link function. The methods described in Sections 4.1 to 4.3 consistently estimate ψ k l ( ) when Models k  and k  (k K = 0, …, ) are correctly specified. Method 1 can be applied using standard GLM software. Methods 2 and 3 improve on method 1 by using more efficient estimators that are closely related to that described by DMTV in the setting of a single baseline exposure. Method 3 gives consistent estimation under slightly weaker conditions than method 2, but is more computationally intensive.
, …, ). Our first estimation method for ψ k l ( ) involves fitting this GLM to estimate α k l ( ) and calculating . We now explain in more detail.
can be obtained as follows. For each of a number (we used 10) of equally spaced values of t between k and k + 1 (including k and k + 1), identify the set of individuals with ⩾ T t and, for each of these individuals, create a copy (a "pseudo-individual") with the same value of A L (¯,¯) K K and with new random variable Q equal to t. Fit to the resulting set of (up to n 10 ) . When ϕ k is unknown, it can be estimated by fitting Model k  to those of the original n individuals with ⩾ T k. In the simulation study of Section 7, we also tried using 50 values of t to construct the set of pseudo-individuals instead of 10, but found this made very little difference to the estimates.
. That is, within the population stratum defined by any given value of A L (¯,¯) k k and by Remembering that the first element of H k equals one for all individuals, it follows that a consistent estimate α k k ( +1) of α k k ( +1) can be obtained by fitting the GLM g to a set of pseudo-individuals constructed as described above but using 10 values of t between k + 1 and k + 2 (rather than k and k + 1) and , which is unknown, and so we replace it by its previously obtained estimate ψ k k to a set of pseudo-individuals constructed using 10 equally spaced values of t between l and l + 1 and using weights Although this estimation procedure involves weights w t ( ) k , these are different from the inverse probability of exposure weights used to fit MSMs and do not suffer the same instability that can plague the latter weights.
The variance of the weights can be reduced by using modified . This may improve efficiency, especially when A j is precisely predicted by depends only on Ā k−1 and Lk, and as model (5) is specified for it and its parameters estimated from the set of individuals still at risk at time j. Note that j k does not need to be correctly specified for ψ k l ( ) to be consistent; need not be compatible with k  .

| Method 2: G-estimation
The principle underlying the following estimator of ψ k l ( ) is that after removing the effects of A k and later exposures from the increment in the counting process ⩾ N t I T t ( ) = ( ), NUC implies that the resulting "blipped down" increment at any time ⩾ t k is independent of A k conditional on Ā k−1 and Lk and being still at risk.
, obtained exactly as in method 1. The next paragraph provides a rationale for Equation (6).
NUC implies that the counting process N t ), the counting process for the observed failure time T . In particular, Equation (2) implies that, for any ∈ t k k [ , + 1) and conditional on can be unbiasedly estimated by the corresponding mean of the observed increments in N t ( ) minus A ψ δ k k k ( ) among the survivors at time t. Hence, the adjusted observed and ⩾ T k. This equation involves inverse weighting by the hazard function; such weighting also features in efficient estimating equations of other additive hazards models. In practice, accurate estimation of the hazard function is difficult and increases the computational complexity of the procedure, and so this weighting is commonly omitted by standard fitting procedures for additive hazards models. Results of DMTV imply (see Web Appendix C) that if this is done with the semiparametric efficient equation for ψ k k To make Equation (6) invariant to additive transfor- . The next paragraph provides a rationale for this estimating equation.
Again we exploit the conditional independence of N t ( ) and A k (NUC) and the relation between and (as noted in Section 4.1) there are w t times as many individuals with . This justifies the above SEAMAN ET AL. in Equation (6). In general, the consistent estimator ψ k l with is available in closed form (see Web Appendix E for formulae when g (.) is the identity or logistic link function).
In Web Appendix F we prove ψ k l

| Method 3: Improved G-estimation
If we use a different estimator e A L t (¯,¯, ) is calculated by fitting a single GLM to a set of pseudoindividuals, with time since lth visit, Q l − , as a covariate.
In method 3, we instead fit a separate GLM at each time since the lth visit. That is, for any ⩾ t k, we calculate e A L t (¯,¯, ) to the set of individuals with ⩾ T t, using weights w t ( ) k . This set changes only at times t at which an individual leaves the risk set, and so the GLM needs to be fitted only at these times. This is the approach taken by DMTV, who denoted the resulting estimator of ψ k k ( ) as "ψ TV PS-DR " and, on the basis of results from a simulation study, recommended it over three alternatives. As in method 2, we can use stabilized weights and replace A ψ j j l ( ) by ψ Δ* j k j l ( ) ( ) . As shown in Web Appendix F, method 3 has the same double robustness property as method 2 but with the parameters γ k l ( ) in Model k l ( )  now allowed to be a function of t l − .

| Constraining exposure effects
In some applications, it may be desirable to impose the constraint that ψ ψ = k k m k k m ( + ) ′( ′+ ) for all k k m , ′, , that is, the effect of exposure measured at one visit k on the hazard m visits later is the same for all k. This reduces the number of parameters and, as we see in Section 7, increases the precision of their estimates.
In Web Appendix G we explain how estimation may be performed under this constraint. See Vansteelandt and Sjolander (2016) for how to impose other constraints.

| THE GENERAL SNCSTM
In this section, we extend the SNCSTM to allow visit times to be irregular, that is, to vary from one individual to another, and effect modification, that is, the effect of exposure on survival to depend on the exposure and confounder histories. . Until now, we have assumed We assume visit times S̄are planned or randomly chosen at baseline using only baseline confounder information L 0 , and we modify NUC to be . Also, let S K i +1, denote an administrative censoring time common to all individuals (until now, we assumed S K = + 1 . To allow effect modification, we define

P T A t A L S T S P T A t A L S T S
A v t Z S ψ { (¯, 0) |¯,¯,¯, } { (¯, 0) |¯,¯,¯, } = exp{− ( , ,¯) }, can also be written as The modifications to methods 1 and 2 needed to fit the general SNCSTM are simple (see Web Appendix D). Modifying method 3 is simple when visit times are regular; it is possible for irregular visit times but is fiddly. In the simulation study reported in Section 7 we found little benefit from method 3 relative to method 2 when visit times were regular, and so did not implement it for irregular times.

| CENSORING
We now allow for censoring before the administrative censoring time. Let C i and T i denote individual i's censoring and failure times, respectively. Redefine T i and N t ( ) is unchanged except that T i has this new meaning. With these changes, methods 1 to 3 remain valid, provided two further conditions hold (Vansteelandt and Sjolander, 2016). First, the censoring hazard does not depend on the exact failure time or future exposures or confounders. That is, the counting process, , for the censoring time satisfies ,¯,¯,¯,˜> ,˜= ( ,¯,¯,¯) where ⌊ ⌋ Ā t and ⌊ ⌋ L t are the exposure and confounder histories up to time t and The second condition, which can be weakened by using censoring weights (see Web Appendix H), is that , so censoring depends only on baseline confounders.

| SIMULATION STUDY
We used a simulation study to investigate the bias and efficiency of the methods. There were K + 1 =4 visits and two time-dependent confounders (ie, dim L ( ) =2 k ). These and the exposure were generated as: is correctly specified with no effect modification (ie, Z = 1 k l ( ) ) and the true exposure effects are ψ = −0.04, . We considered three scenarios: two with regular and one with irregular visit times. For regular visits, S k = ik .

For irregular visits, inter-visit times
were independently uniformly distributed on [0.5, 1.5]. There was administrative censoring at time 4. In one of the regular visit scenarios, there was no random censoring. In the other, and in the irregular visit scenario, there was an exponentially distributed random censoring time with mean 5. For the regular visit scenario without random censoring, the expected percentages of individuals observed to fail between visits 0 and 1, 1 and 2, 2 and 3, and between visit 3 and time 4 were 20%, 14%, 11%, and 9%, respectively. For the regular and irregular visit scenarios with random censoring, these percentages were 18%, 10%, 6%, and 4%, and the corresponding expected percentages of individuals censored were 16%, 11%, 8%, and 5%. For each scenario, we generated 1000 data sets, each with n = 1000 individuals. Estimation was done with and without the constraint, which is true here, that ψ ψ = k k m k k m ( + ) ′( ′+ ) . Tables 1 and 2 show for the regular visit scenario without and with random censoring, respectively, the mean estimates and standard errors (SEs) for methods 1 to 3. Results for the irregular visit scenario are in Web Appendix L. We see that all the estimators are approximately unbiased, though there is some bias for ψ ψ , 0(2) 0(3) , and ψ 1(3) , parameters for which there is relatively little information in the data. Comparing SEs, we see that methods 2 and 3 give very similar results and that these methods are more efficient than method 1. This difference in efficiency is much greater when there is random censoring (it is even greater when visit times are irregular-see Web Appendix L). This may be because method 1, unlike 2 and 3, does not distinguish between failure and censoring (or occurrence of next visit). Although methods 2 and 3 use fitted values from the same GLM that is used in method 1, the estimating equations for methods 2 and 3 involve increments dN t ( ), which equal one only when a failure occurs. For methods 1 and 2, coverage of 95% bootstrap confidence intervals (using 1000 bootstraps) was close to 95% (see Table 3). Coverage was not evaluated for method 3, as it is computationally intensive to bootstrap this method for 1000 simulated data sets. Imposing the constraint that ψ ψ = k k m k k m ( + ) ′( ′+ ) reduced SEs, as expected. In this simulation study, censoring times are independent of exposures and confounders, and so censoring weights (Section 6) are not required for consistent SEAMAN ET AL. | 7 estimation of the ψ k l ( ) 's. However, applying method 1 with censoring weights improved its efficiency (see method 1cw in Tables 1 and 2), probably because chance associations between exposures and censoring events are reduced in the weighted sample. Coverage of bootstrap confidence intervals (Table 3) was close to 95% for most parameters, but there was overcoverage for some parameters. Using censoring weights had no effect on the efficiency of method 2.
Web Appendix L shows results for n = 250 or for a shorter follow-up time with times between visits divided by four and administrative censoring at time 1 (and so fewer failures). These are qualitatively similar to the results in Tables 1 and 2, but with the relative inefficiency of method 1 being even more marked in the scenarios with shorter follow-up time. Web Appendix L also describes a simulation study that demonstrates the double robustness of methods 2 and 3.

FIBROSIS REGISTRY DATA
The UK CF Registry records health data on nearly all people with CF in the UK at designated approximately annual visits (Taylor-Robinson et al., 2018). To illustrate the use of the SNCSTM, we used data on 2386 individuals observed during 2008 to 2016 to investigate the causal effect of the drug DNase on survival. DNase has been found to have a beneficial effect on lung function, including using Registry data (Newsome et al., 2019), but its effect on survival has not been studied. Baseline visit was defined as an individual's first visit during 2008 to 2015, and there were up to K = 8 follow-up visits. The (irregular) visit times were defined as years after baseline visit; the median time between visits was 1.00 years (interquartile range 0.93-1.07). Individuals were defined as "treated" if they had used DNase since the previous visit and "untreated" otherwise. Individuals treated at a visit prior to their baseline visit were excluded, as were visits prior to age 18. Administrative censoring was applied at the end of 2016 and nonadministrative censoring when an individual had a transplant or had not been seen for 18 months. The percentage of treated patients increased from 14% at the baseline visit to 52% at visit 8, and most patients who began using DNase continued to use it. There were 137 deaths during follow-up and 653 nonadministrative censorings (including 36 transplants). Of those who died, 74 (63) were treated (untreated) at the time of death. Total follow-up was 12 380 person-years (py), and death rates while treated and untreated were, respectively, 0.019 (74/3930) and 0.0075 (63/8450) py −1 . The ratio of the probabilities of surviving for one year while treated and untreated is thus ∕ 0.981 0.9925 = 0.989. However, this may be due to confounding: sicker patients being more likely to receive treatment.
We estimated the effect on survival of delaying initiation of treatment by one year. To do this, we (re) defined A k as A = 0 k for those treated at visit k, and A = 1 k for those untreated. Now ψ exp(− ) k k ( ) represents the multiplicative causal effect of intervening to start treatment at visit k rather than at visit k + 1 on the probability of surviving for at least one year after visit k, among patients who survive to, and are untreated at, visit T A B L E 1 Means (×10) and SEs (×10) of parameter estimates when n = 1000, visits are regular and the only censoring is administrative

| DISCUSSION
One advantage of SNCSTMs is that, in contrast to MSMs, they can cope well with situations where the inverse probabilities of exposure are highly variable. Indeed, they can even be used when the so-called experimental treatment assignment assumption is violated, that is, when some individuals are, on the basis of their time-varying covariate information, excluded from receiving particular exposure levels. For these individuals, t Δ ( ) = 0 i , meaning they do not contribute to the estimating functions of methods 1 to 3.
Another advantage of SNCSTMs is that they can be used to investigate time-varying modification of exposure effects on survival time. Although it is, in principle, possible to do this using structural nested AFTMs, estimation difficulties caused by artificial recensoring mean that such models are usually kept simple and interactions are not explored.
The SNCSTM can also be used to estimate the counterfactual exposure-free survivor function, that is, . This is because Equations (4) and (8)  (A) (B) (D) (C) F I G U R E 1 Estimates of the ratio of the survival probabilities when treatment is initiated immediately compared to initiation being delayed by one year. A: from the model with no interaction. B, C and D: from the model with interaction between treatment and FEV % 1 A limitation is that, like other additive hazards models, the SNCSTM does not constrain hazards to be nonnegative, and so does not exclude survival probabilities greater than one. Similarly, Picciotto et al.'s (2012) structural nested cumulative failure time model does not exclude failure probabilities greater than one.
Method 1 appears to be less efficient than methods 2 and 3 but has the appeal that it can be applied using standard GLM software. In our simulation study, the efficiency loss was fairly small when the only censoring was administrative and visit times were regular. This method became much less competitive, however, when there was random censoring, and even more so when visit times were irregular. By not distinguishing between failure and censoring, method 1 may also be more sensitive than methods 2 and 3 to violation of the assumption that . Of the three, method 3 gives consistent estimation under the weakest assumptions. However, it needs more computation than methods 1 and 2, especially when visit times are irregular and the exposure is binary. In our simulation study, methods 2 and 3 performed similarly, and so the theoretical advantage of method 3 may not be worth the extra computation. An R function for implementing our methods, with examples, is described in Web Appendix I.
DMTV discuss the close connection between their model for a point exposure (which is equivalent to the SNCSTM with K = 0) and Picciotto et al.'s (2012) cumulative failure time model. Although the latter is a discrete-time model for the probability of failure, it is easy to finely discretize time so as to approximate continuous time and (as Picciotto et al. note) to reformulate it as a model for the probability of survival. As DMTV explain, a drawback of Picciotto et al.'s method is the difficulty of deriving the efficient estimating equation. This difficulty arises because their class of estimating functions uses correlated survival indicators. By instead using independent increments of a counting process, DMTV were able to derive the efficient estimating function. Methods 2 and 3 are extensions to time-varying exposures of DMTV's recommended method, and are, therefore, expected also to be more efficient than Picciotto et al.'s method. In Web Appendix J we elaborate on DMTV's discussion of Picciotto et al.'s model and reformulate it as a model for the probability of survival. Tables 1 and 2 show mean estimates and SEs for the resulting Picciotto et al. estimator (described in Web Appendix J and denoted "Method P" in tables). The SEs are larger than those of methods 2 and 3, suggesting methods 2 and 3 are indeed more efficient. Methods 2 and 3 also have the advantages of using closedform estimators, handling random censoring automatically (because estimating functions are framed in terms of increments, which are observable up to the time of censoring), and being double robust. Picciotto et al. use an iterative Nelder-Mead algorithm, employ inverse probability of censoring weighting to handle random censoring, even when this censoring is completely at random, and their estimator is not double robust.
In Web Appendix K we outline how the SNCSTM can handle competing risks.