Using generalized linear models to implement g-estimation for survival data with time-varying confounding

Using data from observational studies to estimate the causal effect of a time-varying exposure, repeatedly measured over time, on an outcome of interest requires careful adjustment for confounding. Standard regression adjustment for observed time-varying confounders is unsuitable, as it can eliminate part of the causal effect and induce bias. Inverse probability weighting, g-computation, and g-estimation have been proposed as being more suitable methods. G-estimation has some advantages over the other two methods, but until recently there has been a lack of flexible g-estimation methods for a survival time outcome. The recently proposed Structural Nested Cumulative Survival Time Model (SNCSTM) is such a method. Efficient estimation of the parameters of this model required bespoke software. In this article we show how the SNCSTM can be fitted efficiently via g-estimation using standard software for fitting generalised linear models.The ability to implement g-estimation for a survival outcome using standard statistical software greatly increases the potential uptake of this method. We illustrate the use of this method of fitting the SNCSTM by reanalyzing data from the UK Cystic Fibrosis Registry, and provide example R code to facilitate the use of this approach by other researchers.

that the post-baseline exposure measurement timesS = (S 1 , . . . , S K ) are planned or randomly chosen at baseline using only baseline confounder information, i.e. L 0 .
When exposure measurement times are the same for all individuals, we can omit the termS in all the expressions below.
To allow the (controlled direct) causal effect of A k on the hazard during the interval between times S l and S l+1 (for k ≤ l ≤ K) to be modified by the history (Ā k−1 ,L k ) of exposure and confounders at time S k and/or by the exposure measurement times S, let Z int k(l) be any function ofĀ k−1 ,L k andS (the superscript 'int' stands for 'interaction'), and let Z k(l) be the vector Z k(l) = (1, Z int k(l) ). For example, if the effect of A k during the time interval between S l and S l+1 is modified by L k , then Z int k(l) = L k and Z k(l) is the vector Z k(l) = (1, L k ). If the effect is modified by, 1 for example, both L k and A k−1 , then Z int k(l) is the vector Z int k(l) = (L k , A k−1 ) and Z k(l) = (1, L k , A k−1 ). If there is no effect modification, just let Z k(l) = 1.
Web Appendix B describes how to fit the general SNCSTM.

B Fitting the general SNCSTM B.1 The basic procedure
Here we describe how to estimate the parameter ψ k(l) (0 ≤ k ≤ l ≤ K) of the general SNCSTM described in Web Appendix A.
If the exposure measurement timesS are the same for all individuals, we shall say that the 'measurements are regular'; ifS differs between individuals, we shall say that the 'measurements are irregular'. We shall refer to a GLM with gamma distribution and log link function as a 'gamma GLM'.
Recall that S K+1 denotes the administrative censoring time common to all individuals. We use C to denote the individual's censoring time. So, C ≤ S K+1 . In For each k = 0, . . . , K, specify a GLM for A k givenĀ k−1 ,L k ,S andT ≥ S k with canonical link function. We shall refer to this GLM as 'Model A k '. For example, if A k is continuous, Model A k could be a linear regression model; if A k is binary, Model A k could be a logistic regression model.
For each of the individuals with T ≥ S l and for each value of t = S k , S k + δ, S k + 2δ, S K +3δ, . . . that satisfies S l ≤ t ≤ S l+1 and t ≤ T , create a copy of the individual and set this copy's value of a new time variable Q equal to t. We call these copies of individuals 'pseudo-individuals'.
there is effect modification), to the pseudo-individuals using weights w k (Q), where when S l ≤ Q ≤ S l+1 . Letê k (Q) denote the fitted values thus obtained, and let Seaman et al. (2020) [1] showed that if A k is correctly specified, thenê k (Q) is a consistent estimate of E{A k |Ā k−1 ,L k ,S, T (A k , 0) ≥ Q}. Now fit the gamma GLM with covariate (or covariates, if there is effect modification) −Z k(l)∆k (Q)δ, no intercept, and outcome variable δ to the pseudo-individuals, using weights w k (Q) and excluding the pseudo-individuals that have either Q + δ > S l+1 or both C < T and Q + δ > C. (The former pseudo-individuals are excluded because we are only using data from the period between times S l and S l+1 . The latter are excluded because the failure status at time Q + δ of these pseudo-individuals is unknown.) Letψ k(l) be the resulting estimate of the coefficient of −Z k(l)∆k (Q)δ.
When the measurements are regular and S l − S k and S l+1 − S k are multiples of δ, we estimate ψ k(l) using exactly the procedure that has just been described. The method described in the section 'Estimating the joint effect of two exposures' of our article is the special case where K = 1 and there is no effect modification (and ψ 1(1) is written as ψ 1 ). Note that when measurements are regular, it is always possible to choose the value for δ so that S l − S k and S l+1 − S k are multiples of it.
B.2 Modification of procedure when exposure measurement times differ between individuals When S l − S k and/or S l+1 − S k are not always multiples of δ (in particular, this would be the case if the measurements were irregular), the procedure described in Web Appendix B.1 is still valid (i.e. it yields a consistent estimate of ψ k(l) ), but it may ignore information on some failures that occur shortly after time S l or shortly before S l+1 . This loss of information could be reduced by choosing a very small value for δ, which would ensure that S l − S k and S l+1 − S k were nearly multiples of δ. However, choosing a very small value for δ would lead to a very large number of pseudo-individuals being created, which would make fitting of the models very slow. So, we instead propose that two modifications of the procedure described in Web Appendix B.1 be made when S l − S k or S l+1 − S k are not always multiples of δ. The first modification involves creating up to two extra pseudo-individuals 4 from each of the individuals still at risk at time S k . The second modification is a small change to the covariate(s) used in the gamma GLM. Together, these two modifications ensure that no information on failures that occur shortly after time S l or shortly before S l+1 is ignored.
The modified procedure is as follows.
Create the pseudo-individuals exactly as described above. Call this set of pseudo- we are only using data from the period between times S l and S l+1 , and so for these individuals we end their risk period early, at time S l+1 rather than time Q + δ.) Now, for each of the individuals with T ≥ S l , create up to two extra copies of that individual as follows. First, let Q first equal the smallest of S k , S k + δ, S k + 2δ, S k + 3δ, . . . that is greater than or equal to S l for that individual. This is
The estimation procedure described above is easily modified to impose this con-

C Inverse probability of censoring weighting
In this web appendix, we describe how to use inverse probability of censoring weights to deal with censoring that depends on more than just baseline covariates L 0 and exposure measurement timesS. We do this in the context of the general SNCSTM, which is described in Web Appendix A. We shall describe the required modifications to the fitting procedure described in Web Appendix B.
We shall assume that the conditional hazard of censoring at time t given that the failure time T is at least t and given the actual failure time T and the histories of the exposure and confounders up to time T and the exposure measurement times S depends on at most the histories of the exposure and confounders up to time t andS. Let λ(t |Ā ⌊t⌋ ,L ⌊t⌋ ,S) denote this conditional hazard of censoring at time t. Here,Ā ⌊t⌋ andL ⌊t⌋ denote the histories of the exposure and of the confounders, , and analogously forL ⌊t⌋ .
Usually, λ(s,Ā ⌊s⌋ ,L ⌊s⌋ ,S) (and hence w C k (t)) is unknown and must be estimated from the data. To do this, specify a model for λ(t,Ā ⌊t⌋ ,L ⌊t⌋ ,S). For simplicity, one could use a parametric proportional hazards model with piecewise constant baseline hazard between each pair of subsequent exposure measurement times S k 9 and S k+1 and with time-dependent covariates A k and L k . This would mean that when S l ≤ t < S l+1 .

Now, when fitting Model
. Then, when fitting the gamma GLM, each pseudo-individual should be weighted by w k (Q)× Note that misspecification of the model for λ k (s,Ā k−1 ,L k ,S) does not affect the consistency of the estimator of ψ k(l) .
If λ(t,Ā ⌊t⌋ ,L ⌊t⌋ ,S) = λ(t, L 0 ,S), i.e. the censoring hazard depends only on L 0 and S, then and, when Q ≤ T < Q + D, If δ (and hence D) is small, then the difference between T and Q + D will be small for pseudo-individuals with Q ≤ T < Q + D, and so the weight will be close to one and can be ignored. This confirms the claim in the section 'Censoring' of our paper that no censoring weights are needed when the censoring hazard depends only on L 0 (andS).

D Estimating P {T (0) ≥ t}
When there is no censoring before time t, we can estimate P {T (0) ≥ t} as Here, I(S l ≥ t) equals 1 if S l ≥ t and equals 0 otherwise, and min(t, S l+1 ) equals t if t < S l+1 and equals S l+1 otherwise.
Note that, when K = 1 and Z k(l) = 1, J(t) reduces to which is the formula given in the Section 'Estimating survival probability when both exposures are set to zero' of our paper. With this assumption, P {T (0) ≥ t} will be consistently estimated as If λ(s |Ā ⌊t⌋ ,L ⌊t⌋ ,S) depends neither on the histories of exposure and confounders nor onS, we can estimate w C 0i (t) by using exp{H NA (t)}, where H NA (t) is the Nelson-Aalen estimator of the cumulative hazard, or alternatively by using the Kaplan-Meier estimator. When calculating this Nelson-Aalen or Kaplan-Meier estimator, censoring should be treated as the event of interest and failure should be treated as a censoring event.
More generally, one could estimate λ(t |Ā ⌊t⌋ ,L ⌊t⌋ ,S), and hence w C 0 (t), using a Cox regression model and the Breslow estimator of the baseline cumulative hazard. This is what we did for the analysis of the Cystic Fibrosis data described in the section 'Application to UK Cystic Fibrosis Registry' of our article.

E Proof of consistency ofψ k(l)
Here we provide a proof of consistency of the estimator of the parameter ψ k(l) (0 ≤ k ≤ l ≤ K) in the general model described in Web Appendix A. The method used to calculate this estimator is described in Web Appendix B.
Let R(t) = I(T ≥ t). Let u and t be such that S l ≤ u < t ≤ S l+1 . Write Let B k(l),i denote the set of pairs of values of (Q, D) of copies created from individual i when estimating ψ k(l) . For these, S l ≤ Q ≤ Q + D ≤ S l+1 .
Fitting the gamma GLM described in Web Appendix B involves solving estimating with E{A k |Ā k−1 ,L k ,S, T (Ā k , 0) ≥ u} in the ∆ k (u) term replaced by an estimate that Seaman et al. (2020) [1] showed is consistent, and with ψ k+1(l) , . . . , ψ l(l) replaced by previously obtained estimates. We shall argue by induction, and so we assume that these previously obtained estimates are consistent. Note that M k(l) (Q, Q + D,Ā l ,L l ,S, T ) = 0 for copies that, because they have Q > T , are discarded and do not become pseudo-individuals, and for copies with D = 0.
We now prove that expression (5) has expectation zero givenĀ k−1 ,L k ,S and T ≥ S k when S l ≤ u < t ≤ S l+1 , and hence (6) are unbiased estimating equations for ψ k(l) .
To simplify the notation slightly, we shall condition implicitly (rather than explicitly) onS, and write M k(l) (u, t,Ā l ,L l ,S, T ) and ∆ k (u) just as M and ∆ k . Also, we define, for k ≤ x ≤ l and S l ≤ u < t ≤ S l+1 , To simplify notation, we shall write Π kx and Ω kl as just Π x and Ω l .
We can now write M as Note that line (7) follows from the SNCSTM. Hence, Note that line (9) follows from the SNCSTM. Hence, It then follows that Note that line (12) follows from the SNCSTM, using the same argument that led to line (8). Likewise, line (13) follows from the SNCSTM, using the same argument that led to line (10).
From lines (11) and (14), we see that, by using induction, we can arrive at It then follows that Hence, Note that line (15) follows from the SNCSTM, using the same argument that led to line (10), and line (16) follows from the assumption that T (Ā k−1 , 0)⊥ ⊥A k |Ā k−1 ,L k , T ≥ S k .
So, we can write It now follows that as required.

F Relation betweenψ and 'Method 2'
In this section, we show that the causal effect estimators described in Sections 2, 3 and 5 are closely related to those obtained from Seaman et al.'s (2020) [1] Method 2. For simplicity, we consider the situation of a point exposure A, with no effect modification and only administrative censoring, just as in Section 2. However, the proof below can be generalised to show that the same relation between the estimators exists when there are multiple exposures A 0 , . . . , A K , effect modification and random censoring.
We considered both a scenario in which the exposure measurement times are the same for all individuals ('measurements are regular') and where they differ ('measurements are irregular'). When measurements were regulars, S ik = k. When measurements were irregular, inter-measurement times S k+1,i − S ki were independently uniformly distributed on [0.5, 1.5].
Censoring was imposed by generating a censoring time completely at random from an exponential distribution with rate of 0.2. Individuals who had still not failed or been censored by time 4 were administratively censored at that time.
When n = 1000, the number of individuals observed to fail between time S 0 = 0 and S 1 , between S 1 = 1 and S 2 , between S 2 = 1 and S 3 , and between S 3 and time 4 are approximately 200, 100, 50 and 50, respectively. The corresponding numbers of individuals who are censored were approximately 140, 100, 70 and 50.
We also considered two scenarios (one with regular and one with irregular measurements) in which there is effect modification. In these scenarios, the dataset was generated in the same way as described above, except that the causal effect of A k on the hazard depended on L k1 , the first element of L k . For this data-generating mechanism, when s l ≤ t < s l+1 , or equivalently,

H Further extensions of the general SNCSTM
In this Web Appendix we sketch three extensions of the general SNCSTM described in Web Appendix A. First, we allow the causal effect of A k to vary over time during an interval between exposure measurement times S l and S l+1 (for l ≥ k). Second, we allow the causal effect of a continuous exposure A k to be non-linear in A k .
In particular, we shall sketch the method for a quadratic effect. Third, we have focused thus far on continuous and binary exposures. We shall now consider a categorical exposure with unordered categories.
H.1 Allowing the causal effect to vary with time during an interval between exposure measurement times The SNCSTM of equation (2) assumes that the additive effect, ψ ⊤ k(l) Z k(l) of A k on the hazard between times S l and S l+1 is constant over that time interval. If we instead allow Z k(l) = Z k(l) (t) to be a function of t, then a modified version of equation (2) holds. Let us suppose, for example, that we assume Z k(l) (t) = (L ⊤ k , L ⊤ k t) ⊤ and write ψ k(l) as ψ k(l) = (ψ (L)⊤ k(l) , ψ (L2)⊤ k(l) ) ⊤ . Then when S l ≤ t < S l+1 . Note that the (S j+1 − S j ) 2 and (t − S l ) 2 terms arises because t 0 u du = t 2 , just as the (S j+1 − S j ) and (t − S l ) terms arises because t 0 1 du = t.
Now, when fitting the GLM for A k givenĀ k−1 ,L k and T ≥ t (for S l ≤ t < S l+1 ), we need to include interactions between L k and (t − S l ) and between L k and (t − S l ) 2 .
Aside from the inclusion of these two interaction terms, the fitting of the GLM for A k is the same as described in Web Appendix B. If the measurements are irregular (i.e.S varies from one individual to another), we also need to include other covariates, as described in Web Appendix B.2.
The way in which the gamma GLM is subsequently fitted to estimate ψ k(l) remains the same as was described in Web Appendix B.
H.2 Allowing the causal effect of A k to be a non-linear function of A k

J Estimating equations for SNCSTM with effect modification
At the beginning of Section 5 of our article we described an example of a SNCSTM where the causal effects of A 0 and A 1 are modified by, respectively, L 0 and L 1 , and described how to fit this model. Here we give the estimating equations for each of the three gamma GLMs used to estimate, respectively, (ψ 0 1 , ψ L 1 ), (ψ 0 0(0) , ψ L 0(0) ) and (ψ 0 0(1) , ψ L 0(1) ) in this model.