Stable inverse probability weighting estimation for longitudinal studies

We consider estimation of the average effect of time‐varying dichotomous exposure on outcome using inverse probability weighting (IPW) under the assumption that there is no unmeasured confounding of the exposure–outcome association at each time point. Despite the popularity of IPW, its performance is often poor due to instability of the estimated weights. We develop an estimating equation‐based strategy for the nuisance parameters indexing the weights at each time point, aimed at preventing highly volatile weights and ensuring the stability of IPW estimation. Our proposed approach targets the estimation of the counterfactual mean under a chosen treatment regime and requires fitting a separate propensity score model at each time point. We discuss and examine extensions to enable the fitting of marginal structural models using one propensity score model across all time points. Extensive simulation studies demonstrate adequate performance of our approach compared with the maximum likelihood propensity score estimator and the covariate balancing propensity score estimator.


Abstract
We consider estimation of the average effect of time-varying dichotomous exposure on outcome using inverse probability weighting (IPW) under the assumption that there is no unmeasured confounding of the exposure-outcome association at each time point. Despite the popularity of IPW, its performance is often poor due to instability of the estimated weights. We develop an estimating equation-based strategy for the nuisance parameters indexing the weights at each time point, aimed at preventing highly volatile weights and ensuring the stability of IPW estimation. Our proposed approach targets the estimation of the counterfactual mean under a chosen treatment regime and requires fitting a separate propensity score model at each time point. We discuss and examine extensions to enable the fitting of marginal structural models using one propensity score model across all time points. Extensive simulation studies demonstrate adequate performance of our approach compared with the maximum likelihood propensity score estimator and the covariate balancing propensity score estimator.

INTRODUCTION
Inverse probability weighting (IPW) has become very popular as a method to adjust statistical analyses for bias due to confounding or selection in cross-sectional and longitudinal observational studies. It is widely used in causal inference (Hernán & Robins, 2006), incomplete data problems (Seaman & White, 2013), sample surveys (Kott & Liao, 2012), among others. The reason for the increasing popularity of IPW is that it is relatively simple and can handle complex problems of confounding and selection bias. Moreover, IPW does not require a lot of modeling. In particular, this method demands a propensity score model, which is defined as the conditional probability of a subject being exposed (or assigned to certain treatment) given a set of pretreatment covariates (Rosenbaum & Rubin, 1983). Correction for confounding bias or selection bias works by weighting each subject differently in the analysis, that is, by assigning a weight equal to the inverse of the conditional probability of the observed exposure. This sole reliance on a propensity score model is useful as it helps to ensure robustness against outcome model misspecification, which is difficult to diagnose when the exposed and unexposed subjects have fairly different covariate or exposure history. Unfortunately, despite its widespread popularity, inverse weighting has several important drawbacks. This method can be very sensitive to the choice of estimation method for the inverse probability weights (Kang & Schafer, 2007;Ridgeway & McCaffrey, 2007;Zubizarreta, 2015). These are commonly estimated based on maximum likelihood estimation under standard logistic models. The resulting estimated weights can be highly volatile and unstable, which can substantially increase the finite sample bias and variance of the weighted estimators leading to unstable results (Robins et al., 2007;Seaman & White, 2013). For instance, this could happen when for some treated subjects the estimated probability of receiving treatment is low based on their observed covariates. This can happen because maximum likelihood estimators do not guarantee stability of the estimated weights. Obtaining stable inverse weights can be especially problematic in longitudinal studies, where inverse probability weights are often accumulated as a product of inverse weights across time.
A number of methods have been proposed to attain stable weights in fixed time period studies (see Hainmueller, 2012;Imai & Ratkovic, 2014;Yiu & Su, 2018;Zubizarreta, 2015, among others). Relatively few have been designed for longitudinal studies (e.g., Han, 2016;Kallus & Santacatterina, 2018). In view of this, we introduce a new approach to estimate the inverse probability weights needed for estimation of the effects of a time-varying exposure on an outcome. We propose a system of estimating equations in the fixed time period settings and further extend it to the longitudinal settings. Our proposal relies on so-called calibration equations which endorse covariate balancing conditions between the whole sample and the subsample corresponding to the considered regime. A key feature of the proposed approach, unlike others, is that it adapts to the estimand. Through extensive simulation results, we demonstrate that, under a wide variety of settings, our proposed methodology performs well in terms of several statistical measures compared with the traditional maximum likelihood propensity score estimator (MLE), the covariate balancing propensity score (CBPS) estimator (Imai & Ratkovic, 2014), the stable balancing weights (SBW) estimator (only for the fixed time period setting) (Zubizarreta, 2015) and the adaptation of the calibration approach proposed by Han (2016).
The rest of the article is organized as follows. Section 2 describes the IPW estimation method. In Section 3, we provide our proposed methodology. Section 4 gives a thorough literature review where we discuss the limitations of the existing methods and their relation to our proposed method. In Section 5, we conduct a simulation study to evaluate the performance of our proposal.
In Section 6, we demonstrate the applicability of our proposed approach in the estimation of marginal structural models (MSM). In Section 7, we present an empirical application which aims at estimating the impact of negative advertisement on election outcomes. Section 8 concludes with discussions.

IPW ESTIMATION
We start with the problem of estimating the mean outcome under a certain point treatment regime. For each subject i = 1, … , n we denote R i as the treatment variable, X i as the p-dimensional vector of observed pretreatment covariates, and Y i as the outcome of interest. For simplicity we assume that the treatment variable is binary, that is, R i = 1 if subject i received treatment and R i = 0 if subject i received control. Let Y r denote the potential outcome under treatment r. It is assumed that there is no unmeasured confounding (Rosenbaum & Rubin, 1983), that is, the potential outcome is conditionally independent of the treatment indicator given the covariates (Y r ⟂ ⟂ R | X for r ∈ {0; 1}). No other restrictions are imposed on the outcome and the covariates. Let the propensity score model be a parametric working model for the probability of receiving treatment: P(R = 1 | X) = ( ; X), where we assume that () is known, bounded away from zero and one with probability one (i.e., positivity assumption), and sufficiently smooth in the unknown nuisance parameter . For example, it is common to assume that the propensity score obeys a logistic regression model, that is, where is the nuisance parameter. For a consistent estimator̂, the IPW estimator of = E(Y r ) is defined as follows (Horvitz & Thompson, 1952): (1) For instance, the IPW estimator of E(Y 1 ) is defined as 1 The IPW estimator of is consistent if the propensity score model is correctly specified (Lunceford & Davidian, 2004).
Consider next the problem of estimating the effect of a time-varying treatment on an end-of-study outcome. For each subject i = 1, … , n we denote R t,i as the treatment variable at time t ∈ {0, … , T}, X t,i as the p-dimensional vector of observed covariates at time point t and Y i as the outcome of interest at time T. Again we assume that the treatment variables are binary, that is, R t,i = 1 if subject i received treatment at time t and R t,i = 0 if subject i received control at time t. We denote X t = (X 0 , … , X t ), R t = (R 0 , … , R t ) and Y r as the potential outcome under treatment r = (r 0 , … , r T ). We assume that at each time point t the potential outcome at the end of the study is conditionally independent of the treatment indicator at period t given the history of covariates and exposures (Y r ⟂ ⟂ R t | X t , R t−1 = r t−1 for r t = (r 0 , … , r t ) where r j ∈ {0; 1} for j = 0, … , t), that is, the assumption of sequential randomization holds (Robins, 1986(Robins, , 1999. As in the fixed time period case, we do not impose any other restrictions on the outcome and the covariates. We define parametric propensity score models at each time point: where each t is an unknown nuisance parameter and functions 0 , … , T obey similar restrictions as in the fixed time period setting. For consistent estimatorŝ= (̂0, … ,̂T), the IPW estimator of the counterfactual mean = E(Y r ) is defined aŝ where we denotêt ≡ (̂t, X t ) for t = 0, … , T. In particular, the IPW estimator of E(Y (1, … ,1) ) is defined as 1 The choice of estimation method for the nuisance parameters can drastically affect the performance of̂I PW (̂) and̂I PW (̂). For example, in the fixed time period settings, estimated propensity scores close to zero or one may lead to poor performance of the target estimator. In practice, the nuisance parameter is estimated through standard MLE assuming that the propensity score obeys a logistic regression model. The MLE does not guarantee the stability of the estimated inverse weights since even minor changes in the regression coefficients may drastically change the inverse weights for treated subjects who were unlikely treated. The choice of estimation method for the nuisance parameters is especially important in the context of longitudinal studies, where the inverse weights accumulate as a product of inverse probability weights across time.

3
PROPOSED METHODOLOGY

The fixed time period case
Before proceeding with our proposed methodology for longitudinal studies, we first consider the problem of stable inverse probability weight estimation in point treatment studies. For the sake of simplicity we consider a logistic regression model for the propensity score, that is, ( ; X) = expit( ′ (1, X ′ ) ′ ), for r = 0, 1. We propose to estimate by solving the following system of p + 1 estimating equations: For instance, if one aims at estimating the mean E(Y 1 ), the following system of p + 1 estimating equations will be considered: Otherwise, if interest lies in the counterfactual mean E(Y 0 ), then the following system of p + 1 estimating equations will be considered: The system of estimating equations (3) is popular in the survey sampling literature, where they are known as calibration weighting or calibration equations. In this context, they are designed to provide less volatile inverse weights when estimating the population mean (see Deville & Särndal, 1992;Kott & Liao, 2012;Wu, 2003). This choice of estimating equations has several attractive characteristics: 1. The estimating equations (3) ensure efficiency and robustness of the IPW estimator, when the outcome mean is linear in X i . Under this assumption (i.e., E(Y | R = r, X) = m( r , X) = ′ r (1, X ′ ) ′ , where r is p + 1-dimensional nuisance parameter of the outcome model), the IPW estimator is the efficient estimator of E(Y r ) under the model defined by the restrictions on the propensity score. This estimator can indeed be seen to equal the augmented IPW (AIPW) (Robins et al., 1994) because it can be rewritten aŝ Note that the second term in the right-hand side of the equation above becomes zero when m(X; r ) = ′ r (1, X ′ ) ′ . The AIPW estimator is moreover double robust (DR), in the sense that it is consistent for if either the propensity score model or the linear outcome model m is correctly specified, but not necessarily both (Scharfstein et al., 1999). 2. The estimating equations (4) (and more generally (3)) ensure stability of the inverse weights by imposing the following two constraints. First, they guarantee that the sum of the inverse weights for treated subjects equals the sample size n (e.g., from (4) it follows that ∑ n i=1 R i (̂,X i ) = n). Second, they induce a certain covariate balancing condition, that is, they make the weighted average of the covariates for treated subjects equal to the average of the corresponding covariates in the whole sample (e.g., from (4) it follows that In contrast to the existing approaches for estimating inverse probability weights, the proposed approach adapts to the estimand. For example, depending on the target of interest E(Y 1 ) or E(Y 0 ), the nuisance parameter is obtained by solving the estimating equations (4) or (5), respectively. The use of a different estimator of , depending on whether E(Y 1 ) or E(Y 0 ) is estimated, is attractive. It ensures stability of the considered weights, which differ when estimating

The longitudinal case
In this article, we will extend the previous proposal to longitudinal studies. We consider logistic regression models for the propensity scores, that is, , with variation-independent parameters t , for t = 0, … , T and we focus on the estimation of the mean = E(Y r ). For simplicity, we include all historical covariates in the set of covariates at any time period t = 1, … , T (we will allow for more generality later). To guarantee the stability of the inverse weights in each time period, we propose to estimate the nuisance parameters 0 , … , T using the following system of estimating equations: We will refer to our proposed method as Stable Estimand AdaptiVE IPW estimation (hereafter SEAVE). The proposed estimating equations deliver IPW estimators with similar properties as before. First, the IPW estimator̂I PW (̂) equals the AIPW estimator when̂= (̂0, … ,̂T) is the solution of (6) where t,r t is the nuisance parameter of the outcome model at time t corresponding to the treatment regime r t . Indeed, the AIPW estimator is defined as (Robins & Rotnitzky, 1995;van der Laan & Robins, 2003) where all terms (except̂I PW (̂)) equal 0 by (6). This is attractive because the AIPW estimator is locally efficient (when the outcome model is correctly specified at each time t) and DR. The estimating equations (6) moreover ensure stability of the inverse weights by inducing the following constraints. For the sake of simplicity, we consider the regime r = (1, … , 1). At any time point t = 0, … , T, the estimating equations guarantee that the sum of the inverse weights equals the sample size, that is, which is a desirable property. Next, at any time point t > 0, it makes the following covariate balancing conditions hold for the set of covariates X k at time point k = 0, … , t. By imposing these constraints, the considered estimated inverse probability weights target the estimand.
Remark 1. It is likely that some of the historical covariates cannot be included in the model when the number of time periods is large. In this case, we can define the following propensity score function at time t: where W t is the set of covariates which contains X t and a subset of the covariates in X t−1 . For example, by considering the propensity score * define W 1 = X 1 and omit the baseline covariate X 0 . The function * t obeys similar restrictions as t and * t is the corresponding nuisance parameter. Furthermore, we can reformulate our proposed estimating equations (6) by replacing t and X t with * t and W t , respectively. Note from (8) that by including lagged covariates X t−1 in the set of covariates W t , we obtain additional balancing conditions. For example, if the baseline covariate X 0 is included in all covariate sets X t for t = 1, … , T, then we will obtain the following sequence of balancing conditions for X 0 : Our numerical results confirm that additional balancing conditions lead to better estimation performance when the number of time periods is not large, but not generally otherwise.

RELATION TO THE EXISTING LITERATURE
As mentioned in Section 1, several methods have been proposed to achieve stable inverse weights in fixed time period studies. In the literature related to causal inference and missing data problems, the criterion (3) is closely related to the covariate balancing framework which has been widely studied by several authors (see, e.g., Athey et al., 2018;Chan et al., 2016;Fan et al., 2016;Fong et al., 2018;Imai & Ratkovic, 2014;Zubizarreta, 2015). Most prominently, Imai and Ratkovic (2014) developed an approach known as covariate balancing propensity score (CBPS) aimed at estimating the propensity scores such that resulting covariate balancing is optimized. This method aims to improve covariate balance in the observed data and may potentially provide robust and more accurate IPW estimates of the counterfactual means compared with IPW estimates based on traditional maximum likelihood propensity scores. In general, CBPS balances a particular function of covariates X, that is, where function f () is specified by the researcher. For instance, if f (X) = (1, X ′ ) ′ then the first moment of each covariate will be balanced (known as "exactly identified" CBPS). In this case, CBPS ensures that the weighted average of the covariates for treated and control subjects are equal. Our proposed approach additionally ensures that these averages equal the sample average of the corresponding covariates.
Recently, several authors (see, for instance, Hainmueller, 2012;Yiu & Su, 2018;Zubizarreta, 2015, among others) have focused on estimating the inverse probability weights directly without involving a propensity score model. In particular, Zubizarreta (2015) designed a method based on convex optimization that is referred to as SBW which aims at directly estimating the inverse weights in the context of a missing data problem. The SBW method simultaneously minimizes the variance of the inverse weights and balances the covariates under a tolerance level which is specified by the researcher based on data. In addition, Wang and Zubizarreta (2020) generalized this approach, aimed at minimizing the sum of a convex function applied on the weights and balancing basis function of the covariates. In Appendix A (see the supplementary materials), we conduct a simulation study in fixed time period settings and evaluate a comparison of IPW estimators when the inverse probability weights are estimated using MLE, SBW, CBPS, and our proposed approach (3).
Although considerable research has been devoted to estimating stable inverse probability weights in one time period studies, less attention has been paid to this problem in longitudinal studies. Imai and Ratkovic (2015) generalized the CBPS framework to longitudinal studies for estimating the inverse probability weights for MSM (Robins, 2000;Robins et al., 2000). While our proposed method in particular ensures the balancing conditions of CBPS in fixed time point studies for f (x) = (1, x ′ ) ′ , our numerical simulations demonstrate that, under certain settings CBPS, may fail to provide desirable stability for the estimated inverse weights.
Han (2016) considered the problem of estimating the propensity score in longitudinal studies with dropout using DR estimation. The proposed approach is based on the following calibrated estimating equations: where R t denotes the indicator of observing a subject at period t = 1, … , T, that is, R t = 1 if the subject is observed or is still in the study at period t and R t = 0, otherwise. Here,̂t( t ) is the calibration of the propensity score t = P(R t = 1 | R t−1 , X t ), = ( 1 , … , T ) is a calibration parameter and is estimated sequentially by solving (12). Two calibration techniques considered, multiplicative and additive, are defined aŝ respectively, where the noncalibrated̂t is obtained by maximizing the partial likelihood. Finally, g t−1 is a function of X t−1 which includes 1,̂t and the estimate of E(Y | X t−1 ). Although the proposed system of estimating equations is similar to (6), the main goal of Han (2016) is to achieve intrinsic efficiency of the AIPW estimator and not directly stability of the inverse probability weights. Intrinsic efficiency guarantees that, when the missingness model is correctly specified, the estimator̂attains the minimum asymptotic variance among the class of AIPW estimators regardless of whether the considered model for the outcome is correctly specified. The calibration of the propensity score also requires estimation of the noncalibrated propensity scoreŝt using maximum likelihood. Moreover, the calibrated propensity scores are not guaranteed to lie between 0 and 1. Finally, in contrast to our method, the proposal by Han (2016) requires the estimation of E(Y | X t−1 ). Although the proposed calibration approach focuses on studies with dropouts, we include its adaptation in the numerical study provided in Appendix A. In our study, we use the additive calibration technique since it was recommended by the author. More recently, Kallus and Santacatterina (2018) introduced a method called kernel optimal weighting (KOW) which is based on a quadratic optimization problem. Their method directly estimates the inverse weights by balancing the time-dependent confounders (by using kernels) and simultaneously controlling the precision of the MSM. Specifically, this method optimizes the trade-off between the imbalance (defined as the sum of discrepancies between the weighted observed data and the counterfactual over all treatment regimes) and the precision (defined as the squared distance of the weights from uniform weights).

SIMULATION STUDY
In this section, we perform a simulation study to compare the performance of the proposed estimator SEAVE with that of different estimators of a counterfactual mean . Specifically, in Subsection 5.1, we detail the considered estimators. In Subsection 5.2, we describe the simulation scenarios for the models. We provide additional simulation results in Appendix C (see the supplementary materials).

Considered estimators and study layout
In our numerical study, we consider the following weighted estimators for given nuisance parameter estimateŝ= (̂0, … ,̂T): the Horvitz-Thompson estimator̂I PW (̂) defined in (2) and the sample-bounded IPW (SB-IPW) estimator with normalized weights (Hirano & Imbens, 2001;Hirano et al., 2003): This estimator can be obtained by solving the following score equations: for i = 1, … , n. The considered estimator (13) is sample-bounded in the sense that̂S B−IPW (̂) lies in the range of the outcome Y i . Due to the condition (7) both SB-IPW and IPW estimators are equal if the nuisance parameters are estimated using our proposed approach (6). We examine the numerical performance of these estimators when parameter t is estimated for each period t using the following methods: 1. Maximum likelihood estimator, that is, using standard logistic regression (MLE). 2. Covariate balancing propensity score (CBPS), 3. Our proposed estimator (SEAVE).
In our study, we use the R commands glm and CBPS to obtain MLE and CBPS, respectively. For the CBPS approach we use the argument exact aimed at obtaining "exactly identified" CBPS. We additionally consider the argument over (not reported) which provides "overidentified" CBPS by employing both covariate balancing and score conditions, but this may provide poor finite sample performance (Imai & Ratkovic, 2014), which was confirmed by our simulation results. Finally, we solve the estimating equations of SEAVE by considering the corresponding optimization problems (see Fu, 1998) which are solved using the command nlm. In Appendix E (see the supplementary materials), we provide R-code for a particular simulation setup (e.g., Scenario 1 described below).
Remark 2. If no convergence is attained for solving certain estimating equations of the proposed SEAVE approach using the command nlm, we recommend penalizing the corresponding estimating function using a ridge penalty t , where the penalty parameter has a small positive value (e.g., = 0.001).
To evaluate the performance of a given estimator̂, we consider the following measures: where M is the number of replications,̃= (̂1, … ,̂M) is a 1 × M vector of estimatorsb ased on each Monte Carlo replication. In our numerical evaluation, we also consider the average of the sandwich standard errors (ASSE) and the Monte Carlo coverage of 95% confidence intervals (COV). For the IPW estimator, the standard error̂∕ √ n is calculated aŝ= sd n (U), where sd n is the empirical standard deviation and U is the score function of the IPW estimator. The 95% confidence intervals are calculated as Furthermore, in order to calculate the confidence intervals for the SB-IPW estimator, we consider the estimated standard error̂∕ √ n witĥ where U is the score function of SB-IPW estimator defined earlier and E n is the empirical expected value. We also consider the corrected estimators of the standard errors (accounting for the uncertainty in the estimated propensity scores) by calculating the standard deviations using the influence function (Tsiatis, 2007): where U = U (R T , X T ) is the estimating function of the considered nuisance parameter estimator, as defined in Appendix B (see the supplementary materials) for each estimation method.

Scenario 1
In the first simulation scenario, we consider a simple data-generating mechanism where for each We focus on estimating E(Y (1,1) ) at sample sizes n = 500, 1000, and 2000. The following working models are considered: 0 ( 0 ; X 0 ) = expit( ′

Scenario 3
In the third scenario, we consider a data-generating mechanism where for each i = 1, … , n, we N(0, 1), for k = 1, … , 4, and U 0i = 1, for t = 0, 1, where = (1, −0.5, 0.25, 0.1) ′ and = (27.4, 13.7, 13.7, 13.7) ′ . This mechanism was previously used by Imai and Ratkovic (2015) with three time periods. In our study, we consider only two time periods, because the additional time period leads to fewer observations required for the convergence of the proposed SEAVE approach (considering the presence of the instrumental variables). We focus on estimating E(Y (1,1) ) at sample sizes n = 3000 and 6000. We note that for this simulation setting relatively larger n, since smaller sample sizes lead to unidentified estimates of the proposed method. Following the data generating mechanism, we consider the following working models: . We additionally consider the following working models: , thus, including the baseline covariates in the propensity score model 1 .

Discussion of results
Tables 1-3 report the simulation results based on 1000 replications. We observe that SEAVE outperforms other methods in terms of all statistical measures. More specifically, our proposed methodology provides lower RMSE and MAE than MLE and CBPS. Moreover, we observe that for most of the considered settings, the confidence interval coverage (based on the corrected standard errors) obtained using the proposed SEAVE approach is closer to the nominal level than that calculated using MLE or CBPS methods. We observe that for Scenario 3 (see Table 3) MLE and CBPS methods provide the lowest bias for IPW and SB-IPW estimators, respectively. However, both methods provide high RMSE and MAE as a result of high variability in the estimated inverse weights. Note that in this scenario the covariates X 0 do not affect the treatment variable A 1 . We repeated our numerical study by including X 0 in the estimation procedure of the propensity score 1 (see Table 4). By doing so, we induce an additional balancing condition for the covariate X 0 . Table 4 shows that by including the baseline covariates X 0 , we significantly improve the statistical performance of SEAVE in terms of all measures compared with those in Table 2.
Furthermore, we provide additional results in Appendix C (see the supplementary materials) for the counterfactual mean E (Y (1,0) ) for Scenarios 1 and 3. We observe that the proposed method again provides stable results. Moreover, similar results are obtained for dichotomous outcome, where SEAVE provides desirable results compared with other methods (see Appendix C for more details). (1,1) ). Simulation results based on 1000 replications: Scenario 1  (14) and (15), respectively (see Section 5.1). The corrected standard errors are considered to account for the uncertainty in the estimated propensity scores. Abbreviations: ASSE, average of sandwich standard errors; bias, Monte Carlo bias; CBPS, covariate balancing propensity score; COV, coverage of 95% confidence intervals; IPW, inverse probability weighting; MAE, median of absolute errors; MCSD, Monte Carlo standard deviation; MLE, maximum likelihood propensity score estimator; RMSE, root mean square error; SB-IPW, sample-bounded IPW; SEAVE, Stable Estimand AdaptiVE IPW estimation. (1,1,1) ). Simulation results based on 1000 replications: Scenario 2  (14) and (15), respectively (see Section 5.1). The corrected standard errors are considered to account for the uncertainty in the estimated propensity scores. Abbreviations: ASSE, average of sandwich standard errors; bias, Monte Carlo bias; CBPS, covariate balancing propensity score; COV, coverage of 95% confidence intervals; IPW, inverse probability weighting; MAE, median of absolute errors; MCSD, Monte Carlo standard deviation; MLE, maximum likelihood propensity score estimator; RMSE, root mean square error; SB-IPW, sample-bounded IPW; SEAVE, Stable Estimand AdaptiVE IPW estimation. a No convergence was attained for SEAVE in 24 runs out of 1000. (1,1) ). Simulation results based on 1000 replications: Scenario 3  (14) and (15), respectively (see Section 5.1). The corrected standard errors are considered to account for the uncertainty in the estimated propensity scores.

MARGINAL STRUCTURAL MODELS
In this section, we demonstrate the performance of the proposed SEAVE method for estimating the coefficients of an MSM. This is a model for the causal effect of (time-varying) exposure on the population mean of an outcome, and is defined as, for example,  (14) and (15), respectively (see Section 5.1). The corrected standard errors are considered to account for the uncertainty in the estimated propensity scores. a No convergence was attained for SEAVE in one run out of 1000.
where 1 , … , T+1 are unknown causal parameters. Using the proposed estimating equations (6), the propensity scoreŝr 0 0 (1 −̂0) 1−r 0 , … ,̂r T T (1 −̂T) 1−r T are obtained for each feasible treatment regime r separately. Next, we define the inverse probability weights as for i = 1, … , n. Furthermore, the MSM coefficients = ( 0 , … , T+1 ) are estimated through a weighted least squares regression where W i = w i (R T,i , X T,i ). We demonstrate the performance of the proposed inverse weights for estimating the MSM coefficients and compare with that of MLE and CBPS on a particular numerical study. The weights W i are estimated separably for each possible regime. We use Scenario 1 described in Subsection 5.2.1 and we estimate MSM coefficients at sample sizes n = 500, 1000, and 2000. It can be shown that the MSM is correctly specified under the data-generating mechanism and has the following form: E(Y (r 0 ,r 1 ) ) = 1 + r 0 + 0.5r 1 . Remark 3. Although this article does not focus on estimating the parameters of the MSM, this simple example shows that the stability of the inverse probability weights positively affects the estimated MSM coefficients. This simulation study simply demonstrates the stability of the SEAVE approach which is consistent with the results obtained for the simulation Scenario 1 (see Section 5.2.1).

APPLICATION
In this section, we illustrate the performance of the proposed methodology through an empirical application. We use a data set previously analyzed by Blackwell (2013) who focused on estimating the impact of negative advertisements (when a politician or party focuses on criticizing the opponents rather than highlighting their own qualities) on election outcomes. The data set includes information on 5 weeks leading to the elections from 114 U.S. election races from the period 2000-2006. In our study, we only focus on two time periods: the first week and the last week of the 5-week period. In each week t ∈ {0, 1}, the candidate i may run a negative campaign (R t,i = 1) or remain positive (R t,i = 0). The outcome (Y ) is the "Democratic" percentage of the votes. We estimate the expected percentage of the votes under two regimes: running a negative campaign in the first week and in the last week (i.e., r = (1, 1)), and remaining positive in the first week and in the last week (i.e., r = (0, 0)). The following covariates are considered: campaign length, share of the polls, type of office, baseline share, and baseline proportion of undecided voters (observed at the first week of the considered five week period). We estimate counterfactual mean 0 using MLE, CBPS, and SEAVE methods. Table 6 summarizes the results. We observe that under the regime r = (1, 1) the estimators show roughly the same results. On the other hand, we observe that under the regime r = (0, 0) the proposed SEAVE method shows relatively lower corrected standard error.
The results show that under the regime r = (0, 0) the inverse probability weights estimated using the proposed SEAVE approach are slightly more stable than those estimated using MLE or CBPS methods. Additionally, we repeat the same calculations by considering three time periods: the first, the third, and the last week of the 5-week period leading to the elections. We estimate the expected percentage of the votes under two regimes: running a negative campaign in three periods (i.e., r = (1, 1, 1)), and remaining positive in three periods (i.e., r = (0, 0, 0)). The considered covariates are the same as above. observe that under the regime r = (0, 0, 0) the proposed SEAVE method shows relatively lower corrected standard error. In our study, we do not consider all five time periods since this leads to convergence issues for the considered approaches.
Remark 4. The results show that with three time periods the CBPS method provides propensity score estimates close to one. This leads to extreme target estimates and standard errors.

DISCUSSION
In this article, we have developed a new approach for estimating the nuisance parameters indexing the propensity score model in IPW estimators for longitudinal studies. The proposed method is based on a system of estimating equations and guarantees stability of the inverse weights at each time point. Moreover, it provides balancing conditions on the covariates. Unlike other existing estimation methods, our proposed approach adapts to the estimand, that is, depending on the target of interest, the corresponding nuisance parameter is obtained by solving different estimating equations. The IPW estimator that employs the proposed inverse probability weights equals the efficient and robust AIPW estimator when the outcome model is linear. Through an extensive numerical analysis we have demonstrated that our proposed method provides more stable IPW estimation than MLE and CBPS approaches. In our study, we considered the estimation of counterfactual means. In addition, we have demonstrated the performance of the considered methods on a simple example of fitting MSMs. Despite desirable features of our proposed methodology, an important limitation is that solving (6) requires a separate model in each time period, and that (6) moreover may become difficult to solve at the larger time points where the observed treatment regime becomes sparse. The proposed approach may therefore not be flexible in problems where the number of time periods is large or there are many confounders in the study. In future work, we will remedy this by applying generalized method of moments (Hansen et al., 1996) to solve the estimating equations in (6) simultaneously. This will significantly increase the usability of the proposed method in settings such as the estimation of MSMs.

ACKNOWLEDGMENTS
This work is supported by the FWO research project G016116N. The authors thank the editor, the associate editor, and two anonymous referees for their helpful comments that led to an improvement of this article.