Principal stratification for quantile causal effects under partial compliance

Within the principal stratification framework in causal inference, the majority of the literature has focused on binary compliance with an intervention and modelling means. Yet in some research areas, compliance is partial, and research questions—and hence analyses—are concerned with causal effects on (possibly high) quantiles rather than on shifts in average outcomes. Modelling partial compliance is challenging because it can suffer from lack of identifiability. We develop an approach to estimate quantile causal effects within a principal stratification framework, where principal strata are defined by the bivariate vector of (partial) compliance to the two levels of a binary intervention. We propose a conditional copula approach to impute the missing potential compliance and estimate the principal quantile treatment effect surface at high quantiles, allowing the copula association parameter to vary with the covariates. A bootstrap procedure is used to estimate the parameter to account for inflation due to imputation of missing compliance. Moreover, we describe precise assumptions on which the proposed approach is based, and investigate the finite sample behavior of our method by a simulation study. The proposed approach is used to study the 90th principal quantile treatment effect of executive stay‐at‐home orders on mitigating the risk of COVID‐19 transmission in the United States.

Within the principal stratification framework in causal inference, the majority of the literature has focused on binary compliance with an intervention and modelling means.Yet in some research areas, compliance is partial, and research questions-and hence analyses-are concerned with causal effects on (possibly high) quantiles rather than on shifts in average outcomes.Modelling partial compliance is challenging because it can suffer from lack of identifiability.We develop an approach to estimate quantile causal effects within a principal stratification framework, where principal strata are defined by the bivariate vector of (partial) compliance to the two levels of a binary intervention.We propose a conditional copula approach to impute the missing potential compliance and estimate the principal quantile treatment effect surface at high quantiles, allowing the copula association parameter to vary with the covariates.A bootstrap procedure is used to estimate the parameter to account for inflation due to imputation of missing compliance.Moreover, we describe precise assumptions on which the proposed approach is based, and investigate the finite sample behavior of our method by a simulation study.The proposed approach is used to study the 90th principal quantile treatment effect of executive stay-at-home orders on mitigating the risk of COVID-19 transmission in the United States.stratification framework in causal inference was outlined by, 1 although some previous work (eg, [2][3][4][5] ) can be viewed as implicit applications of the idea of principal stratification.The principal effect with respect to a given stratification is defined as a comparison of potential outcomes under two competing or candidate treatments (eg, standard versus new intervention status) within a principal stratum, or more generally as a comparison of the distribution of potential outcomes within strata (eg, quantile causal effects).Therefore, conditioning on principal strata yields sub-group causal effects.Going forward, we shall use the terms "treatment" and "intervention" interchangeably.In the context of the motivating COVID-19 example, the binary intervention is the issuing (or not) of a stay-at-home order.
A particular implementation of principal stratification analysis focuses on intervention or treatment compliance behavior, such that we assume that each unit of analysis has a latent, pre-intervention propensity to comply with their assigned intervention status.That is, in the simple binary intervention case, how one unit would comply under each of the two possible levels of intervention can be viewed as a bivariate pre-intervention covariate consisting of a pair of potential compliances.According to this bivariate characteristic, we can classify the units into four principal strata: compliers, never-takers, always-takers, and defiers; 3,5 the causal effects in each principal stratum are then studied.Principal stratification typically requires prediction of the units' missing potential compliances so as to determine to which principal stratum each individual belongs, as well as prediction of units' missing potential outcomes.However, both of these quantities are not fully observed in general and thus estimation must involve techniques for incomplete data.In a traditional principal strata analysis, compliance is binary, and estimation often focuses on the complier average causal effect (CACE).In the special case of randomized trials with non-compliance, 3 identified the CACE by considering the randomized treatment assignment as an instrumental variable.Both Bayesian approaches 4,6 and likelihood methods 7 have been proposed for estimating causal effects in the presence of binary compliance. 8and 9 further developed methods for the CACE estimation with both non-compliance and missing data.In the context of instrumental variables, 10 considered multivalued instruments, which are analogous to the (vector) compliance variable in this work, combined in a two-stage least-square procedure with a monotonicity assumption.This monotonicity assumption, introduced by, 11 means that all of those who are affected by the exposure are affected in the same way.In their work, continuous weather-based instruments were turned into three dummy variables, which were used to estimate the relationship between weather conditions and the price of fish.
In some cases, the compliance status should be thought of as continuous.For example, in a clinical trial, a patient may take a portion of a drug or placebo (eg, a fraction of assigned pills taken).In the motivating COVID-19 data analysis, the compliance is taken to be the population stay-at-home rate, which is continuous and partially observed.The principal stratum is given by a pair of potential compliances which indicates a county's compliance status; the sub-group casual effects have a clear scientific meaning that assists in understanding intervention effect heterogeneity.In these cases, the traditional discrete compliance strata (for binary compliance) cannot be applied directly; rather the principal stratification must capture a continuum of compliance behaviors.Since the two potential compliances are never jointly observed, a model for the joint distribution must be assumed, which will typically depend on unknown association parameters.One of the earliest analyses involving continuous partial compliance was carried out by, 12 and relied on the assumption that the potential compliances are linked by an equipercentile equating function that allows one to impute missing compliance status. 13formulated a weaker assumption of the potential compliances, the negative side-effect monotonicity, which assumes that the amount of drug patient i takes under treatment will be at most the amount of placebo he or she takes under control.They proposed a Bayesian approach to estimate the average treatment effect (ATE) based on principal stratification on both compliance to drug and compliance to placebo, assuming strong access monotonicity. 14estimated the same principal causal effect parameter as in, 13 and considered using the Plackett copula, estimating its parameter by a profile likelihood via the EM algorithm., 15 in contrast, treated the copula parameter as known, then varied it in a sensitivity analysis.However, identifiability of their proposed models was not investigated in these previous works.Lack of identifiability of the copula parameters could lead to misspecified joint distributions, and it is known that imputing missing compliances using misspecified models can introduce bias or inconsistent estimation. 16In addition, both 14,15 restricted the association parameter to be constant, even though it is plausible that it depends on covariates.
In addition to focusing primarily on binary interventions, the existing literature on principal stratification has been mainly concerned with the ATE within a principal stratum (eg, the CACE).However, in some research areas, the primary interest is in causal effects in the tails of the outcome distribution.While an intervention that affects the tail of the distribution of an outcome of interest may also result in the changes in the mean, shifts in the tail itself may be of interest.For example, in perinatal epidemiology, we are especially concerned with understanding the determinants of the lower tail of the entire birth weight distribution so as to minimize occurrences of low birth weight.In the motivating analysis for this work, we are interested in the upper quantiles in the number of new cases of the coronavirus disease 2019 (COVID-19), as it is this quantity rather than the mean that is key for resource planning, for example, expanding hospital bed capacity or making arrangements to move patients to other regions, ramping down non-emergency procedures, and so on.
In this article, we thus focus on the conditional quantile treatment effect (C-QTE) 17 where the conditioning is on a principal stratum and some covariates; we call this the principal QTE (P-QTE).We consider continuous potential compliances, and propose a copula-based approach similar to 14 in the sense that association of observed compliances is modeled through a copula.Based on a different data structure where some pairs of compliances are observed, our method differs from theirs also because we allow the copula parameter to depend on covariates.To address identifiability of the copula parameter of the two potential compliances, we shall assume: (i) each unit in the data set is measured both under the state of being intervened and not being intervened upon, though at different times, and (ii) the copula parameter between the two potential compliances is time-invariant.These assumptions ensure that we can fit a bivariate copula model and use it to impute the missing potential compliance.Note that these assumptions are made based on the motivating COVID-19 data structure which we will describe in detail in Section 2; in mimicking the COVID-19 example structure, some generality is lost.We discuss how these assumptions can be relaxed and the proposed method can be applied more broadly in the conclusion section and Section S.1 of the Supporting Information.Thus, while motivated by a particular application in COVID-19, our proposed approach is applicable to a wider set of real-world problems satisfying the same data structure as the COVID-19 example or a more general data structure requiring a relaxed set of assumptions.Herein, we adopt a bootstrap approach introduced by 18 to estimate the standard errors with single imputation.The finite-sample performance of the proposed method is illustrated and investigated through a simulation study.
This article is organized as follows.Section 2 describes the notations, data structure and main assumptions.Section 3 outlines the proposed methodology.Section 4 illustrates and compares the proposed methods through a series of simulations.In Section 5, we apply the proposed method to study the 90th P-QTE of widely implemented statewide executive "stay-at-home" orders in the United States to mitigate the risk of the COVID-19 transmission.We conclude in Section 6.The Supporting Information contains technical details and materials including sampling procedure for Archimedean copulas, additional simulation results and figures.

DATA STRUCTURE
][21] In this section, we frame the data structure based on the motivating COVID-19 example in which the intervention is the issuance of stay-at-home orders.Assuming each unit is measured both under the status of being intervened and not intervened upon at each of two fixed calendar times, t 0 and t 1 , we use a superscript t to represent a fixed calendar time.Let Z t represent the intervention level at time t.In binary intervention settings, Z t takes two values; Z t = 1 if a given unit is intervened upon, and Z t = 0 otherwise.Let Y t denote the observed outcome and Y t (0) and Y t (1) be, respectively, the two potential outcomes under the state of either being not intervened or being intervened upon at time t for t ∈ {t 0 , t 1 }.Let S t represent the observed compliance and S t (0) and S t (1) be the two potential compliances under the two intervention levels at time t for t ∈ {t 0 , t 1 }.In the COVID-19 example, S t (0) is the stay-at-home rate without issuing the order.In general, at a fixed time t, both the potential outcomes and compliance are only partially observed because a given unit is either intervened upon or not.We view ( S t (0), S t (1) ) as a bivariate quantification of compliance, where each of S t (0) and S t (1) lies in the range [0, 1], and we allow for the possibility that ( S t (0), S t (1) ) modifies the effect of the intervention upon the outcome Y t .We assume that the latent compliance varies with a vector of observed pre-intervention covariates X t = (X 1 , X t 2 ) ⊤ , in which X t 2 is time-related and X 1 is time-invariant.X t 2 is the compliance-intervention confounder.The data structure is illustrated in a causal diagram presented in Figure 1.We assume that calendar time does not, itself, causally affect the outcome other than through the covariates including the intervention itself.Note that within a principal stratum, that is, controlling for ( S t (0), S t (1) ) , the back door pathway from intervention status Z t across the calendar time to the outcome Y t is blocked.Herein, we assume that there are no unmeasured confounders influenced by calendar time and influencing the outcome Y t , and there is no direct effect from calendar time to outcome.Given the standard assumptions in Section 3.2 and controlling for X 1 , it is not necessary for the variable calendar time to be included in the model.The arrow from the bivairate compliances to the middle of the arrow between Z t and Y t denotes the interactions between potential compliances and the intervention.The interactions together with the main effect of intervention Z describe the heterogeneous causal effects over the principal strata

TA B L E 1
Observed data for principal stratification analyses under partial compliance.

Unit ID
Calendar time t Z t S t (0) surface.In the COVID-19 example, we assume that the stay-at-home order effects vary in different compliance behavior strata.
We illustrate the data structure of Figure 1 with a toy example in Table 1 where a single observation unit has either been intervened upon, or not, at two fixed calendar times t 1 and t 0 .The question mark ("?") denotes unobserved (missing) data.At each of the two calendar times, the two potential compliances for this unit are partially observed.

Principal quantile treatment effect
The principal stratum S P i to which the observation unit i belongs is given by the value of the bivariate potential compliance measurements (S t i (0), S t i (1)) = (s t (0), s t (1)) for any fixed (s t (0), s t (1)) ∈ [0, 1] 2 at time t.Let F Y t (1)|S t (0),S t (1),X 1 and F Y t (0)|S t (0),S t (1),X 1 , respectively, be the conditional CDFs of the potential outcomes Y t (1) and Y t (0) conditional on the principal stratum (S t (0), S t (1)) and covariates X 1 .We shall make the additional assumption that for almost all s t (0), s t (1), x 1 , the conditional CDFs of potential outcomes, F Y t (z)|S t (0),S t (1),X 1 (⋅|s t (0), s t (1), x 1 ) for z ∈ {0, 1}, are continuous and strictly increasing so that their inverse functions are well defined.Thus, for a fixed  ∈ (0, 1), the th principal quantile treatment effect is That is, the difference between the quantiles of the two conditional distributions 22 conditional on a principal stratum (S t (0), S t (1)) = (s t (0), s t (1)) and covariates X 1 = x 1 .Note that with a data structure as in Figure 1, we shall condition on both (S t (0), S t (1)) and X 1 to block the backdoor pathway from Z across X t 2 , (S t (0), S t (1)), and X 1 to outcome Y .Since S t (0), S t (1) cannot be jointly observed at a fixed time t, Eq. ( 1) suffers from lack of identifiability.We thereby have to model the joint distribution of (S t (0), S t (1)) and impute the missing potential compliance.The assumptions required to identify the P-QTE estimand in Equation ( 1) are discussed in Section 3.2.Clearly, the traditional compliance strata (complier, defier, alwaysand never-taker) do not apply when S t (0) and S t (1) are continuous; rather we have a more complex, two-dimensional surface over which to estimate principal causal effects.
For interpretability of the P-QTEs, the continuous surface (S t (0), S t (1)) can be grouped into discrete categories and the distribution of P-QTEs interpreted on each summarized principal stratum, based on scientific understanding and research questions.For example, one may define a "complier" stratum as S t (1)∕S t (0) > r with a pre-specified r ≥ 1.In the COVID-19 analysis, we display the P-QTE distributions with r = 1 (see Figure 2).We note that the proposed estimator in Eq. ( 1) provides the P-QTE distribution on each summarized principal stratum instead of a single value, therefore our method exploits the information contained within the data as much as possible.

F I G U R E 2
The 90th P-QTE estimates with FDR significance correction under a t Copula family.Each point indicates a potential compliance pair (S t i (0), S t i (1)) for each of t = t 0 and t = t 1 for each county i, where for each point, one of the elements of the pair has been imputed.

Assumptions
Standard assumptions are made to ensure a consistent estimator of a causal effect for each observation (ie, unit ID and calendar time as a unique key): (i) stable unit treatment value assumption (SUTVA): For each observation, the potential outcomes are unrelated to the treatment status of other observations (no interference), and there are no different forms or versions of each treatment level that would lead to different potential outcomes (consistency); (ii) Unconfoundedness: the conditional probability of being assigned to each of the treatment levels Z t = 0, 1 (given confounders X t ) is greater than zero.Additionally, we make the assumption that all treatment effects are short-term.This implies that there is no delayed or lagged effect associated with the treatment, the potential outcomes at different times are conditionally independent, given covariates measured at each time.This implies that the distribution of Y t 1 conditionally on (Z t 0 , X t 0 , S t 0 (0), We propose a copula-based approach to estimate the joint distribution of (S t (0), S t (1)).In order to identify the copula, we assume that the dependence structure of ( S t (0), S t (1) ) is time-invariant.Specifically, we postulate that the copula of ( S t (0), S t (1) ) depends only on the time-invariant covariates, and this merely through its parameter(s), which we term as time-invariant non-stationary dependence (TIND) assumption.The TIND assumption implies the dependence of S t (0) and S t (1) is non-stationary and can vary with time-invariant covariates; however, the dependence structure is constant across time.The margins of S t (0) and S t (1) are left unspecified such that the joint distribution of (S t (0), S t (1)) depends on both of the time-invariant and time-related covariates, avoiding in this way any monotonicity assumptions; the details are illustrated in Section 3.3 below.This dependence structure assumption implies that the observed covariates are sufficient for identifying principal strata, which is also known as principal ignorability in. 23Formally, this assumption can be expressed as: The rationale for the TIND assumption and possible extension in the context of the motivating COVID-19 data is discussed in Section 5.4.We further assume that all units in the data have been measured under the two intervention levels, albeit at different times, as is the case in the COVID-19 example.
The assumptions of repeated measurements on the unit, and the assumption that the copula parameter only depends on time-invariant covariates can be relaxed.In Section S.1 of the Supporting Information, we explain how the assumptions and the model developed here can be applied in the context of matched pairs with a generalized data structure.In the conclusion section, we discuss an extension that incorporates time effects into the copula parameter.

Copula imputation
A copula model allows for separate modeling of the marginal components of a joint distribution and its dependence structure, therefore significantly reducing the challenge of constructing an adequate and appropriate multivariate distribution.Sklar's theorem 24 provides the theoretical foundation for the application of copulas by stating that a continuous multivariate distribution can be uniquely characterized in terms of univariate marginal distributions and a copula, which is a multivariate distribution function having uniform [0, 1] marginals, describing the dependence structure between the variables. 25extended Sklar's theorem to conditional distributions, thereby permitting adjustment for covariates.We shall make use of the latter approach below.The identification of principal causal effects relies on modeling the conditional quantiles of potential outcomes featuring in Eq. (1).Within the data structure illustrated in Section 2, for each unit i at each of the two calendar times t 0 , t 1 , the bivariate qualification of compliance (S t i (0), S t i (1)) is partially observed (as in Figure 1).Let F S t (z)|X t denote the conditional CDF of S t (z) | X t for z = 0, 1, where X t = (X 1 , X t 2 ) ⊤ .Following the notation in Section 2, the latent compliance varies with pre-intervention covariates X t = (X 1 , X t 2 ) ⊤ , where X 1 is time-invariant and may affect the dependence between S t (0) and S t (1).Let H(s t (0), s t (1) | x t ) denote the joint distribution of the compliances conditional on X t . 25showed that for each x t in the support of X t , the joint distribution is uniquely decomposed as follows: for all (s t (0), s t (1)) ∈ [0, 1] 2 , where C(⋅|x t ) is the so-called conditional copula.We consider model (2) with h(⋅|x t ) denoting the conditional density of H, and c(⋅|x t ) denoting the conditional copula density.We assume that C(⋅|x t ) depends only on the time-invariant covariate X 1 and this only through its parameter (X 1 ), that is, C(⋅|x t ) = C(⋅|(x 1 )).We further assume that the conditional margins are in a parametric class each and depend on the covariates only through their parameters  0 (x t ) and  1 (x t ) that characterize respectively the conditional marginal densities.To simplify notation, we thus write , for z ∈ {0, 1}, and analogously for the corresponding distribution functions.Consequently, we have that with u t = F 0 (s t (0)| 0 (x t )) and v t = F 1 (s t (1)| 1 (x t )).Although the dependence structure of (S t (0), S t (1)) only relies on time-invariant covariates, note that their joint distribution nevertheless depends on both of the time-invariant and time-related covariates since the margins include both of X 1 and X t 2 ; this is apparent from (3) above.Because for any given t ∈ {t 0 , t 1 }, (S t (0), S t (1)) are not jointly observed, the copula parameter (x 1 ) in model ( 3) cannot be estimated.However, the TIND assumption in Section 3.2 allows us to estimate (x 1 ) from the observed compliance values.Under TIND, when conditioning on X 1 , X t 0 2 , and X t 1 2 , the conditional copula of (S t 0 (0), S t 1 (1)) is identical to the conditional copula of (S t (0), S t (1)) given X t , that is, that it has density c(⋅|(x 1 )) as in model (3).Specifically, for a simple scenario without time-invariant covariates, the copula parameter is a single value implying the dependence between S t (0) and S t (1) is stationary.We also assume that for z ∈ {0, 1}, 2 ).These assumptions allow us to write where ).Because (S t 0 (0), S t 1 (1)) are observable, the model in (4) can be estimated from the observed compliance values.The resulting estimates of  0 (x t ),  1 (x t ), and (x 1 ) can then be used in model (3) to impute the missing compliances.
There are various ways how the conditional marginals can be modeled; in the context of the data application considered here we assume that the conditional marginal parameters depend on the covariates through a linear predictor and suitable known link function(s).This is detailed in Sections 4 and 5.In this section, we instead focus on the estimation of the conditional copula parameter.Here we assume that (x 1 ) = g −1 ((x 1 )), where g ∶ Θ → R is a known link function which ensures the correct range of the copula parameter  ∈ Θ.For example, for Gaussian and t copulas, we use g −1 (a) = (exp(a) − 1)∕(exp(a) + 1) such that  ∈ (−1, 1).As for the calibration function , we consider a linear predictor (x 1 ) = x ⊤ 1 , where  is a vector of parameters to be estimated.Under the mild assumption that the marginal and copula parameters are distinct, we can estimate  using a two-step approach. 26In this approach, the conditional marginals are fitted first using the observed compliance pairs {s i (1)}, i = 1, 2, ..., n, yielding consistent marginal estimates αz (x t ), z ∈ {0, 1}.In the second step,  is estimated by maximizing the pseudo-log likelihood where and ).If the relationship between  and X 1 is more complex and varies with an arbitrary set of covariates in a parametric, nonparametric and/or semiparametric fashion, one could adopt the local likelihood-based model proposed by 27 or the generalized additive model proposed by 28 and. 29nce the model in (4) has been fitted, we can use it to impute the missing compliances.To this end, set U t = F 0 (S t (0)| 0 (x t )) and V t = F 1 (S t (1)| 1 (x t )) for t ∈ {t 0 , t 1 }, and note from Definition 1 in 25 that the conditional copula of (S t (0), S t (1)) given X t is the conditional distribution of (U t , V t ) given X t .This allows us to impute the missing potential compliance by the inverse probability transform evaluated at the conditional mean of the underlying conditional copula.For instance, when S t 1 (0) is missing, S t 1 (1) is fully observed and we can construct the pseudo-observation ). Subsequently, we can impute Û t 1 by the conditional mean of U t 1 given V t 1 when (U t 1 , V t 1 ) has distribution C(⋅| θ(x 1 )).Finally, we impute the missing S t 1 (0) by In Section 4, different copula families are investigated, and we approximate the conditional mean by a Monte Carlo approach.After imputing the missing compliance, we can estimate the th principal QTE given in Eq. ( 1) using quantile regression.

Bootstrap variance and confidence interval estimation
Single imputation using the mean leads to underestimated standard errors because imputation error is not taken into account. 30However, a bootstrap method proposed by 18 can be used to estimate the standard errors and confidence intervals even with single imputation, in an approach that performs imputation of the bootstrapped data in the same way as in the original data set.The method is valid irrespectively of the imputation method (random or nonrandom, proper or improper), or the type of parameter of interest (smooth or nonsmooth) provided the data set carries identification flags to locate missing values. 18The procedure is detailed in Algorithm 1 of the Supporting Information.In the COVID-19 example, we control the false discovery rate using the 31 procedure to correct the statistical significance of the estimated P-QTEs.

Comonotonic and non-comonotonic relationship
Both 12 and 13 assumed monotonic relationship between drug and placebo compliances, though the functional forms are different.The so-called equipercentile equating of compliances assumption which was first used by 12 and defined by, 13 albeit based on a different data structure, can be transposed into the current setting.This results in what we call a comonotonic relationship; it assumes that for any z ∈ {0, 1} and for any value ) almost surely.The missing compliances can then be easily imputed, as we explain in Section 4.However, as we illustrate therein, inference based on a comonotonic relationship leads to biased results if this assumption is violated.This is problematic since a comonotonic relationship cannot be validated from observed data and may be overly restrictive. 13n contrast, the procedure advocated in this paper is applicable in more general settings in which a comonotonic relationship may or may not hold.Indeed, a comonotonic relationship implies that conditionally on X 1 , X t 0 2 and X t 1 2 , for z ∈ {0, 1}, the variables S t 0 (z) and S t 1 (z) are comonotonic.Because copulas are invariant with respect to increasing transformations, this also means that conditionally on X 1 , X t 0 2 and X t 1 2 , the copula of (S t (0), S t (1)) is the same for t ∈ {t 0 , t 1 }.If we assume the latter to be C with parameter  C (x 1 ), then we also have that the conditional copula of (S t 0 (0), S t 1 (1)) is C with parameter  C (x 1 ), and the procedures proposed here can alternatively be used to estimate  C (x 1 ).As we show in Section 4, proceeding this way is not sensitive to the comonotonicity assumption.Note also that the comonotonicity assumption along with the conditional copula of (S t 0 (0), S t 1 (1)) in fact specifies the conditional copula of the entire vector (S t 0 (0), S t 0 (1), S t 1 (0), S t 1 (1)), while the procedure proposed here specifies the conditional dependence structure only in part.
For example, as we illustrate in Section 4, the assumptions in Section 3.3 hold also under the so-called non-comonotonic relationship.

Simulation scheme
In this section, a series of simulation studies is carried out to illustrate the finite sample performance of the proposed method in Section 3. We consider the Gaussian, t, Clayton and Gumbel copula families.These chosen families cover a wide range of situations: Gaussian and t copulas are symmetric while the t copula exhibits upper and lower tail dependence; Clayton and Gumbel copulas are not radially symmetric and known to have respectively strong and weak lower tail dependence.To investigate the impact of the copula family misspecification, a sensitivity analysis is further conducted with the Gaussian copula as the underlying true family.We compare the results when the copula family is correctly specified to when it is misspecified.Throughout the simulations, the marginal models of S t (0) and S t (1), and the calibration function are correctly specified.We first generate n different unit IDs.Note that each unit is observed both under two intervention states (at, we assume, different points in time), thus the sample size is N = 2 × n.For each unit i, i = 1, 2, ..., n, we generate a vector of time-invariant pre-intervention variables X i,1 = (X i,11 , X i,12 ) ⊤ , simulated from Bernoulli distributions with success probability 0.6 and 0.4, respectively.We then generate the time-related covariate X t 0 i,2 from N(1, 1) (ie, when Z i = 0), and X from N(0, 1) (ie, when Z i = 1); consequently each unit i is observed twice, once at each of two intervention levels 0,1, each corresponding to the two calendar times t 0 and t 1 .By generating the data in this way, the data have the structure illustrated in Figure 1 and Table 1.The simulation scheme is designed to mimic the U.S. county-level COVID-19 data described in Section 5.
Since the accuracy of the estimated calibration function η is not directly comparable across different copula families, we parametrize each copula family considered in this study by Kendall's measure of association  C ; this is possible because the copula parameter is in one-to-one correspondence between  and  C for each of the families considered in this study. 32In other words, we assume henceforth that  =  C .][35] Note that we assume the correlation between the compliances S t (0) and S t (1) only depends on time-invariant covariates X 1 = (X 11 , X 12 ) ⊤ ; we specify the calibration function for the Kendall's tau  C (x 1 ) as (x 1 ) =  0 +  1 x 11 +  2 x 12 .To generate a full data set, we first generate the data (u i , v i |x i,1 ) from a copula model with parameter  C (x i,1 ) = g −1 ((x i,1 )) for i = 1, .., n.The inverse link function is chosen as g −1 (t) = (exp(t) − 1)∕(exp(t) + 1) such that Kendall's tau will be in the correct range.
As noted in Section 3.5, we consider two scenarios.To simulate from a comonotonic relationship, we generate (u i , v i |x i,1 ) from C with parameter  C (x i,1 ) for i = 1, .., n, and set )) for t ∈ {t 0 , t 1 } and i ∈ {1, … , n}.For a non-comonotonic relationship, we generate (u i from C, and similarly for v To compare the approach under different true copula families where analyses correctly specify the true underlying model, we set ( 0 ,  1 ,  2 ) = (1.25,1.5, 1) with true  C varying from 0.55 to 0.95; for the copula family misspecification analysis, ( 0 ,  1 ,  2 ) = (1, 0.3, 0.3) with true  C varying from 0.46 to 0.66.
The conditional marginal distribution of S t (0) and S t (1) given X t = (X 1 , X t 2 ) ⊤ is set to follow a Beta distribution.Specifically, we use the reparameterization where  t z is the mean and  t z is the dispersion parameter; the variance is given by  The outcome is generated by where  is the error term arising from a Pareto distribution with location parameter  m = 1 and shape parameter  = 3; the coefficients are set to  k = 1 for k = 0, 1, … , 8.An additional scenario with a mix of positive and negative coefficients can be found in the Supporting Information.Since the potential compliance pair (S t (0), S t (1)) is partially observed, we remove s t 0 i (1), s t 1 i (0) and obtain simulated observed data.That is, while in simulation we generate complete data including both potential compliance values at each observation time point; for the analysis step, we mimic the real data, which permits us to "observe" only the compliance value consistent with the assigned intervention at each observation time.
According to the discussion in, 36 Eq. ( 7) is a location shift model restricted with a homogeneous principal intervention effect within a principal stratum, that is, there is no interaction between Z t and covariates X 1 .Thus the th P-QTE on a principal stratum (s t (0), s t (1)) is given by  1 +  2 s t (0) +  3 s t (1) for any fixed  ∈ (0, 1).In the imputation procedure, we use copula and comonotonic imputation methods both for the comonotonic and non-comonotonic simulation schemes.More specifically, for the comonotonic imputation method, we assume the comonotonic relationship is correctly specified.Details of the comonotonic imputation method are provided in the Supporting Information.

Results
A correctly specified Beta regression is used to estimate the marginal Beta distribution parameters.We adopt a two-step approach to ensure a consistent copula estimator, where the marginal models are first fitted and the estimated {( ũt 0 i , ṽt 1 i ) ∶ i = 1, 2, ..., n} are formed; in the second step, the standardized component-wise ranks of the latter, {(û n}, are used as pseudo-observations from the copula C. 37 For each of the copula families, we impute the missing values using the empirical conditional means.Note that for the Gumbel copula in particular, sampling from the conditional distribution involves numerical root finding in each component and hence is highly computationally intensive.Details of the sampling procedures for Archimedean copulas are provided in the Supporting Information. Under each copula family, we conduct 1,000 experiments, each with n = 3,000 IDs (ie sample size N = 6,000) and 1000 bootstrap replications, to estimate the quantile regression coefficients  1 ,  2 ,  3 , and the P-QTE Δ S P () at a select arbitrary principal stratum s t (0) = 1, s t (1) = 1 at relatively high quantiles (eg, 80th and 90th), and compare the mean squared error (MSE), bias, variance and confidence interval coverage.The simulation results at the principal stratum s t (0) = 1, s t (1) = 1 are summarized in Tables 2, S.2-S.10,S.13-S.14.In the Supporting Information, we also considered a different value s t (0) = 0.8, s t (1) = 0.3 (Table S.11-S.12) to demonstrate that there was nothing particular to the choice of 1.The comparison results of the two imputation methods (ie, comonotonicity and copula) based on comonotonic and non-comonotonic simulation schemes are listed in Tables S.2-S.7 in the Supporting Information.We see that when data are generated under a comonotonic scheme, the comonotonicity imputation method performs better than the copula imputation method, but differences are not large.However, under the non-comonotonic simulation scheme, the comonotonicity imputation method performs much worse than the copula method, exhibiting larger bias (approximately tenfold) and poor confidence interval coverage; results of MSE and bias at the 90th quantile are even worse than these at the 80th quantile.
Herein, we display the results based on the non-comonotonic simulation scheme.Simulation results for the copula family comparison are listed in Tables 2, S.8, and S.13-S.14 of the Supporting Information with different coefficient settings; results for copula family misspecification are listed in Tables S.9-S.12 of the Supporting Information.From Tables 2, S.8, and S13-14, we see that for all the copula families, estimation using the imputed data performs reasonably well.In the misspecification analysis with Gaussian as the underlying copula, t and Gaussian copula models perform approximately equally well.Since the Clayton copula, having strong lower tail dependence, is less close to the Gaussian copula than is the t copula, it yields worse results in terms of MSE, bias and 95% confidence interval coverage.

HIGH P-QTE OF STAY-AT-HOME ORDERS ON COVID-19 TRANSMISSION
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes COVID-19, is thought to spread mainly through close contact from person to person by respiratory route; community mitigation activities can slow the risk of a new virus transmission by limiting or preventing person-to-person interactions. 38In March 2020, the U.S. states † Note: The Δ S P () is calculated for a select arbitrary stratum (s t (0) = 1, s t (1) = 1).
and territories began widely implementing various community mitigation activities, including closing non-essential business and issuance of orders encouraging (and in some cases, mandating) residents to stay at home.As of May 20, 2020, each state that had imposed a stay-at-home order had begun lifting the restriction of business and public spaces.We therefore consider two intervention levels: being intervened upon when states issued the stay-at-home order, and being not intervened upon when the order was lifted.It is probable that areas with fast-moving pace of infection are more likely to have higher demand for hospital beds and medical care, resulting in healthcare system overload.Therefore, understanding the impact of the extent to which people were complied to those mitigation activities (ie, population stay-at-home rate) on the upper quantiles (eg, 90%) of the distribution of confirmed COVID-19 cases is important for policy-making and health system intervention.We wish to estimate the impact of a stay-at-home order on the propagation of COVID-19 while taking into account heterogeneity in county compliance to such public health orders, recognizing that compliance may be affected by local political leanings and other important covariates such as current weather conditions.

Data description
The data that we used to undertake the analysis are county-level measurements drawn from multiple publicly accessible sources.The USAFacts (https://usafacts.org)COVID-19 time series data are gathered from the Centers for Disease Control and Prevention (CDC), and state-and local-level public health agencies.County-level data were confirmed by referencing state and local agencies directly.The daily percentages of people staying at home were obtained from the Bureau of Transportation Statistics. 39 County-level returns for presidential elections in 2016 were collected and maintained by the MIT Election Data and Science Lab. 40Counties are classified into two groups, Republican-led or Democrat-led, by the votes to candidates.We gathered the county-level monthly average temperature data from the U.S. Climate Divisional Dataset (nClimDiv), which is based on the Global Historical Climatological Network-Daily (GHCN-D) using a 5km grid resolution. 41The geolocation information (ie, latitude and longitude) and county rurality level as of the 2010 Census were retrieved from the U.S. Census Bureau.
The study unit is each county in the United States excluding eight states-Arkansas, Iowa, Nebraska, North Dakota, South Dakota, Oklahoma, Utah (except three counties which issued the order) and Wyoming-that did not issue a statewide stay-at-home order, along with Alaska and Hawaii which do not border any other states.Two intervention levels are considered: being intervened upon when states issued the stay-at-home order, and not intervened upon when the order was lifted.We define the intervened period (t = t 1 ) as of April 6-12, and the not intervened period (t = t 0 ) as of June 15-21; all counties included in the analysis were 'assigned' the two intervention levels, and these are associated with calendar time.Therefore for all the counties, Z t 0 = 0 and Z t 1 = 1.The potential compliance ( S t (0), S t (1) ) , which is partially observed (as shown in Table 1), is measured by the weekly stay-at-home rates under the conditions of being not intervened and intervened upon to a stay-at-home order.The variable S t (0) describes the potential compliance to the stay-at-home order if not intervened upon, and S t (1) denotes the potential compliance to the order if being intervened upon.While we refer to S t (0) as the compliance to being not intervened upon, in fact this represents the stay-at-home rate in the absence of a stay-at-home order.The time-invariant covariates include the county rurality level (urban or rural), the election results (Republican-or Democrat-led), and the county longitude and latitude.The only time-varying continuous covariate is the monthly average temperature.

Analysis
We assume that the copula between S t (0) and S t (1) within a county is time-invariant, i.e. it only depends on the time-invariant covariates, while the marginal distributions F 0 and F 1 depend on both time-invariant and time-related covariates.A two-week incubation period after (non)intervention is considered.To account for the time impact on the progression of the pandemic, for each county, we use the weekly average of new cases in the most recent four weeks as the baseline.The outcome of interest Y t , which we term the "case rate change," is defined as the difference between the weekly new cases after the incubation period and the baseline per 1,000 inhabitants.We assessed the 90th P-QTE of a stay-at-home order on case rate change in different principal strata, controlling for election results and county rurality.
We randomly drew 60% of the data, thereby breaking the strict spatial relationship between the counties in the analytic sample.In total, 1,550 counties were randomly selected, resulting in 3,100 observations.The marginals are fitted by Beta regression models, using BIC and likelihood ratio test to perform model selection.The resulting coefficient estimates α0 , α1 are plugged into the CDF of a Beta distribution to transform the response variables S t (0) and S t (1) to a uniform scale; we use the standardized ranks of the uniformly scaled variables as the pseudo-observations from a copula C which depends on time-invariant covariates.We then impute the missing potential compliance using the proposed method with empirical conditional mean imputation.For comparison, the copula models are fitted under the Gaussian and t families; for each family, we estimate and select the calibration function using BIC.

Results
The scatterplot and histograms of the two observed compliances S t 0 (0), S t Since the results from the two copula families are similar, which was also seen in Section 4, herein we only present the t copula imputation results.Diagnostic plots of copula models and the results from Gaussian copula imputation are provided in the Supporting Information.Figure 2 displays the estimated 90th P-QTE surface, with smaller values indicating stronger principal intervention effects.We correct the statistical significance of the estimated P-QTEs using the 31 procedure by controlling the false discovery rate (FDR) at 0.05 significance level.Among those estimated 90th P-QTEs, fewer than 0.2% are positive and none of these is statistically significant after the FDR correction.The dashed diagonal line divides the counties into two groups: those more likely to comply (ie below the line where S t (1) ≥ S t (0)) and those less likely to comply (above the line where S t (1) < S t (0)).As seen in the boxplot within the scatterplot, the 90th P-QTE estimates are smaller within those counties more likely to comply than those less likely to comply, suggesting stronger stay-at-home order effects in the former group.Figure 3 displays the map of the 90th P-QTE estimates with FDR correction for the selected counties in the analysis; a blank county is either in the excluded states, an unsampled county, or one where the estimated effect is not statistically significant.

F I G U R E 3
The 90th P-QTE estimates which are statistically significant with FDR significance correction under a t copula family.
Estimates displayed correspond to the time when states issued the stay-at-home order.A blank county is either in the eight excluded states, an unsampled county, or one that is not statistically significant.

Discussion
In the boxplot in Figure 2, the median P-QTE value of the 'more likely to comply' group is-0.94;this can be interpreted as, for a specific county at a compliance principal stratum s t (0) = 0.24, s t (1) = 0.26, the stay-at-home order reduced (approximately) 1 new case per 1000 individuals in the population at the 90th quantile.Moreover, the "compliant counties" may be defined as S t (1)∕S t (0) > r with a prespecified r ≥ 1 or even more dummy levels based on scientific understanding, as noted in the discussion in Section 3.1.Our method therefore provides a more informative complier QTE distribution instead of a single complier causal effect value.In this analysis, the assumption that the copula parameter depends only on time-invariant covariates may seem restrictive.To assess its validity, the possibility that the dependence structure of the potential compliances is related to time or time-related covariates was investigated.First, adding X t 0 2 and X t 1 2 (ie monthly temperature at time t 0 and t 1 ) as two independent covariates into the calibration function to impute the missing potential compliances would be illogical and hard to interpret since doing so would imply that the dependence structure of (S t 0 (0), S t 0 (1)) at time t 0 is affected by future values X t 1 2 .However, one can include a time lag variable or the difference of time-related covariates, namely, X 2 , as a covariate into the copula model for (S t 0 (0), S t 1 (1)).When imputing the missing potential compliance at time t ∈ {t 0 , t 1 }, the difference covariate X t,t 2 would then be set to 0, and this would allow for the association between two potential compliances at different times being distinct from that at the same time (eg, allowing for stronger dependence at the same time).In the present data set, including the temperature difference tends not to improve the copula model, however; this is detailed in Section S.5 of the Supporting Information.Another extension, applicable also in a more general setting with repeated measurements at different times t 0,i and t 1,i , would be to include the time lag t 1,0,i = t 1,i − t 0,i as a covariate in the copula model for (S t 0 (0), S t 1 (1)).Similarly to the preceding proposal, t 1,0,i indicates the time lag effects on the association between two potential compliances.Including the time lag is however not meaningful for the COVID-19 data, since the two time points are fixed and identical to all counties.These investigations led us to proceed with the intial assumption that the conditional copula of (S t 0 (0), S t 1 (1)) depends only on the time-invariant covariates.In the application, we assume the stay-at-home rates from the data source are accurately measured with good data quality.However, if there are systematic errors in the data, this could impact the copula model or the outcome model and introduce bias to the principal QTE estimates; this requires further study.
In the COVID-19 analysis, the marginal models are not perfect; the Normal Q-Q plots display heavy tails and not too much inference could be made in the tails, thus interpretation of those estimates in the upper tails in Figure 2 need to be made with caution.We assumed there are no different versions of the stay-at-home order in order to satisfy the SUTVA causal assumption.However, since we only have state-level order issuing and lifting date data, this assumption is relatively strong considering that the strength and types of those orders may have varied at the county-level within states or between states.Furthermore, while we used simple random sampling to detrend any spatio-temporal correlation along with direct covariate adjustment for longitude and latitude, more sophisticated approaches to address residual spatial correlation among compliance level and outcomes could have been employed.In causal inference literature on spatial interference, a key challenge is that the treatment assigned to one location affects the outcome at other locations.Several authors have proposed methods to limit the form and spatial extent of interference, such as partial interference (eg, 42 ) and network approaches (eg, 43 ).In the context of COVID-19 as an infectious disease with potential spillover effects and different types and strengths of stay-at-home orders, spatial dependence was likely to have occured both in the compliance copula model and the outcome model.For example, population mobility increases interactions among people residing in different counties and is likely to cause further virus transmission; population compliance behaviors may be affected by the stringency of orders issued in neighbouring counties.One potential extension of our method could consider a Bayesian hierarchical approach, with parameters varying across counties capturing spatial correlation between compliance and outcomes.This would impose a spatial dependence structure among counties, (possibly) requiring some additional assumptions and data sources.
Finally, in assessing the assumptions needed for this analysis, we observed that the density plot of the propensity scores by exposure groups displayed some lack of overlap at either extreme of the propensity scores, suggesting the possibility of violations of the positivity assumption (see Figure S.10 in the Supporting Information).We therefore undertook a sensitivity analysis by excluding observations with extreme the propensity scores, as explained in Section S.5.2.The findings from the sensitivity analysis were consistent with those from the primary analysis.

CONCLUSION
Building statistical models within a principal stratification framework requires a set of assumptions.When the latent compliance is continuous with partial compliance, assumptions are especially important to identify the association parameter.The approach illustrated in this article relies on some specific data structures such as the availability of measures for each unit under each of the possible intervention values, while it relaxes the assumptions needed to model continuous principal strata and allows the exploitation of the information contained within the data as much as possible.
We propose a conditional copula approach to impute the missing potential partial compliance and estimate the principal quantile treatment effect surface at high quantiles.In order to capture the association structure between two potential compliances, a conditional copula is used to allow the copula parameter to change according to a calibration function of the measured covariates, under mild assumptions.We focus on the linear calibration function for simplicity, but it this can be extended to more complicated dependence structure such as a local likelihood-based model 27 or a generalized additive model. 28,29In practice, the underlying relationship between compliances S t 0 (z), S t 1 (z), z = 0, 1 is unknown.Our simulation results demonstrate that the proposed copula imputation approach performs reasonably well whether the relationship between compliances is comonotonic or non-comonotonic.Our methodology relies on the assumption that the conditional copula of (S t 0 (0), S t 1 (1)) depends only on time-invariant covariates and is the same as the conditional copula of (S t (0), S t (1)).Although this requirement seems to work well for the COVID-19 data, it may be restrictive in other settings.We thus briefly outline possible extensions that allow for a different degree of association between potential compliances at the same time, by including the difference between the time-dependence covariates or a time lag as a covariate in the copula model.Another contribution of this article is the estimation of the P-QTE in the tails rather than in the mean or the center of the data.Considering data sparsity challenges of quantile regression of high quantiles, we study the finite sample behavior of the regression coefficient estimates in relatively high quantile settings via a series of simulations to examine their consistency.Statistical inference for the quantile regression coefficients, including variance estimation and confidence interval construction, is obtained using the bootstrap method.
Our approach is motivated by our analysis of the COVID-19 data, and thus based on each unit being measured under both states of being intervened and not intervened upon, however, our approach is not to be limited to this complete data.For instance, the proposed method is also applicable when a (relatively small) fraction of the units in the data only have access to one of the two interventions.Further, the assumption of repeated measures on each unit can be relaxed and the proposed method can be easily extended to matched pairs; we illustrate how the method may be generalized in Section S.1 of the Supporting Information.Specifically, if there is drop-out or censoring among units, it can be approached as a non-repeated measurement setting and use the matched pairs method.
The method and COVID-19 data analysis performed in this article are based on principal stratification on (S t (0), S t (1)), where both compliances are treated as characteristics of the population in each county capturing some latent trait relating to civil responsibility or willingness to trust and follow government orders.Lack of identification is the key challenge in modeling partial compliance and thereby some assumptions have to be made.Our method yields an estimated joint distribution of the two potential compliances that are never jointly observable.In addition, when their association parameter is close to 1, S t (0) and S t (1) contain almost the same information.In this case, including both S t (0) and S t (1) into a quantile regression model might lead to unstable coefficient estimates; whether it would be sufficient to include only one of these compliance measures is in need of additional research.
For a continuous compliance, we never know the exact point on the compliance surface at which a unit lies.However, the potential compliances characterize the units' compliance status, therefore studying the subgroup causal effects stratified by potential compliances has clear and important scientific meaning in understanding the heterogeneous causal effects across a range of compliance behaviors.For instance, the stay-at-home order impacts how individuals learn, work, play, and shop with potential benefits in terms of transmission reduction that are traded off against downsides such as economic costs and adverse effects on mental health.Understanding the magnitude of the stay-at-home order effects on mitigating COVID-19 transmission for counties with different characteristics (including different compliance status) is essential for policymakers to better gauge the likely benefits of infection containment in a given region.Some of the underlying assumptions (eg, no interference) are likely to be violated for the COVID-19 example as discussed in Section 5.4.However these assumptions may be more plausible in other contexts such as medication adherence in pharmacology.Our method therefore can more confidently be applied to a range of real-world problems, particularly those without the risk of spillover.
TA B L E 2Abbreviation: df, degree of freedom.
1 (1) are given in Figure S.2.The calibration function selection results are reported in Table S.9, and the estimated Kendall's tau values are given in Figure S.4 of the Supporting Information.