#### 2.1 Construction of joint distributions based on copula functions

If (*U*_{1}, *U*_{2}) ′ is a bivariate random variable with standard uniform margins on [0,1], a two-dimensional copula function can be defined as

- (1)

[21]. If there exists a convex decreasing function such that and , and if the copula function can be written as

then copula belongs to the *Archimedean family*; the univariate function is called the *generator* for the copula [22]. Suppose (*U*_{i1}, *U*_{i2}) ′ and (*U*_{j1}, *U*_{j2}) ′ are two random variables drawn from the joint distribution (1). A common measure of the association between *U*_{1} and *U*_{2} is Kendall's *τ*, defined as

where we write *τ*_{θ} to make the relation between *θ* and *τ* explicit.

For Archimedean copulas, this can be written as

Copula functions have received considerable attention in the statistical literature in the past few years because they offer a convenient and attractive way of linking two marginal distributions to create a joint survival function [23]. Suppose *T*_{1} and *T*_{2} are a pair of non-negative random variables with respective survivor functions and given a covariate *z*. If we let and where *α*_{k} indexes the marginal distribution for *T*_{k} | *z*, then *U*_{k} ∼ UNIF(0,1), *k* = 1,2. We can define the bivariate ‘survival’ distribution function of (*U*_{1},*U*_{2}) through a copula as in (1) and obtain a joint survivor function for (*T*_{1},*T*_{2}) ′ given *Z* as

- (2)

where Ω = (*α*′, *θ*)′ with *α* = (*α*′, *α*′)′. Because Kendall's *τ* is invariant to monotonic increasing or decreasing transformations [21], it can be interpreted as a measure of association of the transformed variables (*T*_{1},*T*_{2}) ′ given *Z*. The use of a copula function to define the joint distribution of (*T*_{1},*T*_{2})|*z* is particularly appealing because one can specify the marginal distributions to have a proportional hazards form; this is not typically possible for joint distributions induced by random effects or intensity-based analyses.

If a composite endpoint analysis is planned, it would be based on modeling the random variable *T* = min(*T*_{1}, *T*_{2}), which has survival, density, and hazard function conditional on *Z*, given by

- (3)

, and , respectively. Suppose *Z* is a binary indicator where *Z* = 1 for individuals in a treatment group and *Z* = 0 otherwise. A key point is that the hazard ratio *λ*(*t*|*z* = 1; Ω) / *h*(*t*|*z* = 0; Ω) is not, in general, independent of time. As a result, even if the marginal distributions feature proportional hazards, the model for the composite endpoint will typically not. We study this point further in the next four settings for three different Archimedean copulas and the case of independent components.

##### 2.1.1 Composite endpoint analysis based on a Clayton copula

The Clayton copula [24] is a member of the Archimedean family with generator , and copula function

- (4)

with *θ* ⩾ −1. Kendall's *τ* is then given by *τ*_{θ} = *θ* / (*θ* + 2), which can be seen to vary over [−1, 1].

Consider the joint distribution of (*T*_{1}, *T*_{2})|*Z* in which the marginal distribution for *T*_{k}|*Z*, 1,2 features proportional hazards; so *λ*_{k}(*t*|*z*) = *λ*_{k0}(*t*) exp(*β*_{k}*z*) with Λ_{k}(*t*|*z*) = Λ_{k0}(*t*) exp(*β*_{k}*z*) where , *k* = 1,2. If the joint survivor function is determined by the Clayton copula through (2), by (3) the survivor function of the failure time *T* = min(*T*_{1}, *T*_{2}) given *z* is

- (5)

Hence, the hazard ratio for the treatment versus control groups for the composite endpoint is

- (6)

which is not invariant with respect to time in general.

To gain some insight into this function, suppose the marginal distributions are exponential with common baseline hazards of *λ*_{10}(*t*) = *λ*_{20}(*t*) = *λ* = log 10 so that the probability of a type *k* event occurring before *t* = 1 is 0.90 for a control subject (i.e., *P*(*T*_{k} < 1 | *Z* = 0) = 0.90). Further suppose that a common hazard ratio of 0.50 holds for the two margins (i.e., exp(*β*_{1}) = exp(*β*_{2}) = 0.50). This setting is consistent with the recommendations that the component events occur with comparable frequency because *P*(*T*_{1} < *T*_{2} | *Z*) = 0.5, and have comparable treatment effects (*β*_{1} = *β*_{2}). Figure 1(a) contains a plot of the hazard ratio (6) over the time interval [0,1] for models with mild (*τ*_{θ} = 0.2), moderate (*τ*_{θ} = 0.40), and strong (*τ*_{θ} = 0.60) association. As can be seen, even when the treatment effects are the same for the two component endpoints, there can be non-negligible variation in the hazard ratio over time, and within this family of models, the nature of this variation depends on the strength of the association between the two failure times.

##### 2.1.2 Composite endpoint analysis based on a Frank copula

The generator for the Frank copula [25] is , and the resulting copula function is

where ; Kendall's *τ* is then . If we adopt the same marginal distributions as before, the survivor function for the composite endpoint is

but because , the hazard ratio *λ*(*t*|*z* = 1; Ω) / *λ*(*t*|*z* = 0; Ω) has a complicated form. Figure 1(b) contains a plot of this hazard ratio over [0,1], and as in the case of the Clayton copula, there is considerable variation in this ratio over time.

##### 2.1.3 Composite endpoint analysis based on a Gumbel–Hougaard copula

The generator for the Gumbel–Hougaard [26] copula is giving

for *θ* ⩾ 1; Kendall's *τ* is given by *τ*_{θ} = (*θ* − 1) / *θ*. The corresponding survivor function for the composite endpoint is

and if *β*_{1} = *β*_{2} = *β*, the hazard is

Interestingly, the hazard ratio in this case is exp(*β*), which means that the proportional hazards model for the composite endpoint is compatible with a proportional hazards model for the margins. If the hazard ratio is in fact common for the component endpoints, then a consistent estimator will be obtained for this common effect on the basis of a Cox model for the composite endpoint.

##### 2.1.4 Composite endpoint analysis with independent components

Here, we consider the setting where the component failure times are independent; a special case of *τ*_{θ} = 0 for the joint models in Sections 2.1.1–2.1.3. In this case, the hazard ratio for the composite endpoint analysis reduces to

With nonhomogeneous hazards, it is apparent that the composite endpoint analysis is only compatible with a proportional hazards assumption if either (A.1) *β*_{1} = *β*_{2} = *β* or (A.2) *λ*_{10}(*t*) = *λ*_{20}(*t*). If *β*_{1} = *β*_{2} = *β*, then a consistent estimate of this common effect is obtained in a composite endpoint analysis. If *β*_{1} ≠ *β*_{2} but the hazard functions are identical, the multiplicative effect is (exp(*β*_{1}) + exp(*β*_{2})) / 2. If assumptions A.1 and A.2 do not hold, then the ratio is a complicated time varying function of the baseline hazards and respective treatment effects.

#### 2.2 Misspecification of the Cox model with composite endpoints

The previous section demonstrated that the composite endpoint analysis is typically based on a misspecified Cox regression model if the marginal distributions satisfy the proportional hazards assumption. In this section, we investigate the frequency properties of estimators from a composite endpoint analysis when the component endpoints are associated through a copula function.

Let *T*_{i} = min(*T*_{i1},*T*_{i2}) denote the time of the composite endpoint for individual *i* in a sample of size *m*. Let {*N*_{i}(*s*),*s* < 0} denote the counting process for subject *i*, which indicates the occurrence of the composite endpoint, so that *dN*_{i}(*s*) = 1 if *T*_{i} = *s* and is zero otherwise. Suppose that it is planned to follow all subjects over the interval (0,*C*^{†}] but that subjects may be lost to follow-up or withdraw from the study prematurely. Let *W*_{i} represent the withdrawal time for subject *i* and *C*_{i} = min(*W*_{i},*C*^{†}) denote their right censoring time. Let *Y*_{i}(*s*) = *I*(*s* ⩽ *T*_{i}) indicate whether subject *i* is at risk of the composite endpoint at time *s*, indicate whether they are under observation at time *s*, and indicate whether they are event free and under observation. The observable counting process for the response is then based on for subject *i*. The data for a sample of size *m* then consist of , which if we let , and *Z* = (*Z*_{1}, … ,*Z*_{m})′, we may write more compactly as .

The Cox model is widely used in the analysis of composite endpoints [27] to estimate the relative hazard where we assume the hazard function for *T*_{i}|*z*_{i} to have the form

- (7)

where *ψ*_{0}(*t*) is a non-negative baseline hazard function corresponding to the control group and *z*_{i} is the treatment covariate for individual *i*, *i* = 1, … ,*m*. The treatment effect *α* can be estimated using the maximum partial likelihood [28] by solving

- (8)

where , *k* = 0,1.

If is independent of {*N*_{i}(*s*),0 < *s*} given *Z*_{i} and if (7) is correctly specified, then (8) has expectation zero and the solution is consistent for the true value, *α*. In the independence case, this true value is *β* if the treatment effect is common (i.e., under A.1 *β* = *β*_{1} = *β*_{2}) or *α* = log(exp(*β*_{1}) + exp(*β*_{2})) / 2 if the baseline hazard functions are the same (i.e, under A.2). More generally, however, is consistent for *α**, the solution to expected score function given by

- (9)

where the expectation *E* is with respect to the true model for [29-31]. By using the true model based on (3) and assuming independent censoring for the withdrawal time *W*_{i} with survival distribution , these expectations can be obtained as follows:

Likewise,

To illustrate the bias resulting from a composite endpoint analysis, consider a randomized clinical trial in which subjects are to be followed over the interval (0,*C*^{†}] where *C*^{†} = 1. Let *Z* = 1 for treated subjects and *Z* = 0 for control subjects and suppose *P*(*Z* = 1) = 1 − *P*(*Z* = 0) = 0.5. We set *β*_{1} = *β*_{2} = *β* = log 0.80 to consider the case compatible with the current recommendations on the use of composite endpoints. We set *λ*_{1} and *λ*_{2} so that (i) *P*(*T*_{1} < *T*_{2}|*Z* = 0) = *p*_{1} equals a desired probability that the type 1 event occurs before the type 2 event among control subjects and that (ii) *P*(*C*^{†} < *T*) = *π*_{A} satisfies the administrative censoring rate for the composite endpoint among all subjects, where *π*_{A} = 0.20. Finally, suppose subjects may withdraw from the study early, and let *W* have an exponential distribution with rate *ρ* such that *P*(*C* < *T*) = *π*, where *P*(*C* < *T*) = *E*_{Z}[*P*(*W* < *T* < *C*^{†}|*Z*) + *P*(*C*^{†} < *T*|*Z*)] and *π* is the overall censoring rate set to *π* = 0.20, 0.40, 0.60, and 0.80.

Figure 2 shows the limiting percent relative bias ( 100(*α*^{ ∗ } − *β*) / *β*) of the treatment coefficient from a composite endpoint analysis when the data are generated by a Clayton copula with mild (*τ* = 0.20) and moderate (*τ* = 0.40) association. We plotted this relative bias against *P*(*T*_{1} < *T*_{2}|*Z* = 0) = *p*_{1}, and interestingly, the bias is greatest when *p*_{1} = 0.50 but decreases as this probability approaches zero or one. In either of the extreme cases (*p*_{1} = 0 or *p*_{1} = 1), the composite endpoint coincides with the occurrence of a single endpoint, and a consistent estimate of the common treatment effect is obtained. Note that the bias (*α** − *β*) is positive, and hence, the limiting value of the treatment effect is more conservative than the true common value for each of the components. This means that the estimated value would, on average, under-represent the magnitude of the treatment effect on either component, a conclusion in line with the findings of [10, 15]. Moreover, we note that the common event rate and the common treatment effect are precisely the setting where composite endpoints are recommended for use [10-12, 16]. The plots also reveal the sensitivity of the limiting value to the degree of random censoring; the higher the censoring rate, the smaller the asymptotic bias. This highlights an important point that the limiting value of an estimator from a misspecified failure time model is highly sensitive to the censoring distribution even under independent censoring. By comparing the left and right panels in Figure 2, it is also apparent that the asymptotic bias is dependent on the degree of association between *T*_{1} and *T*_{2}; the greater the association, the greater the asymptotic bias. This makes sense because when the event times are independent, consistent estimates should be obtained because assumptions A.1 and A.2 of Section 2.1.4 are satisfied.

Although of secondary interest, one can also show that , 0 < *t* < *C*^{†}, is consistent for

which when *P*(*Z* = 1) = 0.5 and the censoring distribution is the same in the two groups reduces to