3.1. The percentile ratio
The k-PR qk can be defined as
where k can take any value in [0, 1]. This quantity is thus relevant to any binary explanatory variable specifying group membership, such as a treatment identifier, and provides a relative measure for the treatment effect at each point on the survival probability axis. When discussing binary explanatory variables subsequently, we will assume that it is a treatment versus control comparison which is of interest. Note, however, that if a continuous explanatory variable is of interest, then qk can be defined as the PR that reflects a unit change in the chosen variable. For k = 0.5, the quantity q0.5 represents the median ratio, possibly the percentile ratio of most general interest. Values >1 indicate, for example, that the median survival of the treatment group at this particular percentile is greater than the median survival of the control group, while values <1 indicate the opposite. In some circumstances, of course, another PR may be of more interest.
In the most general setting, qk changes as a function of k, since the PR for a specific value k does not capture the effect of treatment over the entire follow-up period of a trial. For that reason we might have to consider qk over a range of values of k. For illustration, in Figure 2 the PR is plotted for a PH distribution with a log-logistic baseline, with (a) positive and (b) negative treatment effects. We consider percentiles only in the [0.05, 0.95] interval, since calculating qk at the two limiting points, 0 and 1, is not informative. However, the limit for k→1 is calculable and can be taken to represent a final PR at the end of the study. At values of k close to zero, on the other hand, qk is unstable, so conventionally we acknowledge that no treatment is better at k = 0, and hence q0 = 1. We thus adopt the notation k∈(0, 1) to imply that the extreme values 0 and 1 are not considered.
Figure 2. Plot of the PR for a proportional hazards distribution with log-logistic baseline, with positive (a) and negative (b) treatment effects.
Download figure to PowerPoint
3.2. Likelihood inference
For now we focus on the inference concerning the PR qk for a particular percentile level k. The situation when there is no natural or consensus choice of k, for inference purposes, is discussed later. Suppose we want to model the data from a study using a distribution f(t;v, β) for the time to an event T, where v is a parameter which characterizes the treatment effect, and β is a vector containing all other parameters relevant to the distribution. Irrespective of the choice of distribution f(), we can reparameterize it as f(t;qk, β) by expressing v as a function of qk, and possibly β, say v = gk(qk, β), conveniently written in this form to also highlight its dependence on the choice of k. However, qk is a quantity with a clear interpretation and its scale does not depend on the choice of distribution f() or indeed any other features of the data being analyzed. Therefore, within a parametric meta-analysis, where different distributions are fit to data from different studies, qk presents a measure of treatment effect in each of the separate analyses but remains directly comparable across studies. This means that, as a basis for meta-analysis, there exists a parameter common across distributions with an interpretation that can be easily communicated.
Consider now the case where we have N studies to be pooled for a meta-analysis and where we assume that fi(t;bi, uij|xij) is the chosen distributional form to model the data in study i (i = 1, …, N), where j (j = 1, …, ni) denotes the individuals in study i, bi is a scale parameter and uij = µi + vixij is the location parameter represented as a function of explanatory variables xij, denoting treatment and other relevant patient-specific information. Still focussing on a particular percentile level k, we can express the distribution for study i as fi(t;µi, bi, q|xij) using a reparameterization as discussed in the previous paragraph. Here q is the k-PR of study i. The most common assumption in a meta-analysis, of IPD or otherwise, is that the true value of the quantity of interest is the same across studies, while other parameter values can vary. Therefore, we fix q = qk for all studies. Then, the likelihood function can be written as
where Iij is the usual indicator variable for events. Also, the usual assumption that censoring is non-informative, in each of the N studies, is made. Based on (2), standard maximum likelihood estimation (MLE) of the common parameter qk is possible.
Previously in this section we have focussed on the inference concerning a particular value of the percentile level k. However, it may be more appropriate to consider a range of values of k. In this case we can carry out a separate analysis for each value of k and plot the results against k. Since the reparameterization procedure we used to derive the likelihood (2) may depend on k, for every choice of k there is a different likelihood. The likelihoods can only lead to identical inferences if the qk's can be jointly modeled to be common across studies for every k, generally only true if qk = qfor all k or if the dependence of qk on k is modeled to be the same across studies through an assumption of a common distributional shape. The second possibility will only be true under restrictive assumptions about the common features of the time-to-event distributions across trials. The first is less restrictive in regard to distributional shape and is, for example, satisfied if estimation is based on the log-gamma family of AFT distributions.
For illustration, Figure 3 presents a simple case where, for five different studies, we have generated data from PH log-logistic distributions, as in the example in Figure 2. There was no censoring and the data were generated based on different sets of parameters, with the common characteristic q.5 = 2. Clearly qk for values of k≠0.5 is not the same across studies, especially for small k. Based on the meta-analysis framework introduced in this section, we obtain the pooled estimate , plotted for k∈[0.05, 0.95], which nicely falls in the center of individual study curves that give the estimated qk values and thus summarizes them in a single curve. For k = 0.5 we get , very close to the true value of 2.
Figure 3. Plot of data generated from PH log-logistic distributions for five studies with different parameter values but a common median percentile ratio q0.5 = 2. The pooled estimate has also been plotted.
Download figure to PowerPoint