## 1. Introduction

Meta-analysis of data from multiple studies of the same research question has achieved a very high profile in medical research over the recent years. Currently, particular attention is being given to the potential value of individual patient data (IPD) 1 and the need to handle the challenges of meta-analysis of time-to-event outcomes.

Aggregate or summary data, such as hazard ratios and confidence intervals, can be used for time-to-event outcomes and they are commonly available in published papers. Methods for synthesizing evidence of this type (see discussion in 2), are borrowed from the methods used for summary statistics for simpler outcomes. However, in addition to the need for caution when extracting summary statistics of interest from papers or reports where they may not be clearly presented 3, this leaves little opportunity to examine the many characteristics of time-to-event data that may influence the results of standard analyses. To deal with this restriction, Fiocco *et al*. 4 have reconstructed data from the literature and provided a way to examine time-varying hazard models, an important generalization of what is normally possible with summary data. Other aspects of time-to-event data such as covariate adjustment may however be less easily handled with this approach. Thus, while IPD is considered the gold standard in meta-analysis in general 1, as all the relevant data are utilized, and approximations needed for aggregate data meta-analyses are avoided, their use is even more to be preferred with time-to-event outcomes for which a variety of distributional aspects may be of interest.

Simmonds *et al*. review methods used in the meta-analysis of IPD from randomized trials 5 and Tudur-Smith *et al*. explore the heterogeneity of IPD meta-analysis using hierarchical Cox regression models 6. The logarithm of the hazard ratio (logHR) is the most prevalent summary measure used in the meta-analysis of time-to-event endpoints. Although some argue that it is always justified to consider the logHR with time-to-event data, this approach is most natural in the presence of a proportional hazards (PH) structure 7. However, in a meta-analysis, the PH assumption can be particularly restrictive, since it is imposed on multiple studies. Fiocco *et al*. have provided a means to consider time-varying hazard ratios but there remains scope to consider the potential value of other approaches.

Here, the use of parametric models for meta-analysis of time-to-event IPD is explored as an alternative to the widely used Cox's PH model. Greater flexibility in the representation of treatment effects may be one advantage. Depending on the choice of model, various data structures can be naturally incorporated, with the accelerated failure time (AFT) structure being the most obvious alternative to the PH one. In principle, the combination of quite different data structures is possible since likelihoods of different forms from multiple studies can be combined to provide a basis for inference8. In addition, the use of a parametric model allows straightforward incorporation of covariates.

If we do not want to only consider models with a PH structure, the logHR cannot be adopted as the target of inference. As an alternative, we propose the use of a convenient ratio of percentiles, typically related to two treatment groups being compared, and which has the added advantage that it is defined for all distributions. An obvious choice is the median ratio. More generally, the percentile ratio (PR) can be regarded as a continuous function of the percentile. In this case, we can consider the *k*-PR, the ratio of the survival distributions at the *k*th percentile, as one of a possible set of measures of the treatment effect.

For illustration, we focus on AFT distributions defined in the extended log-gamma distribution, initially presented by Prentice in 9. For this family of distributions, the PR does not vary with the percentile chosen and is equal to the acceleration factor for the AFT models and can also be shown to be equal to the exponentiation of the treatment effect. These are also considered in combination with a PH model with log-logistic baseline, a model which does not have a constant PR.

Other families of distributions could be considered (log F 9, log Burr 10), but our aim is simply to allow variation in the representation of PRs and a wide scope in the choice of parametric form. In particular, this allows distributional variation across studies which goes beyond that represented by random effects, or frailty, time-to-event models. These may be suitable for some multi-center trials or meta-analyses but typically only allow a random shift in one parameter across centers or studies. Note that while non-parametric estimation of percentiles is also possible, the generality of the parametric approach maintains considerable flexibility in distributional shapes while also enabling the incorporation of covariates into the meta-analysis in a natural manner.

We begin in Section 2 by introducing a motivating example of the meta-analysis of glioma studies. In Section 3, we consider the PR as a measure of treatment efficacy. Maximum likelihood inference is also considered. In Section 4 a discussion of AFT models as well as the details of the extended log-gamma model are presented, while Section 5 gives details of how AFT models can be combined with log-logistic PH models in a meta-analysis framework. A discussion about study heterogeneity is presented in Section 6 followed by the analysis of glioma data in Section 7. The paper concludes with a discussion in Section 8.