Burden‐of‐illness vaccine efficacy

Summary In recent years, many vaccines have been developed for the prevention of a variety of diseases. Although the primary objective of vaccination is to prevent disease, vaccination can also reduce the severity of disease in those individuals who develop breakthrough disease. Observations of apparent mitigation of breakthrough disease in vaccine recipients have been reported for a number of vaccine‐preventable diseases such as Herpes Zoster, Influenza, Rotavirus, and Pertussis. The burden‐of‐illness (BOI) score was developed to incorporate the incidence of disease as well as the severity and duration of disease. A severity‐of‐illness score S > 0 is assigned to individuals who develop disease and a score of 0 is assigned to uninfected individuals. In this article, we derive the vaccine efficacy statistic (which is the standard statistic for presenting efficacy outcomes in vaccine clinical trials) based on BOI scores, and we extend the method to adjust for baseline covariates. Also, we illustrate it with data from a clinical trial in which the efficacy of a Herpes Zoster vaccine was evaluated.


| INTRODUCTION
One of the main objectives of vaccine efficacy (VE) trials is to compare the proportion of individuals who are infected between vaccinated and unvaccinated groups. However, a vaccine may affect the incidence, duration and severity of a disease. In order to consider incidence, duration, and severity, Chang et al 1 proposed a simple approach. After assigning a score S equal to 0 for uninfected individuals and some postinfection outcome X > 0 for infected individuals, they tested the equality of S between groups using an adapted t test, with a specific variance taking into account the semicontinuous nature of S. This method, called burden-of-illness (BOI) test, is attractive for two main reasons; first, it is simple and second it is consistent with the intent-to-treat (ITT) principle in clinical trials.
Other statistical methods have been proposed in the literature. Follmann et al 2 proposed a BOI-like test called Chop-Lump (CH-L), which essentially compares the mean of S by excluding most of the 0's. This approach is more powerful than BOI in case of no VE. Tu et al 3 proposed a parametric approach, modeling the S's with a mixture of point mass at zero and a log-normal distribution. Lachenbruch 4,5 proposed methods to combine the separate tests for the two endpoints. Mehrotra et al 6 adjusted the Fisher's method (FCM) for postrandomization selection bias using the potential outcomes framework for causal inference. 7 They showed that the Fisher's combination test performs best overall, even after adjusting the test for selection bias. Callegaro et al 8 proposed a permutation-based Fisher's combination test adjusted for selection bias. Other methods based on causal inferences have also been proposed. [9][10][11] The VE is the standard statistic in vaccine development studies. For this reason, in this article, we derive a VE measure based on the BOI scores (VE BOI ). VE BOI is defined as the relative reduction in the BOI score in the vaccinated group compared to the unvaccinated or control group and is calculated as 1 minus the relative risk (RR; the BOI score in the vaccinated group divided by the BOI score in the placebo group). VE BOI is a simple, useful, and interpretable statistic in vaccine development and was chosen as a primary endpoint in the Shingles Prevention Study, which evaluated the VE and safety of a live zoster vaccine. 12 Furthermore, VE BOI and the following statistic (which is a function of VE BOI ) have been used to support claims to regulatory agencies. 13 We illustrate our methods using data from a Herpes Zoster (HZ) clinical trial 14 that evaluated the VE of adjuvanted Recombinant Zoster Vaccine (RZV) in the prevention of HZ in autologous hematopoietic stem cell transplantation (HSCT) recipients 18 years of age (YOA) and older.

| STATISTICAL METHODS
Let us suppose that N V volunteers are randomized to receive a vaccine, N C are randomized to placebo (or control), and that infections are recorded along with a postinfection outcome, X. Chang et al 1 defined the BOI scores S as 0 for the uninfected patients and X for the infected patients. Chang et al proposed to test the null hypothesis that the distribution of S is the same in the two groups using the following t test like statistic, where S j = 1=N j P N j i = 1 S i,j , for treatment group j = V, C, and V ar 0 is the variance computed under the null hypothesis (score test).
In this article, we consider instead a VE BOI statistic which is a useful and interpretable statistic in vaccine development. Two different designs are considered: (a) the fixed time design, where the trial is stopped after an fixed duration of follow-up, and (b) the fixed event design, where the trial is stopped when n events are observed. Results of fixed event designs are provided in Appendix A. In case of fixed time design E S j À Á = p j μ j , j = V, C, where p j is the probability of infection and μ j is the expectation of X of those infected in treatment group j. The variance is where σ 2 j is the variance of X of those infected in treatment group j. Using the Delta method, it follows that

| Other statistical methods
In this section, we describe in more detail some published approaches and compare these methods via a simulation excercise. We denote by (prop) the test which compares the infection rates between the two groups and by (inf ) the t test in infected individuals only, where n C and n V are the number of infected individuals in each group and X Gi is the ith measurement in group G. Note that inf test can be affected by selection bias because it is based on infected individuals only. The two statistics described above can be combined using the Fisher's test (FCM). 6 whereby P-value1{1} inf and P-value prop are the one-sided P-values of Z inf and Z prop , respectively, and χ 2 4 represents a random variable with chi-square distribution and 4 of freedom. Since FCM is a function of inf, it does not assess a causal effect of vaccine. Mehrotra et al 6 adjusted FCM for postrandomization selection bias using the potential outcomes framework for causal inference. In a similar spirit, Callegaro et al 8 proposed a permutation-based FCM test adjusted for selection bias.
Finally, we describe in more details the Chop-Lump test. 2 To test the equality of the distribution of S between the two treatment groups, all zero observations are removed from the treatment group with fewer zeros and an equal proportion of zeros are removed from the other treatment group. This leaves one group with no zeros at all. The distribution under the null hypothesis is obtained by permutation. The Chop-Lump statistic is a t test similar to the BOI test 1 calculated on the BOI scores S 0 on the right of the chopping point, where s 2 l is the pooled sample variance based on the l largest S's in each group, and l = max(n C , n V ). We considered the rank version of this test (CH − LW) because it is expected to be more powerful. 2

| POWER AND SAMPLE SIZE OF VE B O I
The asymptotic power under local alternatives is given by The sample size is given by 3.1 | Intention to treat and principal stratum estimator VE BOI is consistent with the ITT principle because all randomized participants are explicitly included in the analysis. A principal stratum estimator could be considered as well. 10 The advantage of the ITT approach is the simplicity. In fact, it is much more challenging to apply a principal stratum estimator because membership in the principal stratum must be inferred, usually imperfectly, from covariates. On the other hand, sensitivity analysis based on a principal stratum estimator can be useful to better understand these complex data.

| VE B O I ADJUSTED FOR BASELINE COVARIATES
In this section, we consider the possibility of adjusting the BOI scores for baseline covariates, such as age, sex. In the following, Z represents the matrix of baseline covariates. To adjust BOI scores for covariates, we propose a nonlinear regression model where the expectation (E) of the BOI score for the ith subject is where G i is the vaccination group (1 = vaccinated; 0 = control). The VE adjusted for covariates is given by this model cannot be fitted using a simple linear regression on the log-transformed scores, because of the zeros in the scores. In this article, we propose to use a quasi-Poisson model for the following reasons: (a) the overdispersion parameter allows proper estimation of the variance even in presence of many zeros; (b) the model can be implemented in standard software (such as R [glm], or SAS [GLIMMIX]) that does not require integer S values. To adjust for the exposure, we included the log of the follow-up time as an offset. SAS code to fit this model is provided in Appendix B.

| CASE STUDY
In this section, we illustrate our method using data from the GSK HZ clinical trial (NCT01610414) that evaluated the VE of RZV vaccine in the prevention of HZ in autologous HSCT recipients 18 YOA and older.

| Description of study
The study was a phase III, observer-blind randomized, placebo-controlled, multicenter, multicountry study with two parallel groups to evaluate the VE of RZV vaccine in the prevention of HZ in autologous HSCT recipients 18 YOA and older. Eligible subjects were randomized to RZV or placebo according to a 1:1 ratio. The primary endpoint of this study was to demonstrate the VE of RZV to reduce the number of HZ episodes. However, a secondary objective was to demonstrate a reduction in the severity of pain (including pain triggered by air blowing on the skin, by clothing rubbing against the skin or by hot or cold temperatures) associated with HZ for those subjects experiencing a HZ episode. Pain and other types of discomfort (eg, allodynia and intense pruritus) can have a substantial adverse impact on the functional status and quality of life of affected individuals; therefore, relief of acute and chronic HZ pain and discomfort is an important goal.

| Zoster Brief Pain Inventory
The zoster brief pain inventory (ZBPI) was used to quantify HZ pain and discomfort, and was adapted from the Brief Pain Inventory to make it a HZ-specific measure of pain severity that captures pain and discomfort caused by HZ. 15 It uses an 11-point Likert scale (0-10) to rate HZ pain and discomfort for four dimensions (worst, least, average during the past 24 hours and now) and HZ pain and discomfort-related interference with seven functional status and ZBPI activities of daily living (ADL) items: general activity, mood, walking ability, work, relations with others, sleep, and enjoyment of life. The seven questions included in the functional status and ADL are summarized into a single score by taking the mean of the seven items.

| ZBPI burden-of-illness/interference
All subjects with a HZ episode were required to complete the ZBPI questionnaire at the onset of the suspected HZ episode (identified by the presence of rash or pain) on a daily basis from onset of the HZ episode (day HZ-0) to day HZ-28, and then weekly onward until a 4-week pain-free period was documented. If pain reappeared in the same area after a 4-week pain-free period and was not accompanied by a new HZ rash, it was assigned to the previous HZ-episode. The HZ BOI scores were calculated from the ZBPI worst pain scores and the HZ burden-of-interference score were calculated from the ADL score, over the 182 days from the first day of HZ rash (day HZ-0) using area under the curve (AUC) methods. The scores were defined as 0 for participants who did not develop an evaluable case of HZ during the study.
The calculation of AUC was based on the trapezoidal rule. 16 where m is the number of ZBPI assessments between days 0 and 182, Y k is the score at timepoint k and t k is the number of days relative to day 0 at timepoint k.

| Real data analysis
A total of 1721 subjects were included in the analysis cohort. A total of 870 were vaccinated with RZV and 851 subjects were included in the placebo group. About 49 subjects (5.6%) in the vaccinated group and 135 subjects (15.9%) in the control group developed a case of Herpes Zoster. Of the 184 subjects who developed HZ, 5 did not have a ZBPI score (three in the vaccinated group and two in the placebo group) and were thus removed from the analysis. The distribution of worst ZBPI pain scores is as follows, using the notation in section 2, with V = RZV and C = Placebo. S j = P N j i = 1 S i,j N j , α j is the mean follow up time (years) in vaccination group j = V, C S V = 5:57, S C = 28:70, α V = 1:88, α C = 1:70,

Var
S V S C = 0:0021: Hence, the 95% confidence interval (CI) of the VE based on the normal distribution is (0.736, 0.914). The 95% CI for the Burden-of-Interference is calculated similarly. The VE was also calculated separately for the baseline age categories (18-49, ≥50 YOA). The results for BOI are presented in Table 1. The VE for Burden-of-Interference are presented in Table 2.

| BOI Vaccine efficacy adjusting for covariates
The VE for both the BOI and Interference were also calculated adjusting for baseline covariates, using a regression model as described in section 4 (post hoc analysis). Age category at baseline (18-49, ≥50 YOA) was included as a factor and the log of follow up time was included as an offset. Table 3 shows the results of the burden of illness regression. Age category at baseline was not significant.
Using the notation from section 4: with 95% CI of (0.640, 0.909). N j = total number in cohort j, n j = total number with disease in cohort j, S j = mean burden score, α j = mean follow up time (years).

T A B L E 2
Vaccine efficacy of burden-of-interference score a N j = total number in cohort j, n j = total number with disease in cohort j, S j = mean burden score, α j = mean follow up time (years).

| SIMULATION RESULTS
In order to compare the performance of the proposed method with existing approaches, we simulated Zoster Vaccine Trials similar to the real data described above. The number of infections in the placebo group was generated according to a binomial distribution with N C = N V = 858 and p C = 0.15. Different values of VE were considered (VE = [0%, 15%, 30%]), with p V = p C (1 − VE). The log pain score for infected individuals were generated according to a normal distribution with variance σ 2 = 1.5 2 and mean μ C = (4.5, 2.25) or μ V = (μ C − Δ) for patients in the placebo or vaccine group, respectively. For each scenario, 10 000 clinical trials were simulated. We compared the proposed approach (VE BOI ) with the following approaches: the BOI t test 1 (BOI), the test comparing only infection rates (prop), the t test comparing postinfection outcomes in infected individuals (inf ), the Fisher's test (FCM) combining prop and inf 6 and the rank version of the Chop-Lump test (CH − LW). 2 Table 5 shows simulation results based on our real Zoster data. As expected, the power of VE BOI increases as VE or Δ increases. We focus now on the comparisons between the proposed method and the BOI and Chop-Lump tests. First of all we notice that the power of VE BOI is similar to the power of BOI (only slightly better). This is not surprising, the advantage of VE BOI over BOI being the interpretability of the estimated effect. When Δ = 0 (and presumably for very small values) these three tests have similar power, and no method performs very well. At larger Δ values, VE BOI performs better, but generally has lower power than the Chop-Lump test, unless the VE is "large" (VE = 30%). This is due to the ratio μ C /σ = 3. In fact, asymptotic and simulation results of Follmann et al 2 shows that the power of BOI (and VE BOI as well) decreases with larger values of μ C /σ. The last row of Table 5 shows the hypothetical case when μ C is 2.25 instead of 4.5 so the ratio is much smaller (μ C /σ = 1.5). As expected, 2 in this case VE BOI (and BOI) is more powerful, even more powerful than CH − LW.
Finally, we compare VE BOI with FCM test. With the exception of small Δ or" large" VE (VE = 30%), FCM is more powerful than VE BOI , however this test is not consistent with the ITT principle because Z inf is restricted to subjects who are selected based on a postrandomization event. Note that Z inf (and consequently FCM) can lead to the conclusion that a harmful vaccine generating more infections with low pain is beneficial.

T A B L E 5
Power for different tests for HZ Pain and infection in a Phase III zoster vaccine trial Note: A total of 1716 subjects were randomized (1:1 randomization ratio) and had a control infection rate of 15%.

| CONCLUSIONS
In vaccine clinical trials, the VE estimate is the standard statistic for presenting efficacy outcomes. In this article, we proposed a VE statistic based on the BOI score. 1 Even if more advanced methods have been proposed in the literature, [2][3][4][5][6][8][9][10][11] we believe that this approach has an important role from a practical point of view. First, the statistic is meaningful and interpretable in vaccine clinical trials: it is one minus the BOI score ratio and represents the proportional reduction in BOI score due to the vaccine. Second, because all randomized participants are explicitly included in the analysis, our method is consistent with the intent-to-treat principle. A principal stratum estimator could be considered as well 10 and can be useful to better understand these complex data, but it is more complex because membership in the principal stratum must be inferred, usually imperfectly, from covariates. Third, it is simple to apply because it is a regression method that can be performed using standard software and can easily be extended to involve more complex analyses. A possible extension of the proposed BOI regression method is a multivariate outcome approach, where multiple severity scores (such as pain, vomiting, …) are modeled using multivariate regression approaches, such as mixedeffects regression models and generalized estimating equation models. As a motivating example, we analysed HZ data from a vaccine clinical trial. The overall HZ VE for HZ incidence was reported as 68.2%. 17 The overall VE for HZ-BOI was estimated to be 82.5%. The following statistic may be used to support claims to regulatory agencies: However, it should be noted that VE onTOP is conditional on developing disease, that is, it can easily be shown that VE onTOP corresponds to 1 − μ V /μ C . As such, the statistic is subject to selection bias which could lead to the conclusion that a harmful vaccine generating more infections with low pain is beneficial. Consequently the VE onTOP statistic should (a) be adjusted for selection bias, or (b) always be reported in conjunction with the VE HZ and VE BOI statistics to provide a complete overview of results. Further research needs to explore how this statistic can be interpreted and used in future research.
Similarly, the overall VE for reducing the Burden-of-Interference of HZ on activities of daily living was estimated to be 82.8%. The results from this analysis would suggest that in addition to preventing HZ, vaccination with RZV also reduces the BOI and Burden-of-Interference in subjects who develop HZ. Plausible hypotheses are that memory CD4+ T cells would be capable of mounting a rapid antiviral response upon reactivation of the varicella zoster virus. In some cases, this anamnestic immune response may not be able to prevent a HZ episode but in vaccine recipients with breakthrough disease, the response may be sufficient to more rapidly control the reactivated virus, leading to reduced severity of disease. 18 Apparent mitigation of breakthrough disease in vaccine recipients have also been reported for a number of other vaccine-preventable diseases such as influenza, rotavirus, and pertussis. [19][20][21] We proposed a quasi-Poisson regression model on the scores to adjust the BOI VE for baseline covariates. The overdispersion parameter allows proper estimation of the variance even in presence of many zeros and the model can be implemented in standard software (such as R [glm], or SAS [GLIMMIX]) that does not require integer S values. It is straightforward to extend the model, for example by including the interaction between treatment and covariates. In this way, it is possible to test if the VE BOI depends on baseline covariates (testing if the interaction is significant).
Simulation results based on our real data showed that the power of VE BOI is similar to the power of BOI. As expected, the advantage of VE BOI over BOI is not in terms of power but of the interpretability of the estimated effect. In agreement with simulation results of Follmann et al, 2 Chop-Lump (rank) test is more powerful than the VE BOI test when there is small VE and large ratio μ C /σ. Future research could evaluate the proposed BOI regression using postinfection values which have been transformed (f[S]) to reduce the ratio Eðf SjC ð Þ= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi Varðf SjC ð Þ p ). In summary, the proposed BOI VE is a useful and interpretable measure of efficacy for clinical trials where the vaccine may affect both the incidence and the severity of disease. The adjustment for baseline covariates can further improve the efficiency of the analysis and avoid conditional bias from chance covariate imbalance.

APPENDIX A Fixed number of event design
In case of fixed number of events design, it follows that n C B(n, π C ) where π C is the proportion of events in the control group (π C = n C /n). It follows that E S j jn À Á = π j μ j × n=N j and Var S j jn À Á = π j σ 2 j + 1−π j À Á μ 2 j n=N 2 j . In this case, the two means are not independent (Cov(n C , n V |n) = −V ar(n C |n) = nπ C (1 − π C )) and Cov S C , S V j n ð Þ= −nπ V π C μ V μ C = N V N C ð Þ. Using Delta method, it follows that The expectation of the statistic Z = log RR ð Þ= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Var log RR ð Þ ð Þ p is given by and the sample size is