SEARCH

SEARCH BY CITATION

Keywords:

  • self-controlled case series;
  • time-dependent covariates;
  • fractional polynomials

Abstract

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

The self-controlled case series method is a statistical approach to investigating associations between acute outcomes and transient exposures. The method uses cases only and compares time at risk after the transient exposure with time at risk outside the exposure period within an individual, using conditional Poisson regression. The risk of outcome and exposure often varies over time, for example, with age, and it is important to allow for such time dependence within the analysis. The standard approach for modelling time-varying covariates is to split observation periods into blocks according to categories of the covariate and then to model the relationship using indicators for each category. However, this can be inefficient and can lead to problems with collinearity if the exposure occurs at approximately the same time in all individuals. As an alternative, we propose using fractional polynomials to model the relationship between the time-varying covariate and incidence of the outcome. We present the results from an analysis exploring the association between rotavirus vaccination and intussusception risk as well as a simulation study. We conclude that fractional polynomials provide a useful approach to adjusting for time-varying covariates but that it is important to explore the sensitivity of the results to the number of categories and the method of adjustment. Copyright © 2013 John Wiley & Sons, Ltd.

Introduction

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

Farrington [1] developed the self-controlled case series (SCCS) analysis to investigate associations between acute outcomes and transient exposures. The method focuses on individuals who have had the event of interest (cases) and compares time at risk following a transient exposure, for example, immediately following a vaccination, with time at risk outside the exposure window within each individual. Because the analysis is based on within-individual comparisons, this method controls for all known and unknown time-independent confounders at the individual level. Such an approach may increase power compared with a case–control or cohort analysis as it removes between-person variability in the estimation of the exposure effect [2]. Because the approach uses cases only, it is extremely useful for rare events such as those seen following vaccination.

Although the SCCS approach controls for time-independent confounders, it is prone to bias from uncontrolled time-varying confounders. The most common time-varying confounder is age, as the risk of the event of interest often varies with age, as does the risk of exposure. Hence, adjustment for age is invariably needed when assessing the relationship between exposure and outcome. Other time-dependent covariates that may also need to be included in the analysis model are season and calendar time. Time-dependent covariates can be accounted for in an SCCS analysis by splitting each individual's observation period into blocks according to categories of the covariate, for example, month of age. Variation in outcome with the covariate can then be modelled in a step-wise fashion using indicators for each time-group category [3]. Whitaker et al. noted that the effect of adjusting for a time-varying covariate on the inference obtained from an SCCS analysis may be sensitive to the number of categories used and suggested that it is important to vary the number of categories as part of a sensitivity analysis [3]. Despite this suggestion, to date, there has been little exploration of the effect that varying the number of categories used to model a time-dependent covariate has on inference from an SCCS analysis, with few authors carrying out such a sensitivity analysis [2].

Adjusting for time-varying confounders by dividing time at risk into categories and modelling each category separately in the analysis has some limitations. Firstly, if a large number of categories are used in order to accurately capture the relationship between the time-varying covariate and the outcome, this means estimating a large number of nuisance parameters, resulting in an inefficient analysis. In addition, if the risk window for the exposure of interest occurs at a similar time for all individuals, for example a vaccination that is given according to an age schedule, then the effect of the time-varying covariate may be highly confounded with the exposure of interest. This confounding makes it difficult to distinguish the effect of the time-varying covariate and the exposure in the time group(s) where the exposure is common, particularly when the risk window for the exposure is narrow.

In this paper, we propose an alternative, potentially more efficient, approach for modelling the effect of time-varying covariates, such as age, in an SCCS analysis. The proposed approach still requires separating time at risk into categories of the time-dependent covariate as suggested by Farrington [1] but then uses a smooth curve defined by a fractional polynomial (FP) function across these categories to model the relationship between the time-varying covariate and outcome. We compare Farrington's approach of modelling categories of time-varying covariates using indicators with the FP approach in the context of estimating the association between a currently licensed vaccine for rotavirus and intussusception, where there is a strongly age-dependent risk of disease. We present results from a case study and a simulation study based on the case study. We also explore the effect of varying the number of categories used in the confounder adjustment when estimating the exposure–outcome relationship. We begin by describing our motivating example in Section 2, before providing an overview of the SCCS analysis and the two methods of age adjustment in Section 3. In Section 4, we present the results of our case study, and in Section 5, we carry out a simulation study to further compare these two approaches. We end with a discussion of the findings in Section 6.

Motivating example—assessing the relationship between rotavirus vaccination and intussusception

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

Rotavirus is the most common cause of diarrhoea and dehydration in early childhood worldwide [4]. The first rhesus–human reassortant vaccine against rotavirus (Rotashield®;, Wyeth) was withdrawn from distribution 9 months after its introduction into the US National Immunisation Program in 1998, as it was demonstrated in post-marketing surveillance to be associated with risk of intussusception, a rare condition in which a part of the intestine folds inward into another section of intestine causing obstruction of the bowel [5, 6]. More recently, two new rotavirus vaccines have been developed and widely licensed, a pentavalent human–bovine reassortant vaccine (RotaTeq®;) and a monovalent human rotavirus vaccine (Rotarix®;). Neither of these later vaccines demonstrated evidence of an association with intussusception in pre-licensure clinical safety trials [7, 8]. Although reassuring early post-marketing surveillance data were reported from the USA and Latin America [9], there have been reports from Mexico, Brazil and Australia suggesting that there may be an association between these currently licensed vaccines and intussusception [10, 11]. As these new vaccines are rolled out globally, it is important that there is ongoing surveillance with respect to intussusception.

The Australian study of rotavirus vaccines and intussusception collected data on cases of intussusception in four states across Australia and compared the observed number of cases following a vaccination with the expected number of events in the same period using an estimate of the background rate of intussusception [10]. An alternative approach is to analyse the observed cases of intussusception using the SCCS method, as was used to estimate the association between rotavirus vaccination and intussusception in Mexico and Brazil [11]. The risk of intussusception is low in the neonatal period, increasing to a peak at around 5–6 months of age before declining again [12]. Thus, when assessing the relationship between rotavirus vaccination and intussusception, it is important that this age-dependent risk is taken into account.

The self-controlled case series approach

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

The SCCS method requires capturing data on all individuals with the event of interest (cases) within a prespecified period, usually defined by age or calendar time [3]. Once cases have been identified, their exposure history is obtained, and their time under observation during the period of interest is separated into time within and outside a predefined exposure window or windows. The SCCS analysis is often used to examine the incidence of rare adverse event following vaccination, in which case the exposure period is the period immediately following vaccination. As with other study designs, it is important that ascertainment of cases is independent of exposure history [2].

The analysis involves comparing time under observation within the exposure window with time at risk outside of the exposure window, within each individual, using conditional Poisson regression. In standard Poisson regression, the time at risk for each individual is partitioned into finite intervals with nik representing the number of events occurring in individual i in risk period k. In the SCCS analysis, the risk periods are defined by the exposed and unexposed intervals of time. Letting eik denote the length of time that individual i spends in risk period k, we assume that the number of events for patient i in time interval k follows a Poisson process,

  • display math(1)

where λik, the incidence rate of the event in the interval, is modelled by independent subject and exposure effects,

  • display math(2)

with φi representing the baseline risk for person i and βk the effect of risk period k, in our case the (vaccine) exposure. In the SCCS analysis, the underlying Poisson model gives rise to a multinomial likelihood, after conditioning on the number of events for each individual (inline image1 because the analysis is restricted to cases only). This can be maximised by fitting an appropriate ‘conditional fixed-effects’ Poisson regression model. The parameter of interest, the coefficient of the exposure in the Poisson regression model (βk), represents the relative incidence (RI) of the event in the risk period compared with nonrisk periods before and after the exposure. We refer to the detailed tutorial by Whitaker et al. for full details of the SCCS method [3].

Indicator method for time-dependent confounder adjustment

In the presence of a time-dependent covariate, observation time is further separated into intervals according to categories of the covariate, for example month of age. The conditional Poisson regression model is then fitted with the inclusion of a set of indicator variables to adjust for the effect of the time-varying covariate. This can be achieved using the model

  • display math(3)
  • display math(4)

where nijk and eijk are as described previously for individual i, in period j of the time-varying covariate and risk period k, and αj represents the effect of the time-varying covariate in period j.

Fractional polynomial adjustment for time-dependent confounders

Fractional polynomials are an extension of regular (integer) polynomials that enable a flexible (non-linear) curve to model an exposure–outcome relationship for a continuous exposure [13]. Under the FP approach, a model based on a linear combination of fractional power transformations of the continuous predictor is selected, with the number of terms (often only one or two) determined by a backwards elimination procedure [13, 14]. Such a model can be fitted in STATA using the mfp command [15], which incorporates a procedure for selecting the number of terms to include in the model.

When modelling the effect of a time-dependent covariate on disease risk, we would ideally model the risk of disease across the covariate as a continuous variable. This is not possible in the context of a Poisson regression formulation, in which the time-dependent covariate must be treated as constant within each specified interval of time. Instead, we propose fitting a smooth curve across the discrete categories of the covariate using an FP, rather than letting the covariate effects be estimated separately in each category as in Equation (4). The theory that underpins the SCCS approach holds under this approach as the time-varying covariate is still represented by a piecewise-constant (within categories) function. Such a model has very few parameters (only one per term in the FP equation) and fits a smooth curve to the effect of the covariate. An additional possibility with the FP approach is to split time at risk into the smallest unit of time available, for example day of age, and fit an FP to this variable in which case the covariate would essentially be continuous. Using the time-varying covariate categorised into such small intervals is not feasible when modelling the effect using indicators as this would mean estimating a large number of nuisance parameters. This is illustrated in the following examples.

The rotavirus and intussusception example

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

As an illustrative example, we use a subset of data obtained for a national study of rotavirus vaccination in Australia [10]. For the original study, data on cases of intussusception in infants 1 to < 12 months of age were obtained from two paediatric networks that collect and record notifications of intussusception. Once identified, cases were matched to records in the Australian Childhood Immunisation Register, which captures details of all vaccinations received by infants in Australia, to obtain exposure histories (see [10] for full details). For the purpose of this paper, we restricted our analysis to the subset of data from Victoria, where the RotaTeq vaccine is given routinely at approximately 2, 4 and 6 months of age. As in the published analysis, we focus on exposure windows of 1–7 and 8–21 days following each vaccination. This analysis includes a total of 76 cases, including 19 cases that occurred within 21 days of a vaccination. Figure 1 shows a histogram of the age of the cases at the time of intussusception.

image

Figure 1. Histogram of age at the time of intussusception in Victoria.

Download figure to PowerPoint

We present the results from an SCCS analysis restricted to first events only as recommended by Whitaker [3] because the risk of a recurring event is likely to be different to the risk of a first event. Conditional Poisson regression models were fitted in STATA release 12 using the poisson command with a fixed effect for individual [16]. The analysis was carried out using both indicators and an FP to model the age effect, using 2-month, 1-month, 1-week and daily age categories (total of eight analyses). FPs were fitted using the mfp command [15].

Results are presented as the RI (exponential of the estimate of the regression parameter βk) and its 95% confidence interval (CI) for each exposure window following each dose of vaccine. We note that the estimated associations between the rotavirus vaccine and intussusception that we present are for illustrative purposes only. A comprehensive analysis regarding the risk of intussusception following rotavirus vaccination using national data including this dataset will be published elsewhere [17].

Table 1 shows the results of the SCCS analysis of the Victorian intussusception data. When fitting the FP model, two terms were found sufficient to model the effect of age [14, 15]. These results show some variability in the RI between the various approaches of age adjustment (modelling of the age adjustment is shown for both approaches when age is grouped by month in Figure 2). The RI estimates from the FP analyses are generally more stable across the choice of age groupings than those from the indicator approach. However, it is difficult to know which set of results should be preferred.

Table 1. Relative incidence of intussusception immediately following a rotavirus vaccine from a self-controlled case series analysis of the Victorian data on cases of intussusception (n = 76 cases, including 19 in the exposure window).
DoseWindow (days)DayWeekMonth2 months
IndicatorsFPIndicatorsFPIndicatorsFPIndicatorsFP
  1. FP, fractional polynomial.

11–72.32 (0.59, 9.06)3.56 (1.14, 11.1)2.60 (0.67, 10.0)3.57 (1.15, 11.1)2.31 (0.64, 8.28)3.01 (1.03, 8.85)4.11 (1.25, 13.6)2.40 (0.85, 6.79)
 8–212.42 (0.80, 7.30)2.51 (1.06, 5.98)2.39 (0.80, 7.09)2.51 (1.06, 5.99)1.87 (0.64, 5.48)2.40 (1.03, 5.62)3.46 (1.32, 9.12)2.10 (0.93, 4.77)
21–72.21 (0.40, 12.2)1.11 (0.27, 4.66)1.94 (0.37, 10.2)1.11 (0.26, 4.65)2.11 (0.43, 10.3)1.13 (0.27, 4.72)1.46 (0.33, 6.42)1.40 (0.34, 5.85)
 8–210.98 (0.20, 4.88)0.55 (0.13, 2.31)1.06 (0.22, 5.23)0.55 (0.13, 2.30)1.06 (0.22, 5.17)0.56 (0.13, 2.35)0.69 (0.16, 2.94)0.71 (0.17, 2.95)
31–70.63 (0.08, 4.95)0.69 (0.09, 5.06)0.53 (0.07, 4.12)0.69 (0.09, 5.05)0.48 (0.06, 3.69)0.72 (0.10, 5.23)0.63 (0.08, 4.71)0.89 (0.12, 6.50)
 8–210.84 (0.23, 3.05)1.08 (0.33, 3.52)0.87 (0.24, 3.14)1.08 (0.33, 3.51)0.75 (0.21, 2.62)1.10 (0.34, 3.58)0.98 (0.29, 3.29)1.37 (0.42, 4.41)
image

Figure 2. Age adjustment in the self-controlled case series analysis of the Victorian data on cases of intussusception (n = 76 cases, including 19 in the exposure window) using the indicator and fractional polynomial approach fitted to age in months. Note that the figure shows the incidence rate relative to the incidence rate in infants 1 to < 2 months of age.

Download figure to PowerPoint

Simulation study

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

In order to more formally compare the inference from the two approaches to age adjustment in an SCCS analysis, we conducted a simulation study based on the rotavirus vaccine case study.

Generation of data

We simulated 2000 datasets, each with 200 individuals, and generated events for each individual on each day of age from 1 to < 12 months using a Poisson process with the regression model:

  • display math(5)

where age1 and age2 are functions of age in days that were specified to approximate the FP terms selected by the model-fitting procedure applied to the Australian data [17]:

  • display math(6)
  • display math(7)

and X is an indicator for whether the record is in the exposure window (X = 1) or not (X = 0). For simplicity, we considered the scenario of a single dose of vaccine received by 90% of individuals. Age at vaccination was generated using a shifted gamma(2,1) distribution with the origin at 2 months of age, generating ages of vaccine between 2 months and 2 months 11 days (a reasonably narrow but realistic time window for vaccination). The exposure window of interest was defined as 1–7 days following vaccination.

We set α1 = 9 and α2 = 6 determined by the FP model fitted to the national data and β = log(4) to represent a fourfold increase in the risk of disease in the exposure window compared with nonexposed time. The value of φ was chosen so that the Poisson process produced predominantly one event in individuals who had an event (as opposed to > 1 event) and approximately 100 cases in each simulated dataset. Following the case study, analysis was restricted to first events only. Simulated datasets in which there were no cases generated within the exposure window were not included in the analysis as these cannot be used to estimate the exposure risk in the conditional SCCS analysis.

Additional scenarios

For further investigations, we varied the earlier scenarios as follows:

Scenario 2:

Using a 1–21-day exposure window—to increase the confounding between age and vaccination

Scenario 3:

Generating datasets with approximately 1000 cases—to increase the amount of information on the age effect

Scenario 4:

Extending the range of ages at vaccination by generating age at vaccine using a shifted gamma(8,2) distribution with origin at 6 weeks to give a range of age at vaccine between 6 weeks and 3 months—to reduce confounding between age and vaccination

Analysis

The simulated datasets were analysed using the SCCS method adjusting for age using the eight approaches described in the previous section. Inference regarding the exposure–outcome relationship from each analysis model applied to each of the 2000 simulated datasets under the four scenarios was assessed by comparing estimated RI for the exposure to the true value of 4 used to generate the data. We report the average (geometric mean) of the RI, as well as the average bias (difference between the average log(RI) across the 2000 simulations and the true value, log(4)), average (estimated) standard error (SE), average standardised bias (calculated as the bias divided by the average SE), mean squared error (MSE) and coverage (defined as the proportion of nominal 95% CIs that include the true value of log(4)) from each analysis approach applied to each scenario. On the basis of the simulation sample size of 2000, the Monte Carlo (MC) error of the estimates can be calculated as inline image, and the estimated coverage should lie in the range 94–96% (with 95% probability), if the true coverage is equal to the nominal 95%. We also report the number of simulated datasets that gave an unrealistically large estimate of the RI, arbitrarily defined as an RI > 10, to give an idea of the extreme tail of the sampling distribution of the estimator from these different approaches. Occasionally, it was not possible to obtain parameter estimates from all analysis models because of a lack of convergence. Results from such datasets were excluded from all summaries for consistency.

Results

Scenario 1: approximately 100 cases, 7-day exposure window

Table 2 shows the results from the SCCS analysis of the first scenario. When age was broken down into daily or weekly age groups, modelling age using indicators grossly overestimated the RI in around half of the simulated datasets. In contrast, there was less bias on average resulting in a drastically reduced MSE, when the age effect was modelled using an FP (Figure 3a). We note that although the standardised bias appears to be smaller when the age effect was modelled using indicators rather than an FP when applied to daily, weekly or monthly age groups, this was artificially reduced by the overestimation of the SE when using indicators. These problems with the indicator approach were reduced when either monthly or bimonthly age categories were used, although even with the monthly age groupings, the MSE from the indicator analysis was inflated compared with that from the FP analysis, primarily because of the inflated variance. The FP analyses had a reasonable bias, MSE and coverage across all analysis models, with fewer unrealistically large estimates of the RI compared with the indicator approach irrespective of the age categorisation. Of note, there was a slight underestimation of the RI (negative bias) from the FP model when fitted to monthly age groups.

Table 2. Results from a self-controlled case series analysis of simulated data with a 7-day exposure window with an average 97 cases including an average of five cases in the exposure window (Scenario 1, n = 1987 simulated datasets * ).
 DayWeekMonth2 months
IndicatorsFPIndicatorsFPIndicatorsFPIndicatorsFP
  • CI, confidence interval; FP, fractional polynomial; RI, relative incidence.

  • *

    Excluding seven simulated datasets where there were no cases in the exposure window and six datasets where the analysis adjusting for age using indicators fitted to daily age groupings did not reach convergence, and hence, an estimate of the RI could not be obtained.

  • +

    Presented as a geometric mean.

  • Summary on the log scale.

Average RI of exposure + 76.33.498.663.483.342.864.433.79
Average bias ⊥ 2.95 − 0.140.77 − 0.14 − 0.18 − 0.340.10 − 0.05
Average standardised bias ⊥  − 0.17 − 0.140.01 − 0.15 − 0.30 − 0.510.27 − 0.02
Average standard error ⊥ 4320.5561.40.552.270.520.590.56
Mean squared error ⊥ 49.60.3910.50.390.820.420.380.46
Coverage of 95% CI ⊥ 0.970.960.960.960.970.960.960.93
No. of simulations where RI > 1066353480489010149136
image

Figure 3. Mean squared errors of the relative incidence using different approaches to model the age effect. Black bars are from models with the age adjustment using indicators, and grey bars are from the age adjustment using fractional polynomials. (a) Scenario 1: approximately 100 cases, 1- to 7-day exposure window. (b) Scenario 2: approximately 100 cases, 1- to 21-day exposure window. (c) Scenario 3: approximately 1000 cases, 1- to 7-day exposure window. (d) Scenario 4: approximately 100 cases, 1- to 7-day exposure window, with an extended range of ages at vaccination.

Download figure to PowerPoint

Scenario 2: Approximately 100 cases, 21-day exposure window

There was a similar pattern of results when we expanded the exposure window to 21 days, although in this scenario, there was a more pronounced improvement in bias, SE and MSE across all of the age classifications for the FP analysis compared with the indicator approach (Table 3, Figure 3b). Of note, there was under-coverage of the 95% CIs for the FP analyses, particular when fitted to the coarser age groupings.

Table 3. Results from a self-controlled case series analysis of simulated data with a 21-day exposure window with an average of 102 cases including an average of 17 cases in the exposure window (Scenario 2, n = 1999 simulated datasets * ).
 DayWeekMonth2 months
IndicatorsFPIndicatorsFPIndicatorsFPIndicatorsFP
  • CI, confidence interval; FP, fractional polynomial; RI, relative incidence.

  • *

    Excluding one simulated dataset where the analysis adjusting for age using indicators fitted to weekly age groupings did not reach convergence, and hence, an estimate of the RI could not be obtained.

  • +

    Presented as a geometric mean.

  • Summary on the log scale.

Average RI of exposure + 18.03.716.063.717.663.456.294.95
Average bias ⊥ 1.50 − 0.070.42 − 0.080.65 − 0.150.450.21
Average standardised bias ⊥  − 0.11 − 0.25 − 0.01 − 0.25 − 0.23 − 0.470.800.22
Average standard error ⊥ 1700.3424.30.3441.10.331.530.42
Mean squared error ⊥ 24.50.154.930.1510.20.170.700.46
Coverage of 95% CI ⊥ 0.960.920.960.910.960.890.910.83
No of simulations where RI > 1049714361142619320297
Scenario 3: approximately 1000 cases, 7-day exposure window

When we increased the number of observations 10-fold, there were many fewer datasets in which there was overestimation of the RI with the indicator adjustment, with similar bias and MSE from both the indicator and FP models irrespective of the age categorisation (Table 4, Figure 3c). As seen with the previous scenarios, there was generally improved precision (reduced SE) when age was modelled using an FP compared with using indicators. However, in this much larger dataset, there was gross under-coverage of the 95% CI when age was adjusted for using monthly or bimonthly age groupings using both methods of age adjustment.

Table 4. Results from a self-controlled case series analysis of simulated data with a 7-day exposure window with an average of 967 cases including an average of 52 cases in the exposure window (Scenario 3, n = 2000 simulated datasets).
 DayWeekMonth2 months
IndicatorsFPIndicatorsFPIndicatorsFPIndicatorsFP
  • CI, confidence interval; FP, fractional polynomial; RI, relative incidence.

  • +

    Presented as a geometric mean.

  • Summary on the log scale.

Average RI of exposure + 4.153.884.153.873.273.194.594.61
Average bias ⊥ 0.04 − 0.030.04 − 0.03 − 0.20 − 0.230.140.14
Average standardised bias ⊥ 0.03 − 0.150.07 − 0.16 − 1.05 − 1.440.830.87
Average standard error ⊥ 0.390.160.300.160.190.150.170.17
Mean squared error ⊥ 0.160.030.090.030.080.080.050.05
Coverage of 95% CI ⊥ 0.940.950.950.950.810.690.860.85
No of simulations where RI > 104801500000
Scenario 4: approximately 100 cases, 7-day exposure window, with an extended range of ages at vaccination

Finally, when we expanded the range of ages at the time of vaccination so that there was less confounding between the age and exposure effects and returned to the context of approximately 100 cases, we saw a slightly different pattern of results (Table 5). In this scenario, there was slightly larger bias in the RI from the FP adjustment than with the indicator approach, although these biases were still reasonably small when standardised [18]. Again, there was improved precision from the FP models compared with the indicator approach for all age categorisations, with a reduced MSE from the FP model when age was adjusted for using daily or weekly age groups but a slightly increased MSE from the FP model when age was adjusted for using bimonthly age groups (Figure 3d).

Table 5. Results from a self-controlled case series analysis of simulated data with a 7-day exposure window with an extended range of ages at vaccination. Simulations have an average of 97 cases per dataset including an average of five cases in the exposure window (Scenario 4, n = 181 simulated datasets.
 DayWeekMonth2 months
IndicatorsFPIndicatorsFPIndicatorsFPIndicatorsFP
  • Excluding 14 simulated datasets where there were no cases in the exposure window and five datasets where it was not possible to estimate the relative incidence and its 95% confidence interval from all six analysis models because of convergence).

  • CI, confidence interval; FP, fractional polynomial; RI, relative incidence.

  • +

    Presented as a geometric mean.

  • Summary on the log scale.

Average RI of exposure + 4.273.434.053.413.723.164.103.51
Average bias ⊥ 0.07 − 0.150.01 − 0.16 − 0.07 − 0.230.03 − 0.13
Average standardised bias ⊥ 0.02 − 0.150.00 − 0.16 − 0.02 − 0.280.16 − 0.14
Average standard error ⊥ 4.620.571.280.560.610.540.600.57
Mean squared error ⊥ 1.320.420.740.420.410.410.390.49
Coverage of 95% CI ⊥ 0.970.960.970.960.970.960.960.93
No. of simulations with RI > 1025161220619416119103

Discussion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

In this paper, we have presented FPs as an alternative approach to the standard indicator method proposed by Farrington for modelling the effect of a time-dependent covariate, such as age, in an SCCS analysis. We also explored the sensitivity of these approaches to the number of categories used in the adjustment. The results demonstrate that the FP approach is a useful alternative for modelling the effect of a time-varying covariate but that both approaches are sensitive to the categorisation of the covariate.

The results from this simulation study suggest that when the number of cases is small, using indicators to model the effect of a time-dependent covariate can only be applied to broad categories of the covariate. Using a fine categorisation of age, such as daily or weekly groupings, resulted in wildly wrong (biased) estimates of the exposure effect in a number of simulated datasets when age was modelled using indicators, presumably because of the large number of parameters being estimated. In contrast, FPs gave reasonably accurate estimates of the exposure effect when applied to daily and weekly categorisations of age, with some evidence of larger bias and under-coverage using the coarser age groupings. In the FP model, the larger number of points (categories) from using narrow age bands allows the flexibility of the FP to be exploited, hence the reasonably accurate parameter estimates with the fine age categorisation. In contrast, when there are few categories, the FP is artificially constrained so that the performance deteriorates as seen by the increased bias and under-coverage when age is modelled using monthly or bimonthly age groups. The fact that FPs can be applied to narrow categories of a time-dependent covariate even when the number of cases is small means that this approach can more accurately capture the effect of the covariate than the indicator approach. In particular, we have demonstrated that it is possible to fit an FP to age represented in days, which is the finest categorisation possible in this example where times were available in days. Hence, although the FP approach treats age as categorical, it is essentially being modelled as a continuous variable.

When the number of cases was increased by 10-fold (scenario 3), the most accurate inference for the exposure effect for both methods of age adjustment was obtained when age was categorised into narrow age bands. Interestingly, the inference from both approaches was biased when monthly or bimonthly age categories were used to model the age effect in this larger dataset, presumably due to the inaccuracy of modelling the age effect using such broad categories.

Across the majority of analyses presented in this paper, using an FP to model the age effect resulted in a more precise RI for the exposure than using separate indicators. The gain in precision was most prominent when there was a fine categorisation of age. This is most likely due to the smaller number of parameters to be estimated with an FP, generally just one or two, compared with one parameter per age group stratum, less one for the reference stratum, with the indicator approach.

The findings from the current study also suggest that FPs are more robust to confounding between exposure and age than the indicator approach. In scenarios 1 and 2, there were many more simulated datasets in which there was an unidentifiable or unrealistically large estimate of the exposure effect from the indicator analysis compared with the FP approach when applied to days and weeks of age. This was most likely due to the difficulty of distinguishing the exposure and age effects around the time of exposure using the indicator approach. When using an FP, the relationship between the time-dependent covariate and outcome is determined by events across the whole range of the covariate, so it can accommodate areas in the range of the covariate where there is strong confounding between these two effects, that is, around the time of the exposure.

Aside from the fact that caution is required when using FPs fitted to few categories, the main disadvantage with using an FP to model the effect of a time-varying covariate is that this approach is more complex than the indicator approach and requires some understanding of FPs. However, the method is readily accessible using standard software such as the mfp command in STATA [15]. We note that in the current paper we demonstrate the use of FPs to model the effect of a time-dependent covariate applied to a single, but realistic, specification for the age effect. Further examples would be useful to provide a fuller evaluation of this approach.

Alternative methods of adjusting for time-dependent covariates have been suggested in the literature. Farrington and Whitaker presented a semi-parametric approach to adjust for age, in which the age effects are left unspecified [19]. Such an approach offers a lot more flexibility than the indicator or FP approaches presented here, although it is not readily available in standard statistical packages and involves the estimation of a larger number of parameters. In particular, because this approach models a separate age effect for each event, it is not suitable for use in large studies [19]. Another paper used a continuous adjustment for age [20], but it was not clear how the authors had carried out this analysis, and Weldeselassie et al. questioned whether a continuous adjustment had in fact been used at all [2]. In the current paper, we focus on using FPs to adjust for age because this approach is readily available in STATA. However, in principle, the proposed method of fitting a smooth curve across categories of a time-dependent covariate should also work using other smoothing methods, for example splines, which are readily available in other statistical packages such as R [21]. Although splines provide a potentially useful alternative, further research would be needed on the details of their application in this context, for example, on the specification of knots required for the spline fitting.

When carrying out an SCCS analysis, Whitaker et al. recommended carrying out a sensitivity analysis surrounding the number of categories used in the adjustment for time-dependent covariates [3]. However, in a recent review of the use of SCCS in the literature, it was noted that only 2 of the 33 studies that adjusted for age reported a sensitivity analysis that varied the number of age categories used in the analysis [2]. One paper by Hocine et al. [22], exploring the relationship between hepatitis B vaccine and first central nervous system demyelinating events, compared four methods of age adjustment: no adjustment, adjusted for 20 age categories, adjusted for 48 age categories and the semi-parametric model. They concluded that it is essential to adjust and that the semi-parametric approach is the most efficient in terms of providing the narrowest confidence interval for the exposure estimate. The second paper by Smeeth et al. [23], exploring the risk of myocardial infarction and stroke after acute infection or vaccine, adjusted for age using 5-year age groups in the primary analysis and noted that there was little difference in their results when age was adjusted for using 2-year age groups. The findings in the current paper reiterate the importance of carrying out a sensitivity analysis around the number of categories used in the SCCS analysis but also highlight the importance of considering the sensitivity to the method used to model the time-dependent covariate(s).

Conclusion

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

In summary, this study demonstrates that FPs provide a useful approach to adjusting for time-dependent covariates when carrying out an SCCS analysis. Using this parametric approach can be more efficient than using indicators and can lead to more reliable inference for the exposure–outcome relationship particularly when the number of cases is small. However, if an FP is used, it is important that the time at risk is adjusted for using a large number of categories. When carrying out an SCCS analysis, it is important to explore the sensitivity of the results to the number and width of categories used in the analysis, and also the method of adjustment, to ensure that the results are reliable.

Acknowledgements

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References

This work was supported from a Centre of Research Excellence grant awarded to JBC and colleagues from the Australian National Health and Medical Research Council. The authors also acknowledge support provided by the Murdoch Childrens Research Institute through the Victorian Government's Operational Infrastructure Support Program.

References

  1. Top of page
  2. Abstract
  3. Introduction
  4. Motivating example—assessing the relationship between rotavirus vaccination and intussusception
  5. The self-controlled case series approach
  6. The rotavirus and intussusception example
  7. Simulation study
  8. Discussion
  9. Conclusion
  10. Acknowledgements
  11. References
  • 1
    Farrington C.Relative incidence estimation from case series for vaccine safety evaluation. Biometrics 1995; 51:228235.
  • 2
    Weldeselassie YG, Whitaker HJ, Farrington CP.Use of self-controlled case series method in vaccine safety studies: review and recommendations for best practice. Epidemiology and Infection 2011; 139:18051817.
  • 3
    Whitaker HJ, Farrington CP, Spiessens B, Musonda P.Tutorial in biostatistics: the self-controlled case series method. Statistics in Medicine 2006; 25(10):17681797.
  • 4
    Dennehy PH.Transmission of rotavirus and other enteric pathogens in the home. Journal of Pediatric Infectious Diseases 2000; 19:S103S105.
  • 5
    Centers for Disease Control and Prevention. Withdrawal of rotavirus vaccine recommendation. MMWR. Morbidity and Mortality Weekly Report 1999; 48:1007.
  • 6
    Murphy T, Gargiullo P, Massoudi M, Nelson D, Jumaan A, Okoro C, Zanardi L, Setia S, Fair E, LeBaron C, Wharton M, Livengood J,the Rotavirus Intussusception Investigation Team.Intussusception among infants given an oral rotavirus vaccine. New England Journal of Medicine 2001; 344:564572.
  • 7
    Ruiz-Palacios G, Pérez-Schael I, Velázquez F, Abate H, Breuer T, Clemens S, Cheuvart B, Espinoza F, Gillard P, Innis BL, Cervantes Y, Linhares AC, López P, Macías-Parra M, Ortega-Barría E, Richardson V, Rivera-Medina DM, Rivera L, Salinas B, Pavía-Ruz N, Salmerón J, Rüttimann R, Tinoco JC, Rubio P, Nuñez E, Guerrero ML, Yarzábal JP, Damaso S, Tornieporth N, Sáez-Llorens X, Vergara RF, Vesikari T, Bouckenooghe A, Clemens R, De Vos B, O'Ryan M,Group. atHRVS. Safety and efficacy of an attenuated vaccine against severe rotavirus gastroenteritis. New England Journal of Medicine 2006; 354:1122.
  • 8
    Vesikari T, Matson DO, Dennehy P, Van Damme P, Santosham M, Rodriguez Z, Dallas MJ, Heyse JF, Goveia MG, Black SB, Shinefield HR, Christie CD, Ylitalo S, Itzler R, Coia ML, Onorato MT, Adeyi BA, Marshall GS, Gothefors L, Campens D, Karvonen A, Watt JP, O'Brien KL, DiNubile MJ, Clark HF, Boslego JW, Offit PA, Heaton PM,Rotavirus Efficacy and Safety Trial (REST) Study Team. Safety and efficacy of a pentavalent human-bovine (WC3) reassortant rotavirus vaccine. New England Journal of Medicine 2006; 354:2333.
  • 9
    Haber P, Patel M, Izurieta HS, Baggs J, Gargiullo P, Weintraub E, Cortese M, Braun MM, Belongia EA, Miller E, Ball R, Iskander J, Parashar UD.Postlicensure monitoring of intussusception after RotaTeq vaccination in the United States, February 1, 2006, to September 25, 2007. Pediatrics 2008; 121:12061212.
  • 10
    Buttery JP, Danchin MH, Lee KJ, Carlin JB, McIntyre PB, Elliott EJ, Booy R, Bines JE,PAEDS/APSU Study Group. Intussusception following rotavirus vaccine administration: post-marketing surveillance in the National Immunization Program in Australia. Vaccine 2011; 29:30613066.
  • 11
    Patel MM, López-Collada VR, Bulhões MM, De Oliveira LH, Márquez AB, Flannery B, Esparza-Aguilar M, Renoiner EIM, Luna-Cruz ME, Sato HK, Hernández-Hernández LDC, Toledo-Cortina G, Cerón-Rodríguez M, Osnaya-Romero N, Martínez-Alcazar M, Aguinaga-Villasenor RG, Plascencia-Hernández A, Fojaco-González F, Hernández-Peredo Rezk G, Gutierrez-Ramírez SF, Dorame-Castillo R, Tinajero-Pizano R, Mercado-Villegas B, Barbosa MR, Maluf EMC, Ferreira LB, de Carvalho FM, dos Santos AR, Cesar ED, de Oliveira MEP, Silva CLO, Cortes MdA, Matus CR, Tate J, Gargiullo P, Parashar UD.Intussusception risk and health benefits of rotavirus vaccination in Mexico and Brazil. The New England Journal of Medicine 2011; 364:22832292.
  • 12
    Justice F, Carlin J, Bines J.Changing epidemiology of intussusception in Australia. Journal of Paediatrics Child Health 2005; 41:475478.
  • 13
    Royston P, Altman DG.Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling (with discussion). Applied Statistics 1994; 43:429467.
  • 14
    Sauerbrei W, Royston P.Multivariable Model-Building: A Pragmatic Approach to Regression Analysis Based on Fractional Polynomials for Modelling Continuous Variables. Wiley: Chichester, UK, 2008.
  • 15
    Royston P, Ambler G.Multivariable fractional polynomials. Stata Technical Bulletin 1998; 43:2432.
  • 16
    StataCorp. Stata Statistical Software: Release 12. StataCorp LP: College Station, TX, 2011.
  • 17
    Carlin JB, Macartney K, Lee KJ, Quinn H, Buttery J, Bines J, McIntyre P.Intussusception risk and disease prevention associated with rotavirus vaccines in Australia's national immunisation program. Clinical Infectious Diseases 2012. In press.
  • 18
    Collins L, Schafer J, Kam C.A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods 2001; 6:330351.
  • 19
    Farrington C, Whitaker H.Semiparametric analysis of case series data. Journal of the Royal Statistical Society, Series C 2006; 55:553594 (with discussion).
  • 20
    Hughes R, Charlton J, Latinovic R, Gulliford M.No association between immunization and Guillain-Barre’ syndrome in the United Kingdom, 1992 to 2000. Archives of Internal Medicine 2006; 166:13011304.
  • 21
    De Boor C.A Practical Guide to Splines (Revised edition). Springer: New York, 2001.
  • 22
    Hocine M, Farrington C, Touze E, Whitaker H, Fourrier A, Moreau T, Tubery-Bitter P.Hepatitis B vaccination and first central nervous system demyelinating events: reanalysis of a case-control study using the self-controlled case series method. Vaccine 2007; 25:59385943.
  • 23
    Smeeth L, Thomas SL, Hall AJ, Hubbard R, Farrington P, Vallance P.Risk of myocardial infarction and stroke after acute infection or vaccination. New England Journal of Medicine 2004; 351:26112618.