Abstract
 Top of page
 Abstract
 Introduction
 Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 The selfcontrolled case series approach
 The rotavirus and intussusception example
 Simulation study
 Discussion
 Conclusion
 Acknowledgements
 References
The selfcontrolled case series method is a statistical approach to investigating associations between acute outcomes and transient exposures. The method uses cases only and compares time at risk after the transient exposure with time at risk outside the exposure period within an individual, using conditional Poisson regression. The risk of outcome and exposure often varies over time, for example, with age, and it is important to allow for such time dependence within the analysis. The standard approach for modelling timevarying covariates is to split observation periods into blocks according to categories of the covariate and then to model the relationship using indicators for each category. However, this can be inefficient and can lead to problems with collinearity if the exposure occurs at approximately the same time in all individuals. As an alternative, we propose using fractional polynomials to model the relationship between the timevarying covariate and incidence of the outcome. We present the results from an analysis exploring the association between rotavirus vaccination and intussusception risk as well as a simulation study. We conclude that fractional polynomials provide a useful approach to adjusting for timevarying covariates but that it is important to explore the sensitivity of the results to the number of categories and the method of adjustment. Copyright © 2013 John Wiley & Sons, Ltd.
Introduction
 Top of page
 Abstract
 Introduction
 Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 The selfcontrolled case series approach
 The rotavirus and intussusception example
 Simulation study
 Discussion
 Conclusion
 Acknowledgements
 References
Farrington [1] developed the selfcontrolled case series (SCCS) analysis to investigate associations between acute outcomes and transient exposures. The method focuses on individuals who have had the event of interest (cases) and compares time at risk following a transient exposure, for example, immediately following a vaccination, with time at risk outside the exposure window within each individual. Because the analysis is based on withinindividual comparisons, this method controls for all known and unknown timeindependent confounders at the individual level. Such an approach may increase power compared with a case–control or cohort analysis as it removes betweenperson variability in the estimation of the exposure effect [2]. Because the approach uses cases only, it is extremely useful for rare events such as those seen following vaccination.
Although the SCCS approach controls for timeindependent confounders, it is prone to bias from uncontrolled timevarying confounders. The most common timevarying confounder is age, as the risk of the event of interest often varies with age, as does the risk of exposure. Hence, adjustment for age is invariably needed when assessing the relationship between exposure and outcome. Other timedependent covariates that may also need to be included in the analysis model are season and calendar time. Timedependent covariates can be accounted for in an SCCS analysis by splitting each individual's observation period into blocks according to categories of the covariate, for example, month of age. Variation in outcome with the covariate can then be modelled in a stepwise fashion using indicators for each timegroup category [3]. Whitaker et al. noted that the effect of adjusting for a timevarying covariate on the inference obtained from an SCCS analysis may be sensitive to the number of categories used and suggested that it is important to vary the number of categories as part of a sensitivity analysis [3]. Despite this suggestion, to date, there has been little exploration of the effect that varying the number of categories used to model a timedependent covariate has on inference from an SCCS analysis, with few authors carrying out such a sensitivity analysis [2].
Adjusting for timevarying confounders by dividing time at risk into categories and modelling each category separately in the analysis has some limitations. Firstly, if a large number of categories are used in order to accurately capture the relationship between the timevarying covariate and the outcome, this means estimating a large number of nuisance parameters, resulting in an inefficient analysis. In addition, if the risk window for the exposure of interest occurs at a similar time for all individuals, for example a vaccination that is given according to an age schedule, then the effect of the timevarying covariate may be highly confounded with the exposure of interest. This confounding makes it difficult to distinguish the effect of the timevarying covariate and the exposure in the time group(s) where the exposure is common, particularly when the risk window for the exposure is narrow.
In this paper, we propose an alternative, potentially more efficient, approach for modelling the effect of timevarying covariates, such as age, in an SCCS analysis. The proposed approach still requires separating time at risk into categories of the timedependent covariate as suggested by Farrington [1] but then uses a smooth curve defined by a fractional polynomial (FP) function across these categories to model the relationship between the timevarying covariate and outcome. We compare Farrington's approach of modelling categories of timevarying covariates using indicators with the FP approach in the context of estimating the association between a currently licensed vaccine for rotavirus and intussusception, where there is a strongly agedependent risk of disease. We present results from a case study and a simulation study based on the case study. We also explore the effect of varying the number of categories used in the confounder adjustment when estimating the exposure–outcome relationship. We begin by describing our motivating example in Section 2, before providing an overview of the SCCS analysis and the two methods of age adjustment in Section 3. In Section 4, we present the results of our case study, and in Section 5, we carry out a simulation study to further compare these two approaches. We end with a discussion of the findings in Section 6.
Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 Top of page
 Abstract
 Introduction
 Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 The selfcontrolled case series approach
 The rotavirus and intussusception example
 Simulation study
 Discussion
 Conclusion
 Acknowledgements
 References
Rotavirus is the most common cause of diarrhoea and dehydration in early childhood worldwide [4]. The first rhesus–human reassortant vaccine against rotavirus (Rotashield®;, Wyeth) was withdrawn from distribution 9 months after its introduction into the US National Immunisation Program in 1998, as it was demonstrated in postmarketing surveillance to be associated with risk of intussusception, a rare condition in which a part of the intestine folds inward into another section of intestine causing obstruction of the bowel [5, 6]. More recently, two new rotavirus vaccines have been developed and widely licensed, a pentavalent human–bovine reassortant vaccine (RotaTeq®;) and a monovalent human rotavirus vaccine (Rotarix®;). Neither of these later vaccines demonstrated evidence of an association with intussusception in prelicensure clinical safety trials [7, 8]. Although reassuring early postmarketing surveillance data were reported from the USA and Latin America [9], there have been reports from Mexico, Brazil and Australia suggesting that there may be an association between these currently licensed vaccines and intussusception [10, 11]. As these new vaccines are rolled out globally, it is important that there is ongoing surveillance with respect to intussusception.
The Australian study of rotavirus vaccines and intussusception collected data on cases of intussusception in four states across Australia and compared the observed number of cases following a vaccination with the expected number of events in the same period using an estimate of the background rate of intussusception [10]. An alternative approach is to analyse the observed cases of intussusception using the SCCS method, as was used to estimate the association between rotavirus vaccination and intussusception in Mexico and Brazil [11]. The risk of intussusception is low in the neonatal period, increasing to a peak at around 5–6 months of age before declining again [12]. Thus, when assessing the relationship between rotavirus vaccination and intussusception, it is important that this agedependent risk is taken into account.
The selfcontrolled case series approach
 Top of page
 Abstract
 Introduction
 Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 The selfcontrolled case series approach
 The rotavirus and intussusception example
 Simulation study
 Discussion
 Conclusion
 Acknowledgements
 References
The SCCS method requires capturing data on all individuals with the event of interest (cases) within a prespecified period, usually defined by age or calendar time [3]. Once cases have been identified, their exposure history is obtained, and their time under observation during the period of interest is separated into time within and outside a predefined exposure window or windows. The SCCS analysis is often used to examine the incidence of rare adverse event following vaccination, in which case the exposure period is the period immediately following vaccination. As with other study designs, it is important that ascertainment of cases is independent of exposure history [2].
The analysis involves comparing time under observation within the exposure window with time at risk outside of the exposure window, within each individual, using conditional Poisson regression. In standard Poisson regression, the time at risk for each individual is partitioned into finite intervals with n_{ik} representing the number of events occurring in individual i in risk period k. In the SCCS analysis, the risk periods are defined by the exposed and unexposed intervals of time. Letting e_{ik} denote the length of time that individual i spends in risk period k, we assume that the number of events for patient i in time interval k follows a Poisson process,
 (1)
where λ_{ik}, the incidence rate of the event in the interval, is modelled by independent subject and exposure effects,
 (2)
with φ_{i} representing the baseline risk for person i and β_{k} the effect of risk period k, in our case the (vaccine) exposure. In the SCCS analysis, the underlying Poisson model gives rise to a multinomial likelihood, after conditioning on the number of events for each individual (1 because the analysis is restricted to cases only). This can be maximised by fitting an appropriate ‘conditional fixedeffects’ Poisson regression model. The parameter of interest, the coefficient of the exposure in the Poisson regression model (β_{k}), represents the relative incidence (RI) of the event in the risk period compared with nonrisk periods before and after the exposure. We refer to the detailed tutorial by Whitaker et al. for full details of the SCCS method [3].
Indicator method for timedependent confounder adjustment
In the presence of a timedependent covariate, observation time is further separated into intervals according to categories of the covariate, for example month of age. The conditional Poisson regression model is then fitted with the inclusion of a set of indicator variables to adjust for the effect of the timevarying covariate. This can be achieved using the model
 (3)
 (4)
where n_{ijk} and e_{ijk} are as described previously for individual i, in period j of the timevarying covariate and risk period k, and α_{j} represents the effect of the timevarying covariate in period j.
Fractional polynomial adjustment for timedependent confounders
Fractional polynomials are an extension of regular (integer) polynomials that enable a flexible (nonlinear) curve to model an exposure–outcome relationship for a continuous exposure [13]. Under the FP approach, a model based on a linear combination of fractional power transformations of the continuous predictor is selected, with the number of terms (often only one or two) determined by a backwards elimination procedure [13, 14]. Such a model can be fitted in STATA using the mfp command [15], which incorporates a procedure for selecting the number of terms to include in the model.
When modelling the effect of a timedependent covariate on disease risk, we would ideally model the risk of disease across the covariate as a continuous variable. This is not possible in the context of a Poisson regression formulation, in which the timedependent covariate must be treated as constant within each specified interval of time. Instead, we propose fitting a smooth curve across the discrete categories of the covariate using an FP, rather than letting the covariate effects be estimated separately in each category as in Equation (4). The theory that underpins the SCCS approach holds under this approach as the timevarying covariate is still represented by a piecewiseconstant (within categories) function. Such a model has very few parameters (only one per term in the FP equation) and fits a smooth curve to the effect of the covariate. An additional possibility with the FP approach is to split time at risk into the smallest unit of time available, for example day of age, and fit an FP to this variable in which case the covariate would essentially be continuous. Using the timevarying covariate categorised into such small intervals is not feasible when modelling the effect using indicators as this would mean estimating a large number of nuisance parameters. This is illustrated in the following examples.
The rotavirus and intussusception example
 Top of page
 Abstract
 Introduction
 Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 The selfcontrolled case series approach
 The rotavirus and intussusception example
 Simulation study
 Discussion
 Conclusion
 Acknowledgements
 References
As an illustrative example, we use a subset of data obtained for a national study of rotavirus vaccination in Australia [10]. For the original study, data on cases of intussusception in infants 1 to < 12 months of age were obtained from two paediatric networks that collect and record notifications of intussusception. Once identified, cases were matched to records in the Australian Childhood Immunisation Register, which captures details of all vaccinations received by infants in Australia, to obtain exposure histories (see [10] for full details). For the purpose of this paper, we restricted our analysis to the subset of data from Victoria, where the RotaTeq vaccine is given routinely at approximately 2, 4 and 6 months of age. As in the published analysis, we focus on exposure windows of 1–7 and 8–21 days following each vaccination. This analysis includes a total of 76 cases, including 19 cases that occurred within 21 days of a vaccination. Figure 1 shows a histogram of the age of the cases at the time of intussusception.
We present the results from an SCCS analysis restricted to first events only as recommended by Whitaker [3] because the risk of a recurring event is likely to be different to the risk of a first event. Conditional Poisson regression models were fitted in STATA release 12 using the poisson command with a fixed effect for individual [16]. The analysis was carried out using both indicators and an FP to model the age effect, using 2month, 1month, 1week and daily age categories (total of eight analyses). FPs were fitted using the mfp command [15].
Results are presented as the RI (exponential of the estimate of the regression parameter β_{k}) and its 95% confidence interval (CI) for each exposure window following each dose of vaccine. We note that the estimated associations between the rotavirus vaccine and intussusception that we present are for illustrative purposes only. A comprehensive analysis regarding the risk of intussusception following rotavirus vaccination using national data including this dataset will be published elsewhere [17].
Table 1 shows the results of the SCCS analysis of the Victorian intussusception data. When fitting the FP model, two terms were found sufficient to model the effect of age [14, 15]. These results show some variability in the RI between the various approaches of age adjustment (modelling of the age adjustment is shown for both approaches when age is grouped by month in Figure 2). The RI estimates from the FP analyses are generally more stable across the choice of age groupings than those from the indicator approach. However, it is difficult to know which set of results should be preferred.
Table 1. Relative incidence of intussusception immediately following a rotavirus vaccine from a selfcontrolled case series analysis of the Victorian data on cases of intussusception (n = 76 cases, including 19 in the exposure window).Dose  Window (days)  Day  Week  Month  2 months 

Indicators  FP  Indicators  FP  Indicators  FP  Indicators  FP 


1  1–7  2.32 (0.59, 9.06)  3.56 (1.14, 11.1)  2.60 (0.67, 10.0)  3.57 (1.15, 11.1)  2.31 (0.64, 8.28)  3.01 (1.03, 8.85)  4.11 (1.25, 13.6)  2.40 (0.85, 6.79) 
 8–21  2.42 (0.80, 7.30)  2.51 (1.06, 5.98)  2.39 (0.80, 7.09)  2.51 (1.06, 5.99)  1.87 (0.64, 5.48)  2.40 (1.03, 5.62)  3.46 (1.32, 9.12)  2.10 (0.93, 4.77) 
2  1–7  2.21 (0.40, 12.2)  1.11 (0.27, 4.66)  1.94 (0.37, 10.2)  1.11 (0.26, 4.65)  2.11 (0.43, 10.3)  1.13 (0.27, 4.72)  1.46 (0.33, 6.42)  1.40 (0.34, 5.85) 
 8–21  0.98 (0.20, 4.88)  0.55 (0.13, 2.31)  1.06 (0.22, 5.23)  0.55 (0.13, 2.30)  1.06 (0.22, 5.17)  0.56 (0.13, 2.35)  0.69 (0.16, 2.94)  0.71 (0.17, 2.95) 
3  1–7  0.63 (0.08, 4.95)  0.69 (0.09, 5.06)  0.53 (0.07, 4.12)  0.69 (0.09, 5.05)  0.48 (0.06, 3.69)  0.72 (0.10, 5.23)  0.63 (0.08, 4.71)  0.89 (0.12, 6.50) 
 8–21  0.84 (0.23, 3.05)  1.08 (0.33, 3.52)  0.87 (0.24, 3.14)  1.08 (0.33, 3.51)  0.75 (0.21, 2.62)  1.10 (0.34, 3.58)  0.98 (0.29, 3.29)  1.37 (0.42, 4.41) 
Discussion
 Top of page
 Abstract
 Introduction
 Motivating example—assessing the relationship between rotavirus vaccination and intussusception
 The selfcontrolled case series approach
 The rotavirus and intussusception example
 Simulation study
 Discussion
 Conclusion
 Acknowledgements
 References
In this paper, we have presented FPs as an alternative approach to the standard indicator method proposed by Farrington for modelling the effect of a timedependent covariate, such as age, in an SCCS analysis. We also explored the sensitivity of these approaches to the number of categories used in the adjustment. The results demonstrate that the FP approach is a useful alternative for modelling the effect of a timevarying covariate but that both approaches are sensitive to the categorisation of the covariate.
The results from this simulation study suggest that when the number of cases is small, using indicators to model the effect of a timedependent covariate can only be applied to broad categories of the covariate. Using a fine categorisation of age, such as daily or weekly groupings, resulted in wildly wrong (biased) estimates of the exposure effect in a number of simulated datasets when age was modelled using indicators, presumably because of the large number of parameters being estimated. In contrast, FPs gave reasonably accurate estimates of the exposure effect when applied to daily and weekly categorisations of age, with some evidence of larger bias and undercoverage using the coarser age groupings. In the FP model, the larger number of points (categories) from using narrow age bands allows the flexibility of the FP to be exploited, hence the reasonably accurate parameter estimates with the fine age categorisation. In contrast, when there are few categories, the FP is artificially constrained so that the performance deteriorates as seen by the increased bias and undercoverage when age is modelled using monthly or bimonthly age groups. The fact that FPs can be applied to narrow categories of a timedependent covariate even when the number of cases is small means that this approach can more accurately capture the effect of the covariate than the indicator approach. In particular, we have demonstrated that it is possible to fit an FP to age represented in days, which is the finest categorisation possible in this example where times were available in days. Hence, although the FP approach treats age as categorical, it is essentially being modelled as a continuous variable.
When the number of cases was increased by 10fold (scenario 3), the most accurate inference for the exposure effect for both methods of age adjustment was obtained when age was categorised into narrow age bands. Interestingly, the inference from both approaches was biased when monthly or bimonthly age categories were used to model the age effect in this larger dataset, presumably due to the inaccuracy of modelling the age effect using such broad categories.
Across the majority of analyses presented in this paper, using an FP to model the age effect resulted in a more precise RI for the exposure than using separate indicators. The gain in precision was most prominent when there was a fine categorisation of age. This is most likely due to the smaller number of parameters to be estimated with an FP, generally just one or two, compared with one parameter per age group stratum, less one for the reference stratum, with the indicator approach.
The findings from the current study also suggest that FPs are more robust to confounding between exposure and age than the indicator approach. In scenarios 1 and 2, there were many more simulated datasets in which there was an unidentifiable or unrealistically large estimate of the exposure effect from the indicator analysis compared with the FP approach when applied to days and weeks of age. This was most likely due to the difficulty of distinguishing the exposure and age effects around the time of exposure using the indicator approach. When using an FP, the relationship between the timedependent covariate and outcome is determined by events across the whole range of the covariate, so it can accommodate areas in the range of the covariate where there is strong confounding between these two effects, that is, around the time of the exposure.
Aside from the fact that caution is required when using FPs fitted to few categories, the main disadvantage with using an FP to model the effect of a timevarying covariate is that this approach is more complex than the indicator approach and requires some understanding of FPs. However, the method is readily accessible using standard software such as the mfp command in STATA [15]. We note that in the current paper we demonstrate the use of FPs to model the effect of a timedependent covariate applied to a single, but realistic, specification for the age effect. Further examples would be useful to provide a fuller evaluation of this approach.
Alternative methods of adjusting for timedependent covariates have been suggested in the literature. Farrington and Whitaker presented a semiparametric approach to adjust for age, in which the age effects are left unspecified [19]. Such an approach offers a lot more flexibility than the indicator or FP approaches presented here, although it is not readily available in standard statistical packages and involves the estimation of a larger number of parameters. In particular, because this approach models a separate age effect for each event, it is not suitable for use in large studies [19]. Another paper used a continuous adjustment for age [20], but it was not clear how the authors had carried out this analysis, and Weldeselassie et al. questioned whether a continuous adjustment had in fact been used at all [2]. In the current paper, we focus on using FPs to adjust for age because this approach is readily available in STATA. However, in principle, the proposed method of fitting a smooth curve across categories of a timedependent covariate should also work using other smoothing methods, for example splines, which are readily available in other statistical packages such as R [21]. Although splines provide a potentially useful alternative, further research would be needed on the details of their application in this context, for example, on the specification of knots required for the spline fitting.
When carrying out an SCCS analysis, Whitaker et al. recommended carrying out a sensitivity analysis surrounding the number of categories used in the adjustment for timedependent covariates [3]. However, in a recent review of the use of SCCS in the literature, it was noted that only 2 of the 33 studies that adjusted for age reported a sensitivity analysis that varied the number of age categories used in the analysis [2]. One paper by Hocine et al. [22], exploring the relationship between hepatitis B vaccine and first central nervous system demyelinating events, compared four methods of age adjustment: no adjustment, adjusted for 20 age categories, adjusted for 48 age categories and the semiparametric model. They concluded that it is essential to adjust and that the semiparametric approach is the most efficient in terms of providing the narrowest confidence interval for the exposure estimate. The second paper by Smeeth et al. [23], exploring the risk of myocardial infarction and stroke after acute infection or vaccine, adjusted for age using 5year age groups in the primary analysis and noted that there was little difference in their results when age was adjusted for using 2year age groups. The findings in the current paper reiterate the importance of carrying out a sensitivity analysis around the number of categories used in the SCCS analysis but also highlight the importance of considering the sensitivity to the method used to model the timedependent covariate(s).