Is the sequence ratio an unbiased estimate of the incidence rate ratio? A simulation study

We aimed to evaluate the conditions under which the sequence ratio (SR) obtained from a sequence symmetry analysis is an unbiased estimate of the true incidence rate ratio (IRR).

• A core assumption of the sequence symmetry analysis is that the sequence ratio is an unbiased estimate of the incidence rate ratio that would have arisen fromt he underlying cohort of medication users • In this simulation study, the sequence ratio was a biased estimate of the incidence rate ratio if the outcome of interest occurred frequently or the incidence rate ratio was very high • The sequence ratio is an unbiased estimte of the incidence rate ratio if the outcome of interest is rare Thomas Delvin, Sofie Egsgaard, Jesper Hallas, Helene Kildegaard, Lars Christian Lund, and Martin Torp Rahbek contributed equally to this work and are listed in alphabetical order.
The sequence symmetry analysis (SSA) is a self-controlled design increasingly used in pharmacoepidemiology for identifying drug safety signals. 1,2SSA evaluates the order of two events, typically initiation of an exposure drug and an outcome of interest.If no causal relationship exists between the exposure and the outcome, the sequence of exposure to outcome and outcome to exposure are equally likely.The sequence ratio (SR) is the observed number of exposure to outcome sequences divided by the number of outcome to exposure sequences.
The method assumes that the SR is an unbiased estimate of the incidence rate ratio that would have arisen from the underlying cohort of medication users. 1 To assess the validity of this assumption, we attempted to identify the conditions under which the sequence ratio is an unbiased estimate of the incidence rate ratio.

| METHODS
We simulated cohorts of 1 million initially unexposed individuals who were followed for 5 years.In the main simulation, we considered everyone alive and under observation until the end of follow-up.For each individual, we drew a date of exposure assuming a fixed rate of 1 initiation per 50 person years (PY), emulating the day of a first prescription fill of an exposure drug.We further simulated multiple outcomes of interest.For each outcome, the event date was simulated under the assumption of a constant rate for that outcome.To generate clinically realistic scenarios, we used approximations of real-world disease incidences and prescribing patterns to represent events that are very frequent, for example, penicillin prescriptions (IR 1 per 10 PY); common events, for example, type 2 diabetes (IR 1 per 200 PY); rare events, for example, hyperthyroidism (IR 1 per 1000 PY), and very rare events, for example, prescription of amiodarone (IR 1 per 10 000 PY).The outcome rate was modified by a true incidence rate ratio, representing the exposure drug's effect on the outcome.We evaluated the bias of two estimators: First, we analyzed data as a cohort study and calculated the estimated incidence rate ratio as the fraction of incident outcomes per person-year among exposed and non-exposed individuals.Second, we analyzed data using the SSA method with a time window of 365 days before and after exposure and estimated the sequence ratio as the number of exposure-to-outcome sequences divided by the number of outcometo-exposure sequences.
Subsequently, we simulated two additional scenarios to quantify the potential bias resulting from adverse events leading to censoring (e.g., death) and from situations where occurrence of the outcome event influences the following rate of exposure.In the first scenario, we simulated an event of interest that carried a risk of dying with a given probability ranging from 0.01 to 0.90.For each individual experiencing the event of interest, we performed a Bernoulli trial to determine whether the individual survived the event or was censored at the time of the event.In the second scenario, we simulated the scenario where the event of interest reduced the following rate of exposure by a given incidence rate ratio ranging from 0.1 to 0.75.Finally, we calculated sequence ratios and bias using observation windows of 180 and 730 days for the main scenarios.
All simulations were repeated 2500 times.From these, we calculated the mean of the estimated IRR and SR on the log scale and ascertained the bias.Bias was calculated as the difference between the estimate (IRR or SR) and the true parameter value on the log scale.
Finally, we calculated Monte Carlo standard errors to quantify uncertainty of simulations. 3mulations were performed using Stata version 18.The source code for the analyses is available from https://gitlab.sdu.dk/lclund/ssa-irr-simulation/.

| RESULTS
In simulated analyses, the cohort estimator yielded consistently unbiased estimates across varying outcome rates, except for the scenario T A B L E 1 Bias and mean estimates for the sequence ratio and incidence rate ratio under varying outcome rates and true incidence rate ratios.with a very rare outcome (1 event per 10 000 PY) with bias up to À0.11 (true IRR 0.2, mean 0.18), which we mainly attributed to a low number of observed events (Table 1).The SR was essentially unbiased for common and rare events (maximum observed bias of À0.04).For very rare events, the SR was slightly upwards biased (bias 0.08 for IRR of 5, bias 0.36 for IRR of 0.2).For frequent events, the SR was downwards biased when the true IRR was above 1 (bias À0.15 and À0.29).
Figure 1 graphically depicts the estimated SR and IRR for each of the 2500 iterations for selected scenarios.
When halving the observation window to 180 days, the magnitude of bias was halved compared to the main analysis when analyzing frequent events.Conversely, when doubling the window length to 730 days, bias was increased two-fold (Supplementary Table 1).
When evaluating events that carry a risk of immediate censoring, the cohort estimator yielded unbiased results regardless of the probability of censoring.In this situation, the SR was upwards biased with increasing probabilities of censoring (bias 0.01, 0.05, 0.10, 0.69, and 2.32 for probabilities of 0.01, 0.05, 0.1, 0.5, and 0.9), independent of the true IRR.When the outcome reduced the probability of future exposure, the SR was likewise biased upwards for all simulated values (bias 0.28, 0.69, 1.38, and 2.31 for reduced exposure probabilities of 0.75, 0.5, 0.25, and 0.1 times).Simulation precision was evaluated using the Monte Carlo standard error which was 0.02 or lower for all scenarios (Supplementary Table 2).
In a post-hoc analysis, we evaluated a third estimator, the risk ratio comparing the 1-year risk of the event of interest following drug initiation compared to non-initiators (Supplementary Table 3).When calculating the risk ratio, we excluded individuals whose follow up started during the last year of the study period as these could not achieve complete follow up.Incomplete follow-up due to changes in exposure status was ignored.This risk ratio was a close to unbiased estimator of the underlying incidence rate ratio in most scenarios but was downwards biased when the IRR was 5.0 (bias À0.22, À0.05, À0.04, and À0.05 for frequent to very rare events), like the SR.

| DISCUSSION
In this study, we tested whether the SR is an unbiased estimate of the true underlying IRR given different event rates representing very rare to frequent events.Simulations showed that the SR was unbiased except in scenarios with high IRRs or frequent outcomes, with a tendency towards an increasing magnitude of bias with an increasing frequency of the event of interest.The mechanism is the same, as a high incidence of the outcome, whether among all individuals or only among exposed, introduces a conservative bias in the SR.We further evaluated how violation of the method's core assumptions affect the SR 4 and found that bias introduced by censoring was negligible if the probability of censoring was 0.1 or below.Conversely, any sustained reduced probability of exposure following the event of interest leads to substantial bias.
In a post-hoc analysis, we evaluated the risk ratio (RR) and found better agreement between the SR and the RR, if the outcome was frequent or the true IRR was high.This is hardly surprising.In estimating an IR, the timing of outcomes is important, particularly if these are common.Since the follow-up is terminated when an outcome occurs, the denominator in the incidence rate estimate is affected which again is likely to affect the IRR.For a RR or a SR estimate, the timing of outcomes within the follow-up window is disregarded, and both have a uniform follow-up for all subjects.However, if the outcome is rare, RRs and IRRs become increasingly similar 5 as did the SR and the IRR in our simulation.

| CONCLUSIONS
We conclude that the original assumption that the SR estimates the IRR is not universally true.We find better overall agreement with the risk ratio, which is consistent with how the SRs and the RRs are calculated.For rare outcomes, however, the SR and IRR measures become increasingly similar.

FUNDING INFORMATION
None.
U R E 1 incidence rate ratios and sequence ratios for each of the 2500 iterations for selected scenarios.Columns represent varying incidence rates of the outcome rows represent the true incidence rate ratios (IRR) estimated in the given scenario.Purple circles show the estimated IRR from the cohort analysis in each of the 2500 iterations of each scenario; yellow circles represent the estimated sequence ratios.