Estimating cumulative incidence functions in competing risks data with dependent left‐truncation

Both delayed study entry (left‐truncation) and competing risks are common phenomena in observational time‐to‐event studies. For example, in studies conducted by Teratology Information Services (TIS) on adverse drug reactions during pregnancy, the natural time scale is gestational age, but women enter the study after time origin and upon contact with the service. Competing risks are present, because an elective termination may be precluded by a spontaneous abortion. If left‐truncation is entirely random, the Aalen‐Johansen estimator is the canonical estimator of the cumulative incidence functions of the competing events. If the assumption of random left‐truncation is in doubt, we propose a new semiparametric estimator of the cumulative incidence function. The dependence between entry time and time‐to‐event is modeled using a cause‐specific Cox proportional hazards model and the marginal (unconditional) estimates are derived via inverse probability weighting arguments. We apply the new estimator to data about coumarin usage during pregnancy. Here, the concern is that the cause‐specific hazard of experiencing an induced abortion may depend on the time when seeking advice by a TIS, which also is the time of left‐truncation or study entry. While the aims of counseling by a TIS are to reduce the rate of elective terminations based on irrational overestimation of drug risks and to lead to better and safer medical treatment of maternal disease, it is conceivable that women considering an induced abortion are more likely to seek counseling. The new estimator is also evaluated in extensive simulation studies and found preferable compared to the Aalen‐Johansen estimator in non–misspecified scenarios and to at least provide for a sensitivity analysis otherwise.

The remainder of this paper is organized as follows: Section 2 briefly revisits the case of survival data to clarify the target (or estimand) of the proposed inferential procedures. Both the competing risks model and our new estimator of the cumulative event probabilities in the presence of dependent left-truncation are in Section 3. Section 4 reports simulation results and a reanalysis of data on drug-exposed pregnancies accounting for possibly dependent left-truncation is in Section 5. A summary concludes this paper in Section 6. Supporting information, in particular reporting additional simulation results, is provided online.

THE ESTIMAND OF MACKENZIE
The basic model of survival data considers i.i.d. replicates of (T, C), where T is the time-to-event and C is a right-censoring time. The common assumption is that T and C are independent, possibly conditional on baseline covariates. Analogously, left-truncated survival data are typically viewed as copies of the pair (L, T), where L now is the left-truncation time. While right-censoring switches observation "off," left-truncation can be viewed as switching observation "on," such that an individual is under study if L < T, while individuals with T ≤ L are not sampled. Again, the common assumption is that T and L are independent, and, already, Kaplan and Meier 15 suggested to use their estimator for estimation of P(T > t) = exp(− ∫ t 0 (u)du) in the presence of left-truncation, where (t) is the survival hazard. Common to both models also is that both T, L, and C are viewed as latent variables. A difference between right-censoring and left-truncation is that a possible dependence between L and T can be investigated by including L as a covariate in a regression model for the hazard of T, say (t | L = ). One option to account for such dependence is to only assume conditional independence of L and T given baseline covariates. In the example of pregnancy data, such covariates are difficult to identify as the reasons for induced abortions are various. The characteristics of women which influence induced abortions are hard to detect and interpret. A different approach is IPLW. To this end, Mackenzie 14 suggested to consider the conditional survival functions and to subsequently marginalize w.r.t. the distribution F L of L P(T > t) = ∫ P(T > t |L = )dF L ( ).
Section 3 extends Mackenzie's suggestion to the situation where the target of estimation is the unconditional cumulative event probability or cumulative incidence function of a competing risk. However, before we proceed, three remarks are in place. Firstly, Equations (1) and (2) demonstrate that the left-truncation time is truly viewed as a latent variable, but not as the time-to-occurrence of an internal time-dependent covariate. 16 An internal time-dependent covariate is conceptually tied to existence or survival of the individual, and hence, the quantity (u | L = ) is without any useful meaning for u ≤ and P(T > t | L = ) must be one for t < . The consequence is that the present approach does not apply to common examples of so-called prevalent sampling where individuals are included conditional on an initial event such as disease onset, and where this initial event is modeled as a left-truncation event, see the work of Cheng et al 17 for a recent example. In our motivating data example, the interpretation is that a drug-exposed pregnancy comes with a latent time-to-TIS-contact. However, if, say, the pregnancy ends in a spontaneous abortion beforehand, the contact to TIS will not be realized.
Secondly, while our target of estimation will be an unconditional probability, a conditional probability such as P(T > t | L = ) may very well be preferred for individual counseling or prediction. In fact, there is a relation to recent calls for dynamic prediction. 18 If, as in our motivating data example, L = is the time of TIS contact for seeking advice, individual counseling should be based on P(T > t | T ≥ ) or, in the case of dependence between T and L, P(T > t | T ≥ , L = ). Such predictions may very well be obtained by using L = in a hazard regression, see (7) in Section 3. However, as we will also see below, our practical aim will be to provide for a sensitivity analysis of the standard Aalen-Johansen estimator that assumes random left-truncation.
Thirdly, the relation P(T > t) = ∫ P(T > t |L = )dF L ( ) uses the law of total probability. The interpretation is that P(T > t) is the survival probability of an individual picked randomly from the population at hand. Conditional on L, however, the survival probabilities differ between individuals. We will exploit this in the simulations of Section 4, where we determine the target probability by using the Aalen-Johansen estimator on complete data sets without truncation. We also note that, because of the law of total probability, the target quantity is not the (cumulative) hazard (t | L = ) marginalized w.r.t. F L . In other words, the hazard belonging to P( We will briefly revisit these aspects in Section 4 using simulations. In particular, the simulation algorithm in Section 4.1 will illustrate the points made above about latent times and marginal vs conditional probabilities. A brief simulation for survival data without competing risks in Section 4.2 further illustrates the question of the estimand addressed above.

COMPETING RISKS MODEL AND ESTIMATION OF CUMULATIVE EVENT PROBABILITIES
In the following, we consider a competing risks process (X t ) t ≥ 0 denoting the state of an individual for every point in time t, X t ∈ {0, 1, … , k}. As in the data example, we consider the case with k = 3 absorbing states, but the model can be easily written for two or more than three competing events. Every individual starts in the initial state 0 at study entry, P(X 0 = 0) = 1. As long as no event occurs the individual stays in state 0. If an event of type j, j ∈ {1, 2, 3}, occurs the individual moves into the corresponding state. The event time T = inf{t > 0 |X t ≠ 0} is defined as the time the competing risk process moves out of the initial state 0. The process (X t ) t ≥ 0 is assumed to be right-continuous, ie, the competing risks process is in one of the states 1, 2, and 3 at time T.
The cause-specific hazards 0j (t) can be defined via Also, write A 0 (t) ∶= ∫ t 0 0 (u)du, j = 1, 2, 3, for the cumulative cause-specific hazards. The sum of the 0j 's is the all-cause hazard 0· , 0· (t)dt = P(T ∈ [t, t + dt) | T ≥ t), with cumulative all-cause hazard A 0· (t). Hence, all cause-specific hazards enter the survival function P( In a competing risks setting, the interest often is in the cumulative incidence function of state j = 1, 2, 3, Note that P(T ≥ u) = P(T > u), but estimation will use an estimator of P(T ≥ u) rather than P(T > u).
For ease of presentation, we assume (X t ) t ≥ 0 to only be subject to left-truncation and not to right-censoring but including independent right-censoring is straightforwardly incorporated as usual. This corresponds to the typical situation in pregnancy data. Let L denote the left-truncation time, ie, the time of study entry. An individual's event time and event type (T, X T ) can only be observed if L < T. Otherwise, the individual experiences an event before study entry and does not enter the study, ie, only n out of m individuals enter the study, n ≤ m.

Random left-truncation
In the standard random left-truncation model, the event time T and the left-truncation time L are assumed to be independent. For estimation, we introduce the following counting process notation. Let n be the number of individuals under study as above. Therefore, we have n i.i.d. replicates of (L i , T i , X T i ). 19,20 The aggregated at-risk process denoting the number of individuals observed in state 0 just before time t is given by is the number of observed 0 → j-transitions in the time interval [0, t], j = 1, 2, 3. We also define the total number of transitions out of state 0 in the interval [0, t] as N 0· (t) = N 01 (t) + N 02 (t) + N 03 (t). The increment of the counting process, ie, the number of observed events of type j at t, is written as ΔN 0j (t) = N 0j (t) − N 0j (t−). Moreover, ΔN 0· (t) = ΔN 01 (t) + ΔN 02 (t) + ΔN 03 (t) is the total number of events observed at time t. Then, the Aalen-Johansen estimator of the cumulative incidence function is given bŷ where the sum is over all unique event times s in [0, t]. Thereby,P( is the value of Kaplan-Meier estimator over all event times in (0, t) but excluding possible events at time t. We will also writeP(T ≥ t) =P(T > t−). Note thatÂ 0 (t) ∶= ∑ s≤t ΔN 0 (s) Y (s) , j = 1, 2, 3, is the Nelson-Aalen estimator of the cumulative cause-specific hazards.

Dependent left-truncation
Unlike the assumption of random right-censoring, it may be investigated whether or not the information of left-truncation influences the risk of experiencing an event. 2 In a single endpoint survival situation, Keiding and Gill 12 therefore suggested to include the left-truncation time as a covariate in a Cox proportional hazards model. The approach can be analogously applied to the competing risk setting by including the left-truncation time in the Cox models for all cause-specific hazards. 2 To model dependent left-truncation, we assume the cause-specific hazards to follow a Cox proportional hazards model. This assumption is also only testable on the observable region T > L and remains untestable on the remaining region T ≤ L. 14 Thus, Next, the cumulative incidence function conditional on L = is defined as Note that Equations (6) and (7), which define and use 0j (u | L = ) for all times u, rely on the fact that L = is not viewed as an internal time-dependent event happening at , but L is viewed as a latent variable, see Section 2. As also explained in Section 2, the conditional probability (7) would be useful for individual prediction, ideally updated for the past event-free time. However, in what follows, we will be interested in a comparator for the Aalen-Johansen estimator, which can be based on integration of (7) with respect to the left-truncation distribution F L For estimation of this function the parameters 0j need to be estimated first. As in the standard Cox proportional hazards model, this works by maximizing the log-likelihood of the model where L is included as a covariate. The predicted cumulative cause-specific hazards for individual i with entry time i arê where the cumulative baseline hazard can be estimated using the Breslow estimatorÂ 0 ; After that, the predicted cumulative all-cause hazard is given byÂ 0· Furthermore, the predicted conditional survival probability is denoted bŷ With Equation (10) the predicted conditional cumulative incidence functions can be derived In a scenario with dependent left-truncation, we need to account for not observing all events due to left-truncation, ie, only n out of m events can be observed. Thereby, m is unknown and an estimator of it is needed. Inverse probability weighting arguments are used for its derivation. The main idea is that, if the (marginal) inclusion probability is Q = P(T > L) ≈ n m , one individual under study represents 1∕Q individuals of the total population. So, estimating m witĥ leads to an estimated probability of being included in the study ofQ = n m , and the left-truncation distribution F L ( ) can be estimated byF Finally, an estimator the of marginal cumulative incidence function in a scenario with dependent left-truncation iŝ

Setup
Similar to our real data example, we simulated competing risks data with k = 2 competing states as the two primary competing events are induced abortion and spontaneous abortion since live births are usually observed much later. A total number of m = 500 individuals were simulated. To begin, the left-truncation times are simulated following an exponential distribution with parameter chosen such that the marginal probability of being included in the study is Q = 60%. In a second step, the event times are generated conditional on the first step and from an exponential distribution with parameter corresponding to the all-cause hazard The event type is determined in a third step: Given an event occurs at time t i , it is of type 1 with conditional probability 01 Finally, the left-truncated data set for analysis consists of the n ≤ m individuals with L i < T i , i = 1, … , n. We note two facts about the distribution of the simulated data: Firstly, the algorithm is run identically for all initial m units. Hence, the m units are marginally i.i.d. This is why the standard Aalen-Johansen estimator for the marginal cumulative event probability (or Kaplan-Meier for the marginal survival function of T 1 ) computed in the untruncated data set with all m units are (asymptotically) unbiased estimators. However, conditional on L i = i , the units differ. Secondly, we reiterate that (L i , T i ) are viewed as latent variables, see Section 2. Phrased in the context of our data example, the interpretation is that a drug-exposed pregnancy i comes with a latent time-to-TIS-contact L i . If, say, the pregnancy ends in a spontaneous abortion at T i = t i < i = L i , the contact to TIS will not be made anymore. This is not unlike the latent times model for censored survival data, (T i , C i ), where the right-censoring time C i is not observed if T i happens first, but C i is still assumed to conceptually exist. This point of view is put to further use here by modeling the dependence between L i and T i via conditional hazards (6).

Survival data without competing risks
The aim of this section is to illustrate the estimand discussed in Section 2 for standard survival data. To this end, assume left-truncation times to follow an exponential distribution with parameter 0.17 and survival data to follow a conditional distribution with hazard (t|L = i ) = 0.06 · exp(−0.3 · i ); we will reuse these specifications in the context of competing risks. The marginal survival distribution then is where the integral on the right hand side of the previous display may be evaluated numerically.
We simulated 1000 data sets, each with an initial number of m individuals. Figure 1 shows the true curve based on the above formula, and the average across all data sets of Mackenzie's IPLW estimator of the survival function, which is a special case of (one minus) our estimator (13)     In scenario (C), our approach to modeling dependent left-truncation is misspecified, as it is in scenario (D). In contrast to (C), however, in scenario (D), our new estimator is no longer unbiased and its bias is greater than the bias of the Aalen-Johansen estimator. Moreover, the new IPLW estimator has an increased variance compared to the Aalen-Johansen estimator. On the other hand, when considering the mean squared errors of the estimators of both of the cumulative incidence function evaluated at time 40 and displayed in Table 1, for scenario (A) and (C), a bias-variance trade-off can be found. Although the variance is increased in these two scenarios, the mean squared error is comparable to the one of the Aalen-Johansen estimator as the bias is smaller. As the variance of the new estimator is the smallest in scenario (B), the mean squared error is also smaller than the one of the Aalen-Johansen estimator. Since, in scenario (D), the new IPLW estimator is no longer unbiased and has a strongly increased variance, the mean squared error is the highest in this setting.

Competing risks data
To further investigate this, we considered the test of random left-truncation via Cox modeling. Table 2 displays the estimated mean and empirical 95% confidence intervals of the parameters that model the association between the left-truncation time and the hazard of an event in the Cox model.
On average, results were in line with the model specifications for scenarios (A) and (B). In scenario (C), both cause-specific hazards models are misspecified, while in scenario (D), only the cause-specific hazard for event type 2 is misspecified. Hence, the average result for the latter is also in line with the model specifications. On the other hand, the average results reported in Table 2 for scenario (C) and for the cause-specific hazard for event type 2 in scenario (D) reflect model misspecification. However, and this is a notable difference between (C) and (D), results for (C) are of a similar magnitude as compared to (A) and (B), while the magnitude of the effect reported for the type 2 hazard in (D) clearly stands out. In other words, and as measured by Table 2, the degree of misspecification appears to be larger in (D), which helps to explain the results seen in Figure 2.
We also note that, in some simulated data sets (1 in (A), 9 in (B), 8 in (C), and 48 in (D)), too heavy parametric extrapolation occurred such that, in (10) negative values were produced. These studies were not included in the averages reported in Figure 2. Furthermore, the problem was, again, most pronounced in scenario (D).
In the statistic software R, the Aalen-Johansen estimator can be calculated with the function etm of the etm package. 21 The conditional cumulative incidence function can be calculated with the help of the mstate package. 22

COUMARIN DATA
We reanalyze the abortion data set, which we have made publicly available in the R package etm, 1,2,21 where for the present analysis, we have broken ties by adding a small uniform random noise to facilitate the semiparametric modeling approach from above. The data set is originally from a prospective cohort study of pregnant women exposed to coumarin derivatives. 1 It was collected by the Teratology Information Service (TIS) of Berlin, Germany. The TIS counsels women on the toxicity of medications during pregnancy. Often, the first time contacting the TIS is in the first trimester of pregnancy.
The data consist of 1186 women. Of these, 173 were exposed to coumarin, an anticoagulant, for medical reasons, and 1013 served as controls as they were not exposed to any potential teratogens. There are three competing events in the data: induced abortion, live birth, and spontaneous abortion. Of the control group, 20 women had an induced abortion, 69 a spontaneous abortion, and 924 children were born alive. In the coumarin group, on the other hand, 38 women had an induced abortion, 43 a spontaneous abortion, and 92 experienced a live birth. We note that the pregnancy outcome of every woman is observed, meaning there is no right-censoring in the data. Figure S7 of the Supporting Information shows the time under study and the event type of the women divided by group. The figure also shows that the real data analysis must put attention to possible extrapolation. Especially in the coumarin group, there are few late entries. Most women entering after week 22 experience a live birth.
As typical in pregnancy data, the women enter the study several weeks after conception, so the data are left-truncated. Since the data were collected as a consequence of women or clinicians seeking advice by the TIS, there is the concern that women who take an induced abortion into consideration may be more likely to contact the TIS. However, TIS does usually not advice induced abortions. Furthermore, more women consider an induced abortion for social reason than for medical ones. 23 Still, there may be the suspicion of dependence between left-truncation and event time. Since life birth happens at a much later time and spontaneous abortions happen unexpectedly, these events are assumed to be independent of the left-truncation time in the sense that the study entry time does not have an effect on the respective cause-specific hazards. Figure 3 displays the competing risks setting with dependent left-truncation in the abortion data.
To begin, we tested for possible dependence between the induced-abortion hazard and the left-truncation time using a Cox model including the study entry time. Table 3 shows the estimated hazard ratio with corresponding 95% confidence interval, where our first interest here is whether or not the value one is contained in the confidence interval. The subtlety of the competing risks setting is illustrated by the fact that no significant effect was found on the all-cause hazard, but when distinguishing the cause-specific hazards, a significant effect of the study entry time for induced abortion was found in the coumarin group. Possible explanations why a later study entry time increases the risk of induced abortion in the coumarin group could be, on the one hand, that women taking coumarin are more likely to be pregnant unplanned since otherwise the coumarin intake may have been stopped before conception. Unplanned pregnancies are detected later and women considering an induced abortion may be in a decision dilemma and take their time to contact the TIS. On the other hand, the time induced abortions are allowed is limited.
To proceed with the modeling approach, we chose to use as a covariate new i = i · 1( i ≤ 15), because an elective termination after gestational week 15 is typically only possible for severe medical reasons. Figure 4 displays the marginal cumulative incidence functions with corresponding 95% confidence intervals of the two estimators together with the simple proportions of the pregnancy outcomes. The pointwise confidence intervals of the new estimator were obtained with a bootstrap by drawing n times with replacement from the study population. In comparison to the simple proportions, eg, the ones for induced abortion are 38∕173 ≈ 22% in the coumarin group and 20∕1013 ≈ 2% in the control group,   However, this potential underestimation must be contrasted with two aspects. One is that simple proportions are still commonly used in the field, and compared with these, both the standard Aalen-Johansen estimator and the new estimator avoid underestimation of the absolute induced abortion risk as presented by the crude proportions. The second aspect is that the confidence intervals of the new IPLW estimator are now much wider compared to the Aalen-Johansen estimator as a consequence of the small sample size of only 173 women in the coumarin group and of the inverse probability weighting arguments. In Section 3 of the Supporting information, we conducted a simple sensitivity analysis w.r.t. sample size. Artificially, increasing the size of the coumarin group resulted in a decrease of the width of the confidence interval. Nevertheless, the confidence interval of the Aalen-Johansen estimator was still narrower. This is in line with our simulation results which found comparable MSEs in scenarios (A) to (C). Also reported in the Supporting Information is a simple stratified IPLW analysis with smaller confidence intervals and point estimates closer to the standard Aalen-Johansen estimator.

SUMMARY
Estimating pregnancy outcome probabilities must account for both left-truncation and competing risks. In general, left-truncation must be taken into account in time-to-event studies whenever the natural time origin potentially lies before study entry, and competing risks are present whenever there is more than one event type. Because the time of delayed study entry is an element of the past w.r.t. the time scale of analysis, we may test and model associations between the left-truncation time and the time-to-event. In this work, we have extended a recent proposal by Mackenzie 14 for estimating survival functions in the presence of dependent left-truncation. Using inverse probability of left-truncation weights obtained from Cox modeling of the cause-specific hazards, we obtained an estimator of the cumulative event probabilities when the random left-truncation assumption is in doubt. The new estimator was found to be unbiased in simulation studies with correctly specified model for dependent left-truncation. Complementing Mackenzie's investigations, we also studied our new estimator under misspecification of the left-truncation model with negligible bias in the presence of a frailty term, but more pronounced bias when the true truncation model was additive rather than proportional. In all settings, the new IPLW estimator was much more variable than the original Aalen-Johansen estimator, but with comparable MSEs both in the correctly specified models and in the misspecified frailty case. Both the simulation results and the real data analysis illustrated that our new estimator may be used as a sensitivity analysis of a standard Aalen-Johansen analysis, if the assumption of random left-truncation is in doubt. In the real data analysis, we found a signal that the Aalen-Johansen estimator potentially underestimated the absolute risk of induced abortion in the coumarin group, but the potential bias was moderate when compared to the still commonly used incidence proportions. The approach is, in principle, extendable to general multistate models, although, arguably, the problem of increased variation will even be more pronounced in such general models.
Accounting for dependent left-truncation relies on modeling the dependency. Some remarks are in place. To begin, an alternative approach for the estimation of the survival function under dependent left-truncation uses copulas to model the joint distribution of the left-truncation and the event time, 24,25 but the extension to competing risks is less straightforward. An alternative popular method of analysis in the context of competing risks is based on the subdistribution hazard, ie, the hazard attached to the cumulative incidence function interpreting the latter as a distribution function. Originally developed in the presence of right-censoring only, left-truncation has been incorporated by several authors; we refer to the work of Zhang et al 26 who also allow for dependence of the left-truncation time on baseline covariates. The subdistribution hazard has been criticized, questioning whether it has any useful interpretation. 27 In this work, we have deliberately chosen to use the cause-specific hazards rather than the subdistribution hazard. The reason was the ability to specifically model an impact on the hazard of induced abortion only.
In this work, as repeatedly emphasized above, we have also relied on a latent times model (L, T) similar to the common latent times model (T, C) for censored data. An important consequence is that our approach does not apply to common examples of prevalent sampling where left-truncation is due to an internal time-dependent event. For such categorical time-dependent covariates, Cortese and Andersen 28 discuss multistate modeling as a joint model for both the intermediate event and competing risks, see also the book by Beyersmann et al. 29 In a progressive multistate model, a possible approach to estimate a "marginal" cumulative incidence function could be based on estimating the sum of state occupation probabilities corresponding to a competing risk, with absorbing states either reached directly from an initial state or after having passed through the intermediate state. Here, the practical problem will be to estimate the transition intensity from the initial to the intermediate state based on left-truncated data, but such a line of research merits future work, possibly under simplifying assumptions on this intensity.
The real data analysis also illustrated that some care is needed when modeling dependent left-truncation. Here, we only modeled an impact of study entry times within the first 15 weeks, because a later elective termination is typically only possible for severe medical reasons. In the Supporting Information online, we also provide a simple stratified IPLW analysis, which resulted in both less variation and estimates closer to the Aalen-Johansen estimator than the IPLW estimator based on Cox regression. These analyses suggest that investigating modeling alternatives and model choice for accounting for dependent left-truncation merits future research. Compared to the Cox modeling approach in this paper and in the work of Mackenzie, 14 possible directions of hazards modeling include fully parametric modeling as in the work of Anzures-Cabrera and Hutton 13 who employ a Weibull proportional hazards model including a stratified left-truncation indicator as covariate and extensions of the Cox model as in the work of Matsuura and Eguchi 30 who express the hazard as product of a hazard conditional on covariates (but not the left-truncation time) and a "nuisance function" that depends on the left-truncation time. Interestingly, Matsuura and Eguchi also use a stratification approach in their real data analysis. IPLW is not addressed in these papers. 13,30 Of future interest could also be investigating the IPLW estimators of Mackenzie and of the present work under an additive hazards model for the truncation hazard, see, eg, Section 4.2.7 in the work of Aalen et al 20 for a discussion of using the additive hazards model in inverse probability weighting. Another aspect is that, in the real data analysis, we have chosen to break ties in order to facilitate the semiparametric modeling approach. Another possibility and topic of future research would be time-discrete modeling. 31 Finally, in terms of subject matter considerations, we note that selection bias may be present in the TIS data collection as only women who consent are followed-up. On the one hand, it has been discussed that the risk profile of women who enroll in TIS studies does not substantially differ from women who do not enroll. 32 On the other hand, many TIS experienced that women with a higher education may be more likely to seek contact. 33 Schaefer et al 33 give a thorough discussion of such biases. These authors also note that a possible selection effect of socio-economic status or education will likely be present in both drug-exposed groups and in control groups and is not expected to be linked to the potentially harmful drug under investigation. They also stress as a major advantage of TIS data its prospective character. One consequence is that recall bias should be negligible, because very recent drug intake leads to consulting TIS.