New‐user and prevalent‐user designs and the definition of study time origin in pharmacoepidemiology: A review of reporting practices

Abstract Background Guidance reports for observational comparative effectiveness and drug safety research recommend implementing a new‐user design whenever possible, since it reduces the risk of selection bias in exposure effect estimation compared to a prevalent‐user design. The uptake of this guidance has not been studied extensively. Methods We reviewed 89 observational effectiveness and safety cohort studies published in six pharmacoepidemiological journals in 2018 and 2019. We developed an extraction tool to assess how frequently new‐user and prevalent‐user designs were reported to be implemented. For studies that implemented a new‐user design in both treatment arms, we extracted information about the extent to which the moment of meeting eligibility criteria, treatment initiation, and start of follow‐up were reported to be aligned. Results Of the 89 studies included, 40% reported implementing a new‐user design for both the study exposure arm and the comparator arm, while 13% reported implementing a prevalent‐user design in both arms. The moment of meeting eligibility criteria, treatment initiation, and start of follow‐up were reported to be aligned in both treatment arms in 53% of studies that reported implementing a new‐user design. We provided examples of studies that minimized the risk of introducing bias due to unclear definition of time origin in unexposed participants, immortal time, or a time lag. Conclusions Almost half of the included studies reported implementing a new‐user design. Implications of misalignment of study design origin were difficult to assess because it would require explicit reporting of the target estimand in original studies. We recommend that the choice for a particular study time origin is explicitly motivated to enable assessment of validity of the study.


| INTRODUCTION
Guidance reports for comparative effectiveness and safety research of pharmacological treatments recommend the new-user design, [1][2][3][4] in which follow-up time generally starts with the first prescription or dispensing of the drug(s) of interest. 5 In contrast, in the prevalent-user design both current (prevalent) and new users of a drug are included.
The new-user design enforces appropriate temporal ordering of measurements of confounders, treatment, and outcome, protecting the researcher against accidental adjustment for variables affected by treatment and against finding associations that are based on reversed causation [1][2][3][4][5][6][7][8] However, the start of a treatment can be difficult to capture (especially in case of intermittently used treatments) and exclusion of prevalent users may reduce follow-up time or sample size 5,[7][8][9][10] It is unclear how often and for which reasons researchers deviated from the guidance to implement a new-user design.
To assess the uptake of new-user design guidance, it is important to go beyond the distinction of including new or prevalent users.
Many time-related biases can be prevented by choosing a study time origin (or study baseline) such that it establishes alignment of the moment of meeting eligibility criteria, treatment initiation, and start of follow-up. 6,[11][12][13] Previous studies investigated how often pharmacoepidemiological studies deviated from the recommendation to implement a new-user design, [14][15][16] however, the implementation of new-user designs in terms of alignment of eligibility, treatment initiation, and start of follow-up has not been studied yet.
In the current study, we reviewed the literature about contemporary observational effectiveness and safety cohort studies. We assessed how frequently new-user and prevalent-user designs were reported to be implemented in studies published in high-ranked pharmacoepidemiologic journals. For studies implementing a new-user design, we evaluated to what extent eligibility, treatment initiation, and start of follow-up were reported to be aligned.

| METHODS
We systematically assessed the reporting practices in observational studies of treatment effects regarding the definition of the study time origin and inclusion of new versus prevalent users of treatment. A protocol of this study is available on Open Science Framework. 17 Based on recommendations by the editor and reviewers, we deviated from this protocol. Specifically, while we scored the items of the extraction tool for all included articles, we discuss the results on alignment in study design origin for new-user designs only, as will be explained below. This review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, 18 where applicable.

| Journal selection and included type of studies
We aimed to review the reporting of approximately 100 articles published before the 1st of July 2019 in journals publishing pharmacoepidemiologic studies of drug-outcome associations. Six pharmacoepidemiological journals were included: Annals of Pharmaco-

| Extraction of study characteristics and evaluation of reporting quality
Articles were scored on a set of items derived from guideline recommendations about elements that should be reported in protocols 20,21 Key Points • Literature about recent pharmacoepidemiologic effectiveness and safety cohort studies of drug-outcome associations was reviewed to assess the reporting of implementation of and rationale for using new-user and prevalent-user designs.
• Almost half of the included studies reported to follow the recommendation to implement a new-user design. Rationales for implementing a prevalent-user design were scarcely reported.
• The study time origin and allocation of follow-up time influence the extent to which the available data can provide a meaningful estimate of the causal effect of interest. We recommend that the choice for a particular study time origin is explicitly motivated to enable assessment of validity of the study.
or articles 4,22 of effectiveness and safety research using large observational databases, as well as methodological articles that discuss the study time origin in observational studies of causal effects. 6,11 The main focus was on the distinction between new-user and prevalent-user designs and the alignment of moment of meeting eligibility criteria, moment of treatment initiation, and start of follow-up in new-user designs. The rationale for alignment of meeting eligibility, treatment initiation, and start of follow-up is described in the Data S1, as well as possible consequences of misalignment. The established scoring tool was pilot tested on six randomly chosen included studies by KL and JS and further adjusted (all items can be found in Tables 2   and 3).
An incident user can more generally be defined as a new user of any treatment decision, that is, initiating a treatment, but also switching to a different treatment or a change of dose. This understanding of the new-user design was introduced by Brookhart, 23 expanded to prevalent new-users of treatment by Suissa,24 and was followed during scoring of articles. For the item that scored reporting of whether the comparator exposure arm implemented a new-user or prevalent-user design, we decided to score nonusers of treatment as prevalent users. Whereas nonuse is not associated with the biases typically associated with prevalent users (eg, adjusting for intermediates, depletion of susceptibles), definition of study time origin in studies with a nonuser comparator arm is complicated because the choice of the time origin since which the (cumulative) probability of an event of interest can occur in the specified population may not be as straightforward for nonusers of treatment. Consequently, it is more challenging to assess whether the study exposure arm and comparator arm can be assumed to be comparable conditional on measured confounders (ie, whether there is conditional exchangeability).
Information was gathered on general characteristics of the included studies; funding source, type of data source, patient domain, sample size, and length of enrollment window. Funding source was defined as "private" when the article stated the study was funded by a pharmaceutical company or when any of the authors was affiliated with a pharmaceutical company and defined as "public" otherwise.
Data sources were classified into hospital data, dispensings, prescriptions, or claims. Patient domain was grouped into medical specialties based on the target population that was mentioned in the article objective. When the target population did not match a single medical specialty, information on the type of treatment and study outcome was used to identify the medical specialty.
Articles were reviewed independently by KL and JS, results were discussed between the two reviewers and in case of disagreement a third reviewer (RG) was consulted. When multiple effectiveness or safety analyses were described in a single article, only the firstreported analysis was scored. When subgroup analyses were performed in the included studies, only the main analysis was scored.
When methods were discussed in an online protocol or described in a different article, we reviewed the referred material.

| Data synthesis
Rater agreement was computed using the unweighted Cohen's kappa for nominal variables and two coders. 25 Cohen's kappa ranges from The screening and inclusion of eligible articles À1 (perfect disagreement) to 1 (perfect agreement). Reporting of items was presented as percentages of total number of included studies and 95% confidence intervals (CIs) were computed using the normal approximation.

| RESULTS
After screening the full texts of the 100 articles included during abstract and title screening, 89 studies remained based on the eligibility criteria (see Figure 1). The characteristics of the 89 included studies are summarized in Table 1. The most common patient domains considered were cardiology (17 %), neurology (11%) and primary care (10%). The median sample size was 7,011 (range 351,674). In 10% of studies (n = 9), a sample size calculation was reported. The length of follow-up ranged from 1 hour follow-up in one study to a median follow-up of 13.6 years in another study. Rater agreement is presented in Figure 2.
Item kappas indicated that agreement between raters was low (range 0.05-0.75), which was mostly due to ambiguous reporting of the extracted information. Despite the low rater agreement of the initial scores, the presented results have a meaningful interpretation since consensus was reached for all scores with initial disagreement.

| New-user and prevalent-user designs
An overview of item scores is given in Table 2. Forty percent of studies (95% CI 30% -51%, n = 36) reported implementing a new-user design for both the study exposure arm and the comparator exposure arm, while 13% (7%-22%, n = 12) reported implementing a prevalentuser design for both treatment arms ( Figure 3). In 58% (42%-74%, n = 21) of studies with a new-user design for both treatment arms a washout for exposure was reported. For 6% of studies (1%-10%, n = 5) it was unclear whether a new-user or a prevalent-user design was implemented. When a prevalent-user design was reported to be implemented, three studies provided a rationale for including prevalent users. The motivation to include prevalent users concerned biological plausibility of a cumulative effect on outcome risk 26-28 .

| Alignment in new-user designs
In the 36 studies that reported implementing a new-user design in both treatment arms, moment of meeting eligibility criteria, treatment initiation, and start of follow-up were reported to be aligned in both treatment arms in 53% of studies (36%-69%, n = 19). Moment of meeting eligibility criteria, start of treatment, and start of follow-up were reported to be misaligned in both treatment arms in 6% of studies (0%-13%, n = 2) and alignment was unclear in 6% of studies (0%-13%, n = 2) ( Figure 3). In the remaining studies (n = 13), at least one of the six alignment items was misaligned or unclear (see Table 3 for the alignment items).
Implications of misalignment of eligibility, treatment initiation, and start of follow-up can only be assessed relative to the specified  provided an explicit definition of the target estimand. In studies that did not explicitly report the target estimand, it was often unclear which treatment strategies were compared and which treatment decision could be informed based on evidence from the conducted study.

| Examples of good practice
Using examples from the 89 included studies, the next section illustrates how study designs that deviate from an archetypical pharmacoepidemiological active-comparator new-user design could still provide estimates of the target treatment effect with a meaningful interpretation. We did not find any examples with a meaningfully defined study time origin among studies that contained a prevalentuser active-comparator arm.

| Study design with nonuser comparator arm
Korol and colleagues investigated whether initiation of spironolactone affected the risk of new onset diabetes in older patients with heart failure compared to not initiating spironolactone. 29 The patient cohort was defined by day of discharge of the first hospitalization for heart failure. The follow-up was started at the date of first dispensed prescription of spironolactone for the study exposure arm. The start of follow-up for unexposed comparator patients was inherited from the cohort entry date of the comparator and set to the time since hospital discharge from their matched comparator to establish a meaningful study time origin for nonusers, given additional implementations to meet assumptions such as measuring sufficient confounders to invoke the exchangeability assumption ( Table 4). Note that when an eventbased cohort is established, resetting the start of follow-up at the moment of treatment initiation or comparable duration since diagnosis is essential to prevent introduction of immortal time bias. 11

| Study design that anticipated immortal time
Chaignot and colleagues studied whether initiation of baclofen affected the risk of hospitalization and death compared to initiation of acamprosate in adults with an alcohol use disorder without comorbidities. 30 The patient cohort was defined by initiation of baclofen/ acamprosate. To be eligible, patients had to receive at least two reimbursements for the same drug within 60 days after the first reimbursement, meaning that for included individuals, hospitalization/ F I G U R E 3 Frequency of reporting of implementation of new-user and prevalent-user design and type of comparator across the 89 included studies. For studies that reported implementing a new-user design, alignment of eligibility, treatment initiation and follow-up was scored "completely aligned" when all three elements were reported to be aligned in both the active and comparator exposure arm; "completely misaligned" when none of the elements were reported to be aligned in both the active and comparator exposure arm; "unclear" when all three elements were unclear in both the active and comparator exposure arm; "partial alignment" otherwise T A B L E 4 Examples of design solutions for study time origin.

Research question
Designed time origin

Study time origin
Does initiation of spironolactone affect the risk of new-onset diabetes in older patients with heart failure compared to nonuse of spironolactone? 29 The patient cohort was defined by day of discharge of the first hospitalization for heart failure. For the study exposure arm, the follow-up was started at the date of first out-of-hospital dispensed prescription of spironolactone. The date of start of follow-up for unexposed comparator patients was matched to that of exposed patients on the time since hospital discharge axis to establish a meaningful study time origin for nonusers. The authors did not report whether the nonuser cohort was defined based on current exposure information or on future exposure information, that is, whether nonusers could still start spironolactone after their inherited date of start of follow-up or had to be unexposed during the entire study follow-up. The latter could result in a comparator cohort that is restricted to individuals who never had an indication for the treatment, which does not necessarily match the causal contrast of interest. 38 Does initiation of baclofen affect the risk of hospitalization and death compared to initiation of acamprosate in adults with an alcohol use disorder without comorbidities? 30 The patient cohort was defined by initiation of baclofen/ acamprosate. To be eligible, patients had to have received at least a second reimbursement for the same drug within 60 days after the first reimbursement. The start of follow-up was reset after the second prescription to prevent immortal time bias. The study thus estimates the causal effect of baclofen compared to acamprostate given that everyone filled at least two prescriptions within 60 days and death was prevented in the time until they filled a 2nd prescription.  Based on our observations, it is our view that choosing a meaningful time origin is a more fundamental component of the study design than the distinction between new or prevalent users alone.
Even when a new-user design was implemented, some of the articles we reviewed defined the study origin ambiguously. Reporting guidelines, such as RECORD-PE, 35 state that study entry criteria and the order in which these criteria were applied to identify the study population should be clearly described. Indicating that a new-user design was implemented is insufficient to justify validity of a study design and time origin.
Our study had limitations. We focused on study-design approaches to define a meaningful study time origin. Although data analysis approaches can establish correct allocation of follow-up time as well, 24,36 we did not assess them in our review. Misalignment of eligibility, treatment initiation, and start of follow-up may be appropriate when exposures are evaluated in a time-dependent manner. Four of the studies that reported implementing a new-user design studied a time-dependent exposure, thereby possibly adjusting for any misalignment in the study design. In our review, we assessed how frequently new-user and prevalent-user designs were implemented based on the reporting in original articles. It was not always possible to distinguish between lack of reporting and lack of implementation.
Our results should therefore be interpreted as a summary of reporting practices on study time origin in six journals. A final limitation is that our search was restricted to a convenience sample of six journals.
Arguably, the six selected journals are representing the higher impact, specialist pharmacoepidemiology journals and results may therefore overestimate the quality of reporting of pharmacoepidemiologic studies in general.