HARmonized Protocol Template to Enhance Reproducibility of hypothesis evaluating real‐world evidence studies on treatment effects: A good practices report of a joint ISPE/ISPOR task force

Abstract Problem Ambiguity in communication of key study parameters limits the utility of real‐world evidence (RWE) studies in healthcare decision‐making. Clear communication about data provenance, design, analysis, and implementation is needed. This would facilitate reproducibility, replication in independent data, and assessment of potential sources of bias. What We Did The International Society for Pharmacoepidemiology (ISPE) and ISPOR–The Professional Society for Health Economics and Outcomes Research (ISPOR) convened a joint task force, including representation from key international stakeholders, to create a harmonized protocol template for RWE studies that evaluate a treatment effect and are intended to inform decision‐making. The template builds on existing efforts to improve transparency and incorporates recent insights regarding the level of detail needed to enable RWE study reproducibility. The overarching principle was to reach for sufficient clarity regarding data, design, analysis, and implementation to achieve 3 main goals. One, to help investigators thoroughly consider, then document their choices and rationale for key study parameters that define the causal question (e.g., target estimand), two, to facilitate decision‐making by enabling reviewers to readily assess potential for biases related to these choices, and three, to facilitate reproducibility. Strategies to Disseminate and Facilitate Use Recognizing that the impact of this harmonized template relies on uptake, we have outlined a plan to introduce and pilot the template with key international stakeholders over the next 2 years. Conclusion The HARmonized Protocol Template to Enhance Reproducibility (HARPER) helps to create a shared understanding of intended scientific decisions through a common text, tabular and visual structure. The template provides a set of core recommendations for clear and reproducible RWE study protocols and is intended to be used as a backbone throughout the research process from developing a valid study protocol, to registration, through implementation and reporting on those implementation decisions.

Pharmacoepidemiology and ISPOR-The Professional Society for Health Economics and Outcomes Research convened a joint task force to create a harmonized protocol template for RWE studies. The template builds on existing efforts to improve transparency and incorporates recent insights regarding the level of detail needed to enable study reproducibility.
The overarching principle was to reach for sufficient clarity to achieve three main goals.
One, to help investigators thoroughly consider, then document their choices and rationale for key study parameters that define the causal question, two, to facilitate decision-making by enabling reviewers to readily assess potential for biases related to these choices, and three, to facilitate reproducibility. The HARmonized Protocol Template to Enhance Reproducibility helps to create a shared understanding of intended scientific decisions through a common text, tabular and visual structure. The template provides a set of core recommendations for clear and reproducible RWE study protocols and is intended to be used as a backbone throughout the research process from developing a valid study protocol, to registration, through implementation and reporting on those implementation decisions.

| BACKGROUND
Regulatory agencies, health technology assessors, and payers are increasingly interested in studies that make use of real-world data (RWD) to inform regulatory and other policy or clinical decisionmaking. [1][2][3][4][5] While real-world evidence (RWE) studies using rigorous methods applied to fit-for-purpose RWD can provide critical, timely insights into the safety and effectiveness [6][7][8] of drugs, devices, and vaccines; high-profile cases of studies conducted with biased methods [9][10][11][12] or inadequate reporting on unsuitable data [13][14][15] have raised concerns over the credibility of RWE studies. These concerns have led to increasing calls from the research community and decision-makers for more transparency on the design and conduct of studies using RWD. [16][17][18] Some initiatives are already in place. As an example, the European Medicines Agency (EMA) has, for over a decade, required or recommended registration of a study protocol using a template for observational post-authorization safety studies (PASS) conducted by marketing authorization holders. 19,20 However, a large scale evaluation of the reproducibility of 150 studies highlighted that there remains a great deal of variability in transparency about critical details of RWE study implementation, 21 and recently, the EMA endorsed a strategy for moving toward greater standardization and structure in protocols. 22 Clear communication within multi-disciplinary study teams and between investigators, decision-makers and other stakeholders is necessary to increase confidence in RWE study design, conduct, and results.
The rapid development of fragmented recommendations 23 has highlighted the need for an internationally agreed upon set of core expectations regarding best practices for developing and communicating about study design, analysis, and implementation via transparent, comprehensive, and rigorous RWE study protocols. A joint task force between the International Society for Pharmacoepidemiology (ISPE) and the ISPOR-The Professional Society for Health Economics and Outcomes Research (ISPOR) was convened to meet this need by developing a harmonized protocol template for RWE studies that make secondary use of RWD, evaluate a hypothesis and are intended to inform healthcare decision-making. The task force was comprised of core committee members from both professional societies, and included international stakeholder groups including regulatory agencies, health technology assessment (HTA) organizations, industry, and academia.
The task force was primarily focused on protocols for postmarketing studies that deal with questions of causal inference using RWD because of their importance to decision making and the complexity of design and analysis when addressing causal questions.
Examples of such studies include comparative effectiveness or safety studies associated with clinical interventions, studies of the effect of policy interventions such as benefit designs or healthcare delivery models, health care expenditures or value associated with different treatments, and so forth. While it is also important to develop protocols for non-causal inference studies using RWD, that was not the focus of the protocol harmonization effort.
The task force met monthly from July 2021 to January 2022 to develop the harmonized template. The process of developing the harmonized template included both evaluation of external validity (through comparison of existing protocol templates or guidance developed by international multi-stakeholder groups to ensure compatibility with agreed upon scientific principles) and internal validity (through testing and development of example use cases with different designs and data sources by five sub-teams). The final deliverable was a standard template with embedded instruction which harmonized across existing guidance and templates and example protocols for a variety of use cases to illustrate how to use the template.

| Identification and comparison of protocol templates
Existing protocol templates for RWE studies were identified based on templates known to the core committee of the joint task force, coupled with a search for relevant protocol templates in PubMed and the EQUATOR network (Enhancing the QUAlity and Transparency Of health Research) ( Figure 1, Appendix S1). Additionally, an extended reviewer group composed of volunteers from ISPE and ISPOR were asked to review the list of identified protocol templates and to supplement the list with other templates that they were aware of. Protocol templates that were not relevant for RWE studies that make secondary use of healthcare data or were not developed by international multistakeholder groups were excluded. This resulted in four eligible protocol Tool for Real World Evidence (STaRT-RWE). 26 Section headings of the identified protocol templates were compared and mapped to each other, using the oldest guideline (EMA-GVP Module VIII-PASS) as the starting point ( Table 1). The committee observed that at a conceptual level, the major elements of study design and analysis were already largely agreed upon and included in each of the templates. However, the templates differed on the depth and detail of guidance within each section as well as the sequencing of elements within the template. Three of the protocol templates A high-level summary of other differences in format and depth of detail requested by each template is provided in supplemental appendices (Appendix S2).

| Creation of HARmonized Protocol Template to Enhance Reproducibility
In order to create a harmonized template, the core committee of the joint task force discussed each section header in the mapped table of protocol templates. Again, starting with the EMA-GVP Module VIII-PASS template, the committee evaluated the different sections, guidance and/or structure of more recently developed protocol templates under the same section header, jointly deciding how to incorporate these updates into the harmonized protocol template. The committee then categorized the sections as core elements required for any RWE study protocol and non-core elements that may provide important context, administrative and other information (Table 1). Core elements of the protocol were defined as sections that were either considered key for the purposes of reproducibility and validity assessment or were common elements that were found in multiple protocol F I G U R E 1 PRISMA diagram templates and were important to consider core for administrative or other reasons.
After populating an initial mock-up template, the core committee discussed and concluded that a combination of free-text and struc-

| Piloting the usability of HARPER
To pilot the usability of the draft harmonized template, the core committee formed five subgroups. These subgroups had the task of populating the T A B L E 1 Comparison of four protocol templates for real-world evidence studies developed by multi-stakeholder, international organizations.
Note: Shaded gray area within bold black lines reflects core protocol components. draft harmonized protocol template for a variety of use cases that involved different study designs, data sources, and types of data elements. Four of these use cases were based on published effectiveness and safety studies and one was for a study that was in the planning/design phase ( Table 2).
The members of each subgroup worked together to populate the initial version and relayed any issues to the core committee at large for discussion. The harmonized protocol template was revised to improve usability following this group exercise. These revisions included expanding the set of sections that were considered core, re-labeling of some structured prompts, and the addition of more helper text to guide investigators in use of the template. The abbreviated protocols for each use case was transferred onto the final version of the template to provide guidance and examples for future users (Appendix S3).

| Title page
The title page includes a table for administrative details, such as the title of the protocol, brief objectives, a protocol version date, names of investigators and sponsor, study registration, and potential conflicts of interest.

| Abstract
The abstract is a free text section that includes a description of the background, research question and objectives, study design, and data sources.        Table 10. Primary, secondary, and subgroup analysis specification  This table documents what is changed, when it is changed, and why. For example, over the process of developing and implementing a protocol, investigators could start with an initial version of inclusion-exclusion criteria for doing an initial set of feasibility counts (looking at outcome counts that are not stratified by exposure), in version 2 using revised algorithms to generate a second set of feasibility counts to evaluate whether there is enough power and assess diagnostics such as propensity score overlap and balance, and in version 3 using finalized algorithms to create the analytic cohort.

| Milestones
This section includes a table to outline the anticipated timeline for study milestones.

| Rationale and background
This section includes structured free-text prompts to encourage inclusion of important key contextual information. For example, a paragraph about what is known about the condition and the exposures being investigated, knowledge gaps, and the expected contribution from the study described in the protocol.

| Research question and objectives
The prompts ask the user to summarize PICOT informationthat is the population, intervention/exposure, comparator, outcome, and time horizon for the study (when follow up begins and ends) -as well as the main measure of effect. The text prompts closely align with the information needed to compare a RWE study design to a theoretical trial designed to address the same question (e.g., a target trial 27 ). This section includes structured free-text prompts to specify the primary and secondary objectives, as well as the hypotheses being tested for each.

| Limitation of the methods
This section is free-text and provides space for the investigators to summarize the anticipated limitations of the methods and data described in Section 1.4.4.

| Protection of human subjects
This free-text section is intended for the investigators to describe patient privacy protections and the plan to maintain data confidentiality or prevent re-identification. For example, investigators may report how the data were anonymized or pseudo-anonymized, whether small cell sizes were suppressed (if the data holder requires), and/or whether the study protocol underwent ethics review. For many studies using RWD, the latter may not be applicable. If the study is considered exempt by the relevant ethics board this should be stated with the reason it is considered exempt.

| Reporting of adverse events
This free-text section is for investigators to state the plan to report adverse events. This reporting is mandated for certain types of postauthorization studies. 36 If it is not applicable, that can be stated here.

| References
This section is for providing a bibliography for cited work.

| Appendices
The structured, human readable tables in the harmonized template are intended to be accompanied by appendices that list out the clinical code algorithms in a way that can be directly read in by programming code to facilitate creation of study variables. An example is provided in supplemental Appendix S3 Example 1. Appendices to detail other things, like decisions made when converting source data to a common data model or doing data linkage may also be relevant, depending on the study. Some appendices (e.g., specifying clinical code algorithms used for covariates), may not be developed until later versions of the protocol as the study progresses. Likewise, over the course of the conduct of the study, algorithms included in the appendices may be amended, with the changes documented in the amendments table.
Some investigators may use code algorithms that they consider proprietary. If that is the case, this should be so noted in the protocol, thus allowing the reviewer to weigh the potential impact of not having this information on their ability to evaluate the validity or relevance of the study results.

| DISCUSSION
A joint task force between ISPE and ISPOR, including representation from key international stakeholders was formed to create a harmonized protocol template for RWE studies that evaluate a treatment effect and are intended to inform decision-making. HARmonized Protocol Template to Enhance Reproducibility (HARPER) builds on existing efforts, providing clarity, structure, and a common denominator regarding the level of operational detail, context, and rationale necessary in a protocol to produce a transparent, reproducible study and to support assessment of fitness-for-purpose. The overarching principle was to reach for sufficient clarity in the protocol regarding data, design, analysis, and implementation over the lifecycle of a study to achieve three main goals. One, to help investigators thoroughly consider, then document their decisions and rationale regarding key study parameters that define the causal question (e.g., target estimand 37 ). In this way, the template could help investigators to think more carefully about their choices and be used to help train a future generation on best practices. The second goal was to facilitate decision-making by enabling reviewers to readily assess potential for biases related to the clearly communicated investigator choices and rationale. The third goal was to facilitate reproducibility of results.
While the primary focus was on hypothesis evaluating RWE studies, HARPER can also be used as the basis of protocols for descriptive, utilization, predictive or other types of RWE studies. However, there may be some variation regarding which sections are considered core versus optional for different stakeholders (e.g., regulatory, HTA, academic, etc.).

| Parallel workstreams, relationships to checklists/bias assessment tools for RWE
In addition to issues of transparency, many professional associations, regulatory bodies, and health technology assessment agencies have issued best practice guidelines and checklists for the analysis of RWD.