Overview of the European post‐authorisation study register post‐authorization studies performed in Europe from September 2010 to December 2018

Abstract Background The European post‐authorisation study (EU PAS) register is a repository launched in 2010 by the European Medicines Agency (EMA). All EMA‐requested PAS, commonly observational studies, must be recorded in this register. Multi‐database studies (MDS) leveraging secondary data have become an important strategy to conduct PAS in recent years, as reflected by the type of studies registered in the EU PAS register. Objectives To analyse and describe PAS in the EU PAS register, with focus on MDS. Methods Studies in the EU PAS register from inception to 31st December 2018 were described concerning transparency, regulatory obligations, scope, study type (e.g., observational study, clinical trial, survey, systematic review/meta‐analysis), study design, type of data collection and target population. MDS were defined as studies conducted through secondary use of >1 data source not linked at patient‐level. Data extraction was carried out independently by 14 centres with expertise in pharmacoepidemiology, using publicly available information in the EU PAS register including study protocol, whenever available, using a standardised data collection form. For validation purposes, a second revision of key fields for a 15% random sample of studies was carried out by a different centre. The inter‐rater reliability (IRR) was then calculated. Finally, to identify predictors of primary data collection‐based studies/versus those based on secondary use of healthcare databases) or MDS (vs. non‐MDS), odds ratios (OR) and 95% confidence intervals (CI) were calculated fitting univariate logistic regression models. Results Overall, 1426 studies were identified. Clinical trials (N = 30; 2%), systematic reviews/meta‐analyses (N = 16; 1%) and miscellaneous study designs (N = 46; 3%) were much less common than observational studies (N = 1227; 86%). The protocol was available for 63% (N = 360) of 572 observational studies requested by a competent authority. Overall, 36% (N = 446) of observational studies were based fully or partially on primary data collection. Of 757 observational studies based on secondary use of data alone, 282 (37%) were MDS. Drug utilisation was significantly more common as a study scope in MDS compared to non‐MDS studies. The overall percentage agreement among collaborating centres that collected the data concerning study variables was highest for study type (93.5%) and lowest for type of secondary data (67.8%). Conclusions Observational studies were the most common type of studies in the EU PAS register, but 30% used primary data, which is more resource‐intensive. Almost half of observational studies using secondary data were MDS. Data recording in the EU PAS register may be improved further, including more widespread availability of study protocols to improve transparency.

All these previously conducted EU PAS register-based studies relied primary on the data reported within the register, without validation of the collected information or additional information on methodological aspects based on expert review. Another gap in these previous studies is the lack of focus on multiple-database studies (MDS), which are observational studies conducted using more than one source of routinely collected data (e.g., claims database, electronic health records [EHRs]). MDS are of particular importance because they allow the accrual of a very large cohort of patients, which is of particular relevance to paediatric populations and rare diseases, as well as several other populations of special interest. Indeed, the number of MDS has been increasing over the years. Since EU PAS register is a platform for the mandatory recording of data on observational studies as per EU legislation, identifying and describing such studies and their impact on the landscape of observational research is therefore of great value as this has never been done to our knowledge. 7 Given how quickly observational research is growing, the aim of the present descriptive study was to conduct an updated and detailed review of the studies that were registered in the EU PAS register from its inception till the end of 2018, providing additional information on data type (e.g., distinguishing between claims data, EHRs, etc.) and study design (e.g., distinguishing between descriptive studies, cohort studies, casecontrol studies, etc.). Another aim of this study was to conduct an assessment of the inter-rater reliability following the collection of data.

| Data collection
A dataset containing all studies found in the EU PAS register from its inception to 31st December 2018 was provided by the EMA. The EU PAS register is publicly accessible online (http://www.encepp.eu/ encepp/studySearch.htm). The data collection was carried out independently by 34 investigators from 13 academic centres or contract research organisations being part of ENCePP on common and detailed instructions and using the same electronic case report form (Figure 1).
The resulting dataset contained data from the EU PAS register concerning different aspects of study transparency (ENCePP Seal, protocol and availability of publication), regulatory obligations, methodology, target population, scope and drug under study (chemically synthetised vs. biological drugs, orphan drugs). EU PAS register data was supplemented by retrieving data from study protocols or publications, if available, on source of funding, whether a study was an MDS (defined as a study using more than 1 source of already existing databases which could not be linked at patient level), study design, use of reference drug for formal comparison (if any). The full protocol for data collection, including the fields provided by the EMA, is provided in Appendix S1A. Once all the data was collected it was harmonised based on pre-defined criteria.
To evaluate how consistent data collection was, data retrieved by investigators from a centre was checked by investigators from another centre for a 15% random sample of studies using the same protocol used for the main data collection; each investigator validated five studies. Any disagreements were resolved through the intervention of a third centre. This was done for the following key variables: setting, study type (new classification), study design, type of secondary data used, whether the study was an MDS, use of reference drug for formal comparison, drug type, whether the study drug was an orphan drug. After data collection was completed, automated quality check of data entry was conducted for the following fields: study type, data collection method, type of secondary data if applicable, whether the study was MDS, study type and study design. In brief, all investigators were asked to collect de novo data concerning these fields while blinded to previous assessment done by investigators belonging to a different centre.

| Statistical analysis
The cumulative frequency of study registration in the EU PAS register was plotted. This was stratified by type of study, data collection and by MDS or non-MDS status specifically. An overview of all studies was provided using descriptive statistics. This was done by stratifying at a high level by type of study (classified as clinical trials, observational studies, systematic review/meta-analysis, questionnaire-based surveys and others).
Finally, odds ratios (OR) and 95% confidence intervals (CIs) were calculated fitting univariate logistic regression models to investigate whether study-related variables (e.g., study type, data source, etc.) were associated with the use of primary data versus use of secondary data as a (reference) and whether they were associated with non-MDS versus MDS (reference) as a reference.
To better understand the data in the EU PAS register and add value to that data with the inclusion of further information related to methodology a large number of investigators was involved for collecting data.
Cohen's Kappa statistic was calculated in order to evaluate the inter-rater reliability (IRR). This was done by single variables and macro-categories consisting of several variables. Kappa estimates and their 95% CI were obtained by the resampling bootstrap method to account for heterogeneity between academic centres. Bootstrap replicate number was set equal to 100 000.  The overall percentage of agreement in data categorisation among collaborating centres that collected the data concerning study variables was highest for study type (93.5%) and lowest for type of F I G U R E 2 Flowchart of study types in the EU PAS register. 'Others' refers to any study design other than clinical trials, observational studies, systematic reviews, observational studies or questionnaires. Unknown refers to any study design that was not specified. EU PASs, European post-authorization studies secondary data (67.8%; Table 2). The low level of agreement for secondary data is expected to have an impact on the overall level of agreement. These results were largely in line with total kappa coeffi-

cients. The values of Cohen's kappa and the centre variations in
Cohen's kappa for key, along with their 95% CIs, are shown in Figures S1B1 and S1B2, respectively.
Compared to studies based on the secondary use of data, those based on primary data collection were less likely to have a protocol deposited, to be funded by public entities and to use a reference drug for formal comparison, while they were more likely to be funded by pharmaceutical companies (Table 3). In general, there was no substantial difference in study design between studies based on primary data collection or secondary use of data, although descriptive studies were slightly more common in the former.
Only a third of all claims-and EHR-based observational studies were requested by a regulator (Table 4). Claims data were more commonly used for risk assessment than EHRs (60.8% vs. 40.7%).

| DISCUSSION
The present study provides a detailed description of all studies registered in the EU PAS register from its inception until the end of 2018, focusing and providing detail on various aspects of study design and multiple database studies specifically. As expected, as compared to the most recent review of EU PAS register available, 4     These findings are thought provoking and suggest that data in the EU PAS register was not always clear and complete. Indeed, the quality of data entry in the EU PAS register is not monitored at all as with other similar platforms such as clinicaltrials.gov. In the latter, National Library of Medicine (NLM) staff conducts a review of registered study records for obvious errors or inconsistencies, with important issues being communicated directly to the investigators. 11 However, this does not guarantee complete and accurate records for all studies. 12 In the EU PAS register, it is completely up to the investigator's discretion to conduct data entry correctly and accurately as no quality control is conducted.
The very small number of duplicate studies identified, two, is potentially an indicator that in this sense, the register's data is of satisfactory quality.
high transparency. Orphan drugs were not commonly investigated using MDS, although such studies may have a lot of potential in the rare disease field given the small population sizes expected and the increasing need of merging data from different sources to speed up the development of these medicines.
Our analysis has several strengths compared with previous studies as it provides a detailed overview of the studies registered on EU PAS register, with systemic data collection being conducted by a network of established centres of excellence in pharmacoepidemiology. It is very important to underline that data collection was carried out by researchers independently based on common and very detailed instructions. These instructions were shared, agreed and made accessible and available among all reviewers, to avoid potential errors in the classification method. Again, to further check, blind quality control of data entry was conducted for some fields, to obtain an overall agreement. In addition, this is the first time that an evaluation of the MDS landscape has been attempted.
However, this study also has some limitations. The most important limitation concerns the accuracy of the data collection, which was sometimes limited by the lack of information recorded in the EU PAS register or lack of clarity of such information. The lack of data and ambiguities in classification of some studies may limit the robustness of our analyses. In these instances, the judgement on categorising characteristics of a study was somewhat subjective. This is a limitation inherent to the way that data is collected in the EU PAS register and to the lack of quality control of such data. Furthermore, since the registration of a study is voluntary, the EU PAS register is not representative of the whole landscape of studies in Europe and some studies may have not been captured. Another limitation is that the data was collected only until 2018. However, we hold that the present study still has the added value of providing an overview of the EU PAS register compared to the most recently published reviews, which described studies until 2016. 4

| CONCLUSION
Observational studies were the most common type of studies in the EU PAS register, but almost one third of observational studies used primary data, which is more resource-intensive.