Process trumps potential public good: better vaccine safety through linked cross‐jurisdictional immunisation data in Australia

Objective: To provide insights into complexities of seeking access to state and federal cross‐jurisdictional data for linkage with the Australian Childhood Immunisation Register (ACIR). We provide recommendations for improving access and receipt of linked datasets involving Australian Government‐administered data.

T he benefits of vaccines are globally acknowledged. Nevertheless, periodic concerns emerge regarding adverse events following immunisation (AEFI). Newly developed vaccines have incomplete safety profiles at the time of licensure because of limited participant enrolment in clinical trials and short duration of safety surveillance. Trials usually omit the vulnerable populations targeted by government vaccination programmes, including infants, pregnant women and the elderly. Common and acute reactions are readily identified, while rare and delayed AEFI may be missed without further assessment.
Without thorough assessment of all AEFI, appropriate government, regulatory and manufacturer action cannot be taken. Community confidence in immunisation benefits may waiver, resulting in reduced vaccine coverage, as became apparent for human papillomavirus (HPV) vaccination in Denmark 1,2 and Japan 3 ; and disease resurgence, as occurred with the measlesmumps-rubella vaccine in the United Kingdom. 4,5 To detect AEFI and mitigate the impact of any suspected concerns, the World Health Organization advocates all countries implement a post-licensure vaccine safety surveillance system. 6 One recommended approach is a passive surveillance system (PSS), relying on reports submitted to regulatory agencies from health professionals, industry and community. One PSS aim is 'signal detection' , and this is undertaken by the Therapeutic Goods Administration in Australia. Collated AEFI reports are examined for patterns involving specific or groups of vaccines that may then need further investigation. However, as reports are generally non-mandatory, considerable under-reporting exists and is coupled with difficulty in determining numbers of administered vaccines. Signal evaluation is compromised, impeding assessment of a causal association between a vaccine and adverse event. Data linkage, or the matching and joining of records from administrative datasets, has been extensively used internationally, as a means of safety assessment. In the United States and Scandinavian countries, analysis of linked data has been used to both identify and refute associations between specific vaccines and adverse outcomes. 7-10 A wholeof-population linked dataset enhances the scope, representativeness and population size for epidemiological assessments of vaccine safety.
Australia is well placed to employ data linkage for safety assessment of vaccines. Since 1996, routine childhood immunisations have been captured on the Australian Childhood Immunisation Register (ACIR), which was extended to all ages as the Australian Immunisation Register (AIR) in 2016. Adolescent human papillomavirus (HPV) immunisations have been recorded on the HPV Register since 2007, with data also integrated with AIR since late 2018. These registries have potential to be linked with other administrative data collections, such as jurisdictional hospital datasets and the National Death Index (NDI), to enable signal evaluation. However, application of linked administrative datasets for vaccine safety monitoring in Australia has been limited to a single investigation, the South Australian Vaccine Safety (SAVeS) study. 11 In late 2008, a team of investigators embarked on the Australian Research Council-funded study VALiD. The study objective was to investigate the acceptability and feasibility of linking Australian Government (hereafter AusGov) and jurisdictional data collections to evaluate the safety of vaccines. Two policy developments preceded project commencement. Firstly in 2006, the National Collaborative Research Infrastructure Strategy (NCRIS) identified population health and data linkage as a key priority for research investment. Enhanced data linkage capability was anticipated to expedite epidemiological research, leading to improvements in clinical practice and delivery of health and social services. 12 Secondly, two new vaccines were included on the NIP in 2007: a second-generation rotavirus vaccine (RV) protecting against diarrhoeal disease in children, and the HPV vaccine against cervical cancer. The nascent safety profiles of these vaccines warranted post-licensure surveillance, particularly since first-generation RVs were withdrawn in the US following identification of the vaccine's increased risk of intussusception (bowel obstruction). [13][14][15] Safety data for the HPV vaccine were also limited due to its recent introduction, with Australia the first country to include the vaccine in a funded national schedule. This paper describes the complexities encountered in accessing cross-jurisdictional data for linkage with the ACIR to establish a national linked dataset for vaccine safety evaluation. We suggest a series of recommendations for improving access and timely delivery of linked datasets to enhance safety surveillance.

Data sources
Eleven data sources from two federal and five jurisdictional agencies were identified for linkage in the VALiD study (Table 1). Linkage occurred in stages due to delays in approval for release of jurisdictional data. Two project datasets were created with the ACIR as the primary data source: 1) a national linkage with death registration records from the NDI (1999-2010); and 2) a cross-jurisdictional linkage with hospital inpatient and emergency department (ED) attendance records (2003-2013) from four of five jurisdictions: South Australia, New South Wales, Victoria and Queensland. Western Australia (WA) withdrew in the final stages of approval (December 2015), citing legislative restrictions preventing release of hospital data to external agencies. Established in 1996, the ACIR maintains records of all immunisations (specified by the National Immunisation Schedule) administered to children up to 7 years of age. This includes up to 21 separate vaccinations, protecting against 14 diseases, which are administered at six intervals during childhood. 16 The ACIR records child name, address, demographic information, aboriginality, immunisation history (including vaccine administered, date of immunisation, dose and batch number) and provider contact information.

Linkage methodology -the model we proposed
Initially, we proposed linking datasets by replicating the methodology implemented for the SAVeS project, 11 18 this policy stated linkage of AusGov datasets could only be undertaken by an approved agency or 'integrating authority' . At the time of application, two integrating authorities had received accreditation for linking AusGov data: the Australian Bureau of Statistics and the Australian Institute of Health and Welfare (AIHW), with the latter subsequently assigned as the linkage agency for the VALiD project.

Linkage methodology -the model prescribed by the Australian Government
The best practice protocol for data linkage stipulates that person identifiers (name, address, demographic) used for matching information between administrative datasets should be kept separate from those relating to an individual's health information (e.g. immunisations, deaths, hospitalisations). 19 This 'separation principle' ensures individuals' confidentiality and privacy are protected, with linkage staff having no access to health information components. Similarly, researchers never receive access to person identifiers, preventing potential matching of health information to specific individuals.
Following matching of identifiers, a national linkage key (NLK) was generated and sent to data custodians to attach health information. The NLK comprised two identification numbers -a new Person ID and encrypted Local Record ID assigned by the data custodian with the latter removed following attachment of health data.
The AIHW implemented a variation to best practice linking methodology in handling AusGov data, receiving both identifier and health data extracts. Within the AIHW, partitioning of extracts was achieved by having separate data processing domains for receipt of identifier information and health data. Following linkage, integrated data were sent to a secure, remote access computing facility known as 'SURE' (Secure Unified Research Environment), hosted by the Sax Institute in NSW. Figure 1 outlines the process, scope of approvals and agreements required before data linkage could proceed. Extensive negotiation was required between 18 different agencies to obtain 21 separate authorisations and 12 ethics approvals. Approvals were needed from four domains: Australian Government agencies, state-based authorities, ancillary organisations comprising linkage and data curating facilities, and human research ethics committees (HRECs).

Process and scope of approvals for data access
The process commenced with developing a study protocol and initial ethics submission to the University HREC. While the submission was under review, approval for data release was first sought from the AusGov Department of Health. Release of ACIR immunisation data for linkage and analysis required three tiers of approval: legislative, policy authority and administrative. Legislative approval involved review of the relevant Commonwealth statute (Health Insurance Act 1973 (Cth)) regulating the function and management of ACIR. While disclosure of immunisation information was permitted in a restricted range of circumstances, including research, only non-identifiable information could be released. Non-identifiable data, however, are impracticable for data linkage activities that require person identifiers for dataset matching. Consequently, approval in the form of a Public Interest Certificate (PIC) signed by the data custodian, or their delegate, was

Figure 1: Overview of process and range of approvals/agreements required for securing release of crossjurisdictional and Commonwealth data for linkage.
necessary to provide legislative exemption to data release protections.
Before delegate approval could be granted, policy authority was required from the relevant AusGov section with procedural responsibility for the ACIR dataset. During this process, overall project objectives and proposed methodology were reviewed by the Immunisation Branch together with an assessment of the risks of data release weighed against the potential public benefit of data provision. Procedural review for the VALiD project progressed through four levels of governance to achieve policy authorisation. In-principle approval for release of ACIR data for the VALiD project was granted in February 2011, some 2.5 years after project commencement. A further 14 months ensued before the PIC, authorising release of the immunisation register data, was conferred in April 2012. Our ARC competitive grant ended in 2010.
The final AusGov authorisation before data release to AIHW involved approval from the Commonwealth Department of Human Services (DHS). This Department had administrative responsibility for the dataset and was ultimately responsible for

Preventable Disease
Better vaccine safety through linked immunisation data preparing and sending the data extract to the assigned integrating authority. This required a further application to the Medicare External Requests Review Committee and, while timely approval was secured, a further year ensued before data were extracted and transferred to the AIHW in May 2013. Additional requirements supplementary to the Commonwealth process included completion of Confidentiality Deed Polls for those investigators accessing data.
In summary, provision of AusGov approval took almost four years from the initial request for data access in August 2008, with a further year before ACIR data were transferred to the AIHW for the first linkage. An additional 14 months was required for linkage and provision of data, due to re-issue of ACIR data following identification of data integrity issues related to same-individual duplications in person identifier number. This process was completed in August 2014, six years after our initial consultation with the AusGov Department of Health.

Jurisdictional and ethics approvals
The jurisdictional request process was similarly multi-layered and protracted (Table 2). Broadly, this involved agency consultation, identifying required datasets, selecting relevant variables, completion of a data application (and additional forms where necessary), securing data custodian approval and, finally, ethics application. No two jurisdictions were the same in their approach to sanctioning data release. Time to securing data release ranged from nine months in NSW, to more than four years for Victorian data. In NSW, the application process required receipt of the AusGov PIC documentation before an ethics application could be submitted, delaying completion of state approval processes until secured.
As well as being non-uniform, the extent of documentation varied between jurisdictions, sometimes involving a single application form (such as Queensland's Public Health Application) to additional documents such as Privacy Forms (NSW) and technical feasibility assessments (NSW, WA, SA). In SA, QLD and WA, executive approvals were also required for trans-border flow of data to the AIHW and/or release of data from hospital area health services. Due to differences in hospital funding arrangements, two application processes for release of hospital data were undertaken in QLD involving the QLD Health Statistics Unit and Mater Health Services (ED data only from former Mater Children's Hospital) with the latter requiring considerable documentation.
Multiple ethics approvals were also required. The significant delay in acquiring approval for data release from the ACIR delayed submissions for release of state hospital data, effectively separating our research objectives into two projects and two separate linkage processes. Submissions included two Commonwealth, six state, two institutional and two linkage agency applications. This resulted in 12 (rather than nine) separate ethics submissions seeking approval of ACIR linkage with the NDI, and then hospital datasets. Three jurisdictions also required additional submissions to statebased Aboriginal Health Ethics Committees for approval relating to the release of an Indigenous identifier.

Other approvals and agreements
Other approvals and agreements included a (data) Risk Assessment completed by the AIHW and submitted to Commonwealth DHS. Researcher and institutional agreements were also needed for the linkage agency's data storage facility (SURE). Confidentiality agreements with states were also required.

Data linkage methodology
Three variations of the data linkage model were implemented across agencies contributing data. Model 1 involving NSW, QLD Health and SA complied strictly with the best practice protocol. 19

Discussion
This paper describes the complex and convoluted application, approval and operational processes to link crossjurisdictional and AusGov data to establish a national linked dataset for vaccine safety signal evaluation. The cost (direct and personnel) and time to secure approvals for data linkage in Australia is clearly impractical for routine or even periodic safety monitoring of vaccines. This is concerning, given the importance of the public regard for safe vaccines. [20][21][22][23] In earlier studies, the investigators identified considerable parental support (94%) for linking their child's vaccination and hospital records for vaccine safety surveillance, with high confidence (84%) in identity protections. 23 Parents also emphasised the public benefits of generating knowledge on potential harms, which prevailed over concerns regarding permissions for data access. 24 Our experience demonstrates the disconnect between public attitudes and the reality of what can be achieved with administrative data in Australia. While data protection is important, it needs to be balanced against the significant public health advantages arising from linking data and adding value to an existing and government-funded resource -the Australian immunisation registers.

Key findings
Investment in Australia's data linkage infrastructure has expanded availability of administrative data for health research. 25 However, practical access to these data is limited by considerable administrative burdens and delays for researchers. The experience of the VALiD investigators in establishing a national integrated dataset for safety assessment of vaccines identified three key areas of complexity in securing data for cross-jurisdictional projects involving Commonwealth and state data. These relate to: 1) distributed dataset access; 2) variable and frequently non-transparent application processes; and 3) lengthy approval times involving multiple tiers of authorisation (Table  3). While detailed as separate challenges, these issues are all entwined, and arise from the distributed operational responsibility for healthcare in Australia.
Hospitalisations, for example, are administered by jurisdictions, while federal programs such as the National Immunisation Program and the Pharmaceutical Benefits Schedule are administered by AusGov agencies. Researchers need to negotiate approvals with individual jurisdictional and national agencies to obtain authorisation for data release and/or use. Complex and lengthy application processes ensue, progressing through multiple tiers of authorisation for data release approvals. Further obstacles arise through the non-uniform application process, with no two jurisdictions alike in their requirements, leading to differences in application stages (1-4), application documents (1-16) and time to approval for data release -ranging from nine months to six years. Complicating the negotiations were variations in willingness to release specific health data variables, particularly those deemed sensitive and with potential for reidentification of individuals, e.g. postcode.
Transparency in the application processes also varied. At the Commonwealth level, the application process was opaque and lacked coordination between AusGov agencies, with limited information provided by agencies on sequence and timing of government approvals. The delay in securing AusGov authorisations also delayed approvals from state agencies. Applications could not be commenced or progressed in one jurisdiction without receipt of the PIC approving release of data from ACIR, the primary data source. One further concerning feature was the decision not to release hospital data in one jurisdiction (WA) at the final approval stage involving WA formal data custodian endorsement. This happened despite our investigators completing extensive application documentation and securing six individual (ethics and governance) approvals.
Collectively, these challenges create substantial time and resource commitments for researchers brokering approvals to achieve data release. Furthermore, as data were no longer current by the time of data release, revised submissions became necessary to obtain more recent data. The challenges encountered in seeking to link cross-jurisdictional data are not isolated to the current study, with parallels also seen in a two-state immunisation effectiveness study and a national injury surveillance study. [26][27][28] The difficulties encountered raise significant ethical concerns about wasting public research funds and serious concerns regarding the feasibility of undertaking research of significant public benefit due to uncoordinated and disparate governance structures. The 2017 Productivity Commission Data Availability and Use Inquiry Report 29 ('PC Report') further emphasise that delays in data access impact on researchers' ability to provide real-world evidence on topics of concern.
Recommendations A simpler, more transparent application and approvals model for release and linkage of Commonwealth and jurisdictional data is urgently needed to ensure researcher efforts and resources are directed to investigating study objectives rather than negotiating the approval pathway. A uniform model of managing approvals would also reduce the burden experienced by data custodians when assessing and approving dataset requests. Four principal recommendations are proposed to simplify data access and reduce time to data approval and release. These are to: 1. Prioritise access to high-utility data collections, e.g. jurisdictional hospital separations/ED presentations, national registries, Pharmaceutical Benefits Scheme (PBS), Medicare Benefits Schedule (MBS).
2. Establish multiple portals (state-specific or institutional organisations) for managing approvals and data release of high-utility data.
3. Simplify data and ethics applications to a single application process supported by a uniform template for assessing risks of data release.
4. Implement a web-based electronic system with facility for monitoring progress of data requests, approvals, data extract preparation, linkage and data release.
Implementation of these recommendations would benefit researchers and assist data custodians in streamlining processes for data access. Situating Australian Government high-value datasets across multiple approved agencies with a uniform application template would: 1) clearly identify a nominated agency(ies) for data release approvals, thereby reducing burden (and waiting times) related to repeated requests for AusGov datasets with Commonwealth Departments; 2) resolve recursive legislative and policy reviews through a standing agreement with AusGov agencies pre-specifying conditions of data release; 3) reduce the cost of data extracts; 4) establish data dictionaries consistent with minimum dataset templates; and thereby 5) provide a uniform format for release of data variables from crossjurisdictional and national data holdings.

Analytic environments
Allied with our recommendations is an appeal for expanding the options for secure analytic environments used to manage and access integrated data. Researchers are currently constrained by computing and analysis environments pre-determined by data custodial agencies. These environments are inflexible to meet the evolving analytical requirements of researchers. As research questions become more complex and the scope of datasets expands, computing infrastructure in the existing environment will be insufficient for enhanced processing of multiple extremely large datasets and running complex statistical models. As suggested in one submission 30 to the PC Inquiry, institutions could provide a more flexible analysis facility for trusted users working with linked data. Through an accreditation mechanism, institutions would need to demonstrate appropriate security access measures, auditing of file access and transfer are in place. Allowing accredited institutions to establish secure computing environments employing cloud-based storage software, rather than requiring physical servers, 31 would also defray some of the significant (and increasing) costs related to accessing linked data.

Reforming data sharing and release
The propositions outlined above align with recommendations detailed in the PC Report 29 and proposed Australian Government reforms 32 responding to the PC Report. Reforms would be legislated through the new Data Sharing and Release Bill. 33 The Bill aims to increase authorised sharing and release of Australian Government-held data while improving data safeguards and risk management tools to create a more transparent environment for data sharing. 33 Planned reforms for improving data sharing and release arrangements include identifying high-value data collections, creating Accredited Data Authorities to facilitate data provision, and implementing a trusted user framework for assessing data requests. 32 High-value data collections with potential for delivering population benefits would be designated as National Interest Datasets and given priority access.
Accredited Data Authorities (ADAs) building on the current Integrating Authorities model would expand the network of agencies with facility for linking, sharing or releasing datasets. These Accredited Authorities would act as intermediaries between data custodians and users to facilitate data availability, 29,32 including national datasets identified as high-value. Feasibly, ADAs could be assigned to existing state-based linkage units but also serve as an opportunity for other non-government agencies to expand their role to data provision. Increasing the number of agencies available to distribute high-value data on behalf of data custodians would considerably reduce the bottleneck associated with obtaining approvals, access Locate high-utility data collections in a designated repositories through a memorandum of understanding or standing agreement involving state and Australian Government departments Protracted approval for data release arising from multiple tiers of authorisation with transparency lacking on time to approval and data release Electronic system integrating data application process and monitoring progress of approvals, data preparation, linkage and release.
Variable and sometimes non-transparent application processes across jurisdictions Single application process situated at repository with one uniform application form © 2019 The Authos and release of data currently experienced with the AIHW.
To increase transparency of approvals and streamline application processes, data access would be assessed by ADAs applying the 'trusted user' model, based on the Five Safes Framework. 32 Originally intended for identified data with scope broadened to include deidentified data, the Framework comprises five principles or dimensions of data access to inform a process of safe data release. 34 These are: Safe projects (Can the researchers be trusted?); Safe people (Is the purpose of use appropriate? What analysis is being done?); Safe data (Can the data disclose identity?); Safe settings (Does the access environment prevent unauthorised use?); and Safe outputs (Are the statistical results non-disclosive?). 34,35 The Five Safes provide a clearly articulated approach for assessing data requests; noting, however, that the five principles already underpin specifications outlined in current data and ethics applications. If uniformly applied by data custodians as a singular 'template' , the framework would provide a transparent and simplified mechanism for managing data applications and also ethical review of linked data requests. In turn, the template would reduce burden of application review for data custodians and ethics committees as well as duplication of applications submitted by researchers.
The trusted user data model has been adopted by the Australian Bureau of Statistics 36 and the AIHW, with some states (Victoria and SA) also implementing this model for sharing data across government agencies. 35 Trusted users would be aligned with institutions (e.g. government agencies, universities) with existing arrangements for managing improper data use. Arrangements include agreeing to legal undertakings specifying protections for data use and having necessary computing infrastructure for storing data securely. 37

Monitoring data requests, approvals and release
Regarding our recommendation for a web-based monitoring system, some elements such as progress in data requests and approvals are considered in the online application established by the Population Health Research Network (PHRN). 38 The online application aims to harmonise application and approval processes for cross-jurisdictional and multi-jurisdictional research projects.
Useful for complex linkages, the online process provides an opportunity for dataset representatives to collectively raise issues about the application rather than requiring separate discussions with individual linkage agencies. However, there remains concerns by these researchers that the PHRN adds yet another layer of application to an already convoluted application process. Despite the central application point, there still remains a requirement to complete ancillary approval documentation for all jurisdictions (with the exception of SA and TAS), rather than consolidating data access requests into a single uniform application. 38

Programmatic linkages
Further considerations related to data access and data timeliness include establishing enduring (rather than ad hoc) linkages. The now-named Australian Immunisation Register (AIR) offers clear population benefits for evaluating the impact of immunisation. Linked with other datasets, the AIR could be established as a routine and enduring programmatic linkage for monitoring the effectiveness and safety of vaccines. An immunisation programmatic linkage could contribute to a national system of other programmatic linkages involving National Interest Datasets. Programmatic linkages could be managed by ADAs located across and between government, non-government and academic sectors. 29,32 Inter-sectorial programmatic linkages, would also remove redundancy of repeated linkages of datasets. More critically, they would enable rapid and real-time safety surveillance to be conducted for vaccines, particularly seasonal or new vaccines introduced on the NIP.

Implications for public health
The delay and complexity in accessing linked data for vaccine safety surveillance has both local and global public health implications. Firstly, it results in an inability to undertake timely epidemiological reviews to examine associations between reported serious events and a vaccine. This is particularly critical for newly licensed vaccines. Surveillance of AEFI and response to safety concerns provide the community with assurances that vaccines are appropriately regulated by the Australian Government. This, in turn, engenders public confidence and contributes to the success of vaccination programs through ongoing participation. Other work by the project researchers has established community preference for employing linked data for public benefit, including vaccine safety monitoring, over and above preferences for individual consent and privacy concerns relating to the use of identifiable data. [22][23][24] Secondly, countries differ in the selection and timing of vaccine administration. As vaccine trials are usually conducted outside Australia, information on possible safety issues derived from these trials may lack comparability with the Australian population. Routine surveillance following vaccine administration is therefore important for providing local knowledge on AEFI, including their incidence, occurrence and type. Thirdly, the World Health Organization advocates surveillance for AEFI as part of global efforts to enhance safety information across diverse populations, particularly for rare outcomes. Well-resourced countries like Australia, with strict regulatory controls on therapeutic products and a reputation for early adoption of new vaccines, have a role to play in these efforts by providing safety data on routinely administered vaccines.

Conclusion
Integration of immunisation registers with other data collections is achievable in Australia but remains infeasible for routine and rapid identification of vaccine safety concerns. Multiple lengthy authorisation requirements, convoluted application processes and inconsistencies in data supplied all contribute to delayed data availability. Prioritising access to national and jurisdictional datasets of high value, a single application process with transparent assessment of data requests and an electronic system for monitoring progress of approvals and data release would expedite data access. This would lead to a surveillance system that is rapid and responsive to monitoring vaccine safety concerns. Furthermore, data would be available for external parties to provide a measure of accountability for policy decisions, independent of government assessment.