Shortening and validation of the Patient Engagement In Research Scale (PEIRS) for measuring meaningful patient and family caregiver engagement

Abstract Objective To shorten the Patient Engagement In Research Scale (PEIRS) to its most essential items and evaluate its measurement properties for assessing the degree of patients’ and family caregivers’ meaningful engagement as partners in research projects. Methods A prospective cross‐sectional web‐based survey in Canada and the USA, and also paper‐based in Canada. Participants were patients or family caregivers who had engaged in research projects within the last 3 years, were ≥17 years old, and communicated in English. Extensive psychometric analyses were conducted. Results 119 participants: 99 from Canada, 74 female, 51 aged 17‐35 years and 50 aged 36‐65 years, 60 had post‐secondary education, and 74 were Caucasian/white. The original 37‐item PEIRS was shortened to 22 items (PEIRS‐22), mainly because of low inter‐item correlations. PEIRS‐22 had a single dominant construct that accounted for 55% of explained variance. Analysis of PEIRS‐22 scores revealed the following: (1) acceptable floor and ceiling effects (<15%), (2) internal consistency (ordinal alpha = 0.96), (3) structural validity by fit to a Rasch measurement model, (4) construct validity by moderate correlations with the Public and Patient Engagement Evaluation Tool, (5) good test‐retest reliability (ICC2,1 = 0.86) and (6) interpretability demonstrated by significant differences among PEIRS‐22 scores across three levels of global meaningful engagement in research. Conclusions The shortened PEIRS is valid and reliable for assessing the degree of meaningful patient and family caregiver engagement in research. It enables standardized assessment of engagement in research across various contexts. Patient or public contribution A researcher‐initiated collaboration, patient partners contributed from study conception to manuscript write‐up.


| INTRODUC TI ON
Increasingly, patients, family caregivers and the public actively engage with other stakeholders in health research projects in various contexts globally. [1][2][3][4] This engagement is often dynamic and hands-on-for example, co-developing documents, participating in decision-making and providing advice on activities at any and all stages that shape the research process and outcomes. 5 The extent to which they are actively involved in decision-making creates a spectrum of engagement-at the lowest level they are simply informed about research; at the highest, they lead research activities and have primary decision-making authority. 6 Over the last decade, there has been a substantial increase in support for the unique and impactful contribution patients and family caregivers make to improve the relevance, appropriateness and use of research to serve the interests and needs of patients. 2 It is expected, and even mandatory in some circumstances, to include patients and family caregivers in research teams as stakeholders with a personal interest in health research. 2 The Canadian Institutes of Health Research, for example, outlines in its Strategy for Patient-Oriented Research (SPOR) initiative that patients should be engaged in 'active and meaningful collaboration' as partners in the research process. 7 There are currently numerous frameworks, models, best practices and other guidelines to support this goal [8][9][10][11][12] ; but if we want to achieve sustainable improvements in how acceptable, feasible, rigorous and relevant research studies are in terms of patients' realities, we need good quality patient engagement. 3 While it is promoted, practiced and studied, there is little quantitative evidence on how patient engagement in research increases the quality of health research to improve health and health care. 13 This could be due in part to a lack of validated measurement tools to determine the quality of patient engagement. A 2018 systematic review by Boivin and colleagues found that 27 existing patient and public evaluation tools, capturing both qualitative and quantitative data, needed more scientific rigour and patient engagement in their design and write-ups. 14 A 2020 scoping review on the evaluation of patient partnership in research revealed there were no quantitative assessments: all the identified studies used a qualitative approach. 13 Quantitative assessments provide more objective and efficient ways of measuring the quality of engagement, 15 thus enabling researchers to move from generating to testing hypotheses. By using validated quantitative measures of patient engagement in research, we can evaluate the effectiveness of engagement methods and strategies.
This evaluation is vital to improve the quality of partnerships with patients in research projects and across research networks and research initiatives, and provide more generalizable findings that move beyond lessons learned and reflective narratives.
Shortly after Boivin et al's 2018 systematic review was published, 14 the Patient Engagement In Research Scale (PEIRS) was published as the first tool designed to measure the degree of meaningful patient engagement in research on project teams. 16 The PEIRS is based on an empirical conceptual framework enhanced with a literature review, 5 recognized as a promising and important tool for the evaluation of patient and family caregiver engagement in research. 13,[17][18][19][20] The framework outlines the key components of and defines meaningful patient engagement in research as the planned, supported and valued involvement of patients in the research process, which facilitates their contributions and offers a rewarding experience. 5 While the original 37-item PEIRS has undergone face and content validation, most of its measurement properties have not yet been assessed. 16 Furthermore, informal conversations with researchers revealed the length of PEIRS might hinder its implementation. This study sought, therefore, to (1) reduce the respondent burden of the PEIRS by creating a shortened version containing its most essential elements, and (2) evaluate its measurement properties (internal consistency, structural and construct validity, reliability and interpretability) for assessing the degree of meaningful engagement of patients and family caregivers as partners in research projects. 16

| Study design
This survey used two sampling strategies: a web-based survey and a paper-based survey through collaboration with LDH. Eligible individuals were patients or family caregivers who had engaged as partners in research within the last three years, were 18 years or older (≥17 years for the paper survey) and could communicate in English.

The University of British Columbia Behavioural Research Ethics
Board approved this study (REB#H15-00217).

| Web-based survey
The web-based survey recruitment, using the Qualtrics survey tool (https://ubc.qualt rics.com), started in Canada in October 2018, extended to the United States in October 2019 and ended in both countries in March 2020. It entailed a multimodal approach involving internet-mediated and traditional methods with a study recruitment poster. 21 An email invitation containing the recruitment poster was sent first to participants from previous studies by the lead author (CBH) and then to selected patient partners, researchers and relevant organizations. The poster was also K E Y W O R D S evaluation instruments, family caregiver, patient and public involvement, patient-oriented research, psychometrics, reliability and validity posted on websites, in newsletters and on social media platforms (including Twitter and Facebook) via the accounts of the research team members, research organizations and networks, community organizations, and research-affiliated patient groups and organizations in Canada and the United States. In addition, we emailed researchers who engaged with patient partners and had (1) cited either of two frameworks on patient engagement by the lead author (CBH) and team, 5,22 (2) published in a special 2019 edition of a Canadian Medical Journal or (3) presented at certain webinars.
Finally, participants were asked to share the opportunity with other potential participants.
Potential participants completed an eligibility screening form.
Eligible individuals were emailed a personal survey link and asked to complete the questionnaires within three weeks. Eligible individuals were sent up to two reminders before the deadline and were contacted within a month of the deadline if they did not complete the survey. The web-based survey comprised an informed consent page, demographic questions, the 37-item PEIRS, 16 an item of global meaningful patient engagement in research and the

Public and Patient Engagement Evaluation Tool (PPEET) Participant
Questionnaire. 17 At the end of the survey, participants were asked to indicate their willingness to complete a second survey within two to seven days for test-retest reliability. Demographic information was collected on age, gender identity, education level, ethnicity/racial identity, household income, province/state of residence, type of patient partner, phase of research, jurisdictional scope of research team and description of the project being reported on. Participants had a 1 in 20 chance of receiving a CAD$75 (or US$50) gift card for participating.

| Paper-based survey
The paper-based survey involved youths who had lived experience of substance use, had attended one of two 1-day youth summit events on opioid interventions and services geared towards substance misuse for at-risk youths and new users in Ontario, Canada, and met similar criteria to those specified in the web-based survey.
The summits were co-designed and co-facilitated by youth and project allies, supported by a 'developmentally informed' youth engagement strategy and not by the PEIRS's conceptual framework. 23 At the end of the summit, the youths who participated were asked to complete the same questions used in the web-based survey.

| Patient Engagement In Research Scale (PEIRS)
The PEIRS is a self-administered 37-item questionnaire completed by patient partners (including family caregiver partners) to determine their degree of meaningful engagement in research as an indicator of the quality of their engagement in a research project. 16 Each item requires respondents to reflect on their experiences as a research partner in a specific project. PEIRS captures key elements of eight themes from a conceptual framework for meaningful engagement in research. 5,16 These themes align with the seven sections/subscales of the PEIRS: procedural requirements (PR, 14 items), convenience (CN, 4 items), contributions (CT, 4 items), two themes combined as 'team environment and interaction' (T, 5 items), support (SU, 3 items), feel valued (FV, 3 items) and benefits (BE, 4 items). Each item uses a 5-point Likert scale ('strongly agree' to 'strongly disagree') we scored 4 to 0. It achieved good content and face validation. 16

| Global meaningful patient engagement
A single item that reads 'Overall, how meaningful was your experience being a part of the research project?' was used to capture participants' perception of their global meaningful engagement in a research project. The item used a 5-point response scale (5-extremely, 4-very, 3-moderately, 2-slightly and 1-not meaningful) co-designed by our research team, including patient partners, for this project. Responses were reported with 'not' to 'moderately' meaningful grouped as a single category.

| Public and Patient Engagement Evaluation Tool (PPEET)
The PPEET, first published in 2015 and updated in 2018, consists of three questionnaires (participant, project and organization questionnaires), each developed to assess the processes, outputs, and perceived impacts of engagement activities in health system organizations. 17,24 The participant questionnaire has two versions: one designed for one-time engagement and the other for ongoing/longterm engagement activities. As a seminal questionnaire widely used in Canada for evaluating patient engagement in research, we chose the PPEET for convergent validation to assess construct validity of the shortened PEIRS. We used the one-time engagement version because its phrasing in past tense, as compared to present tense, aligned better with the PEIRS, which was developed for both onetime and ongoing engagement activities. The PPEET has two demographic items, plus 19 experience items (including six open-ended items), divided into four groups. 17 We used 10 closed-ended items that seemed relevant to engagement in research: PP1 to PP3 (items 3 to 5) for 'communication and support of participation', PP4 to PP7 (items 7 to 10) for 'sharing your views and perspectives', PP8 (item 12) for 'impacts and influence of engagement initiative', and PP9 and PP10 (items 17 and 18) for 'final thoughts'. Item 16 was excluded because it needed tailoring for each respondent. We worded the items for participants' views on a research 'engagement initiative'. Each item used a 5-point Likert scale, ranging from 1 for 'strongly disagree' to 5 for 'strongly agree'. 17,24 It has undergone face and content validation but has not had its measurement properties evaluated. No scoring instructions were published for the PPEET.

| Sample size considerations
Guided by the quality criteria from Terwee et al (2007) for the measurement properties of health status questionnaires, 25 the target sample size was at least 100 participants for satisfactory evaluation of internal consistency and 50 participants for test-retest reliability over two to seven days. Within this period, our research team which includes patient partners anticipated the respondent's engagement experiences would not change and previous responses forgotten. 25 We aimed for seven participants per item for exploratory factor analysis. 25

| Patient engagement in the current study
This researcher-initiated study was part of an ongoing three-phase research project spanning more than 3 years of collaboration among researchers and four experienced patient partners as research team members. In the previous two phases, the patient partners codesigned the conceptual framework and the PEIRS.

| Internal consistency
This depicted how unified items were for measuring meaningful engagement in research. 27 Because data collected using Likert scales are ordinal-level (or categorical) data, internal consistency was evaluated using a polychoric correlation matrix of items. A resulting average inter-item correlation between 0.20 and 0.40 is ideal; a lower value means items capture different constructs and higher values mean they capture narrowing ranges of the construct. 28 The corrected item-test correlation used a polyserial correlation coefficient, with a criterion of ≥0.4 for retaining items. 29 The ordinal coefficient alpha (criterion: ≥0.70), which is conceptually equivalent to Cronbach's alpha, was calculated using a polychoric correlation matrix. 30 Cronbach's alpha with the same criterion was calculated as a typically reported coefficient. We inspected the inter-item correlation matrix of the PEIRS and removed items that were too lowly (<0.30) or highly (>0.80) correlated. 31 The reduction step was conducted through an iterative process for refinement of the PEIRS and was informed by internal consistency analysis, the distributions of item responses and team discussions. The expected outcome was a parsimonious set of items that are internally consistent and have minimal respondent burden.

| Structural validity
Once the PEIRS had been refined for adequate internal consistency, we assessed its underlying construct. 27 When a participant had <15% of missing responses for the PEIRS, the item-level mean rounded to the nearest whole number was imputed. 32 A Kaiser-Meyer-Olkin measure for sampling adequacy of 0.93 confirmed a sufficient sample size and data to proceed with factor analysis. 33 Horn's parallel analysis, a more accurate test than eigenvaluesgreater-than-one rule and scree plot approach, indicated we could extract one factor from the PEIRS data. 34 Exploratory factor analysis with principle axis factoring was then used to determine the items to retain in the factor and thus the questionnaire. 35 Each item retained met an a priori factor loading criterion of ≥0. 32. 35 Rasch analysis uses probability estimates to inform evaluation and refinement of questionnaires. 36 Adequate fit to the Rasch measurement model can lead to obtaining interval-level scoring for questionnaires, which is desired for questionnaires' use in comparative effectiveness research. 37 A Rasch analysis was conducted on the retained items for fit to the partial credit version of the polytomous Rasch measurement model, 37,38 as it allows for variation between differences among item thresholds. We used Tennant and Conaghan's criteria for evaluating data fit to the Rasch model based on several fit statistics. 39 Overall fit was investigated with three summary statistics: the item-trait interaction chi-square pvalue, mean person-fit residual value and item-fit residual value. A non-significant (α > 0.05) chi-square statistic would indicate a fit between the expected and observed structure of the PEIRS data.
Person-fit and item-fit were achieved if (1) standardized (Z-score) fit residuals were within ± 2.5 units, (2) the mean residual values Items should have local independence, generally demonstrated by fit residual correlated below 0.3 between items. 39 A measure has unidimensional properties when <5% of participants (estimated by the lower bound of the binomial 95% CI) has a significant t test between groups of negative and positively loading items based on their fit residuals.
As part of the Rasch analysis, we calculated the person separation index. The person separation index is interpreted similarly to Cronbach's alpha, and values >0.85 would mean the PEIRS is appropriate for the assessment of individual patient partners. 39 Finally, we assessed for differential item functioning (or item bias) to determine whether scores for any item differed by demographic characteristics (age, gender, education and income) when participants had similar overall PEIRS scores. 39,40

| Construct validity
There are no reference standards for measuring meaningful patient engagement in research. Polyserial correlation coefficients were used to assess correlations between the refined PEIRS total scores and scores from each of the 10 items of PPEET. We hypothesized a moderate correlation of ~0.5 for each pair. 41 We explored, using hypotheses of no significant difference (α < 0.05), the relationship between PEIRS scores and demographic variables (gender, age, education attainment, household income, ethnicity/racial groups) using a non-parametric equivalent to one-way analysis of variance (ANOVA) to determine whether the degree of meaningful patient engagement differed by groups.

| Reliability and measurement error
We evaluated the extent to which repeated administration of PEIRS by participants with stable experiences provided similar PEIRS scores. Test-retest reliability was calculated using an intraclass correlation two-way random effects model (ICC 2,1 ) with 95% confidence intervals (CIs). A value between 0.75 and 0.90 was interpreted as good reliability and above 0.90 deemed excellent. 42,43 The 95% limits of agreement (LOA) repeatability coefficient provided the bounds of random differences between PEIRS scores that 95% of participants would expect to have after repeated administration of the PEIRS. 44 We calculated the standard error of measurement (SEM) and the minimal detectable change at a 90% confidence level (MDC 90 ) with 95 CIs. 25,45

| Interpretability
We explored the extent to which qualitative meaning can be assigned to PEIRS scores. 25 We tested the hypothesis that higher (more favourable) PEIRS scores will be associated with higher (more favourable) self-reported levels of engagement as a research partner. The latter was indicated by the global meaningful engagement measure. When the polyserial correlation coefficient was >0.40 between PEIRS and the global engagment measure scores, we analysed three levels of global meaningful engagement ('no to moderate', 'very', and 'extreme') as this allowed for an adequate number of respondents per level. We had estimated a sample size requirement of 52 participants per level for a moderate effect size. 46 When the assumptions of normality were not met, we used the Kruskal-Wallis test to identify any statistically significant difference in PEIRS scores among three levels of meaningful engagement. When the results were statistically significant, we performed a post hoc pairwise Mann-Whitney test with p-values controlled for the three comparisons using Bonferroni adjustment. Effect size using results from the Mann-Whitney test results was calculated as Cohen where r = z/√n, z was the z-score value obtained from the Mann-Whitney test and n was the total sample used per comparison. 47   Most participants had completed some post-secondary education up to a bachelor's degree (50.4%), and most had a household income of between US$24 000 and US$80 000. The majority (91.5%) identified as patient partners, with some of those also identifying as a family/friend/unpaid caregiver partner. The largest portion (43.7%) of participants were involved in local research teams, and participants' research projects were predominantly (71.4%) in the carrying-out phase.

| Descriptive statistics of survey
All 37 items of the PEIRS had response options of 'strongly agree', and 28 included response options of 'strongly disagree' to cover the ends of the response categories (see Table 2). One item (PR1) had 'neutral' as its least favourable response, and eight items (PR3 to PR8, PR11 and BE3) had 'disagree'. The mean for each item was above 'agree', and the median was either 'agree' or 'strongly agree'. While not important for the total scores, the item-level ceiling effect varied be-

| Item reduction
The inter-item correlation matrix of the 37-item PEIRS revealed multiple negative or low (<0. 10

| D ISCUSS I ON
In this study, we designed Our study had limitations. First, the average sample size of 5.3 participants per item for exploratory factor analysis is less than is widely recommended, although it is acceptable. 25  a Comparison between each pair of groups was statistically significant The numbers in parentheses correspond with the numbers in Figure 3.
whether administration modes impact on the PEIRS-22 psycho-

| DATA S HARING/AVAIL AB ILIT Y
The data that support the findings of this study are available from the corresponding author upon reasonable request.

ACK N OWLED G EM ENTS
We would like to thank the people and organizations who sup-

CO N FLI C T O F I NTE R E S T
The authors have no conflict of interest to declare.

Patient Engagement In Research Scale -PEIRS-22
Name: Date: Your project's name: INSTRUCTIONS: Thinking about your experience as a patient partner in the project, please respond to the statements by choosing only one box for each statement. If you are unsure about which option to choose for a statement, please give the best response you can. This questionnaire may take you about 3 to 7 minutes to complete.

Procedural Requirements
The following seven (7) statements are about your general experiences throughout the project.

Team Environment and Interaction
The following two (2) statements are about the research environment and interaction throughout the project.

Support
The following two (2) statements are about the support provided throughout the project.

Feel Valued
The following two (2)