Patient‐reported outcome measures (PROMs): A review of generic and condition‐specific measures and a discussion of trends and issues

Abstract Background Patient‐reported outcome measures (PROMs) are questionnaires that collect health outcomes directly from the people who experience them. This review critically synthesizes information on generic and selected condition‐specific PROMs to describe trends and contemporary issues regarding their development, validation and application. Methods We reviewed academic and grey literature on validated PROMs by searching databases, prominent websites, Google Scholar and Google Search. The identification of condition‐specific PROMs was limited to common conditions and those with a high burden of disease (eg cancers, cardiovascular disorders). Trends and contemporary issues in the development, validation and application of PROMs were critically evaluated. Results The search yielded 315 generic and condition‐specific PROMs. The largest numbers of measures were identified for generic PROMs, musculoskeletal conditions and cancers. The earliest published PROMs were in mental health‐related conditions. The number of PROMs grew substantially between 1980s and 2000s but slowed more recently. The number of publications discussing PROMs continues to increase. Issues identified include the use of computer‐adaptive testing and increasing concerns about the appropriateness of using PROMs developed and validated for specific purposes (eg research) for other reasons (eg clinical decision making). Conclusions The term PROM is a relatively new designation for a range of measures that have existed since at least the 1960s. Although literature on PROMs continues to expand, challenges remain in selecting reliable and valid tools that are fit‐for‐purpose from the many existing instruments. Patient or public contribution Consumers were not directly involved in this review; however, its outcome will be used in programmes that engage and partner with consumers.


| INTRODUC TI ON
Over the last few decades, health-care systems have increasingly recognized patients' perspectives as fundamental to ensuring that services are of a high quality and delivered in an equitable and safe way. 1 The expanding use of patient-reported outcome measures (PROMs) has been part of this shift. 2,3 PROMs are standardized questionnaires that collect information on health outcomes directly from patients, including about symptoms, health-related quality of life and functional status. In addition to standardization, PROMs should ideally undergo psychometric validation to ensure that they accurately reflect the outcomes they purport to measure and that they can reliably assess changes over time. 4 PROMs were originally developed for use in research, particularly clinical trials assessing the effectiveness of treatments. 5 Over time, their applications have broadened to include the following: supporting clinical decision making, prioritizing patients for surgical procedures, comparing outcomes among health-care providers, stimulating quality improvement and evaluating practices and policies. 2,3,[6][7][8] Evidence that the routine use of PROMs, at least in an oncological setting, leads to better outcomes for patients is inconclusive, but they do appear to improve patient-provider communication and patient satisfaction. 6,9 Potential benefits of these measures rely on them being rigorously developed, relevant to patients and well-validated. 10 Broadly, PROMs fall into two main categories: condition-specific and generic. The latter measures health concepts that are relevant to a wide range of patient groups, enabling aggregation and comparisons across varied conditions and settings. An example is the EQ-5D; developed by the EuroQol Group, 11 it includes five questions asking after the patient's health that day, mobility, self-care, usual activities, pain/discomfort and anxiety/depression.  14 ) or populations (eg children and adolescents 15 ). A review that critically synthesizes information on PROMs and considers trends and issues in this literature has thus far been lacking. This paper aims to address this gap by evaluating trends in PROMs and their publication and discusses contemporary issues that relate to the development, validation and application in health care.

| ME THODS
We used a rapid review methodology to synthesize the evidence on generic and condition-specific PROMs, identifying measures and evaluating trends and issues. Rapid review is a form of evidence synthesis that streamlines traditional systematic review methods in a shortened time frame. 16 The approach was selected due to the fast-moving nature of the field and the breadth of focus of this review. The Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines were used to guide the methodological design. 17 A critical review approach 18 was adopted to evaluate trends and contemporary issues in the use of PROMs. This rapid review was conducted for a lead safety and quality organization in Australia, the Australian Commission on Safety and Quality in Health Care. Although consumers were not directly involved in this review, the Commission engages and partners with consumers to deliver its work programmes. This review is part of its work to support the use of PROMs to 'drive quality improvement in a way that brings patients' voices and outcomes to the fore'. 19  Patient or public contribution: Consumers were not directly involved in this review; however, its outcome will be used in programmes that engage and partner with consumers.

K E Y W O R D S
patient safety, patient-reported outcome measure, PROM, review An example of the search strategy used is displayed in

| Eligibility criteria
PROMs were included in the review if they met the following criteria: (1) standardized instrument/survey that is used for measuring health outcomes (eg symptoms, quality-of-life, functional status) reported directly by the patient and using a range of modes of delivery, including computerized-adaptive testing (CAT); (2) validated, that is, there was published statistical analyses establishing reliability (eg Cronbach's alpha 22 ) and validity of the scale(s), including construct validity (eg factor analyses, item-response modelling, convergent and discriminant validity), criterion-related validity (eg concurrent and predictive validity), or analyses of known group differences; (3) validation analyses were conducted on an English language version of the instrument, either in the original validation paper or subsequently and (4) the measure assesses generic health status OR is a condition-specific PROM from one of the included conditions (see Results, Table 2). To be included, it was not required that a survey be described as a 'PROM' in its original development, but it needed to measure patient-reported health outcomes as per criterion (1).
For criterion (4), conditions were selected by examining the Australian Institute of Health and Welfare's (AIHW's) list of high burden diseases, 23 which largely reflected international trends. 24 Reproductive and maternal conditions were also added based on consultation with the project funder. These conditions are grouped by International Classification of Diseases (ICD-10) disease groups. 25

| Additional searches: Snowballing, scoping published reviews and grey literature
When reviewing validation papers, researchers noted any additional potential PROMs, adding them to the list for scoping (ie snowballing).

Searches for other PROMs in Scopus, Google Scholar and Google
Search were also performed using terms for both specific conditions

Database Keywords and MeSH
MEDLINE "patient?reported outcome" OR "PROM" OR MeSH term "patient reported outcome measures" AND "psychometr*" OR "reliability" OR "valid*" AND "questionnaire" OR "tool" OR "scale" OR "survey" OR "measure" OR "instrument" OR "interview"  26 were checked. Researchers also searched for recent systematic and other reviews on the topic, reading these papers to ensure as far as possible no relevant PROMs were missed.

| Screening and data extraction
Citations returned from the database searches were downloaded into Endnote and duplicates were removed. Titles and abstracts were then exported to Microsoft Excel for a simultaneous process of screening against inclusion criteria and preliminary data extraction. Preliminary data extraction involved documenting the relevant disease group (where applicable, from Table 2) and condition(s) reported in the abstract, plus any PROMs mentioned by name, for all citations meeting the inclusion criteria. In the few cases where there was not sufficient information in the abstract to assess eligibility or extract data, the full text of the paper was examined. In

| RE SULTS
A total of 6453 citations were returned from the database search.
Following removal of duplicates, 3450 titles/abstracts were exported to Excel for screening and preliminary data extraction ( Figure 1). This led to the identification of 255 PROMs, both generic and from conditions in Table 2.
A further 200 PROMs were identified through additional searches, amounting to 455 PROMs evaluated. Of these, 315 met all inclusion criteria and were included in this review (Appendix S1).
PROMs were excluded at this stage for a number of reasons including insufficient evidence of validation or lack of validation in English, measures were a clinical assessment tool (ie used by clinicians) or not yet sufficiently validated for use as a PROM, the measure was too generally described or lacked appropriate standardization to be adequately validated (eg general reports of numeric rating scales, visual analogue scales, the Patient Global Assessment 27 ), or the information available online was too limited to assess eligibility.
The number of PROMs included in the review by disease group is shown in Table 2. As can be seen, the highest number of PROMs was identified for musculoskeletal conditions and for generic PROMs,

| D ISCUSS I ON
This review identified both generic and condition-specific PROMs, evaluated trends in measures and publications and considered contemporary issues in the development, validation and application of PROMs in health care. We identified 315 validated PROMs (Appendix S1), covering a range of common conditions with high burden of disease across all major condition groups. 25 Thirty-nine of the included measures were generic PROMs. In scoping the literature, we also observed utilization of CAT over the last decade. CAT is able to differentiate respondents along the continuum of a trait (eg degree of pain) by extensive collection, content validation and calibration of item banks. 40 Calibrating item banks involves advanced psychometrics using item-response theory to determine the degree of 'difficulty' or level of the underlying trait being measured (eg 'moderate' pain) for each item. 41 Using calibrated item banks in computer administered PROMs is thought to improve efficiency, making delivery of the questionnaire more dynamic and flexible because new questions are adapted to patients' prior responses. 42 In that sense, CAT-enabled PROMs are more individual- In many cases, these were drawn from other validated PROMs, but their calibration, coupled with the extensive support available and limited restrictions on use, has contributed to PROMIS' popularity. 48

| Issues
Many of the PROMs identified by this review were well-establishedsome were more than 30 years old-while papers captured from our search of academic databases were more recent. Thus, the term 'patient-reported outcome' appears to be a fairly new designation for measures, questionnaires and inventories that might once have been described as assessing symptom severity or health-related quality-of-life. It was often only in retrospect, by the attribution of other authors, that older instruments were recognized as PROMs.
This may make searching for PROMs challenging.

| Strengths and limitations
The method reported here is a multipronged approach to reviewing the literature and identifying PROMs. A key strength was examining both generic and condition-specific PROMs to consider trends and issues in this growing field. Less than two thirds of instruments  64 We excluded a number of measures that are still in the early stages of validation (eg reporting on content validity) but it is likely that over time, through more extensive validation, these would have met criteria for inclusion.
We also excluded PROMs where we could not find evidence of English validation, even though some had translations available. 65 The research on, and use of, PROMs, as this review has shown, is constantly moving.

| CON CLUS ION
There are many PROMs and many more studies examining, using or discussing them. In this review, we identified 315 generic and condition-specific PROMs, producing a library available for public use that will be updated over time. 66 We also evaluated trends among measures and publications on PROMs and discussed contemporary issues related to their development, validation and application in healthcare. A key challenge to using PROMs is selecting a reliable and valid tool that is appropriate for one's purpose from the hundreds of instruments available. This review provides insights to assist with understanding the scope of available generic and selected condition-specific PROMs, including trends in the development, validation and use of these PROMs. It highlights the growing global recognition that incorporating the patient's perspective is integral to the quality and effectiveness of health care and issues associated with this shift. The review demonstrates that the measurement of patient-reported outcomes is an evolving field.

CO N FLI C T O F I NTE R E S T S
None declared.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available in the supplementary material Appendix S1 of this article.