Patient-reported outcomes, along with clinician-reported outcomes, laboratory tests, and device measurements, are collected in clinical trials to evaluate the effectiveness of treatments. Instruments within each of the types of measurements have unique sets of characteristics that make them relevant for use in a given trial. Guidelines have been proposed to assist in the preparation of manuscripts and presentation of data. For PRO measures, in particular, guidelines have been proposed by Fayers and Machin , Revicki , Staquet et al. [19,20] and Sloan et al. .
In this section, we go beyond previously published guidelines by specifying the detailed type of information needed to describe adequately PRO instrumentation used in a study. Complete information is needed about the relationships among each instrument, the study population, and the interventions, as well as data collection procedures, methods of analysis, and methods of interpretation, for understanding the findings and for generalizing the results beyond any particular study.
By following the detailed reporting of information recommended here, users can compare PRO information across various studies, whether conducted as part of a submission to the Food and Drug Administration (FDA) or for other purposes. Consistent reporting can also lead to the creation of an evidence-based data set that can be used to support the construct validity of PRO instruments in a variety of settings. This information might simplify the FDA submission and approval process as well as contribute to a more informed instrument selection process in the future.
Description of the Instruments Used
A description of each PRO instrument used in any trial is critical for understanding the outcomes. In most publications, this information is minimal, consisting of the name of the instrument with key references to its development and to documentation of its measurement properties. This is typically accompanied by a statement that the instrument has been administered, scored, and analyzed according to the developer's documentation. Providing minimal information about a PRO instrument in the Material and Methods section is acceptable only in trials that use an instrument without any modification.
A minimal description is satisfactory when the PRO instrument has been well documented by its developer and widely used, such as the St. George's Respiratory Questionnaire (SGRQ)  and the SF-36 . Minimal information is insufficient, however, for instruments with little or no formal documentation. For these tools, sufficient information to be described in the Materials and Methods section includes the minimal details (given above) as well as domain names, recall period, and methods of scaling and scoring. All modifications to the original tool and/or conditions for use, e.g., changes in item wording, recall period, and scoring method, should be fully described to clarify interpretation and comparability of findings. A copy of the instrument should appear in the cited references, if possible, be readily available at a permanent website, or supplied by the authors on request.
Ideally, the Materials and Methods section should include all of the minimal and sufficient information as well as a copy of the instrument, in either the text or an appendix. This need not be in camera-ready format but, rather, may be in reviewable form that includes the item stem, response options, and recall period. With ready access to the tool, the reader can evaluate the instrument's content within the context of the trial and thus interpret the findings meaningfully.
If an instrument has been endorsed by a professional organization, such as the American College of Rheumatology, this point should be noted along with an instrument's description and taken as indication of the relevance of an instrument's content. For example, the ACR response criteria incorporate several PROs [23,24]. Endorsement also implies widespread use with results contributing evidence to support an instrument's construct validity. If an instrument lacks professional endorsement, then including a review version of the instrument in the article will help the reader determine its appropriateness. Also, without an endorsement and/or widespread use, evidence will need to be provided as to an instrument's validity for use in a particular study population.
Although reporting minimal and sufficient information seems basic, it is easy to find articles that cite incorrect instrument names and references and fail to describe changes in wording and/or scoring procedures. Most frequently, however, authors name and reference an instrument, but provide no information as to the conditions of its use. For example, in a review of approximately 100 articles reporting the use of either the General Well-Being or the Psychological General Well-Being Index  in clinical trials, 55% provided less than minimal information about how the instrument was actually used .
Description of the Intervention
Details of the interventions intended for each group should be stated including the method, timing, and duration of treatment administration. Authors should describe the treatment given to the control group (more than merely stating that a group received a control regimen or “standard of care”), and they should provide information on the placebo treatment (if a placebo was used). Finally, they should state whether the treatment was masked (i.e., double-blind) or open label, as this may affect evaluation of effectiveness.
In addition, specifying the course of treatment is crucial so that the time points that the PRO instrument(s) was (were) administered can be examined in the context of treatment impact. From the description of the treatment(s) being evaluated and the timing of the PRO measurements (e.g., at baseline and weeks 2, 4, and 8), the relationship between treatment response and the likely ability of the instrument to have detected any change in status should be clearly understandable. For example, short reference periods (e.g., 4 or 24 hours) are better suited for assessing the impact of migraine treatment than are longer periods (e.g., 1 week or 1 month).
Data Collection Procedures
PRO instruments, when used in clinical trials, can be interviewer administered or self-administered either at the site of care by postal questionnaire, by telephone through an interactive voice recognition system, or via electronic data capture. Analysts should disclose the conditions of the administration site (e.g., a standardized location free of distractions) and any formatting specific to the study (e.g., large scale print to accommodate elderly persons with low vision or electronic forms of data capture). If the PRO measure was interviewer-administered, either in person or via the telephone, the extent to which interviewers were trained should be specified. Information on standardized training of study coordinators or patients (e.g., training videos) and use of practice diaries during the placebo run-in period should also be provided. Such detail is important for understanding the nature of missing data and incomplete responses and potential sources of bias.
If the procedures used in the trial differ from those used in the development of the instrument then steps used to validate the alternative method of administration should be described or referenced. Although sponsors may use PRO data collected in a Phase III clinical trial to both validate a method of administration and evaluate treatment efficacy, they run the risk of not being able to use the PRO results for making a claim if the validation is unsuccessful. Thus, sponsors are ill advised to use a pivotal trial for PRO validation purposes. This same advice applies to using PRO data from a Phase III study to validate initially any aspect of an instrument that is also to be used as a trial end point. Developers can use clinical trial data as supportive evidence for the psychometric properties of a PRO but an independent validation study or another published study of the end point would ideally be available if sponsors intend for the PRO to be a primary end point.
Sample Size Determination
If a PRO is the primary end point, then the Materials and Methods section needs to state the methods and information used in determining the sample size; this requirement is similar to that for presenting equivalent information for a non-PRO end point. It is especially important to cite the source of the MID that is used so that its relevance to the current study is clear. With the increasing use of PROs in clinical trials, estimates of MID are more readily available from published sources. The power needs to be stated as well as any effort to oversample to account for anticipated dropouts, for those who prove to be ineligible, for key subgroups, or for missing data.
Many PRO instruments consist of multiple domains and subdomains, unlike laboratory tests, device measurements, and many clinician-rated assessments. Even if investigators intend to use a summary score (e.g., the Health Assessment Questionnaire Disability Index), the FDA may evaluate the subdomains. To accommodate this regulatory analysis, a larger sample size might be advisable. If the sponsor decides to use this strategy, the investigators should clearly state this point in the methods section.
Presentation and Interpretation of Findings
Clear and thorough presentation is essential because PRO data provide the only statement of the patient's perspective on treatment impact. In addition, publication of the PRO data builds evidence to support the use of an instrument in a variety of populations and settings. That is, published findings help establish an instrument's construct validity.
Clinical trial data generally appear in tables or annotated figures. Tables are essential for presenting descriptive statistics that can be used by others to calculate estimates of effect sizes and perhaps other types of responsiveness statistics. This information should be presented for all PROs used in the clinical trial. Thus, PRO data need to be made publicly available, whether in journals or as part of a website of findings (perhaps as part of http://www.ClinicalTrials.gov).
Figures can be used to show relationships between PRO and clinical data. For example, Teeter et al.  used a scatter plot to show the absence of a relationship between FEV1 and total symptoms, thus illustrating the unique contribution of PRO data. Such plots can be more informative than a summary statistic such as a correlation coefficient. Bar charts or line drawings are also useful for showing either scores between groups at a particular time point or change from baseline scores. In these graphs, means, standard deviations, and sample sizes should be given to allow the reader to check for statistical significance or calculate responsiveness statistics. Charts without data are unacceptable unless the data are presented in accompanying tables.
For investigating the relationship between treatment groups in a PRO end point, the FDA in a recent guidance document  suggested a cumulative distribution plot (see page 19 in ) to supplement a table of sample sizes, means, standard deviations, and/or P-values for investigating differences between treatment groups. A cumulative distribution plot shows the distributional properties of the observed PRO end point data not readily extracted from a table of summary statistics. While a useful proposal, we present alternative graphics which may be easier for nonstatisticians to read (Fig. 1). A more detailed comparison of these alternative graphics is forthcoming.
Figure 1. Overlaid (by treatment group) histogram (A) and difference between treatment groups in cumulative distribution plot (B) for change-from-baseline PRO data. PRO, patient-reported outcome.
Download figure to PowerPoint
Complete disclosure about the amount and type of missing data is imperative to the veracity of study results. Presenting a CONSORT diagram is a concise and complete method that has gained favor in recent years [18,30]. It delineates what happened to every person in the study from initiation to completion and provides sample sizes for the various analytical datasets (efficacy, safety, etc.). Alternatively, or complementarily, a table of the amount of missing data should be included to identify the exact number and proportion of missing data that occurred for each end point. Authors should comment about the amount of data that they consider be missing at random or missing owing to some systematic influence.
In reporting missing data, we recommend that authors include the following information: number of patients potentially able to provide data, number of patients who died, and number of patients who failed to provide data but were alive. This gives a clear indication of the proportion of patients who truly failed to provide data when they might have and indicates the degree to which missing data could have been avoided or may have influenced results. It also allows for an examination of differential dropout between treatment arms.
PRO data are usually one of several types of data collected in a clinical trial; other types include clinician-reported outcomes, laboratory tests, and biometric measures. Together, findings from these sources yield a composite picture of treatment efficacy and effectiveness. Relationships between the types of indicators need to be presented in a straightforward way that will be readily understandable.
For transparency, all results should be presented. What happens when there is a mix of positive and negative PRO results for an experimental treatment? For example, say a treatment group reported less fatigue (primary end point) but more diarrhea (secondary end point) than the control group. Interpreting such results will only be approachable on a case-by-case basis. The decision ultimately has to fit with the concept for the label claim. An analogy here is the listing of all potential adverse events that appear during a clinical trial. It is not unreasonable therefore to suggest that any PRO domain that demonstrates a beneficial result might be included in a label claim or promotion as a potentially positive effect of a treatment, but that all other results, whether or not they are beneficial should also be presented to provide a full picture of the effects. The PRO labeling claim should be based on the a priori specified domains and whether between-group differences were statistically and clinically significant for these specified domains. Nevertheless, unexpected PRO findings, either positive or negative, may be seen in the secondary PRO end points, and these results may be informative to clinicians and patients.