Evaluating community deliberations about health research priorities

Abstract Context Engaging underrepresented communities in health research priority setting could make the scientific agenda more equitable and more responsive to their needs. Objective Evaluate democratic deliberations engaging minority and underserved communities in setting health research priorities. Methods Participants from underrepresented communities throughout Michigan (47 groups, n = 519) engaged in structured deliberations about health research priorities in professionally facilitated groups. We evaluated some aspects of the structure, process, and outcomes of deliberations, including representation, equality of participation, participants’ views of deliberations, and the impact of group deliberations on individual participants’ knowledge, attitudes, and points of view. Follow‐up interviews elicited richer descriptions of these and also explored later effects on deliberators. Results Deliberators (age 18‐88 years) overrepresented minority groups. Participation in discussions was well distributed. Deliberators improved their knowledge about disparities, but not about health research. Participants, on average, supported using their group's decision to inform decision makers and would trust a process like this to inform funding decisions. Views of deliberations were the strongest predictor of these outcomes. Follow‐up interviews revealed deliberators were particularly struck by their experience hearing and understanding other points of view, sometimes surprised at the group's ability to reach agreement, and occasionally activated to volunteer or advocate. Conclusions Deliberations using a structured group exercise to engage minority and underserved community members in setting health research priorities met some important criteria for a fair, credible process that could inform policy. Deliberations appeared to change some opinions, improved some knowledge, and were judged by participants worth using to inform policymakers.


| INTRODUC TI ON
A major contributor to health disparities is the relative lack of resources-including the resources of science-allocated to address the health problems of those with disproportionately greater needs. 1,2 While health research priorities are often shaped by scientists, clinicians, advocacy groups and the private sector, the allocation of scarce resources for health research requires explicit attention to both justice and science. 3,4 Engaging and involving underrepresented communities in research priority setting could make the scientific research agenda more equitable, and more responsive to their needs and values. [5][6][7] Academics, funders and governments increasingly strive to engage communities not merely as subjects of research, but as partners in the setting of priorities for health research. 8 The Council of Public Representatives, charged with advising the NIH Director on research priorities, recommended educating and involving the public, "where they live," 3,[9][10][11][12] yet how to engage communities in research priority setting remains a challenge. Those seeking to involve the public in setting priorities for limited resources sometimes use a deliberative approach, which aims for collective, informed problemsolving about a policy problem. 13 Trade-offs between different areas of spending can be difficult policy topics. 10,14 and those that wrestle with core values, or that pit money against health, can be particularly difficult. Research allocation decisions may not seem salient to many non-experts, and discussions about health research priorities can be complex and technical, so members of the public may not feel competent to contribute. Given the challenges of deliberations on complex and value-laden topics, attention to the quality of deliberation is essential.
In this paper, we evaluate the use of a deliberative exercise, CHAT (CHoosing All Together), to facilitate deliberation about health research priorities constrained by limited resources. † CHAT was originally developed as a "serious game" for deliberations about the design of health insurance plans 15 that aims to promote informed, reasoned dialogue about allocation decisions among ordinary persons. 16 It has been used to examine healthcare priorities in a number of different settings in the USA and other countries, engaging a wide range of individuals and communities. 15,[17][18][19][20][21][22][23][24] A number of studies have concluded, in these settings, that CHAT facilitates high-quality deliberation, changes individual preferences and opinions and increases knowledge. 15,18,20 There is some evidence that CHAT leads participants to take a more public-spirited view of resource allocation decisions; for example, a 2004 study found that participants in CHAT were willing to give up some benefit coverage to increase coverage of the uninsured. 20 Setting priorities for health insurance or health care, while complex and value-laden, can be viewed by most people as potentially relevant to their lives, whereas priorities for health research could seem more remote. Whether CHAT can produce high-quality deliberation on this complex topic, further from the day-to-day experience of most members of the public, is unknown. Here we report an evaluation of deliberations about such prioritization decisions using CHAT to facilitate deliberation about the allocation of health research dollars in minority and medically underserved communities.
We evaluate CHAT deliberations using a framework that examines the formal structure of deliberation (how it is organized), the process of deliberation (how it transpires) and the outcomes produced (Table 1). [25][26][27][28][29] While the goal of deliberation could be construed as "better" decisions, or outcomes, much of the normative value of deliberation comes from its promise of offering a fair process of discussion and decision making, independent of the decisions actually reached. Theories of deliberative democracy, despite important differences, share an emphasis on a process in which political actors listen to each other with openness and respect, provide reasons and justifications for their opinions, remain open to changing their points of view and consider the common good. 30,31 Structural elements can include information and choices, materials, tasks, sampling and group composition. 32 Examples of procedural aspects of quality include respectful treatment, civility and reasongiving. Outcomes can include changes in participants' knowledge or opinions, decisions made and participants' views of the group decision, including trust in decision makers. 30,33 These domains may interrelate; for instance, representation (one element of structure) could influence the quality of deliberations (process) and/or changes in the point of view of participants (outcome). Evaluation of the quality of deliberative approaches, despite its importance, is a nascent field of study. 26,27 We evaluated the structure, process and outcomes of deliberations from the perspective of deliberators themselves, how they viewed the process, whether and how knowledge and attitudes changed, and what they thought about using such a process to inform decision makers.

| ME THODS
To adapt CHAT to the unique needs and objectives of research priority setting with minority and underserved communities, we utilized a participatory process, led by a Steering Committee comprised of a † We describe the priorities for health research spending selected by participants using this exercise elsewhere. 35,47 K E Y W O R D S community-based participatory research, health priorities, research priorities, resource allocation majority of community leaders and several leaders of research institutions, that engaged community partners in all phases of the project. 34 Adaptation was informed by documents and interviews with funders, research institutions, clinicians and community members.
CHAT content was designed to be credible and comprehensible to a lay audience. Final content (which included definitions and explanations of a number of scientific terms) had a Flesch-Kincaid readability score of 55 and was written at an 8th grade reading level (See Table S1). All content was available in both English and Spanish.
Since participants were laypersons with varying levels of baseline knowledge, sessions began with a brief video about health research goals, methods, costs, funders and uses, and introduced deliberators to their task. Tablet devices displaying the CHAT exercise presented participants with an interactive game board ( Figure 1) with spending options depicted as wedges of a circle. Each of the 16 wedges represented a category of health research spending, and each wedge had different levels of spending (including the option of no spending at all); higher levels (towards the centre of the wheel) present a greater investment in that type of research. Categories and levels are described in Table S1 and previously published work. 35 Participants chose a level of funding for each category by allocating markers required for the particular level. However, participants were given a limited number of markers (50 markers for 92 open spaces) so choosing high levels of funding in one category required lower or no funding in another. Participants allocated their markers in four rounds. In the first round, participants set priorities as individuals; in the second round, they set priorities in small groups of 2-4; in the third round, they set priorities with the entire group (up to 15); and in the fourth round, they set priorities again as individuals. After rounds 1 and 2, the group heard and discussed scenarios ("events") that illustrated the consequences of their choices. In round 3, deliberators were asked to articulate reasons for their priorities. In all rounds, trained facilitators asked deliberators to make fair decisions on behalf of fellow community members. Participants learned from other members of the group, the illustrative events and embedded resources.

| Sampling and recruitment
We aimed to recruit equal numbers of men and women, and to have disproportionate representation of minority and low-income residents, since these perspectives tend to be underrepresented in decisions about health research priorities. [10][11][12] Purposive sampling targeted minority and medically underserved communities throughout the state of Michigan. 36

| Data collection
Given the complexity of public deliberations about health research priorities, we aimed to evaluate multiple aspects of the deliberation structure, process and outcomes (Table 1). Data sources included pre-and post-deliberation surveys, research staff observations of deliberations, priorities selected by individuals before and after group deliberations (previously reported), 35 and follow-up interviews with one participant from each group a year after the deliberations were conducted. Missing data for survey responses ranged from 0% to 7%.

| Structure
We measured representativeness using participants' self-reported demographic characteristics. Given our goal to engage minority and

| Process
We measured multiple elements of the deliberative process. Thirteen items in post-deliberation surveys measured various dimensions of deliberative quality perceived by participants, including respectful treatment, opportunity to contribute their point of view and their views of the kinds of arguments offered in deliberation. 15,24,[26][27][28][29] Mean responses are reported on a 0-4 Likert scale from Strongly Disagree to Strongly Agree, with some items reverse-coded so that higher scores always indicate higher deliberative quality. Post-deliberation surveys also included two items measuring whether deliberators supported using their group's decision to inform decision makers, and their trust in a process like this to inform decision makers; while not direct measures of process quality, we expect deliberators to support or trust processes like this to inform policy only if they view them as credible, legitimate and fair.
In addition to survey measures, the distribution of contributions by deliberators was measured by members of the research team at 41 of the 47 sessions; at six sessions, staff was insufficient to allow complete recording of participation. Using a diagram of the deliberators, they hand-recorded the number of times each person spoke, a more accurate way to capture this information than transcription.
We used this information to assess and compare equality of participation between groups using a standard metric for market concentration, the Herfindahl-Hirschman Index (HHI), 37 which measures the degree to which one or a few actors dominate any setting. Here, we used the HHI to measure the degree to which discussion was dominated by one or a few people.

| Outcomes
To measure the impact of participation on deliberators' knowledge about research and health disparities, we compared their responses on pre-and post-deliberation surveys. Knowledge of health research was measured using two new instruments, after a search revealed no validated measures available. One instrument presented three vignettes and asked participants whether or not the vignette was research. The other instrument presented statements about research and research funding, and asked respondents to rate them true or false, for example "Results from research need to be repeated to make sure they are believable," and "The federal government funds a great deal of health research." Both measures of knowledge about research were cognitively pretested. Knowledge of health disparities was tested using a single item based on the standard definition: Which of the following do you think is the best way to define "health disparities?" 1. Health disparities are differences in the health-care people receive. 2.
[correct] Health disparities are particular types of health difference closely linked with social, economic and/or environmental disadvantage.
3. Health disparities are health differences between racial and ethnic groups.

I don't know.
Post-deliberation surveys also measured trust in medical researchers, 38 willingness to participate in research, likelihood of future participation in health research and perceived and desired input on setting research priorities.
One year after the final group deliberation, we randomly selected, from those who agreed to be re-contacted (86% of partic-

| Analysis
Descriptive results include means for scale scores (eg views of information) and individual survey items (eg willingness to participate in future research). Proportions describe some demographics and correct responses for knowledge items.
We analysed all questions measuring deliberators' perceptions of the quality of the deliberative process and structure using principal components analysis. As expected, items loaded onto different scales depending on whether they measured the sufficiency of information and choices, or the quality of deliberation itself. Items measuring the quality of information and choices loaded onto two separate scales depending on whether they were phrased positively or negatively; we label these "Sufficient Information and Choices" and "Insufficient Information and Choices." We expected items measuring the quality of deliberation to load onto separate factors for different elements of deliberative quality (eg mutual respect, quality of argumentation). However, the PCA results strongly suggested that these items formed a single factor, which we label "Views of Deliberation." Factor analysis revealed similar domains.
We used multilevel regression to examine relationships between participants' demographic characteristics (age, gender, race, ethnicity, education, income and rural residence), their views of the deliberation and its information and choices, and their overall trust in or support for using this process to inform policy.
Changes from pre-to post-deliberation were assessed adjusting for within-participant responses nested within-CHAT group using multilevel regression models for knowledge of health disparities (percentage correct), and using multilevel logistic regression models for dichotomized responses of perceived and desired input on research priorities (some or a great deal vs a little or none at all), likelihood of participation in research (somewhat likely or very likely vs somewhat unlikely or very unlikely) and willingness to take part in research (somewhat willing or very willing vs somewhat unwilling or very unwilling). For calculating percentage correct knowledge responses, if at least one item within the set of knowledge questions is answered, then a missing response is considered an incorrect response.
To examine the distribution of participation in deliberations, we calculated the Herfindahl-Hirschman index (HHI) for each group: where X represents the total number of contributions to the de-

| Structure
Deliberators ranged from 18 to 88 years old, with 20% over 65 (Table 2). About two-thirds were women and about one-third resided in a rural area. About half self-identified as White, 31% Black/ African American, 7% Hispanic, 6% Native American and 4% Arab American, Arab or Chaldean. Most participants (63%) had incomes less than $35,000; at least 157 (32.6%) were under the federal poverty level. About half (48.0%) reported very good or excellent health. Compared with the population of Michigan, our sample overrepresented minority and low-income residents.
Mean item and scale scores (Table 3) (Table 4). Those with higher incomes rated the sufficiency of information and choices more highly.

| Process
Views of deliberation were generally favourable (See Table 3 Table 6). Since a normalized HHI = 0 indicates complete equality of participation, and HHI N =1 indicates completely monopolized dialogue, these results are consistent with relatively equal contribution frequency within each group.

| Outcomes
Participants were more likely to correctly identify the definition of health disparities after CHAT than before (aOR = 2.2, P < 0.001) (

TA B L E 4 Predictors of overall support for or trust in deliberation process
Dependent variables → "I would trust a process like this to inform funding decisions" "I would support using our group's decision to inform decision makers"  Note: All summary statistics are from hierarchical model, accounting for potential correlation of responses from within-person nested within-deliberation group. Abbreviation: OR, odds ratio. a Number of participants who responded both before and after deliberation. b Adjusted for within-CHAT group clustering using multilevel logistic regression when likelihood ratio test for between-group variance was significant (P < 0.05). c Based on dichotomized responses to health disparity definition question as correct ("health differences linked to sociodemographic disadvantages") vs. not correct ("health differences between racial and ethnic groups," "health-care people receive," or "I don't know") d Mean changes are calculated as after deliberation minus before deliberation score; negative values correspond to decrease in attitudes/knowledge/ trust after deliberation; adjusted for within-CHAT group clustering using multilevel regression model when likelihood ratio test for between-group variance was significant (P < 0.05). e Three deliberation groups were not administered with knowledge questions post-deliberation and are excluded from this analysis. If at least one item within the set of questions is answered, then missing response is considered an incorrect response. f Collected using a 4-point scale ranging from 0 to 3, and the dichotomized response combines "2 = some/willing/likely" or "3 = a great deal/very willing/very likely." g Mean of 4 items; each 5-point item can range from 0 to 4. Scale reliability coefficient (α) is 0.44 † P < 0.05; ‡ P < 0.001. concerns brought to me…whether it's the public or staff.

Views of Deliberation
Fourteen of 33 participants who were asked whether they looked for opportunities to get involved in their communities had not done so since playing CHAT. Just under a third (10 of 33) said they were already involved in their communities, and 9 said they became more involved:

| D ISCUSS I ON
This paper presents an evaluation of a particular deliberative procedure engaging minority and underserved communities in deliberations with the challenging task of setting health research priorities.
Consistent with our aim, we successfully overrepresented minority and low-income residents, 41 people who can be difficult to reach, since these perspectives tend to be underrepresented in decisions about health research priorities.
Deliberators' views of the information and choices available were Overall, participants expressed favourable views of the deliberation. Importantly, views of the discussions, which included their perceptions of respectful treatment, equal opportunity to talk and civility, were the strongest predictor of trust in the process and support for using results to inform decision makers. This suggests that participants viewed this as a fair process for decision making, a finding consistent with similar projects. 15,20,23,24,42 Participation was generally well distributed, even in the smallest groups, as measured by the HHI. Besides providing evidence of well-led discussions in this project, this demonstrates the use of such an index to measure the distribution of participation in deliberations, a key element of deliberative quality. As Himmelroos has articulated, "A fair and inclusive process would subsequently be one where all participants actively take part in the exchange and evaluation of reasoned arguments." 31 While a combination of frequency of contributions and volume of text (or speaking time), as others have done, 27 may permit a deeper assessment of overall contribution to discussion, the HHI provides a metric to compare and assess between groups and even between projects or events. Furthermore, even if some contributions are brief (small text volume), finding welldistributed participation in a deliberating group, and that nearly all participants speak, provides some evidence that deliberators felt comfortable contributing. Interviews revealed the unexpected finding that some participants in CHAT felt "heard" even if they did not speak during group deliberations, and seemed to welcome having other ways to contribute their points of view. Given concerns that silence, in deliberative forums, may represent refusal to engage, or passive (even if attentive) listening, 43

| Limitations
Proportions and associations should be interpreted with caution given sampling did not aim to be statistically representative. When convening face-to-face deliberations, random sampling during recruitment does not predictably lead to proportional representation, since obstacles to the willingness and ability to attend group sessions (eg time, transportation, mobility) are not randomly distributed. Instead, we aimed to oversample groups typically underrepresented in both research and policy decision making, and had excellent representation of minority and medically underserved populations. Participants ranged in educational attainment and age, and about half were <200% of the federal poverty level. Women were overrepresented, as is often true in research engaging minority and underserved populations. 45,46 Finally, exploratory analyses of interviews with deliberators about the impact of participation will need to be validated in future work.
Still, this study incorporated a wide variety of tools for data collection and analysis to measure comprehensively the quality of deliberations about resource allocation. Most measures indicated, from the perspective of deliberators themselves, good quality structures, processes and outcomes.
Our results suggest that structured deliberation using CHAT can produce high-quality deliberation even on complex prioritization decisions, such as health research spending.

ACK N OWLED G EM ENTS
We would like to thank the many Michiganders who participated in Regional Advisory Groups and in CHAT sessions.