Reasoning in the valuation of health‐related quality of life: A qualitative content analysis of deliberations in a pilot study

Abstract Background Group deliberation can be a pathway to understanding reasons behind judgement decisions. This pilot study implemented a deliberative process to elicit public values about health‐related quality of life. In this study, participants deliberated scales and weights for a German adaption of the Short‐Form Six‐Dimension (SF‐6D) Version 2 from a public perspective. Objective This article examines the reasons participants stated for health state valuations and investigates the feasibility of eliciting public reasons for judgement decisions in a deliberative setting. Methods The 1‐day deliberation was guided by MACBETH as a method of multi‐criteria decision analysis and involved qualitative comparisons of SF‐6D health states and dimensions. Participants deliberated in parallel small groups and a subsequent plenary assembly. A qualitative content analysis was conducted to assess the value judgements and reasons behind them. Results A total of 34 students participated in the study. Common reasoning was the level of impairment, marginal benefit, possibility of adjustment and expectation satisfaction. While the small groups agreed on scales for the SF‐6D dimensions, the plenary assembly did not reach consensus on one scale and dimension weights. When dimensions were prioritized, these were pain and mental health. Conclusions While no consented value set was derived, this pilot study presents a promising approach for eliciting public reasoning behind judgements on health state values. Furthermore, it demonstrates that participants consider diverse motives when valuing health‐related quality of life.


| INTRODUC TI ON
Preference elicitation to value health-related quality of life (HRQOL) measures is generally conducted in large-scale surveys of the population. These surveys can be administered using a variety of valuation techniques -such as time-trade-off, standard gamble and discrete choice experiments -and formats -from solely online-based to face-to-face or group interviews with and without computer aids. 1 Despite continuous development of these methods, 2 traditional approaches to preference elicitation face several points of critique. In many cases, participants are given cognitively challenging tasks with vague and unfamiliar questioning 3 and have little opportunity to reflect and make well-considered judgement decisions. 4 Additional concerns include the occurrence of inconsistent assessments 5,6 and the use of mean values which may not best mirror participants' value judgments. 3 In light of these issues, Hausman suggests that citizens should deliberate health state valuations to derive public values. 3,7 Taking a public perspective in valuation means considering how a health state limits the range of objectives which members of a society can pursue. 7 This approach is in contrast to eliciting private values where citizens consider their individual objectives in health state valuation.
Hausman argues that resource allocation in health care should be neutral to such individual objectives and instead consider the public value of health states. 3 Following the argument for deliberative HRQOL valuation, this pilot study implements a consensus conference to value a German adaption of the Short-Form Six-Dimension is presented elsewhere. 8 In short, MACBETH is a method of multi-criteria decision analysis (MCDA) which uses only qualitative judgements to derive numerical valuations. 9 While MACBETH has been used in various settings within and beyond health care, 10,11 MCDA techniques have also been applied successfully in the valuation of HRQOL. 12,13 For the purpose of this study, participants used the MACBETH procedure to elicit scores and weights for the SF-6D dimensions. Discussing the qualitative ratings needed for MACBETH was the basis of both deriving values and participants' reasoning behind their judgement decisions when assessing HRQOL.
This research builds upon studies investigating what participants think when performing valuation tasks for HRQOL. Such studies have applied qualitative approaches such as the thinkaloud protocol for various HRQOL measures. 14,15 These analyses, however, focus on data on the individual level of study participants. With the application of deliberation for the valuation of HRQ OL, this study goes beyond individuals' reasons for judgement decisions. In brief, deliberation can be defined as non-professional members of the public being educated about a certain topic to then consider it as a group and come to an agreed-upon solution. 16 Deliberation generally entails providing factual information, gathering a representative group of participants and encouraging open and reflective discussions. 17,18 In this article, we argue that deliberative settings are suited to elicit reasoning behind judgement decisions when valuing HRQOL from a public perspective. To evaluate the feasibility of eliciting public reasons on HRQOL valuation through deliberation, the analysis is performed in two steps. First, this article gives a brief overview of the valuation results and examines which reasons participants stated when expressing valuation decisions about SF-6D health states. Second, it investigates whether the deliberative setting in this pilot study was suited to elicit public reasons for judgement decisions on HRQOL valuation. The reasons and their perspective are then discussed in the context of the pilot character of this study.

| Study design
The pilot study was conducted in a 1-day conference and applied a German adaption of the HRQOL measure SF-6D Version 2. The descriptive system used for valuation was derived from the items of the German SF-12 and SF-36 corresponding to the English SF-6D Version 2. To value the SF-6D, deliberation first took place in six small groups which each discussed the scale of one of the SF-6D dimensions physical functioning, role limitation, social functioning, pain, mental health and vitality. In this so-called scoring procedure, participants gave F I G U R E 1 Example matrix for the scoring procedure of the dimension pain as presented to the participants. Note: All fields with question marks refer to the difference in attractiveness between the row and column. These differences were assessed and the corresponding fields filled in by participants Very mild  pain  Mild pain  Moderate  pain  Severe pain  Very severe  pain   No  completed by the group discussing pain is shown in Figure 1. The deliberative groups filled in the question marks with the differences listed in the column on the right. Participants also received detailed supporting information illustrated in Appendix S1. The small groups' results were then brought into a plenary assembly for validation.

No pain
After eliciting a scale for each of the SF-6D dimensions, the weighting procedure was designed to assign a weight to each dimension. In the plenary assembly, participants were first asked to rate the difference in attractiveness of a change from the worst to the best performance level one dimension at a time. In a second step, participants were to compare this change between two dimensions on the qualitative MACBETH scale. For all evaluations, participants were asked to deliberate not their personal preferences but with regard to the effects on a self-determined and independent life. This approach was chosen to invoke reasoning from a public perspective following the public value concept suggested by Hausman. 3 After the conference, participants evaluated the conference in a debriefing questionnaire including open questions. Details on the pilot study's evaluation and its methodology are beyond the scope of this paper and can be found in Gansen et al. 8 In addition to the questionnaire given to the study participants, interviews with the facilitators of the deliberations as well as the software operators were conducted after the conference. In these semi-structured interviews, the interviewees were asked to summarize the reasons stated during deliberations and elaborate how they perceived the deliberation.

| Analysis
The numerical results of participants' valuations were elicited with the software M-MACBETH Version 2.5.0. This software implements the MACBETH procedure and derives numerical scales and weights from the qualitative judgements given. To assess the reasoning behind participants' decisions, a qualitative content analysis (QCA) was performed. The QCA was conducted with transcripts of the plenary sessions and the interviews with the facilitators and software operators. The transcripts were based on audio recordings of both the conference and interviews. These recordings were transcribed verbatim following a uniform standard. 19 The participants' comments in the debriefing questionnaire were not included in the part of the QCA relevant to this article. 8 The QCA was implemented with the software MAXQDA 2018.
The procedure of the QCA followed the approach suggested by

| Implementation
The conference was held on 15 December 2017 from 9.15 am to 5 pm.
Thirty-four students of the University of Bremen participated in the study. Their main characteristics are illustrated in Figure

| Dimension scales
The scales for each of the SF-6D dimensions which were derived by the small groups with the M-MACBETH software are shown in Table 1. In the plenary assembly, these results were introduced and discussed. The plenary assembly agreed on all dimension scales except for that of the dimension pain.
Regarding the difference in attractiveness which participants assigned to the level dimensions, weak differences were common between the worst two levels of each dimension. Weak and very weak differences were also identified between the highest levels 1 and 2: TA B L E 1 Numerical scores and weights derived for the adapted German SF-6D Version 2 In the comparison of "social contacts are rarely limited" and "never limited" we saw a weak difference in attractiveness.

Facilitator of group 'Social Functioning' in scoring session on social functioning
Individual groups chose moderate differences in attractiveness as a compromise when some participants argued for strong and others for weak differences. The assignment 'extreme' was predominantly given to differences in attractiveness across more than one level. In the dimension social functioning, for example, the deliberative group identified an extreme difference between 'limited all of the time' and 'limited a little of the time' or 'none of the time', respectively.

| Dimension weights
In the weighting procedure performed in the plenary assembly,

| Level of impairment and autonomy
One of the most common arguments in participants' judgement decisions was the level of impairment in the health state in question.
This basis of reasoning was summarized in the predominant themes regarding impairment, autonomy and self-determination.
In part, the reasons stated appeared to be specific to the dimen- It is still possible to lead a self-determined life with "mild" or "very mild pain".

| Adjustment and marginal benefit
Other recurring themes during deliberations were the prospect of tol-

| Expectations and satisfaction
The fulfilment of expectations set both by the individuals living in a health state and those set by society was another approach for explanations. This line of argument included reasoning that certain health constraints -such as having a headache from time to time -were common and socially acceptable. Therefore, these states had little effect on a self-determined and independent life and were classified as less severe. Participants agreed that the assessment of the impact on a selfdetermined and independent life can depend on both self-expectations and expectations of society. However, in some arguments, it was unclear which of these perspectives participants were referring to. To ensure consistent coding, ambiguous statements such as the following were coded in the subcategory 'Self-expectation': On the other side this could be about the meaning of life.
If you have the feeling that you're not living up to expectations this could also be problematic. But this would also reflect on mental health.

Participant in weighting session
Other themes that were identified in the participants' arguments were how satisfied individuals are despite their health constraints. One participant argued that always being in severe pain overshadows everything and no longer allows joy of life. Another theme was arguing that some dimensions and their assessment were connected to the fulfilment of basic needs: We also thought that this fatigue is somehow part of the basic functioning of the body. So is it able to do anything that day in the first place?
Participant in scoring session on vitality In any case, I would subjectively and intuitively say that qualitatively, there is a very big difference between going from "very little" to "no pain" and going from "very severe pain" to "severe" or vice versa, namely that the gap between "severe" and "very severe pain" should be greater.

| D ISCUSS I ON
To discuss the feasibility of eliciting public reasons for valuation decisions on health states using deliberation, the following section focuses on three key issues: the main themes on reasoning identified in the deliberations, the perspective of these reasons and the suitability of a deliberative setting to elicit public reasons. A detailed discussion of the methodology applied in the pilot study is presented in Gansen et al. 8

| Reasons behind valuations
The results of this pilot study demonstrate that participants engaged bers, 23 social circumstances and effects on family and friends, 24 and ability to achieve goals and anticipated adaptation. 25 Taken together, preceding research has identified several non-health consequences connected to personal and social circumstances that influence participants' health state valuations. 14,15 Some of these rationales, such as enjoyment, can be recognized in the reasons stated in this pilot study. Other non-health consequences, such as ability to take care of oneself or the effect on activities or relationships, form part of the SF-6D descriptive system. This circumstance could explain why some reasons are specific to the dimensions valued. It could also offer an explanation as to why the impact on others was not identified as a separate theme in this study. When comparing our findings to earlier qualitative research, it is important to note that other studies were based on different HRQOL measures. Moreover, participants of the comparative studies were asked for their personal preferences and not -as implemented in this study -for assessments from a public perspective.

| Perspective of reasoning
Despite the fact that it was generally possible to identify underly- These are open questions that should be addressed with further research on the difference in public and private reasons in deliberative settings that task participants to argue form a public perspective.

| Suitability of deliberation
To assess the suitability of deliberation to elicit public reasoning, it is important to evaluate whether the deliberative setting fulfills the requirements of deliberations. Taken as whole, the pilot study implemented the intended aspects of deliberation. This excludes the aspect of a representative group of participants as this was not a focus of the pilot study. Yet, the study did gather non-professional members of the public who learned about a specific topic -the valuation of HRQOL -with factual information and engaged in group consideration and open, reflective discussions. [16][17][18] One criterion of deliberations that was not fulfilled was the aspect of coming to an agreed-upon solution. 16  It also reveals areas in which participants lack understanding and provides the support needed to manage complex valuation tasks.
As such, deliberation offers a setting to explore and address concerns that preferences on health states are not fully informed. 26 To better inform participants about the consequences of health states, deliberations could be extended to include experts such as patient representatives and HRQOL specialists. It is important to note, however, that expert opinions could, in turn, bias participants in their judgement decisions.

| Limitations
Overall, the findings of this study indicate that deliberation is a can be based on citizen's juries and use stratification. 27 Another limitation was the use of a translated version of the SF-6D which was not validated externally before its application. As the phrasing of the descriptive system is important to its valuation, the pilot results have to be seen against this restricting background.
Notwithstanding the potential for inaccurate wording, the deliberative setting gave participants the opportunity to discuss and agree on their understanding of the descriptive system.
Deliberative valuations beyond a pilot phase, however, would need to apply a validated HRQOL measure as well as consider issues such as anchoring and duration of health states. Finally, the deliberative procedure was only roughly structured and mainly constructed around the requirements of the MCDA method MACBETH. As such, the effect of deliberation cannot be separated from the methodology chosen for valuation. Future deliberative valuations should therefore attempt at standardizing the deliberative procedure and investigate the effect of public deliberation on health state values.

| CON CLUS IONS
This pilot study presents a novel approach for deriving reasoning behind value judgements on HRQOL in a deliberative setting.
While no consented value set was derived, participants deliber-

ACK N OWLED G EM ENTS
First, we would like to thank Wolf Rogowski for supervising the research project and contributing to its design and implementation.
We also thank Lisa Baumann and Madlen von Fintel for transcribing the conference sessions and are grateful to Eugenia Larjow and Madlen von Fintel for supporting the qualitative content analysis.
Finally, we would like to thank the students who participated in the pilot study as moderators, operators and participants for their valuable contributions.

CO N FLI C T O F I NTE R E S T
The authors declare that they have no competing interests. Julian Klinger is employed by the digital health company Newsenselab GmbH. However, his contribution to this project was largely made while working as a research assistant at the University of Bremen.
Newsenselab was not involved in and did not provide funding for this study.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.