How lay people understand and make sense of personalized disease risk information

Abstract Background Disease risk calculators are increasingly web‐based, but previous studies have shown that risk information often poses problems for lay users. Objective To examine how lay people understand the result derived from an online cardiometabolic risk calculator. Design A qualitative study was performed, using the risk calculator in the Dutch National Prevention Program for cardiometabolic diseases. The study consisted of three parts: (i) attention: completion of the risk calculator while an eye tracker registered eye movements; (ii) recall: completion of a recall task; and (iii) interpretation: participation in a semi‐structured interview. Setting and participants We recruited people from the target population through an advertisement in a local newspaper; 16 people participated in the study, which took place in our university laboratory. Results Eye‐tracking data showed that participants looked most extensively at numerical risk information. Percentages were recalled well, whereas natural frequencies and verbal labels were remembered less well. Five qualitative themes were derived from the interview data: (i) numerical information does not really sink in; (ii) the verbal categorical label made no real impact on people; (iii) people relied heavily on existing knowledge and beliefs; (iv) people zoomed in on risk factors, especially family history of diseases; and (v) people often compared their situation to that of their peers. Discussion and conclusion Although people paid attention to and recalled the risk information to a certain extent, they seemed to have difficulty in properly using this information for interpreting their risk.


| INTRODUCTION
Informing people about their personal risk plays a key role in the prevention of lifestyle-related diseases, such as cardiovascular diseases (CVD), diabetes and chronic kidney disease (CKD). [1][2][3][4] For example, clinical guidelines in different countries stipulate that general practitioners practise cardiovascular risk assessment and communication. 5,6 Online personalized risk calculators are increasingly being used in this context, often as a first step in prevention programmes. 7,8 These risk calculators provide people with personalized information such as the risk factors (eg age, smoking, BMI) that modify their susceptibility, with numerical information about their likelihood of developing the illness within a particular time frame and with advice on how to reduce their risk. People are expected to use this information and obtain insight into their risk, thereby enabling them to make informed health-related decisions, which would ultimately improve population health. 9,10 Precisely how an individual's personalized risk resulting from an online risk calculator should be communicated has become a crucial question. 8,11,12 Although some risk formats (eg natural frequencies and some graphical formats in addition to numerical information) in general seem to evoke better risk understanding than other formats (eg percentages only), [13][14][15] it remains unclear how the provided risk information supports an individual's understanding of their risk. Previous user tests have shown that risks presented as percentages in risk calculators often have unclear or ambiguous meaning for end-users, even when accompanied by graphical information. 8,16,17 Other more general problems revealed by such user tests are that the risk message does not necessarily match the individual's existing beliefs and expectations about risk factors, and that, perhaps partly as a result of this, many end-users with relatively high risks tend to undervalue or normalize their risk. 17,18 Such problems are particularly urgent as many people, not only those with lower educational levels, have poor health literacy and numeracy skills, 19,20 thereby placing them at a higher risk of misinterpreting information and making non-informed decisions. 15 It is therefore important to investigate how end-users of risk calculators make sense of their risk result and to improve risk communication accordingly.
To date, little qualitative work has been performed to investigate how people exactly understand risk information in risk calculators. 4,8,17 Most of this research has employed think-aloud protocols and/or user evaluations, but it can be questioned whether these methods fully capture how people understand risk information. User evaluations typically investigate the user-friendliness of information from the perspective of end-users themselves, which does not give a more "objective" assessment of how people understand information. 14 Think-aloud protocols provide insight into the thought processes of people who use information 21,22 ; although this can be useful in assessing how people understand the provided information, the method does not necessarily capture how people subsequently utilize this information to interpret and make sense of their risk. We therefore adopted a novel qualitative approach that followed different essential phases in the process of understanding risk information provided in a risk calculator. The aim of this study was to examine how lay people understand the result from the above-mentioned online cardiometabolic risk calculator using eye tracking, a recall task and qualitative post-test interview questions. We assumed that in order to understand their personal disease risk in a risk calculator, people have to (i) pay attention to essential information; (ii) be able to recall this essential information; and (iii) use this essential information in their risk interpretations. Previous qualitative studies did not adopt such a qualitative approach that specifically focused on these phases in the process of individual comprehension and interpretation, but rather focused on general reactions to provided information.

| Study design
Our case study involved the online risk calculator that is part of the Dutch National Prevention Program for CVD, type 2 diabetes and CKD. This risk calculator, which is the first step in the prevention programme, can be used by general practitioners to identify high-risk individuals. General practitioners can invite patients between 45 and 65 years of age to fill out the risk calculator at home for a first risk estimation based on sex, age, smoking status, BMI, waist circumference and family history of type 2 diabetes and CVD. 23 Only individuals whose test results reveal an elevated risk are advised to see their general practitioner for further screening. We performed a qualitative study consisting of three parts, corresponding to the assumed essential phases in the process of understanding risk information.

1.
Attention: participants completed a risk calculator while an eye tracker (TOBII) registered their eye movements; the interviewer did not intervene in this phase.

2.
Recall: participants were provided with a recall task after they had completed the risk calculator, assessing their recall of different parts of the risk information.

3.
Interpretation: Semi-structured questions were posed during a 30-minute interview, focusing on participants' subsequent risk interpretations.
We assumed that in order to understand their risk, people should, at a minimum: (i) pay attention to some of the numerical information to get an idea of the size of their risk 24 and also of the verbal categorical label, bar graph or comparative risk information (the risk of someone of the same age without risk factors) that form part of the risk communication to provide intuitive or "gist" meaning of the number 25 ; (ii) recall the size of their risk (ie in numbers) and some of the information aimed to provide intuitive meaning, for example the verbal label; and (iii) use this information as well as information about qualitative dimensions of risk (ie their personal risk factors, the controllability of their risk 26,27 ) to interpret and make sense of their risk result.

| Recruitment and sample characteristics
We recruited people from the target population of the prevention programme (people aged between 45 and 60 years without a medical history of type 2 diabetes, CVD and CKD) through an advertisement placed in a free distributed local newspaper. This advertisement mentioned that the study would focus on people's opinions about health websites. A total of 21 people responded and were provided with further details. These 21 people were initially all willing to participate; 16 of them actually participated. The participants' characteristics are presented in Table 1.

| Procedure
Participants were interviewed at the VU University. To make sure the participants would feel comfortable, the interviews were conducted in an attractive laboratory setting that was especially designed to facilitate laboratory research in a realistic environment. The interviewer (NB) informed participants about the online risk calculator and the aim of the interview and then instructed them about the use of the eye tracker and asked permission to audiotape the interviews.
After providing written consent, the interviewer started the online risk calculator on the computer screen. Participants completed the risk calculator while an eye tracker registered their eye movements and fixations (part 1); in this phase, the interviewer sat a few metres behind the participant and did not intervene; she viewed the eye movements on a second screen. After completing the risk calculator, the interviewer provided the participant with a recall task (part 2, Section 2.4). Next, the interviewer conducted the semi-structured interview using an interview guide (part 3, see Section 2.4). Finally, participants' socio-demographic characteristics (sex, age, educational level, language spoken at home), subjective numeracy 28 and subjective health literacy 29 were assessed in a short survey. Participants were thanked for their participation and were given a small financial reward (€20).

| Materials
The online risk calculator communicates people's personalized risk in different formats on a single web page ( Figure 1): a percentage (eg your risk is 14%), a natural frequency (eg 14 of 100 men/women like you will develop the diseases within 7 years from now), a bar graph, a categorical verbal label (eg your risk is "slightly elevated") and comparative risk information (eg the risk of someone your age without risk factors is 10%). Information about the risk factors contributing to people's personal risk was provided on another web page. In part 2, we used a recall task that assessed participants' recall of the personalized risk information as provided in Figure 1. A blank hard copy page of this web page was provided to participants ( Figure 2) and they were asked to fill out their test results, that is the percentage, the natural frequency, the categorical verbal label and the statement on the right side of the bar graph.
Part 3 used an interview guide specifically designed to let participants elaborate on their risk interpretations and to make sense of their test result. We first asked questions about how people perceived their risk after receiving their test result. Examples included: "How do you interpret your risk of getting type 2 diabetes, CVD and CKD?" "How likely do you think you are to develop one of these diseases and why?" In the second part, we again provided people with their personalized risk on the computer screen and compared it to people's answers to the recall task. We asked participants explicitly about perceived difficulties in completing the recall task and then together reviewed the different parts of the risk communication. Examples of interview questions were as follows: "What were your thoughts on seeing your risk percentage?" "How do you feel about having a risk that is elevated/ slightly elevated/not elevated?"

| Data analysis
All interviews were audio-recorded and transcribed verbatim. We ana-  32 In making inferences about the eye-tracking patterns in relation to risk understanding, we compared the eye-tracking data to the recall data and the qualitative themes.
Analysis of the interview data in MAXQDA followed the phases of thematic analysis as described by Braun and Clarke. 33 Initially, three researchers (NB, OD and MH) read all transcriptions. Observations made by the interviewer (NB) during the interview about how respondents interpreted and used the risk information were discussed. Subsequently, all interviews were coded by two researchers independently (11 were performed by NB and OD, and five were performed by NB and MH).
First, the interviews of two different participants were each coded openly by NB and OD (ie we assigned preliminary codes to text fragments). This open coding meant that we had no pre-existing coding or classification scheme. After discussing these codes, the three researchers further coded the remaining interviews while being able to access the codes of the first two interviews in an Excel sheet. When differences between the codes of the different researchers occurred, these were discussed in consensus meetings in which the two researchers participated; sometimes, the code of one of the researchers was adopted, and sometimes, a new code was created. In all cases, we were able to reach consensus. Next, the data were axially coded (integrating codes in broader related concepts), which resulted in a hierarchical list of codes. Based on this, initial themes were defined by two researchers (NB and OD). These themes were discussed with the other members of the research team (MH and DT) and the eye-tracking and recall data were used to further refine the themes where necessary. Based on the eight subjective numeracy items developed by Fagerlin et al. 28 All questions use 6-point Likert-type scales with endpoints as marked (1-6). A higher score indicates a higher subjective rating of numeracy abilities and preferences. b Based on the three subjective health literacy screening items developed by Chew et al. 29 : (i) "How often do you have someone help you read hospital materials?" (ii) "How confident are you filling out medical forms by yourself?" and (iii) "How often do you have problems learning about your medical condition because of difficulty understanding written information?". Inadequate health literacy if answers other than "never" on items 1 or 3 and/or answers other than "extremely" or "quite a bit" on item 2.

| RESULTS
We will first describe the attention paid by people to the information, followed by their recall of that information. Next, people's risk interpretations are described by means of five qualitative themes.

| Attention for information
We used the eye-tracking data of 11 participants; of the remaining five participants, the data were of insufficient quality due to failed calibration. Note that this only concerned the eye-tracking data; the recall and interview data were complete and fully analysed for 16 par- and also, to some extent, at the verbal label. Participants particularly looked at the natural frequency in detail: many participants read this information 2-3 times, as the gaze plots showed, which may suggest that this was hard to understand. They paid less attention to the bar graph, which is an important result because the bar graph was explicitly meant as a graphical aid to provide intuitive meaning to the information. Participants did look at the comparative risk information, which was another attempt to provide such intuitive meaning (bottom of bar graph), although not very extensively. Notably, participants rarely exactly read out their own risk from the bar; although most participants did look at the percentage displayed next to the bar, they did not place it on a scale from 0% to 100% (see the examples in supplementary file 2). Table 2 presents the information recall of all participants. Percentages were overall remembered well, whereas natural frequencies and verbal labels were remembered less well. Only one person adequately recalled the statement at the bar graph ("visit your GP").

| Risk interpretation
We identified five qualitative themes related to how people interpreted and made sense of their personalized disease risk information.
The next section describes these themes and their corresponding subthemes, illustrated by respondents' quotes in Table 3.

| Theme 1: numerical information does not really sink in
We found that many participants struggled to adequately comprehend and recall the numbers provided, including the probability information (subtheme 1a). Furthermore, even if people did focus to some extent on the numerical information, for example as a result of probing questions, many participants tended to undervalue these numbers and seemed to interpret the risk as less severe than medical experts (subtheme 1b), indicating that the numerical information did not fully sink in. Theme 1 was related to themes 3 and 4 in the sense that people did not rely heavily on the numerical information in making sense of their risk, but rather on other aspects such as infor- and beliefs about a number of topics (Theme 3). Eye-tracking data and recall data showed that participants did look at and remember numerical information, which suggests that participants did process this information to some extent.
Subtheme 1a: struggling to comprehend and recall numerical information Many people struggled with comprehending and recalling the provided numerical information correctly. This became clear from posttest interview questions about how people interpreted the numerical information, but also from people's responses to the recall task. Some participants tried to unite the percentage and the frequency of the probability information, which often proved hard for them. In addition, probability information was also confused with other numerical information, such as the time frame of the risk (7 years) and people's BMI. It was noticeable, in this respect, that participants did adequately recall the risk percentage.

Subtheme 1b: risk undervaluation
It also became clear that participants perceived relatively high risks (eg ranging from 12% to 23%) as rather low, and saw no reason for worry. This seemed to occur partly because of difficulties in interpreting numerical information (subtheme 1a), but also because people used their own risk factors in risk interpretations more than the size of the risk (subtheme 3a).
People also spontaneously talked about reasons why they felt that their risk was actually different from what was communicated in the test, causing them to downplay the size of their risk. For example, some participants believed that because they felt healthy, the risk that was communicated to them was an overestimation of their actual risk. Another reason why risk undervaluation might have occurred can be ascribed to the way the risk was presented; for example, the bar graph ascended all the way up to 100%. Although a bar graph that goes up to 100% is of course an adequate way to present percentages, risks of 15% or 20%, which would be considered severe risks by experts, can seem minor because they are in the lower part of the bar.

| Theme 2: the verbal categorical label made no real impact on people
Like the numerical risk information, the verbal categorical risk label (either "elevated," "slightly elevated" or "not elevated") made no real impact on participants. From an expert/epidemiological perspective, these labels are important information because they form the basis for deciding who needs further screening and who does not. However, our participants did not seem to regard this label as an essential element of their test result and recall data showed that many did not recall the label. Several participants with a "slightly elevated risk" who did not recall their verbal label correctly thought their label was "minor" or "small". This might be related to the fact that people tended to undervalue their risk (subtheme 1b).

| Theme 3: people relied heavily on existing knowledge and beliefs
We found that participants primarily relied on their existing knowledge and beliefs to interpret their risk, rather than on the actual risk information provided (see themes 1 and 2). Many participants used their knowledge about risk factors to make sense of their own risk (subtheme 3a). Many participants also used their perceived physical complaints (or lack thereof) to judge their susceptibility, rather than the size of the risk as communicated in the risk calculator (subtheme 3b). Related to subthemes 3a and 3b, a third subtheme was that people's perceived susceptibility also depended on beliefs about, or "images" of, the diseases (subtheme 3c). Many participants also used their perceived complaints (or lack thereof) to judge their susceptibility. It seemed that many participants believed that as long as they felt healthy and were free from medical/ physical complaints, their risk would be rather low. The risk information did not affect these beliefs and perceptions.
Subtheme 3c: reliance on knowledge and beliefs about the diseases Participants' perceived susceptibility also partly depended on existing beliefs about the diseases in the risk calculator, which were, in turn, related to how familiar these were to people. In general, participants thought that cardiometabolic diseases were not very severe, for example in comparison with cancer. Several participants also indicated, without being prompted, that they had no clear picture of cardiometabolic diseases, especially of type 2 diabetes and CKD.

| Theme 5: people often compared their situation to that of their peers
Participants spontaneously compared their risk to that of peers, to make sense of their own risk result. They had stereotypical beliefs about "which people have an elevated risk" and they compared their own risk to these stereotypes to judge the severity of their own risk.
This tendency seemed to be related to a risk undervaluation (subtheme 1b), although not all participants who compared their risk to the risk of (hypothetical) other people undervalued their own risk.
While participants were curious to know the risk of others, our eyetracking data showed that they hardly looked at the comparative risk information available in the bar graph (supplementary file 2).

| DISCUSSION
This study aimed to examine how lay people understand and make sense of the result from a disease risk calculator. We combined eye tracking with a recall task and qualitative post-test interview questions in a qualitative study using a Dutch cardiometabolic risk calculator. Our findings showed that when making sense of their risk, people did not make extensive use of the risk information provided.
Neither the numerical risk information nor the categorical verbal labels seemed to make a real impact on people's risk perceptions and interpretations, although they did look at and recall this information to a certain extent. Instead, people primarily relied on existing knowledge and beliefs, for example about the presence or absence of risk factors or about the severity of diseases.
The finding that participants relied so heavily on existing knowledge and beliefs and that, as a result, their perceptions and interpretations seemed hardly affected by the information provided was significant. Although research about this topic is scarce (see review of Sheridan 2 ), previous studies have shown that giving people information about their cardiovascular disease risk can alter their risk perceptions. One explanation for our finding that the risk information hardly affected people's perceptions might be that the risk communication from our case example was suboptimal, and did not effectively guide end-users to essential information about the size and severity of their risk. We did find that people attended to and recalled aspects of the provided risk information, most notably the risk percentage, but this did not always occur without any difficulties. Although we do not know for sure why so few participants filled in the natural frequency in our recall task (eg it might be that they believed that they did not have to fill in this frequency because it is essentially the same information as the percentage), it was noticeable that this natural frequency was badly recalled and discussed by participants. Overall, indeed, people had difficulties with interpreting and providing meaning to the numerical information, a result which has previously been found in the general field of risk communication 34 as well as in studies that specifically focused on online disease risk calculators. 4,17 It was noticeable that people made limited use of the verbal categorical label and the graphical bar chart, which are formats explicitly intended to provide intuitive meaning to the numbers. In addition, the comparative risk information was often neglected, while it obviously has the potential to provide meaning to people's personalized risk. 35 Graphical risk formats do have the potential to attract end-users' attention 35 and to support their understanding. 14, 36 We can only speculate about the reasons why our participants discounted this information. It might be that people themselves did not experience any problems in comprehending the percentage, and therefore found it unnecessary to also view the bar graph that again emphasized the percentage. symptoms in general. New information about the size of cardiometabolic risk is probably interpreted in the light of already existing beliefs or "mental models".
One particularly salient aspect of people's existing beliefs that might be interesting in the light of better risk communication was having a family history of diseases. Although the risk calculator provided reliable information by using single enquiries for family history, our participants were rather sceptical about this "small set" of questions. Furthermore, participants talked a lot about family history in their answers to interview questions about their risk interpretations, which indicated that family history largely influenced their perceived susceptibility. A similar finding has been previously reported in the context of a breast cancer risk calculator. 18 A more detailed family history assessment might lead to better use of the risk information, because it directly increases the perceived relevance of information and, by doing so, probably also increases people's motivation to process the information. 42 A previous study also revealed that a detailed familial risk questionnaire contributed to users' risk acceptance and motivation to adapt healthier lifestyles among people with a positive family history. 43 An important caveat is that putting more emphasis

| Strengths and limitations
Our study might be limited by the fact that participants were all people who were interested in health websites, as we recruited them through an advertisement that mentioned health websites. A further limitation was that the eye-tracking data of several participants (N=5) were of insufficient quality due to failed calibration. We should thus be cautious in drawing firm conclusions about people's attention for risk information. Another limitation is that we did not ask participants about their knowledge or basic perceptions of risk before they completed the risk calculator. Had we done so, we might have gained more insight into how pre-existing beliefs influenced people's risk in- terpretations. An important strength of this study is that we used a novel approach to test how people interpret and make sense of their disease risk, that is a combination of eye tracking with a recall task and qualitative post-test interview questions. This approach provided us with valuable insights into how people used the risk information from the risk calculator and how they interpreted their risk of cardiometabolic diseases.

| CONCLUSION
Although people pay attention to and recall risk information in an online risk calculator to a certain extent, they do not seem to optimally use it in their risk interpretations. Risk communication in an online disease risk calculator could be improved by building on people's existing knowledge and beliefs (eg about risk factors such as family history of diseases), by providing clear, more elaborate information about the diseases and using alternative graphical formats of numerical risk.