Measuring recall of medical information in non‐English‐speaking people with cancer: A methodology

Abstract Background Many patients who require an interpreter have difficulty remembering information from their medical consultations. Memory aids such as consultation audio‐recordings may be of benefit to these patients. However, there is no established means of measuring patients’ memory of medical information. Objectives This study aimed to develop a method for eliciting and coding recall of medical information in non‐English‐speaking patients. Design This method, called Patient‐Interpreter‐Clinician coding (PICcode), was developed in the context of a phase II trial conducted in two outpatient oncology clinics in Melbourne, Australia, and was refined iteratively through consultation with an expert panel and piloting. Between‐coder differences in early versions of the coding system were resolved through discussion and consensus resulting in refinements to PICcode. Results The final version of PICcode involved transcribing, translating and coding of audio‐recorded consultations and semi‐structured interviews (SSI). The SSIs were designed to elicit patients’ free‐recall of medical information. Every unit of medical information in the consultation was identified and categorized in a coding tree. SSIs were coded to identify the extent to which information was recalled from the consultation. Discussion The iterative changes involved in developing PICcode assisted in clarifying precise details of the process and produced a widely applicable coding system. PICcode is the most comprehensively described method of determining the amount of information that patients who use an interpreter recall from their medical consultations. PICcode can be adapted for English‐speaking patients and other healthcare populations.


| INTRODUCTION
Many patients have difficulty remembering information from their medical consultations, which can negatively impact their understanding of their medical condition, adherence to treatment, management of side-effects and subsequent outcomes. [1][2][3][4][5][6] Various interventions have been proposed to help patients remember medical information including written summaries, audio-recordings, question prompt lists and clinician communication training. 7 Patients who require an interpreter face additional communication challenges and experience difficulties communicating with their healthcare team. 8,9 Memory aids may therefore be of particular importance to these patients. Systematic reviews of the literature suggest that providing English-speaking patients with an audio-recording of their consultation improves their memory of information provided in the consultation. [10][11][12][13][14] Two studies have piloted the provision of consultation audio-recordings to non-English-speaking migrant patients. 15,16 These studies focused on the patients' perceptions of receiving the audio-recording and, while both found that patients valued receiving it, neither study measured the impact of the audio-recording on patients' memory of information given in the consultation. While subjective perceptions of value are useful, evidence of the effectiveness of communication interventions is needed before recommendations for more widespread implementation can be made.
There are no widely accepted, established means of measuring patients' memory of medical information, and no detailed methodological account has been published. A variety of studies measuring memory in English-speaking patients have emerged from diverse areas of health care, including genetic counselling, 17 surgery, 3 psychiatry, 5 gastroenterology, 1 health behaviour 18 and oncology. 19 These studies vary considerably in the type of memory assessed and the measurement methods used. Some measure how much patients remember under usual care conditions and others measure the influence of a new intervention or approach. Most measure what is remembered from a consultation, 20,21 while others investigate patients' memory of a therapy, 5 test results 6 or health services. 22 The lack of a consistent measurement tool has hindered efforts in this area. An established measurement process would allow for comparison between interventions, greater consistency between trials and generalizability of results.
Memory retrieval can be broadly categorized into two types: recall and recognition. When evaluating the impact of an intervention, it is important to define the type of memory retrieval being measured as they differ in mechanism and difficulty. Recall involves the retrieval of information from memory without prompting (free recall) or with little prompting (cued recall); while recognition is entirely prompted and involves the identification of a piece of information as either familiar or novel. 23 It is harder to recall something than to recognize it, and information recalled by patients tends to differ from the information they recognize. 4,24 A patient's ability to freely recall information is somewhat harder to quantify than their ability to recognize information. Measuring recall requires the researcher to elicit the remembered information from the patient without prompting. The researcher must then identify each item of information provided and determine which ones were recalled. Although harder to measure, the act of recalling information is arguably a more accurate representation of the lived experience of remembering medical information, as patients will often need to remember information without prompting.
In general, there are three steps involved in measuring how much a patient remembers from their consultation: (i) determine the content of the consultation, (ii) determine what the patient remembers from the consultation, and (iii) compare what the patient remembers with the genuine content of the consultation. Many previous studies have used audio-or video-recordings to determine consultation content, 1,4,17,[19][20][21][25][26][27][28][29][30][31] or a combination of recordings and subjective measures such as clinician report. 32,33 Alternatively, the information delivered has been transcribed verbatim by an observer during the consultation; 3 or a checklist has been provided to the clinician to ensure that the same information is delivered to each patient. 34 An audio-or video-recording of a consultation provides the most objective record of the content of the consultation. This recording must then be analysed to label and count the information discussed.
Dunn et al 19 adapted the Roter Interaction Analysis System (RIAS) 35,36 and coded the audio-recorded content of the consultation into units of information, with each unit defined as "a segment of speech from the doctor expressing a single idea concerning medical issues." Defining units of information using this methodology provides a quantitative summary of consultation content which can then be compared to what the patient remembers from the consultation. Previous studies have varied in how they measured patients' memory of consultations. Some studies in which the target consultations were highly similar for each patient (eg, chemotherapy education consultations) utilized a standard consultation content questionnaire with multiple choice response options. 27,37,38 Other studies have used a questionnaire template and personalized this for each patient according to the content of their consultation. 1,17,20 The multiple choice sections of these questionnaires measure recognition, while other, more open-ended questions give a measure of cued recall. Both of these approaches limit the consultation types to those where the content is highly similar or consistent between patients. Most studies involving consultations with highly varied content have interviewed the patients post-consultat ion. 19,21,26,28,33,[39][40][41] The three steps outlined above also apply to non-English-speaking patients; however, further considerations are necessary with this population. Consultations in which interpreters are involved differ from English-only consultations because all speech needs to be interpreted into another language. Competent, trained interpreters will interpret accurately, but interpretation is an extremely difficult task. There is the risk that nuance of meaning can sometimes be lost, especially when interpreters are not adequately trained. [42][43][44] The consultation as perceived by a non-English-speaking patient is reflective of what is spoken by the interpreter, which may be subtly different from what was spoken by the clinician. However, patients who utilize the services of an interpreter will vary in their ability to understand English and so some may understand parts of what is said in English by the doctor. In such cases, the genuine content of the consultation is less straightforward to determine: Is it reflective of the clinician's input, or the interpreter's interpretation, or a mixture of both? Additionally, determining what the patient remembers requires bilingual research staff, or some other interpretation and translation solution. Any methodology for determining non-English-speaking patients' recall of consultation content must therefore account for these additional variables.
The current study aimed to develop a method for eliciting and coding recall in non-English-speaking patients. This method, called Patient-Interpreter-Clinician coding (PICcode), involved coding the medical information provided in an oncology consultation where an interpreter was utilized, eliciting the patient's recall of that consultation, and coding the extent of their recall of the information provided in the consultation. This study was conducted in the context of a phase II trial that evaluated the feasibility and acceptability of providing a consultation audio-recording to non-English-speaking patients diagnosed with cancer, the findings of which are in preparation for publication.

| METHODS
The PICcode recall elicitation and coding system was developed iteratively (see Figure 1) using several approaches, including review of existing methods in the literature; piloting; and consultation with experts in consultation audio-recording, psychology, oncology, coding system development and interpreting/translating (PS, MJ, TH, SH, PB).

| Context of development of PICcode
PICcode was developed in the context of a phase II randomized control trial (RCT), which was conducted in two outpatient oncol- F I G U R E 1 Development of PICcode coding system vision or speech impaired; self-identified as non-literate; or had a diagnosed cognitive or psychological disorder that would preclude participation. All interpreters were professionally accredited and were employed in-house by the hospital or through a specialist agency. All study information and materials were provided to participants in their own language.
Mid-treatment review appointments were not included, because a phase I pilot found them to be not as information-dense as other consultations. 16 All other consultations with an oncologist were included.
Participants had one consultation audio-recorded using a Dictaphone.
The attending interpreter, clinician and patient's family gave prior consent. Audio-recordings were given to participants on USB or CD. Semistructured interviews (SSIs) were used to elicit participants' recall of the information given in the consultation. Participants completed the SSI via telephone 2 weeks after their audio-recorded consultation.
The SSIs were conducted by bilingual research assistants (RAs). The bilingual RAs were not aware of the content of the audio-recorded consultation.

| Translation of consultations and SSIs
The consultations and SSIs were conducted in English and Arabic, Cantonese, Greek or Mandarin. All data were translated into English prior to analysis so that coding of all languages could be completed by one coder to maintain consistency. Prior to translation, consultation and SSI audio-recordings were transcribed verbatim by bilingual RAs in the languages in which they were spoken, using the alphabet most common to that language (eg, Mandarin transcribed in simplified Chinese characters, and Cantonese in traditional Chinese characters), and in accordance with published recommendations. 45 All transcriptions were reviewed by a second bilingual RA to ensure accuracy before being translated into English by professional, tertiary-trained translators accredited by the National Accreditation Authority for Translators and Interpreters. 46 The coding process for a particular participant did not commence until both their consultation and SSI had been translated.

| Development of PICcode
Relevant literature was reviewed and used to capture existing recall elicitation and coding methods. Searches were not limited to oncology, nor to audio-recording interventions. General consultation content analysis methods not specific to information recall were also reviewed, such as the RIAS and the Medical Interaction Process System (MIPS). 36,47 After review of the literature, PICcode adopted the approach taken by Dunn et al 19 to determine the content of the consultation by coding each unit of information in the recording. This approach was chosen as it can be adapted to many consultation types and be personalized for each participant. Using a standardized format, the information provided by the clinician is dissected into units which can be counted and compared to what the participant remembers post-consultation.

| Development of recall elicitation component of PICcode (semi-structured interview)
The audio-recorded consultations had highly varied content and leading questions needed to be avoided to ensure that recall (not recognition) was measured. The interviewers (bilingual RAs) were trained to adapt, respond to the participant, explore new information, re-focus participants and probe for details to encourage participants to expand on what they recalled. Training included: a session on qualitative interviewing including how to encourage free-recall and avoid leading questions; listening to example audio-recordings of well-conducted and badly conducted SSIs; role-playing SSIs with another interviewer who spoke their language (so each had the opportunity to "be" the patient); and practicing SSIs in English with a mock-patient (RLS or AH).
The initial set of SSI questions was developed in conjunction with an oncologist (MJ) and piloted in a phase I study, giving interviewers additional practice. 16 Patient and staff feedback from the phase I study suggested that the duration of the SSI was taxing and that some questions were potentially leading or not always applicable. In addition, there was concern that participants were not focusing on the consultation that was audio-recorded and were instead recalling information from other consultations.
The language barrier and time required for transcription and translation prevented supervisors giving real-time feedback on interview quality, so peer review was used to monitor interviewers' performance. Each interviewer had at least one of their SSIs reviewed by a peer who assessed their interviewing technique according to criteria developed by supervisors.

| Development of recall-coding component of PICcode
After reviewing the literature, a draft recall-coding system (version 1) was developed, reviewed by the expert pane, and applied to a pilot sample of 3 participants' data (see Figure 1). RLS and AH independently double-coded these participants' data and then compared their coding. Areas of inconsistency or disagreement were discussed, and the coding system was updated to ensure fidelity of future coding and consistency between coders. Changes were reviewed by the expert panel, and their feedback was used to refine the coding system (version 2).
RLS and AH used the version 2 coding system to double-code the next 6 participants' data and updated the coding system where necessary. Following a third panel review, the final coding system (version An independent coder (AM) then used the PICcode Manual to code the remaining data and to re-code the first 3 participants' data.
Introduction of an independent coder ensured that the manual and methods were comprehensive, comprehensible and could be applied effectively by other researchers. Fifteen per cent of all participants' data was then double-coded by RLS to assess the intercoder reliability of PICcode. Coders were blinded to the content of the SSI when coding the consultation.

| Analysis
NVivo (QSR International Pty Ltd., Version 10, 2012) was used to identify and code information in the consultations and SSI transcriptions. Demographic and consultation data were analysed using descriptive statistics. Pearson product-moment correlation coefficient was used to assess relationships between variables. Statistical analysis was completed using R version 3.3.1 (R Core Team, Vienna, Austria).
The consultations were classified as one of three different "consultation types" on the assumption that some types of consultations would contain more information than others. If the audio-recorded consultation was the patient's first at the present hospital, it was classified as "first in hospital". If the patient had visited the hospital before, but this consultation was their first with a particular specialist (eg, their previous consultations had been with a surgical oncologist, while this one was with a radiation oncologist), it was classified as "first with specialist". The remaining consultations were following an investigation or procedure of some kind and so were classified as "post-scan, surgery or other work-up".
Due to variability in the number of units of information that each participant had to remember, the extent of participants' information recall is reported as a percentage of the total amount of information given in the consultation. Most previous studies, including Dunn et al, 19 have compared what the patient remembers to the genuine content of the consultation by calculating the percentage of total information that was recalled by the patient. Percentage of information recalled was calculated for each participant as follows: Intercoder reliability was assessed by comparing coders' on of the number of units of information that they identified in the participants' consultations and the percentage of information that they calculated as recalled. In addition, the content of the units of information identified in the consultation by coder 2 (RLS) were compared to those identified by coder 1 (AM) and were labelled as either "Matching" (unit content identical to coder 1), "Added" (an extra unit identified by coder 2, but not coder 1) or "Omitted" (a unit missed by coder 2, but identified by coder 1).

| Participant and consultation characteristics
Forty-seven patients consented (9 Arabic, 10 Cantonese, 8 Greek, 20 Mandarin). Six participants withdrew prior to their consultation. One participant's consultation was not audio-recorded due to technical failure, and 1 participant was lost to follow-up, leaving 39 participants with complete data (6 Arabic, 8 Cantonese, 6 Greek, 19 Mandarin).
Participant, consultation and SSI characteristics are presented in Table 1. Consultations varied in: type, duration (3 minutes to 90 minutes, mean 23 minutes), the number of units of information discussed (10 to 189) and number of people present (56% of consultations had more than 2 people present).
To ensure that participants focused on the consultation that was audio-recorded, rather than recalling information from other recent consultations, questions were added to the beginning of the SSI to orient the participant to the audio-recorded consultation (questions 1a-1c in Table 2). Interviewers corrected participants' answers to these questions if necessary.
Participants were made aware upon consent that the purpose of the SSI was to talk about their audio-recorded consultation; but the specific purpose of information recall was only revealed at the time of the SSI so that participants could not actively prepare.
Despite all interviewers receiving equal and comprehensive training, coders' field notes suggested that interviewers differed in their approach and that the quality of the SSI varied as a result. In some SSIs, the interviewer did not probe for further information when the opportunity arose, or did not ask any questions beyond those in Table 2. The duration of the SSI varied (5 to 62 minutes, mean 16 minutes), and was moderately correlated with both the duration of the consultation (r = .50) and the number of units of information in the consultation (r = .47). However, coders observed that shorter SSIs were generally not as well conducted.  The PICcode coding tree distinguishes between clinician-generated information and patient or family member-generated information Duration of semi-structured interview in minutes, mean (SD, range) a 16 (11.61, 5-62) a n = 39 due to withdrawals (n = 6), lost to follow-up (n = 1) and consultation not audio-recorded (n = 1).

T A B L E 1 (Continued)
T A B L E 2 Semi-Structured Interview to determine patient recall of consultation content F I G U R E 2 Steps involved in analysing data using PICcode (both prompted and un-prompted) so that comparisons can be made between the sources of information and the extent of the participant's recall of that information (see Figure 3).

3.
Which language to code: Version 1 of the coding system assumed that only information spoken in the participant's primary language in the consultation would be coded, as any information spoken in F I G U R E 3 PICcode coding tree English would not necessarily have been understood by the participant. However, participants' English language skill varied (see Table 1).
At times, a doctor and patient would converse in basic English, and the interpreter would not interpret this exchange. The PICcode coding tree distinguishes information spoken in the participant's first language from information spoken in English where no attempted interpretation was made (see Figure 3). This allows for comparisons to be made between participants' recall of information spoken in English (that the interpreter had determined was understood by the patient) and information spoken in their primary language.

4.
What to code as recalled in the SSI: Using version 1 of the coding system, coders occasionally differed in which units of information they considered recalled in the SSI. Coders were therefore instructed to discuss with a colleague if a participant's response in the SSI is ambiguous. If a participant recalls a unit of information that implies recall of another, related, unit of information, then both should be coded as recalled.

| Intercoder reliability
After coding was completed, 15% (n = 6; 1 Arabic, 1 Cantonese, 1 Greek, 3 Mandarin) was double-coded to assess intercoder reliability. Coder discrepancy in the per cent of information recalled by each participant ranged between 0 and 8% (see Table 3). The content of the units of information identified by coder 2 was identical to the content of the majority of the units of information identified by coder 1 for all participants. The per cent of units of information that were identical between coders varied between 63 and 90% (see "Matching" units of information, Table 3). Compared to coder 1, coder 2 identified between 0 and 16 additional units of information for each participant and omitted between 3 and 24 units of information for each participant (see "Added" and "Omitted", Table 3).

| DISCUSSION
This study developed a comprehensive and rigorous methodology for measuring patients' recall of information from a consultation. PICcode was designed for assessing recall in oncology patients who use the services of an interpreter, but can be adapted for English-speaking patients and other healthcare populations, making the system relevant and versatile.
The strength of PICcode lies largely in its precision and adaptability. The iterative process involved in developing PICcode assisted in clarifying precise details and produced a coding system that allows for broad applicability. Units of information are coded into a nested coding tree (Figure 3 provided by the patient or family members can be expressed as a percentage of the total information in the consultation to give an impression of the amount contributed by the patient or family. In addition, distinction can be made between information that the patient or family has spontaneously contributed and prompted information that they provided in response to a question from the clinician. The ratio between spontaneous and prompted information may give an indication of the patient's or family member's sense of agency. Future studies could use PICcode to investigate differences in patients' memory for medical information delivered by various parties (eg, do patients more easily remember information provided by clinicians than by family?).
While all participants in this study required an interpreter, their English skills varied, with the majority (67%) reporting that they had some English skills. PICcode accounts for patients' bilingual or multilingual status and can be used in groups with varying English skills to explore the relationship between language skill and information recall.
Although PICcode was developed for patients who require an interpreter, it draws on similar research with English-speaking patients 19 and can be adapted for use in populations where no interpretation or translation is required. Patients with low-health literacy are also at risk of not understanding or remembering information from their medical consultations. 53 PICcode could be adapted for this population and used to investigate the efficacy of interventions designed to improve health literacy.
The consultations that were audio-recorded and used to develop

| CONCLUSIONS
PICcode, and the methods used to develop it, will introduce more consistency in the measurement of patients' memory of medical information. PICcode can be applied to a wide range of patients, including non-English-speaking and low-health literacy groups, and can be used to measure many factors pertaining to consultation content including information recall and shared decision-making.