Cross-Cultural Differences in Categorical Memory Errors



Cultural differences occur in the use of categories to aid accurate recall of information. This study investigated whether culture also contributed to false (erroneous) memories, and extended cross-cultural memory research to Turkish culture, which is shaped by Eastern and Western influences. Americans and Turks viewed word pairs, half of which were categorically related and half unrelated. Participants then attempted to recall the second word from the pair in response to the first word cue. Responses were coded as correct, as blanks, or as different types of errors. Americans committed more categorical errors than did Turks, and Turks mistakenly recalled more non-categorically related list words than did Americans. These results support the idea that Americans use categories either to organize information in memory or to support retrieval strategies to a greater extent than Turks and suggest that culture shapes not only accurate recall but also erroneous distortions of memory.

1 Introduction

A Chinese proverb asserts, “the palest ink is better than the best memory,” an observation supported by literature highlighting the unreliability and reconstructive nature of memory. False memories, or mistaken memories of the occurrence of events or presentation of stimuli (Roediger & McDermott, 1995), occur because of reconstruction or recombination of ideas or elements (Schacter, 1999) and can be extremely realistic (Dewhurst & Anderson, 1999; Loftus, 2005; Norman & Schacter, 1997; Roediger & McDermott, 1995). In some cases, false memories cannot be discriminated from true ones at greater than chance levels (e.g., Dewhurst & Farrand, 2004). Despite the explosion of interest in erroneous and false memories, there has been little research into the ways that errors systematically differ across people, specifically as an effect of culture.

The cultural lens through which an individual views the world may shape information processing, impacting memory through what is encoded or retrieved (Gutchess & Indeck, 2009). For example, Caucasian Americans tend to recall more memories that emphasize the individual than do Asians, and Asians tend to recall more memories that contain many people and emphasize social interactions compared to Caucasians (Wang & Conway, 2004; Wang & Ross, 2005). In addition, Masuda and Nisbett (2001) showed that Japanese participants attend to context more than Americans, in both their descriptions of animated scenes and memories.

Because information processing capabilities are limited, certain details and aspects of the environment are prioritized over others. Some evidence indicates that these priorities, and the cognitive processing demands to implement particular strategies, can be shaped by the specific cultural lens through which one views the world (e.g., Boduroglu, Shah, & Nisbett, 2009; Gutchess et al., 2006). Peoples' memories, including false memories, may reflect the differing values, beliefs, and customs of their respective cultures (Gutchess & Indeck, 2009; Gutchess, Schwartz, & Boduroglu, 2011).

One way in which cognition may differ across cultures is in the use of categories. When given a set of words such as “seagull,” “squirrel,” and “nut,” Westerners tend to pair “seagull” and “squirrel,” as they both belong to the same category (“animals”), and East Asians tend to pair “squirrel” and “nut,” which are related functionally to each other (Chiu, 1972; Ji, Zhang, & Nisbett, 2004; Unsworth, Sears, & Pexman, 2005b). In the domain of memory, Gutchess et al. (2006) showed that elderly Americans use categories as an organizational strategy to support memory more than elderly Chinese.

Memory research has also focused on how semantic associations, including categories, distort memory. After studying several thematically related words, participants exhibit robust false recall of a non-presented related word (Roediger & McDermott, 1995). Some research indicates that categorical relationships result in one of the highest false memory rates among different types of semantic relationships (Brainerd, Yang, Reyna, Howe, & Mills, 2008; Smith, Ward, Tindell, Sifonis, & Wilkenfeld, 2000).

In this study, we examined the role of culture in shaping memory distortion in Americans and Turks. To investigate the impact of culture on memory, participants were given a cued memory task in which they were presented with word pairs and then attempted to recall the target word presented along with the cue. The task contained categorically related and unrelated word pairs; we predicted that when Americans could not recall the originally paired word, they would be more likely to commit categorical memory errors than Turks. The literature shows that Easterners tend to focus on similarities and functional relationships while Westerners tend to focus on categories and rules (Nisbett, Peng, Choi, & Norenzayan, 2001). Present-day Turkish culture is considered a blend of Eastern and Western influences due to its historical roots and geography. Studies investigating self-construal and social processes clearly show that compared to Europeans and Americans, Turks are more relational (e.g., Kashima & Hardie, 2000) and collectivistic (Imamoglu & Karakitapoglu Aygun, 2007; Kagitçibaşi, 1994), providing evidence of greater Eastern influence in Turkish culture than in European or American cultures. Because Turks are more influenced by Eastern culture than are Americans, we predicted that Turks would focus less on categories and therefore make fewer categorical memory errors than would Americans on our word recall task. Such a pattern would substantiate claims that Westerners' attention to categories can impact memory processes and strategies. Identifying cross-cultural differences in memory errors would indicate that culture not only shapes the content and strategies that support accurate memory but also impacts false memory.

2 Methods

2.1 Participants

Participants in the final sample consisted of 40 Americans tested at Brandeis University in the United States and 34 Turks tested at Boğaziçi University in Turkey. Samples were matched on operation span (Unsworth, Heitz, Schrock, & Engle, 2005a) to ensure that participants of similar cognitive ability participated across sites, which necessitated excluding an additional six participants (one American; five Turks) with extreme scores, whose data are not included in any of the measures reported here. See Table 1 (note that operation span scores were missing for three of the Turkish participants). While the Turkish sample is significantly older than the American sample, both groups are drawn from undergraduate samples. The older age likely reflects the fact that the Turkish students typically complete a year of English preparatory school because the curriculum at Boğaziçi University is taught in English. It is common for students to complete an Introduction to Psychology course in either their first or second year of college, as is the case for the American sample. To be eligible, participants had to be fluent in the testing language (English or Turkish), originate from either the United States or Turkey, and could not have lived outside of the country in which they were being tested for more than 2 years. Participants gave written consent and were reimbursed with either course credit or payment.

Table 1. Means (standard deviations) for test scores and demographic information for Americans and Turks
 N (A=Americans, T=Turks)AmericansTurksp-valuea
  1. a

    The p-values correspond to independent samples t tests comparing Americans and Turks on each measure.

Gender40 A, 34 T15 M, 25 F9 M, 24 F, 1 transsexual 
Age (years)40 A, 34 T19.15 (1.15)21.03 (1.70)<.001
Operation Span Total40 A, 31 T65.95 (5.18)63.32 (8.27).13

2.2 Materials

For each culture, stimuli consisted of 32 categorically related word pairs selected from culture-specific category norms. American norms were taken from Battig and Montague (1969); corresponding Turkish norms were taken from Peynircioglu (1988). To identify categories with similar structures across samples, we looked at categories that had similar numbers of items generated by at least 10 participants across cultures. From these, we selected a final set of 32 categories across cultures by selecting categories that differed cross-culturally by seven or fewer exemplars that had been generated by at least 10 participants. For these 32 categories, the numbers of exemplars generated by at least 10 individuals in the two samples (American and Turkish) did not differ significantly, p = .61. In addition, for these 32 categories, the number of exemplars generated by at least 100 individuals in the two samples did not differ significantly either, p = .61.

For each culture, we then selected the top two exemplars from each category. The categorically related words constituted the related condition. Word pairs were then scrambled to create new pairs that were categorically and semantically unrelated. Assignment of pairs as related or unrelated was confirmed through piloting. Exceptions were made for four categories in which lower items on the list were used instead of the top two exemplars. These substitutions for potentially problematic pairs (e.g., words sharing a common root, such as “president” and “vice president”) were based on discussions of the teams across cultures. To ensure that the strength of category association of the exemplars did not differ across cultures, we compared the proportions of participants in the norming studies who generated these specific exemplars as examples of members of their respective categories. The means were similar (Americans: M = .81, SD = .14; Turks: M = .78, SD = .13), p = .21.

We divided both the American and the Turkish word pairs into two lists each for counterbalancing across conditions. This created two separate sets of word pairs such that the words appearing in categorical pairs in one list were rearranged into new pairs to serve as the unrelated pairs for the other half of the participants and vice versa. When creating the counterbalancings, we matched across category frequency (how often the exemplar was generated as a member of its category in the published norms), written word frequency (Kucera & Francis, 1967), and number of syllables. Within each counterbalancing list, we scrambled all of the word pairs to create the unrelated pairs, which we were careful to make sure were not semantically related. We were also careful to re-pair the words analogously across cultures. For example, if the first precious stone exemplar was re-paired with the second fruit exemplar in English, then the first precious stone exemplar would also be re-paired with the second fruit exemplar in Turkish.

2.3 Procedure

Participants encoded 32 word pairs presented on the computer for 4 s each, using E-Prime software (Psychology Software Tools, Pittsburgh, PA, USA). For each pair, the word on the left-hand side of the screen was the highest category associate, and the word on the right-hand side was a second highest category associate. Half of the word pairs were related and half were unrelated. The pairs were presented in a single study block and in a random order, unique to each participant. Participants were informed that later they would be presented with the first word of the pair and would be asked to report the word that went with it. After encoding, participants had a 30-s filled delay interval before completing a self-paced cued recall task. Participants wrote down the associated word from the encoded pair in response to the cue. Participants were encouraged to report words for which they were not confident. There were two counterbalanced versions of the lists such that words appeared in the related and unrelated conditions equally often across participants. Task instructions were translated and back translated by native Turkish speakers who were also fluent in English.

To compare cognitive abilities across samples, participants were characterized on additional measures (Table 1), including demographics and operation span (executive function; Unsworth et al., 2005a). The demographics questionnaire was completed at the beginning of the session, before the encoding task, and the operation span was the final measure completed at the end of the study.

2.4 Scoring

Participants' recall responses were grouped into six categories: correct responses, categorical errors, other semantic errors, other list word errors, other non-semantic errors, and blanks. Errors could be based on the mispairing of another word from encoding or on the generation of a completely new word. A correct response was a response that matched the word the participant was being asked to recall (ignoring spelling errors or mistaken pluralization). A “categorical error” was a response that was incorrect and taxonomically related (based on definitions used in the cross-cultural literature; e.g., Ji et al., 2004; Chiu, 1972; Unsworth et al., 2005a) to the prompt (first word of the word pair) or to the correct response (second word of the word pair; e.g., for the prompt “pear,” a response of “banana” or “fruit”). An “other semantic error” was an incorrect response that was semantically related to the prompt or the correct response in any way other than categorically (e.g., a response of “clock” to the prompt “minute”), including synonyms. An “other list word error” was the recall of a word that was from the encoding list, but that was incorrect and unrelated in meaning to the prompt or the correct response. An “other non-semantic” error was the recall of a word that was incorrect, not from the encoding list, and unrelated in meaning to the prompt or correct response. Responses that were unrecognizable or that showed partial phonetic recall of correct responses were also counted as other non-semantic errors, although such errors were rare. Blanks were items that the participant skipped. Questions about how to score the memory errors were resolved through discussion of members of the research team drawn from each of the two cultures, and to ensure consistency, coders from each site scored the responses of the 10 participants from each culture who committed the most errors (agreement rate of 83%). The numbers of errors of each type were converted into proportions out of the total number of errors to control for differences in raw numbers of memory errors across cultures.

3 Results

To give a sense of the overall distribution of correct responses, memory errors, and omissions (trials left blank), we present the raw numbers in Tables 2 and 3. Subsequent analyses rely on proportions, rather than raw numbers.

Table 2. Means (standard deviations) for raw number of correct responses and blanks for Americans and Turks
 Correct Responses: RelatedCorrect Responses: UnrelatedBlanks: RelatedBlanks: Unrelated
Americans11.83 (3.07)6.55 (3.78)2.13 (2.16)5.10 (3.24)
Turks13.09 (2.21)7.41 (3.21)1.62 (1.89)5.94 (3.02)
Table 3. Means (standard deviations) for raw number of error types for Americans and Turks
 Categorical Errors: RelatedCategorical Errors: UnrelatedOther Semantic Errors: RelatedOther Semantic Errors: UnrelatedOther List Words: RelatedOther List Words: UnrelatedOther Non-semantic: RelatedOther Non-semantic: Unrelated
Americans1.43 (1.65)2.73 (2.26)0.40 (0.81)0.95 (1.11)0.15 (0.48)0.55 (0.85)0.08 (0.27)0.13 (0.33)
Turks0.76 (0.78)1.24 (1.60)0.29 (0.63)0.44 (0.66)0.18 (0.52)0.85 (1.35)0.06 (0.24)0.12 (0.33)

3.1 Correct responses

To test for differences in the proportions of correct responses, we conducted a 2 × 2 repeated-measures anova, with culture (American/Turkish) as the between-participants variable and word pair type (related/unrelated) as the within-participants variable.

There was a main effect of word pair type (F[1, 72] = 292.05, p < .001, ηp2 = 0.80), with more related (M = .78, SD = .17) than unrelated (M = .43, SD = .22) word pairs recalled. There was no main effect of culture (p = .11) or interaction of word pair type x culture (p = .53). See Fig. 1. This suggests that across cultures, overall accuracy of memory was equivalent and that categorically related word pairs were better recalled than unrelated ones.

Figure 1.

Proportion of correct responses for related and unrelated word pairs across cultures. Error bars represent the standard error of the mean.

3.2 Blanks

We compared the effects of culture on the proportions of items left blank for related and unrelated trials in a 2 × 2 repeated-measures anova, with culture (American/Turkish) as the between-participants variable and word pair type (related/unrelated) as the within-participants variable. More trials were left blank for unrelated (M = .34, SD = .20) than related (M = .12, SD = .13) word pairs (F[1, 72] = 139.19, p < .001, ηp2 = 0.66). While there was no overall main effect of culture (p = .76), there was a significant interaction between culture and word pair type (F[1, 72] = 4.75, p = .03, ηp2 = 0.06). Turks exhibited a larger difference between the number of blanks for unrelated pairs (M = .37, SD = .19) and related pairs (M = .10, SD = .12), compared to Americans (unrelated: M = .32, SD = .20; related: M = .13, SD = .14).

3.3 Memory errors

To test for differences in the proportions of error types, we conducted a 2 × 4 repeated-measures anova. Culture (American/Turkish) was the between-subjects variable and error type (categorical/other semantic/other list word/other non-semantic) was the within-subjects variable. Importantly, error types were compared based on the proportions out of all memory errors; this would allow for fair comparisons across groups that could potentially differ in the total number of errors (due to either differences in the number of correct responses or in the willingness to guess vs. tendency to leave items blank). Given the possible contribution of cognitive ability to patterns of errors across cultures, we tested whether operation span should be included as a control variable (covariate) in the model. Because its contribution to the model was non-significant, this variable was removed from the model (consistent with an approach discussed by Kutner, Nachtsheim, & Neter, 2004). One American and four Turks were not included in these analyses because they did not commit any errors (i.e., responses consisted of correct answers and blanks), leaving a sample of 39 Americans and 30 Turks for the analyses of memory errors. Responses were originally divided into those for related pairs and those for unrelated pairs, but because there was no significant error type x word pair type x culture interaction (p = .90), analyses were collapsed across related and unrelated word pairs. This allowed us to increase the number of participants included in the analyses, given that participants needed only to make an error in the related or the unrelated condition, not necessarily both, to be included. To compare related and unrelated errors, the remaining sample would have consisted of only 30 Americans and 23 Turks.

There was a main effect of error type (F[2.35, 157.61] = 44.84, p < .001, ηp2 = 0.40; Greenhouse–Geisser corrected). Paired-samples t tests showed that the proportion of categorical errors out of total errors (M = .58, SD = .31) was significantly higher than that for each of the three other types of errors (ps < .001). There were also significantly more other semantic errors (M = .19, SD = .23) and other list word errors (M = .18, SD = .27) than other non-semantic (M = .05, SD = .15) errors (ps < .002). Other semantic errors and other list word errors did not significantly differ (p = .75). It was meaningless to assess the main effect of culture because summing the proportion of errors across all types necessarily equaled 1 for each participant in each culture.

Of particular interest, there was a significant error type x culture interaction (F[2.35, 157.61] = 3.37, p = .03, ηp2 = 0.05; Greenhouse–Geisser corrected). Americans (M = .65, SD = .27) committed proportionally more categorical errors than Turks (M = .50, SD = .34), t(67) = 2.04, p < .05, Cohen's d = .49. In addition, Turks (M = .26, SD = .34) committed proportionally more other list word errors than Americans (M = .11, SD = .16), t(38.61) = 2.29, p = .03 (sphericity not assumed), Cohen's d = .56. Americans and Turks did not significantly differ in other semantic errors or other non-semantic errors (p = .72). See Fig. 2.

Figure 2.

Proportion of error types across cultures. Error bars represent the standard error of the mean.

4 Discussion

This study revealed two main findings. First, Americans made significantly more categorical errors than did Turks on a word pair recall task. Second, Turks made significantly more other list word errors than did Americans. Together these results indicate that Americans used categories in the memory task to a greater extent than did Turks. The tendency of the Turks to commit proportionally more other list word errors but fewer categorical errors than Americans suggests that Turks used different strategies to support memory than Americans. The data indicate that Americans may use categories to organize memory more than Turks, which would be consistent with previous findings based on the organization of accurate responses during free recall (Gutchess et al., 2006). An alternative interpretation is that cultural differences occur in guessing strategies at retrieval, such that Americans were more likely than Turks to guess a categorically related word when they could not accurately retrieve a word pair. Both of these potential explanations suggest that the salience of categories in this task may not have limited Turks to categorical responses as much as it did Americans. Whereas Americans overwhelmingly committed categorical errors, Turks' errors tended to be categorically related words or other studied words that were unrelated. These findings are consistent with the idea that Western cultures place greater emphasis on categories relative to more Eastern cultures (e.g., Ji et al., 2004; Unsworth et al., 2005ab). In addition, this study demonstrates that categories not only shape accurate memory (e.g., Gutchess et al., 2006) but also play a role in shaping the types of memory errors that occur across cultural groups.

The nature of the memory errors sheds further light on cognitive differences across cultures. While Turks used categories less than Americans, it is important to note that Turks still used categorical relationships in the task. Both cultural groups accurately remembered more categorically related than unrelated word pairs, and categorical errors tended to occur more frequently than other semantic or non-semantic errors. In addition, it is important to emphasize that cross-cultural differences in memory errors occurred even though the samples were matched on cognitive ability (with the operation span as a measure of executive function). This suggests that differences across cultures can occur in the qualities of memory, or the strategies used, rather than in the amount of information remembered or the overall cognitive ability of our samples. Future studies could further inform our understanding by contrasting an organizational strategy preferred by Turks (perhaps functional relationships, as in Eastern cultures, e.g., Ji et al., 2004) with the categorical strategy that is salient to Americans. Such an approach would also allow us to address the possible explanation that Americans are simply more likely than Turks to make errors of whatever relationship is made salient, regardless of what the relationship is. For example, Westerners, when viewing a word pair, could use rules to figure out the relationship between the specific words, and this might make them more likely than Easterners to apply those same rules later in a recall task, consistent with Western emphasis on logic and rules (i.e., Nisbett et al., 2001).

Although stimuli were carefully matched across cultures in terms of their relationship with categories, a limitation of this study is that the word lists may have differed in other unintended ways. For example, the exemplars selected for each category or the properties of the words themselves could share phonetic or additional unintended semantic associations in one language but not the other. To address these concerns, two separate word lists were used across participants and we attempted to use the identical items across cultures whenever possible (47% of words would be considered translations). Future work on the effects of culture on memory errors can further substantiate these results, extending research on the influence of culture on memory errors into different tasks and types of stimuli, as well as different domains of memory.

This research offers new insights to the literature on cross-cultural memory differences especially given that Turks have not been studied extensively in the psychological literature. As research draws more attention to the factors that contribute to cultural differences, there is a greater need to study culture more broadly than simply through the traditional East/West divide (Henrich, Heine, & Norenzayan, 2010). As a blended culture shaped by both Western and Eastern influences, Turkish culture is extremely valuable to study. Our finding of cultural differences in the tendency to commit categorical memory errors suggests one way in which cognitive processes may differ between Turks and Americans.

In an increasingly globalized world, acquiring basic knowledge about cross-cultural differences in cognition can be important to maintain successful business, diplomatic, and interpersonal relationships. Particularly as memory has implications for the reliability and even the credibility of a source, it is important to understand potential systematic cross-cultural differences in the patterns of memory distortions.


Authors gratefully acknowledge support from the National Science Foundation (BCS-1147707), Fulbright Scholars (A.H.G.), TUBA-GEBIP (A.B.), and Brandeis University Research Circle on Democracy & Cultural Pluralism Award (A.J.S.). We thank Xiaodong Liu for statistical consultation; Ellen Wright and Pete Millar for feedback; and Nur Bulut, Ege Kurtulus, Aylin Uzun, Liat Zabludovsky, Joseph Polex Wolf, and Albert Lin for experimental assistance.