Cat Got Your Tongue? Using the Tip-of-the-Tongue State to Investigate Fixed Expressions



Despite the fact that they play a prominent role in everyday speech, the representation and processing of fixed expressions during language production is poorly understood. Here, we report a study investigating the processes underlying fixed expression production. “Tip-of-the-tongue” (TOT) states were elicited for well-known idioms (e.g., hit the nail on the head) and participants were asked to report any information they could regarding the content of the phrase. Participants were able to correctly report individual words for idioms that they could not produce. In addition, participants produced both figurative (e.g., pretty for easy on the eye) and literal errors (e.g., hammer for hit the nail on the head) when in a TOT state, suggesting that both figurative and literal meanings are active during production. There was no effect of semantic decomposability on overall TOT incidence; however, participants recalled a greater proportion of words for decomposable rather than non-decomposable idioms. This finding suggests there may be differences in how decomposable and non-decomposable idioms are retrieved during production.

1. Introduction

“Virtually every sentence that a person utters or understands is a brand new combination of words, appearing for the first time in the history of the universe” (Pinker, 1994, p. 22). Such a strict generative theory of language is perhaps one of the reasons why the study of fixed expression production remains underdeveloped. However, our use of fixed expressions such as idioms and proverbs extends far beyond the periphery of linguistics. We cook up a storm before going out to paint the town red, and when our whistle is sufficiently wet, we hit the road and then the sack. In a review of an unscripted telephone conversation corpus Van Lancker Sidtis (2004) reported that 48% of utterances were familiar expressions. The Oxford Idioms Dictionary for Learners of English (Parkinson & Francis, 2006) lists over 10,000 idioms, while Jackendoff (1995) estimates that there are approximately 40,000 fixed and idiomatic expressions in English; by any measure, fixed expressions are a prevalent feature of our language.

Due to the disparity between the literal meanings of the component words of a phrase and their holistic figurative meaning, it was initially thought that fixed expressions such as idioms were analogous to single words with a single lexical representation (e.g., Bobrow & Bell, 1973; Swinney & Cutler, 1979). This view has since been replaced by theories of idiom comprehension that allow for various degrees of compositionality and syntactic analysability (e.g., Cacciari & Tabossi, 1988; Gibbs & Nayak, 1989; Peterson, Burgess, Dell, & Eberhard, 2001). However, only a few studies have examined the mechanisms underlying idiom production (e.g., Cutting & Bock, 1997; Konopka & Bock, 2009; Sprenger, Levelt, & Kempen, 2006). Here, we report the first use of the “tip-of-the-tongue” (TOT) paradigm to examine the processes underlying lexical access of idiomatic expressions.

1.1. Fixed expression production

Cutting and Bock (1997) tested the hypothesis that the component words of a phrase are individually semantically and syntactically analyzed. They found that people were more likely to erroneously “blend” two idioms if they shared syntactic structure (e.g., shoot the breeze and raise the roof become shoot the roof) than if they did not (e.g., shoot the breeze, nip and tuck). Furthermore, this was as likely to happen for phrases that were similar in literal meaning (e.g., swallow your pride, chew your ego) as phrases that were similar in figurative meaning (e.g., kick the bucket, meet your maker), supporting the assertion that both figurative and literal meanings are activated (Cacciari & Glucksberg, 1991). Finally, these effects were similar for semantically decomposable idioms (where component words have a relation to the overall meaning, e.g., easy on the eye), and non-decomposable idioms (e.g., where component words do not have an obvious relation to the overall meaning, e.g., kick the bucket). Based on these findings, it appears that idioms follow the same production processes as non-idiomatic phrases—a conclusion supported by Konopka and Bock's (2009) finding that idioms are susceptible to syntactic priming.

Sprenger et al. (2006) found that the individual components of an idiom (e.g., road) primed the idiomatic phrase (e.g., hit the road) more strongly than a similar but non-idiomatic phrase (e.g., clean the road). They argued this was driven by spreading activation from a common idiom representation; a “superlemma.” This abstract representation is a semantically and syntactically specified node mediating between the conceptual level and the component word representations (known as “lemmas,” which code for syntactic properties but not phonological properties, phonological content being encoded at a subsequent level of production; see, e.g., Levelt, Roelofs, & Meyer, 1999). Unlike non-idiomatic phrases, all the lemmas of an idiom are connected, and so activating one of them facilitates activation of the rest. That is, the idiomatic meaning activates the superlemma, which in turn activates the simple component words. An important feature of the superlemma is that it “disables” syntactic alternatives for the idiom (e.g., bucket is kicked), and so it can account for the apparent peculiarities of idiom syntax (many idioms are argued to be syntactically inflexible, e.g., Emily kicked the bucket loses its idiomatic meaning when produced as the bucket was kicked by Emily).

This ability to deal with the syntax of idioms (which intuitively seems idiosyncratic) is one of the appeals of the superlemma account. However, Tabossi, Wolf, and Koterle (2009) questioned the assumption that idiom syntax is idiosyncratic and argued that it is instead governed by pragmatic constraints. For example, they found that pragmatically appropriate context could increase the acceptability judgments of idioms that had undergone some form of syntactic operation (for example, in a conversation about death, the bucket was kicked by Emily may be judged more acceptable than in isolation). This finding suggests that idiom syntax is not idiosyncratic in the “all-or-none” sense; rather, Tabossi et al. proposed a model where configurations of nodes at a lexical-conceptual level correspond to idioms. These configurations are similar to the superlemma in that they are associated with the meaning of the idiom at the conceptual level, and with the individual component words at the lexical-conceptual level. However, they do not specify the syntactic behavior of the idiom; instead, syntax is governed by pragmatic constraints. Because it does not rigidly specify syntax, this model is consistent with the finding that idioms show syntactic flexibility when set in the appropriate context.

1.2. Tip-of-the-tongue

Experimentally elicited TOT states have been used to study seriality in lexical access (that is, the theory that semantic, syntactic, and phonological information are accessed in strictly that order), with evidence that participants are able to accurately recall syntactic features such as gender or number for isolated words which they cannot produce (e.g., Caramazza & Miozzo, 1997; Vigliocco, Antonini, & Garrett, 1997; Vigliocco, Vinson, Martin, & Garret, 1999). Insofar as TOT states provide a window into idiom production processes, they allow us to examine what information is available at the stage of lexicalization as they provide a situation in which the lexical concept and lemma of a word have been accessed without the phonological form. Although the exact locus of a TOT state could potentially be at several stages in the production process—failure to activate phonological information or failure to activate one or more of the simple lemmas of an idiom—in order for a TOT state to occur for one or more words in an idiomatic expression those words must be available individually. One may therefore predict that a participant in a TOT state for an idiom would be able to partially recall that idiom, rather than it being an “all-or-nothing” process as predicted by earlier holistic theories of idiom representation (e.g., Swinney & Cutler, 1979). The effects of decomposability upon the ability to partially recall an idiom are more difficult to predict. If there are differences in the lexical representation of decomposable and non-decomposable idioms, we would predict that participants should be able to recall more of a decomposable idiom due to the greater availability of its individual parts.

The TOT paradigm also provides an opportunity to investigate literal and figurative semantic activation through error analyses. By modifying the classic TOT design and asking participants to provide whole words rather than partial phonological or grammatical information, it should be possible to determine whether the literal meanings of component words are active. Semantically related speech errors are generally interpreted as the consequence of spreading activation within the conceptual level leading to the erroneous activation of a related concept (e.g., Fromkin, 1971). In the case of an idiom, in order for participants to produce a literal semantically related error (e.g., when cows fly—when pigs fly), the production system must allow for the activation of the individual component lemmas of the idiom (so that they can then spread activation to literal relations). Figurative errors, on the other hand, should occur both in TOT states and when the participant does not know the idiom and is simply guessing, as the figurative meaning is contained within the definition participants are presented with. With regard to the issue of decomposability, if a representational difference exists, with non-decomposable idioms behaving like single lexical items, we would expect to see a difference in either overall TOT rates or information available for the two types of idioms. On the basis of such a difference, we would also predict that participants would be more likely to make a literal error for a decomposable idiom due to the overlap of the idiomatic and literal meaning of the component words.

2. Method

2.1. Participants

One hundred and twenty-six participants (103 female) were recruited through online mailing lists (mean age = 28 years and 3 months, SD = 7.72). All participants were native English speakers and were screened for dyslexia.

2.2. Materials

Forty idioms and their definitions were rated by 20 native English speakers (who did not take part in the TOT study) on familiarity, the applicability of the definition to the phrase, and decomposability (on a three-point scale). Idioms were categorized as decomposable/non-decomposable if over 55% of participants rated them as belonging to these categories. From this pool, 30 experimental items were chosen (see Appendix A for the list of idioms used in each category; full rating information for each idiom is available online at Interrater reliability of decomposability ratings was calculated using Krippendorff's alpha (Hayes & Krippendorff, 2007) to determine consistency among raters. Reliability was found to be at a moderate level (α = 0.41).

2.3. Procedure

Participants were instructed that they would be shown a definition and asked to type the corresponding idiom. If participants indicated they were in a TOT state, they were told to type any words they could recall from the phrase. If they did not know the answer, they were instructed to guess any words they thought might be part of the idiom. The trial ended after 45 s and participants were shown the answer. To check participants were in a positive TOT state, they were asked to confirm they had been trying to think of the target phrase. If participants indicated they were thinking of a different phrase, they were given the opportunity to provide an alternative answer.

2.4. Data scoring and design

Responses were recorded as correct if the participant produced the target phrase. Participants were not penalized for omitting initial words from the idiom that did not change the meaning; for example, “fish out of water” was marked as correct for “like a fish out of water.” Non-target or missing responses were classified as incorrect. Participants were scored as being in a positive TOT (pTOT) state if they indicated the target phrase was the one they were trying to retrieve. When participants indicated that the phrase they were trying to recall was not the target, this was labeled a negative TOT (nTOT). Finally, a don't know (DK) state was scored when participants did not know the answer.

Errors were categorized as literal (e.g., butter for bring home the bacon), figurative (e.g., pretty for easy on the eye), or unrelated (e.g., false for have egg on your face) by two researchers independently. Unrelated errors were not included in any analyses. In five cases (2.3% of all errors) agreement could not be reached and these errors were classified as unrelated and not analyzed further.

In keeping with other TOT studies, the DK state provided a “baseline” measure with which to compare responses in the pTOT state (e.g., Biedermann, Ruh, Nickels, & Coltheart, 2008). Broadly speaking, if participants are able to recall more correct information in the pTOT state than in the DK state (where they are essentially guessing and should perform at chance), this suggests that they have access to some form of lexical representation regarding the idiom they are trying to produce.

3. Results

From a total of 3,780 responses there were 1,123 (29.7%) target responses, 1,366 (36.14%) non-target responses, 271 (7.17%) TOT states (99 pTOT, 172 nTOT), and 1,020 (26.98%) DK states. To ensure that the items were balanced in terms of difficulty, a Wilcoxon's signed ranks test compared the proportion of responses that were categorized as pTOT for decomposable (M = 0.03, range = 0.00–0.47) and non-decomposable idioms (M = 0.03, range = 0.00–0.67). There was no significant difference. The pattern of significance for by-subjects and by-items analyses was the same unless stated.

3.1. Target retrieval analyses

To test the hypothesis that the simple lemmas of an idiomatic phrase are active, analyses were conducted to determine if participants were significantly more likely to partially recall the idiom in a pTOT than a DK state (see Table 1). Two measures of partial recall were calculated. First, a blunt measure of recall that scored participants as having partial recall if they correctly recalled at least one of the words in the idiom and, second, the proportion of correctly recalled words (i.e., out of the total possible number of words for the idioms participants reported being in a pTOT or DK state for, what proportion were correctly recalled). As the pattern of results from both measures was identical, only the latter are reported. Wilcoxon's rank sum tests revealed that participants recalled a significantly higher proportion of words in the pTOT compared with the DK state (= −5.698, < .001, = .43). This significance held for decomposable (= −6.139, < .001, = .50) and non-decomposable idioms (= −2.819, < .005, = .23); however, for non-decomposable idioms the by-items analysis was not significant, although this may reflect a lack of power due to only 15 cases present in the analysis. Participants reported a significantly greater proportion of the words for decomposable compared to non-decomposable idioms (= −2.819, < .005, = .23), suggesting a greater availability of the individual components of a phrase for decomposable items when in a TOT state. The by-items analysis was approaching significance (= 1.851, = .064, = .49). To alleviate concerns about the binary grouping of idioms into decomposable and non-decomposable, we conducted Spearman's Rho correlations between degree of decomposability and the proportion of words correctly recalled and found a marginally significant correlation, r (30) = .35, = .059 (that is, the more decomposable an idiom, the greater the proportion of words recalled), supporting the results of the dichotomous analyses (see Fig. 1).

Table 1. Mean proportion of correctly recalled words by idiom type. Total number of responses is provided in parentheses
StateAll IdiomsRangeDCRangeNDCRange


  1. DC, decomposable idioms; DK, don't know state; NDC, non-decomposable idioms; PTOT, positive tip-of-the-tongue state.

pTOT0.14 (59)0.00–0.680.16 (42)0.00–0.680.07 (17)0.00–0.38
DK0.01 (31)0.00–0.250.01 (16)0.00–0.220.01 (15)0.00–0.25
Figure 1.

Scatterplot showing correlation between decomposability ratings and the proportion of words recalled while in a pTOT.

3.2. Error analyses

We then investigated whether participants were more likely to make a figurative or a literal error in each state (see Fig. 2). Figurative errors should be common in both states due to participants using the information given in the definition to guess words related to the overall meaning of the idiom. However, if the simple lemmas of the idiom are not activated, then literal errors should occur no more often than chance, that is, when participants are simply guessing in the DK state. Wilcoxon's rank sum tests revealed that participants were significantly more likely to make a literal error in a pTOT state than a DK state across all idioms (= −4.122, < .001, = .56). This significance held for decomposable idioms (= −3.468, < .001, = .59) and non-decomposable idioms (= −2.869, < .005, = .47), although the by-items analysis for non-decomposable was nonsignificant (= .09). Across all idioms participants were significantly less likely to make a figurative error in a pTOT state than a DK state (= −3.023, < .005, = .41). This significance held for non-decomposable idioms (= −2.308, < .05, = .37); however, there was no significant difference for decomposable idioms. A comparison of decomposable versus non-decomposable idioms revealed no significant difference in the number of literal or figurative errors made when in a pTOT state. Spearman's Rho correlations were also performed between degree of decomposability and the number of literal and figurative errors. No significant relationships were found.

Figure 2.

Proportion of literal and figurative errors produced by participants for pTOT and DK states by idiom type. Raw number of errors are given below bars. Error bars represent standard error. Note. DC, Decomposable idioms; NDC, non-decomposable idioms.

4. Discussion

In the current study, people in a TOT state were able to recall component words of an idiom more accurately than when they were guessing. In addition, when they could not correctly report individual words, the words that they reported instead were more likely to be related to the literal meaning of the idiom (e.g., “over” for “water under the bridge”) than the figurative meaning (e.g., “chip in” for “all hands on deck”). This was in contrast to trials where they were merely guessing the idiom, in which case the words were more likely to be figuratively related than literally related. Again, this suggests that people were basing their guesses on some lexical representation of the idiom rather than making an educated guess based on its definition. It also provides converging evidence for the conclusion that both literal and figurative meanings are active during idiom production (e.g., Cutting & Bock, 1997; Sprenger et al., 2006).

There was no effect of decomposability on the overall incidence of TOT states. However, participants recalled a greater proportion of words for decomposable than non-decomposable idioms, a consistent finding from the dichotomous comparison of idiom types and the correlation analysis. This stands at odds with previous findings that decomposable and non-decomposable idioms are processed in the same way for production (Cutting & Bock, 1997; cf. Konopka & Bock, 2009). As reviewed in the Introduction, Sprenger et al. (2006) reported some evidence for processing differences arising from decomposability; in a series of post hoc analyses they found that some variance in their data could be accounted for by decomposability (Experiments 2 and 3, see Appendix A of Sprenger et al.). However, they argued that the design of their experiments did not allow them to draw any conclusions as to what this meant, and the pattern was not consistent in nature. We believe that our findings add weight to this tentative evidence that decomposability affects production.

We see at least two ways of accounting for the effect of decomposability. One interpretation is that decomposable and non-decomposable idioms are represented differently at the lexical level; the fact that participants were able to report more information for decomposable than non-decomposable idioms might suggest that access to non-decomposable idioms is more “all-or-nothing” than access to decomposable idioms. This would fit with the view that non-decomposable idioms are particularly “word-like” in their representation. However, the view of idioms as single lexical representations does not fit with the evidence that both literal and figurative meanings are active during production, or with evidence that idioms have some syntactic flexibility given the right pragmatic circumstances.

We prefer to interpret our findings as reflecting the processes of lexical activation during production. We interpret this in terms of the superlemma model but note that an analogous account would be consistent with Tabossi et al.'s (2009) configural node hypothesis. To take the example of a decomposable idiom first, a non-lexical concept is formed that a speaker wishes to convey, for example, “having the advantages of two things” (see Fig. 3A). This abstract concept then activates the superlemma corresponding to the best of both worlds as well as other semantic associates such as good, best, two, both. The superlemma sends activation to the individual lemmas of the phrase; these receive a boost, as they have already received activation from the lexical-conceptual level. In a TOT context, this additional activation increases the likelihood that individual lemmas will be selected (hence the higher proportion of component words reported for decomposable idioms). In contrast, the component lemmas of a non-decomposable idiom (e.g., kick the bucket) will receive activation from the superlemma, but not from the conceptual level; death is not semantically related to kick or bucket (see Fig. 3B). In a TOT state, where the activation of individual components has failed, this means component lemmas are less likely to be selected because they have not had an additional activation boost from the conceptual level.

Figure 3.

Representation of the flow of activation for (A) a decomposable idiom and (B) a non-decomposable idiom.

An anonymous reviewer noted that some of the items included in the present study may be more readily classified as proverbs rather than idioms, for example, Rome wasn't built in a day. We would argue that those items are idioms that can be produced in a proverbial form; however, this does not mean that they are not still idiomatic expressions. An idiom is defined as “A group of words whose meaning is different from the meaning of the individual words” (Horny, 2000, p. 643). By this definition only those items that are non-decomposable fall into this category, as there are many decomposable idioms for which the meaning can be deduced, for example, “like a fish out of water.” Given that idioms and proverbs can be assumed to be a part of the same spectrum of fixed expressions, there is no reason to believe that idioms would be represented in a different form to proverbs—regardless of whether this form includes a superlemma level.

Finally, it is worth noting that the accounts of idiom production used as the basis for the current work, and consequentially the conclusions drawn here, presume a rich view of lexical representation in which such representations contain a great deal of semantic and contextual information. Elman (2009) has proposed a model in which there is no lexicon in the traditional sense but rather a grammar on which words operate. Only certain linguistic information is encoded in the lexical representations with semantic, pragmatic, and contextual knowledge accessed from other modular sources. Such a view poses problems for the current account of idiom representation due to the connections that we suggest exist between the literal and figurative meanings of an idiomatic expression. It is outside of the scope of the current work to probe these issues in any meaningful way; however, the work of Elman is a potentially interesting avenue for future research into idiom production.

It is important to emphasize that neither idioms nor the wider class of fixed expressions represents a special case in language production; they are an intrinsic part of our language use. As outlined in the Introduction, there has been comparatively little research thus far into the production of fixed expressions. Our findings are consistent with those of previous studies which found that both literal and figurative meanings are activated during production of idiomatic expressions. Interestingly, however, they differ from previous findings in that the production processes underlying decomposable and non-decomposable idioms do appear to differ. Although decomposable and non-decomposable idioms may be represented the same way as they enter the production process (Cutting & Bock, 1997), it appears that they are different at least so far as the pattern of activation for individual component lemmas is concerned.


We wish to thank Lisa DeBruine for coding and hosting the online experiment and Derek Murphy who contributed significantly to the creation of stimuli lists and data scoring.

Appendix A:: Idioms by category

A leopard can't change his spotsA piece of cake
Actions speak louder than wordsA storm in a teacup
All hands on deckBe on cloud nine
All that glitters is not goldHave an axe to grind
Be ahead of the gameHave egg on your face
Blood is thicker than waterHit the nail on the head
Bring home the baconLike a fish out of water
Don't mince your wordsLook what the cat dragged in
Easy on the eyePull your socks up
In the heat of the momentRome wasn't built in a day
It takes two to tangoTake the bull by the horns
Money doesn't grow on treesThe cat's out of the bag
Run out of steamThe hair of the dog
Saved by the bellWater under the bridge
The best of both worldsWet behind the ears