What Have you Done for me Lately? Release of Information and Strategic Manipulation of Memories*


  • *

     I thank Ben Polak for his invaluable guidance and support throughout all phases of this project. I am also thankful to Stephen Morris, David Pearce, an editor and two anonymous referees for their comments and suggestions. I have also benefited from conversations with Michael Boozer, Ettore Damiano, Nick de Roos, Jason Draho, Ricky Lam and Mario Simon.


How should a rational agent (politician/employee/advertiser) release information in order to manipulate the memory imperfections of his forgetful assessor (electorate/supervisor/consumer)? This article attempts to answer this question using a memory model based on the principles of recency, similarity and repetition. I show that the problem of a rational agent who releases information to a forgetful assessor can be modelled as a standard dynamic optimisation problem and we describe the properties of the optimal profile for releasing information. The theoretical results are applicable in a wide range of social and economic contexts, such as political campaigns, employee performance evaluations and advertising strategies.

How should a rational agent release information in order to manipulate the memory imperfections of his forgetful assessor? This question arises in a wide range of social and economic interactions, where agents are rewarded at some critical date on the basis of an assessment of their past performance. Often in such cases, an objective criterion that summarises past performance is not available and, as a result, the agents that provide the assessment (referred to as the assessors) have to rely on their memories of past events. For example, the election prospects of an incumbent politician depend on the electorate's memories on election day. In a labour market setting, promotion and bonus decisions depend on a supervisor's memories of an employee's past accomplishments. Similarly, the success of a new film on its opening weekend depends on the memorability of the advertising campaign prior to the film's release.

In this article I first borrow from the work of Mullainathan (2002) to model the assessors’ memory technology. I then explore the model's economic implications by addressing the issue of how a rational agent (politician/employee/advertiser) should time a sequence of informative events in order to manipulate the memories of his forgetful assessor (electorate/supervisor/consumer). I show that the agent's problem can be modelled as a standard dynamic optimisation problem, I describe the properties of the optimal profile for releasing information and we discuss the implications in the context of various applications. The main contribution of the article is, therefore, to show how a memory model similar to Mullainathan's (2002) can be fruitfully incorporated into standard decision problems to generate plausible and intuitive results that are relevant to a range of applications.1

To fix ideas, let us suppose that the agent is a politician facing an election at date T. During the period leading up to the election, the electorate observes events, both good and bad, that pertain to the politician. Events occur stochastically, but the politician also has some newsworthy items (referred to as successes) that he can release at his discretion. He can choose, for example, when to announce his commitment to cut taxes and when to schedule a TV appearance that will generate positive publicity. His objective is to maximise the memories that the electorate will have at election time, subject to the electorate's memory technology.

Following Mullainathan (2002) I assume that the electorate's memory is governed by the principles of recency, cue dependence and rehearsal. Recency refers to the obvious fact that time erodes memories. Hence, all else being equal, more recent events are more memorable. Cue dependence refers to the property that current events may trigger memories of similar past events. When the memory of an event is triggered by a similar current event, rehearsal kicks in by increasing the strength of the original (just triggered) memory.

In the context of the political application these principles give rise to two opposing forces. On the one hand, recency implies that the politician should release his successes close to the election date (recency effect). On the other hand, the combination of cue dependence and rehearsal implies that current information releases do not only create new memories for the future, but they can also make similar past memories more memorable. To use the economics jargon, there is a complementarity between current and past information releases, similar to the complementarity between current and past consumption in models of addiction or habit formation. From this point of view, the politician should release his successes, when these successes can most effectively rehearse past successes (rehearsal effect). If, for example, the politician has recently accumulated some favourable publicity due to exogenous stochastic events, the rehearsal effect will be maximised if the politician releases his successes as early as possible, while the existing memories of past successes are still fresh. Optimal decisions are dictated by a trade-off between the recency and the rehearsal effects.

The formal model considers the case where the politician has a fixed budget of newsworthy items to release. In the benchmark model, all of these newsworthy items and all stochastic events are favourable (good news) to the politician. It is shown that the optimal rule for when to release information has a nice simple form. Loosely speaking, we can think of a streak of events creating a rehearsal stock. When this stock is above a certain threshold level, it triggers the agent to release more successes. This way new successes reinforce the memories of past successes while these memories are still fresh. Moreover, I show that in the optimal profile successes should be bunched together in consecutive periods. This implies that once the agent starts releasing newsworthy items, he continues doing so until he exhausts his supply of such items.

These results are relevant for political campaigns, advertising strategies and evaluations at the workplace, to mention a few. A politician, for example, should release favourable events immediately following a streak of favourable press releases (threshold result). An advertiser should schedule commercial spots addressing different attributes of the same brand back to back in consecutive time slots (bunching result).

The bunching result has some support in the experimental evidence found in the marketing literature. A series of experiments has addressed the issue of whether memory is superior for bunched rather than for spaced presentations of similar or identical information.2Janiszewski (2002) shows that, for presentations of similar information (e.g. different commercial spots each addressing a different attribute of the same product), recall is superior when the target stimulus is presented in a bunched format.

I also extend the model to allow for the presence of bad news (failures), i.e. events which are harmful to the agent's image. The earlier results continue to hold true, but some new insights also emerge. Now, there is also a rehearsal stock for failures and this means that successes serve two purposes. They rehearse past successes, as before, but they also guarantee that past failures will not be reinforced by a potential failure. The higher the rehearsal stock for failures, the more eager will the politician be to avoid a failure and, hence, the more eager will he be to deplete his budget and release a success. In the special case where all events are good or bad, i.e. there are no neutral events, we show that the benefit from avoiding a failure is symmetric to the benefit from releasing a success and optimal decisions depend only on the sum of the two (one for successes, one for failures) rehearsal stocks.

1. Related Literature

In this article, I build on two (related) strands of existing economic literature, namely behavioural economics and memory imperfections.

Behavioural economics is the vein of economic research that injects psychological insights on human behaviour and cognition into existing economic models, in an attempt to improve their descriptive power. Rabin (1998) outlines the general framework of this research programme. What, I believe, differentiates this article from the existing literature in behavioural economics is our focus on the strategic considerations of bounded rationality. Such considerations arise when a fully rational agent recognises a cognitive limitation or bias, from which others suffer, and he now faces the problem of how to modify his actions in order to manipulate it for his own benefit. In this article, for example, a politician recognises that the electorate suffers from imperfect memory that obeys some principles and he times the sequence of events in an effort to manipulate the memories that the electorate will have at election time. From this point of view, the spirit of this article is similar to O'Donoghue and Rabin (1999). They consider the problem faced by a principal in a moral hazard setting who, having recognised that the agent suffers from time inconsistency, has to identify the appropriate contract that will prevent the agent from procrastinating. Della Vigna and Malmendier (2004) and Sarafidis (2004) also look at similar problems, in asking how rational firms can exploit (or react to) consumers with time inconsistent preferences.

Imperfect memory has been studied in the context of traditional game theory, giving rise to ‘games of imperfect recall’ as in Piccione and Rubinstein (1997). On the same wave length, Bernheim and Thomadsen (2005) look at imperfect memory in conjunction with anticipatory feelings. Other implications of imperfect memory for economic decisions, outside the standard game theoretic framework, have been explored in Dow (1991) and Hirshleifer and Welch (2002). Also, as I have already discussed, our article borrows on the work of Mullainathan (2002), who first developed a memory model grounded in the principles of recency, similarity and repetition. In Mullainathan's work, however, the focus is on an agent's own memory imperfections and on how these can explain often-observed decision-making biases or empirical puzzles (inertia, the curse of knowledge and over/under reaction in financial markets). On the contrary, I focus on the decision problem of a rational agent who reacts to someone else's memory imperfection.

On the modelling level, the complementarity between present and past information releases gives rise to a payoff function that is reminiscent of addiction, habit formation and growth with endogenous preferences. See Becker and Murphy (1988), Constantinides (1990) and Ryder and Heal (1973), respectively.

Finally, the effect of spacing on memories has been extensively investigated experimentally in the marketing and the cognitive psychology literatures. Some important contributions in this field are Greene (1989; 1990), Hall (1992), Malaviya and Sternthal (1997) and Janiszewski (2002). I hope that this article provides a theoretical contribution in modelling formally this strand of experimental evidence.

The rest of the article proceeds as follows. Section 2 develops the model and derives its key properties. Section 3 extends the benchmark model to allow for bad news (failures). Section 4 discusses the results in the context of applications and presents some supporting experimental evidence. An Appendix contains the mathematical proofs of the results.

2. The Model

To fix ideas, we cast the model in the context of a politician facing re-election at some future date T. The model can also be interpreted in the context of other situations where people or products go through periodic assessments, such as in employee performance evaluations and advertising campaigns.

Time is discrete and indexed by t = 1, 2,…,T. In each period t an event, et, may be realised, in which case we refer to it as a success and we write et = 1. This is in contrast to the case where the event is not realised, and et = 0. Thus, we have et ∈ {0,1}. In our political application each success should be interpreted as a piece of information that the electorate (assessor) observes and pertains to our politician's (agent's) ability or popularity. These could be, for example, how his negotiation skills helped to avoid an airline strike at Christmas, or a burst of publicity due to his instrumental role in passing a popular piece of legislation. We will assume that all successes are of equal importance and, for the moment, only beneficial to the politician's image (hence the term success). Therefore, re-election is more probable the more successes the forgetful electorate remembers at the election date T.

To capture the fact that recall is imperfect, we assume that the electorate remembers a period-i success at election date only with some probability, referred to as the recall probability of success i. This probability depends on the memorability (to be interpreted as vividness or ease of recollection) of this period-i success at election date.

To model how the memorability of each success is determined, we borrow on some stylised facts on memory. These stylised facts are supported by an array of experimental evidence from psychology and neuroscience, as well as casual observation and introspection.

2.1. Some Stylised Facts on Memory

The model is built on three stylised facts, which we will refer to as recency, rehearsal and cue dependence. The credit for bringing these concepts into economics, and building an economically relevant model of memory based on them, should go to Mullainathan (2002). Below, we briefly discuss each of these stylised facts, referring the reader to Mullainathan (2002) for a more thorough justification and discussion of the experimental evidence in favour them. The reader is also referred to Schacter (1996) and Parkin (1993).

Recency states that time erodes memories. This idea is indisputable, supported by casual observation, introspection and experimental evidence dating back to the late nineteenth century.

Rehearsal states that accessing the memory of an event facilitates its subsequent recall. We all exploit this idea in our daily lives when we repeat a telephone number in order not to forget it. One can experimentally verify the effect of rehearsal on subsequent memories in the following experiment. Subjects are presented with a list of words to be remembered (one word at a time appears on a computer screen for a few seconds). When the position of the word in the list is plotted against the probability of recall, one obtains a U-shaped curve, the so called ‘Serial Position Curve’. As expected, words at the end of the list have high recall probabilities (recency effect), but quite surprisingly words in the beginning of the list have higher recall probabilities than words in the middle (primacy effect). Psychologists attribute the primacy effect to the fact that words in the beginning of the list are rehearsed more than subsequent words. This hypothesis was examined by Rundus (1971) who required his subjects to rehearse words out loud. When plotting the serial position against the average number of rehearsals he obtained a downward sloping curve, supporting the hypothesis that the primacy effect is indeed attributed to rehearsal.

Cue dependence states that current events function as cues that can trigger similar past memories. Watching a Brady Bunch episode (or similar old TV programme), for example, can be very effective in stimulating the recall of long buried childhood memories. The idea that memory is a cue-dependant process was first demonstrated experimentally by Tulving and Osler (1968). In a related study, Tulving and Thompson (1973) proposed that the cue dependant property of memory is due to the encoding specificity principle. This states that the way we approach and encode an event when it occurs determines what type of cue will be effective in facilitating its subsequent recall. When there is a high degree of similarity and overlap between the way we encoded an experience (or an event) and the present environment, then recall will be superior.

Before we proceed, it is also important to recognise the limitations of a memory model built exclusively on the principles of recency, rehearsal and cue dependence. Psychological research with enormous judicial and social repercussions reveals that in some (usually pathological) cases people ‘remember’ events that never occurred; see Loftus (1993). Along the same lines, there is strong evidence to suggest that the way we approach the past may have an effect on what memory we derive from a certain event.3 To explain such findings, some psychologists theorise that memory relies heavily on reconstruction. In other words, memories instead of being actual snapshots of reality, they are only fragments of the past. These fragments interact with previous knowledge and elements of the present, such as our desires, biases and motives to produce the experience that we call a ‘memory’. These elements are absent from my model.

2.2. The Memory Technology

Let inline image denote the memorability of the event ei = 1 at time t. Since an event can have meaningful memorability only after it has occurred, I let inline image, for all t < i, and normalise inline image. I assume that the recall probability of success i at time T is proportional to the memorability of success i at time T, inline image. If a period-i success is not recalled at election date, the electorate uses the event ei = 0 as the default memory.

Two comments are in order. First, in modelling the memory technology in this way, I implicitly assume that the electorate can only forget successes that actually happened and does not invent false ‘memories’ of successes that never happened. Second, in using ei = 0 as the default memory when a period i success is forgotten, I interpret the event ei = 0 as a non-event, i.e. if ei = 0 then nothing happened in that period. I later extend the model to the case where the event ei = 0 is interpreted by the electorate as a failure.

I incorporate recency into the model by assuming that the memorabilities, inline image, decay exponentially over time at a constant rate of (1−ρ). To incorporate cue dependence I assume that a t-period success triggers the memories of past successes. For example, reading a newspaper article that paints a favourable picture of the politician will trigger other past good memories pertaining to him. As these memories get triggered, rehearsal kicks in by making these (triggered) memories of past successes more memorable. This is built into the model by introducing a parameter κ, such that a success at time t boosts the memorability of a past success from period i < t by κti. In other words, the boost to the memorability of a past event depends on the time that has elapsed between the event itself and the later similar event that rehearses it. It is also natural to assume that the increment to the memorability of a t-period success from a success in period t + 1 is less than its current memorability. (Notice that at time t + 1 the memorability of a t-period success is ρ.) Thus, I let κ < ρ. The importance of this assumption will become apparent and discussed later.

As long as t > i, all this can be neatly summarised in the following equation:


where I is the usual indicator function, i.e. I{et = 1} = 1 if et = 1 and 0 otherwise.

The number of successes that a forgetful assessor actually remembers at the critical date T will be a random variable, with upper bound the actual number of successes, inline image. If the probabilities of recall are proportional4 to the memorabilities at time T, then the expected number of successes that the assessor recalls at time T is proportional to the sum of all memorabilities at time T, given by inline image. Using (1) together with the initial condition inline image and summing over i, we can work out the evolution of this sum over time:


If we define inline image and inline image, then we can alternatively express the preceding equality as:


Casting the model in terms of this new pair of equations allows for an intuitive interpretation. Memory decay, embedded in the parameter ρ, can be thought of as a discount rate on the past rather than on the future. Then, one could think of At as a running total stock variable (or stock of goodwill) that measures the number of successes that have occurred up to time t, with past successes being ‘discounted’ at rate (1−ρ). When a new success is realised the stock At is increased, but the incremental effect of a new success has two components: a direct effect (plus 1 term), originating simply from the fact that a new success has been realised, and an indirect effect (plus κSt term), coming from the fact that the new success triggers memories of past successes, which now become more memorable.

Our reduced form equations (2) and (3) give a convenient way to record this indirect effect. One might think that the indirect effect of a new success would depend on the exact sequence of zeros and ones that have occurred up to that point. In fact, however, the variable St acts as a summary statistic for the sequence. The fact that each point in time we can summarise the past in a single variable, rather than having to carry over the whole vector of past realisations, is clearly an attractive feature of the model. Thus, we can think of St as a ‘rehearsal stock’. Just as the running total stock At decays at rate (1−ρ), the rehearsal stock St decays at rate (1−κ). Our earlier assumption that κ < ρ implies that the rehearsal stock St decays at a faster rate than the running total stock At. Also, for future reference, notice that the rehearsal stock St is a sum of powers of κ. The nth power of κ is present in St if, and only if, a success occurred in period tn.

The existence of a rehearsal stock St may rehearse a memory for readers familiar with models of addiction as in Becker and Murphy (1988), growth with endogenous preferences as in Ryder and Heal (1973), or Constantinides’ habit formation (1990). In this line of work, the marginal utility of consumption depends on past levels of consumption. Similarly, in our model the effect of a new success depends on the stock of past memories summarised by St. The more vivid past memories are, i.e. the higher the rehearsal stock St is, the more effective a new success becomes in triggering them.

2.3. Optimal Profiles

Suppose now that our politician can exert some influence over the stochastic process that generates the realisations of the events et by making a choice, ct. Let this choice ct be between acting or waiting, so that ct ∈ {Act,Wait}. In case the politician acts, he secures a success in that period, whereas if he waits a success occurs stochastically with probability p < 1.

In addition, suppose that the politician is constrained by inline image, that is he can act for at most n periods. Such a constraint could reflect the fact that he is ‘sitting’ on n number of events which he can release at his discretion. Say, for example, that he can choose when to schedule two appearances on two different evening talk shows that will generate a lot of publicity. Alternatively, the variable n could be interpreted as a stock of effort, a stock of energy, or an advertising budget that is depleted every time the agent acts in order to increase the probability of obtaining a success. What is then the optimal choice profile over time?

For a risk neutral agent the optimal choice profile is the one that maximises AT, the aggregate memorability of past successes at time T. More formally, the agent solves the following program:


The optimal profile also depends on the initial condition S0, but not on the initial condition A0. The initial condition S0 may be greater than zero if, at the time the agents first encounters the problem, there already exists a relevant history of events.

It should be stressed that when we let the agent maximise the stock AT, we (and the agent) implicitly assume that the assessor is not just forgetful, but also naive in the sense that he does not try to correct his memory imperfection.5 I believe that this is an innocuous assumption in a model, like mine, where the value of positive memories comes from creating goodwill. In a different model, where memories are interpreted as different data points that the assessor uses to update his prior beliefs, the assumption of naivety may be more debatable, but still a reasonable starting point.

I can cast the agent's problem in terms of a Bellman equation by letting V(S,n,τ) be the expected continuation payoff, assuming that choices are made in an optimal way. There are three state variables: (S) the rehearsal stock which summarises the past history of events, (n) the number of periods that the agent can act before he exhausts his budget, and (τ) the number of periods left until the end. Notice that the running total stock variable A is not a state variable, since the rehearsal stock S carries all past information which is relevant for optimal decisions. Then, at time t = 0 the agent faces:


Notice that if n = 0, acting is not an option and, hence, Vact(S,n,τ) is defined only as long as n > 0.

When κ < ρ there exists a trade-off between acting in any period t rather than in a subsequent period, say t + k. Acting in period t + k is preferred from a recency point of view as successes closer to period T are more likely to be remembered. However, acting in period t is preferred form a rehearsal point of view, as early successes are more powerful in reinforcing fresher previous memories. The following example clarifies this observation.

Example 1.LetT = 2, n = 1, p = 0 andinline imageHere, the agent has one available action with two remaining periods. We must choose betweenacting earlyin the penultimate periodT−1, or waiting and acting in the last period,T. If he acts early the resulting profile will beinline imagewhereinline imagedenotes the rehearsal stock entering the decision problem. If he waits the resulting profile will beinline imageThe respective continuation payoffs from these profiles are:


Clearly, the recency effect (ρ < 1) favours waiting and acting the last period. The rehearsal effect, however, favours acting early. An early success adds to the continuation payoff an indirect effect ofκin periodT−1, which then decays for one period contributing a total ofρκ. In contrast, a late success adds onlyκ2. If inline imageis above the threshold of (1−ρ)/[κ(ρκ)], then acting early is optimal.

The logic behind this example can be generalised to the multiple period, multiple action, stochastic case (p > 0), yielding a simple rule for choosing optimally. Optimal choices are driven by the trade-off between the recency and the rehearsal effects. When the rehearsal stock is above a certain threshold, the rehearsal effect dominates over the recency effect and acting is preferred to waiting. The critical thresholds that trigger the agent to act depend on τ, the number of periods to go, and n, the number of available actions. Moreover, when there are fewer periods to go, the recency effect is less pronounced and a lower rehearsal stock is needed to induce acting early.

Proposition 1. (Thresholds) With τ actions still available (with n < τ), there exists a threshold rehearsal stock, H(τ,n|ρ, κ), such that it is optimal to act if and only if SH(τ,n|ρ, κ). Moreover, the thresholds are increasing in τ; that is the fewer the periods to go, the lower the rehearsal stock need be before it triggers the agent to act.

In terms of the motivating example, this result implies that the politician should act and release more successes immediately following a streak of favourable press releases. This way he will be able to reinforce the memories of these favourable press releases in the minds of the electorate.

For the subsequent analysis, the following observation will be useful.

Lemma 1. (Acting increases the stock of rehearsal) For any κ < 1, we have 1 + κS > S.

In other words, if the agent acts today, tomorrow's rehearsal stock will necessarily be higher despite the decay in the rehearsal stock. To see this, recall that S is a sum of powers of the parameter κ and, hence, it can be at most 1/(1−κ). The desired result follows by rearranging the inequality S < 1/(1−κ).

I can now show that the optimal profile will have the following structure. It will never be optimal to act in period t, then wait for k periods and then act again in period t + k + 1. Instead, in the optimal profile, the periods that the agent chooses to Act are bunched together one after the other. Therefore, the optimal rule has the following simple form: Wait until the rehearsal stock reaches some threshold level (Proposition 1), and then keep on acting until you exhaust the supply of available actions.

Proposition 2. (Bunching) If it is optimal to act in period t, it is also optimal to act in period t + 1. More formally, for any level of rehearsal stock S, number of available actions n and periods to go τ, with 0 < nτ, we have:


To see the intuition behind Proposition 2 consider the following argument. Suppose that with a rehearsal stock of S, τ + 1 periods to go and n + 1 available actions the agent found it optimal to act. The subsequent period the state variables have evolved to 1 + κS, τ and n, respectively. How has the attractiveness of acting changed? Keeping all else constant, the fact that the rehearsal stock has increased has made acting even more attractive (recall that 1 + κS > S). Similarly, the fact that there are fewer periods to go also favours acting (thresholds decrease as τ, the number of periods to go, decreases). If it were also true that thresholds decrease as the number of available actions decreases (i.e. thresholds are increasing in n), then all three forces would make the choice of acting in the second period even more attractive than it was in the first. However, as I will show later, thresholds are not necessarily increasing in the number of available actions. In other words, it maybe the case that as the number of available actions decreases the agent becomes more stingy with his actions and more reluctant to act early on. The bunching result of Proposition 2 establishes that the first two forces (higher rehearsal stock and less periods to go) will always dominate over the third (less available actions) which may work in the opposite direction.

Keeping some real world applications in mind, the bunching result implies the following tendency: a politician who has his successes bunched together will create a more favourable impression than a politician who has the same number of successes but spread apart over a period of time. Similarly, a series of brutal murders in New Haven, that make national headlines, will harm the reputation of the city more if they all happen in a short period of time back to back.

2.4. Thresholds in Closed Form for the Non-stochastic Case

It is possible to express the thresholds H(τn|ρ, κ) in closed form for the non-stochastic case, p = 0, where a success is obtained if and only if the agent acts. To be sure, when p = 0 the agent's problem is trivial and thresholds are of no use in describing the optimal profile. This is because, when stochastic successes are not possible, the agent enters the decision problem without any rehearsal stock and he has no incentive to act early on. He simply waits until there are n periods to go (i.e. there are as many periods to go as available actions) and acts these last n periods. Nevertheless, seeing the thresholds in closed form is useful for gaining some additional insight and intuition behind our earlier discussion and results. After all, it could be that the agent has wasted some successes in the past and he now has to decide when to release his remaining successes, facing a positive rehearsal stock.

Proposition 3. (Thresholds in closed form) When p = 0 the critical thresholds H(τ, n|ρ, κ) are given by:


The first fraction shows the tension between the recency and the rehearsal effect. The numerator is the benefit from acting the last n periods (recency), whereas the denominator captures the benefit from getting a higher rehearsal boost by acting the first n periods (rehearsal). The second fraction, does not depend on τ, the number of periods to go, and simply adjusts for n, the number of available actions. It is straightforward to verify that thresholds are increasing in τ, the number of periods to go, as claimed in Proposition 1.

Of particular interest are the comparative statics with respect to n, the number of available actions, which are ambiguous. Readers may convince themselves by verifying that H(4, 3|0.95, 0.3) < H(4, 2|0.95, 0.3) (decreasing in n), whereas H(3, 2|0.95, 0.3) > H(3, 1|0.95, 0.3) (increasing in n). This observation is important, because describing the optimal profile essentially amounts to identifying how the value function V(S, n, τ) depends on the three state variables S, n and τ. Earlier I argued that as S, the rehearsal stock, increases and as τ, the number of periods to go, decreases, the attractiveness of acting (defined as the difference VAct(S, n, τ)−VWait(S, n, τ)) increases. The expression for the thresholds in Proposition 3 allows us to see that the effect of n, the number of available actions, is ambiguous: the attractiveness of acting may increase or decrease as n, the number of available actions, decreases. This is precisely why the bunching result is more subtle and powerful than it might first appear.

3. Allowing for Bad News

Up to this point, I have been assuming that all news is good news. The events et were either beneficial (et = 1) or neutral (et = 0) to the agent under assessment. I now extend the benchmark model to allow for the prospect of bad news (failures), which are harmful for the agent and whose effect the agent seeks to minimise.

The event et can now take values in et ∈ {−1, 0, 1}. I refer to event realisations with et = −1 as failures. The memory technology that determines the recall probabilities for successes and failures is the same as before, with failures rehearsing only past failures and successes rehearsing only past successes. That is, the memorability of a period-i success evolves according to (1), whereas the memorability of a period-i failure is given by:


Let inline image be the sum of the memorabilities of all failures up to time T. Using identical steps as before, it can be shown that this sum evolves according to:


where inline image is the rehearsal stock for failures. The respective stocks for successes, referred to as At and St as before, are defined similarly by replacing the argument in the indicator function by et = 1; see (2) and (3).

Now, the agent maximises the net number of successes that the assessor will remember on expectation at time T. Once I specify the following conditional probabilities, p and p (with p + p≤1), the agent's maximisation problem becomes:


I will first look into how the rehearsal stocks, S and S, affect the agent's optimal actions. As in the model with only good news, the higher the rehearsal stock of successes, S, the greater the incentive for the agent to act early in order to rehearse earlier successes. When the rehearsal stock for successes, S, is above some threshold, the benefit from rehearsing past successes outweighs the loss from wasting a success in an early period. Therefore, it is straightforward to generalise Proposition 1 and establish a critical threshold, call it H+(τnS|ρκ), such that: Fix the rehearsal stock for failures, S. Then, with τ periods to go and n actions still available, there exists a threshold H+(τnS|ρκ), such that it is optimal to act if and only if S > H+(τnS|ρκ). Moreover, the thresholds can be shown to be increasing in τ; that is, the more the periods to go, the higher the rehearsal stock, S, need be before it triggers the agent to act.

The rehearsal stock for failures, S, affects optimal actions in a slightly different way. To see this, notice that generating a success in period t has two functions. First, it reinforces past successes and, second, it guarantees that past failures will not be reinforced by a potential t-period failure. As a result, the agent also has an incentive to act early on when the rehearsal stock for failures, S, is relatively high. This way he avoids past failures from being reinforced by a potential failure. This gives rise to the following result.

Proposition 4. (Thresholds with failures) Fix the rehearsal stock for successes, S. Then, with τ periods to go and n available actions (with n < τ), there exists a threshold, H(τnS|ρκ), such that it is optimal to act if and only if SH(τnS|ρκ). Moreover, the critical thresholds H(τnS|ρκ) are increasing in τ, the number of periods to go.

A special case of the model with both successes and failures is when p + p = 1. That is, all news is either good or bad and there are no null events. This case is identical to what we would have got in the benchmark model (et ∈ {0, 1}), had we interpreted the events et = 0 as the agent failing to deliver a success and subjecting these failures to rehearsal from subsequent failures. In this case, the benefit from generating a success is completely symmetric with the benefit from avoiding a failure. As a result, optimal decisions depend on the sum of the two rehearsal stocks, S and S. Notice that this is not to say that we subtract the failure rehearsal stock from the success stock. Rather, the incentive to act to avoid rehearsing failures is added to that aimed at rehearsing successes.

Proposition 5. (Symmetry in the rehearsal stocks) Let p + p = 1. Then, optimal decisions depend only on the sum S + S.

In other words, when p + p = 1, with τ periods to go and n available actions, there exists a single threshold level, such that the agent acts when the sum S + S is above that threshold. This implies that if we return to our original set up with et ∈ {0, 1} and reinterpret the events et = 0 as failures subject to rehearsal, optimal decisions will still depend on a single threshold level as before.

Finally, the bunching result of Proposition 2 goes through, with the intuition being the same as before. For completeness I state this as follows:

Corollary. (Bunching) In the general case where et ∈ {−1, 0, 1}, if it is optimal to act in period t, then it will be optimal to act in period t + 1.

4. Discussion

Throughout the article, I often discussed the model in the context of a political campaign. Nevertheless, the general framework can be applied to a variety of settings where people or products undergo periodic assessment: public relations executives managing the release of corporate news, advertisers timing the airing of commercial spots, marketing executives choosing when to launch new products, evaluation of forecasters based on their past predictions and public opinion formation. Keeping these applications in mind, I discuss the theoretical results.

4.1. Bunch Together your Successes–Spread Apart Your Failures

In models of quasi-Bayesian and quasi-rational decision making it is often the case that the order of informative signals matters. For instance, Rabin and Schrag (1999) argue that this is a consequence of a confirmatory bias. Lam (2000) shows that early signals may lock the decision maker into a scenario-story, which affects the way new information is interpreted. In my model what is important is not only the order (i.e., which events comes first and which comes second), but the actual spacing (i.e., the amount of time that elapses between informative events), since it determines the effectiveness of rehearsal. In particular, rehearsal is most effective when all events are bunched together in consecutive periods.

This implies that advertisers should schedule commercial spots that advertise different attributes of a brand back to back. Janiszewski (2002) (see below) verifies experimentally the validity of this recommendation. Another implication is that a stock market forecaster is more likely to achieve a ‘guru’ status in the mind of investors, when he makes successful predictions in consecutive months or years, rather than when his successes are spaced by larger intervals of time.

The same logic can also be applied in the domain of public opinion for various controversial issues. Take, for example, the gun control debate in the US. Public opinion is influenced by various gun-related crimes that make headlines, such as the Columbine high-school shooting incident. The timing of these events is crucial for what people remember and, consequently, for what side they will take. My results predict that public support for gun control will reach its peak when these random incidents are recent, as dictated by common sense, but also when they are clustered together in a short period of time. This is because such a clustered profile is most efficient in reinforcing the memorabilities of each of these events. By the opposite logic, the cause of the National Rifle Association will be least affected if these events are spread apart by big time intervals.

4.2. Random Events Trigger Information Releases

The benchmark model gave rise to a simple rule for releasing information. We can think of random events as creating a rehearsal stock. When enough random events push the rehearsal stock above some threshold level, the agent finds it optimal to act and release information, in an effort to reinforce the past good memories (and avoid reinforcement of the past bad memories). Indeed, this happens in my model despite the fact that there is a finite number of actions to be used up, which implies that early actions come at the cost of not being able to act later on.

This intuition can be manipulated by an executive managing the release of corporate news. My model implies that he can improve the perception of the company in the eyes of investors by releasing good news pertaining to his firm immediately after a streak of favourable press articles. By the same token, a streak of events that have generated bad publicity for the government will trigger the opposition to release further information that discredits the cabinet. For example, suppose that a train accident and a delayed response to a natural disaster have made the cabinet look incompetent. Now is the time for the opposition to go public with information they have on a corruption scandal involving cabinet officials.

4.3. Experimental Evidence on Spacing Effects for Memory

A basic premise of this article has been that when similar events are spaced in close proximity each one functions as a cue that prompts previous ones, thus reinforcing their memories. The theoretical model predicts that this logic takes an extreme form, implying that subsequent recall for similar events will be maximised when these events are bunched together in consecutive periods, i.e. when there is absolutely no spacing between them.

It turns out that this exact point, i.e. the comparison between bunched vs. spaced presentation for subsequent memories, has been the subject of numerous experiments in the cognitive psychology and marketing literatures. In Janiszewski (2002) subjects are exposed to advertisements which present information about different attributes (4-wheel drive, V-8 engine, anti-lock breaks etc.) relating to various (car) brands. It is shown that recall of attribute-brand combinations is superior when the advertisements describing different attributes of the same brand are bunched together. In his words, bunching ‘…promotes concurrent activation among attributes for the same brand and allows the mental experience from one brand attribute to prompt the generation of another brand attribute.’ Moreover, he shows that the relative advantage of bunched presentations is enhanced when the attributes pertaining to the same brand are related (all safety attributes; anti-lock brakes, airbags, rear seat-belts), the hypothesis being that when the attributes are related, cued activation of attributes becomes more likely. These results provide, at least a partial, empirical validation of the theoretical bunching result.

4.4. Some Concluding Remarks

To conclude, this article has shown how a memory model similar to that of Mullainathan (2002) can be incorporated into a standard decision problem where a rational agent releases information in order to manipulate the memories of his forgetful assessor. We have shown that the agent's problem can be modelled as a standard dynamic optimisation problem and we described the spacing properties of the optimal profile for releasing information. Our results have implications in the context of political campaigns, employee performance evaluations, advertising strategies etc. A natural direction that this work can be extended to is to look at the strategic interaction that arises when multiple agents compete for the memories of a common assessor. This would be the case when two or more politicians compete for the same seat or when many brands advertise to the same pool of consumers.


  • 1

     It should also be stressed upfront that throughout the article I assume that the assessors’ bounded rationality takes two forms. Assessors are not just forgetful, but they are also naive. That is, they behave under the premise that what they remember is what actually happened. I believe that this is a natural assumption for the applications that the article addresses, where (as the formal model shows) memory has the interpretation of ‘goodwill’. See also the relevant discussion in Section 2.3.

  • 2

     The technical terms typically used in this literature are ‘grouped vs. dispersed’ or ‘massed vs. spaced’.

  • 3

     Consider for example the last time you were at a party. If you see yourself in the picture you have an observer perspective, as opposed to a field perspective in case you visualise the scene the same way you saw it when it occurred. People often adopt a field perspective for emotionally charged memories and an observer perspective for more objective types of episodes. More surprisingly however, when you direct people to switch from the field to the observer perspective, these same memories become less emotionally charged.

  • 4

     The memorability of a success can never be greater than 1/(1−κ), for any rate of memory decay ρ. Therefore, choosing the constant of proportionality to be (1−κ), or smaller, guarantees that the recall probabilities are well defined.

  • 5

     A sophisticated assessor could, for example, come up with arguments of the form: ‘For the stock AT to be x, it must be that there were successes in periods 1, 2 and 4 which I somehow forgot’. I would like to thank an anonymous referee for pointing this out.

  • 6

     I am grateful to an anonymous referee who suggested decomposing each profile e in this way. This trick simplifies an earlier lengthy inductive proof significantly.


Appendix: Omitted Proofs


  • 1 Suppose that the state vector is (Snτ). Let Ω(Snτ) be the set of all profiles of zeros and ones after which the first action would be optimal. When Ω(Snτ) = then V(Snτ) = VAct(Snτ). For example, let n = 1, τ = 3 and the rehearsal stock is S, such that: S < H(3, 1|ρκ) and 1 + κS > H(2, 1|ρκ). Then, Ω(Snτ) = {(1),(0,1),(0,0)}.
  • 2 Suppose that V(Snτ) = VWait(Snτ). Let inline image be the continuation payoff starting from the state vector (Snτ) when one follows the following set of instructions: wait the first period and then follow the instructions that would have been optimal if you had started out with the state vector (S ′, nτ′) instead. Also, let inline image be the continuation payoff starting from the state vector (Snτ) when one follows the following instructions: act the first period and then wait until you observe a profile ω ∈ Ω(S ′, nτ′). Then onwards, (as long as n > 0) mimic the instructions had you started following VWait(S ′, nτ′) instead, assuming that in the meantime you had encountered the random sequence ω (rather than (1, ω)).
  • 3 Let CP(eS) be the continuation payoff from profile e entering the problem with a rehearsal stock S. For example, let e : S|1, 1, 0, 1, 0. Then, CP(eS) =ρ4(1 + κS) + ρ3(1 + κ + κ2S) + ρ(1 + κ2 + κ3 + κ4S).
  • 4 Let inline image, where |e| is the order of the profile e. Then CP(eS) can be decomposed into CP(e, 0) and RE(e) so that CP(eS) =CP(e, 0) + S·RE(e). Returning to the previous example of eS|1, 1, 0, 1, 0 we have: RE(e) = ρ4κ + ρ3κ2 + ρκ4 and CP(e,0) = ρ4 + ρ3(1 + κ) + ρ(1 + κ2 + κ3). One can verify that CP(e,S) = CP(e,0) + S·RE(e).
  • 5 Let inline image. For example, with e = (1, 1, 0, 1, 0), we have F(e) = κ5 + κ4 + κ2.

Proof of Proposition 2 (Bunching).I prove the contra-positive statement that:


Assume that VWait(1 + κS, n, τ) > VAct(1 + κS, n, τ). If n = 1 go directly to step 1.

Step 0 : Show that:


The desired result follows by the assumption that VWait(1 + κS, n, τ) is the optimal set of instructions when the state vector is (S, n, τ).

Step 1: Show that:


Define E(1 + κS, n, τ) as the set of all possible τ -period sequences which could occur if the agent follows the instructions in VWait(1 + κS, n, τ), that is if he behaves optimally. Pick any e ∈ E(1 + κS, n, τ) and define τe as the period in which the nth action is used along the sequence e. Also, define we = (e1, e2, …, eτe−1) as the subsequence before the nth action is used and ze = e/(we, 1) as the subsequence after the nth action is used. Then, e can be decomposed as e = (we, 1, ze).6 Following the instructions in inline image instead would generate the same set of possible sequences and with same probabilities except that each e ∈ E(1 + κS, n, τ) is modified to (1, we, ze). Also, the instructions in inline image give rise to the same sequences (and with the same probabilities) as VWait(1 + κS, n, τ). Similarly, the instructions in inline image give rise to the same sequences as inline image. The desired result in (21) holds if for all sequences e ∈ E(1 + κS, n, τ) we have:


This is true because 1 + κS > S and RE(we, 1, ze) < RE(1, we, ze) (see the definition of RE(·)).

Step 2: Show that:


As we have seen in Step 1, the instructions in inline image and inline image give rise to the same possible τ-period sequences (and with the same probabilities), expect that (for each e ∈ E(1 + κS, n, τ)) if inline image generates (we, 1, ze), then inline image generates (1, we, ze). In the first τ periods, the instructions in inline image and inline image also generate the same possible sequences (and with the same probabilities), respectively. Pick a sequence e ∈ E(1 + κS, n, τ). Then, the instructions inline imageinline image generate (we, 1, ze, 1) or (we, 1, ze, 0) with probabilities p and 1−p ((1, we, ze, 1)or(1, we, ze, 0)). In the latter case, where the last element is 0, the desired result in (25) follows immediately. In the former case, where the last element is 1, we need to show that:


By the definition of F(·), this is true for all sequences e.

Step 3: Show that:


where inline image is the continuation payoff generated by the following set of instructions:

  • (a) as long as you have two or more available actions mimic the instructions in VWait(1 + κS, n, τ),
  • (b) use your last available action immediately after you use the penultimate one.

Notice that from the second period onward behaviour is identical under VAct(S,n + 1,τ + 1) and VWait(1 + κS, n, τ).

Following the same logic from the previous steps, notice that the instructions inline imageinline image will generate some random sequence (we, 1, ?, ze) ((1, we, ?, ze)), whereas the instructions inline image [VAct(S, n + 1, τ + 1)] will generate with the same probability the random sequence (we, 1, 1, ze) [(1, we, 1, ze)]. The term ‘?’ is 1 or 0 with probabilities p and 1−p. In the former case, where the ‘?’ term is 1, the desired result follows immediately. In the latter case, where the ‘?’ term is 0, the result follows once you notice that F(we, 1) > F(1, we), for all sequences e.

Proof of Proposition 1 (Thresholds) I show that if it is optimal to wait with rehearsal stock S, then it will be optimal to wait with any rehearsal stock S ′ < S. Let inline image be the continuation payoff if one starts with the state vector (S,n,τ) and uses up all n available actions the first n periods. Notice that if V(S ′,n,τ) = VAct(S ′,n,τ), then, by Proposition 2, all n actions are bunched together in the first n periods and, hence, inline image. It suffices then to show that:


Define E(S, n, τ) as the set of all possible τ-period sequences that could occur if one follows the instructions in VWait(S, n, τ). Each e ∈ E(S, n, τ) can be decomposed as e = (we, 1, 1, .., 1, ze), where we (ze) is the subsequence before (after) the 1st (nth) action is used. (Here, we invoke Proposition 2 from which we know that the agent will act in consecutive periods.) The instructions inline image will give rise to exactly the same set of possible sequences and with the same probabilities. The instructions inline image and inline image will generate the same sequences and with the same probabilities, except that each e ∈ E(S, n, τ) is modified to (1, 1, …1, we, ze). As long as S ′ < S, the desired result in (31) follows once you notice that RE(1, 1,…,1, we, ze) > RE(we, 1, 1…,1, ze), for all sequences e.

Next, I show that the thresholds are increasing in τ, the number of periods to go. It suffices to show that:


The first two sets of instructions in (8) will generate sequences of the form (we, 1, 1, …, 1, ze) and (1, 1, …, 1, we, ze) respectively. The last two sets of instructions will generate with the same probability the sequences (we, 1, 1, …, 1, ze, ?) and (1, 1, …, 1, we, ze) respectively. The desired result in (8) follows from an argument identical to that in Step 2 of the bunching result.

Proof of Proposition 3 (Thresholds in closed form)We first establish the following lemma.

Lemma 2. (Either/or) In the non-stochastic case, with τ = T periods to go and n available actions (with n < τ = T), it is optimal to act in either the first or the last n periods.

Proof. By Proposition 2, we know that it is optimal to act in consecutive periods. Consider the following three profiles e, e′ and e′′, where the agent acts in consecutive periods starting in periods t, t + 1 and t + 2 respectively. In all three cases the agent enters the decision problem with rehearsal stock S and stochastic successes are not possible (p = 0).


I need to show that if profile e′ is preferred to profile e, then profile e′′ is preferred to profile e′. Assume that profile e′ is preferred to profile e and denote by 1n a sequence of n consecutive successes. Then we have:


Profile e′′ is preferred to profile e′ if and only if:


Since κ < 1 and inline image, we must have inline image. This proves the lemma.

Then, to compute the thresholds I simply need to compare the continuation payoffs from acting the first or the last n periods. Acting the first n periods yields ρTn[CP(1n,0) + S·RE(1n)], whereas acting in the last n periods yields CP(1n,0) + κTnS·RE(1n). Then, it is optimal to act the first n periods if inline image. Substituting for CP(1n,0) and RE(1n) yields the result in Proposition 3.

Proof of Proposition 5 (Symmetry in the rehearsal stocks). Let the rehearsal stocks be S and S. Consider the comparison between the continuation payoffs of two different profiles e and e′, to be observed in the remaining τ periods. Notice that if CP(e, S, S), the continuation payoff of profile e, contains the term ρxκτxS, then CP(e, S, S), the continuation payoff of profile e′, contains either the same term or it contains the term −ρxκτxS. When comparing the continuation payoffs CP(e, S, S) and CP(e′, S, S), in the former case we can cancel out the term. In the latter case the two terms can be combined into the term ρxκτx(S + S). Hence, in comparing the two continuation payoffs what matters is only the sum of the two stocks, S + S.