Assessing the accuracy of interviewed recall for rare, highly seasonal events: the case of wildlife consumption in Madagascar


  • Editor: Nathalie Pettorelli
  • Associate Editor: Ioan Fazey


Researchers and practitioners from a range of fields including conservation biology, sociology, public health and economics rely on information gained from interviews to quantify the frequency and scale of activities or events of interest. These ‘recall’ data often form the basis of wildlife sustainability assessments and, ultimately, policy decisions and management actions, but they are highly vulnerable to bias, particularly when the behavior of interest has strong temporal variation. Here, we investigate bias in recalls of wildlife consumption in rural Madagascar by comparing oral recalls collected monthly and annually from male heads of household with daily diet diaries maintained by female heads of household. Daily diet calendars collected from 28 households were assumed to be the measure of true consumption and were used to validate the recalled information. While we found little interhousehold variation in accuracy of responses, we found a tendency for recalls to overreport rates of wildlife consumption. Estimating the annual frequency of rare and/or seasonal events was quantified more accurately by recalls of the prior year than by extrapolation of recalls of the prior month. We conclude that monthly variation in consumption rate leads to predictable errors in estimation of the annual consumption rate. Local consumption of wildlife has large temporal variability, reflecting human preference or the underlying life cycles of animals being consumed. Accurate assessment of consumption rates therefore requires determining an appropriate recall period by taking into account the temporal variability and frequency of the events in question.


The use of data derived from a subject's recall of specific events from memory is a central component of many studies in conservation science, sociology, economics and public health (Coughlin, 1990; Eisenhower, Mathiowetz & Morganstein, 1991; Gavin & Anderson, 2005; Brigham et al., 2008; Ellsberg et al., 2008; Jones et al., 2008). Activities, behaviors and events that are difficult for researchers to directly observe often require the use of recall methodologies to estimate frequencies, durations and/or periodicity of occurrence. Policymakers depend on the reliability of these data to inform their decisions regarding household economics, human health and environmental sustainability.

In many cases, researchers estimate event frequencies by extrapolating from recall information based on a small sample of prior days or weeks (Eisenhower et al., 1991; Lemmens & Knibbe, 1993). Such an approach assumes (1) that a subject can accurately recall the events of interest, (2) that the events quantified during a brief recall period remain roughly constant in frequency throughout the year and (3) that the timing of the recall period will capture or detect events of research interest (Lemmens & Knibbe, 1993). However, these assumptions are often violated such that estimates based on extrapolation from short recall periods are vulnerable to systematic and response errors (Lemmens & Knibbe, 1993; Johansson et al., 2002). This problem can be further compounded by investigators designing surveys to be conducted during periods where events of particular interest will occur relatively frequently. Assessing the accuracy and precision of these recalled estimates by using mixed-method data collection is needed to determine the validity of researcher's reliance on this form of inference (Rosenberg et al., 1983).

In this study, we use a case of hunting and wildlife consumption in north-eastern Madagascar to assess the accuracy of annual recalls for rare and seasonal events. Prior evidence has shown that hunting in this region is highly seasonal, and for some species, rare (Golden, 2009; Golden et al., 2011). We compared prior month with prior year recalls of wildlife use against daily diet calendars, assuming the calendars to be measures of true consumption because they were not subject to any recall delay. We posed the following hypotheses: (1) recalls will predict the frequency of consumption of rarely consumed taxa more accurately than frequently consumed taxa (Sudman & Schwarz, 1989; Eisenhower et al., 1991; Reis & Judd, 2000); (2) recalls will estimate the consumption of aseasonally consumed taxa more accurately than seasonally consumed taxa (Lemmens & Knibbe, 1993); (3) prior month recalls conducted in a low-hunting season will provide more accurate consumption information than recalls conducted in a high-hunting season; (4) the average of the two prior month recalls (one from high-hunting and one from low-hunting season) will provide more accurate data than individual prior month recalls; and (5) annual recalls will more accurately predict the true long-term rates of consumption than prior month recalls.

Our hypothesis that long-term annual recalls would perform better than prior month recalls might seem surprising given that, in general, more recent events are more easily recollected (Reis & Judd, 2000). However, linear extrapolation requires a constant rate of occurrence through the period of extrapolation (Wentland & Smith, 1993). In the case of many seasonal events, this assumption of rate constancy is wrong. Further, because researchers can often anticipate which seasons are most productive, they sometimes design the timing of their research effort so as to increase the probability of detecting rare events (Wentland & Smith, 1993). This procedure risks producing an inflated estimate of the frequency of such events. These types of systematic bias are critical to understand in the field of sustainability science so that we do not gravely underestimate or overestimate harvest pressure on wildlife.


Sampling and data collection

From February 2008 through February 2009, we enlisted the participation of 28 households in one community in the Makira Natural Park of north-eastern Madagascar. These households were selected by systematic random sampling (see Golden et al., 2011), and all had a male and female head of household. The female head of each household completed daily diet calendars listing the number of individuals of each type of wildlife consumed within their household. Twice during the research period (May 2008 and January 2009), a diet recall was conducted in which the male head of each household was asked to estimate the number of animals consumed of 15 wildlife taxa (14 species, plus a single category for all insectivorous bats) in the household over the past month. These were the most frequently consumed taxa in the area. The recall conducted in May occurred during the high-hunting season while the recall in January occurred during the low-hunting season. At the end of the study, we conducted an annual diet recall by asking each male head of household to enumerate the numbers of animals of each taxa consumed during the past year (see Golden, 2009 for details).

The daily diet calendars were maintained as part of a larger public health study, and the recall data were collected during unplanned visits to the households. A research assistant visited each household weekly throughout the year to ensure that the calendars were properly maintained. We assumed that the two most important potential sources of information recall bias would be memory failure and intentional misrepresentation (Bradburn, Rips & Shevell, 1987). Because our team had been working in this community for 5 years prior to this study and had gained the trust of community members, we believe that deliberate misleading was absent or trivial. Accordingly, we assume that recall bias came primarily from misremembering.

Cultural context

Rice agriculture was the primary form of labor activity for both men and women. Certain tasks were performed jointly such as weeding, and others were highly segregated by sex. Hunting and cooking largely fell into the second category of labor where activities were segregated by sex. It was the man's domain to hunt and the woman's domain to prepare and cook food. This was the reason why the diet calendars were maintained by the female head of household, and the oral recall was performed by the male head of household. There was very little variation in the role of men and women in producing and securing food. Thus, households were reasonable replicates of one another in this sense.

Accurately assessing rates of consumption is important in determining the sustainability of harvest and future trajectory of wildlife populations. In this case, wildlife consumption can be viewed as nearly all of harvest. In this region, there is no market for luxury species at markets or restaurants, although it is very occasionally served in urban households. Further, although certain animals are hunted as pests, they are also consumed after death. Thus, the consumption records in this study should offer a clear indication of harvest pressure.

Although Makira Natural Park was gazetted as a protected area in 2005, the laws restricting hunting remain relatively unmonitored and unenforced (Golden, 2009). Our study community, while surrounded by forest, lies 5.5 km from the boundary of the protected area. The community enrolled into this study is thus not part of the buffer area of the park, where active community engagement is taking place under the aegis of the Wildlife Conservation Society and the Ministry of Water and Forests.

Data analysis

In all comparisons of recall types, we considered daily diet calendar data to represent ‘true’ wildlife consumption events. We used two metrics to assess the accuracy of oral recall methods as compared with values obtained from the daily diet calendars: (1) ratios of the geometric means of recall and calendar data across taxa per household (Johansson et al., 2002) and (2) mean squared errors (MSEs) to quantify the difference between the diet calendar report and oral recall. The first method follows Johansson et al. (2002) and transforms the Poisson distributed count data of observed diet calendar consumption events and estimated oral recall of consumption events using the equation [ln(x+1)]. We then calculated the geometric mean of transformed consumption data across each household for each taxa for both the recall and calendar data. Finally, we calculated a ratio of the geometric means of the recall data and the calendar data. The closer this ratio is to 1, the more accurately the recall data from a given household matched the assumed true rate of annual consumption.

For the second analysis, we used MSE to calculate the difference between the calendar count (i.e. the true count) and the recall count (i.e. the estimator of the truth). The MSE is a quantitative comparison of two values that determines the degree of similarity between the values, usually assuming that one of the values is accurate and the other subject to error (Wang & Bovik, 2009). We used this approach to compare the average recall count with each calendar count, the average calendar count with each recall count, and both the average calendar and recall count for each taxa and household.

We repeated the same type of analysis for three other cases: the extrapolated prior month recalls from May 2008, those performed at the end of January 2009 and a combined recall extrapolation where the counts from May and January were added together and then extrapolated to a year. We then compared these extrapolated short-term recall rates with the annual recall rate to determine which recall method had better predictive accuracy of annual rates of consumption – as determined by their respective MSE value. Households in this study had conducted recalls 0–3 times prior to this research.

We used a generalized linear mixed model (GLMM) with an unstructured covariance matrix to determine the association between the MSE value (from the annual recall count and the reported dietary calendar count) and a series of covariates using the household as a random effect:

display math

Yijk is the MSE value in the ith household for the jth sample, where a lower MSE denotes greater accuracy of recall; β0 is a constant; β0i is the random effect for the household; β1 is the reported frequency of consumption from the diet calendars; β2 is a binary categorical variable expressing whether hunting practices for a given taxa are seasonal or aseasonal; β3 is a binary categorical variable (‘elementary’ or ‘post-elementary’) characterizing the educational attainment of the male head of household who conducted the recall; β4 is a continuous variable listing the number of years in which the head of household has conducted an annual recall; β5 is the body mass of each given species (log-transformed because of positive skew); and eijk is the error term. Illegal species included only those species that are illegal to harvest throughout the year. These included an endangered carnivore species, the fosa (Cryptoprocta ferox) and all lemur species (Table 1 to see species list). Aseasonally hunted species included all carnivoran species, insectivorous bat species and the bush pig (Potamochoerus larvatus); seasonally hunted species included all lemurs, frugivorous bat species and members of the family Tenrecidae (C. D. G., unpublished data). Seasonal refers to species that are consumed during a particular hunting season, which typically last 3–8 months per year depending on the species. This seasonality can arise from two reasons: estivation by a species or hunter behavior and preferences. We use the term ‘rarely consumed’ to mean consumed less than five times in 1 year by one household.

Table 1. MSE estimates associated with each recall method for each taxa
TaxaAnnual recall MSEPrior month recall (high hunting) MSEPrior month recall (low hunting) MSECombined prior month recalls MSE
  1. Mean squared errors (MSEs) for each recall method where the records of the daily diet calendars serve as the measure of true consumption. Low numbers indicate greater accuracy. For 13 of 15 taxa, the annual recall method serves as the most accurate reflection of true consumption.
  2. aDenotes taxa that are illegal to harvest throughout the year.
  3. bDenotes taxa that are frequently consumed (i.e. commonly consumed five or more times per year by a given household).
Seasonally consumed    
Avahi lanigera0.1827.880.157.36
Cheirogaleus sp.a0.89768.157.00198.72
Daubentonia madagascariensisa00.080.110.08
Eulemur albifronsa, b2.86434.4628.4492.92
Hapalemur griseusa0.07121.0410.8145.72
Indri indria00.080.120.08
Microcebus sp.a0.0422.155.337.20
Rousettus madagascariensisb5.822218.50109.67608.04
Setifer setosusb2.71211.19193.44147.96
Tenrec ecaudatusb5.366104.6971.811592.52
Aseasonally consumed    
Cryptoprocta feroxa011.150.072.96
Galidia elegans03.120.330.36
Microchiroptera spp.0.045.5401.44
Potamochoerus larvatusb0.115.150.561.52
Viverricula indica0.0433.880.679.32
Unweighted mean MSE1.21664.4728.57181.08


Variation in accuracy based on recall length and timing

There were 420 total paired samples (15 taxa in 28 households) of the taxa under investigation. In comparing the annual recalls to the daily diet calendars, 318 of the 420 paired samples were 0,0 pairs where both the male's recall of consumption and the female's diet calendar from the same household recorded no consumption of that taxa. Excluding the 0,0 pairs, there was a high correlation between the number of individual animals reported to be consumed by men and tallied in women's daily diet calendars (r = 0.85, n = 102, P < 0.0005, Fig. 1).

Figure 1.

Accuracy of long-term recalls in predicting annual consumption. Recall estimates from the male heads of household of the average number of individuals of a given taxa consumed in the prior year were highly correlated with the number of individuals tallied in daily diet calendars maintained by female heads of household (r = 0.85, n = 102, P < 0.0005).

We found that the annual recall performed best overall when examining the MSEs associated with the consumption reported in the daily diet calendar as compared with the annual, prior month (from both high and low hunting) and combined month recalls (15 of 16 cases, Table 2; one-sided binomial test P = 0.0003). Following our initial hypotheses, the prior month recall from the low-hunting season performed better than the prior month recall from the high-hunting season in predicting true rates of annual consumption (14 of 16 cases, Table 2; one-sided binomial test P = 0.002). Moreover, the prior month recall from the low-hunting season performed better than the combined prior month recalls from both the high hunting and low-hunting season (12 of 16 cases, Table 2; one-sided binomial test P = 0.04). For example, based only on the extrapolated rates of hunting from prior month recall during the high-hunting season, we calculated that the dwarf lemur (Cheirogaleus sp.) was consumed on average 1.7 times its true consumption rate and potentially as high as 8.6 times the true rate. More strikingly, using the same extrapolation from prior month recall, we calculated that the common tenrec (Tenrec ecaudatus) was consumed on average 5.9 times its true consumption rate and as high as 48 times the true rate.

Table 2. MSE estimates associated with each recall method for a given level of true consumption (measured by the daily diet calendar)
Recorded consumption on calendarAnnual recall MSEPrior month recall (high hunting) MSEPrior month recall (low hunting) MSECombined prior month recalls MSE
  1. The first column is a list of how many animals of each given taxa were consumed in a particular household over a 1-year period according to the daily diet calendars. The daily diet calendars are assumed to be a true measure of consumption. Each subsequent column lists the mean squared error by each recall method of all the recall estimates associated with a given level of true consumption. Rare consumption is defined as being consumed less than five times per year.
  2. MSE, mean squared error.
2045.0049 729.00205.0010 609.00
Unweighted mean MSE10.984755.86248.971214.77

Causes of variation in recall discrepancies

The mean ratio of the log-transformed geometric means of the recall count data and the calendar count data for wildlife consumption was 1.056 (median = 1.044; standard error = 0.013), demonstrating high predictive accuracy of annual recalls on the daily diet calendars (Fig. 2). There was little variation between households in terms of accuracy of estimating annual consumption through long-term recalls (Fig. 2). Twenty-four of the 28 households slightly overreported their annual consumption rates. Dividing the taxa into those that were legally versus illegally harvested, the mean deviation from a perfect recall for illegally harvest taxa was 0.03 whereas the mean deviation for legally harvested taxa was 0.11.

Figure 2.

Calculating agreement between recall-based estimates of household wildlife consumption and data from daily diet calendars. The 0 line represents a perfect fit between recall and diet calendars. Each household's (n = 28) accuracy of recall is depicted, averaged across the 15 taxa of interest. Negative deviation represents underreporting whereas positive deviation represents overreporting. In general, annual recalls were highly accurate despite a trend toward overreporting of wildlife consumption events for 15 wildlife taxa.

In examining causes of variation across the MSEs, certain factors arose as possible sources of variation. The number of times a taxa was consumed according to the diet calendar was the strongest predictor for MSE, where more frequent consumption increased the MSE (β = 1.08, P = 0.006, Table 3). Taxa that were consumed less frequently were more reliably reported by recalls than taxa that were consumed less frequently (100 vs. 80% of cases, Table 1). If a given taxa was only seasonally harvested (Table 1), the accuracy of recall also decreased (the absolute value of MSE increased) significantly (β = 0.35, P = 0.025, Table 3). If a taxa had a higher body mass, the accuracy of recall increased (β = −0.14, P = 0.097, Table 3). There were no significant statistical associations between the accuracy of recall estimators and either the male head of household's education or the number of times which he has conducted a recall for past research (Table 3). This means that rarity, more than large body size, was more memorable to an individual recalling consumption events (Table 3). Additionally, frequency of consumption was more likely to skew the accuracy of recall than the seasonality of consumption (Table 3).

Table 3. The association between household and species' characteristics and the accuracy of oral recall reports (n = 420)
  1. High species body mass increased the accuracy of recall of wildlife consumption whereas high frequency of consumption and seasonality reduced the accuracy of recall.
  2. aSignificance level for each coefficient *P < 0.10; **P < 0.05; ***P < 0.005.
  3. bCalendar count refers to the annual rates of consumption of each taxa according to a household's daily diet calendar.
  4. cSeasonality is a binary categorical variable expressing whether hunting practices for a given species are aseasonal (0) or seasonal (1).
  5. dMale education is the maximal educational obtainment of the male head of household who conducted the recall (categorized as either elementary or post-elementary).
  6. eRecall history is a continuous variable listing the number of years in which the head of household has conducted an annual recall.
  7. fSpecies body mass is the weight of each given species (log-transformed because of positive skew).
Calendar countb1.08***
Male educationd−0.26
Recall historye−0.77
Species body massf−0.14*


It is critical to validate the accuracy of survey methods that provide numerical responses on which management and policy actions are based. Our results show that annual recalls performed better than prior month recalls from either low- or high-hunting seasons and better than prior month recalls that combined a month from both seasons. The 1-month recall from the low-hunting season performed better than both the prior month recall during the high-hunting season and the combined 1-month recalls from both the high hunting and low-hunting season in predicting annual rates of consumption. This demonstrates the risk of overestimating true rates of consumption by conducting recalls during the high-hunting season. In this case, and in other similar cases where events of interest are rare and/or seasonal, a longer recall period could be the best approach, or extrapolation should be limited to the season of occurrence. The annual recall is more cost-effective than two separate recall periods and produces a more accurate estimate in this case. It should be noted that the two prior month recalls may have affected the accuracy of the annual recall, and it should also be noted that many participants had performed recalls for this research in the past. However, other studies have shown that rapid assessments can only provide qualitative but not quantitative accuracy and should only be used for restricted applications (Gavin & Anderson, 2005).

Contrary to the suggestions of other studies (Casswell, Huckle & Pledger, 2002; Johansson et al., 2002), 24 of 28 households overreported their consumption behaviors. With more than half of the taxa being illegal to harvest throughout the year, one might expect that consumption behaviors would be underreported rather than overreported (e.g. Rist et al., 2010). Our result is similar to the finding that fishermen tended to overreport their catch (Lunn & Dearden, 2006), but dissimilar to Jones et al. (2008) who found that respondents overreported low levels of consumption and underreported high levels of consumption. However, in our study, the ratio of the recall to the calendar shows only a small margin of overreporting. This is likely due to one of two reasons: (1) the general lack of fear of reporting hunting activities in this area as local monitoring and enforcement are minimal or (2) the possibility that men are overreporting wildlife consumption as a point of pride (e.g. the influence of social desirability, DeMaio, 1984). It is possible that both men and women were simultaneously underreporting use of illegal taxa. Our data indicated that there was a high correlation between responses, and that legality and illegality of taxa did not affect the recall accuracy.

Our results underline the danger of extrapolating from short-term recall while assuming that behaviors of interest occur at a constant rate over time. Yet our research does not suggest that an annual recall is always better than a short-term recall period. It does suggest that we must be diligent in extrapolating rare and/or seasonal events. Rare events, or what Reis & Judd (2000) would call salient or distinctive events, are more accurately recalled than very common events. Understanding the temporal variability of behaviors of interest is critical to designing research methods that produce accurate estimates and are least likely to suffer from systematic bias. Thus, we must be sure that the recall is long enough to detect an event of interest but short enough for recall to remain vivid. Further, researchers should not feel obliged to utilize only one recall period in a given study but should adjust this recall period to be meaningful to the specific study subject. We recommend: (1) conducting a pilot study with focus groups to understand the rarity and seasonality of the events in question prior to creating surveys; (2) designing the study so that subjects are not exclusively observed during highs and/or lows; (3) creating recall periods that will allow the researcher to detect the event without risking memory inaccuracies; and (4) only extrapolating responses to the season of occurrence, if this is relevant. These recommendations have significant implications for affecting study design to minimize bias and reduce systematic errors that could inadvertently mislead managers and policymakers.


We would like to thank J. Gagnon-Bartsch for statistical consultation. We also acknowledge financial support by the National Geographic Society Conservation Trust #C135-08, the Margot Marsh Biodiversity Fund # 023815, the Mohamed bin Zayed Species Conservation Fund #1025935 and the National Science Foundation doctoral dissertation improvement grant # 1011714 and National Science Foundation Coupled Natural Human Systems grant #NSF-GEO1115057. We would also like to thank L. Fernald for her financial generosity in purchasing the kitchen scales for this project.