This paper examines how beliefs about own HIV status affect decisions to engage in risky sexual behavior, as measured by having extramarital sex and/or multiple sex partners. The empirical analysis is based on a panel survey of males from the 2006 and 2008 rounds of the Malawi Diffusion and Ideational Change Project (MDICP). The paper develops a behavioral model of the belief-risky behavior relationship and estimates the causal effect of beliefs on risky behavior using the Arellano and Carrasco (2003) semiparametric panel data estimator, which accommodates both unobserved heterogeneity and belief endogeneity arising from a possible dependence of current beliefs on past risky behavior. Results show that downward revisions in the belief assigned to being HIV positive increase risky behavior and upward revisions decrease it. For example, based on a linear specification, a decrease in the perceived probability of being HIV positive from 10 to 0 percentage points increases the probability of engaging in risky behavior (extramarital affairs) from 8.3 to 14.1 percentage points. We also develop and implement a modified version of the Arellano and Carrasco (2003) estimator to allow for misreporting of risky behavior and find estimates to be robust to a range of plausible misreporting levels. © 2013 The Authors. Journal of Applied Econometrics published by John Wiley & Sons, Ltd.


The AIDS epidemic imposes a large toll on populations in sub-Saharan Africa through high rates of mortality and morbidity. About two thirds of people infected with HIV worldwide reside in the region, and several countries have adult prevalence rates above 20% (UNAIDS, 2008). Heterosexual intercourse is known to be the main mode of transmission in Africa, but relatively little is known about how the disease and people's awareness of their HIV status influence sexual behaviors. Understanding the behavioral link is important to developing effective policy interventions, such as HIV testing programs or informational campaigns.

This paper studies how people's decisions to engage in risky sexual behaviors relate to their beliefs about own HIV status. From a theoretical perspective, the effect of beliefs on risky behavior is ambiguous. People who assign a high likelihood to being HIV-positive may take more risks as they are already infected. On the other hand, the fear of infecting others (via altruism, social norms or sanctions) might deter transmissive behaviors. People who assign a low likelihood to own infection may have a greater incentive to take precautions to avoid infection but may also take more risks because of less concern about infecting others. Reducing risky behavior of HIV-positive persons generally reduces incidence rates, but the relationship between risky behavior of HIV-negative persons and incidence rates is less clear.

To prevent the further spread of HIV, government and nongovernmental organizations have implemented a variety of public health interventions, including increasing access to testing and treatment services, informational campaigns, and condom distribution programs. It is hoped that informing individuals about their own HIV status and about methods of avoiding transmission will reduce incidence rates, although the quantitative evidence on behavioral responses is scarce. A study by Thornton (2008), described in Section 2, finds that individuals in Malawi who received positive HIV test results modestly increased condom purchases but did not alter sexual behavior over a 2-month timeframe following test result dissemination. Oster (2012) also shows little response of sexual behavior to local prevalence rates using Demographic and Health Surveys data for a subset of African countries. Philipson and Posner (1995) report similar findings for the USA.

Two ingredients are necessary for a program intervention to effectively reduce HIV incidence. First, the intervention must alter individuals’ beliefs about their own HIV status, HIV prevalence and/or about the technology for transmission; and, second, these belief changes must induce changes in behavior. In the context of rural Malawi, the link between HIV testing and beliefs has been tenuous. Table 1 shows the 2004 and 2006 test results given to males in the Malawi Diffusion and Ideational Change Project (MDICP) sample used in our analysis and their reported belief of being HIV-positive 2 years later (in 2006 and 2008). One would expect those receiving a positive test result to revise their belief of being positive upward (perhaps to 100%) and those receiving a negative test outcome to revise their belief downward. However, as seen in the table, the majority of individuals who tested positive in 2004 and 2006 report a zero probability of being positive 2 years later. There are also some individuals who test negative in 2004 and 2006 and assign a high probability to being positive 2 years later.

Table 1. HIV test results in 2004 and reported beliefs of own probability of infection 2 years latera
 HIV test outcome in 2004HIV test outcome in 2006
Reported belief category 2 years laterNegativePositiveNegativePositive
  1. a

    Sample of males who got tested and learned the test result.

Zero probability40182326
Low probability7761445
Medium probability122312
High probability15482

The evidence reported in this paper and in Delavande and Kohler (2009) indicate that beliefs are not completely revised in accordance with test results, although the reasons why are not fully understood. HIV-positive individuals are typically asymptomatic for many years and may therefore not believe that they carry the disease, particularly in the earlier years when testing was less prevalent. A high reported belief of being positive in 2006 despite a negative test result in 2004 could also reflect interim risky behavior. Lastly, the testing protocol required a second test whenever a positive result was obtained and a third test whenever the first and second tests were discordant, which induced a very low probability of a false positive. Nonetheless, some MDICP respondents expressed skepticism about the quality of the tests administered in 2004, which was likely exacerbated by an initial delay of one or more months in providing the test results. Tests administered in 2006 and 2008 used more rapid testing technology and did have this delay.

This paper focuses on the second ingredient mentioned above and analyzes how beliefs about own HIV status influence risky behavior. The effect of participating in HIV testing on risky behavior has been examined in previous studies, but the belief–behavior relationship has received less attention. This relationship is independently of interest, because the effects of many policy interventions, such as HIV testing programs or public awareness programs, are mediated through changes in beliefs. Additionally, beliefs can change over time even in the absence of policy interventions, for example, in response to past risk exposure or to new information about the HIV status of previous sex partners.

Our empirical analysis is based on panel data from the MDICP survey, which contain unique measures of beliefs about own HIV status that vary substantially across people and over time. The sample covers rural populations from three different regions in Malawi, where overall HIV prevalence is approximately 7%. The survey is unusual in that it includes measures of individuals’ reported beliefs about their own and their spouse's HIV status as well as information on whether they engaged in risky behaviors. We use data from the 2004, 2006 and 2008 survey waves. We focus on men, who are more likely than women to report risky behavior. The rate of risky behavior need not be the same for men and women. First, it is more socially acceptable for men to report having multiple partners and extramarital affairs than for women. Second, men engage in some practices that are not common for women, such as having transactional sexual relationships with younger women. Because our analysis focuses on men, the results may not apply to other demographic groups.

Of key concern in any analysis of the relationship between sexual behavior and beliefs is the potential for endogeneity arising from a likely dependence of current beliefs on past behavior. Such a dependence leads to bias for both cross-section and within estimators (in linear models). Other panel data estimators (e.g. conditional logit) are also inappropriate as they do not allow for feedback from lagged behavior on current beliefs (a violation of strict exogeneity). For this reason, we estimate our model using a semiparametric panel data estimator developed by Arellano and Carrasco (2003), which accommodates feedback from lagged behavior on current beliefs and unobservable heterogeneity. We also develop a modified version of the Aarellano Carrasco (2003) estimator that allows for potential under-reporting of risky behaviors.

Section 2 summarizes related literature. Section 3 presents a simple model of risky behavior that illustrates that the net effect of changing beliefs on risk-taking is theoretically ambiguous and which guides the choice of variables in our empirical analysis. Section 4 presents our empirical strategy for estimating the causal effect of beliefs about own HIV status on risk-taking behaviors. Section 5 describes the empirical results based on the Arellano and Carrasco (2003) estimator and on our modified version that allows for misreporting of risky behavior. Section 6 discusses policy implications.


The notion that individuals change their behavior in response to communicable diseases is generally well accepted and there is a theoretical literature that explores the general equilibrium implications of this type of behavioral response. Philipson (2000), for example, surveys alternative theoretical frameworks of how behavior responds to disease prevalence. These include models of assortative matching (HIV-positives matching with HIV-positives and HIV-negatives with HIV-negatives), which are shown to have a dampening effect on the spread of a disease (Dow and Philipson, 1996); models that relate prevalence rates and the demand for vaccination; models for the optimal timing of public health interventions; and models for studying the implications of information acquisition (e.g. testing) for asymptomatic diseases such as HIV. In another theoretical study, Mechoulan (2004) shows that without a sufficient fraction of altruistic individuals testing can increase disease incidence.

Thornton (2008) empirically examines the causal impact of receiving HIV test results on risky behavior. When the 2004 tests were administered, the MDICP project team carried out an experiment that randomized incentives to pick up the test results. Thornton (2008) analyzes data from this experiment along with data from a 2-month follow-up survey that gathered information on condom purchases and risky sexual behavior. Using the randomized incentive as an instrument for picking up the test results, she finds that learning a positive test result modestly increased condom purchases but did not alter sexual behavior. Individuals who tested negative tended to revise their subjective beliefs about being HIV-positive downward and those who tested positive did not significantly revise their beliefs.

Although also based on MDICP survey data, our study differs from Thornton's in a number of ways: (i) a focus on identifying the causal belief–behavior relationship rather than HIV testing–behavior relationship; (ii) the use of new data gathered in the 2006 and 2008 rounds of the MDICP sample that contain more detailed measures on beliefs than were available in the 2004 round and that is not conditioned on having picked up the test results in 2004; (iii) the use of a different modeling framework and estimation methodology; and (iv) the use of different measures of risky behavior (extramarital sex and multiple sex partners, measured annually).

Boozer and Philipson (2000) analyze the relationship between HIV status, testing and risky behavior using data from the San Francisco Home Health Study (SFHHS). Our identification strategy is similar to theirs in that we also make use of belief information gathered in two time periods, where individuals had the opportunity to get tested in the intervening period. In the SFHHS survey, all individuals who were unaware of their status (around 70%) were tested immediately after the first wave of interviews and learned their status. Boozer and Philipson use those who already knew their status (the remaining 30%) as a control group and find that decreases in the probability assigned to being HIV-positive increase sexual activity. That is, individuals who considered themselves highly likely to be infected and discover they are not increase the number of partners and those who believe themselves to be unlikely to be infected and discover otherwise reduce their number of partners. Our empirical findings are similar, despite the different study population and estimation approach.

Coates et al. (2000) and Gong (2012) analyze data from a Voluntary Counseling and Testing (VCT) Efficacy Study: a randomized trial that took place in Kenya, Tanzania, and Trinidad in the mid 1990s. Study participants were randomly assigned to a treatment group that received VCT or to a control group that received basic health information. Data were gathered on self-reported sexual behavior. Sexually transmitted infections (STIs) were also diagnosed and treated at first follow-up. Coates et al.’s (2000) analysis finds that VCT reduced risky behavior, as measured by self-reported unprotected intercourse. More recently, though, Gong (2012) reanalyzed the data from the African sites, including the STI outcome data, and found that individuals who originally believed themselves to be HIV-negative and were surprised by a positive test result were more likely to contract an STI, while the reverse was true for those who were surprised by a negative test result. He concludes that informing people of a positive test has the unintended consequences of increasing risky behavior. Gong argues that biological STI measures are better indicators of risky behavior than self-report measures, because they are not affected by misreporting. However, misclassification of risky behavior is also possible when biological measures are used, as not all individuals who engage in risky behavior contract an STI. Comparing Gong's study population to ours, notable differences are that his sample is younger and contains a higher proportion of single and urban individuals. His data were also gathered at a time when there were fewer HIV treatment options, which may affect how individuals respond to testing and to changes in beliefs about own HIV status.

As we describe in detail later, the MDICP survey measured beliefs about own HIV status using two different measurement instruments. In the 2004, 2006 and 2008 surveys, individuals were asked to choose one of four categories: no likelihood, low likelihood, medium likelihood and high likelihood. In 2006 and 2008, the categorical measure was supplemented with a numerical measure, which is our main belief measure in this paper. Delavande and Kohler (2009) used the MDICP data to study the accuracy of individuals’ reported numerical beliefs of being HIV-positive and provided detailed documentation of the method used in the surveys to elicit the probabilistic beliefs. They found that the probability assessments on HIV infection gathered in the 2006 round of the survey were remarkably well calibrated to local community prevalence rates. For the 2004 wave of the MDICP data, however, the likelihood of own infection is reported only in broader categories. Anglewicz and Kohler (2009) point out that individuals in the 2004 wave seem to overestimate the risk of being infected; 10% of husbands and 18% of wives estimate a medium or high likelihood of current infection, while actual prevalence in 2004 was 6% for men and 9% for women. In reconciling the 2004 evidence with the well-calibrated probabilistic assessments in the later wave, Delavande and Kohler note problems of interpersonal comparability of the coarse belief categories and that, even if anchoring techniques are used (such as vignettes), complications still remain in translating the coarse categories into more precise assessments. 1 In this paper, we make use of both the coarse belief categories and the finer measurements gathered in 2006 and 2008, as further described in Section 4.

In a follow-up paper, Delavande and Kohler (2012) use the 2004 and 2006 MDICP data to assess the impact of learning HIV status on expectations about own HIV status and on sexual behavior, measured in 2006. Like Thornton (2008), they use the randomized financial incentive as an instrument for learning HIV status (picking up the test result), but their outcome measures are obtained 2 years after the testing rather than 2 months after (as in Thornton's, 2008, study) and their model additionally controls for possible nonrandom attrition. They find that learning an HIV-positive status is not associated with a statistically significant increase in the probability assigned to being HIV-positive 2 years later. Also, people who received an HIV-negative test result in 2004 tend to report a higher probability of being infected in 2006. With regard to sexual behavior, they find that learning a positive HIV status in 2004 leads to having fewer partners and using a condom more often (measured in 2006). HIV-negative individuals who learn their status were more likely to have a condom at home. Couples in which two people learned a negative status were more likely to use condoms (typically, for extramarital relations), but couples in which only one person learned a negative status decreased condom use. We do not analyze condom use in our study, because its use within marriage is rare and the variable was not available in 2008.

Delavande and Kohler (2011) use the 2006 and 2008 MDICP data to study the relationship between subjective beliefs and risky sexual behaviors. They specify a three-equation model of the probability of being HIV-infected prior to the 2006 test, the decision to get tested in 2006, and the decision to have multiple partners. The decision to have multiple partners is assumed to depend on the difference in the subjective expectations of surviving with and without HIV and on own beliefs about being HIV-positive. The authors find that HIV/AIDS-related subjective expectations play an important role in the multiple-partner decision.

Although some of the questions addressed in the Delavande and Kohler (2011) study are similar to those in this paper, the empirical approach is quite different. A key difference is that Delavande and Kohler (2011) account for belief endogeneity by explicitly modeling how beliefs are formed within a parametric framework. Our dynamic panel data model approach does not require specifying how beliefs are formed. Also, they explicitly incorporate belief data about HIV survival probabilities into their model, whereas our model incorporates these beliefs through person-specific unobservables. One similarity is that Delavande and Kohler (2011) use the estimation strategy first proposed in our paper to allow also for the possibility of misreporting error.

There are some other related papers in the public health literature (see, for example, Higgins et al. (1991); Ickovics et al. (1994); Wenger et al., 1991, 1992) that find little or mixed evidence of behavioral response to HIV testing.


As noted in the Introduction, theoretical models are usually ambiguous as to the direction of the relationship between beliefs about one's own HIV status and risk-taking behaviors. Downward revisions in beliefs, as may arise from learning a negative test result, should increase the expected length of life and thereby increase the benefits from risk avoidance. On the other hand, if, as in our sample, individuals tend to overestimate the probability of becoming HIV-infected from one sexual encounter with an infected person, then learning that they are HIV-negative despite a past life of risky behavior may increase their willingness to take risks. 2 This channel is not included in the theoretical model presented here but is allowed to operate in our empirical analysis, as later described. Altruism also plays an important role in HIV transmission; people who are altruistic should curtail risky behaviors after an upward revision in beliefs. Other factors that may also influence transmissive behavior are social or legal sanctions imposed on HIV-positive individuals.

To explore the relationship between beliefs of own HIV status and sexual behavior, we next present a simple two-period model. It assumes that individuals choose their level of risky behavior in the first period and update their beliefs of own HIV status in a Bayesian way. Let inline image denote an individual's chosen level of risky sexual behavior (which represents activities such as engaging in extramarital sex or having multiple sex partners). The (perceived) probability of infection is an increasing function of risky behavior and we denote it by inline image. In a multi-period context, this belief may also be updated through time but we take it as predetermined when the risky behavior decision is taken. Other factors, such as the prevalence rate in the community, modulate the link between sexual behavior and the likelihood of infection and could also be incorporated into the function g(⋅). We abstract from such influences here for ease of presentation, but the empirical analysis includes conditioning variables intended to hold constant local prevalence rates.

Let B0 denote the individual's prior belief about his own HIV status. Individuals potentially obtain satisfaction from risky sexual behaviors in the first period. We also allow one's perception on HIV status, B0, to directly affect utility: inline image. How beliefs affect the marginal utility of risky behavior can be regarded as a measure of altruism or the degree to which social sanctions on transmissive behavior by HIV-positive individuals affect the utility of sexual intensity. In the second period, individuals receive a ‘lump-sum’ utility flow equal to inline image, but this is reduced by inline image if an individual contracts HIV in the first period. λ can be interpreted as the mortality rate for an HIV-positive individual. The discount factor is β. The belief of being HIV-positive in the second period (B1) depends on previous period beliefs (B0) plus the probability of having contracted the disease in the last period:

display math(1)

The individual's problem is

display math

or, equivalently,

display math

The first-order condition yields

display math(2)

where U1(⋅,⋅) denotes the derivative of U(⋅,⋅) with respect to its first argument. This condition implicitly defines inline image as a function of the belief variable B0. Furthermore,

display math

which, given a concave (in inline image) utility function, is positive if inline image and inline image. The latter is reasonable if the probability of infection inline image) is low (take, for instance, g(⋅) to be a logistic or normal cumulative distribution function and consider the low rates of transmission per sexual act). If an individual's marginal utility from (risky) sexual behavior is insensitive to his or her perception on HIV status (that is, not altruistic or amenable to social sanctions if HIV-positive), inline image which is positive. As long as one's marginal utility does not decrease much (relative to inline image), higher prior beliefs are associated with riskier behaviors. A person who is not altruistic (i.e. U12(⋅) = 0) would be expected to increase risky behavior upon learning a positive test result and to decrease risky behavior upon learning a negative test result. Intuitively, if one is already infected, sexual behavior poses no further risks but still provides utility.

In a multi-period context, beliefs affect current behavior and respond to past behavior through updating. The prior belief B0 is based at least in part on previous inline image choices. As described in the next section, dependence of beliefs on previous behavior poses estimation challenges, because it leads to a potential lack of strict exogeneity in a panel data model. Another potential source of endogeneity arises from unobservable traits that affect both beliefs B0 and behavior inline image.


As noted, we aim to assess whether and to what extent changes in beliefs about own HIV status affect risk-taking behaviors. The behavioral model developed in the previous section implies a decision rule for risky behavior that depends on beliefs about own HIV status (equation (2)). Our empirical decision rule specification introduces additional covariates to allow for important time-varying determinants of behavior, such as age. (Any linear time trend is also captured into the coefficient on age.) It also controls for time-invariant determinants by incorporating correlated individual random effects (as described below). These time-invariant determinants may include religiosity, education, local prevalence rates (which were roughly constant over the 2006–2008 time period we study), and individual or region-specific costs of risky sexual behavior.

We next describe the nonlinear panel data estimation strategy used to control for endogeneity of beliefs and for (correlated) unobservable heterogeneity. Let inline image denote the actual measure of risk-taking behavior of individual i in period t, which in our data is an indicator for whether the individual engaged in extramarital sex or an indicator for having had more than one partner over the previous 12 months.3 Denote by inline image the reported measure of risk-taking behavior of individual i in period t. Later, we allow for misreporting in the variable inline image. Bit denotes an individuals’ beliefs at time t about their own HIV status, measured on a 0–10 scale (with 0 being no likelihood of being HIV-positive and 10 being positive with certainty).

The empirical specification (without misreporting) can be written as

display math(3)

Following Arellano and Carrasco (2003), assume the error term can be decomposed as

display math

where vit is an idiosyncratic shock and fi is a time-invariant effect that is potentially correlated with the included covariates. Arellano and Carrasco (2003) approach for modeling the correlated random effect extends an earlier approach proposed by Chamberlain (1984). It is assumed that uit is logistically distributed with a location parameter equal to inline image. No restrictions are imposed on the shape of the conditional mean function. inline image is a vector that assembles previous and current values of Bit and Xit and past values of inline image. In our case, inline image will have a discrete support as our covariates all have discrete supports. Then,

display math

where inline image can be easily estimated in the data as our covariates have discrete support. Applying an inverse transformation function, the above expression is equivalent to

display math

which, first-differenced, yields

display math


display math

By the law of iterated expectations:

display math

In the previously described behavioral model, current beliefs about HIV status depend on prior beliefs and last-period behaviors through updating (equation (1)):

display math

where inline image is a function of fi and vit − 1 (equation (3)). This updating implies a potential correlation between Bit and inline image, and therefore between Bit, vit − 1 and fi. This correlation amounts to a violation of the usual assumption that covariates be independent of past and future idiosyncratic shocks (vit s) invoked in nonlinear panel data settings (i.e. strict exogeneity). An advantage of the Arellano and Carrasco (2003) estimator is that it only requires that covariates be independent from current and future idiosyncratic shocks (vit s), but not past ones (i.e. assumes weak exogeneity). This allows lagged behavior (which is partly determined by past vit s) to affect current and future beliefs.

The conditional moment restriction can be used to construct a moment-based estimator for the parameters of interest. In the case of covariates with finite support, the conditional moments above are equivalent to the following unconditional moments (see Chamberlain, 1987):

display math

where Zit is a vector of dummy variables, each corresponding to a cell for inline image. Arellano and Carrasco suggested constructing a GMM estimator based on the empirical moments:

display math

for t = 2, …, T.

For our weighting matrix we use inline image, a diagonal matrix that gives more weight to the cells that have more individuals. To handle cases in which inline image is 0 or 1, we adopt a slight modification of Cox's (1970) small-sample adjustment to the logit transformation:

display math

In our application, Bit is a numerical measure of the likelihood respondents assigned to being infected with HIV. In addition, we have access to cruder belief measures that were reported in categories (‘no likelihood’, ‘low’, ‘medium’ or ‘high likelihood’). To improve efficiency, we include additional moments using this cruder belief variable in the estimation. We add the following empirical moments to our estimator:

display math

The vector lit − 1 contains dummies for the categorical belief variables in 2006 (no likelihood, low, medium or high likelihood). Following Arellano and Carrasco (2003), we also assume the normalization that inline image, which provides two extra moments (one for each year) and allows us to estimate the intercept α. This restriction does not, however, impose that the average fixed effects are zero within geographic regions. They may not be the same, for example, due to prevalence rates that vary across regions.

The resulting GMM estimator is asymptotically normal and its asymptotic variance, taking into account the estimated regressors (the estimated predicted probabilities), can be obtained by conventional methods for multi-stage estimation problems (see for example, Newey and McFadden, 1994).

To facilitate the interpretation of the estimated parameters, we also report later in the paper the effects of belief changes from B′ to B′ ′ on behavior:

display math

These are computed as in Arellano and Carrasco (2003), replacing population expectations and parameters by sample averages and estimates. In particular:

display math

This marginal effect measures the causal impact of beliefs on risky behavior, holding constant the individual effect (fi) (similar considerations are discussed in Chamberlain (1984, pp. 1272–1274)).

Finally, in our robustness analysis we also consider the possibility that some fraction of individuals who engage in risky behavior report that they do not. To this end, we adapt ideas developed by Hausman et al. (1998) to the Arellano and Carrasco (2003) framework. We assume that individuals always report truthfully when they do not engage in extramarital sex and with a probability α1 lie about having extramarital sex. Thus, letting Yit denote reported behavior and inline image denote true behavior:

display math

With misreporting, the conditional probability of reporting risky behavior takes the form:

display math

which, by the same steps as in the previous derivation leads to the following first-difference expression:

display math


display math

Using the law of iterated expectations, we again obtain estimation moments for the parameters of interest. 4 In our robustness analysis, we report coefficient estimates for varying degrees of misclassification.


5.1 Background on the MDICP Survey

The MDICP data were gathered by the Malawi Research Group in rural areas of three districts in the different administrative regions of the country. 5 As described in the supplementary Web Appendix (supporting information), Malawi's three administrative regions (North, Center and South) are significantly different in ways that are potentially relevant to our analysis. The MDICP data include information gathered from five rounds of a longitudinal survey (1998, 2001, 2004, 2006, 2008) that together contain extensive information on socio-economic indicators, household composition, sexual and partnership histories, and risk assessments of more than 2500 men and women. We primarily use the 2006 and 2008 survey rounds that include detailed information on beliefs about own HIV status combined with cruder measures on reported beliefs from the 2004 survey round. Also, for reasons described previously, we analyze data on men.

Recent studies on the quality of this survey have compared the MDICP sample to other survey samples from rural Malawi. Anglewicz et al. (2009) compare the MDICP participants in 2004 to the 2004 rural population in the Malawi Demographic Health Survey (DHS). MDICP subjects tend to be older (see Table 1.1 in that paper), more educated, more likely to be married, more likely to have known individuals with AIDS but somewhat less knowledgeable about the disease. The authors conjecture that the difference might be explained by the fact that the Malawi DHS includes rural townships, whereas the whole MDICP sample resides in villages. The supplementary Web Appendix provides further information about Malawi and the survey (see also Watkins et al., 2003).

The MDICP survey measured beliefs about own HIV status using two different measurement instruments. In the 2004, 2006 and 2008 surveys, individuals were asked to choose one of four categories: no likelihood, low likelihood, medium likelihood and high likelihood. In 2006 and 2008, the categorical measure was supplemented with a probability measure. One might be concerned that low-education populations would have difficulty in reporting probabilities. For this reason, the MDICP survey used a novel bean-counting approach to elicit probabilities, in which these were measured on a 0–10 bean scale, where more beans for a particular event correspond to a higher probability assessment for that event (see the supplementary Web Appendix for details). The measures of subjective beliefs are valuable, because decision making is affected by how individuals perceive their environment, whether their perceptions are correct or not. Although we cannot directly validate whether reported beliefs correspond to actual perceptions, Delavande and Kohler (2009) show that the beliefs correlate as expected with the variables associated with HIV infection likelihood. As with other empirical studies using belief data, our analysis assumes that subjective beliefs are accurately reported. If people do not accurately report subjective beliefs because of fear of stigma, for example, then the estimates could be biased.

Finally, we note that beliefs are measured at the time of the interview (in 2006 and 2008), whereas the risky behavior measure pertains to the preceding 12 months of each interview. We therefore assume, in terms of timing, that the beliefs reported at the interview are roughly stable over the previous 12 months. If beliefs were the only regressor, then a violation of this assumption could lead to potential upward bias in estimation on the coefficient associated with beliefs (which we estimate to be negative). The direction of the bias is unclear when there are other covariates included.

5.2 Descriptive Analysis

Table 2 shows the mean and standard deviations for the variables used in our analysis. The total sample size is 587 men for whom data were collected in both the 2006 and 2008 survey rounds. When reporting results for extramarital sex, we restrict the sample to the 485 men who were married in both rounds (possibly to different women). The results for the multiple sex partner outcome include all men, whether married or not. In 2008, the average age of the sample is 46.

Table 2. Descriptive statistics. Sample: males in 2006 and 2008 MDICP samples
  1. a

    This variable defined conditional on being married.

Age (in 2008)45.73911.639
No school0.1020.303
Primary education only0.7020.458
Secondary education0.1840.388
Higher education0.0120.109
Reside in Balaka0.3180.466
Reside in Rumphi0.3720.484
Reside in Mchinji0.3100.463
Polygamous (2006)0.1730.379
Polygamous (2008)0.1680.375
Number of children (2006)5.0503.032
Number of children (2008)5.5382.802
Number of children not reported (2006)0.0460.210
Number of children not reported (2008)0.0000.000
Metal roof 20060.1520.359
Metal roof 20080.2010.401
Believe that own prob. of HIV is zero in 20060.7920.406
Believe that own prob. of HIV is low in 20060.1520.359
Believe that own prob. of HIV is medium in 20060.0290.168
Believe that own prob. of HIV is high in 20060.0270.163
Believe that own prob. of HIV is zero in 20080.5510.498
Believe that own prob. of HIV is low in 20080.3410.475
Believe that own prob. of HIV is medium in 20080.0810.272
Believe that own prob. of HIV is high in 20080.0270.164
Subjective prob. of being HIV positive, bean count measure (2006)0.7341.701
Subjective prob. of being HIV positive, bean count measure (2008)1.3711.824
Subjective prob. of spouse being HIV positive, bean count measure (2006)0.6631.552
Subjective prob. of spouse being HIV positive, bean count measure (2008)1.4301.923
Extramarital sex in last 12 months in 2006a0.0790.270
Extramarital sex in last 12 months in 2008a0.1090.312
Number of partners in 20061.2761.444
Number of partners in 20081.3421.821
More than one partner in 20060.2010.401
More than one partner in 20080.2100.407
Took HIV test in 20060.9370.243
Took HIV test in 20080.8160.388
Number of observations587

As seen in Table 2, in 2006 the average number of beans representing the belief that one's spouse is HIV-positive is 0.73, in comparison to 1.37 in 2008 (on a scale of 0–10 beans). Figure 1 further shows the full distribution of reported beliefs in the 2006 and 2008 rounds. Even though individuals were not informed about their spouse's test result for confidentiality reasons (if their spouse got tested), almost all of the men who report their spouses got tested also report that their spouses shared the test results with them. For example, out of the 580 men in our sample who were married in 2006, 67% report that their spouse has been tested, and, of those, 97% report that the test result was shared. With regard to risky behavior, 7.9% in 2006 and 10.9% in 2008 reported having extramarital sex in the last 12 months. For those married in both rounds, the numbers are 4.3% and 10.5%.6

Figure 1.

Belief distribution as measured by numbers of beans (in 2006 and 2008. This figure is available in colour online at

The average number of sex partners was about 1.27 in 2006 and 1.34 in 2008, with monogamous men reporting on average 1.05 and 1.18, respectively. The average number of partners for younger men (men under the age of 50) is similar to that for the overall sample. The proportion of men reporting more than one partner in 2006 was 20% and in 2008 was 21%. For monogamous men the numbers go down to around 5% in both years. As previously noted, HIV testing was offered in 2006 and 2008. 93.7% of the sample was tested in 2006, in comparison with 81.6% in 2008.

Table 3 explores the potential determinants of decisions about extramarital sex and having more than one sexual partner, using a standard logit regression applied to 2006 and 2008 data. The bean count measure (reported in columns (1) and (5)) is the regressor used later in our implementation of Arellano and Carrasco (2003). The disaggregated measures (columns (2), (3), (6) and (7)) are also used later in the Arellano–Carrasco implementation, in constructing cells used in estimation. People who assign a higher probability of themselves being HIV positive are more likely to report engaging in extramarital sex and to report having more than one sexual partner. These correlations do not have a causal interpretation though, because they do not account for unobserved heterogeneity or for the potential endogeneity of beliefs. Because the individual effect fi positively affects the likelihood that yi,t − 1 is positive and this, in turn, positively affects beliefs by increasing the probability of infection since the last period, we would expect beliefs and the residual to be positively associated, introducing an upward bias in the estimation.

Table 3. Logistic estimation of risky sex determinants in 2006 and 2008 (Standard errors in parentheses)
VariableExtramarital sexMore than one sex partner
  • Standard errors in parentheses.

  • *

    p < 0.10;

  • **

    p < 0.05;

  • ***

    p < 0.01.

  • a

    MDICP respondent reported their subjective belief about being HIV-infected using a bean-counting measure on a 0–10 scale, where more beans represented higher likelihood.

  • b

    The omitted categories are: secondary school or some years of higher education; resides in Mchinji; assigned zero beans to the likelihood of being infected.

Bean Count(a)0.139**  0.138**  0.105***  0.197**  
(0.055)  (0.061)  (0.037)  (0.080)  
One Beana 0.2970.295 0.2680.264 0.1370.137 0.6620.665
(0.384)(0.384) (0.392)(0.393) (0.223)(0.223) (0.414)(0.413)
2-10 Beansa 0.753***  0.616**  0.585***  1.285*** 
(0.289)  (0.307)  (0.174)  (0.369) 
2-4 Beansa  0.625*  0.402  0.586***  1.345***
(0.323)  (0.348)  (0.196)  (0.365)
5-10 Beansa  1.042***  1.113***  0.584**  1.157*
(0.401)  (0.409)  (0.257)  (0.596)
Age in 2006   −0.146*−0.144*−0.149*   −0.087−0.087−0.087
(0.077)(0.078)(0.078)   (0.099)(0.102)(0.102)
Age squared in 2006   0.001*0.0010.001*   0.0000.0000.000
(0.001)(0.001)(0.001)   (0.001)(0.001)(0.001)
Moslem   −0.429−0.433−0.443   0.853*0.878*0.886*
(0.443)(0.442)(0.441)   (0.468)(0.464)(0.466)
No schoolb   0.7930.7830.849   −0.485−0.551−0.571
(0.608)(0.612)(0.610)   (0.605)(0.610)(0.618)
Primary schoolb   0.7630.7430.794*   −0.492−0.514−0.528
(0.478)(0.477)(0.479)   (0.397)(0.398)(0.398)
Resides in Balakab   −0.021−0.029−0.015   0.1200.1180.112
(0.414)(0.415)(0.414)   (0.439)(0.427)(0.426)
Resides in Rumphib   −0.760**−0.725**−0.755**   −0.405−0.314−0.313
(0.362)(0.368)(0.371)   (0.444)(0.444)(0.443)
Polygamous   −0.169−0.154−0.180   6.686***6.809***6.816***
(0.419)(0.421)(0.423)   (0.441)(0.453)(0.450)
Number of children   0.0740.0720.076   0.0940.0920.091
(0.060)(0.061)(0.061)   (0.075)(0.075)(0.075)
Number of children not reported   0.4890.5470.484   1.639**1.747**1.766**
(1.106)(1.095)(1.114)   (0.718)(0.722)(0.726)
Metal roof   −0.041−0.041−0.025   −0.206−0.172−0.170
(0.363)(0.367)(0.369)   (0.389)(0.396)(0.397)
Pseudo R20.0380.0420.0440.0820.0820.0860.0070.0100.0100.6460.6530.653

Any simple within estimator, whether linear or nonlinear (e.g. fixed effects logit) would also be biased, because the assumptions required to justify those estimators preclude feedback from lagged behavior on current beliefs. Nevertheless, for purposes of comparison, we also report standard fixed-effect logit estimates for the two risky behavior measures (see Table A1 in the supplementary Web Appendix). Most of the estimated coefficients associated with the belief variables are positive but statistically insignificant, possibly because identification comes only from individuals who switch their behavior status from one period to the other, reducing the effective sample size, or because of endogeneity bias.

5.3 Estimated Causal Effects

We next report estimates based on model (3) using the preferred Arellano and Carrasco (2003) methodology, which properly addresses belief endogeneity. The estimation requires that we construct cells based on inline image, where the vector includes lagged belief measures and age. In principle, cells could be constructed separately for all possible values of the discrete covariates; in practice, this procedure would lead to too many small cells. For this reason, we aggregate some of the cell categories and, following the recommendation in Arellano and Carrasco, exclude in estimation very small cells (consisting of one or two individuals). A systematic procedure for cell aggregation that would allow us to optimally select moments and handle small cells is not available for this setting and is beyond the scope of this paper. We define the cells by first dividing individuals into age quintiles and also according to aggregated belief categories. To check sensitivity, we consider the two alternative aggregations of the bean-counting measure used to represent beliefs: 0,1,2–10 beans and 0,1,2–4,5–10 beans. Although the cells are defined based on aggregate categories, we use the disaggregated age and belief measures (actual bean counts) in forming the difference inline image.

Panel A of Table 4 reports the estimated coefficients obtained for the extramarital sex outcome under two alternative specifications. Both specifications include linear terms in beliefs and age. The second augments the first to include quadratic terms in age and beliefs. A joint test of the statistical significance of the belief variables in the quadratic specification shows that they are statistically significant at a 5% level. The estimates indicate that the impact of beliefs on risky behavior is statistically significant and that people reporting higher beliefs of being HIV positive are less likely to engage in extramarital sex.

Table 4. GMM estimation of the effects of beliefs on the propensity to engage in risky sex
  • *

    p < 10%;

  • **

    p < 5%;

  • ***

    p < 1%.

  • a

    The estimates are reported for the two different bean aggregation schemes used in implementing the GMM procedure. The age categories are aggregated into quintiles.

  • b

    MDICP respondents reported their subjective belief about being HIV-infected using a bean-counting measure on a 0–10 scale, where more beans represented higher likelihood.

Panel A: Extramarital sex
Bean countb−1.552***−3.168***0.3030.145
Age squared  0.0080.008
Bean count squaredb  −1.361*−1.461**
Bean Aggregation used0,1,2 − 100,1,2 − 4,5 − 100,1,2 − 100,1,2 − 4,5 − 10
Number of cells used in GMM23272327
Panel B: More than one sex partner
Bean countb−0.421***−0.767***−0.193−0.322
Age squared  0.0070.007
Bean count squaredb  −0.311−0.358*
Bean aggregation used0,1,2 − 100,1,2 − 4,5 − 100,1,2 − 100,1,2 − 4,5 − 10
Number of cells used in GMM27322732

Panel B of Table 4 shows analogous results for the models where the outcome variable is having multiple sex partners. For both belief aggregations and for the linear model, the coefficient on beliefs is negative and highly significant: people reporting higher beliefs of being HIV-positive are less likely to have more than one partner. In the quadratic specification, higher beliefs lead to less risky behavior. The coefficients on the linear and quadratic terms are jointly significant at a 5% level.

To aid in interpretation of the coefficient estimates, Table 5 reports the marginal effects of changes in beliefs (indicated in the table) for both the linear and quadratic specifications on the probability of engaging in extramarital sex. The estimates imply that revising beliefs upward decreases risk-taking. For example, an individual who changes beliefs from a measure of 2 beans (corresponding to 20%, the average in 2006 for HIV-positive respondents), to a measure of 10 beans (corresponding to 100%) would decrease the probability of having extramarital sex by 5.2 percentage points in 2006 (according to the linear index specification and the 0,1,2–10 bean aggregation; see Panel A of Table 5). The estimates also indicate that revising beliefs downward increases risk-taking. Someone who decreases their belief from a measure of one bean (10%, the average number reported in 2006 by HIV-negative individuals), to zero increases the probability of extramarital sex by 5.8 percentage points in 2006 (again for the linear specification and 0,1,2–10 aggregation of beans). If the model is estimated on men who are younger than age 50, the estimated marginal effects are slightly larger.

Table 5. Average marginal effects implied by estimated coefficients in Table 6
Bean changecBean aggregation: 0,1,2–10aBean aggregation: 0,1,2–4,5–10a
Extramarital sexMore than 1 partnerExtramarital sexMore than 1 partner
  1. a

    The estimates are reported for the two different bean aggregations used in implementing the GMM procedure. The age categories are always aggregated into quintiles.

  2. b

    The marginal effects are obtained for each individual in the 2006 and 2008 samples and are averaged across individuals.

  3. c

    MDICP respondents reported their subjective belief about being HIV-infected using a bean-counting measure on a 0–10 scale, where more beans represented higher likelihood.

Panel A: Extramarital sex
Panel B: More than one sex partner

For comparison, we also estimate a linear regression of changes in the binary outcomes (ΔYit) on changes in beliefs (ΔBit) and other regressors (ΔXit) using the cells’ indicators Zi as instruments.7 We obtain estimates that are within the range of marginal effects reported in Table 5 but that constrain those to be constant regardless of the magnitude in the beliefs change. A linear regression of change in beliefs on covariates and Zi gives an F-statistic for the exclusion of the instruments (i.e. cells) of 16.51 and an R2 of 0.3678 for the first aggregation scheme. The F-statistic and R2 for the second aggregation scheme are 15.29 and 0.3478, respectively. We obtain comparable belief coefficients in the equations of interest. For the extramarital affairs indicator, the estimated coefficient is − 0.05 for the specification with the 0,1,2–10 bean aggregation and − 0.08 for the one using the 0,1,2–4,5–10 bean aggregation. Similarly, for the indicator of more than one partner, we get − 0.001 for the specification with the 0,1,2–10 bean aggregation and − 0.02 with the 0,1,2-4,5–10 bean aggregation.

Many HIV testing programs seek to reduce risk-taking behaviors by providing individuals with information about their own HIV status. Our results show that the behavioral response with regard to risk-taking will depend on how beliefs change after receiving test results. The estimates indicate that individuals who revise their beliefs downward in response to a negative test would increase risk-taking and individuals who revise their beliefs upward in response to a positive test would decrease risk-taking.

5.4 Robustness

5.4.1 Misreporting

Because risky sexual behavior may be considered a sensitive subject, an obvious concern is misreporting. In this subsection, we explore the robustness of the previously estimated specification to allow for misreporting of risky behavior. To investigate the potential problem of misreporting, the MDICP team carried out a small set of qualitative interviews with men who had reported not having extramarital sex during the 1998 round of the survey. These follow-up interviews were very casual (no questionnaire or clipboard, typically no tape recorder) and were later transcribed by the principal investigators in the field. Slightly over 9% of those who had originally denied infidelity admitted otherwise in these informal interviews. Even though the reference period in the 1998 survey was longer and the men may tend to exaggerate in these casual conversations, this provides some evidence of under-reporting by the respondents during the more formal interviews.

To gain intuition into why misreporting leads to an attenuation bias in the estimated coefficients, consider a linear model. Under linearity, inline image and the estimated parameters are attenuated by α1 > 0. In our nonlinear case, inline image and misreporting leads to inline image (see also Hausman et al., 1998).

To assess the impact of under-reporting on our estimation results, we re-estimated the model for the extramarital sex measure of risky behavior assuming different levels of misreporting. We use the modified version of Arellano and Carrasco's estimator that is described in Section 4. The coefficient estimates are shown in Table 6 for the linear index specification (and in Table A2 in the supplementary Web Appendix for the quadratic specifications) and for the two alternative belief aggregation levels and for varying levels of misreporting (α1). The first row displays the estimates presented in our main analysis (i.e. without misreporting) and subsequent rows display the estimates for higher levels of misreporting (α1). We find that higher levels of misreporting lead to higher coefficient magnitudes.

Table 6. Estimated coefficients for effects of beliefs on the propensity to engage in extramarital sex, under varying levels of misreporting (α1). Linear specification
α1Bean aggregationa
  1. a

    The estimates are reported for the two different bean aggregation schemes used in implementing the GMM estimation procedure. Age categories are aggregated into quintiles.


5.4.2 Additional Regressors

We also investigate how the estimates are affected by the inclusion of additional covariates, namely reports on past behavior and perceived local HIV prevalence (see Tables A3 and A4 in the supplementary Web Appendix). 8 In the theoretical model of Section 3, past behavior only influenced current behavior through belief updating. However, it could conceivably have an independent effect on current behavior, for example, by affecting search costs for finding extramarital partners (which were not incorporated into the theoretical model of Section 3).

Our previous estimations also assumed that perceived risk of HIV infection is held constant by inclusion of individual effects, motivated by the fact that actual local prevalence rates were stable from 2006 to 2008: the p-value for a test of equality across these two years in our sample is 0.4. Overall, the prevalence has been stable and might have even slightly decreased (as measured by the 2004 and 2010 DHS), but it is possible that individuals’ beliefs about prevalence varied over time. For this reason, we also estimated a specification that includes past behavior and reported perceived local prevalence. The variable used to measure perceived local prevalence rate is the respondents’ answer to the following question: ‘If we took a group of 10 people from this area—just normal people who you found working in the fields or in homes—how many of them do you think would now have HIV/AIDS?’ We notice that the average perceived prevalence is substantially above the prevalence in our sample, raising some concerns about this variable. The inclusion of this variable complicates the estimation procedure some, because the cells used in the estimation now need to be constructed using these additional covariates. We base the new cells on quartiles of perceived prevalence, but the average number of individuals per cell still drops from 21 to less than 10 in the extramarital sex regressions, once prevalence is included for example. The estimated effect of beliefs on risky behavior is still negative once prevalence is added and the coefficient is highly significant in the linear specification and jointly significant in the quadratic specification.


This paper examines how beliefs about own HIV status affect decisions to engage in risky behavior, as measured by extramarital sex and having multiple sexual partners. We use a unique panel survey from Malawi that includes detailed longitudinal measures of subjective beliefs and behaviors. The men in our sample were given the opportunity to get tested for HIV in 2004, 2006 and 2008 and most availed themselves of the opportunity, often multiple times. Reported beliefs about the probability of being HIV-positive vary substantially, both geographically and over the time period covered by the data collection. Changes in reported beliefs do not always accord with test results.

Simple cross-sectional correlations suggest that individuals who believe they have a higher likelihood of being HIV-positive engage in riskier behaviors. These correlations do not have a causal interpretation though, because of unobserved heterogeneity and because one would expect beliefs to be updated to reflect additional risk posed by lagged behaviors. In a panel data setting, the correlation between current beliefs and lagged behaviors leads to a violation of conventional assumptions that regressors in all periods be independent of error terms (strict exogeneity). To take into account the endogeneity of the belief variable as well as individual unobserved heterogeneity, we use a semiparametric panel data estimator developed by Arellano and Carrasco (2003). The estimates indicate that downward revisions in beliefs lead to a higher propensity to engage in risky behaviors and that upward revisions in beliefs lead to a lower propensity. We also modified the Arellano and Carrasco (2003) estimator to incorporate reporting error, along the lines of Hausman et al. (1998). Reporting error attenuates the empirical estimates, but the estimates are robust over a range of plausible reporting error levels.

Our findings have important policy implications. They indicate that credibly informing people that they are HIV-negative, for example, through testing campaigns, can increase risky behavior. Also, in contexts where people overestimate their risk of being HIV-positive, providing more accurate information can lead them to be more risky. On the other hand, informing people about their HIV-positive status reduces risky behavior. The results imply that policy interventions that aim to inform people about HIV status will be more effective in reducing risky behavior when selectively targeted at people considered to be at higher risk of infection.

Lastly, our analysis does not examine how beliefs about own HIV status are formed and to what extent different types of policy interventions influence beliefs. The fact that beliefs do not align closely with HIV test results suggests a need for further study of the process underlying belief formation.


We thank Jere Behrman, Martina Kirchberger, Hans-Peter Kohler, Seth Richards, Susan Watkins and Nicholas Wilson for helpful comments. We also thank Philip Anglewicz for assistance in understanding the data and Arun Hendi for research assistance. The authors gratefully acknowledge financial support from a pilot grant funded by the National Institutes of Health—National Institute on Aging, Grant No. P30 AG-012836 (B. J. Soldo, PI), the National Institutes of Health-National Institute of Child Health and Human Development, Grant No. R24 HD-044964 (H. L. Smith, PI), and the Boettner Center for Pensions and Retirement Security at the University of Pennsylvania (O. S. Mitchell, Director). The findings, interpretations and conclusions expressed in this paper are entirely those of the authors, and do not necessarily represent the views of the World Bank, its Executive Directors, or the governments of the countries they represent.

  1. 1

    For recent surveys on the use of expectations data in development contexts, see Attanasio (2009) and Delavande et al. (2011).

  2. 2

    The probability of infection from a single sexual encounter is thought to be about 0.1% (see Gray et al., 2001).

  3. 3

    A possible alternative measure of risky behavior is reported condom use, but it is not available in the 2008 survey. Previous work finds that condom use (though not condom purchase) is relatively inelastic in Malawi. Only 7% of those individuals tested in 2004, for example, reported using condoms.

  4. 4

    One important problem in implementation is that inline image may be above one in small samples. To guard against this small-sample problem we use inline image.

  5. 5

    The data collection was funded by the National Institute of Child Health and Human Development (NICHD), grants R01-HD044228-01, R01-HD050142, R01-HD37276 and R01-HD/MH-41713-0. The MDICP has also been funded by the Rockefeller Foundation, grant RF-99009#199. Susan Watkins was the PI for the last three grants. Hans-Peter Kohler was the PI for the first two. Detailed information on this survey can be obtained at

  6. 6

    A number of individuals engaging in extramarital sex are only married in one of the rounds and thus are not used in the estimation sample for analyzing the extramarital affairs outcome. However, they are included in the analysis of the other risky behavior measure of having multiple partners.

  7. 7

    In keeping with the Arellano–Carrasco estimator, we use inline image as weighting matrix, but do not exclude the smallest cells.

  8. 8

    Another potentially relevant covariate is the belief about spousal HIV status. This variable is, however, highly correlated with belief about own HIV status: among the respondents in our sample, 91% in 2006 and 78% in 2008 report the same likelihood for own and spouse infection and the average difference is about 0.06 beans on the 0–10 scale used to measure subjective beliefs.