## 1. Unobserved Heterogeneity in Microdata

One of the most striking features of consumer microdata is the great heterogeneity in choice behaviour which is evident, even amongst economic agents which are otherwise similar in observable respects. This presents researchers with a difficult problem – how to model behaviour in a way which accommodates this heterogeneity and yet preserves theoretical consistency and tractability.

One rather robust response is to demand that everything should be explainable by the theory in terms of observables alone. This view is typified by Stigler and Becker (1977):

Tastes neither change capriciously nor differ importantly between people.

The research agenda which follows from this view is one which tries to explain differences in observed behaviour without recourse to unobserved heterogeneity in tastes, but instead purely in terms of the theory and observable differences in constraints, characteristics of market goods and characteristics of agents. From this point of view, resorting to unobserved preference heterogeneity in order to rationalise behaviour is a cop-out; it is an admission of failure on the part of the theory.

From this perspective, it is therefore a matter of some regret that measures of fit in applied work on microdata are typically very low – that is, the theory performs poorly (Banks *et al.* (1997); Lewbel and Pendakur, (2009) who report R^{2} as low as 20% in consumer demand microdata). As a result, the belief that unobserved heterogeneity is an inescapable and essential part of the modelling problem has become the dominant view in the profession. This approach was summarised by the joint 2,000 Nobel laureates as follows:

In the 1960s, rapidly increasing availability of survey data on individual behavior … focused attention on the variations in demand across individuals. It became important to explain these variations as part of consumer theory, rather than as

ad hocdisturbances. McFadden (2000),Nobel Lecture

Research in microeconometrics demonstrated that it was necessary to be careful in accounting for the sources of manifest differences among apparently similar individuals. … This heterogeneity has profound consequences for economic theory and for econometric practice. Heckman (2000),

Nobel Lecture

In applied microeconometrics, the standard approach has been to pool data across agents and to model the behaviour of individuals as a combination of a common component and an idiosyncratic component which reflects unobserved heterogeneity. In its least sophisticated form, this amounts to interpreting additive error terms as unobserved preference heterogeneity parameters. Recently, it has become clear that such an approach typically requires a combination of assumptions on the functional form of the statistical model and the distribution of unobserved heterogeneity. Contributions here include McElroy (1987), Brown and Walker (1989), Lewbel (2001) and Lewbel and Pendakur (2009). Broadly, the current consensus on unobserved heterogeneity is that: it is a fundamental feature of consumer microdata; if neglected it makes econometric estimation and identification difficult; and it is rather hard to deal with convincingly, especially in non-linear models and where heterogeneity is not additively separable.

Whilst the dominant empirical methods, by and large, proceed by *pooling* agents, the approach which we develop here is based on *partitioning* . The spirit of pooling agents is to account for heterogeneity with a small number of extra parameters (e.g. one) per type or characteristic, as in fixed-effects models with lots of covariates. Here, many parameters (e.g. those relating to covariates) are shared across the agents in the pooled model, and each agent has one or more agent-specific parameters. Some pooling models have a continuum of types (Lewbel and Pendakur, 2009) and some have a small number of discrete types (Heckman and Singer, 1984) but pooling models share the feature that unobserved heterogeneity is an ‘add-on’ to the model shared by all agents.

In contrast, the spirit of partitioning is to allow each type to be arbitrarily different from every other, for example, by giving each type a completely different set of parameters governing the effects of covariates. We use restrictions resulting from the assumption that all agents are utility maximising to partition agents into groups which maximise different utility functions.

We work from the basis of revealed preference (RP) restrictions (Afriat, 1967; Diewert, 1973; Varian, 1982). At heart, RP restrictions are inequality restrictions on observables (prices, budgets and demands), which provide necessary and sufficient conditions for the existence of an unobservable (a well-behaved utility function representing the consumer's preferences which rationalises the data). RP restrictions are usually applied to longitudinal data on individual consumers and are used to check for the existence and *stability* of well-behaved preferences. In this article, we apply this kind of test to cross-sectional data on many different consumers (though, as we describe below, our idea applies to many contexts with optimising agents). In this context, RP restrictions are interpretable as a check for the *commonality* of well-behaved preferences.^{1}

Of course, this is a rather simplistic idea. The very notion that such a check might pass and that the choices of all of the consumers in a large microeconomic data set could be explained perfectly by a single common utility function is, as Lewbel (2001) points out, ‘implausibly restrictive’. The real problem is what to do if (or more likely when) the data do not satisfy the RP restrictions. It is important to recognise that there are many reasons that a model which assumes homogeneous preferences might fit the data poorly including mistakes by the data collector (measurement error), mistakes by the individuals themselves (optimisation error) and mistakes by the theorist (specification error, which is to say applying the wrong model). The truth is doubtless a mixture of all three. This article focuses primarily on the last of these and in particular on the issue of preference heterogeneity: we ask how far we can get by assuming that unobserved preference heterogeneity is the sole cause of poor fit.^{2}Dean and Martin (2010) provide one type of solution along these lines: they show how to find the largest subset of the data that *do* satisfy (some of) the RP restrictions. However, their approach leaves some of the data as unexplained by the optimising model.

The contribution of this article is to provide a different (and complementary) set of strategies for the case where the pooled data violate the RP restrictions. Here, some amount of preference heterogeneity is necessary to model those data – we need more than just one utility function. The question is *how many do we need*? Is it very many (perhaps as many as there are observations), or just a few? This article shows how to find out the minimum number of types (utility functions) necessary to fully explain all observed choices in a data set. In seeking the *minimum* number of utility functions necessary to rationalise behaviour, we keep with Friedman's (1953) assertion that we do not want the true model, which may be unfathomably complex; rather, we want the simplest model that is not rejected by the data. Occam's Razor applies here: we know that we can fully explain behaviour with a model in which every agent is arbitrarily different from every other but that approach is not useful for modelling or predicting behaviour. Instead, our aim is to group agents into types to the maximum possible degree that is consistent with common preferences. If the minimum number of types (utility functions) is very large relative to the number of observations, then modelling strategies with a continuum of types, or with one type for each agent (such as fixed effects models), might be appropriate. In contrast, if the minimum number of types is small relative to the number of observations, then modelling strategies with a small number of discrete types, such as those found in macro-labour, education choice and empirical marketing models, might be better.

We argue that our approach offers two main benefits which may complement the standard approaches to unobserved heterogeneity in empirical work. First, it provides a framework for dealing with heterogeneity which is driven by an economic model of interest and it thereby provides a practical method of partitioning data so that the observations in each group are fully theory consistent. This contrasts with approaches wherein only part of the model (the part which excludes the unobserved heterogeneity) satisfies the theory.^{3} Second, it is elementary: our approach does not require statements about the empirical distributions of objects we cannot observe or functional structures about which economic theory is silent. This contrasts with the standard approach of specifying *a priori* both the distribution of unobserved preference heterogeneity parameters and its functional relationship with observed variables.

We implement our strategy with cross-sectional data set consumer microdata. These data happen to record milk purchases but, importantly, they have individual-level price, quantity and product characteristics information, and so are ideal for the application of RP methods. We find that the number of types needed to completely explain all of the observed variation in consumption behaviour is quite small relative to the number of observations in our data. For our main application, with a cross-section data set of 500 observations of quantity vectors, we find that four or five types is enough. Furthermore, it seems that two-thirds of the data are consistent with a single type and two types are sufficient to model 85% of the observations.

This article is organised as follows. We begin with a description of the cross-sectional data on household expenditures and demographics which we use in this study. We then investigate whether these data might be rationalised by partitioning on observed variables which form the standard controls in microeconometric models of spending patterns. We then set out a simple method for partitioning on revealed preferences, and consider whether the results from these partitioning exercises can be a useful input to econometric modelling of the data. We then consider the problem of inferring the number of types in the population from which our sample is drawn. The final Section draws some conclusions.