Cross-classified multilevel models deal with data pertaining to two different non-hierarchical classifications. It is unclear how much interpenetration is needed for a cross-classified multilevel model to work well and to estimate the two higher-level effects reliably. The paper investigates this question and the properties of cross-classified multilevel logistic models under various survey conditions. The effects of different membership allocation schemes, total sample sizes, group sizes, number of groups, overall rates of response and the variance partitioning coefficient on the properties of the estimators and the power of the Wald test are considered. The work is motivated by an application to separate area and interviewer effects on survey non-response which are often confounded. The results indicate that limited interviewer dispersion (around three areas per interviewer) provides sufficient interpenetration for good estimator properties. Further dispersion yields only very small or negligible gains in the properties. Interviewer dispersion also acts as a moderating factor on the effect of the other simulation factors (sample size, the ratio of interviewers to areas, the overall probability and the variance values) on the properties of the estimators and test statistics. The results also indicate that a higher number of interviewers for a set number of areas and a set total sample size improves these properties.

Hospital performance metrics, often in the form of risk-adjusted hospital mortality rates, are increasingly being made available in the public domain to compare hospitals. Despite the proliferation of these metrics, uncertainty remains regarding their validity and reliability given the noise surrounding their underlying measures. The paper considers a quality measure of hospital performance developed by McClellan and Staiger which smooths within hospitals and over time, while remaining computationally straightforward. The McClellan and Staiger method improves on others by incorporating different measures of outcome, eliminating systematic bias arising from the heterogeneous mix of hospital outputs and the noise that is inherent in other measures of quality. The technique also allows the forecasting of future quality. Using English hospital episode statistics for the years 2000–2005 for acute myocardial infarction and hip replacement, we use this technique to return quality measures based on hospital fixed effects estimated from yearly cross-sectional patient level data, and vector auto-regressions estimated over time, which then combine information from different time periods and across conditions to produce robust hospital quality measures. Our results suggest that this method is well suited to measure and predict provider quality of care in the English setting.

We exploit variation across Italian regions in the implementation of region-specific tariffs within a prospective pay system for hospitals based on diagnosis-related groups to assess their effect on health and on the use of healthcare services. We consider survey data for the years 1993–2007 with information on both individuals’ perceived health and their utilization of healthcare services. The results suggest that the introduction of market incentives via a fixed price payment system did not lead to worse health perceptions. Instead, it marked a moderate decrease in hospitalization coupled with a clearer decrease in the utilization of emergency services. We also find mild evidence of reduced satisfaction with healthcare services among hospital patients. These effects were stronger for adoptions occurring when also the national government supported the market-based approach. Results are robust to sensitivity checks.

We propose a cross-classified mixed effects location–scale model for the analysis of interviewer effects in survey data. The model extends the standard two-way cross-classified random-intercept model (respondents nested in interviewers crossed with areas) by specifying the residual variance to be a function of covariates and an additional interviewer random effect. This extension provides a way to study interviewers’ effects on not just the ‘location’ (mean) of respondents’ responses, but additionally on their ‘scale’ (variability). It therefore allows researchers to address new questions such as ‘Do interviewers influence the variability of their respondents’ responses in addition to their average, and if so why?’. In doing so, the model facilitates a more complete and flexible assessment of the factors that are associated with interviewer error. We illustrate this model by using data from wave 3 of the UK Household Longitudinal Survey, which we link to a range of interviewer characteristics measured in an independent survey of interviewers. By identifying both interviewer characteristics in general, but also specific interviewers who are associated with unusually high or low or homogeneous or heterogeneous responses, the model provides a way to inform improvements to survey quality.

The paper concerns the statistical modelling of emergency service response times. We apply advanced methods from spatial survival analysis to deliver inference for data collected by the London Fire Brigade on response times to reported dwelling fires. Existing approaches to the analysis of these data have been mainly descriptive; we describe and demonstrate the advantages of a more sophisticated approach. Our final parametric proportional hazards model includes harmonic regression terms to describe how the response time varies with the time of day and shared spatially correlated frailties on an auxiliary grid for computational efficiency. We investigate the short-term effect of fire station closures in 2014. Although the London Fire Brigade are working hard to keep response times down, our findings suggest that there is a limit to what can be achieved logistically: the paper identifies areas around the now closed Belsize, Bow, Downham, Kingsland, Knightsbridge, Silvertown, Southwark, Westminster and Woolwich fire stations in which there should perhaps be some concern about the provision of fire services.

In education studies value added is by and large defined in terms of a test score distribution mean. Therefore, all except a particular summary of the test score distribution is ignored. Developing a value-added definition that incorporates the entire conditional distribution of students' scores given school effects and control variables would produce a more complete picture of a school's effectiveness and as a result provide more accurate information that could better guide policy decisions. Motivated in part by the current debate surrounding the recent proposal of eliminating co-payment institutions as part of Chile's education reform, we provide a new definition of value added that is based on the quantiles of the conditional test score distribution. Further, we show that the quantile-based value added can be estimated within a quantile mixed model regression framework. We apply the methodology to Chilean standardized test data and explore how information garnered facilitates school effectiveness comparisons between public schools and those that are subsidized with and without co-payments.

We re-explore Abel-Smith and Townsend's landmark study of poverty in early post World War 2 Britain. They found a large increase in poverty between 1953–1954 and 1960, which was a period of relatively strong economic growth. Our re-examination is a first exploitation of the data extracted from the recent digitization of the Ministry of Labour's ‘Enquiry into household expenditure’ in 1953–1954. First we closely replicate their results. We find that Abel-Smith and Townsend's method generated a greater rise in poverty than other reasonable methods. Using contemporary standard poverty lines, we find that the relative poverty rate grew only a little at most, and the absolute poverty rate fell, between 1953–1954 and 1961, as might be expected in a period of rising real incomes and steady inequality. We also extend the poverty rate time series of Goodman and Webb back to 1953–1954.

Modelling relationships between individuals is a classical question in social sciences and clustering individuals according to the observed patterns of interactions allows us to uncover a latent structure in the data. The stochastic block model is a popular approach for grouping individuals with respect to their social comportment. When several relationships of various types can occur jointly between individuals, the data are represented by multiplex networks where more than one edge can exist between the nodes. We extend stochastic block models to multiplex networks to obtain a clustering based on more than one kind of relationship. We propose to estimate the parameters—such as the marginal probabilities of assignment to groups (blocks) and the matrix of probabilities of connections between groups—through a variational expectation–maximization procedure. Consistency of the estimates is studied. The number of groups is chosen by using the integrated completed likelihood criterion, which is a penalized likelihood criterion. Multiplex stochastic block models arise in many situations but our applied example is motivated by a network of French cancer researchers. The two possible links (edges) between researchers are a direct connection or a connection through their laboratories. Our results show strong interactions between these two kinds of connection and the groups that are obtained are discussed to emphasize the common features of researchers grouped together.

We present results of a survey experiment aimed at assessing context effects on reporting life satisfaction, exerted by raising awareness of fundamental life domains before eliciting overall life satisfaction, through questionnaire manipulations. Psychologists refer to similar context effects, generated by providing more details about the object of a subsequent evaluation, as ‘unpacking effects’. The longitudinal structure of our experimental design allows us to assess the effects of the questionnaire manipulation both between and within subject. In our sample of university students, asking subjects to report satisfaction with life domains before reporting overall satisfaction with life generates a robust unpacking effect, as it shifts upwards the subsequent mean overall life satisfaction evaluations. In addition, raising awareness about life domains significantly increases reliability and validity of self-reported life satisfaction, by reducing the dispersion of responses and increasing the association between life satisfaction and life domain evaluations. We also detect heterogeneous effects across subgroups of our sample—such as people with children or in bad health—and discuss implications of these findings for research on life satisfaction.

Ecological momentary assessment is used to measure subjects' mood and behaviour repeatedly over time, leading to intensive longitudinal data. Variability in ecological momentary assessment schedules creates an analytical challenge because predictors are measured more frequently than responses. We consider this problem in a study of the effect of stress on the cognitive function of telephone helpline nurses, where stress is measured for each call and cognitive outcomes are measured at the end of a shift. We propose a flexible structural equation model which can handle multiple levels of clustering, measurement error, time trends and mixed variable types.

In May 2004, Poland and seven other countries from central and eastern Europe joined the European Union. This led to a massive emigration from Poland, especially to the UK. However, relatively little is known about the magnitude of migration flows after the 2004 enlargement of the European Union. In the paper Labour Force Survey data from the sending and receiving countries are utilized in a Bayesian model to estimate migration flows. The estimates are further combined with the output of the ‘Integrated modelling of European migration’ model. The combined results with accompanying measures of uncertainty can be used to validate other reported estimates of migration flows from Poland to the UK.

The paper reviews the growing literature on responsive and adaptive designs for surveys. These designs encompass various methods for managing data collection, including front loading potentially difficult cases, tailoring data collection strategies to different subgroups, prioritizing effort according to estimated response propensities, imposing stop rules for ending data collection, monitoring key survey estimates throughout the field period, using two-phase or multiphase sampling for following up non-respondents and calculating indicators of non-response bias (such as the *R*-indicator) other than response rates to monitor and guide fieldwork. We give particular attention to efforts to evaluate these strategies experimentally or via simulations. Although the field seems to have embraced these new tools, most of the evaluation studies suggest they produce marginal reductions in cost and non-response bias. It is clearly difficult to lower survey costs without reducing some aspect of survey quality. Other issues limiting the effectiveness of these designs include weakly predictive auxiliary variables, ineffective interventions and slippage in the implementation of interventions in the field. These problems are not, however, unique to responsive or adaptive design. We give recommendations for improving such designs and for improving the management of data collection efforts in the current difficult environment for surveys.

Competing risks consider time to first event and type of first event. This subdiscipline of survival analysis is challenging in that multiple hazards determine the outcome probabilities. The paper demonstrates that Nightingale and Farr were aware of these connections in their co-operative work in hospital epidemiology. At the fourth International Statistical Congress (in London, 1860), they suggested forms for reporting hospital mortality that were conceptually more complete than many reported competing risks analyses today.

Modern systems of official statistics require the timely estimation of area-specific densities of subpopulations. Ideally estimates should be based on precise geocoded information, which is not available because of confidentiality constraints. One approach for ensuring confidentiality is by rounding the geoco-ordinates. We propose multivariate non-parametric kernel density estimation that reverses the rounding process by using a measurement error model. The methodology is applied to the Berlin register of residents for deriving density estimates of ethnic minorities and aged people. Estimates are used for identifying areas with a need for new advisory centres for migrants and infrastructure for older people.

To meet the strategic goals and objectives for the 2020 census, the US Census Bureau must make fundamental changes to the design, implementation and management of the decennial census. The changes must build on the successes and address the challenges of the previous censuses. Of particular interest is to gauge the on-going quality of the census frames. We address this topic by discussing a set of statistical models for the *Master Address File* that will produce estimates of coverage error at levels of geography down to the block level. The distributions of added and deleted housing units in a block are used to characterize the undercoverage and overcoverage respectively. The data used are from the 2010 address canvassing operation. As will be shown, these distributions are highly right skewed with a very large proportion of 0 counts. Hence, we utilize zero-inflated regression modelling to determine the predicted distribution of additions and deletions. In addition to standard statistical measures, we gauge the performance of this model by simulating a 2010 address canvassing operation using a specified coverage level. We also discuss future maintenance and updating of this model.

The rise in healthcare expenditures has raised doubts about the sustainability of health systems and instigated a discussion on their design. Policy making in this field requires a proper understanding of how healthcare expenditures evolve throughout an individual's lifetime, and of how they vary between individuals. Given the lack of data on healthcare expenditures during an individual's lifetime, we developed a new nearest neighbour resampling approach to construct realistic individual life cycles of healthcare expenditures based on cross-sectional data from the Netherlands. This approach provides insight into lifetime healthcare expenditures. Our main finding is that the inequality in lifetime healthcare expenditures is much smaller than the inequality as derived from cross-sectional healthcare expenditures.

Surveillance data collected on several hundred different infectious organisms over 20 years have revealed striking power relationships between their variance and mean in successive time periods. Such patterns are common in ecology, where they are referred to collectively as Taylor's power law. In the paper, these relationships are investigated in detail, with the aim of exploiting them for the descriptive statistical modelling of infectious disease surveillance data. We confirm the existence of variance-to-mean power relationships, with exponent typically between 1 and 2. We investigate skewness-to-mean relationships, which are found broadly to match those expected of Tweedie distributions, and thus confirm the relevance of the Tweedie convergence theorem in this context. We suggest that variance- and skewness-to-mean power laws, when present, should inform statistical modelling of infectious disease surveillance data, notably in descriptive analysis, model building, simulation and interval and threshold estimation, threshold estimation being particularly relevant to outbreak detection.

Longitudinal monitoring of biomarkers is often helpful for predicting disease or a poor clinical outcome. We consider the prediction of both large and small for gestational age births by using longitudinal ultrasound measurements, and we attempt to identify subgroups of women for whom prediction is more (or less) accurate, should they exist. We propose a tree-based approach to identifying such subgroups, and a pruning algorithm which explicitly incorporates a desired type I error rate, allowing us to control the risk of false discovery of subgroups. The methods proposed are applied to data from the Scandinavian Fetal Growth Study and are evaluated via simulations.

We apply parametric and non-parametric regression discontinuity methodology within a multinomial choice setting to examine the effect of public healthcare user fee abolition on health facility choice by using data from South Africa. The non-parametric model is found to outperform the parametric model both in and out of sample, while also delivering more plausible estimates of the effect of user fee abolition (i.e. the ‘treatment effect’). In the parametric framework, treatment effects were relatively constant—around 10%—and that increase was drawn equally from home care and private care. In contrast, in the non-parametric framework treatment effects were largest for large (and poor) families located further from health facilities—approximately 5%. More plausibly, the positive treatment effect was drawn primarily from home care, suggesting that the policy favoured children living in poorer conditions, as those children received at least some minimum level of professional healthcare after the policy was implemented.

]]>Sampling hidden populations is particularly challenging by using standard sampling methods mainly because of the lack of a sampling frame. Respondent-driven sampling is an alternative methodology that exploits the social contacts between peers to reach and weight individuals in these hard-to-reach populations. It is a snowball sampling procedure where the weight of the respondents is adjusted for the likelihood of being sampled due to differences in the number of contacts. The structure of the social contacts thus regulates the process by constraining the sampling within subregions of the network. We study the bias induced by network communities, which are groups of individuals more connected between themselves than with individuals in other groups, in the respondent-driven sampling estimator. We simulate different structures and response rates to reproduce real settings. We find that the prevalence of the estimated variable is associated with the size of the network community to which the individual belongs and observe that low degree nodes may be undersampled if the sample and the network are of similar size. We also find that respondent-driven sampling estimators perform well if response rates are relatively large and the community structure is weak, whereas low response rates typically generate strong biases irrespectively of the community structure.

The paper investigates the dependences between levels of severity of road traffic accidents, accounting at the same time for spatial and temporal correlations. The study analyses road traffic accidents data at ward level in England over the period 2005–2013. We include in our model multivariate spatially structured and unstructured effects to capture the dependences between severities, within a Bayesian hierarchical formulation. We also include a temporal component to capture the time effects and we carry out an extensive model comparison. The results show important associations in both spatially structured and unstructured effects between severities, and a downward temporal trend is observed for low and high levels of severity. Maps of posterior accident rates indicate elevated risk within big cities for accidents of low severity and in suburban areas in the north and on the southern coast of England for accidents of high severity. The posterior probability of extreme rates is used to suggest the presence of hot spots in a public health perspective.

The instability of ethnicity measured in the national census is found to have doubled from the period 1991–2001 to the period 2001–2011, using the Longitudinal Study that links a sample of individuals’ census records across time. From internal evidence and comparison with results from the Census Quality Survey and the Labour Force Survey, estimates are made of instability due to changing question wording, imputation of missing answers, proxy reporting, recording errors and changes in the allocation of write-in answers. Of the remaining instability, durable changes of ethnicity by individuals are thought to be considerably less common than changes due to a person's sense of identity not closely fitting the categories offered in the census question. The instability creates a net change in size of some ethnic groups that is usually small compared with the change in population between censuses from births, deaths and migration. Consequences for analysis of census aggregate and microdata are explored.

We present new auto-regressive logit models for forecasting the probability of a time series of financial asset returns exceeding a threshold. The models can be estimated by maximizing a Bernoulli likelihood. Alternatively, to account for the extent to which an observation does or does not exceed the threshold, we propose that the likelihood is based on the asymmetric Laplace distribution, which has been found to be useful for quantile estimation. We incorporate the exceedance probability forecasts within a new time varying extreme value approach to value at risk and expected shortfall estimation. We provide an empirical illustration using daily stock index data.

The analysis of national mortality trends is critically dependent on the quality of the population, exposures and deaths data that underpin death rates. We develop a framework that allows us to assess data reliability and to identify anomalies, illustrated, by way of example, using England and Wales population data. First, we propose a set of graphical diagnostics that help to pinpoint anomalies. Second, we develop a simple Bayesian model that allows us to quantify objectively the size of any anomalies. Two-dimensional graphical diagnostics and modelling techniques are shown to improve significantly our ability to identify and quantify anomalies. An important conclusion is that significant anomalies in population data can often be linked to uneven patterns of births of people in cohorts born in the distant past. In the case of England and Wales, errors of more than 9% in the estimated size of some birth cohorts can be attributed to an uneven pattern of births. We propose methods that can use births data to improve estimates of the underlying population exposures. Finally, we consider the effect of anomalies on mortality forecasts and annuity values, and we find significant effects for some cohorts. Our methodology has general applicability to other sources of population data, such as the Human Mortality Database.

Age and sex patterns of migration are essential for understanding drivers of population change and heterogeneity of migrant groups. We develop a hierarchical Bayesian model to estimate such patterns for international migration in the European Union and European Free Trade Association from 2002 to 2008, which was a period of time when the number of members expanded from 19 to 31 countries. Our model corrects for the inadequacies and inconsistencies in the available data and estimates the missing patterns. The posterior distributions of the age and sex profiles are then combined with a matrix of origin–destination flows, resulting in a synthetic database with measures of uncertainty for migration flows and other model parameters.

We consider two econometric problems when investigating the effect of family size on labour market outcomes using the popular twin birth instrument. The first is the potential for omitted variable bias caused by the fact that fertility treatments are linked to twin births and are typically unobserved. We present estimates that are corrected for this bias and find that it is comparatively small. Second, we show that the effects of twin-birth-induced variation in family size, as well as characteristics of the compliers, varies substantially with time passed since birth, which has consequences for the interpretation of estimates across samples and time.

We explore the existence of short- and long-term effects of retirement on health. Short-term effects are estimated with a regression discontinuity design which is robust to weak instruments and where the underlying assumptions of continuity of potential outcomes are uncontroversial. To identify the long-term effects we propose a parametric model which, under strong assumptions, can separate normal deterioration of health from the causal effects of retirement. We apply our framework to the British Household Panel Survey and find that retirement has little effect on health. However, our estimates suggest that retirement opens the gate to a sedentary life with an impoverished social component and this is a channel through which retirement could indirectly affect health in the long run.

We conduct a quasi-Monte-Carlo comparison of the recent developments in parametric and semiparametric regression methods for healthcare costs, both against each other and against standard practice. The population of English National Health Service hospital in-patient episodes for the financial year 2007–2008 (summed for each patient) is randomly divided into two equally sized subpopulations to form an estimation set and a validation set. Evaluating out-of-sample using the validation set, a conditional density approximation estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and among the best four for bias and goodness of fit. The best performing model for bias is linear regression with square-root-transformed dependent variables, whereas a generalized linear model with square-root link function and Poisson distribution performs best in terms of goodness of fit. Commonly used models utilizing a log-link are shown to perform badly relative to other models considered in our comparison.

Some sociologists argue that non-intact family structures during childhood have a negative effect on adult children's civic engagement, since they undermine, and in some cases prevent, the processes and activities through which parents shape their children's political attitudes and orientations. In this paper, we evaluate this hypothesis on the basis of longitudinal data from the German Socio-Economic Panel. In a first step, we construct various measures of family structure during childhood and perform both cross-sectional and sibling difference analyses for various indicators of young adults' civic engagement. Both exercises reveal a significant negative relationship between growing up in a non-intact family and children's political engagement as adults. In a second step, we implement a novel technique—proposed by Oster—for evaluating robustness of results to omitted variable bias. The distinctive feature of this technique is that it accounts for both coefficient movements *and* movements in -values after the inclusion of controls. Results suggest that our estimates do not suffer from omitted variable bias.

The paper argues that we need more general statistical indices for the analysis of the European labour markets. First, the paper discusses some normative aspects that are implicit in the current definition of the employment rate, which is a fundamental policy target in the new strategy Europe 2020. Second, it proposes a class of generalized indices based on work intensity, as approximated by the total annual hours of work relative to a benchmark value. Third, it derives, in a consistent framework, household level employment indices. These indices provide a more nuanced picture of the European labour markets, which better reflects the diversity in the use of part-time and fixed term jobs as well as other factors affecting the allocation of work between and within households.

]]>Multiple-imputation (MI) methods for imputing missing data in observational health studies with repeated measurements were evaluated with particular focus on incomplete time varying explanatory variables. Standard and random-effects imputation by chained equations, multivariate normal imputation and Bayesian MI were compared regarding bias and efficiency of regression coefficient estimates by using simulation studies. Flexibility of the methods in handling different types of variables (binary, categorical, skewed and normally distributed) and correlations between the repeated measurements of the incomplete variables were also compared. Multivariate normal imputation produced the least bias in most situations, is theoretically well justified and allows flexible correlation for the repeated measurements. It can be recommended for imputing continuous variables. Bayesian MI is efficient and may be preferable in the presence of categorical and non-normally distributed continuous variables. Imputation by chained equations approaches were sensitive to the correlation between the repeated measurements. The moving time window approach may be used for normally distributed continuous variables with auto-regressive correlation.

Using Bayesian Markov chain clustering analysis we investigate career paths of Austrian women after their first birth. This data-driven method allows characterizing long-term career paths of mothers over up to 19 years by transitions between parental leave, non-employment and different forms of employment. We classify women into five cluster groups with very different long-run career trajectories after childbearing. We further model group membership with a multinomial specification within the finite mixture model. This approach gives insights into the determinants of long-run outcomes. In particular, giving birth at an older age appears to be associated with very diverse outcomes: it is related to higher odds of dropping out of the labour force, on the one hand, but also to higher odds of reaching a high wage career track, on the other hand.

Recently, various indicators have been proposed as indirect measures of non-response error in surveys. They employ auxiliary variables, external to the survey, to detect non-representative or unbalanced response. A class of designs known as adaptive survey designs maximizes these indicators by applying different treatments to different subgroups. The natural question is whether the decrease in non-response bias that is caused by adaptive survey designs could also be achieved by non-response adjustment methods. We discuss this question and provide theoretical and empirical considerations, supported by a range of household and business surveys. We find evidence that more balanced response coincides with less non-response bias, even after adjustment.

We show how a moment-based estimation procedure can be used to compute point estimates and standard errors for the two components of the widely used Olley–Pakes decomposition of aggregate (weighted average) productivity. When applied to business level microdata, the procedure allows for autocovariance and heteroscedasticity robust inference and hypothesis testing about, for example, the coevolution of the productivity components in different groups of firms. We provide an application to Finnish firm level data and find that formal statistical inference casts doubt on the conclusions that one might draw on the basis of a visual inspection of the components of the decomposition.

Macroeconomic indicators about the labour force, published by national statistical institutes, are predominantly based on rotating panels. Sample sizes of most labour force surveys in combination with the design-based or model-assisted mode of inference obstruct the publication of such indicators on a monthly frequency. Previous research proposed a multivariate structural time series model to obtain more precise model-based estimates by taking advantage of sample information observed in previous periods. In the paper this model is extended to use sample information from other domains or strongly correlated auxiliary series. A relatively parsimonious version of these models is currently used by Statistics Netherlands to produce official monthly figures about the labour force.

Sequence analysis is widely used in life course research and more recently has been applied by survey methodologists to summarize complex call record data. However, summary variables derived in this way have proved ineffective for post-survey adjustments, owing to weak correlations with key survey variables. We reflect on the underlying optimal matching algorithm and test the sensitivity of the output to input parameters or ‘costs’, which must be specified by the analyst. The results illustrate the complex relationship between these costs and the output variables which summarize the call record data. Regardless of the choice of costs, there was a low correlation between the summary variables and the key survey variables, limiting the scope for bias reduction. The analysis is applied to call records from the Irish Longitudinal Study on Ageing, which is a nationally representative, face-to-face household survey.

We present substantial evidence for the existence of a bias in the distribution of births of leading US politicians in favour of those who were the eldest in their cohort at school. This result adds to the research on the long-term effects of relative age among peers at school. We discuss parametric and non-parametric tests to identify this effect, and we show that it is not driven by measurement error, redshirting or a sorting effect of highly educated parents. The magnitude of the effect that we estimate is larger than what other studies on ‘relative age effects’ have found for broader populations but is in general consistent with research that looks at professional sportsmen. We also find that relative age does not seem to correlate with the quality of elected politicians.

Research demonstrates that police reduce crime. We study this question by using a natural experiment in which a private university increased the number of police patrols within an arbitrarily defined geographic boundary. Capitalizing on the discontinuity in patrols at the boundary, we estimate that the extra police decreased crime in adjacent city blocks by 43–73%. Our results are consistent with findings from prior work that used other kinds of natural experiment. The paper demonstrates the utility of the geographic regression discontinuity design for estimating the effects of extra public or private services on a variety of outcomes.

The paper uses a symmetric entropy statistic to study income inequality. The index quantifies the information content of a two-way message that transforms the empirical income distribution into an egalitarian reference distribution, and then back to the original. This allows the measure to be interpreted as an average of *n* income-to-mean divergences such that the inequality estimate can be broken down into contributions across population subgroups. Various properties of the index are analysed and an application comparing the USA, Germany and Britain is provided. We focus on the sensitivity of inequality to the tails of the income distribution and show that the extreme right-hand tail accounts for a large and generally increasing proportion of total inequality. This result holds even if incomes are measured at the household level, averaged over a 5-year period and taken after government taxes and transfers.