Several studies have shown that conversational interviewing (CI) reduces response bias for complex survey questions relative to standardized interviewing. However, no studies have addressed concerns about whether CI increases intra-interviewer correlations (IICs) in the responses collected, which could negatively impact the overall quality of survey estimates. The paper reports the results of an experimental investigation addressing this question in a national face-to-face survey. We find that CI improves response quality, as in previous studies, without substantially or frequently increasing IICs. Furthermore, any slight increases in the IICs do not offset the reduced bias in survey estimates engendered by CI.

Returns to education are variable both within and between educational group. If uncertain pay-offs are a concern to individuals when selecting an education, wage variance is relevant. The variation is a combination of unobserved heterogeneity and pure uncertainty or risk. The first element is known to the individual, but unknown to the researcher; the second is unknown to both. As a result, the variance of wages observed in the data will overestimate the real magnitude of educational uncertainty and the effect that risk has on educational decisions. We apply a semiparametric estimation technique to tackle the selectivity issues. This method does not rely on distributional assumptions of the errors in the schooling choice and wage equations. Our results suggest that risk is decreasing in schooling. Private information accounts for a share varying between 0% and 13% of total wage variance observed depending on the educational level. Finally, we conclude that the estimation results are very sensitive to the functional relation that is imposed on the error structure.

We examine forecasting performance of the recent fractionally cointegrated vector auto-regressive (FCVAR) model. We use daily polling data of political support in the UK for 2010–2015 and compare with popular competing models at several forecast horizons. Our findings show that the four variants of the FCVAR model considered are generally ranked as the top four models in terms of forecast accuracy, and the FCVAR model significantly outperforms both univariate fractional models and the standard cointegrated vector auto-regressive model at all forecast horizons. The relative forecast improvement is higher at longer forecast horizons, where the root-mean-squared forecast error of the FCVAR model is up to 15% lower than that of the univariate fractional models and up to 20% lower than that of the cointegrated vector auto-regressive model. In an empirical application to the 2015 UK general election, the estimated common stochastic trend from the model follows the vote share of the UK Independence Party very closely, and we thus interpret it as a measure of Euroscepticism in public opinion rather than an indicator of the more traditional left–right political spectrum. In terms of prediction of vote shares in the election, forecasts generated by the FCVAR model leading to the election appear to provide a more informative assessment of the current state of public opinion on electoral support than the hung Parliament prediction of the opinion poll.

We contribute to the small, but important, literature exploring the incidence and implications of misreporting in survey data. Specifically, when modelling ‘social bads’, such as illegal drug consumption, researchers are often faced with exceptionally low reported participation rates. We propose a modelling framework where firstly an individual decides whether to participate or not and, secondly, for participants there is a subsequent decision to misreport or not. We explore misreporting in the context of the consumption of a system of drugs and specify a *multivariate inflated probit model*. Compared with observed participation rates of 12.2%, 3.2% and 1.3% (for use of marijuana, speed and cocaine respectively) the true participation rates are estimated to be almost double for marijuana (23%), and more than double for speed (8%) and cocaine (5%). The estimated chances that a user would misreport their participation is a staggering 65% for a hard drug like cocaine, and still about 31% and 17%, for the softer drugs of marijuana and speed.

Recently personalized medicine and dynamic treatment regimes have drawn considerable attention. Dynamic treatment regimes are rules that govern the treatment of subjects depending on their intermediate responses or covariates. Two-stage randomization is a useful set-up to gather data for making inference on such regimes. Meanwhile, the number of clinical trials involving competing risk censoring has risen, where subjects in a study are exposed to more than one possible failure and the specific event of interest may not be observed because of competing events. We aim to compare several treatment regimes from a two-stage randomized trial on survival outcomes that are subject to competing risk censoring. The cumulative incidence function (CIF) has been widely used to quantify the cumulative probability of occurrence of the target event over time. However, if we use only the data from those subjects who have followed a specific treatment regime to estimate the CIF, the resulting estimator may be biased. Hence, we propose alternative non-parametric estimators for the CIF by using inverse probability weighting, and we provide inference procedures including procedures to compare the CIFs from two treatment regimes. We show the practicality and advantages of the proposed estimators through numerical studies.

Before adjustment, low and high frequency data sets from national accounts are frequently inconsistent. Benchmarking is the procedure used by economic agencies to make such data sets consistent. It typically involves adjusting the high frequency time series (e.g. quarterly data) so that they become consistent with the lower frequency version (e.g. annual data). Various methods have been developed to approach this problem of inconsistency between data sets. The paper introduces a new statistical procedure, namely wavelet benchmarking. Wavelet properties allow high and low frequency processes to be jointly analysed and we show that benchmarking can be formulated and approached succinctly in the wavelet domain. Furthermore the time and frequency localization properties of wavelets are ideal for handling more complicated benchmarking problems. The versatility of the procedure is demonstrated by using simulation studies where we provide evidence showing that it substantially outperforms currently used methods. Finally, we apply this novel method of wavelet benchmarking to official data from the UK's Office for National Statistics.

Attempts to create measures of national wellbeing and progress have a long history. In the UK, they go back at least as far as the 1790s, with Sir John Sinclair's *Statistical Account of Scotland*. More recently, worldwide interest has led to the creation of various indices seeking to go beyond familiar economic measures like gross domestic product. We review the ‘Measuring national well-being’ development programme of the UK's Office for National Statistics and explore some of the challenges which need to be faced to bring wider measures into use. These include the importance of getting the measures adopted as policy drivers, how to challenge the continuing dominance of economic measures, sustainability and environmental issues, international comparability and methodological statistical questions.

The existing literature that estimates the incidence of arrears relies on either household survey data or administrative data derived from the lender's records of their borrowers. But estimates based on these different sources will give different estimates of arrears. Moreover, the estimates are not useful for policy analysis or for a bank's lending decision, since they ignore the fact that some households do not borrow. The paper discusses the selection issues that are involved in using either source of data and is the first paper to bound the estimate of the household's underlying propensity to repay. To demonstrate the methodology, it uses data from the European Union Survey of Income and Living Conditions survey for 2008 to estimate the factors that affect repayment among Eurozone households.

We analyse the effect of income on mortality in Austria by using administrative social security data. To tackle potential endogeneity concerns arising in this context, we estimate time invariant firm-specific wage components and use them as instruments for actual wages. Although we find quantitatively small yet statistically significant effects in our naive least squares estimations, instrumental variables regressions reveal a robust zero effect of income on 10-year death rates for workers aged 40–60 years, both in terms of coefficient magnitude and narrow width of confidence intervals. These results are robust to various sample specifications and both linear and non-linear estimation methods.

In sample selection models, a treatment can influence the observed outcome in two ways: by affecting the binary selection or participation decision and by affecting the latent outcome. The former is called the ‘extensive margin effect’, and the latter is called the ‘intensive margin effect’. Despite the popularity of these effects, however, the intensive margin effect does not have the traditional causal parameter interpretation because it is conditioned on the selecting or participating decision, which is a post-treatment variable possibly affected by the treatment. The paper presents a causal framework for sample selection models and introduces various subpopulation effects. It is difficult to separate such effects in general; however, in certain popular models (nearly parametric sample selection models, semiparametric ‘independence models’, semiparametric zero-censored models and ‘polynomial approximation’ models) with linear latent equations, they are separately identified and easily estimable with probit and least squares estimators. An empirical analysis is provided to illustrate these causal effects in sample selection models.

It has previously been shown that, across three British birth cohorts, relative rates of intergenerational social class mobility have remained at an essentially constant level among men and also among women who have worked only full time. We establish the pattern of this prevailing level of social fluidity and its sources and determine whether it also persists over time, and we bring out its implications for inequalities in relative mobility chances. We develop a parsimonious model for the log-odds-ratios which express the associations between individuals’ class origins and destinations. This model is derived from a topological model that comprises three kinds of readily interpretable binary characteristics and eight effects in all, each of which does, or does not, apply to particular cells of the mobility table, i.e. effects of class hierarchy, class inheritance and status affinity. Results show that the pattern as well as the level of social fluidity are essentially unchanged across the cohorts, that gender differences in this prevailing pattern are limited and that marked differences in the degree of inequality in relative mobility chances arise with long-range transitions where inheritance effects are reinforced by hierarchy effects that are not offset by status affinity effects.

The paper investigates the group structure in a terrorist network through the latent class model and a Bayesian model comparison method for the number of latent classes. The analysis of the terrorist network is sensitive to the model specification. Under one model it clearly identifies a group containing the leaders and organizers, and the group structure suggests a hierarchy of leaders, trainers and ‘foot soldiers’ who carry out the attacks.

We investigate a long-debated question, which is how to create predictive models of recidivism that are sufficiently accurate, transparent and interpretable to use for decision making. This question is complicated as these models are used to support different decisions, from sentencing, to determining release on probation to allocating preventative social services. Each case might have an objective other than classification accuracy, such as a desired true positive rate TPR or false positive rate FPR. Each (TPR, FPR) pair is a point on the receiver operator characteristic (ROC) curve. We use popular machine learning methods to create models along the full ROC curve on a wide range of recidivism prediction problems. We show that many methods (support vector machines, stochastic gradient boosting and ridge regression) produce equally accurate models along the full ROC curve. However, methods that are designed for interpretability (classification and regression trees and C5.0) cannot be tuned to produce models that are accurate and/or interpretable. To handle this shortcoming, we use a recent method called supersparse linear integer models to produce accurate, transparent and interpretable scoring systems along the full ROC curve. These scoring systems can be used for decision making for many different use cases, since they are just as accurate as the most powerful black box machine learning models for many applications, but completely transparent, and highly interpretable.

The aim of the paper is to assess how climate change is reflected in the variation of the seasonal patterns of the monthly central England temperature time series between 1772 and 2013. In particular, we model changes in the amplitude and phase of the seasonal cycle. Starting from the seminal work of Thomson, various studies have documented a shift in the phase of the annual cycle, implying an earlier onset of the spring season at various European locations. A significant reduction in the amplitude of the seasonal cycle is also documented. The literature so far has concentrated on the measurement of this phenomenon by various methods, among which complex demodulation and wavelet decompositions are prominent. We offer new insight by considering a model that allows for seasonally varying deterministic and stochastic trends, as well as seasonally varying auto-correlation and residual variances. The model can be summarized as containing a permanent and a transitory component, where global warming is captured in the permanent component, on which the seasons load differentially. The phase of the seasonal cycle, in contrast, seems to be following the trend that was identified by Thomson with Earth's precession in a stable manner. We identify the reported fluctuations as transitory.

The relationship between aging, health and healthcare expenditures is of central importance to academics and public policy makers. Generally, it is observed that, with advancing age, health deteriorates and healthcare expenditures increase. This seems to imply that increases in life expectancy would strongly increase both the demand for healthcare expenditures and the number of years lived in poor health. Previous research has shown that such straightforward conclusions may be flawed. For example, it has been established that not age but ‘time to death’ is the main driver of increased healthcare expenditures at advanced ages. The paper extends this line of research by investigating the relationship between age, time to death and health, the last being longitudinally measured via a health-related quality-of-life questionnaire. We propose an approach for modelling the health-related quality-of-life outcome that accounts for both the non-standard nature of this response variable (e.g. bounded, left skewed or heteroscedastic) and the panel structure of the data. Analyses were performed within a Bayesian framework. We found that health losses are centred in the final phase of life, which indicates that future increases in longevity will not necessarily increase life years spent in poor health. This may alleviate the consequences of population aging.

In a health context, dependence is defined as a lack of autonomy in performing the basic activities of daily living and requiring care giving or significant help from another person. However, this contingency, if present, changes over one's lifetime. Empirical evidence shows that, once this situation occurs, it is almost impossible to return to the previous state and in most cases increases in intensity. In the paper, the evolution of the intensity of this situation is studied for the Spanish population affected by this contingency. Evolution in dependence can be seen as sparsely observed functional data, where we obtain a curve for each individual that is observed at only those points where changes in his or her condition of dependence occur. We use functional data analysis techniques, such as curve registration, functional data depth and distance-based clustering, to analyse this type of data. This approach proves to be useful in this context because it considers the dynamics of the dependence process and provides more meaningful conclusions than simple pointwise or multivariate analysis. We use the sample statistics obtained to predict the future evolution of dependence. The database analysed originates from the ‘Survey on disability, personal autonomy and dependence situations’ in Spain in 2008. The survey is the largest and most complete survey to be made available in Europe for the study of disability. In addition, the Spanish legislation is one of the most recent in Europe and provides a detailed quantitative scale to assess dependence. In the paper, the scale value according to this legislation has been calculated for each individual included in the survey. Differences between sex, age and the time of first appearance were considered, and a prediction of the future evolution of dependence is obtained.

The paper employs a recently developed instrumental variable approach for the estimation of dynamic quantile regression models with fixed effects to model the dynamics of health outcomes. Our proposed estimator not only allows us to control for individual-specific heterogeneity via fixed effects in the dynamic quantile regression framework but may also reduce the bias that exists in conventional fixed effects estimation of dynamic quantile regression models with small numbers of time periods. Using data on the children of the US National Longitudinal Survey of Youth 1979 cohort, we examine the extent of true state dependence in youth depression conditional on unobserved individual heterogeneity and family socio-economic status. Our results suggest that true state dependence in youth depression among the survey respondents is very low and the observed positive association between previous and current depression is mainly due to time invariant unobserved individual heterogeneity.

The paper extends the latent promotion time cure rate marker model of Kim, Xi and Chen for right-censored survival data. Instead of modelling the cure rate parameter as a deterministic function of risk factors, they assumed that the cure rate parameter of a targeted population is distributed over a number of ordinal levels according to the probabilities governed by the risk factors. We propose to use a mixture of linear dependent tail-free processes as the prior for the distribution of the cure rate parameter, resulting in a latent promotion time cure rate model. This approach provides an immediate answer to perhaps one of the most pressing questions ‘what is the probability that a targeted population has high proportions (e.g. greater than 70%) of being cured?’. The approach proposed can accommodate a rich class of distributions for the cure rate parameter, while centred at gamma densities. The algorithms that are developed in this work allow the fitting of latent promotion time cure rate models with several survival models for metastatic tumour cells.

We analyse exchange rate pass-through into import prices for a large group of 33 emerging and developed economies from 1980, quarter 1, to 2010, quarter 4. Our error correction models permit asymmetric pass-through for currency appreciations and depreciations over three horizons of interest: on impact, in the short run and in the long run. We find that depreciations are typically passed through more strongly than appreciations in the long run, suggesting that exporters may exert a degree of long-run pricing power. This asymmetry is stronger in economies which are more import dependent but is moderated by freedom to trade and a positive output gap. Given that this pass-through asymmetry is welfare reducing for consumers in the destination market, a key macroeconomic implication is that import-dependent economies, in particular, can benefit from trade liberalization.

Dropouts and delayed graduations are critical issues in higher education systems world wide. A key task in this context is to identify risk factors associated with these events, providing potential targets for mitigating policies. For this, we employ a discrete time competing risks survival model, dealing simultaneously with university outcomes and its associated temporal component. We define survival times as the duration of the student's enrolment at university and possible outcomes as graduation or two types of dropout (voluntary and involuntary), exploring the information recorded at admission time (e.g. educational level of the parents) as potential predictors. Although similar strategies have been previously implemented, we extend the previous methods by handling covariate selection within a Bayesian variable selection framework, where model uncertainty is formally addressed through Bayesian model averaging. Our methodology is general; however, here we focus on undergraduate students enrolled in three selected degree programmes of the Pontificia Universidad Católica de Chile during the period 2000–2011. Our analysis reveals interesting insights, highlighting the main covariates that influence students’ risk of dropout and delayed graduation.

Contactless credit cards and stored value cards are touted as a fast and convenient method of payment to replace cash at the point of sale. Cross-sectional approaches find a large effect of these retail payment innovations on cash usage (around 10%). Using a semiparametric panel model that accounts for unobserved heterogeneity and general forms of attrition, we find no significant effect for contactless credit cards and only a 2% reduction in cash usage stemming from single-purpose stored value cards. These results point to the uneven pace of payment innovation diffusion.

A pair of municipalities may consolidate services if they are contiguous. Traditional estimation methods assume that each voting process is independent. Instead we propose a new estimation procedure that allows the probability of consolidation to be influenced by neighbouring decisions. We extend a model of local interaction by allowing consolidation effort of neighbours to be either strategic substitutes or strategic complements. We disentangle direct effects arising from a change in one's own characteristics from indirect or spillover effects associated with a change in the other municipalities’ characteristics. Results reveal that the endogenous peer effect coming from neighbours is a primary determinant of willingness to consolidate.

In randomized controlled trials with non-adherence, instrumental variable (IV) methods are frequently used to report the complier average causal effect. With binary outcomes, many of the available IV estimation methods impose distributional assumptions. We develop a randomization-inference-based method of IV estimation for binary outcomes. The method is non-parametric and is based on Fisher's exact test, and estimates can be easily calculated from a set of 2×2 or 2×2×2 tables. Although we retain the standard IV identification assumptions for confidence regions and point estimates, the IV estimand under randomization inference is sample specific and does not assume that the randomized controlled trials participants are a random sample from the target population. We illustrate the method with the ‘IMPROVE’ trial that compares emergency endovascular *versus* open surgical repair for patients with ruptured aortic aneurysms.

Does universal preschool constitute an effective policy tool to promote the development and integration of children from minority groups? We address this question for the children of the Roma—the largest and most disadvantaged minority group in Europe. To tackle the issue of non-random selection into preschool, we exploit variation in the individual distance to the nearest preschool facility. Non-parametric instrumental variable estimations reveal significant short-term gains in terms of children's literacy. Preschool attendance also increases the prevalence of vaccinations but has no effect on other observed health outcomes. Overall, preschool also does not seem to enhance integration measured by children's language proficiency or social–emotional development, at least not in the short term.

Cross-classified multilevel models deal with data pertaining to two different non-hierarchical classifications. It is unclear how much interpenetration is needed for a cross-classified multilevel model to work well and to estimate the two higher-level effects reliably. The paper investigates this question and the properties of cross-classified multilevel logistic models under various survey conditions. The effects of different membership allocation schemes, total sample sizes, group sizes, number of groups, overall rates of response and the variance partitioning coefficient on the properties of the estimators and the power of the Wald test are considered. The work is motivated by an application to separate area and interviewer effects on survey non-response which are often confounded. The results indicate that limited interviewer dispersion (around three areas per interviewer) provides sufficient interpenetration for good estimator properties. Further dispersion yields only very small or negligible gains in the properties. Interviewer dispersion also acts as a moderating factor on the effect of the other simulation factors (sample size, the ratio of interviewers to areas, the overall probability and the variance values) on the properties of the estimators and test statistics. The results also indicate that a higher number of interviewers for a set number of areas and a set total sample size improves these properties.

Hospital performance metrics, often in the form of risk-adjusted hospital mortality rates, are increasingly being made available in the public domain to compare hospitals. Despite the proliferation of these metrics, uncertainty remains regarding their validity and reliability given the noise surrounding their underlying measures. The paper considers a quality measure of hospital performance developed by McClellan and Staiger which smooths within hospitals and over time, while remaining computationally straightforward. The McClellan and Staiger method improves on others by incorporating different measures of outcome, eliminating systematic bias arising from the heterogeneous mix of hospital outputs and the noise that is inherent in other measures of quality. The technique also allows the forecasting of future quality. Using English hospital episode statistics for the years 2000–2005 for acute myocardial infarction and hip replacement, we use this technique to return quality measures based on hospital fixed effects estimated from yearly cross-sectional patient level data, and vector auto-regressions estimated over time, which then combine information from different time periods and across conditions to produce robust hospital quality measures. Our results suggest that this method is well suited to measure and predict provider quality of care in the English setting.

We propose a cross-classified mixed effects location–scale model for the analysis of interviewer effects in survey data. The model extends the standard two-way cross-classified random-intercept model (respondents nested in interviewers crossed with areas) by specifying the residual variance to be a function of covariates and an additional interviewer random effect. This extension provides a way to study interviewers’ effects on not just the ‘location’ (mean) of respondents’ responses, but additionally on their ‘scale’ (variability). It therefore allows researchers to address new questions such as ‘Do interviewers influence the variability of their respondents’ responses in addition to their average, and if so why?’. In doing so, the model facilitates a more complete and flexible assessment of the factors that are associated with interviewer error. We illustrate this model by using data from wave 3 of the UK Household Longitudinal Survey, which we link to a range of interviewer characteristics measured in an independent survey of interviewers. By identifying both interviewer characteristics in general, but also specific interviewers who are associated with unusually high or low or homogeneous or heterogeneous responses, the model provides a way to inform improvements to survey quality.

The paper concerns the statistical modelling of emergency service response times. We apply advanced methods from spatial survival analysis to deliver inference for data collected by the London Fire Brigade on response times to reported dwelling fires. Existing approaches to the analysis of these data have been mainly descriptive; we describe and demonstrate the advantages of a more sophisticated approach. Our final parametric proportional hazards model includes harmonic regression terms to describe how the response time varies with the time of day and shared spatially correlated frailties on an auxiliary grid for computational efficiency. We investigate the short-term effect of fire station closures in 2014. Although the London Fire Brigade are working hard to keep response times down, our findings suggest that there is a limit to what can be achieved logistically: the paper identifies areas around the now closed Belsize, Bow, Downham, Kingsland, Knightsbridge, Silvertown, Southwark, Westminster and Woolwich fire stations in which there should perhaps be some concern about the provision of fire services.

In education studies value added is by and large defined in terms of a test score distribution mean. Therefore, all except a particular summary of the test score distribution is ignored. Developing a value-added definition that incorporates the entire conditional distribution of students' scores given school effects and control variables would produce a more complete picture of a school's effectiveness and as a result provide more accurate information that could better guide policy decisions. Motivated in part by the current debate surrounding the recent proposal of eliminating co-payment institutions as part of Chile's education reform, we provide a new definition of value added that is based on the quantiles of the conditional test score distribution. Further, we show that the quantile-based value added can be estimated within a quantile mixed model regression framework. We apply the methodology to Chilean standardized test data and explore how information garnered facilitates school effectiveness comparisons between public schools and those that are subsidized with and without co-payments.

We re-explore Abel-Smith and Townsend's landmark study of poverty in early post World War 2 Britain. They found a large increase in poverty between 1953–1954 and 1960, which was a period of relatively strong economic growth. Our re-examination is a first exploitation of the data extracted from the recent digitization of the Ministry of Labour's ‘Enquiry into household expenditure’ in 1953–1954. First we closely replicate their results. We find that Abel-Smith and Townsend's method generated a greater rise in poverty than other reasonable methods. Using contemporary standard poverty lines, we find that the relative poverty rate grew only a little at most, and the absolute poverty rate fell, between 1953–1954 and 1961, as might be expected in a period of rising real incomes and steady inequality. We also extend the poverty rate time series of Goodman and Webb back to 1953–1954.

Modelling relationships between individuals is a classical question in social sciences and clustering individuals according to the observed patterns of interactions allows us to uncover a latent structure in the data. The stochastic block model is a popular approach for grouping individuals with respect to their social comportment. When several relationships of various types can occur jointly between individuals, the data are represented by multiplex networks where more than one edge can exist between the nodes. We extend stochastic block models to multiplex networks to obtain a clustering based on more than one kind of relationship. We propose to estimate the parameters—such as the marginal probabilities of assignment to groups (blocks) and the matrix of probabilities of connections between groups—through a variational expectation–maximization procedure. Consistency of the estimates is studied. The number of groups is chosen by using the integrated completed likelihood criterion, which is a penalized likelihood criterion. Multiplex stochastic block models arise in many situations but our applied example is motivated by a network of French cancer researchers. The two possible links (edges) between researchers are a direct connection or a connection through their laboratories. Our results show strong interactions between these two kinds of connection and the groups that are obtained are discussed to emphasize the common features of researchers grouped together.

We present results of a survey experiment aimed at assessing context effects on reporting life satisfaction, exerted by raising awareness of fundamental life domains before eliciting overall life satisfaction, through questionnaire manipulations. Psychologists refer to similar context effects, generated by providing more details about the object of a subsequent evaluation, as ‘unpacking effects’. The longitudinal structure of our experimental design allows us to assess the effects of the questionnaire manipulation both between and within subject. In our sample of university students, asking subjects to report satisfaction with life domains before reporting overall satisfaction with life generates a robust unpacking effect, as it shifts upwards the subsequent mean overall life satisfaction evaluations. In addition, raising awareness about life domains significantly increases reliability and validity of self-reported life satisfaction, by reducing the dispersion of responses and increasing the association between life satisfaction and life domain evaluations. We also detect heterogeneous effects across subgroups of our sample—such as people with children or in bad health—and discuss implications of these findings for research on life satisfaction.

Ecological momentary assessment is used to measure subjects' mood and behaviour repeatedly over time, leading to intensive longitudinal data. Variability in ecological momentary assessment schedules creates an analytical challenge because predictors are measured more frequently than responses. We consider this problem in a study of the effect of stress on the cognitive function of telephone helpline nurses, where stress is measured for each call and cognitive outcomes are measured at the end of a shift. We propose a flexible structural equation model which can handle multiple levels of clustering, measurement error, time trends and mixed variable types.

In May 2004, Poland and seven other countries from central and eastern Europe joined the European Union. This led to a massive emigration from Poland, especially to the UK. However, relatively little is known about the magnitude of migration flows after the 2004 enlargement of the European Union. In the paper Labour Force Survey data from the sending and receiving countries are utilized in a Bayesian model to estimate migration flows. The estimates are further combined with the output of the ‘Integrated modelling of European migration’ model. The combined results with accompanying measures of uncertainty can be used to validate other reported estimates of migration flows from Poland to the UK.

The paper reviews the growing literature on responsive and adaptive designs for surveys. These designs encompass various methods for managing data collection, including front loading potentially difficult cases, tailoring data collection strategies to different subgroups, prioritizing effort according to estimated response propensities, imposing stop rules for ending data collection, monitoring key survey estimates throughout the field period, using two-phase or multiphase sampling for following up non-respondents and calculating indicators of non-response bias (such as the *R*-indicator) other than response rates to monitor and guide fieldwork. We give particular attention to efforts to evaluate these strategies experimentally or via simulations. Although the field seems to have embraced these new tools, most of the evaluation studies suggest they produce marginal reductions in cost and non-response bias. It is clearly difficult to lower survey costs without reducing some aspect of survey quality. Other issues limiting the effectiveness of these designs include weakly predictive auxiliary variables, ineffective interventions and slippage in the implementation of interventions in the field. These problems are not, however, unique to responsive or adaptive design. We give recommendations for improving such designs and for improving the management of data collection efforts in the current difficult environment for surveys.

Competing risks consider time to first event and type of first event. This subdiscipline of survival analysis is challenging in that multiple hazards determine the outcome probabilities. The paper demonstrates that Nightingale and Farr were aware of these connections in their co-operative work in hospital epidemiology. At the fourth International Statistical Congress (in London, 1860), they suggested forms for reporting hospital mortality that were conceptually more complete than many reported competing risks analyses today.

Modern systems of official statistics require the timely estimation of area-specific densities of subpopulations. Ideally estimates should be based on precise geocoded information, which is not available because of confidentiality constraints. One approach for ensuring confidentiality is by rounding the geoco-ordinates. We propose multivariate non-parametric kernel density estimation that reverses the rounding process by using a measurement error model. The methodology is applied to the Berlin register of residents for deriving density estimates of ethnic minorities and aged people. Estimates are used for identifying areas with a need for new advisory centres for migrants and infrastructure for older people.

To meet the strategic goals and objectives for the 2020 census, the US Census Bureau must make fundamental changes to the design, implementation and management of the decennial census. The changes must build on the successes and address the challenges of the previous censuses. Of particular interest is to gauge the on-going quality of the census frames. We address this topic by discussing a set of statistical models for the *Master Address File* that will produce estimates of coverage error at levels of geography down to the block level. The distributions of added and deleted housing units in a block are used to characterize the undercoverage and overcoverage respectively. The data used are from the 2010 address canvassing operation. As will be shown, these distributions are highly right skewed with a very large proportion of 0 counts. Hence, we utilize zero-inflated regression modelling to determine the predicted distribution of additions and deletions. In addition to standard statistical measures, we gauge the performance of this model by simulating a 2010 address canvassing operation using a specified coverage level. We also discuss future maintenance and updating of this model.

The rise in healthcare expenditures has raised doubts about the sustainability of health systems and instigated a discussion on their design. Policy making in this field requires a proper understanding of how healthcare expenditures evolve throughout an individual's lifetime, and of how they vary between individuals. Given the lack of data on healthcare expenditures during an individual's lifetime, we developed a new nearest neighbour resampling approach to construct realistic individual life cycles of healthcare expenditures based on cross-sectional data from the Netherlands. This approach provides insight into lifetime healthcare expenditures. Our main finding is that the inequality in lifetime healthcare expenditures is much smaller than the inequality as derived from cross-sectional healthcare expenditures.

Surveillance data collected on several hundred different infectious organisms over 20 years have revealed striking power relationships between their variance and mean in successive time periods. Such patterns are common in ecology, where they are referred to collectively as Taylor's power law. In the paper, these relationships are investigated in detail, with the aim of exploiting them for the descriptive statistical modelling of infectious disease surveillance data. We confirm the existence of variance-to-mean power relationships, with exponent typically between 1 and 2. We investigate skewness-to-mean relationships, which are found broadly to match those expected of Tweedie distributions, and thus confirm the relevance of the Tweedie convergence theorem in this context. We suggest that variance- and skewness-to-mean power laws, when present, should inform statistical modelling of infectious disease surveillance data, notably in descriptive analysis, model building, simulation and interval and threshold estimation, threshold estimation being particularly relevant to outbreak detection.

Longitudinal monitoring of biomarkers is often helpful for predicting disease or a poor clinical outcome. We consider the prediction of both large and small for gestational age births by using longitudinal ultrasound measurements, and we attempt to identify subgroups of women for whom prediction is more (or less) accurate, should they exist. We propose a tree-based approach to identifying such subgroups, and a pruning algorithm which explicitly incorporates a desired type I error rate, allowing us to control the risk of false discovery of subgroups. The methods proposed are applied to data from the Scandinavian Fetal Growth Study and are evaluated via simulations.

Sampling hidden populations is particularly challenging by using standard sampling methods mainly because of the lack of a sampling frame. Respondent-driven sampling is an alternative methodology that exploits the social contacts between peers to reach and weight individuals in these hard-to-reach populations. It is a snowball sampling procedure where the weight of the respondents is adjusted for the likelihood of being sampled due to differences in the number of contacts. The structure of the social contacts thus regulates the process by constraining the sampling within subregions of the network. We study the bias induced by network communities, which are groups of individuals more connected between themselves than with individuals in other groups, in the respondent-driven sampling estimator. We simulate different structures and response rates to reproduce real settings. We find that the prevalence of the estimated variable is associated with the size of the network community to which the individual belongs and observe that low degree nodes may be undersampled if the sample and the network are of similar size. We also find that respondent-driven sampling estimators perform well if response rates are relatively large and the community structure is weak, whereas low response rates typically generate strong biases irrespectively of the community structure.

The paper investigates the dependences between levels of severity of road traffic accidents, accounting at the same time for spatial and temporal correlations. The study analyses road traffic accidents data at ward level in England over the period 2005–2013. We include in our model multivariate spatially structured and unstructured effects to capture the dependences between severities, within a Bayesian hierarchical formulation. We also include a temporal component to capture the time effects and we carry out an extensive model comparison. The results show important associations in both spatially structured and unstructured effects between severities, and a downward temporal trend is observed for low and high levels of severity. Maps of posterior accident rates indicate elevated risk within big cities for accidents of low severity and in suburban areas in the north and on the southern coast of England for accidents of high severity. The posterior probability of extreme rates is used to suggest the presence of hot spots in a public health perspective.

We exploit variation across Italian regions in the implementation of region-specific tariffs within a prospective pay system for hospitals based on diagnosis-related groups to assess their effect on health and on the use of healthcare services. We consider survey data for the years 1993–2007 with information on both individuals’ perceived health and their utilization of healthcare services. The results suggest that the introduction of market incentives via a fixed price payment system did not lead to worse health perceptions. Instead, it marked a moderate decrease in hospitalization coupled with a clearer decrease in the utilization of emergency services. We also find mild evidence of reduced satisfaction with healthcare services among hospital patients. These effects were stronger for adoptions occurring when also the national government supported the market-based approach. Results are robust to sensitivity checks.

We study the effect of competition on adverse hospital health outcomes in a context in which information about hospital quality is not publicly available. We use data on patients who were admitted to hospitals in the Lombardy region of Italy. Although risk-adjusted hospital rankings are estimated yearly in this region, such rankings are provided to hospital managers only and are not available to general practitioners or citizens. Hence, patients may choose the hospital where to be admitted on the basis of different criteria such as their geographical closeness to the hospital, local network information and referrals by general practitioners. We first estimate a model of patient hospital choices and include among the determinants a variable capturing social interaction, which represents a proxy for the quality of hospitals perceived by patients. Using patient-predicted choice probabilities, we then construct a set of competition indices and measure their effect on a composite index of mortality and readmission rates that represents, in our settings, hospital quality in terms of adverse health outcomes. Our results show that no association exists between such adverse events and hospital competition. Our finding may be the result of asymmetric information, as well as the difficulty of building good quality health indicators.

We apply parametric and non-parametric regression discontinuity methodology within a multinomial choice setting to examine the effect of public healthcare user fee abolition on health facility choice by using data from South Africa. The non-parametric model is found to outperform the parametric model both in and out of sample, while also delivering more plausible estimates of the effect of user fee abolition (i.e. the ‘treatment effect’). In the parametric framework, treatment effects were relatively constant—around 10%—and that increase was drawn equally from home care and private care. In contrast, in the non-parametric framework treatment effects were largest for large (and poor) families located further from health facilities—approximately 5%. More plausibly, the positive treatment effect was drawn primarily from home care, suggesting that the policy favoured children living in poorer conditions, as those children received at least some minimum level of professional healthcare after the policy was implemented.

]]>We conduct a quasi-Monte-Carlo comparison of the recent developments in parametric and semiparametric regression methods for healthcare costs, both against each other and against standard practice. The population of English National Health Service hospital in-patient episodes for the financial year 2007–2008 (summed for each patient) is randomly divided into two equally sized subpopulations to form an estimation set and a validation set. Evaluating out-of-sample using the validation set, a conditional density approximation estimator shows considerable promise in forecasting conditional means, performing best for accuracy of forecasting and among the best four for bias and goodness of fit. The best performing model for bias is linear regression with square-root-transformed dependent variables, whereas a generalized linear model with square-root link function and Poisson distribution performs best in terms of goodness of fit. Commonly used models utilizing a log-link are shown to perform badly relative to other models considered in our comparison.

The analysis of national mortality trends is critically dependent on the quality of the population, exposures and deaths data that underpin death rates. We develop a framework that allows us to assess data reliability and to identify anomalies, illustrated, by way of example, using England and Wales population data. First, we propose a set of graphical diagnostics that help to pinpoint anomalies. Second, we develop a simple Bayesian model that allows us to quantify objectively the size of any anomalies. Two-dimensional graphical diagnostics and modelling techniques are shown to improve significantly our ability to identify and quantify anomalies. An important conclusion is that significant anomalies in population data can often be linked to uneven patterns of births of people in cohorts born in the distant past. In the case of England and Wales, errors of more than 9% in the estimated size of some birth cohorts can be attributed to an uneven pattern of births. We propose methods that can use births data to improve estimates of the underlying population exposures. Finally, we consider the effect of anomalies on mortality forecasts and annuity values, and we find significant effects for some cohorts. Our methodology has general applicability to other sources of population data, such as the Human Mortality Database.

Age and sex patterns of migration are essential for understanding drivers of population change and heterogeneity of migrant groups. We develop a hierarchical Bayesian model to estimate such patterns for international migration in the European Union and European Free Trade Association from 2002 to 2008, which was a period of time when the number of members expanded from 19 to 31 countries. Our model corrects for the inadequacies and inconsistencies in the available data and estimates the missing patterns. The posterior distributions of the age and sex profiles are then combined with a matrix of origin–destination flows, resulting in a synthetic database with measures of uncertainty for migration flows and other model parameters.

The instability of ethnicity measured in the national census is found to have doubled from the period 1991–2001 to the period 2001–2011, using the Longitudinal Study that links a sample of individuals’ census records across time. From internal evidence and comparison with results from the Census Quality Survey and the Labour Force Survey, estimates are made of instability due to changing question wording, imputation of missing answers, proxy reporting, recording errors and changes in the allocation of write-in answers. Of the remaining instability, durable changes of ethnicity by individuals are thought to be considerably less common than changes due to a person's sense of identity not closely fitting the categories offered in the census question. The instability creates a net change in size of some ethnic groups that is usually small compared with the change in population between censuses from births, deaths and migration. Consequences for analysis of census aggregate and microdata are explored.

We explore the existence of short- and long-term effects of retirement on health. Short-term effects are estimated with a regression discontinuity design which is robust to weak instruments and where the underlying assumptions of continuity of potential outcomes are uncontroversial. To identify the long-term effects we propose a parametric model which, under strong assumptions, can separate normal deterioration of health from the causal effects of retirement. We apply our framework to the British Household Panel Survey and find that retirement has little effect on health. However, our estimates suggest that retirement opens the gate to a sedentary life with an impoverished social component and this is a channel through which retirement could indirectly affect health in the long run.

We present new auto-regressive logit models for forecasting the probability of a time series of financial asset returns exceeding a threshold. The models can be estimated by maximizing a Bernoulli likelihood. Alternatively, to account for the extent to which an observation does or does not exceed the threshold, we propose that the likelihood is based on the asymmetric Laplace distribution, which has been found to be useful for quantile estimation. We incorporate the exceedance probability forecasts within a new time varying extreme value approach to value at risk and expected shortfall estimation. We provide an empirical illustration using daily stock index data.

We consider two econometric problems when investigating the effect of family size on labour market outcomes using the popular twin birth instrument. The first is the potential for omitted variable bias caused by the fact that fertility treatments are linked to twin births and are typically unobserved. We present estimates that are corrected for this bias and find that it is comparatively small. Second, we show that the effects of twin-birth-induced variation in family size, as well as characteristics of the compliers, varies substantially with time passed since birth, which has consequences for the interpretation of estimates across samples and time.