Statistics
Journal of the Royal Statistical Society Virtual Issue
FREE - Classic Journal Content |
To celebrate the International Year of Statistics we have created a special Virtual Issue of classic papers from The Journal of the Royal Statistical Society (series A, B, C). You can read all these articles FREE during 2013. | ||
Journal of the Royal Statistical Society - Series A: Statistics in Society |
Diet revealed?: Semiparametric estimation of nutrient intake–age relationships From the abstract: Smoothed estimates of the complex relationships between age and intakes of energy, fat, calcium and vitamin C are obtained for males and females from British National Food Survey data covering the period 1974–94. Since the data record household food acquisitions during a survey week, not food consumption by individuals, a model is constructed in which average nutrient intakes by sex at each completed year of age are parameters which are estimated by roughness penalized least squares. Income per head and region of residence are controlled for in a non-linear extension of the model. The sensitivity of the estimates to variations in eating out and the incidence of visitors is examined. Age- and sex-specific estimates of the proportion of energy from fat, a key health indicator, are derived and compared across a 21-year period. |
Commissioned analysis of surgical performance by using routine data: lessons from the Bristol inquiry From the abstract: The public inquiry into paediatric cardiac surgery at the Bristol Royal Infirmary commissioned the authors to design and conduct analyses of routine data sources to compare surgical outcomes between centres. Such analyses are necessarily complex in this context but were further hampered by the inherent inconsistencies and mediocre quality of the various sources of data. Three levels of analysis of increasing sophistication were carried out. The reasonable consistency of the results arising from different sources of data, together with a number of sensitivity analyses, led us to conclude that there had been excess mortality in Bristol in open heart operations on children under 1 year of age. We consider criticisms of our analysis and discuss the role of statisticians in this inquiry and their contribution to the final report of the inquiry. The potential statistical role in future programmes for monitoring clinical performance is highlighted. |
How not to measure the efficiency of public services (and how one might) From the abstract: The single-input case of the ‘technical efficiency’ theory of M. J. Farrell is reformulated geometrically and algebraically. Its linear programming developments as ‘data envelopment analysis’ are critically reviewed, as are the related techniques of ‘stochastic frontier analysis’. The sense and realism of using data envelopment analysis or stochastic frontier analysis techniques, rather than some value-based method, for the assessment of police force efficiency are questioned with reference to the Spottiswoode report and related studies. |
Journal of the Royal Statistical Society - Series B: Statistical Methodology |
Particle Markov chain Monte Carlo methods From the abstract: Markov chain Monte Carlo and sequential Monte Carlo methods have emerged as the two main tools to sample from high dimensional probability distributions. Although asymptotic convergence of Markov chain Monte Carlo algorithms is ensured under weak assumptions, the performance of these algorithms is unreliable when the proposal distributions that are used to explore the space are poorly chosen and/or if highly correlated variables are updated independently. We show here how it is possible to build efficient high dimensional proposal distributions by using sequential Monte Carlo methods |
A direct approach to false discovery rates From the abstract: Multiple-hypothesis testing involves guarding against much more complicated errors than single-hypothesis testing. Whereas we typically control the type I error rate for a single-hypothesis test, a compound error rate is controlled for multiple-hypothesis tests. For example, controlling the false discovery rate FDR traditionally involves intricate sequential p-value rejection methods based on the observed data. Whereas a sequential p-value method fixes the error rate and estimates its corresponding rejection region, we propose the opposite approach—we fix the rejection region and then estimate its corresponding error rate. |
Regularization and variable selection via the elastic net From the abstract: We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together. The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the p≫n case. An algorithm called LARS-EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lasso. |
Journal of the Royal Statistical Society - Series C: Applied Statistics |
Bayesian inference for generalized additive mixed models based on Markov random field priors From the abstract: Most regression problems in practice require flexible semiparametric forms of the predictor for modelling the dependence of responses on covariates. Moreover, it is often necessary to add random effects accounting for overdispersion caused by unobserved heterogeneity or for correlation in longitudinal or spatial data. We present a unified approach for Bayesian inference via Markov chain Monte Carlo simulation in generalized additive and semiparametric mixed models. Different types of covariates, such as the usual covariates with fixed effects, metrical covariates with non-linear effects, unstructured random effects, trend and seasonal components in longitudinal data and spatial covariates, are all treated within the same general framework by assigning appropriate Markov random field priors with different forms and degrees of smoothness. |
Generalized additive models for location, scale and shape From the abstract: A general class of statistical models for a univariate response variable is presented which we call the generalized additive model for location, scale and shape (GAMLSS). The model assumes independent observations of the response variable y given the parameters, the explanatory variables and the values of the random effects. The distribution for the response variable in the GAMLSS can be selected from a very general family of distributions including highly skew or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modelling not only of the mean (or location) but also of the other parameters of the distribution of y, as parametric and/or additive nonparametric (smooth) functions of explanatory variables and/or random-effects terms. |
Model-based geostatistics From the abstract: Conventional geostatistical methodology solves the problem of predicting the realized value of a linear functional of a Gaussian spatial stochastic process S(x) based on observations Yi = S(xi) + Zi at sampling locations xi, where the Zi are mutually independent, zero-mean Gaussian random variables. We describe two spatial applications for which Gaussian distributional assumptions are clearly inappropriate. |
See what else we are doing to celebrate the International Year of Statistics... |