## 1 INTRODUCTION

In a presentation on the unsolved problems in mathematics on 8 August 1900 at the International Congress of Mathematicians in Paris, Professor David Hilbert made this remark (Hilbert, 1902):

The deep significance of certain problems for the advance of mathematical science in general and the important role which they play in the work of the individual investigator are not to be denied. As long as a branch of science offers an abundance of problems, so long is it alive; a lack of problems foreshadows extinction or the cessation of independent development. Just as every human undertaking pursues certain objects, so also mathematical research requires its problems. It is by the solution of problems that the investigator tests the temper of his steel; he finds new methods and new outlooks, and gains a wider and freer horizon.

By this standard, climate science is very much alive and is raising many interesting and unsolved statistical problems. Climate science has made outstanding progress in the past 50 years with the development of ever more complex and more detailed numerical models of the climate system (Randall *et al.*, 2007; Chandler *et al.*, 2010). Climate model simulations are numerical solutions to a large set of nonlinear first order differential equations subject to boundary conditions or ‘forcings’ (greenhouse gases, solar radiation, aerosols, etc.) that control the subsequent evolution of the system. It is now common practice to sample uncertainty in climate predictions by considering ensembles of simulations from one or more climate models. Ensembles are created either by perturbing a climate model's initial conditions (initial condition ensembles) or physical parameters (perturbed physics ensembles), or by taking model outputs from a set of different climate models (multi-model ensembles; MME). For example, the forthcoming Intergovernmental Panel on Climate Change Fifth Assessment Report intends to present projections based on data from an MME of around 40 climate models produced by climate modelling centres around the world as part of the coordinated CMIP5 project.1 Typical time series from an MME are shown in Figure 1 together with corresponding observations of UK temperature.

Despite their increasing complexity and seductive realism, it is important to remember that climate models are *not* the real world. Climate models are numerical approximations to fluid dynamical equations forced by parameterisations of physical and unresolved sub-grid scale processes. Climate models are inadequate in a rich diversity of ways, but it is the hope that these physically motivated models can still inform us about various aspects of future observable climate. A major challenge is how we should use climate models to construct credible probabilistic forecasts of future climate. Because climate models do not themselves produce probabilities, an additional level of explicit probabilistic modelling is necessary to achieve this.

Various methods have been used to obtain future climate projections from ensembles of climate model output, for example, the simplest pragmatic approaches of calculating equally weighted means of the ensemble members, or unequally weighted means of the ensemble members based on descriptive model performance metrics. Alternatively, one can use statistical frameworks such as regression of future changes on past model statistics (e.g. Bracegirdle and Stephenson, 2012), or more complex hierarchical models (Buser *et al.*, 2009; Knutti *et al.*, 2010a; Leith and Chandler, 2010; Collins *et al.*, 2012; references therein). However, there is little agreement on what is the most reliable and robust methodology for making probabilistic predictions of real-world climate.

To help address this important and pressing challenge, we proposed probabilistic climate prediction as one of the two major themes for a residential research programme at the Isaac Newton Institute for Mathematical Sciences in Cambridge, entitled ‘Mathematical and Statistical Approaches to Climate Modelling and Prediction’ (Chandler *et al.*, 2010). The programme brought together more than 150 climate scientists and mathematicians/statisticians over the period August–December 2010. As part of the interdisciplinary programme, we organised a workshop on probabilistic climate prediction at the University of Exeter from 20 to 23 September 2010. The main aims of the workshop were to bring climate and statistical experts together to discuss, identify, and formulate the big and potentially solvable problems in the mathematics of probabilistic climate prediction. More details of format, talks, and participants are provided in the Appendix.

The purpose of this article is to summarise the interesting discussions that took place at the workshop in order to open up these issues to wider participation from statisticians. We hope that the article is also useful to climate scientists wishing to know more about the more fundamental issues involved in interpreting ensembles and making probabilistic climate predictions—it is important to be aware of the statistical modelling issues. One of the first stages in a problem is to recognise that a problem exists, and so it is our hope that this article will unveil some of the more challenging problems that are still in serious need of solutions. It should be noted that as a reflection of the diverse ideas that emerged at the Exeter workshop, this article is neither a consensus statement nor is it a comprehensive account of all the interesting statistical problems that arise in climate science.