## 1. Background

The time series that record various aspects of the Earth's climate system are widely recognized as being non-stationary (Hays *et al.*, 1976; Imbrie *et al.*, 1992; Karl *et al.*, 2000; Tomé and Miranda, 2004; Raymo *et al.*, 2006; Beaulieu *et al.*, 2010; among others). Several methods have been implemented to solve the ‘change point’ problem for shorter climatic time series. For example, Karl *et al.* (2000) fixes the number of discontinuities and then uses both Haar (square) wavelets and a brute force minimization of the residual squared error for the placement of piecewise continuous line segments. Similar to this second approach, Tomé and Miranda (2004) automate the creation of a matrix of over-determined linear equations and consecutively solve this system for every possible combination of change points that satisfies their constraints. To deal with the exponentially increasing number of change point solutions associated with longer time series, dynamic programming change point algorithms have been developed that reduce the computational burden to a more manageable size (Ruggieri *et al.*, 2009). Branch and Bound techniques (Aksoy *et al.*, 2008) also aim to reduce the computational burden by screening and eliminating sub-optimal segmentations.

Alternatively, Seidel and Lanzante (2004) first identify change points by visual inspection and then refine their location so as to: (1) minimize the number of change points; (2) be consistent with previous research; and (3) have support from an iterative non-parametric statistical method (Lanzante, 1996). This iterative approach adds one change point at a time, testing each for statistical significance. In an attempt to minimize the *a priori* assumptions on the number and location of change points, Menne (2006) proposes a semi-hierarchic splitting algorithm to place the change points. Here, the placement of a change point splits the time series, but each splitting step is followed by a merge step to determine whether change points chosen earlier are still significant.

Each of these methods returns a single, ‘optimal’ solution. But if there are *∼N*^{k} possible placements of *k* change points in a time series of length *N*, how confident are we that this one solution is vastly superior to any other, especially one that may only differ by a single data point? A Bayesian approach to the change point problem can give uncertainty estimates not only for the location, but for the number of change points as well.

For computational reasons, Markov chain Monte Carlo (MCMC) (Barry and Hartigan, 1993; Lavielle and Lebarbier, 2001; Zhao and Chu, 2006) and Gibbs Sampling (Stephens, 1994; Khaliq *et al.*, 2007) approaches have dominated Bayesian solutions to the multiple change point problem. However, these techniques only approximate the posterior distribution of change point locations and leave open difficult questions of convergence.

Bayesian change point algorithms that do not rely on MCMC procedures include Hannart and Naveau (2009) who use Bayesian Decision Theory to minimize a cost function for the detection of multiple change points and Beaulieu *et al.* (2010) who probabilistically locate multiple change points through a splitting algorithm akin to Menne (2006), but without the corresponding merge step. However, these approaches are limited to identifying changes in mean. Fearnhead (2006) developed a recursive algorithm similar to the Forward/Backward equations of a Hidden Markov Model which Seidou and Ouarda (2007) generalized to fit a regression model. With respect to the algorithm presented here, there are two main differences: the nature of the recursion and the prior distributions on the model parameters. Seidou and Ouarda (2007) require two training data sets and a prior distribution on the distance between adjacent change points (an implicit assumption on the number of change points in a time series). Our approach requires neither.

In what follows, we describe an exact Bayesian solution to the multiple change point problem that uses dynamic programming recursions to reduce the computational burden down to a point where a time series of any length can be analysed for an arbitrary number of change points. The key to dynamic programming is to break the multiple change point problem down into a set of progressively smaller sub-problems, the smallest of which (the placement of a single change point) can easily be solved. The full solution can then be obtained by efficiently piecing together the solutions to these sub-problems. The Bayesian Change Point algorithm can detect changes in the parameters of any regression model being used to describe a climatic time series, be it changes in the mean, trend, and/or variance of the climate signal. After describing its implementation, the Bayesian Change Point algorithm is used to analyse both the NOAA/NCDC global surface temperature anomalies time series and the δ^{18}O proxy record of the Plio-Pleistocene. For the latter, the goal is to show how the algorithm can be applied to very long time series and search for more than just changes in trend. The ability to provide uncertainty estimates in the number and timing change points is a key contribution of the Bayesian Change Point algorithm and a significant advantage over a frequentist approach.