## Introduction

In many ecological settings, we wish to combine information from different sites, times or studies to relate the true value of the parameter to one or more predictor variables. For example, we might wish to estimate the relationship between reproductive rate and an index of climate. Meta-regression, in which we combine information from different studies, is an important special case of this type of analysis (Verdú & Travaset 2005; Knowles, Nakagawa, & Sheldon 2009).

It is important to realise that there are two sources of error variation in such data: process error and measurement error. In our example, the former refers to the error variation we would encounter if we knew the exact reproductive rate each year, while the latter arises as a consequence of having to estimate reproductive rate from field data. These two sources of error variation need to be borne in mind whenever we analyse such data.

Recognition of the need to distinguish process error from measurement error is one of the reasons for the increasing popularity of hierarchical modelling, which arises naturally within the Bayesian approach to data analysis (Gelman & Hill 2007; Royle & Dorazio 2008; Gurevitch & Mengersen 2010). In our example, we would specify a model for the relationship between the true reproductive rate and the climate index (incorporating process error), as well as for the relationship between the true reproductive rate and the observed reproductive rate (incorporating measurement error). Fitting these two models simultaneously provides an effective means of recognising and allowing for the two sources of error variation.

Although hierarchical modelling has obvious benefits, we might sometimes wish to adopt a simpler approach, in which the analysis is performed in two stages (Cox 2006; Murtaugh 2007). In our example, this would involve first calculating annual estimates of reproductive rate and then fitting a regression of these against the climate index. To allow for measurement error, it would be common practice to perform a weighted regression at the second stage, with the standard errors of the annual estimates being used to determine the relevant weights (Gurevitch & Hedges 1999; Murtaugh 2007).

This two-stage approach may be necessary if the original data are not available, as is common in meta-regression. In addition, in some settings, specialist software might be required to model the measurement process, making a hierarchical modelling approach logistically difficult (see the molecular rates example discussed below). Even if all the data are available, a two-stage approach might be preferred for reasons of simplicity and transparency (Murtaugh 2007).

The purpose of this paper is to consider how such a two-stage approach should be carried out. In particular, we focus on assessing when it is preferable to use standard (unweighted) regression in the second stage, i.e. to ignore the differences in the measurement error of the individual estimates. Our motivation for this is twofold. First, we might wish to use an even simpler analysis, as long as we can still make reliable inferences (Murtaugh 2007). Second, it is possible that the use of unweighted regression might provide a more reliable analysis, as it does not require separate estimation of the process and measurement error variation. The standard recommendation in the literature is to use weighted regression (e.g. Gurevitch & Hedges 1999), and our aim is to consider the extent to which such a recommendation is merited.

We do not consider all the statistical issues that arise in meta-regression or more generally in meta-analysis, such as choice of population and parameter, sampling issues (including publication bias), and standardisation of methods across studies (Englund, Sarnelle, & Cooper 1999; Osenberg *et al.* 1999), as these are outside the scope of this paper. In addition, for simplicity of presentation, we primarily consider the case where it is reasonable to assume a linear relationship between the population parameter and a single predictor variable, with all the error terms being independent and normally distributed.

In the next section, we provide motivating examples and describe three methods of analysis that might be used in a two-stage approach: unweighted regression, weighted regression and ‘weighted regression ignoring the process error’. We compare these methods, using theory to outline their properties, and simulation to assess their performance in terms of the coverage rate and width of a 95% confidence interval for the slope.