Characterizing temperature and precipitation multi-variate biases in 12 and 2.2 km UK Climate Projections

Many impactful weather and climate events include two or more variables (like temperature or precipitation) having high or low values (e.g., hot dry summers). Understanding biases in the relationship between modelled variables is important for characterizing uncertainties in the risks associated with compound events. We present a framework for evaluating the relationships between different variables (multi-variate bias). We illustrate our approach with UK temperature and precipitation, using HadUK-Grid observations and two model ensembles (12 and 2.2 km horizontal resolution) of the HadGEM3 regional model used in UK Climate Projections both forced with the same driving conditions. There are distinct regional patterns in the biases of both the Pearson correlation coefficients and coefficients of linear regression between temperature and precipitation in both resolutions, for example, large areas of positive biases in the Pearson correlation coefficients across the United Kingdom in winter, and negative biases across most of England in summer. We combine the Pearson correlation coefficients and bias in the coefficient of linear regression into a combined metric and consider regions where either the bias in the coefficient of linear regression or the bias in Pearson correlation coefficient is significantly dominant over the other. By considering only days with similar North Atlantic driving conditions using Met Office Weather Patterns we can identify regions with significant differences between the two model resolutions that are attributable to the difference in model resolution and structural design. The root mean square error (RMSE) of correlation bias across the United Kingdom is reduced in the 2.2 km compared to the 12 km model data in each season except summer where it is broadly similar. For Weather Pattern


| INTRODUCTION
Many societally impactful weather or climate events occur when multiple variables have low/high or extreme values, either simultaneously over a short timescale (e.g., high winds and high precipitation leading to storm damage) or a longer timescale (e.g., seasonally higher than average temperatures and lower than average precipitation leading to drought and crop failures). Compound multi-variate events can result in greater societal impacts than if two separate uni-variate events had occurred separated in time or space. Studies of weather and climate hazards often consider variables separately, rather than developing understanding of the relationships between variables. This is being addressed by the growing body of weather and climate research focussed on compound events including multi-variate hazards (e.g., Zscheischler et al., 2020;Bevacqua et al., 2021). As compound events can have impact on society through sustained or repeated conditions, as well as extremes, it is important to understand relationships between variables in general, and their biases in model simulations. This understanding of biases is very useful for considering appropriate bias correction techniques in impact studies or societal risk assessments to weather and climate hazards (e.g., François et al., 2020).
The relationship between temperature and precipitation in the United States was studied by Madden and Williams, 1978 who determined correlations over 60 years and identified regional variations across the United States, and that the relationships were robust on various timescales. These relationships across the United States were updated by Zhao and Khalil (1993) for distinct months, and Isaac and Stuart (1992) identified Canadian spatial variation in the temperature-precipitation relationship from daily data.
Globally, Trenberth and Shea (2005) identified that negative correlations between temperature and precipitation are far more widespread, especially south of 50 N, as dry conditions tend to be associated with more sunshine and reduced evaporative cooling, but that wet summers are cooler. At the higher northern latitudes over winter, positive correlations dominate due to extratropical cyclones with their warm moist advection resulting in increased precipitation, while precipitation is limited in cold conditions due to the atmospheric water holding capacity. The same relationships were identified on interannual timescales due to the dominance of El Niño Southern Oscillation in the tropics, and El Niño Southern Oscillation and the Atlantic Oscillation in the Northern Hemisphere by Adler et al. (2008).
Other studies (e.g., Hardwick Jones et al., 2010; Wang et al., 2017) have considered the relationships between extreme precipitation and temperature, for which advanced statistical techniques and modelling are often employed to study, but we focus on the overall relationships and not the behaviour of the extremes in this study.
Other research has focussed on the relationships between temperature and precipitation or their extremes at sub-continent, country or region level spatial scales (e.g., Austria: Schroeer and Kirchengast, 2018, India: Mukherjee et al., 2018, Australia: Bao et al., 2017Barbero et al., 2018 and high altitude snow cover regions in China: Ban et al., 2021). Wasko, 2021 considered if temperature can be used to inform changes to flood extremes projected under climate change, however despite good agreement between peak-flow temperature sensitivities and historical trends across Australia, globally there is not sufficient analysis of flood projections from temperature sensitivities. These examples of localisation of climate research are an important part of gaining local scientific understanding that underpins local climate projections and provides the scientific basis for decision making around climate mitigation and adaptation on country or state level, while developing techniques that can be transferred and adapted as relevant to different regions around the world.
To understand how well regional climate models represent the interrelation of meteorological or climate variables, their output can be compared to observational products at similar resolution to understand the model biases. Here we consider how we can combine information about two variables to understand the relationship between them and evaluate model bias in multiple variable relationships. When applying this methodology across a wide geographic area instead of a point location, observational gridded products must be of sufficient quality to make this a meaningful exercise, and we therefore show analysis using the variables temperature and precipitation in this study, as we have confidence that there are low uncertainties in multi-decadal gridded observational products across the United Kingdom for these variables.
Understanding the relative size, seasonality, and spatial variation of bias in intervariable relationships in different sources of climate information will better inform decisions over which products are most appropriate for impact applications involving compound hazards. An improved understanding of the biases in multi-variate relationships can be also used to better understand and improve the model deficiencies that lead to these biases and help model developers understand in which regional areas bias correction is most important for considering multi-variate events. It is also key for model users to understand the relationships between variables and if there are biases, to help them choose appropriate multi-variate bias correction techniques for their weather and climate problem of interest. Multi-variate bias correction is an emerging field of scientific research and choice of appropriate technique is often not straightforward for model data users (e.g., François et al., 2020). This paper addresses the following key questions with a focus on UK climate: does a regional climate model at two resolutions accurately capture the relationships between temperature and precipitation in the current climate, where are biases largest regionally, and how do these biases differ seasonally?

| METHODS
In this work we use data from a 12-member Perturbed Parameter Ensemble (PPE) of both a 12 km resolution and 2.2 km resolution (regridded to 5 km for comparison to observations) regional atmospheric model from the UK Climate Projections (Lowe et al., 2018). The 12 km ensemble projections were forced at the boundaries of a European domain by a corresponding Perturbed Parameter Ensemble of the HadGEM3 global (60 km resolution) climate model (Williams et al., 2018). The outputs from the European 12 km simulations were then used as boundary conditions for 2.2 km simulations over the United Kingdom producing both a 12 km and a 2.2 km resolution dynamical downscaling for each global ensemble member. The 2.2 km model is based on the operational UKV model used for Met Office weather forecasting, and the higher resolution of the 2.2 km model is better able to represent small scale atmospheric processes and has higher resolution coastal, mountain and city features (Kendon et al., 2020), with other key differences in urban area representation (Keat et al., 2021). Convection is explicitly represented in the 2.2 km model, whereas the 12 km model requires a parameterisation scheme. This explicit representation of convection improves the representation of high temporal frequency (e.g., hourly) rainfall (e.g., Kendon et al., 2014). We use model data for the time period 1980-1999 for the 5 km resolution data, and the time periods 1980-1999 and 1980-2019 for the 12 km resolution period (the 5 km resolution data are currently only available for between 1980 and 1999 during the historical period).
The observational data used are the variables temperature and precipitation from HadUK-Grid data (Hollis et al., 2019), which are constructed at 1 km resolution and generated at both 5 and 12 km resolution for comparison to the UK Climate Projections model data. To facilitate better comparison to the ensemble median, we use the observational data from 1960 to 2019, with the longer period removing some of the interdecadal variability in the observations which is not the focus of this study. For both observations and model data, we use daily resolution data.
At each grid cell of the model and observational datasets we evaluate the correlation coefficients (using a Pearson correlation coefficient) and the slope coefficient of linear regression (calculated using an ordinary least squares linear regression model) between the temperature and precipitation values for a climatological season. We use the slope coefficient of linear regression to provide a simple first order representation of the magnitude of the relationship between temperature and precipitation, which is a simplification of the joint probability distribution that would better represent the relationship. It was determined that it is important to consider seasons (winter: December, January, February-'DJF', spring: March, April, May-'MAM', summer: June, July, August-'JJA', and autumn: September, October, November-'SON') separately due to the differing relationships between temperature and precipitation over the year. The coefficient of linear regression between two variables represents the strength of the co-variance of the two variables with time (which can be less formally thought of as how rapidly on average one variable changes with another). The correlation coefficient reflects how tightly constrained the co-variance is (how well the pattern through time of one variable matches that of the other). In Figure 1, we use illustrative data to show how the relationship between two variables in models and observations may have either low or high biases in the correlation coefficients and low or high biases in the coefficient of linear regression dependent on the structure of the multi-variate distribution.
The biases in the correlation coefficient and the coefficient of linear regression between the temperature and precipitation variables are (separately) calculated by subtracting the coefficient calculated from the observational data at a grid-cell from the coefficient calculated from the model data at the same grid-cell. The resulting bias can then be plotted on a map to show any regional variations in biases that may exist across a geographic region. We show the bias calculated from the median correlation coefficients and coefficients of linear regression across the ensemble, with stippling to show where the bias is calculated as significantly different to zero when considering each member of the ensemble separately (using a 1-tailed t-test with p value <.01 to indicate significance at the 99% confidence level). Using the median correlation/coefficients of linear regression can give a lower estimate of bias than if calculating bias for each member of the ensemble separately, which we discuss later in this article.
To demonstrate regionally where the biases in correlation and the coefficient of linear regression are together relatively large or small, the metrics are combined using the following percentile-based method. We take the absolute values of the correlation bias data (model correlation value minus observed correlation value), C, and the absolute values of the coefficient of linear regression bias data (model-observed), R, and extract their values at each grid-point (respectively C G and R G ). The following assignments are then completed against the distribution of C and R from the whole spatial domain in the following order to ensure that all grid-points end up with the correct metric value, and the assignment values are expressed in Figure 2a: • if C G is less than the 100th percentile of C AND R G is less than the 100th percentile of all R, assign metric value 9; • if C G is less than the 80th percentile of all C OR R G is less than the 80th percentile of all R, assign metric value 8; • if C G is less than the 80th percentile of all C AND R G is less than the 80th percentile of all R, assign metric value 7; • if C G is less than the 60th percentile of all C OR R G is less than the 60th percentile of all R, assign metric value 6; • if C G is less than the 60th percentile of all C AND R G is less than the 60th percentile of all R, assign metric value 5; • if C G is less than the 40th percentile of all C OR R G is less than the 40th percentile of all R, assign metric value 4; • if C G is less than the 40th percentile of all C AND R G is less than the 40th percentile of all R, assign metric value 3; • if C G is less than the 20th percentile of all C OR R G is less than the 20th percentile of all R, assign metric value 2; • if C G is less than the 20th percentile of C AND R G is less than the 20th percentile of all R, assign metric value 1. F I G U R E 2 (a) Metric values assigned to a model gridbox based on the magnitude of the bias in the Pearson correlation coefficient and the bias in the coefficient of linear regression relative to the percentiles of biases nationally. (b) Metric values assigned to a model gridbox based on the magnitude of the bias in the Pearson correlation coefficient and the bias in the coefficient of linear regression relative to the percentiles of biases across the study domain to highlight regions where one bias is substantially greater than the other A similar gridded framework is then used to assign values to reveal where the bias from the correlation coefficient or the bias from the magnitude of the coefficient of linear regression is from a higher percentile bracket than the other. This can help identify regions where one of the biases is substantially dominant over the other, noting that this is based on how the local bias is ranked compared to the rest of the study area rather than the absolute magnitude of the biases at a specific location. Where both biases fall in the same quintile, then the assigned value is 0, and where the biases fall into adjacent quintiles, the value is either 1 or −1 depending on whether the correlation bias or the regression bias dominates ( Figure 2b). When the biases fall into non-adjacent quintiles, the metric values increase or decrease depending on how many quintiles they are apart ( Figure 2b); for example, when the correlation bias is in the quintile that contains the 60-80% percentiles, but the regression bias is in the quintile containing the 0-20% percentiles, the assigned value is 3. This metric will not capture biases where the correlation and regression biases lie in the same quintiles, so biases can be large but not captured by this metric; instead, this metric is designed to highlight the areas where one bias is substantially greater than the other. The assignment values are shown in Figure 2b. For these two metrics, we show stippling to show where both the correlation and regression biases are calculated as significantly different to zero when considering each member of the ensemble separately (using a 1-tailed t-test with p value <.01 to indicate significance at the 99% confidence level).
We subset the data and recalculate the analysis using Met Office 'Weather Patterns' which are a numerical representation of the state of the North Atlantic and British Isles atmospheric circulation. These Weather Patterns are described by Neal et al. (2016) and are based on methodology used operationally by the Met Office to identify likely weather pattern changes in the medium-to extended-range forecast period. These patterns have been assigned to the UK Climate Projections model simulations by Pope et al. (2021) using the Global Circulation Model (60 km) global model. In this work we show results from two patterns from a set of eight, which are comprised of subsets from a larger 30 pattern set. The two patterns we show represent days where the North Atlantic circulation represents North Atlantic Oscillation ('NAO') positive and negative conditions.
For both the main analysis and the repeated analysis using subsets of days with Weather Patterns 1 (representing NAO negative conditions) and 2 (representing NAO positive conditions), we calculate the root mean squared error (RMSE) of the correlation and regression biases across the United Kingdom for both the 12 and 2.2 km model data for each season. This provides a summary metric which can be used to compare the biases across the United Kingdom across model resolutions, particularly when the same driving conditions are experienced (using the Weather Pattern 1 and 2 subset data). We show the biases as calculated from the model ensemble median correlation and regression metrics, and when the RMSE is calculated separately for each ensemble member (as well as the median of these results). These two methods can yield different results, which we discuss later in this article.
3 | RESULTS Figure 3a shows the positive relationship between temperature and precipitation in winter, especially over high topography areas. The biases in the Pearson correlation coefficient between temperature and precipitation in the 2.2 km model data show relatively strong positive values through the east of Scotland, the north of England, and central Southern England (Figure 3b). Note we include a map detailing the names of the regions of the United Kingdom in the Supplementary Material SF1. In the 12 km model, the positive biases are largest across areas of high topography (Figure 3c,d), which reflects underlying uni-variate model biases of too much winter precipitation in these regions (readers can see the temperature and precipitate uni-variate biases for each season in Figure SF2 in the Supplementary Material). In these areas in both models, the positive correlation between cool and wet, and warm and dry weather is too strong relative to observations in winter ( Figure 3b). However, the bias is not positive everywhere, with some areas of the United Kingdom having a correlation coefficient which is too low in the model relative to observations, such as the far north of Scotland in the 2.2 km model and east of Inverness and in the West Midlands in the 12 km model. This could be caused by too much rainfall over topography resulting in too little rain in other areas in the model relative to observations.
In spring there is a weak negative relationship between temperature and precipitation across the United Kingdom (Figure 3e), and there is a positive bias in Pearson correlation coefficient in the 2.2 km model data across much of the United Kingdom and especially northern areas of high topography (Figure 3f). This pattern is broadly similar in spring in the 12 km model (Figure 3g,h), but the magnitude of positive correlation is greater over the western high topography in the 12 km model than in the 2.2 km model.
In the summer (Figure 3i-l), the 2.2 km model reveals a negative bias in Pearson correlation coefficient across England and Northern Ireland and south-west Scotland, with north-west Scotland experiencing positive biases ( Figure 3j). The bias in the relationship in the 12 km model (Figure 3k,l) is similar to the 2.2 km model but with a positive bias across some of Northern Ireland and the south-west of Scotland instead, and a weaker negative bias across Wales and South West England. However, these areas of main differences are not stippled in the 12 km plots indicating that the range of data across the ensemble does not show that the bias is significantly different from 0 in the regions at the 99% confidence level.
The autumn (Figure 3m-p) 2.2 km model and the 12 km model over the same period show a similar negative bias over some of the England but show differences in the direction of bias in Wales, Northern Ireland, across Cornwall, and coastal Scotland.
We caveat the above results by highlighting that the above patterns are broadly consistent across ensemble members, but there are distinct differences between some ensemble members for most seasons where the stippling is absent suggesting that too fine an analysis of the regional patterns, especially for any analysis using only F I G U R E 3 The Pearson correlation coefficient between temperature and precipitation for the Northern Hemisphere meteorological winter in HadUK-Grid observational data for (a), spring (e), summer (i) and autumn (m), and the bias in the Pearson correlation coefficient between the model and the observations (calculated as model ensemble medianobservation at each gridbox), calculated for the Northern Hemisphere meteorological winter (b-d), spring (f-h), summer (j-l) and autumn (n-p), and for 2.2 km model data over time period 1980-1999 (b,f,j,n), 12 km model data over time period 1980-1999 (c,g,k,o), and 12 km model data over time period 1980-2019 (d,h,l,p). Stippling indicates where the biases across the 12 member ensemble are significantly different from zero at the 99% confidence level one ensemble member, may not be very meaningful. Despite the differences that we do see between the ensemble members, the broad agreement identified between the ensemble medians of the 2.2 km and the 12 km model data over the 1980-1999 period across some regions of the United Kingdom, does suggest that the ensemble median output is a useful tool. Figure 4a shows that positive slope coefficients of linear regression are found between temperature and precipitation over much of the United Kingdom. The biases in linear coefficient of regression are positive across most of the United Kingdom over winter in the 2.2 km model except for the north-west of Scotland (Figure 4b). In the 12 km model, the bias in linear coefficient of regression tends to be positive along the west coast of the United Kingdom across areas of high topography in winter (Figure 4c,d); the coefficients of linear regression are less positive in the observations (Figure 4a) than the model in these regions (not shown).
In spring (Figure 4e-h), positive biases in the linear coefficient of regression are concentrated down the western side of the United Kingdom across areas of high topography in both the 2.2 km model (Figure 4f) and the 12 km model (Figure 4g,h); the coefficients of linear F I G U R E 4 The Pearson correlation coefficient between temperature and precipitation for the Northern Hemisphere meteorological winter in HadUK-Grid observational data for (a), spring (e), summer (i) and autumn (m), and the bias in the slope coefficient of linear regression between the model and the observations (calculated as model ensemble medianobservation at each gridbox), calculated for the Northern Hemisphere meteorological winter (b-d), spring (f-h), summer (j-l) and autumn (n-p), and for 2.2 km model data over time period 1980-1999 (b,f,j,n), 12 km model data over time period 1980-1999 (c,g,k,o), and 12 km model data over time period 1980-2019 (d,h,l,p). Stippling indicates where the biases across the 12 member ensemble are significantly different from zero at the 99% confidence level regression between temperature and precipitation are more negative in the observations (Figure 4e) than in the model over these areas (not shown).
In summer (Figure 4i-l), there are positive biases in the coefficient of linear regression found in west Scotland, and England and Wales experience negative biases in both model resolutions although these are more widely significant at the 99% confidence level in the 2.2 km model.
In autumn (Figure 4m-p), there are positive biases in the coefficient of linear regression found in west Scotland, to a lesser extent in the 2.2 km model ( Figure 4n) and a greater extent in the 12 km model (Figure 4o,p). Large parts of England and Wales experience negative biases in both model resolutions, reflecting that the coefficient of linear regression between temperature and precipitation is smaller in the model data than in the observational data.
When the biases are combined in Figure 5, the metric calculated from the 2.2 km shows the highest values of the metric in winter down the central regions of Scotland and England, and in eastern Scotland (Figure 5a). Note that the levels chosen for this bias are inherently subjective, and therefore the results should be interpreted in a comparative sense rather than in quantitative detail. Stippling here is shown for areas where both the regression and correlation biases were both statistically significant from zero at the 99% confidence level. In contrast to the 2.2 km model data, the metric calculated from the 12 km model data highlights that in winter, the strongest biases in both correlation and regression exist across the western areas of high topography in the United Kingdom, as well as along the southern coast of England and the north-east of Scotland (Figure 5b,c). In spring, the highest values of the metric are found down the west coast of Scotland and England in both the 2.2 km model ( Figure 5d) and the 12 km data (Figure 5e,f).
In the summer, the largest values of the metric in the 2.2 km model are found across some of north-west Scotland, central and southern England and parts of Wales (Figure 5g) but in the 12 km model over the same time period the Scottish biases extend further south, and high biases in England are found mostly in central England (Figure 5h,i), but not in Wales and South West England as for the 2.2 km model (Figure 5g).
In the autumn, the highest values of the metric are found across Wales, Northern Ireland and north-east Scotland and England in the 2.2 km model data (Figure 5j), but in northern Scotland and central England in the 12 km model (Figure 5k,l).
Another metric is considered which highlights areas in which either the bias in coefficient of linear regression or the bias in Pearson correlation coefficient is significantly dominant over the other. In winter, the dominance of the correlation coefficient bias in the 2.2 km model is spread through the north-east of England and throughout central England as well as some parts of Scotland (Figure 6a), but F I G U R E 5 A metric combining the bias in the Pearson correlation coefficient and the bias in the coefficient of linear regression (as per Figure 2a) calculated for the Northern Hemisphere meteorological winter (a-c), spring (d-f), summer (g-i) and autumn (j-l), for 2.2 km model data over time period 1980-1999 (a,d,g,j), 12 km model data over time period 1980-1999 (b,e,h, k), and 12 km model data over time period 1980-2019 (c,f,i,l). Stippling indicates where the biases across the 12 member ensemble are significantly different from zero at the 99% confidence level for both the Pearson correlation coefficient and the coefficient of linear regression in the 12 km model correlation is only consistently dominant in parts western Scotland (Figure 6b,c). The regional areas where we identify a dominance of the linear regression bias in winter in the 2.2 km data are eastern Wales and South West England as well as some parts of central England and Scotland, with the 12 km model data also showing a dominance of linear regression bias through central Southern England (Figure 6b,c).
In spring, the correlation biases dominate along the east coast of the United Kingdom, while the regression biases dominate in parts of Wales and South West England in the 2.2 model (Figure 6d) and 12 km model (Figure 6e,f), though in the 12 km model data neither bias is dominant in Central or Eastern England.
In summer, regression biases are dominant in some areas of Scotland in the 2.2 km model ( Figure 6g) and some west coastal areas of the United Kingdom in the 12 km model, although these are not statistically significant due to the different behaviour across ensemble members (Figure 6h,i).
In autumn, the 2.2 km ( Figure 6j) and 12 km models (Figure 6k,l) do not show large areas where either bias is dominant, though the latter does show some western areas where the regression bias is dominant.
When we confine our analysis to winter and 'Weather Pattern' 1 days, when the Northern Atlantic driving conditions represent a NAO negative type state, we find strong correlation biases across most of the United Kingdom in both the 2.2 km (Figure 7a (Figure 7i,j). Broadly, in both the 2.2 and 12 km models we identify that regression biases dominate in parts of South West England and Wales, but correlation biases dominate over parts of western Scotland and central England (Figure 7m,n).
When we confine our analysis to winter and 'Weather Pattern' 2 days, when the Northern Atlantic driving conditions represent a NAO positive type state, we find the pattern of correlation biases has some differences between the 2.2 km (Figure 7c) and the 12 km (Figure 7d) models. In the 2.2 km model (Figure 7c) for Weather Pattern 2 the greatest positive biases exist across most of the mainland United Kingdom with negative biases in the north-west of Scotland and Northern Ireland (Figure 7c). In the 12 km model data (Figure 7d) there are strong positive biases in F I G U R E 6 A metric indicating dominant bias in either the Pearson correlation coefficient or the bias in the coefficient of linear regression (as per Figure 2b) calculated for the Northern Hemisphere meteorological winter (a-c), spring (d-f), summer (g-i) and autumn (j-l), and for 2.2 km model data over time period 1980-1999 (a,d,g,j), 12 km model data over time period 1980-1999 (b,e,h,k), and 12 km model data over time period 1980-2019 (c,f,i,l). Stippling indicates where the biases across the 12 member ensemble are significantly different from zero at the 99% confidence level for both the Pearson correlation coefficient and the coefficient of linear regression South West England and Northern Ireland (the latter not statistically significant), which do not agree with the pattern of the 2.2 km model (Figure 7c). These regional differences also exist between the 2.2 km data (Figure 7g) than in the 12 km data (Figure 7h) for the coefficient of linear regression for Weather Pattern 2.
For the overall bias metric greater values are found in eastern Scotland, Wales and Northern Ireland for the 2.2 km model data (Figure 7k), but for the 12 km model data the greatest values are across western England and Wales for Weather Pattern 2 in winter (Figure 7l). The correlation and regression bias metric shows correlation Stippling is as described in Figures 3-6 both the 2.2 km model data (Figure 8e) and the 12 km model data (Figure 8f) with some areas tending towards a (statistically non-significant) positive bias in the northwest of Scotland in the 2.2 km model (Figure 8e) and the south coast of England in the 12 km model (Figure 8f). The overall bias metric is largest across Wales and western and central England and Scotland in the 2.2 and 12 km model data (Figure 8i,j, respectively), though larger values spread further south in the 2.2 km model compared to the 12 km model. There are no clear significant areas of regression or correlation bias dominating in either model (Figure 8m,n).
In the summer for Weather Pattern 2, there are positive biases across Scotland and negative biases across England for both the 2.2 km (Figure 8c Figures SF3 and SF4, respectively. In addition to the regional distributions of bias already shown, across the United Kingdom the biases can be summarized with the RMSE metric of the correlation and regression biases using all data, as well as for days with particular Weather Patterns (here we use Weather Patterns 1 and 2). In most seasons we find that the RMSE across the United Kingdom is greater for the correlation biases in the 12 km model data than in the 2.2 km model data, except for in summer where the RMSE is approximately the same for both models for all data and both Weather Patterns 1 and 2 (Figure 9). For Weather Pattern 1 there is not a big difference between the median RMSE for each model in winter and spring ( Figure 9). However, the maximum RMSE from the ensemble is consistently greater for the 12 km model across all seasons and each weather pattern. (Figure 9).
For the regression biases the RMSE for the 2.2 km data is less than or about the same as that in the 12 km model data for all seasons (Figure 10). The RMSE is greater for Weather Pattern 2 than when calculated using all data for both the 12 and 2.2 km model data for each season by up to a factor of 2 ( Figure 10). F I G U R E 9 The root mean square error across all UK gridcells for the Pearson correlation coefficient. This is calculated from the median of the 2.2 km (magenta) and 12 km (cyan) perturbed parameter ensembles when compared to HadUK-Grid (larger outline symbols), but also for each ensemble member separately (2.2 km -solid red symbols connected by solid lines, 12 km -solid blue symbols connected by dashed lines), with the median of the results for the separate ensemble members also shown (smaller red/blue outline symbols). The upward and downward triangles denote the results when just the Weather Pattern 1 days are evaluated and the sideways lying triangles represent Weather Pattern 2 days only There are compensating biases across the ensemble, so many, and in some cases all, of the ensemble members have RMSE values for correlation and regression which are larger than the value when the biases are calculated from the ensemble median values of correlation and regression. This highlights that using individual ensemble members for impact-based risk assessment may introduce larger biases in the multi-variate relationships, in addition to the different parameter choices inherent in the perturbed parameter scheme resulting in varying regional patterns of multi-variate bias.

| KEY RESULTS OUTLINED BY SEASON
In winter (December, January, February):

| GENERAL DISCUSSION
In this work we first show the differing magnitude of biases in the median correlation coefficient and slope coefficient of linear regression across the ensemble, to highlight where there are robust multi-variate biases across the ensemble, and how these differ across the different resolutions of high-resolution regional model available as part of the UK Climate Projections perturbed parameter ensemble. The magnitude of the biases in the Pearson correlation coefficient and coefficient of linear regression between temperature and precipitation show some differences in the 2.2 km data relative to the 12 km in any season, highlighting that in a particular local area, users may want to consider using the resolution where multi-variate biases are reduced when considering multivariate biases. However, if this is a lower resolution model, there may be other competing drawbacks to this decision such as that short duration rainfall extremes will be less well represented in the 12 km model (e.g., Kendon et al., 2014;. In most seasons, we quantify these biases overall by a RMSE for the United Kingdom that is between 20 and 50% smaller for the 2.2 km model data than for the 12 km model data, so when considering the whole of the United Kingdom, the 2.2 km is shown to be less biased overall in terms of these two metrics. For a particular user of climate data in a specific location, they might wish to consider how model data may be likely to be biased in their area in a uni-variate as well as a multi-variate sense so they can potentially make corrections to their future projection data and risk assessments, using appropriate bias corrections methodologies (e.g., François et al., 2020 outline a range of proposed multi-variate bias correction methodologies). Vrac et al. (2022a) investigate how rank correlation can be included within multi-variate bias correction to reduce bias, finding that it improves climate projection data. This work provides users with information around biases in rank correlation as well as the slope coefficient of linear regression. Vrac et al. (2022a) also find that when using an ensemble, the robust part of the climate change signal from the average of the ensemble members yielded the best results, emphasizing the importance of using ensembles and a focus on mean/median metrics over individual ensemble members as also shown in this work. This supports wider climate modelling understanding (e.g., Gleckler et al., 2008) that the model mean typically outperforms individual model simulations. Vrac et al. (2022a;2022b) highlight how, in their work across Europe in global climate models, correlation changes over time in reanalysis data (a reconstruction of past observations using model techniques to fill in observational gaps) in a way that is not captured by global climate models. Here we show the biases in a regional model, which does not show large changes between 1980-1999 and 2000-2019, but we deliberately look at the model bias compared to the longer observational period of 1960-2019 to remove the influence of decadal variation which we expect because the climate simulations are not hindcast models. Instead, they have been spun off a climate model in steady state and forced with historical emissions data up to 2005, and then climate model projection scenarios thereafter (as per the CMIP5 protocol). This means that a direct comparison of a given time period between observations (or reanalysis) and climate model simulations is not necessarily that helpful in quantifying bias or its pattern, because of the multi-decadal variability inherent in our climate system that is not in sync with that in these models. In this work, we present a novel way of combining the two metrics to consider overall which regions experience most bias in the 12 and 2.2 km models, finding that western regions are typically more biased in winter, spring, and autumn, especially in the 12 km model. In the summer the biases are greatest across central and eastern England and western Scotland. This information about in which regions multi-variate biases are greatest gives important information to future model developers about how the multi-variate behaviour of temperature and precipitation varies across the United Kingdom, and for users to help them interpret nationally where biases may be greatest. We also show where either correlation or regression biases dominate, which helps users identify how data is most biased in some regions. If a user were interested in a smaller area of study, or increased granularity of these metrics, the same methodology could be used with deciles rather than quintiles. Further understanding can also be achieved locally by a full evaluation of the distribution, perhaps by using techniques such as the joint probability distribution and its bias, over a particular area or at a specific location.
When considering the current climate, we show there are variations in the biases regionally which may need close consideration for local impact-based risk assessment and careful bias correction. Here we show that over the whole domain of temperature and precipitation relationships, there are some regional areas and some seasons (e.g., in summer) where there may not be a significant benefit from use of the 2.2 km model data compared to the 12 km model data for the multi-variate relationships between temperature and precipitation. However, we are emphasizing here that we are considering the bias in the whole distribution, and when considering only extreme impacts and the tails of the distribution there may be a very different situation [e.g., the 2.2 km model will be a better choice for examining short duration rainfall extremes across the United Kingdom (Kendon et al., 2020)]. The approach taken in this framework could be extended to higher temporal frequency (e.g., hourly) data from convective permitting model simulations, as well as to other multi-variate pairings. However, HadUK-Grid currently only exists at daily resolution for temperature and precipitation, so instead of observational data, extension of this work might have to rely on reanalysis products for bias correction (as used by Vrac et al., 2022b).
By separating our data into two different Weather Patterns we identify how the biases differ across days with similar wider driving conditions, and we show that differing Weather Patterns are key for determining the multivariate biases that we have identified across the models, especially in some regions, for example, Northern Ireland. The RSME of the correlation and regression biases are typically greater for Weather Patterns 1 and 2 than for all conditions, especially for Weather Pattern 2. For correlation biases, the Weather Pattern 2 RMSE is 1.5 to 2 times greater than for all conditions for both model resolutions, and for regression the biases for Weather Pattern 2 are 2 to 4 times greater than for all conditions for the 12 km model, and around 2 times greater for the 2.2 km model. This could be especially important for risk assessment in regions like South West England and Wales in winter where biases are very strong in the 12 km model under Weather Pattern 2 conditions. Our work supports the findings of Vrac et al. (2022b) in global climate models that wider driving conditions determine the behaviour of multi-variate biases. This may have implications for potential skill in seasonal to interannual predictions or assessing risk in particular conditions; if a particular driving pattern is forecast to be more likely in a future season, or under climate change, a user could potentially adjust or caveat their risk assessment of multi-variate impacts according to the biases we have identified. There is more work required to enable users to do this easily, but this work is a first step in identifying how bias varies across the whole distribution. This result also highlights to climate model users about the important of understanding wider driving weather patterns as well as local predicted changes to weather and climate variables in understanding multi-variate behaviour and therefore ultimately robust weather and climate risk assessment.

| KEY CONCLUSIONS
• We show the differing magnitude of biases in the ensemble median correlation coefficient and slope coefficient of linear regression and highlight where there are robust multi-variate temperature-precipitation biases regionally, using the UK Climate Projections perturbed parameter ensemble (12 and 2.2 km resolution) and HadUK-Grid observations. • We present a way of combining the two metrics of correlation coefficient and slope of linear regression for two variables to consider overall which regions experience most bias in the 12 and 2.2 km models, finding that for temperature and precipitation, western regions are typically more biased in winter, spring, and autumn, especially in the 12 km model. • We identify how these temperature-precipitation biases differ across days with similar wider driving conditions (Weather Patterns), and we show that differing Weather Patterns are key for determining the multi-variate biases that we have identified across the models, especially in some regions, for example, Northern Ireland.