A new system to classify global climate zones based on plant physiology and using high temporal resolution climate data

Climate classification systems (CCSs) can be used to predict how species’ distributions might be altered by climate change and to increase the reliability of these estimates is an important goal in biogeographical research. We produce an objective, global climate classification system (CCS) at high temporal resolution based on plant physiology as a robust way to predict how climate change may impact terrestrial biomes.


| INTRODUC TI ON
Climate classification systems (CCSs) describe variation in environmental conditions by grouping areas of the globe according to climatic similarity. As climate change research focuses increasingly on predicting potential future climate scenarios, CCSs can be a useful tool to visualize changes in environmental conditions both globally (Elguindi, Grundstein, Bernardes, Turuncoglu, & Feddema, 2014;Rubel & Kottek, 2010) and regionally (Engelbrecht & Engelbrecht, 2016;Rubel, Brugger, Haslinger, & Auer, 2017;Zeroual, Assani, Meddi, & Alkama, 2018). Strong links between climate and plant distributions (Woodward, 1987) lead CCSs often to be constructed with the distribution of biomes in mind, or using variables considered to have biological relevance (e.g. Prentice et al., 1992). Indeed, the distribution of biogeographic realms (Olson et al., 2001), biomes (Köppen, 1936) and plant functional types (Prentice et al., 1992) have themselves been used to define climatic zones of the world. When constructed in this way, a CCS provides a simplification of the relationship between plant distributions and climate that enables the study of both component parts. Under the assumption that plants and climate will remain in equilibrium with each other over spatial and temporal extents (Pearson & Dawson, 2003), CCSs have therefore been applied widely to predict possible plant responses to future climate change scenarios (Cramer & Leemans, 1993). When used to model plant distributions, CCSs become predictive, rather than descriptive tools (Rubel & Kottek, 2011), and the way they are constructed must reflect this purpose to ensure the reliability of results. However, climate zones are often defined by variations in seasonal or annual average conditions which only correlate indirectly with the physiological processes that determine ultimately where a species can survive and grow. When applied to predict future plant distributions, these CCSs may therefore suffer limitations notorious to the wider species distribution modelling literature, primarily that correlations between present-day occurrences and distal climate variables may be strong in space but they may not be so over time (Dormann et al., 2012;Leemans & Van den Born, 1994). This means that CCSs constructed using distal variables will likely provide less reliable predictions of future plant distributions than those that incorporate proximal drivers, for which direct relationships to physiology ensure transferability into new time frames (Austin, 2002;Kearney & Porter, 2009).
Climate classification systems that consider the physiologically limiting aspects of climate for plants in order to define climate zones do exist. Thornthwaite (1948), for example, recognized the importance of water availability for plants and included a water balance measure. Prentice et al. (1992) using mean temperature of the coldest month, growing degree-days above 5°C and evapotranspiration, all of which are known to have physiological relevance to plants, successfully reproduced the global biome model of Olson, Watts, and Allison (1983). However, the Köppen-Geiger (KG;Geiger, 1954) and Köppen-Trewartha (KT;Trewartha & Horn, 1980) systems have remained most popular, possibly as, despite their use of distal rather than proximal predictors of plant distributions to define global climatic zones, they are straightforward to calculate using widely available climate datasets (e.g. WorldClim : Hijmans, Cameron, Parra, Jones, & Jarvis, 2005), simple to understand and easy to use to determine how climatic changes alter the global distribution of zones.
Recently, Metzger et al. (2013) produced a global CCS using just four climate variables known to affect plant physiological processes.
Their CCS was constructed at high (~1 km 2 ) spatial resolution using the WorldClim global climate dataset (Hijmans et al., 2005). However, the temporal resolution of physiological climate variables may also be crucial to retain their proximity to biological processes and thus predict reliably how climate conditions influence species distributions. Kearney, Matzelle, and Helmuth (2012), for example, compared the predictive performance of a species distribution model constructed using physiological metrics derived from both monthly and daily data. They found that average monthly data obscured important periods of high and low temperatures and led to biased estimates of the climatic impacts on survival, growth and reproduction.
The authors recommended using fine (at least daily) resolution data wherever possible, as some of the benefits of using physiologically relevant climate variables may otherwise be lost when the temporal resolution of climate data used in their construction is coarse.
Increasing the temporal resolution of climate data may therefore support more accurate predictions of how altered environmental conditions will affect species' physiology and ultimately their distributions (e.g. Montalto, Sarà, Ruti, Dell'Aquila, & Helmuth, 2014).
Furthermore, the aspects of climate that impact directly on species' physiological processes, and therefore their ability to colonize or persist in an area, may require consideration for environmental conditions over reasonably short, sensitive periods of their life cycle. Chuine and Beaubien (2001), for example, showed how drought only affected the survival of tree species during the growing season, so that considering seasonal variation in water availability led to better predictions of their distributions. Similarly, Hatfield and Prueger (2015) reported that higher growing season temperatures increased phenological development in maize, but that the most major impact on grain yield was determined by increases during the reproductive phase. High temporal resolution climate data can not only capture variation within these periods (Bateman, VanDerWal, & Johnson, 2012), but may be required if physiologically relevant aspects of climate act over short time scales that cannot be represented by longer-term means. Although longer-term climate averages may correlate with physiological predictors (e.g. Reich & Oleksyn, 2004), this relationship may breakdown over spatial and temporal scales and lead to errors when extrapolating results into novel environments or time frames (Elith & Leathwick, 2009).
In this study, we produce a global CCS at high temporal resolution for the 21st century and beyond. To achieve this, we use climate variables identified in an extensive review (Gardner, Maclean, & Gaston, 2019) as physiologically relevant to plants. These variables include those related to soil water availability and climatic variation within the growing season. We use cluster analysis to group climatically similar areas objectively and independently of any a priori assumptions about the vegetation that should be found within different zones (Metzger et al., 2013;Unal, Kindap, & Karaca, 2003).
We discuss the defining features of each climate zone in our new physiological CCS.
Our approach is different from the Köppen systems, where boundaries of zones were chosen to reflect vegetational boundaries (Prentice et al., 1992). However, by comparison to the Köppen CCSs, we can check that our physiological variables explain effectively recognized divisions in global vegetation patterns (Metzger et al., 2013).
We cluster our physiological variables into a five-and six-cluster solution to reflect the number of zones in the KG and KT CCSs, respectively, and using Kappa statistics  find good to very good agreement with both Köppen schemes. We interpret this as reassurance for the value of our new physiological CCS, but also highlight areas where the physiology and Köppen zones do not match, particularly in temperate areas. We suggest that in these places, the Köppen systems do not reflect well physiologically relevant variation in climate and discuss the implications of this when using Köppen CCSs to predict how climate change may impact future plant distributions.
There is currently no global climate database available for some of the required physiological variables at both high temporal and high spatial resolution and so by prioritizing the use of high temporal resolution climate data to construct our CCS it was necessary to compromise on spatial resolution. At coarser spatial resolutions, even 'physiological' variables may become mismatched to the conditions species experience (Potter, Woods, & Pincebourde, 2013).
Although in many cases this problem is overcome through 'mean field approximation', whereby macroclimate is a good predictor of the aggregated responses of many individuals to climate (Bennie, Wilson, Maclean, & Suggitt, 2014), constructing a physiological CCS at both high spatial and high temporal resolution remains a future goal. Our physiological CCS might not, therefore, be the final step towards a global climate classification for plants, but it is the first both to consider physiology and use high-resolution temporal climate data to construct these variables. We hope that as techniques to model climate at high temporal and high spatial resolution improve, our CCS can be updated to further improve predictions of how climate change may impact plant distributions worldwide.

| Physiologically relevant climate variables
The climate variables used were 10 physiologically meaningful variables identified from the peer-reviewed literature on plant physiology (Gardner et al., 2019). These variables were as follows: (a) soil water content during the growing season; (b) mean growing season temperature, (c) growing season precipitation; (d) total summer precipitation; (e) total annual precipitation; (f) growing season length; (g) maximum temperature during the growing season; (h) mean annual temperature; (i) mean summer temperature and (j) summer soil water content (see Appendix S1 for further information).
To construct the physiology variables we downloaded the following climate data at 4× daily (6 hourly) temporal resolution and 2.5 spatial resolution from the NOAA website (https://www.esrl. noaa.gov/psd/) for each year in the period 2000-2017: (a) surface skin temperature, (b) air temperature, (c) relative humidity, (d) net short-wave radiation, (e) downward short-wave radiation, (f) net long-wave radiation, (g) wind speed, (h) volumetric soil moisture (0-10 cm below ground level) and (i) water equivalent of snow depth.
These data are combinations of modelled forecasts and hindcasts, calibrated and tested against observed data (Kalnay et al., 1996).
Precipitation data (Xie et al., 2007) were downloaded from NOAA as surface-level daily totals at 0.5° spatial resolution and resampled to 2.5° resolution using a bicubic spline (Forsythe, Malcolm, & Moler, 1977). We derived hourly estimates for all variables except precipitation, which was retained as a daily variable due to the stochasticity of precipitation events. Details of data download and processing can be found in the Supporting Information (Appendix S1).

| Cluster analysis
To determine the best number of clusters to describe the physiology data, we calculated the Euclidean distance between the predicted values for principal components (PCs) 1 and 2 and used the NbClust function in R to identify the clustering solution with the lowest within-cluster variance (specifying Ward's clustering method and a minimum of 2 up to a maximum of 30 clusters).
We mapped the results on a global grid and zone classifications were decided based on variable loadings on the axes of the first two PCs, the average values of variables across each zone and visual similarity of zones to the most recent update of the KG climate classification map (Beck et al., 2018). The global maps produced by Beck et al. (2018) update the Köppen CCS for present-day climate at ~1 km 2 resolution and apply global Köppen climate classes at the same spatial resolution to a future climate scenario under representative concentration pathway 8.5.
As climate globally is expected to experience yearly variation, we also tested the optimal clustering solution for different time periods to ensure the relationship between climate variables and our zone classifications for the period 2000-2017 was representative of other years. This was found to be true and we report the results in Supporting Information (Appendix S2,

| Comparison with Köppen CCSs
Global maps of the KG (Geiger, 1954;Köppen, 1936) and KT (Trewartha & Horn, 1980) CCSs were made following the definitions of zones described in Belda, Holtanová, Halenka, and Kalvová (2014; see Table 1) and with the precipitation and temperature data used to construct the 10 physiology variables.
To compare the climatic variation described by our physiology variables to the Köppen maps, we first conducted two k-means cluster analyses. k-means clustering requires the number of clusters (k) to be specified by the user and is therefore useful when seeking a particular clustering solution. We calculated the Euclidean distance between the predicted scores for PCs 1 and 2 and performed two k-means cluster analyses on these values (Hartigan & Wong, 1979): (a) specifying a five-cluster solution, and (b) specifying a six-cluster solution. These k values match the number of zones for the basic KG and KT CCSs, respectively. We examined the result of each clustering solution to assign zone classifications.
We then used the kappa statistic  statistically to compare zone assignment for the physiology five-and six-cluster solutions to those of the KG and KT systems. Following Monserud and Leemans (1992) and Landis and Koch (1977), we considered values <0.4 poor or very poor, 0.4-0.55 fair, 0.55-0.7 good, 0.7-0.85 very good and >0.85 excellent agreement between zone assignment.
Given that the global climate data used were spatially coarse, we repeated the process above using hourly climate data at 0.25° × 0.25° spatial resolution for a case study area (−90 S to 90 N degrees north latitude and −20 W to 50 E) to test the sensitivity of our results to spatial resolution. Full methods and results for this case study are reported in Supporting Information (Appendices S1 and S2).
All data analyses were carried out in the statistical programme R (R Core Team, 2018).

| Principal component analysis
The first two PCs explained 88% of the variation in climate variables (see Appendix S2, Table S2.1). PC1 explained 64% of the variance TA B L E 1 Zone names and descriptions for the Köppen-Geiger (KG) and Köppen-Trewartha (KT) climate classification systems, reproduced from Belda et al. (2014). For KG, rainfall is concentrated in summer/winter if 70% of the annual precipitation falls in summer/winter, respectively. Summer/winter is April-September/ October-March in the Northern Hemisphere and October-March/ April-September in the Southern Hemisphere. If rainfall is neither concentrated in summer nor winter, rainfall is classed as evenly distributed. For KT, R is Patton's precipitation threshold, defined as R = 2.3T − 0.64 P w + 41, where T is the mean annual temperature ( C) and P w is the percentage of annual precipitation (cm) occurring in winter (winter definition as defined above; Patton, 1962) Zone name Criteria

Köppen-Geiger
Tropical Temperature of the coldest month > 18°C Mean annual rainfall (cm) is above value given for Dry zone

Dry
If rainfall is concentrated in summer: Mean annual rainfall (cm) < 2 × mean annual temperature (°C) + 28

| Optimal clustering solution
The best number of clusters for the data was found to be 10 (see Appendix S2, Table S2.2). Figure 1 shows  Table S2.3). Figure 2 shows the zones of optimal clustering solution on the first two PC axes. There was greatest mismatch in the assignment of the temperate zone between the five-cluster physiology and KG CCS. There was greatest mismatch in the assignment of the temperate and continental and subtropical zones between the six-cluster physiology and KT CCS showed (Appendix S2, Tables S2.5 and S2.6).

| Comparison to Köppen classification systems
Our five-cluster physiology solution allocated more areas classified as temperate in the KG system to the tropical zone. Our six-cluster physiology solution allocated more areas classified as temperate and continental in the KT system to the boreal zone.
Many areas classified as subtropical in the KT zone were assigned to the temperate and continental zone in the six-cluster physiology solution.

| D ISCUSS I ON
The distribution of plants worldwide is strongly influenced by climate (Woodward, 1987). This allows CCSs to describe not only climatic variation but also to be used predictively to assess how changing environmental conditions may alter patterns of vegetation. When applied in this way, an important assumption of the CCS is that plant distributions correlate directly with the climate variables used to delimit zones and that this relationship is conserved over the spatial and temporal extents that predictions are made (De Castro, Gallardo, Jylha, & Tuomenvirta, 2007). Physiologically relevant variables have direct links to biological mechanisms or processes of the study species and will therefore be causally linked to a species' distributional response to climate both in space and in time (Austin, 2002). Their use in CCSs may support more robust estimates of how species may move in response to future climate

| Importance of using physiological variables
To construct a CCS using only simple temperature and precipitation indices may neglect important aspects of climate variation for plants. The timing of climatic events during the growing season period, soil water content and the mutual availability of temperature and water for plants, for example, are important physiological variables that should not be overlooked (Gardner et al., 2019).
Our physiological CCS reflects in aggregate the variation in these factors during the periods most important for plant growth. Areas with no growing season or where temperatures rarely exceed zero degrees Celsius, for example, are clearly distinguishable. Tropical zones are also highlighted, which is unsurprising given that variables relating to water availability and temperature loaded strongly in both PC1 and PC2.
It is widely acknowledged that conventional CCSs do not always clearly identify tropical savanna biomes (e.g. Prentice et al., 1992) and it has been suggested that soil water balance (Scholes & Walker, 1993) and rainfall seasonality (Lehmann, Archibald, Hoffmann, & Bond, 2011) are important in explaining the absence of trees. Our CCS, which distinguishes between tropical wet and tropical savanna based on growing season conditions, and particularly growing season precipitation, could reflect this. Given that tropical savannas support high species endemism (Abreu et al., 2017), but have also been considered one of the world's most threatened ecosystems (Hoekstra, Boucher, Ricketts, & Roberts, 2005), it could be important to consider how the impacts of climate change on rainfall might cause ecosystem shifts from tropical savanna towards tropical wet climate zones and increase woody vegetation cover (Tews & Jeltsch, 2004). This could support the use of our physiology CCS to predict how climate change may impact plant distributions.

| The importance of temporal resolution
Although other CCSs based on plant physiology do exist and are available at high spatial resolution (e.g. Metzger et al., 2013) can have stronger impacts on physiological processes than changing mean climate (Reyer et al., 2013), or can advance changes in species composition in response to altered average environmental conditions (Jentsch, Kreyling, & Beierkuhnlein, 2007), are also less likely to be overlooked or underestimated. Overall, this means that the variables maintain a proximal link to plant physiology that might be lost with coarser-resolution climate data. In such cases, even 'physiological' variables could become dissociated from the processes that drive species' responses to climate and this could potentially negate any advantage of using a mechanistic approach (Kearney et al., 2012).
Climate change is expected to extend or alter conditions within the growing season period (Linderholm, 2006) and increase the frequency and intensity of extreme weather (Pachauri et al., 2014). It may therefore become increasingly important to use data with fine temporal resolution in order to quantify climate conditions within these important periods or during anomalous events to predict accurately how climatic suitability for species might be affected (Kearney & Porter, 2009). CCSs constructed at high temporal resolution can account for short-term weather anomalies that may be obscured by monthly mean data (Potter et al., 2013). The use of high temporal resolution data will therefore support better estimations of climatic suitability and positively impact the reliability of species range predictions (Serra-Diaz et al., 2016).

| Correspondence to the Köppen systems
We find that both a five-and six-cluster physiological CCS constructed with physiologically relevant variables show good overall agreement to the KG and KT schemes respectively. Individually, all physiology zones show fair or better agreement to their equivalent Köppen zone. This verifies that our CCS can reproduce present-day vegetation patterns but, in areas of discrepancies, highlights where CCSs constructed using distal climate variables (at the same spatial and temporal resolution) may fail to capture physiologically relevant aspects of climate variation.
Areas of climate extremities such as the Arctic and Antarctic (very cold temperatures) and the Sahara and Kalahari Deserts in Africa and the deserts of Australia (hot with limited precipitation) showed better agreement between the Köppen and physiological CCSs. On these bases, we conclude that the variables used to construct the KG and KT CCSs capture effectively the extremity and physiological limiting nature of very cold and very dry environments and may therefore be good proxies for physiology variables in such areas. However, there was mismatch in the assignment of the temperate zone between the physiology and Köppen CCSs and, similar to Wang and Overland (2004), we find that our five-and six-cluster  (Feddema, 2005) and to construct the climate of these regions therefore requires consideration for seasonal variation among a suite of variables; this complexity and variability cannot be captured by considering only extremities in one or more parameters.

| Climate classification in a changing climate
Climate change is expected to alter the global distribution of climate zones (Rubel & Kottek, 2010), but predictions suggest that changes will be especially large in areas currently classified as temperate. Rubel and Kottek (2010) Many areas classified by the KG system as temperate were assigned to the tropical zone in our five-cluster solution. Tropical species can be more sensitive to temperature change as they often live close to their optimal temperature (Deutsch et al., 2008). They may therefore reach their physiological upper thermal tolerance limit sooner and be more vulnerable to climate change than might be predicted using the current KG system. If these species are unable to migrate or adapt, they may be at greater risk of extinction due to global warming than species living in cooler climates. With the addition of the subtropical zone in the KT system the opposite problem is revealed, as many areas in the KT temperate zone were classified as boreal in the six-cluster solution. In these cases, the KT temperate zone is warmer than the physiological temperate zone, so species may have more capacity to resist climate warming than might be predicted using the KT system. If species assigned to a temperate and continental climate using the KT classification system can occupy a cooler climatic niche, this might enable them to tolerate unusual chilling events in an expanded range and facilitate their successful poleward movement (Wen, Qin, Leng, Zhu, & Cao, 2018).
In both cases, a physiologically based CCS constructed at sufficiently high temporal resolution could provide a more reliable estimate of the areas or species most vulnerable to climate change and more accurately predict future range expansions and contractions.
This is because physiology variables correlate directly with biological mechanisms driving species distributions; a relationship which can be extrapolated over space and time.
It is significant and encouraging that polar regions are well-described by the Köppen systems, as the ecosystems in these areas have also been reported as highly vulnerable to climate change (Larsen et al., 2014). Although these areas typically have very low plant diversity (Cavieres et al., 2016), correctly identifying where climate conditions are moving away from a polar classification and towards those characteristic of a warmer climate zone could help to suggest areas where species currently limited by cold temperatures may be able to survive in the future (Bravo et al., 2001;Hinzman et al., 2005). This may also aid better understanding of the ecophysiology of plants specifically adapted and restricted to extremely cold environments and their possible responses to climate warming.

| Using the Köppen systems
The precipitation and temperature data required to construct CCS maps are widely available from existing climate datasets at fine spatial resolution (~1 km 2 ; e.g. Hijmans et al., 2005 By compromising spatial resolution to favour high temporal resolution and to construct more complex climate variables, physiological variables may themselves become 'distal' to physiological processes operating at the organismal scale (Potter et al., 2013). However, the good correlation we observe between the five-and six-cluster physiology maps and the Köppen systems was shown to be consistent down to 0.25° spatial resolution across our case study area (reported fully in Appendices S1 and S2), providing further support that known present-day patterns of vegetation (on which the Köppen systems are based) are well-described using our physiology variables and data with coarse spatial resolution.
Incorrect estimations of the distributions of species or their habitats could have severe consequences where this information is used to inform conservation priorities or land management strategies (e.g. Santini & di Paola, 2015). We therefore suggest caution is applied when using the Köppen systems to predict plant responses to climate in areas we highlight as correlating poorly with the physiology zones. Based on our analyses, we recommend constructing CCSs using climate data with high temporal resolution: 1. in temperate areas; 2. in areas with high temporal variance in climate conditions, such as the northern half of the northern hemisphere (Feddema, 2005); 3. where climatic variability, such as the frequency or intensity of extreme events, is predicted to increase significantly as a result of climate change; 4. where data are available at both high spatial and temporal resolution and computational capacity permits use of these data.
We also encourage ecologists and climate scientists to consider expanding monitoring networks for physiological variables at fine spatial and temporal resolutions or increasing efforts to develop better methods to calculate these variables from existing climate datasets.
Reliably to predict plant responses to climate change a CCS should reflect physiological mechanisms of the study species.
However, consideration for the temporal, as well as spatial resolution of the climate data used to construct these variables may be required. Our physiological CCS shows how it is possible to construct an objective, physiological alternative to the popular Köppen systems that captures the critical timing of climatic events within the growing season period and the mutual availability of temperature and water for plant physiological processes. We encourage the use of our CCS to support more reliable predictions of how altered environmental conditions may impact the global distribution of vegetation zones, but also urge the development of physiologically relevant climate data at finer temporal and spatial resolutions to strengthen further these predictions. This will influence positively the ability to manage ecosystems appropriately and protect global biodiversity for the future.

ACK N OWLED G EM ENTS
This work was funded by the Natural Environment Research Council (NERC; Grant Reference: NE/P01229/1) with support from Cornwall Council.

DATA AVA I L A B I L I T Y S TAT E M E N T
The R script used to construct the physiology variables has been released as an R package (climvars) on GitHub (https://github.com/ ilyam aclea n/climvars). The assembled climate datasets used in our analyses are also available via this repository.