Incorporating physiology into species distribution models moderates the projected impact of warming on selected Mediterranean marine species

1090 –––––––––––––––––––––––––––––––––––––––– © 2020 The Authors. Ecography published by John Wiley & Sons Ltd on behalf of Nordic Society Oikos This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. Subject Editor and Editor-in-Chief: Miguel Araújo Accepted 18 February 2020 43: 1090–1106, 2020 doi: 10.1111/ecog.04423 43 1090–1106

Species distribution models (SDMs) correlate species occurrences with environmental predictors, and can be used to forecast distributions under future climates. SDMs have been criticized for not explicitly including the physiological processes underlying the species response to the environment. Recently, new methods have been suggested to combine SDMs with physiological estimates of performance (physiology-SDMs). In this study, we compare SDM and physiology-SDM predictions for select marine species in the Mediterranean Sea, a region subjected to exceptionally rapid climate change. We focused on six species and created physiology-SDMs that incorporate physiological thermal performance curves from experimental data with species occurrence records. We then contrasted projections of SDMs and physiology-SDMs under future climate (year 2100) for the entire Mediterranean Sea, and particularly the 'warm' trailing edge in the Levant region. Across the Mediterranean, we found cross-validation model performance to be similar for regular SDMs and physiology-SDMs. However, we also show that for around half the species the physiology-SDMs substantially outperform regular SDM in the warm Levant. Moreover, for all species the uncertainty associated with the coefficients estimated from the physiology-SDMs were much lower than in the regular SDMs. Under future climate, we find that both SDMs and physiology-SDMs showed similar patterns, with species predicted to shift their distribution north-west in accordance with warming sea temperatures. However, for the physiology-SDMs predicted distributional changes are more moderate than those predicted by regular SDMs. We conclude, that while physiology-SDM predictions generally agree with the regular SDMs, incorporation of the physiological data led to less extreme range shift forecasts. The results suggest that climate-induced range shifts may be less drastic than previously predicted, and thus most species are unlikely to completely disappear with warming climate. Taken together, the findings emphasize that physiological experimental data can provide valuable supplemental information to predict range shifts of marine species.

Introduction
Global climate change is strongly affecting terrestrial and marine biota, evident in species range shifts and changes in phenology (Parmesan 2006, Elith et al. 2010, IPCC 2014, Sunday et al. 2015. As temperatures are expected to continue rising in the oceans, species must either adapt in-situ or respond to the changing climate by continuously shifting their distribution to preserve their climatic niche (Lawler et al. 2006, Coll et al. 2012, Marras et al. 2015, García Molinos et al. 2016. Forecasting how species distributions will shift with climate change is critical for understanding the potential fate of the oceans and for creating decision making support tools for biodiversity management (Lawler et al. 2006, Bates et al. 2013, Guisan et al. 2013. Species distribution models (SDMs) are being increasingly used to predict distributional responses to changing climate (Guisan andThuiller 2005, Kearney andPorter 2009). These models typically correlate species occurrence (sometimes including absences) or abundance with environmental predictors (Elith and Leathwick 2009). The model is developed using the current distribution of the studied species, the realized niche, which can be extrapolated in space or time to forecast the changes in species distributions caused by climatic changes (Morin and Lechowicz 2008, Fitzpatrick and Hargrove 2009, Woodin et al. 2013. SDMs are relatively easy to apply and can be used for any species as long as there are sufficient data on their distribution and the relevant environmental predictors (Elith and Leathwick 2009). However, SDMs have been criticized in that they do not take into account the physiological processes behind species range shifts (Lawler et al. 2006, Kearney et al. 2010, Thuiller et al. 2013. SDMs are most useful when species realized niches are representative of their fundamental niche, shaped by the underlying physiological constraints. However, if the fundamental and realized niches diverge, correlative SDMs may be inaccurate in predicting the future distribution of species (Elith et al. 2010, Parravicini et al. 2015, Rilov et al. 2019).
An alternative strategy for understanding and predicting species distributional response to climate change is based on species physiology (Cooke et al. 2013), i.e. the fundamental niche (Morin and Lechowicz 2008). Physiological models take into account the biological mechanism behind the species response to the changing environment, and thus can be used more confidently for forecasting the response to novel environmental conditions (Kearney et al. 2010, Cabral and Kreft 2012, Cheaib et al. 2012, Cooke et al. 2013, Levy et al. 2016. The simplest form of these models utilizes a physiological threshold, such as temperature, to predict species future distributions (Martínez et al. 2014). Complex physiological models require data on the relationship between the species environmental conditions and its performance (Buckley et al. 2011, Childress andLetcher 2017). However, physiological models, being based on the fundamental niche, tend to over-estimate the distribution of target species, thereby affecting model performance (Pearson and Dawson 2003).
Given the shortcomings of each method alone, it has been recommended that physiological estimates of species performance under varying climatic conditions should be incorporated into correlative SDMs (Woodin et al. 2013, Martínez et al. 2014, Talluto et al. 2016. SDMs that combine occurrences and physiological data can provide more robust forecasts of species distributions when extrapolating to novel climate scenarios (Kearney and Porter 2009). However, most attempts so far to combine SDMs with physiology have been based on overlying two separate models (Martínez et al. 2014). Due to lack of appropriate methods and data requirement constraints, a simple integration of correlative SDMs and species physiology has been difficult to achieve (Kearney and Porter 2009, Cheung et al. 2012, Talluto et al. 2016, Chapman et al. 2017, Chefaoui et al. 2019, Rodríguez et al. 2019. Recently, a new modeling technique has been suggested that uses a Bayesian approach to integrate physiology into correlative SDMs within a single framwork (Talluto et al. 2016). Applications of this method are, however, still scare.
In this study, we combined correlative SDM and physiological data to predict future distributions of indigenous and non-indigenous species in the Mediterranean (Albouy et al. 2013). Satellite data have suggested that the average sea surface temperature in the Levant region (the eastern part of the Mediterranean Sea along the coasts of Turkey, Syria, Lebanon, Israel, Egypt and Cypress) has increased by about 1.1 degrees in the past century, and it is predicted to increase by a further 2.3-2.9°C by the end of this century (Somot et al. 2008, Nykjaer 2009, Rilov 2016, Ozer et al. 2017. The marine biota in the Mediterranean Sea is reacting to this by changes in life cycles, demography and distribution (Ben Rais Lasram et al. 2010, Rivetti et al. 2014, Marbà et al. 2015, Rilov 2016, van Rijn et al. 2017, Shapiro Goldberg et al. 2019, Yeruham et al 2019. In addition to its response to elevated water temperatures, the Levant biota is rapidly changing due to the influx of hundreds of invasive species from the Red Sea through the Suez Canal (Bianchi 2007, Rilov and Galil 2009, Marras et al. 2015. It has been postulated that these mostly thermophilic invaders may be able to better resist the rising temperatures and hence be less affected, or even facilitated, by warming waters (Lejeusne et al. 2010, Bates et al. 2013, Marras et al. 2015. Several studies have used correlative SDMs to forecast the fate of Mediterranean species faced with warming climate (Ben Rais Lasram et al. 2010, Albouy et al. 2013, Azzurro et al. 2013). These models have typically forecast extreme range contractions of indigenous species, with species disappearing from the trailing 'warm' edge of their distribution. Many indigenous species occupying the Levant, which represents the warmest part of their distribution, may live close to their upper thermal limit. These species will be highly sensitive to increased ocean temperatures and could disappear with further warming (Yeruham et al. 2015). Other species may be further away from their thermal limit and will probably endure longer as climate changes (Rilov and Treves 2010). Correlative SDMs cannot differentiate between these scenarios, making predictions at the trailing edge of a species distribution highly uncertain (Morin andLechowicz 2008, Thuiller et al. 2008).
In this study, we use a recently proposed Bayesian approach that combines SDMs with physiological data (Talluto et al. 2016) in order to forecast the effects of ocean warming on six coastal Mediterranean marine species. We expected that combining physiological information on the species with correlative SDMs would enable more accurate predictions, especially for the warm trailing-edge of species distributions.

Study species
We focused on both indigenous and non-indigenous shallow rocky-shore taxa covering a wide taxonomic range, including seagrass, macroalgae and bivalves, in order to allow findings that would be as general as possible. We first screened the literature to compile a list of species for which solid experimental data of physiological thermal response curves are available. This greatly restricted the pool of species we could use. We further excluded species with uncertain taxonomic status in the Levant. For the remaining selected species, we gathered occurrences data from both primary and gray literature and open-access occurrence databases (Supplementary material  Appendix 1 Table A1). Data were visually checked and duplicate points and unreasonable locations were removed. The entire range of the species was used to better represent the conditions in which the species could survive (Parmesan et al. 1999). Our final species list (six species) was restricted to those possessing both sufficient occurrence records (the species with the minimal number of occurrences contained 227 records) and relevant physiological information (Table 1). Most species could be clearly defined as either indigenous or non-indigenous, and we discuss the implications of the origin of the species on the results in the discussion.

Environmental predictors
Environmental predictors used for the SDMs comprised annual maximum and minimum sea surface temperature (maxSST and minSST, respectively). We focused on temperature because we are interested in the species reaction to the warming of the Mediterranean Sea. In addition, temperature is the most commonly assessed predictor in physiological experiments, allowing us to incorporate the physiological data within the SDMs. We tested the relative importance of maxSST and minSST in relation to other environmental predictors such as current velocity and productivity for which we did not have corresponding physiological data. These other predictors had low variable importance relative to maxSST and minSST (Supplementary material Appendix 1 Table A2) and hence excluding them from the models did not likely impact the results. It is possible to add additional predictors into the combined physiological and correlative SDMs, even those with no physiological data (using vague priors, see model structure below). However, adding environmental predictors for which we do not have physiological data could risk diluting the physiological information. Since we are explicitly interested in examining the impact of the addition of physiology to predictions, we chose to retain only maxSST and minSST for which we had physiology-based priors. We chose maxSST and minSST, and not mean SST, in order to assess the species response under extreme temperatures that are likely to limit its distribution. The correlation between maxSST and minSST was 0.63, which is sufficiently low to use both within a single model (see below).
Within large scale occurrence data, uncertainty in sampling location is common. For coastal species this uncertainty in sampling locations can translate into very large difference in depth estimates (e.g. a 1 km can sometimes include locations of both depths of zero and depths > 100 m). Due to this uncertainty in depth estimates we did not use depth as a predictor. However, depth was used as a masking layer with a depth threshold of 300 m as a cutoff.
Environmental predictors were obtained from Global Marine Environment Datasets (GMED) at five arcmin resolution (Basher et al. 2014). In order to predict future distribution we used future minSST and maxSST for the year 2100 using the RCP scenarios (Assis et al. 2018) also at five arcmin resolution. These scenarios were based on the CCSM45 (Community Climate System Model 4), HadGEM2-ES (Hadley Centre Global Environmental Model 2) and MIROC55 (Interdisciplinary Research on Climate 5) climatic models. The results from the RCP4.5 scenario (a moderate scenario where concentration levels stabilize) and the RCP8.5 scenario (increasing emissions over time leading to high greenhouse concentration levels) are described here. The RCP6.0 scenario showed intermediate results and are not presented. All environmental predictors were upscaled to a resolution of 20 arcmin for analyses due to the location uncertainty associated with the occurrence records.

Species distribution model (SDM)
Species occurrences were related to environmental predictors using Bayesian generalized linear models (GLMs). Like with most SDMs, we focused on large scale environmental effects and ignore the potential influence of species interactions and local human anthropogenic impacts. As our occurrence data are based on presence only, we first created pseudo-absence data for each species. The selection of pseudo-absence data may have strong implications for model performance (Barbet-Massin et al. 2012). To overcome this limitation, we employed the three-step method (Senay et al. 2013, Iturbide et al. 2015. In this method pseudo-absences are selected outside the realized niche by taking into account the environmental variables, but in accessible geographic areas that potentially could be reached by dispersal. This provides a balance between using the spatial and environmental space for the selection of pseudo-absence points and has been shown to outperform other methods (Iturbide et al. 2015). The first step consists of defining environmentally unsuitable areas using a presence-only support vector machine algorithm. The second step consists of building SDMs using pseudo-absences generated for different sized buffers around known presence locations but within the unsuitability background zones defined in the first step. The third step consists of selecting the optimum buffer extent from the models generated in step 2. For the final output, we ran 50 models based on different sets of pseudo-absence data.
After the selection of pseudo-absences, the relationship between presence or absence and the predictors was estimated using Bayesian GLMs, in which species occurrences were treated as Bernoulli trials, where p is the probability of finding the species. We chose to focus on this method in order to facilitate direct comparison with the SDM combined with physiological data (see below). We used minSST and maxSST and their quadratic forms (to account for unimodal responses and non-linearities) as predictors. As a prior for the regression coefficients, we used a non-informative normal distribution (µ = 0, sigma = 10 000). The model was run using a burn-in period of 4000 samples. Following this, 4000 additional MCMC samples were used to estimate the posterior distribution of regression coefficients. These 4000 samples of the posterior distribution were taken from all 50 models containing different sets of pseudo-absence data and hence encompass the uncertainty associated with pseudoabsence generation. Using the model, we predicted for each 20 arcmin cell in the Mediterranean (overall 7992 cells) species suitability and the uncertainty associated with this estimate, both under current climate and climate for the year 2100. For cross validation, we used 70% of the data (presence and pseudo-absence occurrences) to train the model, and the remaining 30% was used as testing data. The splitting into training and testing was performed 100 times.

Physiological data
We collected data on the temperature dependence of important physiological responses for use within the combined physiological and correlative SDMs (hereafter physiology-SDMs). These data were obtained from published studies (Padilla-Gamiño and Carpenter 2007, Garval 2015, Georgiou et al. 2016, Guy-Haim et al. 2016, Guy-Haim 2017. For each species we used a response variable that is hypothesized to be directly associated with its fitness, and known to respond to changes in temperature (Angilletta 2009, Cooke et al. 2013, such as photosynthetic rates, growth rates (McGraw and Caswell 1996) and clearance rates which are directly associated with assimilation rates and are often used as a proxy for fitness through their effect on survivorship and fecundity (Angilletta 2009, Sinclair et al. 2016. The response variables (Table 2) were assessed using different proxy parameters, as follows: growth rates were measured in Halophila stipulacea as the change in leaf area (Georgiou et al. 2016); photosynthetic rates were measured as oxygen production per biomass per time (Padilla-Gamiño and Carpenter 2007, Garval 2015, Guy-Haim et al. 2016; and clearance rate was measured in the bivalve Striarca lactea as clearance rate of its microalgal food (Guy-Haim 2017).

Physiological model
We used the framework of Talluto et al. (2016) in order to combine the physiological information with the geographical occurrence data. This was done by first choosing a model to represent the temperature response curve of the physiological data. The posterior distribution, from the physiological submodel, was then used as a prior for each of the SDM coefficients ( Fig. 1). Thus, each coefficient in the SDMs received a prior mean and the standard deviation from the matching coefficient in the physiological sub-model. For example, if maxSST is used as a predictor within the GLM, the prior for this coefficient was based on the coefficient of the relationship between the physiological variable and temperature estimated from the physiological response curve. See Fig. 1 for a schematic overview of the modeling stages.
The physiological data used here, as typically obtained in most experiments, is expressed in units that must be converted to probability (i.e. that vary between zero and one) in order to be used as priors with the physiology-SDM. To do this necessitates two steps. First, we had to decide a threshold above which the physiological parameter indicates positive habitat suitability. For example, it is likely that given herbivory and other processes in natural communities, experimentally estimated photosynthetic rates have to be above a certain threshold to translate into non-zero habitat suitability. This issue is less of a problem in cases, such as those described in Talluto et al. (2016), for which the physiological data consists of population level growth rates estimates. To overcome this limitation, we applied a cutoff above which the physiological parameter values translate into positive habitat suitability. For the main text we show results for a thresholds set at the 10th quantiles of the measured physiological parameter data. However, results were robust to threshold choice (Supplementary material Appendix 1 Fig. A6).
Second, following Talluto et al. (2016) we translated the physiological data into probabilities that vary between zero and one. For this, we assumed that at a specific temperature the experimental physiological data follow a normal distribution. Then, based on the mean and standard deviation of the experimental data, we calculated the probability of the physiological parameter being positive at that temperature (Talluto et al. 2016). For example, if at 25°C growth rate measured in an experiment was 20 mm 2 leaf −1 d −1 but the experimentally estimated standard deviation was 10, the estimated probability of growth rate to be positive at 25°C was lower than in an experiment with a similar growth rate but a standard deviation of 1.
After conducting these two steps, we ran the physiological sub-models to estimate how the physiological data, transformed to relative suitability, changes with temperature. For the physiological sub-model we used Bayesian beta regression, with temperature as a predictor and the relative suitability, based on physiological data, as a response (Talluto et al. 2016). The physiological sub-model coefficients were given Gaussian priors, with the mean taken from the maximum likelihood estimation to improve convergence, and a vague prior was set on the variance (set at 1000).

SDM combined with physiological data (physiology-SDM)
The mean and variance of the posterior distribution of the coefficients estimated from the physiological sub-model were used as priors in the physiology-SDM to produce a SDM informed by the physiological data (Talluto et al. 2016). In cases where we had strong confidence in the physiological data, evident by low variance in the posterior distribution of the physiological sub-model coefficients, this could substantially change the physiology-SDM results in comparison to the SDM results. However, in cases where the posterior distributions of the physiological sub-model coefficients contain a lot of uncertainty, the physiology-SDM and SDM may be very similar.
We fitted the physiology-SDM using Bayesian GLMs, similar to the regular SDMs. Thus, species occurrences were treated as Bernoulli trials, where p is the probability of finding the species. However, the posterior distribution of the coefficients relating relative suitability to temperature, retrieved from the physiological sub-model, were used as informative priors. Here too, we used 4000 MCMC samples for burnin, and estimated the posterior distribution using 4000 additional samples. We visually examined convergence, using a convergence plot for model parameters, and found that the models for all species adequately converged. We applied prior weights of one to the physiological sub-model, so that the physiological data was given equal weight as the occurrence data. However, the results did not qualitatively change when other weights were chosen (Supplementary material Appendix 1 Fig. A3, A4).

Model performance and overlap calculations
For both the SDMs and physiology-SDMs we estimated cross-validation model performance from the training data to the testing data using the Boyce index (Hirzel et al. 2006). The Boyce index ranges from −1 to 1, where 0 means the model does not differ from random and 1 indicates a perfect fit to the data (Hirzel et al. 2006). We also sought to quantify the degree to which species distribution changes between: 1) current and future climate, and 2) regular SDMs and physiology-SDMs. These changes were quantified using the Schoener's D and Hellinger's I matrices of overlap (Warren et al. 2008). The D and I matrices measure the overall match between the species distributions and range from 0, no overlap, to 1, identical distributions (Warren et al. 2010, Broennimann et al. 2012. We were specifically interested in the Levant region, where the Mediterranean Sea is warmest, strongly affected by climate change, and where range contractions are likely to be the largest (Coll et al. 2010). Thus, in addition to the entire Mediterranean, we examined model performance and overlap for the trailing edge of the species distribution in the Levant. This was done by dividing the Mediterranean into the Levant region and the Mediterranean excluding the Levant, and re-calculating the above indices for each.

Physiological response curves
Focusing on the physiological response curves, the relative performance of a species at a certain temperature converted to probabilities, we found that most species demonstrated a unimodal response to temperature (Supplementary material Appendix 1 Fig. A1). Moreover, most species showed a decline in their suitability beyond 30°C (Supplementary material Appendix 1 Fig. A1). The cryptic bivalve, Striarca lactea was an exception as its suitability generally increased towards high temperatures, which may suggest that it is not strongly limited by the highest temperature examined here of 35°C.

SDM and physiology-SDMs coefficient estimates
Using the posterior distribution of the coefficients estimated from the physiological thermal response curves, we constructed the physiology-SDMs and compared them to SDMs based on environmental data only. We found that while for some species the posterior distribution for the parameters examined (minSST and maxSST and their quadratic forms) were similar between the regular SDMs and physiology-SDMs, for other species the estimates diverged, especially for minSST (Fig. 2). However, for all species the uncertainty associated with the coefficients estimated from the physiology-SDMs were much lower than in the regular SDMs (Fig. 2). When comparing the predicted temperature response curves from the SDM and the physiology-SDM for each species, we found a generally similar pattern, especially for minSST (Fig. 3). However, the physiology-SDM for most species showed less uncertainty, especially for maxSST at high temperatures (e.g. Ellisolandia elongata and Padina pavonica in Fig. 3). These results suggest that the physiological data were informative and had a substantial impact on constraining the modeled results.

Current distributions
The global distribution of the examined species predicted using physiology-SDMs under the current climate change be seen in Supplementary material Appendix 1 Fig. A4 and the Mediterranean distribution can be seen in Fig. 4, 5. Overall measures of cross-validated model performance indicated good fit of the SDM (Boyce = 0.77 ± 0.26; Table 3). These values are slightly, but not significantly, lower than those found for the physiology-SDMs (Boyce = 0.82 ± 0.27; Table 3), suggesting a slight advantage for the physiology-SDMs. When looking at model performances in the Levant region -the warm trailing edge of the species distribution -we found variable model performance, at least partially due to the low number of occurrences in this region for some species (Table 1). Regular SDMs generally presented low values which indicate an extremely poor fit (Boyce = −0.40 ± 0.46; Table  3). Overall, physiology-SDMs in the Levant also performed poorly (Boyce = −0.08 ± 0.75; Table 3), but for three species the improvement in model performance for the physiology-SDM compared to regular SDMs was substantial. Thus, for Asparagopsis taxiformis the Boyce index increased from 0.34 to 0.65, for Halophila stipulacea from −0.14 to 0.94, and for Ellisolandia elongata from −0.80 to −0.01 when using physiology-SDMs (Table 3). These findings support our hypothesis that physiology-SDMs can provide more accurate predictions at the edges of the distributions, although they do not unequivocally prove it. The predicted distributions for the SDMs and physiology-SDMs under the current climatic conditions showed a high general overlap (Supplementary material Appendix  1 Table A5, I = 0.98 ± 0.01 (mean ± SD across species), D = 0.87 ± 0.05). Regular SDM predictions suggest that most species have higher suitability in the western Mediterranean (Fig. 4, Supplementary material Appendix 1 Fig. A2). In contrast, the physiology-SDMs (Fig. 4, 5) suggest that the probabilities of occurrence are much more homogeneously distributed throughout the Mediterranean, with high probabilities even in warm regions like the Levant (Fig. 5).

Future distribution
Under future climate (year 2100, scenarios RCP4.5, PCR8.5), we forecast that most species will undergo range contractions and shift their distribution to the north-west (Fig. 4, Supplementary material Appendix 1 Fig. A2). This is true for both SDMs and physiology-SDMs, which showed a high general overlap with current conditions under scenario RCP4.5 (Supplementary material Appendix 1 Table A3, for regular SDMS: I = 0.96 ± 0.02, D = 0.77 ± 0.05) but a more pronounced shift resulting in lower overlap under scenario RCP8.5 (Supplementary material Appendix 1 Table A3, for regular SDMS: I = 0.77 ± 0.85, D = 0.46 ± 0.01). However, the overlap between current and future predictions was higher for the physiology-SDMs than for regular SDMs (Supplementary material Appendix 1 Table A3, difference in mean overlap between models for RCP8.5: ΔI = 0.14 ± 0.07; ΔD = 0.22 ± 0.13). Hence, when using physiology-SDMs species are predicted to display less change in their distribution, as seen in the general blue colors in Fig. 6.

Discussion
Regular SDMs predict that by the year 2100, all four of the indigenous species examined will have disappeared from the Levant, the warmest region of the Mediterranean, and shifted their distribution north-west. This adds to other studies that have predicted widespread loss of indigenous species from the Levant and a transformation of the local marine biota (Thomas et al. 2004, Ben Rais Lasram et al. 2010, Albouy et al. 2013, Rilov 2016, Givan et al. 2018. However, by incorporating physiological data into the SDMs we show that: A) the uncertainty associated with the coefficient estimates decreases, hence increasing confidence in the results (Fig. 2); B) overall cross-validated model performance was slightly higher for the physiology-SDMs (Table 3); and C) despite low number of occurrences, in the warm trailing edge of the species distribution physiology-SDMs substantially outperformed regular SDMs in three out of the six species (Table 3). These results suggest an overall advantage for the physiology-SDMs. The differences between SDMs and physiology-SDMs become more apparent when used to predict future distributions, as the overlap between current and future distributions was higher for the physiology-SDMs Figure 3. The relative suitability of a species to a certain temperature, obtained by fixing the other predictor at the median value, for the regular SDMs and the physiology-SDMs. The left column is for minSST and the right column for maxSST. Non-indigenous species names are marked by an asterisk. than for regular SDMs (Supplementary material Appendix  1 Table A3). This means that when using physiology-SDMs species are predicted to display less change in their distribution (Fig. 6), demonstrating that range shifts may be less dramatic than previously predicted. Thus, many species may be able to survive in the Levant even when water temperatures increase substantially.
The use of physiological data combined with SDMs for predicting species distributions is a new but rapidly developing methodology. The simplest approach is to integrate basic physiology into SDMs, by using the physiological data as a threshold to delineate the range of environmental conditions within which the species can survive. Using this method, Martínez et al. (2014) indicated that one Atlantic algae species would become extinct at its southern range while another would increase its occupancy as climate warms. This approach uses physiology only to delineate range margins and hence does not take advantage of the wealth of available data. More sophisticated hybrid models may directly combine correlative and mechanistic models (Gallien et al. 2010, Thuiller et al. 2013, Evans et al. 2016. For example, Dynamic Bioclimate Envelope Models (DBEM) identify 'environmental preference profiles' for each species and then transform changes in climate into changes in life history, growth and carrying capacity (Cheung et al. 2011(Cheung et al. , 2012. DBEMs are complex and rely on several assumptions that are difficult to verify for most species (Lefevre et al. 2017). Here, we used the framework suggested by Talluto et al. (2016), which enables a straightforward and transparent integration of physiological data into the SDMs. This opens up the possibility of taking advantage of the large number of experimental studies that have produced physiological response curves in order to improve model projections. While this approach is promising, it has not, to our knowledge, been applied and tested for marine species.
Many studies, based largely on SDMs, predict drastic changes in species ranges under future climates both for Mediterranean marine species (Ben Rais Lasram et al. 2010, Albouy et al. 2013 and for other taxa and regions (Beaumont andHughes 2002, Thomas et al. 2004, Present predictions Future predictions RCP4.5    Araujo et al. 2006). Here we have shown, by incorporating physiological data into SDMs, that those forecasts may be too extreme, at least for the Mediterranean. Using regular SDMs we found that the suitability for most species in the warm trailing edge was extremely low (Supplementary material Appendix 1 Fig. A2); whereas the physiology-SDMs usually predicted a higher suitability for both the current and future climate (Fig. 5, 6). Nevertheless, even with physiology-SDMs habitat suitability for most species becomes low under extreme warming scenarios such as RCP8.5. We expected the physiology-SDMs to perform better than SDMs in the Levant region, which is the warm trailing end of the distribution of indigenous species. In general, we obtained poor cross-validated model performances in the Levant regardless of the type of model used (SDMs or physiology-SDMs). There are two possible reasons for this. First, it may indicate that the physiology-SDMs were simply not beneficial in improving the models. We show, however, that the coefficient estimates from the physiology-SDMs are more constrained than the estimates obtained for regular SDMs (Fig. 2). Moreover, for three out of six species cross-validated model performance in the Levant for the physiology-SDMs was much higher than for the regular SDMs (Table 3). This suggests that adding physiology into the SDMs did have clear advantages and has the potential to provide more accurate predictions at the edges of the distribution. Second, for many species there are currently few occurrence points in the Levant (< 10 in three out of the six species), which can affect model performance (Table 1). Thus, although we did not see unequivocal improvement in model performance within the Levant, we believe that with more occurrence data the physiology-SDMs may clearly perform better than regular SDMs.
We found that both SDMs and physiology-SDMs predicted north-west shifts in distribution, with a corresponding decrease in occurrence in the Levant. This pattern was highly pronounced in the indigenous species Padina pavonica, which in the Levant region is predicted to decrease substantially under future climate (Fig. 5, Supplementary material Appendix 1 Fig. A2). Figure 3 and Supplementary material Appendix 1 Fig. A1, indicate that this species cannot physiologically survive high temperatures. An exception to this north-west distributional shift is the non-indigenous Halophila stipulacea, for which we predicted a high suitability throughout the Mediterranean Sea under both current and future climates (Fig. 5, Supplementary material Appendix 1 Fig. A2). This is in line with Georgiou et al. (2016) who suggest that this species has a wide thermal tolerance. Nonetheless, although this species has been found in the Mediterranean for over a century (Georgiou et al. 2016) its distribution remained highly restricted and patchy, suggesting that other factors are limiting its spread beyond the sites where it currently occurs.
The forecasted distributional shifts differed in magnitude between model types. For most species, it can be seen that regular SDMs predicted larger changes than physiology-SDMs (Fig. 6). This may be due to over-fitting by regular SDMs, compared to physiology-SDMs (Fig. 3). Over-fitting may be reduced in physiology-SDMs by the external constraints imposed by the species physiological response. Species poleward distributional shifts due to climate change have been repeatedly described (Parmesan 2006) and predictions may benefit from models that incorporate species physiology.
We a-priori assumed that the tropical non-indigenous species would perform better under high temperatures, as they naturally encounter higher temperatures across their native range (Lejeusne et al. 2010, Bates et al. 2013, Marras et al. 2015. This prediction is complicated by the possibility that these species are not at equilibrium with their environment, and hence not constrained by environmental factors, and the possibility of niche shifts (Parravicini et al. 2015). The two non-indigenous species examined here (Halophila stipulacea and Asparagopsis taxiformis) revealed temperature response curves that are not very different from those of the indigenous species (Supplementary material Appendix 1 Fig. A1). Nevertheless, both the SDMs and physiology-SDMs support our prediction as we found that these non-indigenous species have higher habitat suitability in the Levant compared to indigenous species, under both current and future climate (Fig. 5, Supplementary material Appendix 1 Fig. A2). We further found that the increase in cross-validated model performance attained by using physiology-SDMs was highest for the two non-indigenous species (Table 3). This contradicts our prediction that performance improvement of the physiology-SDMs would be higher for indigenous species, as non-indigenous species may have additional occurrence from their native range that correspond to locations that represent their upper thermal tolerance. Examining the temperature response curves, it is evident that the increase in mode performance for the two non-indigenous species is associated with both higher confidence in the estimates (Fig. 2) and modified response curves (Fig. 3). Nonetheless, the two tested species may not necessarily represent the majority of non-indigenous species, and more species need to be tested to determine if physiology-SDMs are generally more useful when modelling non-indigenous species. While we have demonstrated here the utility of the physiology-SDM, this approach also has several drawbacks. First, we examined species that dwell in different habitats ranging from intertidal pools to shallow subtidal reefs. Some of these species may not directly experience the sea surface temperatures estimated from remote sensing data, which may underestimate true temperatures by 2°C (Rilov 2016). Moreover, species dwelling in intertidal pools may experience very different temperatures to those that dwell in the subtidal. While in regular SDMs these inaccuracies may not be critical as the models are designed to predict relative habitat suitability, for physiology-SDMs this mismatch between remote sensing based climate data and physiological data may compromise model performance. In our case, it means that species may be experiencing higher temperatures than those predicted by the environmental layers for shallow water and tide pool species, but lower temperatures than predicted by the environmental layers for burrowing or deeper subtidal species. Hence, future change in suitability for a species in the Levant may be either larger or smaller than estimated here.
Second, the way the measured physiological data correspond with habitat suitability is not straightforward. Even if parameters such as population growth rate (r) can be measured experimentally in controlled environments, it remains unclear if this can be scaled up to reflect large-scale natural populations. This is even more of a problem for the physiological parameters used in this study such as leaf growth rate, clearance rate and photosynthesis rate. For the physiology-SDM, we assume higher performance in these parameters, e.g. high leaf growth rate, corresponds to higher habitat suitability as estimated from SDMs. While this seems like a reasonable assumption, instances in which this assumption fails can be easily envisioned (Osorio-Olvera et al. 2019). Another issue to consider is the value of the physiological parameters which correspond to positive growth rates and habitat suitability. For the parameters used here, it is clear that negative values will entail death over the long term. However, many organisms may remain alive for long periods even without acquiring nutrition through filtering or photosynthesis. In addition, positive values do not guarantee positive habitat suitability. It may be that an individual with positive leaf growth rate or positive photosynthesis rate will not grow fast enough to produce positive population growth rates within a specific habitat. We tackle this problem by applying different thresholds to the physiological sub-models above which the physiologically estimate values can be translated to habitat suitability greater than zero. We show that the results are robust to different thresholds used (Supplementary material Appendix 1 Fig. A6), but this will have to be assessed on a case by case basis in future studies.
Third, the weight applied to the physiological sub-model relative to the SDM, here set at one, is somewhat arbitrary.
Ideally, the weight should reflect the relative confidence in the physiological data and should increase with better experimental evidence, e.g. if a wide range of temperatures are examined, if sample design is adequate, and if several different populations are tested. Similarly, the cutoff used before applying the physiological sub-model, here set to 10% of the data, is also arbitrary, although it should ideally reflect physiological knowledge of the conditions under which species have positive growth rates. Nonetheless, we show that our results are generally robust to variation in these weights (Supplementary material Appendix 1 Fig. A3, A5). In future studies, it would be desirable to produce guidelines for selecting an objective weighting scheme for the physiological sub-models.
Fourth, the physiological response of species to environmental gradients may not be stationary in time or space. Acclimation, epigenetic responses and local adaptation can all produce variation in the temperature response curve (Thuiller et al. 2013). Here, for lack of data, we do not assess this variation, but argue that for most species it is likely to be small relative to interspecies differences in the temperature response curve. With more information accumulating for each species, this intraspecific variation may be empirically assessed, with limits to the local adaptation being incorporated into models.
Finally, physiological data are typically collected within controlled tank or mesocosm experiments. Experimental properties such as volume, duration and complexity (e.g. inclusion of environmental fluctuations, system openness) can affect the magnitude and direction of the measured response. Species may also show different temperature response curves in the field in the presence of predators, competitors and variation in other environmental factors (Helmuth et al. 2006, Childress andLetcher 2017). Moreover, physiological data are estimated for the adult stage only. However, adults can survive temperatures that larval stages may not (Levy et al. 2016). This stresses that physiological data should be obtained from experiments conducted under conditions as close as possible to those in nature. For example, mesocosms that mimic natural temperature, salinity and irradiance in both mean and daily ranges (Guy-Haim 2017). Such types of experiments will probably produce better physiology-SDMs than the more controlled tank experiments used here.
In conclusion, there is a need to incorporate more realism into SDMs in order to improve predictions of distribution changes, such as species interactions and direct estimated of human anthropogenic impact (Urban et al. 2016, Record et al. 2018. We have shown here that incorporating species physiology data into SDMs changes species distribution predictions. Importantly, we found that while SDMs predicted that indigenous species are likely to decline in the Levant as the climate warms, physiology-SDMs predicted a milder reduction in suitability. We also found some support for physiology-SDMs performing better at the environmental range edges (the trailing end of the species distribution) where regular SDMs often fail. However, this initial finding requires further support from more species with more occurrence data. The framework proposed by Talluto et al. (2016), and employed here, is easy to implement on other species and regions using already available physiological data. Although physiological data for marine species have typically been limited and scattered, emerging technologies posture to modernize methods for obtaining these data (Pimm et al. 2015, Wolfert et al. 2017, while complementary new databases (Bennett et al. 2018, Farley et al. 2018) will catalog and maintain the data for future investigations such as the physiology-SDMs approach presented here. These models can inform biodiversity management programs and help to preserve threatened marine biota.