Dynamic ensemble models to predict distributions and anthropogenic risk exposure for highly mobile species

Advances in ecological and environmental modelling offer new opportunities for estimating dynamic habitat suitability for highly mobile species and supporting management strategies at relevant spatiotemporal scales. We used an ensemble modelling approach to predict daily, year‐round habitat suitability for a migratory species, the blue whale (Balaenoptera musculus), and demonstrate an application for evaluating the spatiotemporal dynamics of their exposure to ship strike risk.


| INTRODUC TI ON
The ocean provides a diverse and extensive suite of ecosystem services, all of which ultimately depend on a functioning ecosystem.
Management strategies that enable human activity while conserving biodiversity remains a key challenge for ocean governance.
Marine spatial planning implicitly addresses such trade-offs and as such has become a valuable approach for regulating multiple ocean activities while achieving conservation targets (Foley et al., 2010;White, Halpern, & Kappel, 2012). However, marine spatial planning is often static, despite recognition that marine habitats tend to be highly dynamic and can shift in space on timescales of days to weeks (Checkley & Barth, 2009;Kavanaugh et al., 2016). Thus, static management strategies do not account for shifting habitats or human activities and, importantly, offer only partial protection for highly mobile species (Dunn, Maxwell, Boustany, & Halpin, 2016;Hazen et al., 2018). Dynamic management has been identified as a potential solution to this problem by allowing management decisions to be updated in space and time in response to changing environmental or socioeconomic conditions . In order to inform the ecological components of dynamic management, there is first a need to accurately describe the spatiotemporal distribution of species and populations (Foley et al., 2010).
Species distribution models (SDMs) are key tools for describing species habitats and distributions across marine and terrestrial systems (Elith & Leathwick, 2009;Robinson, Nelson, Costello, Sutherland, & Lundquist, 2017). Species distribution modelling involves using statistical tools to relate species occurrence or abundance to spatiotemporal patterns of environmental variation (Elith & Leathwick, 2009). Though the designs, uses and applications of SDMs in ecology are diverse, two methodological advancements hold particular promise for dynamic ocean management.
First, while marine SDMs have typically used a single-model type (Robinson et al., 2017), determining an appropriate modelling approach can be challenging given inherent trade-offs in the statistical methods available (Elith et al., 2006;Qiao, Soberón, & Peterson, 2015). Multi-model ensembles can reduce uncertainty by overcoming biases inherent in any one model type and providing a "consensus" approach to predictions (Araújo & New, 2007;Gritti, Duputié, Massol, & Chuine, 2013). Predictions generated from multi-model ensembles can also be more accurate than those of single models (Marmion, Parviainen, Luoto, Heikkinen, & Thuiller, 2009;Oppel et al., 2012;Scales et al., 2015). As a result, ensemble model approaches are increasingly recommended for marine species distribution modelling (Jones & Cheung, 2014;Robinson et al., 2017), motivating further evaluation of their application to dynamic management.
Second, data-assimilative ocean circulation models offer the opportunity to better match species data to the underlying environmental processes at relevant spatiotemporal scales (Becker et al., 2016;Brodie et al., 2018). There has been increasing attention paid in the marine realm to the spatiotemporal scales of environmental observations and their relevance to the scales of animal response (Mannocci et al., 2017;Scales et al., 2016). Frequently, there is a mismatch between the spatiotemporal resolution of species occurrence and the spatiotemporal resolution of environmental observations. Such mismatch can lead to incorrect inferences and increased uncertainty (Scales et al., 2016). Moreover, designing dynamic management strategies for highly mobile species often relies on up-to-date, or "real-time," estimates of species distributions (Laist, Knowlton, & Pendleton, 2014). Data-assimilative ocean circulation models can help solve this mismatch by providing gapless environmental data, often with higher spatial or temporal resolutions than those of processed remotely sensed data (Becker et al., 2016;Brodie et al., 2018). Ocean circulation models can also provide information about the vertical structure of the ocean that remotely sensed variables cannot, which can improve predictions of marine species' distributions .
Blue whales (Baleanoptera musculus) are both a highly migratory species and a species of conservation concern, highlighting the need to understand their habitat use and exposure to potential anthropogenic threats throughout their migrations. Blue whales are listed as Endangered under both the U.S. Endangered Species Act (1973) and the IUCN Red List of Threatened Species due to population depletion from commercial whaling (Reilly et al. 2008). In the Northeast Pacific, blue whales perform latitudinal migrations between tropical wintering/breeding grounds and productive foraging grounds at higher latitudes in the California Current Ecosystem (CCE) (Bailey et al., 2009;Ballance, Pitman & Fiedler, 2006;Irvine et al., 2014;Mate, Lagerquist, & Calambokidis, 1999). While in the CCE, blue whales follow the spring and summertime progression of the availability of krill (Abrahms et al., 2019), their primary prey, and demonstrate temporal synchrony with krill availability (Croll et al. 2005;Fossette et al. 2017). While dynamic distribution data on krill are not available at the requisite spatial and temporal scales of our study, previous studies have shown that blue whale habitat in the CCE can be characterized by a combination of dynamic and static environmental characteristics, such as sea surface temperature, thermocline and seafloor depths and primary productivity (Becker et al., 2016;Hazen et al., 2017).
Though the eastern North Pacific blue whale population is recovering (Monnahan, Branch, & Punt, 2014), mortality from ship strikes in the CCE remains a major management concern (Rockwood, animal movement, anthropogenic risk, dynamic ocean management, ensemble modelling, habitat selection, marine megafauna, migration, satellite telemetry, spatial ecology, species distribution model Calambokidis, & Jahncke, 2017). In particular, the Southern California Bight (SCB; from San Diego to Point Conception, ~33-34.5°N) is a hotspot for ship strikes due to the high spatial and temporal overlap between blue whale summer foraging hotspots and shipping vessels travelling to and from southern California's largest ports Redfern et al., 2013;Rockwood et al., 2017). Currently, voluntary seasonal slowdowns are the only tool used to manage ship strike risk in this region (Rockwood et al., 2017). The existence of multiple shipping lanes for marine traffic offers potential opportunity for redirecting vessels based on strike risk (Redfern et al., 2013), but existing blue whale models for the region are either limited by partial temporal coverage (Becker et al., 2018) or by coarse resolution . In addition to better matching the timescales of species occurrences, finer scale distribution estimates may be better able to inform dynamic management decisions in the SCB given the migratory behaviour of blue whales (Bailey et al., 2009) and the variable biophysical conditions in the CCE (Checkley & Barth, 2009).
We used a multi-year (1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008) satellite tracking dataset on 104 blue whales along with data-assimilative ocean model output to develop a daily, year-round distribution model for blue whales in the CCE. We evaluated the utility of using multi-model ensembles relative to single-model approaches and used the largest compilation of independent blue whale sightings datasets to date to select and validate our final species distribution model. Finally, we quantified the spatial and temporal distribution of exposure to ship strike risk within shipping lanes in the SCB.
All tracks were filtered for errors and smoothed to provide daily position estimates using a Bayesian switching state-space model (Bailey et al., 2009;Jonsen, Flemming, & Myers, 2005), resulting in 10,603 daily locations ( Figure S1). Information on tag types and deployment duration is available in Bailey et al. (2009).
Because presence-absence models outperform presence-only models for species distribution modelling (Elith et al., 2006), use of artificial absence data (i.e., "pseudo-absences") is recommended when true absence data are unavailable (Barbet-Massin, Jiguet, Albert, & Thuiller, 2012). To analyse habitat suitability in a presence-absence framework, we generated pseudo-absences following Hazen et al. (2017) by simulating 200 correlated random walks per whale track using empirical step-length and turn-angle distributions (Kareiva & Shigesada, 1983). This approach enabled matching the error structures of pseudo-absences and empirical movement data Jonsen et al., 2005;Scales et al., 2016;Willis-Norton et al., 2015). A flag value was assigned to each simulated track indicating its similarity to the empirical track based on distance and net angular displacement from the empirical track Willis-Norton et al., 2015). To ensure simulated tracks represented areas accessible to the whales, only simulated tracks in the upper 75th percentile of flag values were used for comparison (see Hazen et al., 2017 for detail). Two simulated tracks per empirical track were selected at random for inclusion in further analyses, resulting in 21,328 pseudo-absence points compared to 10,603 presences (Barbet-Massin et al., 2012; Figure S2).

| Environmental data
Presence and pseudo-absence data were matched to dynamic surface and subsurface environmental variables as well as to static seafloor relief variables. Variables were examined based on hypothesized drivers of habitat and those shown to be significantly associated with blue whale space use (Becker et al., 2016(Becker et al., , 2018Hazen et al., 2017).
Daily environmental data at 0.1° resolution were sourced from historical and near-real-time data-assimilative versions of the Regional In addition, the following static seafloor relief variables were sourced from ETOPO1 (obtained from https ://www.ngdc.noaa.gov/mgg/ globa l/global.html; 0.1-degree resolution): bathymetry (z), standard deviation of bathymetry (z_sd), slope and aspect. Standard deviations of SST, SSH and bathymetry for each location were calculated using a 1° radius centred on that location. Our study area was matched to the ROMS model domain (30 to 48°N and from the coast to 134°W).

| Species distribution modelling
Given potential differences in explanatory power and predictive skill (Derville, Torres, Iovan, & Garrigue, 2018;Fiedler et al., 2018), species distribution models were built using both Generalized Additive Mixed Models (GAMMs; "mgcv" R package) (Wood, 2017) and Boosted Regression Trees (BRTs; "dismo" R package) (Elith, Leathwick, & Hastie, 2008). Because seasonality in migratory behaviour has been shown to influence blue whale environmental preferences , for both GAMMs and BRTs we explored year-round models as well as separate models for each migratory season (summer/fall-July-November; winter/spring-December-June). GAMMs were fitted using the binomial family and a logit link function, with individuals nested as a random effect. A tensor product smooth between latitude and longitude was explored as predictors in GAMMs to account for spatial autocorrelation (Becker et al., 2018). A latitudinal interaction term with SST was also considered in the GAMMs to account for the latitudinal temperature gradient over the study area (Becker et al., 2018). Multiple candidate GAMM models were generated based on published and hypothesized predictor variable combinations (Becker et al., 2016;Hazen et al., 2017) and were initially evaluated using the Area Under the receiver operating Curve (AUC ; Table S1) before selecting the top three GAMMs for further model evaluation. AUC statistics discriminate between truepositive and false-positive rates, and range from 0 to 1, where a score of >0.5 indicates better than random discrimination (Hanley & McNeil, 1982). Generalized variance inflation factors were used to ensure any colinear predictor variables were not included in the same GAMM. As such, SST and BVF were not included together in candidate GAMMs.
A Bernoulli family distribution was used for BRTs, in which all environmental variables were included since BRTs can handle irrelevant predictors and any collinearity effects (Elith et al., 2008). GAMM-BRT ensemble model combinations were explored by ensembling the seasonal BRTs with each of the top three performing seasonal GAMMs.
Each model type was given an equal weighting in the ensembles.
Because of the importance of using multiple metrics for SDM evaluation (Fourcade, Besnard, & Secondi, 2017), predictive performance for the BRTs, top three GAMMs, and ensemble model combinations was further evaluated using AUC and True Skill Statistic (TSS) metrics on three training and testing dataset combinations: (a) k-fold cross-validation with a 75%/25% training/testing data split over each of five folds, (b) "Leave One Out" cross-validation in which a year of data was iteratively left out from training and retained for testing and (c) the full tagging dataset tested on an independent blue whale sightings dataset (N = 3,413 observations; SI Table S3, Figures S3 and S4).
To calculate AUC and TSS metrics for testing against sightings data, pseudo-absences were randomly generated at a 1:3 presence:absence ratio. As independent testing is recommended over cross-validation for evaluating SDM performance (Derville et al., 2018;Gregr, Palacios, Thompson, & Chan, 2018), the final model was chosen based on the sightings data metric averaged across seasons. Finally, we calculated the point biserial correlation between the final model's predictions and independent sightings versus pseudo-absences (Elith et al., 2006).
Daily spatial predictions of blue whale habitat suitability were mapped onto the 0.1° gridded study domain using the "raster" r package.

| Risk exposure in shipping lanes
Based on the output of the top-performing model or ensemble, we compared predicted habitat suitability within shipping lanes inside and outside of the Santa Barbara Channel in the SCB over the course of a year to evaluate spatiotemporal patterns in risk exposure to ship strikes. Vessels travelling into the SCB use either established routes inside the Santa Barbara Channel ("Northern approach") or a Western approach outside of the channel, following the implementation of an "Ocean-Going Vessel Fuel Rule" in 2009 that resulted in increased traffic using the Western approach (Redfern et al., 2013). Spatial layers for the alternate routes were provided by the Channel Islands National All analyses were performed using r statistical computing (R Core Team, 2017).

| Species distribution model
Dynamic ensemble modelling for blue whales revealed that blue whale habitat use in the CCE is strongly influenced by temperature, seafloor topography and subsurface water properties (Table 1, Figures 1 and 2). Seasonal models outperformed year-round models on average for both GAMMs and BRTs (Table S2), and top models showed high predictive performance, with AUC scores using tracking and independent sightings datasets ranging from 0.84 to 0.99, and TSS scores ranging from 0.55 to 0.89 (Tables 1 and S1). The highest performing GAMM included SST, SSH_sd, z, z_sd, ILD and EKE (Table 1, Figure 1). The average explained deviance, a measure of descriptive performance, for the seasonal GAMMs was 43.4%. BRTs did not undergo variable selection and included all environmental TA B L E 1 Top seasonal GAMM, BRT and ensemble candidate models and diagnostic metrics (AUC/TSS) compared to independent sightings data for each season and averaged across seasons. Metrics for additional training/testing methods are provided in Table S1. Variable acronyms refer to sea surface temperature (SST), bathymetry (z), sea surface height (SSH), SSH standard deviation (SSH_sd), bathymetry standard deviation (z_sd), isothermal layer depth (ILD), bulk Brunt Väisälä frequency (BVF), and eddy kinetic energy (EKE). All GAMMs include a random effect for individual. The final model is highlighted in bold  Figure 2, Figures S5 and S6).
The average explained deviance for the seasonal BRTs was 60.1%.
GAMMs and BRTs showed general agreement in habitat preferences.
For both model types, some differences in response curves between seasons were apparent: in winter/spring, habitat suitability was associated with SST >15°C, high SST standard deviations indicating thermal front activity, shallower ILDs (<50 m), low BVFs indicating weak stratification, seafloor depths <3,000 m, and areas of high seafloor ruggedness as measured by standard deviation of bathymetry (Figures 1 and 2, Figure S5).
In summer/fall, habitat suitability was associated with SST between 16 and 20°C, weak stratification, shallower seafloor depths (<2,000 m), high seafloor ruggedness and high wind stress curl (Figures 1 and 2, Figure   S6). In addition, sea surface height standard deviation (SSH_sd), a measure of mesoscale variability, was a significant contributor to the models in summer/fall, but not in winter/spring (Figures 1 and 2). Multi-model ensembles outperformed single models (Table 1)  which were similarly concordant with the sightings data ( Figure 5).

| D ISCUSS I ON
Dynamic management approaches, in which management strategies are adjusted in concert with relevant biological, environmental and socioeconomic processes, are increasingly proposed to balance the tradeoffs between human activities and species conservation Maxwell et al., 2015). Because dynamic management strategies often rely on an understanding of how the spatial distribution of a species or population changes with time (Howell, Kobayashi, Parker, Balazs, & Polovina, 2008;Maxwell et al., 2015), dynamic SDMs are emerging as an important natural resource management tool . Dynamic SDMs can not only elucidate a species' habitat preferences and distribution in relation to shifting environmental conditions, but they can also help identify the spatial and temporal dynamics of species' risk exposure (Zydelis et al., 2011). Our study highlights the utility of dynamic ensemble modelling using high-resolution environmental data to identify time-varying species distributions and guide dynamic management of a highly migratory species.

| Model performance
Based on a suite of testing metrics, including validation against an extensive compilation of independent sightings data, our final seasonal model yielded accurate year-round predictions of daily blue whale habitat suitability in the CCE and realistically reproduced the whales' expected latitudinal migratory behaviour. The top GAMM and BRT models showed strong descriptive and predictive performance, and an ensemble of the two models increased overall performance (Table 1) Figure S1), which may be due in part to nearshore tagging locations (Bailey et al., 2009). Telemetry-based models may therefore underestimate offshore habitat suitability, though model performance remained high when compared to sightings data that were more broadly distributed offshore ( Figure S3). Testing the performance of models developed using a combination of data types such as telemetry, transect survey and acoustic monitoring data would therefore be a valuable exercise for exploring biases based on data types used (Fithian, Elith, Hastie, & Keith, 2014;Yamamoto et al., 2015).
Our model evaluation procedure also highlights the value of using multi-model ensembles. Though top-performing GAMMs and BRTs yielded similarly high diagnostic scores (Table 1), fine-scale differences were evident in the spatial predictions (Figure 3). Different modelling approaches have various strengths and weaknesses, and in particular display trade-offs between the ability to explain fitted data versus predict novel data (i.e., descriptive versus predictive performance, respectively) (Derville et al., 2018). For instance, machine learning techniques like BRTs typically have strong descriptive power but can suffer from overfitting (Derville et al., 2018). Indeed, the unsmoothed response curves from our BRT models ( Figures S5 and S6) show abrupt changes typical of the recursive binary splits of regression trees (Elith et al., 2008). Such abruptness in response curves can indicate overfitting, though the performance of these models tested against novel sightings data suggests this is not the case. In contrast to machine learning models, regression models like GAMMs may have lower descriptive performance but have been shown to have good predictive performance (Derville et al., 2018;Gregr et al., 2018;Qiao et al., 2015). GAMMs have also been proposed as effective tools for predicting into novel conditions (Becker et al., 2018;Derville et al., 2018). Ensemble models therefore provide an approach for balancing these trade-offs and can highlight areas of consensus between models (Araújo & New, 2007;Gritti et al., 2013;Marmion et al., 2009;Scales et al., 2015). In order to be relevant to management applications, our interest here was primarily in predictive rather than descriptive performance, and indeed, we demonstrate that a multi-model ensemble yielded higher predictive performance than either single-model type. Nevertheless, concordance between the highest-ranked GAMMs and BRTs in modelled responses to environmental predictors lends confidence that each model was able to detect general patterns of whale habitat preferences.

| Blue whale habitat use
Previous studies have demonstrated the importance of SST, SSH and seafloor topography in blue whale habitat selection in the CCE (Becker et al., 2016;Hazen et al., 2017). Our study supports these findings and indicates that subsurface ocean dynamics also play an important role in habitat suitability for blue whales in this region.
Isothermal layer depth (ILD) was retained as a significant predictor in the top ranking single models, and likely contributed to increased predictive performance as compared to a previous model using only remotely-sensed surface variables . These results support a recent study demonstrating that subsurface ocean variables can improve descriptive power and predictive performance in SDMs for a number of other marine predator species, and help resolve species' responses to mesoscale activity apparent in the patchiness of prediction fields . Blue whales' apparent preference for areas with shallower ILD values is also consistent with higher predicted blue whale densities in these areas (Becker et al., 2016), an association likely driven by increased prey availability.
Indeed, May-September predictions of blue whale habitat suitability showed strong spatial consistency with known krill hotspots in the SCB, Monterey Bay, and downstream of Cape Mendocino and Cape Blanco (Santora, Sydeman, Schroeder, Wells, & Field, 2011;Santora, Zeno, Dorman, & Sydeman, 2018). Thus, while our model did not include prey data, it was able to realistically identify blue whale foraging hotspots via its combination of physical proxies.

| Implications for dynamic management
Despite the dynamism of human activities and species affected by them, most boundaries used for ocean management, such as shipping lanes, are static. If dynamic spatial boundaries are unfeasible for management, the spatiotemporal patterns of overlap with anthropogenic threats should be assessed to understand where and when greatest risk occurs (Redfern et al., 2013;Rockwood et al., 2017 (Figures 4 and 5). However, the anomalous warming of 2015 (Bond et al., 2015;Di Lorenzo & Mantua, 2016;Jacox et al., 2018Jacox et al., , 2016 resulted in dramatic changes in the timing of overlap, with risk exposure increasing earlier in the year under anomalously warm conditions ( Figure 5). These predictions were mirrored by the date ranges of observed blue whale sightings in those years ( Figure 5, Figure S4). resolutions; we encourage development, dissemination and uptake of output from these ocean models for applications like the one demonstrated here. In the absence of gapless data, analytical techniques such as Boosted Regression Trees have been successfully applied to deal with missing remotely sensed data, for example due to cloud cover, in a dynamic species distribution modelling context Welch et al., 2018). With the increase in ocean modelling or remote sensing technologies and computational power, there is greater opportunity to implement dynamic management approaches that are more responsive to changing environmental conditions, species' movements and patterns of human activity Maxwell et al., 2015). Such efforts thereby exemplify opportunities for cross-disciplinary collaboration. Dynamic, high-resolution species distribution models provide a valuable tool for assessing the spatiotemporal patterns of risk exposure to achieve management objectives.

ACK N OWLED G EM ENTS
We thank the Benioff Ocean Initiative for supporting this study. We thank the UCSC Ocean Modelling group for providing ROMS output.