Spatio‐temporal model and machine learning method reveal patterns and processes of migration under climate change

Despite extensive studies of phenological shifts in migration by climate change and driving factors of migration, a few issues remain unresolved. In particular, little is known about the complex effects of driving factors on migration with interactions and nonlinearity, and partitioning of the effects of factors into spatial, temporal, and spatio‐temporal effects. Here, we aim to elucidate migration pattern as well as its driving factors under climate change.


| INTRODUC TI ON
Climate change has substantially affected organisms worldwide (e.g.Hoegh-Guldberg & Bruno, 2010;Pecl et al., 2017;Walther, 2010;Walther et al., 2002).Changes in species distribution are thought to be an indicator of the effect of climate change on organisms.
However, studies on distributional shifts often have gaps in terms of climate change and ecology.Previous studies in ecology have suggested that species are affected by various external factors with complex interactions and nonlinear effects (e.g.Guisan et al., 2006;Nakayama et al., 2020;Ryo et al., 2018;Sugihara et al., 2012).
Therefore, distributional shifts under climate change should also be related to these complex interactions and nonlinear effects; simple statistical methods such as correlation, linear regression and similarity may lead to erroneous results.
Migration, which is a seasonal movement of organisms from one region to another and a driver of seasonal species distribution, plays a role in population and ecosystem dynamics as well as in ecosystem services such as tourism (e.g.Bauer & Hoye, 2014;Teitelbaum & Mueller, 2019).Recently developed tracking technology has provided insights into the drivers of migration.One major driver of migration is ecological factors such as prey (e.g.La Sorte & Graham, 2021;Mathot et al., 2007;Sherrill-Mix et al., 2008) and temperature (e.g.Kanamori et al., 2019;Marra et al., 2005;Sherrill-Mix et al., 2008).This type of factors leads to migration through changes in physiological conditions such as reproduction.Migration itself also affects physiological conditions, thereby regulating the timing and location of migration.
Despite extensive evidence and predictions that these factors are related to migration, some challenging problems remain unresolved.The first problem involves the complex effects of driving factors on migration because species are generally affected by various factors with interactions (e.g.Guisan et al., 2006;Ryo et al., 2018) and nonlinearity effects (e.g.Nakayama et al., 2020;Sugihara et al., 2012).The complex descriptions of the relationship between species and migration-driving factors make it difficult to interpret the results of analyses; therefore, a method for easily interpreting the results is needed.The second problem involves partitioning the effects of migration-driving factors into spatial, temporal and spatiotemporal effects.For example, we consider a general case in which we use time-series data of a species within a given spatial range and use a generalized linear model to analyse the relationship between species occurrence and a factor such as temperature.Because the data include both temporal and spatial information, the interpretation of the effect of the factor on species occurrence (i.e.regression coefficient) is vague.In other words, does the factor affect the variation in species occurrence spatially, temporally or both?Use of this partitioning approach can lead to further understanding of migration, including whether species actively seek out specific conditions or merely react to changes in conditions (Schlaff et al., 2014).
Elasmobranchs (sharks, skates and rays) are apex predators with long-distance migration in marine systems (Heupel et al., 2015).
Their migrations are thought to play an important role in marine communities and ecosystems by horizontally and temporally connecting habitats and transferring energy via the trophic cascade (e.g.Dulvy et al., 2000;Hammerschlag et al., 2019;Heithaus et al., 2008Heithaus et al., , 2012)).Although their migration also supports their populations through energy intake, they have been threatened with extinction (Dulvy et al., 2014;Worm et al., 2013) due to overfishing (e.g.Dulvy et al., 2021;Pacoureau et al., 2021) and climate change (e.g.Chin et al., 2010;Hammerschlag et al., 2022;Niella et al., 2021).Hence, it is necessary to elucidate the spatio-temporal changes in their migration patterns and driving factors in order to develop effective conservation and management strategies.Many studies have shown that ecological factors, such as temperature and prey as well as other drivers (e.g.salinity, pH and tide), trigger the migration (e.g.Ackerman et al., 2000;Niella et al., 2021;Ortega et al., 2009;Ubeda et al., 2009).In addition, elasmobranchs are able to perceive magnetic field (e.g.Kalmijn, 1982;Meyer et al., 2005;Newton & Kajiura, 2020) and one study reported that a shark utilized its perception of magnetic field for navigation (Keller et al., 2021).
The North Pacific spiny dogfish Squalus suckleyi (hereafter "spiny dogfish") is a demersal and coastal shark, which has received comparatively little attention in the previous studies of shark migration.They are widely distributed in the North Pacific, from the Bering Sea to the coast of Japan and from the Gulf of Alaska to California (Orlov et al., 2012).In the western North Pacific, they migrate southward from October to December for parturition, after which the adults and their offspring migrate northward around April for feeding (Kojima, 1958; Figure 1).It is possible that they migrate by following changes in water temperature (Ebert, 2003;Kojima, 1958).Their preferred water temperature is 6°C to 10°C in the western North Pacific (Yano, Ohshimo, et al., 2017).A previous study suggested that they may be opportunistic feeders with no specific target prey (Mecklenburg et al., 2018), while another study indicated that their distribution may be related to the presence of potential prey, including sardine, walleye pollock and squid (Yano, Hattori, et al., 2017).They are a commercially important target species in Japan, the United States and Canada (Orlov et al., 2012).In Japan, they have been caught in the Sea of Japan and in the western North Pacific.Stock assessment of spiny dogfish is performed using two abundance indices, namely, catch per unit effort from the bottom trawl fisheries in the western North Pacific and from the bottom longline fisheries in the Tsugaru Strait (Figure 1).
Here, to elucidate the pattern of migration and its processes under climate change, we first examined the long-term changes in timing and geographic location of migration by applying a spatiotemporal model to c. 5-decade time series data for the presence/ absence of spiny dogfish in the western North Pacific.Then, to understand the processes of the change in migration, we evaluated the spatial, temporal and spatio-temporal effects of driving factors (fish productivity, sea surface temperature [SST], depth and magnetic field) on the seasonal occurrence of spiny dogfish, using a machine learning model and an interpretable machine learning technique.

| Occurrence data for spiny dogfish
Daily catch statistics for bottom trawl fisheries from 1972 to 2019, except for the fishery closures (July and August), were used.The spatial resolution was 10′ latitude × 10′ longitude horizontal square in the area from 138°E to 142°E and 37°N to 42°N.We defined the area west of 140.5°E as SJ (Sea of Japan) and the area east of 140.5°E as NP (North Pacific; Figure 1).
The daily catch statistics included the date, geographic location (longitude and latitude), fishing effort (number of tows), catch weight in each fish category including spiny dogfish and so on.Although the data included geographic location, the location was not all fishing points in a day but only one location in a day; fishermen recorded the total catch weight in each fish category and geographic location of the most caught fish category every day.Accordingly, there is the possibility that record of catch weight of spiny dogfish was low or high in a point where spiny dogfish inhabits or does not inhabit.
Therefore, to estimate the monthly distribution in each year, we transformed the catch weight of spiny dogfish into a binary variable (presence/absence) taking a value of 1 when catch weight >0 and 0 otherwise.See Appendix S1 for further information and discussion about spiny dogfish.

| Data collection for candidate driving factors
Data were collected for three ecological factors (fish productivity, SST, and depth) and one wayfinding factor (magnetic field).For further information, see Appendix S2.

| Estimation of spatio-temporal variation in occurrence
To estimate spatio-temporal variation in the monthly occurrence of spiny dogfish, we used the Barrier model (Bakka et al., 2019) which is a state-of-the-art spatio-temporal model.This model accounts for spatio-temporal changes in fishing locations that are likely to cause estimation biases (Thorson, Fonner, et al., 2017).This model also accounts for spatial autocorrelation in the physical barriers, which is necessary because the study area contained some islands with complex coastlines (Bi et al., 2020).The occurrence rate p i for each sample i was approximated using a logit-linked linear predictor as follows: where t i is the effect of time t defined as a year-month combination (i.e.January 1972 is t = 1, February 1972 is t = 2, … , December 2020 is t = 480), which was estimated as a first-order random walk due to the lack of data for the fishery closures season (July and August).u s i is the spatial random effect for location s in a Gaussian random field with a Matérn covariance function through stochastic partial differential equations (SPDE).Although the full conditional distribution at the site in two-dimensional space has the expectation using small set of neighbours' under the Matérn model through SPDE, the paths crossing physical barriers (e.g.land and coastal line) are eliminated in the Barrier model by adding the concept of the simultaneous autoregressive model into Matérn model through SPDE.Thus, u(s), which is a spatial random effect in the Barrier model, is a solution to the following SPDE using the finite element method: North Pacific spiny dogfish Squalus suckleyi starts migrating southward from October to December for parturition and then migrates northward around April.In this study, the Sea of Japan (SJ) and the North Pacific (NP) were defined as the areas west and east of 140.5°E, respectively.Stock assessment in Japan has been conducted using bottom longline data from the Tsugaru Strait and bottom trawl data within the dashed area.
Mercator projection and azimuthal equidistant projection were used for world map and study area, respectively.
where u(s), s ∈ Ω ⊆ ℝ 2 is the Gaussian random field, r and u are constants, ∇ = x , y , and W(s) denotes spatial white noise.Ω n is the normal area, Ω b is land, and their disjoint union gives the whole study area Ω.For further details, see Bakka et al. (2019).We did not include any covariates in this model Parameters in the model were estimated by integrated nested Laplace approximation algorithm (Rue, Martino, & Chopin, 2009) using the R-INLA package (Bakka et al., 2018;Rue, Martino, & Lindgren, 2009) in R 3.6.3(R Development Core Team, 2021).
Then, we predicted the occurrence rates p(s, t) in the study areas.
Hereafter, we call the points with predicted occurrence rates as "predicted points."The distances between the predicted points were c. 10 km.

Geographic location
To evaluate long-term changes in the geographic location where spiny dogfish migrated, we used the following two measurements.
First, the mean spatial distribution of spiny dogfish for each month was obtained to determine the average spatial migration patterns over c. 5-decades because it was difficult to ascertain the patterns from the extensive number of maps of predicted occurrence rates p(s, t) (Appendix S3, Figures S3.1-S3.6).For example, the mean spatial distribution in January was obtained as

Model
To model the relationship between the estimated occurrence rate of spiny dogfish and driving factors, we used the gradient boosting method, which is based on machine learning algorithm (e.g.Hastie et al., 2009;McLaren et al., 2018;Shiferaw et al., 2019).The method sequentially generates large numbers of weak learners and finally combines them.This combination of weak learners can describe nonlinearity between a response variable and covariates as well as complex interactions among covariates.
First, we checked the correlations among covariances in each month (Appendix S3, Tables S3.1-S3.4).The variables related to magnetic field (declination, inclination, and total intensity) were consistently correlated with each other.Although this presents an issue when interpreting the model, we did not eliminate any magnetic field variables; instead, we resolved this issue by using the interpretation method described in the Interpretation.Strong correlations among other driving factors were not detected.
Then, after the data for each month were divided into training data (70%) and test data (30%), we tuned seven hyperparameters (nrounds, max_depth, eta, gamma, colsample_bytree, min_child_ weight, and subsample) and trained the models by 10-fold crossvalidation using the training data.This tuning was conducted using the package caret (Kuhn, 2008(Kuhn, , 2020) ) in R. The hyperparameters were obtained by five grid searches and the optimal models (i.e. high-performance models) were chosen based on the root mean squared error (RMSE).Min-max normalization and z-score normalization were performed for all of the driving factors.This analysis , was conducted with both high-resolution and low-resolution fish productivity data.We determined the best model based on the RMSE when using the test data.The model using the fish productivity with low-resolution was selected as the best model in all months; accordingly, the results using fish productivity with lowresolution are presented.
To demonstrate that the gradient boosting model had a higher predictive value than that of a linear model, we also analysed the relationship between the estimated occurrence rate and driving factors for each month by filtering the training data with a linear model that included the terms of all driving factors, pairwise secondorder interactions, squared mean SST and mean depth.We then performed a comparison of the RMSE of gradient boosting and the linear model when using the test data.

Interpretation
To evaluate the spatial, temporal and spatio-temporal effects of driving factors on migration, we applied permutation feature importance (PFI) and grouped permutation feature importance (GPFI, Appendix S3, Figure S3.9).Here, we briefly explain these two concepts.PFI measures the changes in prediction errors between using raw data in the best model and using permuted data for a covariate in the best model (Appendix S3, Figure S3.9a).A greater increase in the prediction error is interpreted as indicating a more important covariate for the prediction (Molnar, 2020).However, the calculated importance is unstable and difficult to interpret when the permuted covariate is correlated with other covariates.GPFI can resolve this problem by permuting correlated covariates as a "single set of covariates" (Appendix S3, Figure S3.9b).
Based on the above, we considered that spatial effects have the greatest importance when permuting within a year because the data were permuted among locations; temporal effects have the greatest importance when permuting within a location because the data were permuted among years; and spatio-temporal effects have the greatest importance when permuting with no restrictions on location and year because the data are permuted among both locations and years (Appendix S3, Figure S3.9c).To estimate the spatial, temporal and spatio-temporal effects for each covariate, we calculated the L 2 norm (i.e. the Euclidean distance) between the RMSE from the best model and that from the permutations.This permutation was conducted 100 times for each covariate, and covariates related to magnetic field (declination, inclination and total intensity) were treated as a single set due to the observed correlations among them (Appendix S3, Table S3.1-S3.4).

| Long-term changes in migration patterns
The spatial distribution of spiny dogfish clearly changed seasonally (Figures 2 and Appendix S3, Figure S3.1-S3.6).Although the mean occurrence rates were low in the whole study area from September to December, spiny dogfish occurred on the north side of the Shimokita Peninsula.In January and February, the mean occurrence rates increased significantly around the Shimokita Peninsula and increased on the north side of SJ and along the coastline of NP.From March to May, the mean occurrence rates in SJ and NP decreased gradually, although the occurrence rates on the north side of the Shimokita Peninsula remained slightly elevated.In June, the mean occurrence rates were somewhat high only on the north side of the Shimokita Peninsula.
COG did not change in any month during the c. 5-decade period (Figure 3).The longitudes of COG in SJ and PO did not change in any

| Spatial, temporal and spatio-temporal effects of driving factors on migration
All gradient boosting models had higher predictive power than the linear models, even though the linear models were somewhat complex due to the inclusion of interactions among variables and square terms for nonlinearity (Appendix S3, Tables S3.5).
The effects of driving factors, particularly SST and fish productivity, changed seasonally (Figure 5).The spatial effects of magnetic field and depth and the spatio-temporal effect of magnetic field were consistently the largest (pink boxplots).Following those factors, the spatial, temporal and spatio-temporal effects of fish productivity, the spatial effects of coefficient of variation (CV) of depth, and the spatial and spatio-temporal effects of SST also tended to be large (the yellow box plots).The spatial and spatio-temporal effects of SST tended to increase in February and then decrease in April, whereas the effects of fish productivity tended to decrease in February and then increase in April.
The factors with weak effects were consistent across months (Figure 5).The temporal effect of magnetic field was consistently weak, although the spatial and spatio-temporal effects of these factors were consistently large.Similarly, the temporal effect of SST was consistently weak, although the spatial and spatio-temporal effects of SST changed seasonally.Effects of the CV of SST were consistently weak.

F I G U R E 3
Long-term changes in the geographic location of migration of spiny dogfish using the center of gravity (COG) from September to June.See Section 2.2.2 for details.This map used azimuthal equidistant projection.Here, a "year" was defined as September to June because spiny dogfish begin to migrate in autumn.
The geographic locations of spiny dogfish migration were quite stable over c. 5-decades from 1972 to 2019 (Figure 3).The spatial and spatio-temporal effects of magnetic field as well as the spatial effect of depth and the CV of depth (i.e.submarine topography) were consistently large (Figure 5).These results provide evidence that spiny dogfish returned to specific locations because of the effect of magnetic field and a strong preference for submarine topography.
The temporal effect of SST was weak every month, although the spatial and spatio-temporal effects of SST tended to increase from January to March (the southward migration season) and then decreased in April (the northward migration season) (Figure 5).These results suggest that the migration of spiny dogfish was not a reaction to temporal changes in SST but the result of tracking locations based on SST.The mean SST in the study area showed increasing trends, with steep increases after 2000 (Appendix S2, Figure S2.1).This means that the waters in the study area reach the temperature preferred by spiny dogfish earlier than that in the past, which leads them to arrive at their destination earlier.We consider this to be why the migration timing of spiny dogfish advanced by a month after 2000 (Figure 4).
Distributional shifts are thought to occur when species track climate velocity (the rate and direction of the isotherms shifts through space) (e.g.Burrows et al., 2011;Pinsky et al., 2013;Sunday et al., 2015), whereas phenological shifts are thought to occur due to earlier warming (e.g.Edward andRechardson 2004, Asch, 2015).
However, we did not detect a distributional shift or an effect of temporal changes in SST despite warming in the present study area; instead, we detected an advance of migration timing due to spiny dogfish seeking their suitable SST location where SST had increased.
These findings provide new evidence that phenological and distributional shifts have occurred due to the combination of the way that previous studies suggested.Therefore, it may be necessary to evaluate phenological and distributional shifts simultaneously and to understand the mechanisms underlying those shifts in order to predict such shifts in the future.
We evaluated the effects of fish productivity using all fish categories other than spiny dogfish in the daily statistics under the assumption that the abundance of the prey available to spiny dogfish can use increases as the total catch weight increases (see details 2.1.2Fish productivity section).We found that the effects of fish productivity were consistently large (Figure 5).This suggests that spatio-temporal dynamics of spiny dogfish can be greatly affected by the spatio-temporal dynamics of other species.Unfortunately, we do not know the details of their feeding habits (i.e. an opportunistic feeder or a specialist for some fish species).However, there are fish species for which the estimated stock abundance has increased (e.g.Sailfin sandfish Iida et al., 2021;bighand thornyhead Kanamori, Morikawa, et al., 2021) and decreased (e.g.Atka mackerel Morita et al., 2021, snow crab Morikawa et al., 2021) in the study area.In addition, many bottom fish exhibited distributional shifts in NP (Y.Kanamori, in prep.).Therefore, to interpret the results of this study further, we need community-and ecosystemlevel studies that investigate the relationship between spiny dogfish and other species.

Utility of the spatio-temporal model and machine learning model in biogeography
Biogeographical patterns and its processes have been reshaped under climate change (e.g.Pinsky et al., 2020).Machine learning methods have the potential to resolve some of the major problems that we face, because the models can handle nonlinearity and complex interactions among covariates and have high predictability (e.g.Hastie et al., 2009;McLaren et al., 2018;Ryo et al., 2018;Shiferaw et al., 2019).However, at least two issues make the application of machine learning models difficult in the field of biogeography.The first is that huge datasets are required.In this study, we showed that this issue can be overcome by spatio-temporal models that are well known and widely used in biogeography, because the models can predict species abundance and occurrence in many locations.
We consider that spatio-temporal models can not only increase the amount of data used for machine learning models but also remove the noise associated with data collection.For example, spatiotemporal models are a well-known method for standardization, which removes various kinds of noise (e.g.spatio-temporal changes in survey points and survey effort), the effects of target fishing, environmental factors, and species misidentification (e.g.Kanamori et al., 2019;Kanamori, Morikawa, et al., 2021;Kanamori, Nishijima, et al., 2021;Thorson, 2019;Thorson & Barnett, 2017;Thorson, Fonner, et al., 2017;Thorson, Ianelli, et al., 2017).Thus, the combination of spatio-temporal models and machine learning algorithms can contribute to finding more robust biogeographic patterns.
The second issue is the difficulty in more deeply understanding how covariates affect species distribution, although interpretable methods have recently been developed (e.g.Molnar, 2020;Ryo et al., 2020).This difficulty is due to the fact that the collected data frequently include spatial and temporal components and we do not know how important these factors are (i.e.do these factors affect the spatial, temporal or spatio-temporal patterns of species distribution?).In this study, we attempted to evaluate the spatial, temporal and spatio-temporal effects of factors on migration of spiny dogfish by developing our version of an interpretable machine learning method.Using this method, we obtained two interesting results: (i) distribution was quite stable due to the spatial and spatial-temporal effects of magnetic field and depth and (ii) spiny dogfish sought suitable SST locations where SST had increased, especially after 2000, which advanced their migration timing after 2000.These detailed interpretations would have been difficult by previous interpretable methods and the coefficients of linear models such as GLM.
Therefore, our partitioning approach can provide deeper insights into biogeographic processes.

| CON CLUS ION
This study revealed that although the geographic location of spiny dogfish migration has not changed over c. 5-decades, the timing of migration has advanced by a month since 2000 in the western North Pacific.The spatial and spatio-temporal effects of magnetic field and depth were consistently large and the spatial and spatio-temporal effects of SST changed the timing of migration, even though the temporal effect of SST was consistently low.These results reveal that the migration area of spiny dogfish has remained stable because it is determined by geographic features, whereas the migration timing has advanced.This advancing of migration timing was not a direct response to temporal changes in SST but a consequence of spiny dogfish tracking suitable locations where SST had increased sharply after 2000.Therefore, not only temperature but also many were greater fluctuations in latitudes of COG in SJ and NP than in longitudes (Appendix S3, Figure S3.7).Although the latitude tended to decrease from 1972 to around 2015, the magnitude of change was quite small.The timing of migration advanced in both SJ and NP (Figure4).Before 2000, migration was observed mainly in February, compared with January after 2000.Consistent results were obtained when using the mode and median of estimated occurrence rates(Appendix S4,).
Long-term changes in the timing of migration of spiny dogfish in the Sea of Japan (SJ) and in the western North Pacific (NP).
other factors influence migration simultaneously under climate change, highlighting the importance of considering both biotic and abiotic factors including temperature and understanding the underlying processes in predicting future impacts of climate change on species distribution.ACK N O WLE D G E M ENTSWe appreciate Ms. Akane Yoshikawa, Fisheries Resources Institute, Japan Fisheries Research and Education Agency for organizing the fisheries statistics in the Sea of Japan.This research was financially supported by the Japan Fisheries Research and Education Agency.No permits were required to carry out the research in this study.
COG Lat.(t) and COG Lon.(t) were estimated as where Latitude(s) and Longitude(s) are the latitude and longitude of location s.As before, the COGs for January were calculated when t = 1, 11, 21, ⋯ , 471.We estimated COGs in both SJ and NP because the study area was separated by land and a single COG was not appropriate.The changes in COGs were small in both SJ and NP; line graphs using the values of COGs are shown in Appendix S3 (FigureS3.7).
(Schwaiger & Holzmann, 2013)efined t as a year-month combination (i.e.January 1972 is t = 1, February 1972 is t = 2, …, December 2020 is t = 480).The unimodality in each predicted point and month was checked by Silverman's test, which was performed using the package silvermantest in R(Schwaiger & Holzmann, 2013).Figures S3.8) as the peak of migration based on a histogram of the temporal migration patterns.Here, the "year" was defined as September of year t to June of t + 1 because spiny dogfish begins to migrate southward in autumn and data were not available for July and August when fisheries were closed.For sensitivity analysis, see Appendix S4.

5
Spatial, temporal and spatio-temporal effects of driving factors from September to June estimated from GPFI.Boxplots show the medians and the 25% and 75% quantiles.Whiskers indicate the data points within ×1.5 of the quartiles and grey circles represent outliers.Large L 2 norm indicates more important factors for predicting the estimated occurrence rates of spiny dogfish.Letters after the slash (/) on the y-axis indicate spatial (s), temporal (t) and spatio-temporal (st) effects.Dep, Mag, Prod and SST are depth, magnetic field, fish productivity and monthly mean SST, respectively.Dep_ cv and SST_cv are the coefficient of variation of depth around the estimated points (i.e.submarine topography) and monthly coefficient of variation of SST (i.e.monthly variability of SST), respectively.A detailed description of the parameters is provided in Appendix S2.