Contrasting ecological information content in whaling archives with modern cetacean surveys for conservation planning and identification of historical distribution changes

Many species are restricted to a marginal or suboptimal fraction of their historical range due to anthropogenic impacts, making it hard to interpret their ecological preferences from modern‐day data alone. However, inferring past ecological states is limited by the availability of robust data and biases in historical archives, posing a challenge for policy makers . To highlight how historical records can be used to understand the ecological requirements of threatened species and inform conservation, we investigated sperm whale (Physeter macrocephalus) distribution in the Western Indian Ocean. We assessed differences in information content and habitat suitability predictions based on whale occurrence data from Yankee whaling logs (1792–1912) and from modern cetacean surveys (1995–2020). We built maximum entropy habitat suitability models containing static (bathymetry‐derived) variables to compare models comprising historical‐only and modern‐only data. Using both historical and modern habitat suitability predictions we assessed marine protected area (MPA) placement by contrasting suitability in‐ and outside MPAs. The historical model predicted high habitat suitability in shelf and coastal regions near continents and islands, whereas the modern model predicted a less coastal distribution with high habitat suitability more restricted to areas of steep topography. The proportion of high habitat suitability inside versus outside MPAs was higher when applying the historical predictions than the modern predictions, suggesting that different marine spatial planning optimums can be reached from either data sources. Moreover, differences in relative habitat suitability predictions between eras were consistent with the historical depletion of sperm whales from coastal regions, which were easily accessed and targeted by whalers, resulting in a modern distribution limited more to steep continental margins and remote oceanic ridges. The use of historical data can provide important new insights and, through cautious interpretation, inform conservation planning and policy, for example, by identifying refugee species and regions of anticipated population recovery.


INTRODUCTION
Understanding species' distributions, and their ecological requirements, is a fundamental component of evidence-based conservation planning. Areas of habitat are often identified using habitat suitability models, which determine the relationship between species occurrence records and spatially associated environmental variables, and can be used to extrapolate predictions beyond sampled areas (Phillips et al., 2008;Zurell et al., 2020). This approach has been widely used to guide conservation actions, such as protected area (PA) placement and spatial management (Chen et al., 2018;Embling et al., 2010;Mannocci et al., 2017). However, the modern-day distributions of many species represent an incomplete subset of their historical distributions, resulting from population declines and range contractions caused by human activities (Channell & Lomolino, 2000a, 2000b. Furthermore, past declines have often been spatially and ecologically biased due to geographical differences in past human pressures. For example, lowland terrestrial environments have typically been highly affected by anthropogenic processes, such as habitat conversion, which has led to many threatened species now persisting only in upland landscapes that were historically less accessible to humans (Fisher, 2011;Turvey et al., 2015;Zhu et al., 2013). In a conservation context, models derived from modern records alone may therefore reconstruct the parameters of a species anthropogenically constrained niche only, rather than their historical unconstrained niches. This limitation leads to the risk that PA placement aiming to protect core habitat may instead be biased toward landscapes with marginal habitat in the historical range of such "refugee" species (Kerley et al., 2020). The widespread and related management problem of ineffective spatial distribution of PAs, which placements are often residual to human resource use and only protect habitat of low ecological significance (Devillers et al., 2014), has been hypothesized as the reason many species continue to decline despite increased PA coverage (Duarte et al., 2020).
In the absence of long-term monitoring data for most species, historical archives have the potential to help guide conservation management and decision-making by providing unique information on past ecological states (Turvey & Saupe, 2019). Such archives are diverse and cover a range of sources, including the zooarchaeological and fossil records, museum collections, numerous Western and non-Western historical records, and Indigenous knowledge and oral traditions (e.g., Barnosky et al., 2017;Turvey et al., 2015).
The incorporation of such historical archives into modern conservation planning is complicated by the differing methods by which such data were derived. For example, historical data are often anecdotal and rarely stem from systematic survey effort. In contrast with modern scientific records, historical records often lack data on survey effort, meaning that true absence records are typically not available and therefore can only represent presence-only rather than presence-absence records (Newbold, 2010). Historical records are also often derived from hunted animals, meaning that the action by which the occurrence of the species was recorded might have affected its distribution, in contrast to typically nonextractive modern scientific surveys (e.g., Bouchet et al., 2018). Care must therefore be taken when deriving species distribution and conservation insights from historical records.
Marine mammal populations have been affected severely by past human exploitation, with many species, especially large whales, now persisting as drastically reduced and geographically and ecologically restricted remnant populations (Christensen, 2006;Rodrigues et al., 2019). Understanding past distribution and environmental requirements of cetaceans and other threatened marine megafauna is an important consideration for informing marine and fisheries management (Escalle et al., 2015) and spatial conservation initiatives, such as the designation of marine protected areas (MPAs) and important marine mammal areas (IMMAs) (Hoyt & di Sciara, 2021). Extensive historical whaling records, for example, from the Yankee open-boat whaler era (Smith et al., 2012), have been used to reconstruct past species distributions (Monsarrat et al., 2016), ecological associations (Rodrigues et al., 2018), seasonal habitat use (Sahri et al., 2020), and historical exploitation dynamics (Rodrigues et al., 2018). Historical catch records have also been analyzed in combination with recent data from scientific surveys to generate time-integrated models of cetacean population change (e.g., Johnson et al., 2016). However, the lack of formal comparison with modern survey data makes it unclear whether whaling records provide novel or differing insights into cetacean species distributions and their environmental determinants compared with recent baselines. Clarifying whether historical archives can establish a better understanding of cetacean habitat preferences and ecological requirements thus has important implications for cetacean conservation and spatial planning.
We addressed this knowledge gap through a comparative investigation of the ecological information content of historical and modern records for sperm whales (Physeter macrocephalus) in the Western Indian Ocean (WIO). Sperm whales are mesopelagic predators that occupy diverse habitats and forage on a wide variety of mesopelagic prey (Best, 1999). Sperm whales were hunted during the open-boat (1712-1920) and industrial  whaling eras and are thought to have experienced substantial population declines. Recent estimates suggest that the estimated global population in 1999 (360,000 individuals) was only 32% that of prewhaling levels (Whitehead, 2002), and the species is currently listed as vulnerable by International Union for the Conservation of Nature (IUCN) (Taylor et al., 2019). Yankee whaling records have previously been incorporated in sperm whale habitat suitability models (Johnson et al., 2016) and have been used to provide information on seasonal changes in historical sperm whale habitat use (Sahri et al., 2020). However, Sahri et al. (2020) specifically questioned whether Yankee records might provide different conservationrelevant information on sperm whales in comparison with modern baselines because the modern distribution may not be effectively protected by an MPA placement prioritized on the basis of historical distribution.
We generated habitat suitability models for sperm whales in the WIO based on historical and modern records to compare ecological information content between these different data types. We sought to identify patterns and environmental determinants of habitat suitability, congruence, and differences in model performances and predictions and potential explanations for observed differences. We also assessed whether use of these different data types has practical management implications for inferring habitat suitability by testing for 2 related processes hypothesized to affect modern conservation planning: whether current MPA placement is residual with respect to sperm whale distribution (i.e., MPA placement protects only marginal habitat) and whether sperm whales should be considered refugee species (i.e., their modern-day habitat represents only a subset of their historically unconstrained niche). We used our findings to provide wider recommendations regarding the opportunities and pitfalls of using data from whaling logs and from other historical archives in modern conservation planning.

Study region
In the context of cetacean conservation, the Indian Ocean provides unique challenges and opportunities. This region was designated as a whale sanctuary in 1979 by the International Whaling Commission (Anderson et al., 2012;Leatherwood & Donovan, 1991). Specific subregions have been identified as IMMAs by the IUCN Marine Mammal Protected Area Task Force (https://www.marinemammalhabitat.org/), and several additional regions are highlighted as areas of interest (AOI) but cannot be evaluated further due to data deficiency (Hoyt & di Sciara, 2021). We restricted our analyses to the WIO, defined here as 26−85 • E, 40 • S−25 • N, which covers an area of 47.25 million km 2 and contains a complex bathymetric seascape including ridges, seamounts, banks, and continental shelves (Figure 1a,b).
The WIO region contains several MPAs, the result of selection criteria that are likely specific to each MPA or polity, differing as a consequence of the sociopolitical context and natural history of the political entity in which they are located. Several of these MPAs may be considered sufficiently large to protect mobile species, such as sperm whales, including the Aldabra Group National Park, the Farquhar Atoll Area of Outstanding Natural Beauty, the Amirantes to Fortune Bank Area of Outstanding Natural Beauty (IUCN VI [https://seylii. org/sc/legislation/si/2018/10-0]), and the fully implemented no-take Chagos Archipelago MPA (IUCN Ib; Hays et al., 2020) in the British Indian Ocean Territory. To our knowledge, the marine spatial planning (MSP) and designation of these MPAs did not rely on sperm whale distribution data (e.g., https:// seymsp.com/).

Sperm whale occurrence data
We collated historical and modern sperm whale occurrence data for the WIO and identified environmental correlates of sperm whale habitat by building 2 habitat suitability models based on historical-only and modern-only occurrence data. Historical distribution records for sperm whales dating from 1792 to 1912 in the WIO were obtained from the Census of Marine Life collection of American offshore whaling logbooks, which includes records of 36,909 sperm whales seen by American whalers (Smith et al., 2012). This archive has been used previously to investigate sperm whale distribution in the Eastern Indian Ocean (Johnson et al., 2016) and Indonesia (Sahri et al., 2020).
In the absence of modern standardized cetacean survey effort covering the entire WIO, we conducted a review of sperm whale occurrence records from recent (1995-2020) scientific surveys. We searched online species occurrence databases (www. obis.org, www.marinespecies.org) and conducted internet and Clarivate searches (www.webofknowledge.com) for this 25-year period with a Boolean combination of terms (sperm whale*, Physeter*, Physeter macrocephalus*). We reviewed and retained search results if they included map locations or precise coordinates of sperm whale occurrences or if the author could be contacted for data access and if reports were available in English and published in the primary or gray literature. The latter selection criteria likely introduced a bias because records published in other languages would have been missed. We populated our database by extracting coordinates or digitizing maps of occurrence and survey effort. We plotted occurrences with QGIS 3.10 (QGIS Geographic Information System 2020) to check accuracy of location and excluded records that were obviously out of place.

Habitat suitability analyses
We limited our analysis of environmental conditions to static variables derived from bathymetry and geomorphic features (Bouchet et al., 2015;Harris et al., 2014;Yesson et al., 2021), rather than including dynamic variables, such as sea surface temperature and chlorophyll a for which only modern-day data are available (Boyce et al., 2010). We built on the literature to identify 5 static variables hypothesized to be of ecological relevance to sperm whale habitat suitability and to marine predator distribution in general: seabed depth, seabed slope, distance to 1000-m isobath, distance to seamount, and distance to spreading ridge (Bouchet et al., 2015; Appendices S1 & S2). All variables were in ASCII raster format and were extracted for historical and modern sperm whale occurrences using the extract function from the raster package (Hijmans, 2021) in R 4.1.0 (R Development Core Team, 2021). We rarefied our occurrence FIGURE 1 Study region in the Western Indian Ocean and sperm whale occurrence records: (a) bathymetry and main underwater features, including basins, seamounts, plateaus, and spreading ridges; (b) territories and exclusive economic zones (EEZs) (some are contested) and marine protected areas (MPAs) (IUCN-UNEP, 2021); (c) historical sperm whale occurrence records from the Census of Marine Life (COML) (Smith et al., 2012); and (d) modern sperm whale occurrence records (OA, Ocean Alliance [de Vos et al., 2012;Wise et al., 2009]; REMMOA, census of marine mammals and other pelagic megafauna by aerial survey [Mannocci et al., 2013[Mannocci et al., , 2015; NOAA, National Oceanic and Atmospheric Administration [Balance & Pitman, 1996;Ballance et al., 1998Ballance et al., , 2001 points to match the resolution of our environmental raster layers (1/6 • × 1/6 • ) to ensure a maximum of 1 occurrence per raster cell. We calculated geodetic distance from each cell to the nearest coast and to different geomorphic features of interest (Harris et al., 2014; Appendices S1 & S2) with the st_distance() function from the sf package in R (Pebesma, 2021). We tested for correlation between variables (Pearson's correlation values >0.7), although no pair of variables qualified. We tested for differences in variable values between historical and modern occurrences with Kruskal-Wallis tests.

FIGURE 2
Background points for (a) historical and (b) modern Maxent models of whale sightings. Historical background points refer to position of whaling vessel days on which no whales were sighted (no encounter) and for sightings other than sperm whales (other sighting) (data from Census of Marine Life [Smith et al., 2012]). Scientific survey effort (c) associated with modern records (OA, Ocean Alliance [de Vos et al., 2012;Wise et al., 2009]; REMMOA, Census of marine mammals and other pelagic megafauna by aerial survey [Mannocci et al., 2013[Mannocci et al., , 2015; NOAA, National Oceanic and Atmospheric Administration [Balance & Pitman 1996;Ballance et al., 1998Ballance et al., , 2001). See Appendices S3 and S5 for references.
We used a maximum entropy modeling approach (Maxent; Phillips et al., 2006) in the ENMeval 2.0 package to build our models (Kass et al., 2021;Muscarella et al., 2014). Maxent uses occurrence records to predict geographically continuous species distributions based on maximum entropy; occurrence means are associated with environmental variable means (Phillips et al., 2008). Unlike some other approaches, Maxent can use presenceonly data (Elith et al., 2011) to identify suitability habitat outside the surveyed area. Although Maxent is commonly used for data sets with unknown effort, as is often the case for historical archives (Turvey et al., 2020), accounting for sampling bias and uneven effort reduces omission and commission errors (Kramer-Schadt et al., 2013, Fiedler et al., 2018Phillips et al., 2009). We accounted for uneven survey effort by ensuring that our background points were similarly biased to the survey effort. For the historical model, we randomly selected 10,000 background points from positions of vessel days on which whales were not observed and for position of nonsperm whale occurrences (Smith et al., 2012) across the study region ( Figure 2a). For the modern model, sampling effort was available for 95% of occurrences (Appendix S3). We therefore generated our background points along survey track lines (Figure 2b,c) by dividing track lines into 10-km segments and generating 10,000 background points equally across segments, following Fiedler et al. (2018). To account for the greater number of historical occurrences, we included a sensitivity analysis by taking a random subsample of historical records equal to the sample size of modern records and then rerunning the models with 10 repetitions.
We used the same parameters in both models to facilitate a direct comparison. Because our objective to describe sperm whale habitats had to be balanced with our aim to achieve modeling structures easily interpreted in terms of ecological processes and to facilitatecomparison between models based on different data types, we opted to fit simple Maxent models with linear and quadratic as the feature classes in the maxnet algorithm (Kass et al., 2021).
All environmental variables were aggregated on a resolution of 1/6 • × 1/6 • grid, giving 137,943 cells across the study region for model predictions. We implemented a 4-fold geographic cross-validation following a checkerboard pattern on a resolution of 2 • × 2 • for model evaluation. To assess congruence between predicted and observed presences, we used the Continuous Boyce Index (CBI) (Boyce et al., 2002), which gives a score on a scale from −1 to 1 (positive values indicate congruence, values close to 0 indicate the model does not differ from a random model, and negative values indicate noncongruence [Hirzel et al., 2006]). We assessed out-of-sample predictive performance with the area under the curve (AUC) of the receiver operating characteristic (ROC) (Liu et al., 2005), where a purely random ranking had an AUC of 0.5 and models with AUC scores >0.6 were considered well fitted (Elith, 2000).
We applied the convex hull approach (Mannocci et al., 2018) to evaluate the extent of extrapolation in environmental space in predictions for both historical and modern models. Specifically, we calculated the multidimensional convex hull associated with the 5 predictors with the background points (historical or modern) as the reference set and the environmental rasters as the test set. Areas lying outside the multidimensional convex hulls were extrapolations in environmental space because those combinations of environmental conditions were not sampled by our background points, meaning that the associated predictions were likely unreliable. We used the WhatIf package (Stoll et al., 2020) to perform these calculations. Finally, we assessed congruence between the predictions from the 2 models on a cell-by-cell basis with Pearson's correlation tests.
We identified areas of spatial noncongruence between historical and modern models by mapping the residuals from a linear correlation between the predictions of each model (Turvey et al., 2020). We further contrasted the continuous habitat between the 2 models by producing binary maps showing regions of highly suitable and unsuitable habitat. We set the suitability threshold following the recommendations of Liu et al. (2013) by maximizing the sum of sensitivity and specificity (max SSS).

Spatial protection assessment and differences in relative habitat suitability
To assess how different data types can influence spatial management decision-making, we used the habitat predictions from our different models to investigate 2 related mechanisms that have been proposed to limit the recovery potential of threatened species and the conservation impact of MPAs. First, we assessed whether spatial protection MPA placement in the WIO is residual with respect to sperm whales (i.e., provides ineffective protection because it contains low habitat suitability for target species (Devillers et al., 2014). For each model, we compared predicted habitat suitability values for each 1/6 • × 1/6 • raster cell inside MPAs versus outside with Kruskal-Wallis tests and regional MPA coverage obtained from the World Database of Protected Areas (IUCN-UNEP, 2021). We also contrasted the number of high suitability cells, as defined by our binary maps, inside MPAs versus outside, for both models. Because sperm whale distribution did not explicitly feature in the MSP process of the largest MPAs in the region that may be of sufficient size to protect sperm whales, this analysis primarily aimed at highlighting how using historical versus modern data could lead to different conclusion concerning habitat suitability with ramification for MSP. A more formal analysis of the implication for MSP could be achieved by conducting separate MSP with the historical and modern data, in addition to biodiversity and socioeconomic data, which are held constant, and to see how the solutions change. Such an analysis is beyond the scope of this article.
Second, we explored whether sperm whales in the WIO represent a "refugee" population (i.e., have experienced historical contraction in relative habitat suitability that restricts them to marginal or suboptimal present-day habitat [Kerley et al., 2020]). We used the binary maps to identify habitat lost and habitat gained between the historical and modern eras and to determine their area. We further assessed whether habitats gained were farther from the coast than habitats lost. This approach is in keeping with current understanding concerning anthropogenic impacts disproportionately affecting coastal areas (Williams et al., 2021) compared with the open ocean (Jones et al., 2018). Sperm whales would meet the definition of a refugee species in the WIO context if distances to the coast of habitat gained in the modern model are greater than that of habitat lost in the historical model.
All data and scripts to reproduce the statistical analysis and figures are available at https://github.com/LauraMannocci/ spermwhale.

Historical and modern sperm whale records
We compiled 543 sperm whale records within the WIO from the Yankee whaling era and 219 records from the modern era, following rarefaction, originating from a variety of survey platforms (oceanographic vessels, patrol vessels, aircrafts, small boats) and methods (visual transects, passive acoustics) (Appendix S3). The whaling and modern survey data sets provided broad geographical coverage of the WIO (Figure 1c,d) and associated environmental conditions (Figure 3). Modern sperm whale records that did not meet our selection criteria (e.g., because of the resolution at which the records had been gridded [Escalle et al., 2015]) were limited to a few records from a relatively small geographical range (n = 40) (Appendix S4). Value ranges of environmental variables associated with historical records were greater than that in the modern data set, with the exception of seabed slope (Figure 3b). Environmental variables' values associated with sperm whale records were significantly different for the 2 data sets, with the exception of distance to spreading ridge (seabed depth, seabed slope, distance to 1000-m isobath, distance to seamount) (Kruskal-Wallis tests, p < 0.001).

Maxent model performance and outputs
Both Maxent models performed satisfactorily; cross-validation indicated that habitat predictions were more consistent with observed presence records for the historical model than the modern model (CBI Relationships between environmental variables and habitat suitability were broadly consistent between models (Figure 4a), with some noteworthy differences resulting in modestly consistent model predictions. Both models predicted increased suitability in areas with deeper water, as represented by seabed depth and proximity to 1000-m isobath and seamounts (Figure 4b,d-f). Habitat suitability was weakly and negatively associated with seabed slope in the historical model, in contrast to being strongly and positively associated with slope in the modern model (plateau reached at 20% slope) (Figure 4c). In the historical model, the 1000-m isobath had a higher habitat suitability peak and a sharper drop off than the modern model ( Figure 4d).
The area of environmental extrapolation was larger in the modern model (Figure 5a,b). The max SSS threshold denoting high habitat suitability was 0.54 for the historical model and 0.43 for the modern model. The historical model FIGURE 3 Bathymetric and geomorphic variable values associated with historical and modern occurrences of sperm whales (numbers, median values; box, first and second quartiles; bar ends, range). Significant differences among variables in historical and modern values were tested with Kruskal-Wallis tests (**p < 0.01; ***p < 0.001; ns, not significant). predicted high habitat suitability and prominent core habitat in association with continental slopes and islands (Figure 5a,b), whereas the modern model predictions appeared more homogenous across the study region but had higher suitability that was closely associated with steep slopes. Both models predicted low habitat suitability in deep basins, such as the Arabian basin and Mid-Indian basin (Figure 5c,d).
Habitat suitability predictions between the historical and modern models were relatively spatially congruent (Pearson's correlation = 0.496) (Appendix S7). Positive residuals, reflecting higher predicted suitability in the historical model than in the modern model, were located on continental shelves, along the eastern Arabian Peninsula, east of Madagascar, in the Mascarene basin and plateau, and near steep and remote oceanic areas, such as the southern sections of the southwest Indian Ridge. Negative residuals, reflecting higher predicted suitability in the modern model than in the historical model, were in coastal areas in association with steep slopes, and in open-ocean habitats in deep areas, such as the Arabian and Mid-Indian basins. Strongly negative residuals were particularly prominent near raised bathymetric features, such as remote and midocean sections of the Indian and Carlsberg ridges.

Spatial protection and refugee species
Relative habitat suitability was predicted to be higher in MPAs compared with outside; the greatest relative and significant differences were predicted by the historical model (median values inside vs. outside MPAs: historical, 0.67 and 0.55; Kruskal-Wallis test, p < 0.0001; modern, 0.51 and 0.49, p < 0.0001) (Figure 6a). This resulted in a greater number of cells classified as of high suitability inside MPAs than outside for the historical model (3191 vs. 30,754, respectively) compared with the modern model (2,219 vs. 33,063 respectively) (Figure 6b).
We estimated that the approximate area of high relative habitat suitability lost is 2,156,175 million ha (6300 cells) and the area of high relative habitat suitability gained is 6,514,729 million ha (19,035 cells) (Figure 7a). High relative habitat suitability gained was predominantly far from the coast (>300-1300 km), whereas habitat lost was predominantly near the coast (0-300 km) (Figure 7b).

DISCUSSION
We used sperm whale occurrence records from across 4 centuries to compare habitat predictions between historical and FIGURE 4 Historical and modern Maxent sperm whale habitat suitability model: (a) environmental variable coefficients (right pane, zoomed-out view to show outlier) and partial plots of (b) seabed depth (m), (c) seabed slope (%), (d) distance to 1000-m isobath (km), (e) distance to seamount (km), and (f) distance to spreading ridge (km). modern data types. By accounting for how different data sets were collected in selection of model background points, we interpreted our model predictions in terms of ecological information content and historical population change. Our findings will help inform future management and planning decisions for conservation and industry impact assessment. Overall, predicted habitat was modestly consistent between the historical and modern models; the latter had a more geographically homogenous distribution. Our comparisons revealed a noteworthy inconsistency in the relationship between sperm whale occurrence and environmental variables between eras concerning the influence of seabed slope. Critically, historical and modern records yielded conflicting inferences concerning the relative strength of MPA placements, which has implications for conservation planning. Our results suggest that sperm whales experienced a historical contraction in relative habitat suitability FIGURE 5 Sperm whale relative habitat suitability predictions for (a) historical and (b) modern models with MPA locations overlayed (IUCN-UNEP, 2021) and locations of high predicted relative habitat suitability (above the threshold set by maximizing the sum of sensitivity and specificity following the recommendations of Liu et al. [2013]) for (c) historical and (d) modern models (gray, areas of extrapolated environmental space where predictions should be considered unreliable).

FIGURE 6
Frequency density distribution of relative habitat suitability predictions (a) outside and inside marine protected areas (MPAs) (vertical lines, median; historical, 0.55 and 0.67; modern, 0.49 and 0.51) and (b) number of cells with high relative habitat suitability (above the threshold set by maximizing the sum of sensitivity and specificity following the recommendations of Liu et al. [2013]) outside MPAs and inside MPAs (dark shading, outside MPAs; light shading, inside MPAs; values from extrapolation zones excluded).

FIGURE 7
High relative habitat suitability lost and gained (in blue) in the modern era, compared with (a) the historical era, and (b) corresponding distance to coast excluding values from extrapolation zones.
in coastal areas and an expansion toward more remote habitats, providing tentative evidence that sperm whales represent a refugee species.
Considerable efforts have been directed at understanding the distribution of sperm whales across multiple spatiotemporal scales (Jaquet & Whitehead, 1996;Pirotta et al., 2011). At the scale of coastal habitats in the WIO, modeling the distribution of sperm whales was initiated through the REMMOA aerial surveys (Mannocci et al., 2013(Mannocci et al., , 2015, the records of which were included in our analyses. The analytical approach of Mannocci et al. (2013Mannocci et al. ( , 2015 was guided by their objectives to model densities of energetically similar species, rather than occurrence, which necessitated that sperm whales and beaked whales be aggregated into a taxonomic guild because they were hypothesized to be energetically similar. Our analyses identified considerable variation in relative habitat suitability for sperm whales across the WIO, thus representing a step forward in establishing a species-specific conservation baseline on which more nuanced management decisions can be based. Specifically, steep slopes should be considered priority regions for at-sea conservation areas (Scales et al., 2014).
Relationships with environmental variables were consistent between models, with the notable exception of slope. We observed depletion in sperm whale habitat over time in areas of historically core habitat in coastal regions. In contrast, relative habitat suitability became comparatively more important in remote open-ocean locations, such as the Mid-Indian and Carlsberg ridges and the Mid-Indian basins. This relative difference resulted in MPA placements that were less favorable in the modern area. We suggest 4 hypotheses, outlined below, to explain the difference in predicted relative habitat suitability between the 2 models related to differences in motivations between commercial whaling operations and modern scientific surveys; spatial differences in relative population depletion; changes in environmental conditions leading to differences in habitat use; and avoidance behavior, leading to differences in habitat use.
Experience from the terrestrial realm suggests that predicted species distributions derived from historical records are rarely consistent with distributions derived from more recent systematic surveys (e.g., Turvey et al., 2020). Yankee whaling activities were, by their very nature, economically motivated to maximize catch rates while minimizing overall time spent at sea. For this reason, these whalers fished during all states of the monsoon seasons (Sahri et al., 2020) and primarily targeted large groups of whales (25-50 individuals), which were typically made up of females and their young in tropical waters (Best, 1979;Whitehead, 2002). Interestingly, modern sperm whale occurrences derived from tuna purse seiners, which are also economically motivated, fall predominantly within the area covered by historical whaling records (Escalle et al., 2015). In contrast, scientific surveys are driven by other incentives. For example, surveys interested in collecting data on biodiversity baselines (Graham & McClanahan, 2013) will preferentially target remote and inaccessible regions and are likely to yield a greater number of records there. This expected spatial variation in patterns of data collection is reflected in our modern survey data, which contained a relatively greater portion of effort in the high seas and in remote atolls, such as the Chagos Archipelago and the Maldives, which may explain why marginal (and often remote) habitats are relatively better defined by the modern surveys and why our modern model predicted higher relative suitability for sperm whales in such regions.
If we interpret these differences in predicted relative habitat suitability in the context of anthropogenic sperm whale population declines across recent centuries (Whitehead, 2002), differences between the models may reflect genuine differences in intensity of habitat use by sperm whales following intensive historical hunting. Many marine predators have experienced range contractions and regional extinctions following population depletions (Rodrigues et al., 2018;Worm & Tittensor, 2011). Generally, such contractions are most pronounced where humans are most active, typically along highly settled coastlines or key fishing grounds . This spatially biased pattern of historical depletion has led to a global rarity of intact coastal ecosystems (Williams et al., 2021); most pristine marine ecosystems are restricted to the open ocean (Jones et al., 2018). Open-ocean waters in the WIO were not important sperm whale whaling grounds during the Yankee whaling era because most catch records were located coastally. In contrast, the spatial footprint of sperm whale catches during the industrial whaling era  is less clearly documented (Allison, 2020) (but see Johnson et al. [2016]), although industrial whaling was probably less geographically restricted than that of earlier Yankee whaling due to technological developments and to the establishment of colonial whaling stations in remote regions (e.g., the Norwegian-French station of Port Jeanne d'Arc in Kerguelen). Sperm whale populations are estimated to have experienced the biggest decline during 1950-1970 (from 66% to 33% of estimated baseline levels) (Whitehead, 2002), during which sperm whales in the WIO were opportunistically caught in the temperate open ocean, en route to Antarctic whaling grounds by Japanese (Machida, 1975) and Soviet whalers (Clapham & Ivashchenko, 2009). The sharp drop in sperm whaling catches between 1975catches between and 1985catches between (Whitehead, 2002 therefore probably coincided with the persecution of these last pockets of open-ocean sperm whales. Many species' ranges and habitat use are changing as a consequence of changes in abiotic (e.g., temperature, currents) and biotic (e.g., distribution of prey, competitors and predators) conditions. Sperm whales are considered highly adaptable and are likely sensitive to environmental-mediated distribution shifts in mesopelagic prey variability (e.g., Proud et al., 2017). Here, uncertainty in how these conditions may have changed during our focal time frame meant we opted to exclude nonstatic variables in our analyses, and we are, as a consequence, unable to discount this possibility. However, the sperm whale feeding range is likely one of the greatest on Earth. We further expect this species to be less sensitive to climatic change than other whales with more restrictive habitat preferences. To our knowledge, evidence of sperm whale range shifts related to climatic change is limited to examples where a pod expanded into new habitats made accessible by sea ice melting (Posdaljian et al., 2022).
Many predators express highly adaptive avoidance behavior , and the complexity of cetacean behavior should caution against simplistic interpretation of our models. Avoidance behavior by sperm whales has been observed in response to local stressors, such as seismic surveys (Mate et al., 2011). More recently, it has been proposed that the onset of open-boat whaling in the 18th and 19th centuries triggered a rapid spread of avoidance behavior in sperm whales, such that sightings decreased ahead of population decline as animals learned to avoid boats (Whitehead et al., 2021). It is therefore possible that sperm whales learned to avoid regions of persistent human activity and persecution and migrated to regions that were more inaccessible or more rarely visited by humans (e.g., steep slopes on continental margins or noncoastal habitat), leading to distribution shifts. To our knowledge, such avoidance behavior has never been documented in marine mammals at the scales we considered (i.e., >200 nm). As a highly migratory species adaptable to new habitats (Posdaljian et al., 2022), this is an intriguing possibility for sperm whales, with implications for recolonization potential.
We were unable to discriminate between these 4 competing hypotheses. However, the relative difference in habitat suitability between the open ocean and coastal regions ought to be most credible in locations that had relatively high amounts of effort in the historical as well as in the modern era. These are regions that were important whaling grounds and that have since been visited by multiple scientific surveys in the modern era, such as the Seychelles and Mauritius. That the coastal regions in the Seychelles and Mauritius were not predicted to contain an equally pronounced gradient in high habitat suitability relative to the open ocean in the modern era compared with the historical is therefore suggestive. We therefore contend that these locations were indeed more important habitats relative to the open ocean prior to the onset of whaling and that our results support change in relative habitat use, either caused by population decline or by avoidance behavior. Furthermore, these findings are both consistent with expectations from the demographic model of population decline (Channell & Lomolino, 2000b), which predicts that core populations should persist until the final stages of declines, and from the contagion model (Channell & Lomolino, 2000a), which predicts that remnant populations are restricted to remote and inaccessible refuges far from humans . This further suggests that sperm whales represent a refugee species.
Our findings of differences in predicted relative habitat suitability between historical and modern eras have 2 important implications for large-scale conservation planning of marine mammals, including the location of future IMMAs in the region. First, reliance on historical data to inform prioritization could lead to poorer conservation outcomes because the species no longer occurs in the areas identified as core habitat. Here, a spatial zoning plan derived on the basis of the historical habitat suitability would give relatively high conservation scores to coastal protection zones, whereas the conservation benefit of those zones would be less in the modern habitat suitability compared with open-ocean zones. Although it may be tempting to supplement modern cetacean records with historical data to better guide spatial management (e.g., Johnson et al., 2016), this may lead to suboptimal decisions and residual placement if the underlying distribution has changed between eras.
Second, reliance on modern data alone to inform prioritization could lead to poorer conservation outcomes because the areas identified as core habitat are marginal, and core habitat is not protected. For a species of interest, conservation planners should balance the estimated magnitude of historical population decline with the extent of current threats when deciding on whether to prioritize halting ongoing decline or promoting future recovery. For sperm whales, whose populations still survive globally in spite of historical population declines (32% of prewhaling levels [Whitehead, 2002]), designating PAs to promote population recovery may be more beneficial than ending threats in locations that may only represent historically marginal habitats. Furthermore, the risks of poor conservation outcomes (first implication) may be deemed acceptable. In fact, predictions of greater historical habitat suitability inside MPAs means that some recovery of sperm whales may already be promoted under current MPA placement in the WIO. However, for highly depleted species, such as the North Atlantic right whale (Eubalaena glacialis) (e.g., <6% [Monsarrat et al., 2016;Rodrigues et al., 2019]), halting current threats may be more important, meaning that PAs should be targeting high habitat suitability predicted from modern records, therefore making the missing of core habitat (second implication) less of a concern, given that the immediate priorities lie with the survival of the species.
Through the incorporation of modern and historical longterm data, we demonstrated that past and present data can help identify consistently important habitats and their environmental predictors for threatened species. For some declining species, our results illustrated more generally that only incomplete inferences of current-day distribution can be made from historical archives. However, comparison of patterns between past and present baselines also provides novel insights that could not otherwise be obtained from consideration of modern-day data alone. More broadly, we encourage further evaluation of existing historical archives collated from past human interactions with cetaceans and other marine species (Kittinger et al., 2015) and their inclusion-perhaps analyzed in tandem with data on human impact through time-as important yet underused components of the conservation tool kit. In the WIO region, cetacean conservation planning still remains poorly informed. Specifically for sperm whales, our models identified several regions of potential population recovery (inside MPAs) and potential refuges (outside MPAs). These locations, associated with the South West Indian Ridge and in the central Indian Ocean near and adjacent to the Chagos Archipelago, should be prioritized for further investigation and considered for promotion to IMMA-AOI. To this effect, we are expanding our spatial analysis through novel surveys (Letessier et al., 2022) and by considering other cetacean taxa.

ACKNOWLEDGMENTS
We thank V. Ridoux and the PELAGIS Observatory-UMS 3462 (La Rochelle University-CNRS) and Office Français de la Biodiversité for the access to aerial observations of the REM-MOA South West Indian Ocean survey. We thank P. Sabarros and P. Bach (IRD) for facilitating access to fishery data. We are grateful to MRAG Ltd for their data on daily observations of cetaceans in the Chagos Archipelago (BIOT), particularly J. Clark for data curation and T. Franklin, S. Browning, D. Hughes, R. Hartnell, J. White, and Y. Barnes. We thank M. Procknik from the New Bedford Whaling Museum for help with accessing whaling logs and The Science Museum, London, for use of their microfilm reader. We thank N. Casajus for his assistance with coding of the R compendium, R. Chang at the Royal Veterinary College for advice on statistics and methodology, and P. Boersch-Supan and C. Collins for assisting with GIS and extraction of geomorphic features. T.B.L. was funded by the synthesis center CESAB of the French Foundation for Research on Biodiversity (FRB; www.fondationbiodiversite.fr), the Mediterranean Centre for Environment and Biodiversity Laboratory of Excellence (CeMEB LabEx) (https://www.labex-cemeb.org), and the Bertarelli Program in Marine Science.
Open access publishing facilitated by The University of Western Australia, as part of the Wiley -The University of Western Australia agreement via the Council of Australian University Librarians.