Prey resources are equally important as climatic conditions for predicting the distribution of a broad‐ranged apex predator

A current biogeographic paradigm states that climate regulates species distributions at continental scales and that biotic interactions are undetectable at coarse‐grain extents. However, advances in spatial modelling show that incorporating food resource distributions are important for improving model predictions at large distribution scales. This is particularly relevant to understand the factors limiting the distribution of widespread apex predators whose diets are likely to vary across their range.


| INTRODUC TI ON
Within biogeographic theory, climate is hypothesized to be the main driver of species distributions at continental scales (Louthan et al., 2015;Wiens, 2011). This is evidenced through the fossil record (Davis & Shaw, 2001) and recent observed trends (Parmesan & Yohe, 2003;Walther et al., 2002). However, the relationship between distribution and climate may be either indirect (Rich & Currie, 2018), an oversimplification (Dallas et al., 2017), or due to historical biogeography (Heads, 2015). Whether biotic resources are more important determinants of species distributions than climatic conditions is still a central issue in ecology and biogeography (Andrewartha & Birch, 1954;Araújo & Rozenfeld, 2014;Heads, 2015;MacArthur, 1972;Wisz et al., 2013). The current paradigm postulates that biotic resources are most apparent at finer geographical scales (Pearson & Dawson, 2003;Peterson et al., 2011), but this assertion may not apply across all taxa.
Biotic effects may be lost at continental scales due to the coarsegrain extent, commonly termed the Eltonian Noise Hypothesis (Soberón & Nakamura, 2009). The Eltonian Noise Hypothesis postulates that because biotic interactions occur at a fine-scale individual level, modelling approaches will fail to recognize them when working at coarse continental scales. Alternatively, biotic resources may correlate closely with abiotic factors; thus, the biotic signal is lost in abiotic environmental space (Brewer & Gaston, 2003). The effect of biotic resources on species distributions can vary markedly across a given species geographic range (Thompson, 2005). Even so, the overriding assumption is that biotic resources require a fine-scale spatial structure to be noticeable (Soberón & Nakamura, 2009), because by definition biotic interactions occur at the individual level (Anderson, 2017). This assumption is applied to multiple biotic interactions such as the presence or absence of mutualists, competitors and predators.
The relationship between the range limits of animals, such as butterflies and nectivorous birds being driven by the distribution of their food plants, is well established (Kass et al., 2019;Wisz et al., 2013). However, whether the same processes act on the distribution of large vertebrate apex predators with more diverse diets is still unclear (Sih, 2005;. It is well-known that apex predators can limit the distribution of their prey species (Holt & Barfield, 2009). However, an outstanding question for large vertebrates is whether the distribution of food resources limits the distribution of their main consumers (Aragón & Sánchez-Fernández, 2013;Louthan et al., 2015;Schweiger et al., 2015;Sih, 2005). The expectation would be for a high overlap between the abiotic conditions in the predator's distribution and those of its prey. Consumer (i.e. predator) distribution should be nested within their main resource distributions (Holt, 1997), but conversely food resource distributions are not reliant on the distribution of their main consumers.
Food resource distributions can be an important predictor for estimating avian distributions at regional or landscape scales (Aragón et al., 2018;Aragón & Sánchez-Fernández, 2013;de Araújo et al., 2014). However, whether the distribution of food resources can successfully predict the presence of a main consumer across continental extents (2000-10,000 km) has not been tested specifically for a terrestrial apex predator. The harpy eagle (Harpia harpyja) is a large Neotropical raptor with a continental range across Central and South America from southern Mexico to northern Argentina (Sutton et al., 2021(Sutton et al., , 2022a(Sutton et al., , 2022bVargas Gonzalez et al., 2006).
Species distribution models (SDMs) are spatial statistical models that establish the environmental range limits of a given species from environmental conditions and resources at known occurrences (Franklin, 2009;Peterson et al., 2011). SDMs have seen rapid advances over the past 20 years, yet there are still outstanding conceptual and methodical issues that need addressing to improve predictions (Guisan et al., 2017). An important current ecological question is whether including biotic interactions in SDMs can increase their predictive power (Anderson, 2017;Dormann et al., 2018;Stephenson et al., 2022;Wiens, 2011;Wisz et al., 2013).
Incorporating food resource distributions into abiotic SDMs can improve model predictive performance, leading to more realistic predictions at regional scales (Aragón et al., 2018;Atauchi et al., 2018;Palacio & Girini, 2018). Moreover, including biotic resources in SDMs is especially relevant for species ranging over lower tropical latitudes with more benign abiotic conditions (Louthan et al., 2015). Indeed, it has long been hypothesized that species range limits in low-latitude areas are driven more by species interactions than climate (Biotic interactions hypothesis, Dobzhansky, 1950;MacArthur, 1972), with support for similar responses along altitudinal gradients (Dvorský et al., 2017;Normand et al., 2009). However, given that all taxa need suitable resources and conditions to survive, species distributions must be regulated by both conditions and resources regardless of scale (Godsoe et al., 2015). Thus, in this tropical forest predatorprey system, biotic resources and abiotic conditions are expected to exert varying but accountable effects on harpy eagle distribution.

K E Y W O R D S
biotic interactions, geographic range size, Harpia harpyja, harpy eagle, prey base, species distribution models Here, we used a hierarchical modelling approach with four SDMs fitted as functions of abiotic conditions and food resource covariates for the harpy eagle using: (1) climatic and topographical covariates (Abiotic model), (2) solely food resource distribution covariates (Biotic model), (3) including food resource distributions individually, and (4) as predicted prey species richness (both Abiotic-Biotic models). Last, pair-wise niche overlaps in geographical space were calculated and all distributions were correlated to determine commonality in distribution for all species in this predator-prey system. Specifically, we sought to establish whether including food resource distributions were more important for predicting distribution at continental scales and meaningfully improved climate-derived model predictions. Further, we quantified the level of niche overlap between the harpy eagle and its main prey in this lowland tropical forest predator-prey system and predicted areas of highest environmental suitability for the harpy eagle and its main food resources.

| Occurrence data
We sourced harpy eagle occurrences from the Global Raptor Impact Network (GRIN, McClure et al., 2021) a data information system for monitoring populations of all raptor species. For the harpy eagle, GRIN consists of occurrence data from two previous distribution assessments (Miranda et al., 2019;Vargas González & Vargas, 2011), which give precise point localities for nests and sightings. In addition, we downloaded presence-only data from the Global Biodiversity Information Facility (GBIF, 2022) but omitted eBird (Sullivan et al., 2009) data points, due to concerns over location accuracy because of the checklist system used in the eBird dataset.
Food resource occurrence data were compiled from GBIF (2019aGBIF ( , 2019bGBIF ( , 2019cGBIF ( , 2019d, using the five most frequent prey by genus (Miranda, 2015): three and two-toed sloth Bradypus Food resources were combined into their respective genera (1) to obtain a higher number of occurrence records for each model, and (2) as an appropriate broad-scale representation of food resource distribution. Two genera were used for capuchin monkey based on a recent taxonomic assessment, with Sapajus (or robust capuchin) found south of the Amazon river and Cebus (or gracile capuchin) north of the Amazon river (Lynch Alfaro et al., 2012). Combined these five food resource genera comprise 73.8% by frequency and 75.6% of biomass, representing the majority of food resources taken by the harpy eagle across its range. We omitted all other known prey types because including these would create unnecessary noise in our predictions and our focus was on correlating harpy eagle occurrence with its main food resources important for conservation applications.
We cleaned occurrences by removing duplicate records, those with no geo-referenced location and only those occurrences recorded from 1960 onwards, to temporally match the timeframe of the environmental covariates. Finally, a manual check for unlikely outliers was performed in the Quantum Geographic Information System software (v3.2.2., QGIS, 2022). For all species occurrences, a 5-km spatial filter was applied between each occurrence point using the 'geoThin' function in the R package 'enmSdm' (Smith, 2019).
Using a 5-km filter approximately matches the resolution of the environmental raster data (~4.5-km) and reduces the effect of biased sampling (Kramer-Schadt et al., 2013). After data cleaning, a total of 609 geo-referenced records were compiled for the harpy eagle.
Applying the 5-km spatial filter resulted in 488 harpy eagle occurrence records for use in the calibration models. Occurrence records used for the food resource species calibration models are given in Table 1.

| Environmental covariates
We downloaded 37 bioclimatic and topographical abiotic raster layers from the WorldClim (v1.4, Hijmans et al., 2005) and ENVIREM (Title & Bemmels, 2018) databases at a spatial resolution of 2.5 arc-minutes (~4.5-km resolution). WorldClim variables (n = 19) are generated through interpolation of average monthly weather station climate data from 1960 to 1990, with ENVIREM variables derived from the WorldClim v1.4 bioclimatic layers. Raster layers for both the harpy eagle and food resource models were cropped to a polygon representing the accessible area for all species (Barve et al., 2011). We based this on the Neotropical ecoregions (Dinerstein et al., 2017) where harpy eagles are known to occur (Vargas Gonzalez et al., 2006). This polygon was then cropped to filtered occurrences for food resource genera used in the food resource distribution models (GBIF, 2019a(GBIF, , 2019b(GBIF, , 2019c(GBIF, , 2019d (Vargas Gonzalez et al., 2006). This improves model predictive power by only including those ecoregions where harpy eagles would most likely be found and reducing the background area used for testing points used in model evaluation (Barve et al., 2011;Radosavljevic & Anderson, 2014).
Before building each model, all covariates for both the harpy eagle and food resource models were tested for multicollinearity underlying occurrences using variance inflation factor (VIF, Hair et al., 2006) in the R package 'fuzzySim' (Barbosa, 2015(Barbosa, , 2018. VIF is based on the square of multiple correlation coefficients, regressing a single predictor variable against all other covariates. A stepwise elimination of highly correlated variables was used retaining covariates with a VIF threshold of <10 , and Spearman's correlation coefficient of r s ≤|0.7| retained for consideration as covariates. We selected environmental covariates for the food resource distribution models based on species biology and reducing collinearity between environmental covariates underlying the occurrences of each specific food resource genus (Meineri et al., 2012). Using this method, all five food resource models used a different set of environmental covariates (Table S1), resulting in low collinearity between the final food resource model raster covariates (all tests VIF = <10; Table S2).
For the Abiotic-Biotic SDMs, these five covariates defining modelled food resource distributions were included in the harpy eagle model calibration as raster layers as has been done in previous studies (Araújo & Luoto, 2007;Gherghel et al., 2018;Preston et al., 2008;Stephenson et al., 2022). Finally, we combined all individual food resource models into a stacked SDM and the continuous suitability values summed for a continuous estimate of food resource species richness in the range of 0.0-5.0. We standardized all raster values for the food resource distribution predictions prior to summing and stacking the rasters with a mean of one and standard deviation of zero, thus treating all prey species equally in the stacking.

For the Abiotic model (A) two climatic variables, Climatic
Moisture Index (CMI) and minimum temperature warmest month, and one topographic variable, Terrain Roughness Index (TRI), were included as covariates. We selected all three covariates a priori because combined they contributed 96% to model prediction from a previous SDM (Sutton et al., 2021) and because all three presented a low level of spatial autocorrelation with the filtered harpy eagle occurrences when assessed with a Mantel correlogram ( Figure S1).

| Species distribution models
We fitted SDMs using penalized elastic-net logistic regression (Fithian & Hastie, 2013), within a point process model (PPM) framework in the R package 'maxnet' . form is equivalent to an IPP and can be interpreted as a measure of relative occurrence probability proportional to a species potential abundance. We used a tuned penalized logistic regression algorithm because this modelling approach outperforms other SDM methods (Valavi et al., 2021), including ensemble-averaged models (Hao et al., 2020).
We used a random sample of 10,000 background points as pseudo-absences recommended for regression-based modelling We only used linear and quadratic features to produce less complex and more realistic predictions (Guevara et al., 2018;Merow et al., 2013). The checkerboard cross-validation method of partitioning masks the geographical structure of the data according to latitude and longitude lines, dividing all occurrences into four spatially independent bins of equal numbers. By masking the geographical structure of test data, the models are projected onto an evaluation region not included in the calibration process. All occurrence and background test points are assigned to their respective bins dependent on location, thus further reducing spatial autocorrelation between testing and training localities (Radosavljevic & Anderson, 2014).
We used the 'checkerboard2' method because this is an appropriate approach when correlating multiple species distributions in the same analogue environmental space but not transferring models into nonanalogue conditions across space and time (Muscarella et al., 2014). We used response curves, parameter estimates and per cent contribution to model prediction to measure variable performance within the optimal calibration models.

| Model evaluation
We evaluated model performance using both threshold-  (Smith, 2019). We evaluated models against random expectations using partial Receiver Operating Characteristic ratios (pROC), which estimate model performance by giving precedence to omission errors over commission errors (Peterson et al., 2008). Partial ROC ratios range from 0 to 2 with 1 indicating a random model. Function parameters were set with a 10% omission error rate, and 1000 bootstrap replicates on 50% of test data to determine significant ( = 0.05) pROC values >1.0 in the R package 'ENMGadgets' (Barve & Barve, 2013).

| Geographical overlap and correlation
We measured pair-wise geographic overlaps between the harpy eagle and the five prey distributions using Schoener's D (Schoener, 1968;Warren et al., 2008), which ranges from 0 (no overlap) to 1 (identical overlap). To estimate the areas where all six species coincide, the three harpy eagle SDM predictions that used biotic covariates (models B, A + B and A + SR) were first stacked and their respective CBI scores used to calculate a weighted mean ensemble prediction.
Second, the five prey distribution SDMs were also stacked into a single raster. Last, we then predicted a measure of commonality in species distribution by intersecting the harpy eagle ensemble prediction, with the stacked prey distribution rasters with a threshold of 0.5 using the 'stability' function in the R package 'sdStaf' (Atauchi, 2018).

| Food resource distribution models
Optimal model selection (ΔAIC c = 0.0) for the capuchin monkey, howler monkey and three-toed sloth distribution models had feature classes linear and quadratic and a regularization multiplier β = 1. The best-supported two-toed sloth model had a β = 1.5 and the bestsupported tree porcupine model had a β = 3.5, both with linear and quadratic feature class functions. Discrimination ability (OR10) for all models was at or close to the expected threshold of 0.10 ( Table 2).
Final best-fit models were robust to random expectations (range:

| Harpy eagle distribution models
All four best-supported harpy eagle models (ΔAIC c = 0.0) had feature classes linear and quadratic and a regularization multiplier TA B L E 2 Evaluation metrics for prey distribution models used as biotic covariates in the harpy eagle distribution models.

Visually, including prey distributions in both the Biotic (B) and
Abiotic+Biotic (

| Geographic overlap and correlation
In geographic space, pair-wise overlaps between the harpy eagle and its food resource distributions were highest with capuchin

| DISCUSS ION
Recent theoretical and empirical work has demonstrated the importance of including resource distributions in macro-scale species distribution models (SDMs, Araújo & Rozenfeld, 2014;Atauchi et al., 2018;Gherghel et al., 2018;Palacio & Girini, 2018). Our results show that incorporating the distribution of the harpy eagle's five main prey species at a continental scale improved its distribution estimates compared to using solely abiotic covariates. This result further counters the Eltonian Noise Hypothesis (Soberón & Nakamura, 2009), the assumption that biotic interactions are unimportant at broad spatial scales (Pearson & Dawson, 2003). Including food resources as individual prey species distribution rasters improved the predictive performance of the Abiotic model, but we acknowledge that this is also related to a higher number of parameters in the prey resource models. Moreover, using solely biotic covariates or combined as species richness still resulted in high-performing models, but the combination of Abiotic+SpeciesRichness (model A + SR) had highest calibration accuracy. Geographic overlap ranged from moderate to high between the harpy eagle and its main prey species, with highest environmental suitability for all species combined ranging across northern South America and Central America.
The spatial pattern of species' distributions is products of physiological constraints such as climate and topography, and interactions with other co-occurring species (MacArthur, 1972 (Kass et al., 2019;Wisz et al., 2013). Our results support this conclusion by improving an abiotic model prediction with the inclusion of food resource distributions. Three-toed sloth was the most important biotic predictor in both the Biotic and Abiotic+Biotic (A + B) models (Table S3), consistent with this species being the principal prey for the harpy eagle across its range (Aguiar- Silva et al., 2014;Miranda, 2015Miranda, , 2018. However, the importance of three-toed Ecuador (Muñiz-López, 2008). However, the harpy eagle is not so specialized on a diet of three-toed sloths as to be absent from areas where sloths are not present. It seems likely that in the southern and eastern parts of the harpy eagle range primates and porcupines are the key prey species, replacing sloths as the primary food source F I G U R E 3 Response curves for covariates in the Abiotic distribution model for the harpy eagle. The response curves show the contribution to model prediction (y-axis) as a function of each continuous habitat covariate (x-axis). Maximum values in each response curve define the highest predicted relative suitability. The response curves reflect the partial dependence on predicted suitability for each covariate and the dependencies produced by interactions between the selected covariate and all other covariates.

F I G U R E 4
Response curves for covariates in the Biotic distribution model for the harpy eagle. The response curves show the contribution to model prediction (y-axis) as a function of each continuous habitat covariate (x-axis). Maximum values in each response curve define the highest predicted relative suitability. The response curves reflect the partial dependence on predicted suitability for each covariate and the dependencies produced by interactions between the selected covariate and all other covariates. (Miranda, 2015). Thus, our models are able to capture the spatial variation in predator-prey distribution across a continental tropical forest system by using a range of key prey genera and not relying solely on a single biotic predictor.
Using response curves to interpret model outputs is a useful though underused aspect of model evaluation in many SDMs (Guevara et al., 2018;Kass et al., 2019). Here, modelled partial responses for the three-toed and two-toed sloth were strongly positive in both the Biotic and Abiotic+Biotic (A + B) models, peaking at 1.0 as expected (Figures 4, 5). Capuchin and howler mon- We recognize the potential limitations in our modelling approach for selecting a varying number of environmental covariates a priori for each harpy eagle SDM even though this is an established and biologically driven method for variable selection (Fourcade et al., 2017).
The inconsistent model structure and number of parameters could lead to overfitting, with a potential solution to generate randomly permuted variables (Niittynen & Luoto, 2018). However, in this case all four harpy eagle models had 10% omission rates at or near to the recommended OR10 value (0.10), demonstrating low overfitting in all models, including the A + B model with the highest number of parameters. Thus, selecting biologically plausible variables based on current knowledge was effective in this case rather than relying on random generation of covariates. In addition, we acknowledge the limitations of using community science data, which often does not sample over the entire extent of a species range (Beck et al., 2014).
This lack of occurrence coverage has implications for producing accurate and realistic models, reflected in regional variation in prey availability. However, as shown here, the point process model framework using penalized logistic regression produces reliable and useful predictions for species range limits in the absence of detailed distributional information for tropical forest species.
We show how incorporating food resource distributions improves model predictive power and circumscribes the spatial complexity in harpy eagle distribution. Adding food resource distributions revealed the crucial role of predator-prey interactions in harpy eagle distribution. Given the wide variation in food type F I G U R E 7 Predicted distribution correlation for the harpy eagle given the distribution of its five main prey species. Values close to −2 suggest absence, −1 to 0 can be interpreted as colonizable areas, 0 to 1 defines areas of highest suitability (prey availability) and values of 2 (dark red patches) show the most unsuitable (low prey availability) areas. Grey borders represent national borders and state boundaries for Argentina, Brazil and Mexico.
taken by the harpy eagle across its range (Aguiar- Silva et al., 2014;Miranda, 2018), maintaining these prey resources should also be a priority in conservation programmes for the harpy eagle. Conserving habitat for the key arboreal mammal prey populations along with one of their main predators as a complete tropical forest system seems a viable approach given the reliance on harpy eagle presence with their main food resource distributions. We encourage practitioners to incorporate known biotic interactions into SDMs, but modellers should recognize that understanding the complex interactions inherent in natural systems is a challenge (Aragón et al., 2018). Whilst we demonstrate that using resource distributions improves model predictions at macro-scales, this needs further testing across multiple taxa and ecosystems to determine whether this finding is consistent elsewhere.

ACK N O WLE D G E M ENTS
We thank all individuals and organizations who contributed occurrence data to the Global Raptor Impact Network (GRIN) informa- reviewers whose comments and suggestions greatly improved the original manuscript.

CO N FLI C T S O F I NTE R E S T
The authors have no conflict of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data and results that support the findings of this study are openly available on the data repository Dryad DOI: 10.5061/ dryad.76hdr7szs. R code to run the analyses is available on Github: https://github.com/lsutt on74/HAEA-Bioti cSDM