Habitat modeling of Irrawaddy dolphins (Orcaella brevirostris) in the Eastern Gulf of Thailand

Abstract Aim The Irrawaddy dolphin (Orcaella brevirostris) is an endangered cetacean found throughout Southeast Asia. The main threat to this species is human encroachment, led by entanglement in fishing gear. Information on this data‐poor species’ ecology and habitat use is needed to effectively inform spatial management. Location We investigated the habitat of a previously unstudied group of Irrawaddy dolphins in the eastern Gulf of Thailand, between the villages of Laem Klat and Khlong Yai, in Trat Province. This location is important as government groups plan to establish a marine protected area. Methods We carried out boat‐based visual line transect surveys with concurrent oceanographic measurements and used hurdle models to evaluate this species’ patterns of habitat use in this area. Results Depth most strongly predicted dolphin presence, while temperature was a strong predictor of group size. The highest probability of dolphin presence occurred at around 10.0 m with an optimal depth range of 7.50 to 13.05 m. The greatest number of dolphins was predicted at 24.93°C with an optimal range between 24.93 and 25.31°C. Dolphins are most likely to occur in two primary locations, one large region in the center of the study area (11o54′18′′N to 11o59′23′′N) and a smaller region in the south (11o47′28′′N to 11o49′59′′N). Protections for this population will likely have the greatest chance of success in these two areas. Main Conclusions The results of this work can inform management strategies within the immediate study area by highlighting areas of high habitat use that should be considered for marine spatial planning measures, such as the creation of marine protected areas. Species distribution models for this species in Thailand can also assist conservation planning in other parts of the species’ range by expanding our understanding of habitat preferences.

However, successful establishment of an effective MPA requires understanding the relationship between the population to be protected and its habitat as well as human uses and impacts (Cañadas et al., 2005, St. Martin & Hall-Arber, 2008. Statistically based habitat models can provide information on preferred habitats to identify areas critical for protection and future research (Bailey & Thompson, 2009).
Little is known about Irrawaddy dolphin habitat preferences.
It is one of only three cetaceans (with the finless porpoise, Neophocaena phocaenoides, and the tucuxi, Sotalia fluviatilis) able to inhabit both marine and freshwater (Smith & Jefferson, 2002).
This suggests that they prefer shallow nearshore areas with high nutrient input and biological productivity, likely supporting prey resources (Dolar et al., 2002;Minton et al., 2013Minton et al., , 2011. However, a more detailed understanding of Irrawaddy dolphin habitat characteristics is needed to establish effective conservation measures. For this purpose, we use a species distribution modeling (SDM) approach in our study of a group of Irrawaddy dolphins offshore of Trat Province, Thailand.
Here we examine habitat preferences of Irrawaddy dolphins in the eastern Gulf of Thailand, a subpopulation which remains unassessed by the IUCN and for which no formal habitat studies have been conducted. Local government groups are planning spatial protections for this species and need habitat and distribution information to make effective management decisions. We conducted standardized line transect surveys, collecting Irrawaddy dolphin occurrence data concurrently with data on physical and biological habitat characteristics. These data were used to develop F I G U R E 1 Irrawaddy dolphins (Orcaella brevirostris) in the Gulf of Thailand a SDM with three goals: 1) to determine the factors influencing suitable habitat in the study area, 2) to predict dolphin distributions for use in development of conservation measures (e.g., MPA development, boating/fishing restrictions, pollution mitigation efforts, reduced dolphin entanglement risk) in the Gulf of Thailand, and 3) to provide a model for predicting Irrawaddy dolphin presence for prioritizing future sampling efforts in less studied coastal subpopulations.

| Field methods
We carried out research primarily along the coast of the eastern Gulf of Thailand (Figure 2a) between the villages of Laem Klat and Khlong Yai, within Trat Province, Thailand. This subpopulation's abundance is estimated at 423 individuals, one of the largest for this species . In two field seasons, We surveyed from a 12-meter fishing boat, small inflatable motor boat, or 20-meter fishing boat (Table 1). We conducted all surveys, except for the April-May 2012 fieldwork, in the dry season during the northeasterly monsoon (see Hines et al., 2015 for details).
The April-May 2012 environmental data fell within the range of values collected in other years, so we included it in the full dataset rather than modeling it separately. Total area surveyed was 552 km 2 in Trat Province, 2,127 km 2 around the islands, and 815 km 2 in Chanthaburi.
Environmental data were collected at time of sighting or at every 1-km 2 grid cell and included sea surface temperature, depth, salinity, turbidity, pH, and chlorophyll a ( Table 1). All of these variables have been shown to inform marine mammal distribution models (Redfern et al., 2006;Torres, Read, & Halpin, 2008). These factors can limit dolphin distribution due to potential physiological constraints (e.g., temperature, salinity, and pH limits), prey availability (e.g., depth, chlorophyll a), and the influence of turbidity on visual capture ability and water quality. We included a binary variable indicating calf presence during a sighting as an independent variable to test whether calf presence influenced dolphin group size or presence.

| Analytical methods
We first measured distances to the coastline and river mouths for each environmental data point using ArcGIS (ESRI, 2014). We carried out all subsequent analyses in R version 2.13.1 (R Development Core Team, 2009Team, -2013. We identified and removed outliers (points more than three standard deviations from the mean, according to the Zvalue test) in each data category (e.g., sightings, depth, and salinity) (Aggarwal, 2013;Hodge & Austin, 2004). We binned turbidity and chlorophyll a into high, medium, and low categories using a Jenks natural breaks classification because variability was high between years, low within years, and non-normally distributed. We used a pair plot to initially explore the data and identify linear relationships. Next, we ran both Moran's I and Mantel tests to determine if sightings were spatially autocorrelated (Dray, Dufour, & Thioulouse, 2016;Paradis et al., 2015). Neither analysis found significant clustering of sightings (p > 0.05). We tested variables for collinearity using variance inflation factors (VIF) with a cutoff value of 3 (Naimi, 2015;Zuur, Ieno, Walker, Saveliev, & Smith, 2009), resulting in removal of the variable "distance to coastline." Frequency plots showed sightings data were highly zero-inflated and overdispersed (Figure 3a; mean = 0.78, variance = 4.91). We chose a hurdle model, which models data in two components. The zero component models data as binary with a binomial distribution (zeros vs. all nonzero counts) and the truncated count component models just nonzero counts using a Poisson, negative binomial, or geometric distribution (Hu, Pavlicova, & Nunes, 2011;Zeileis, Kleiber, & Jackman, 2008;Zuur et al., 2009).
Hurdle models are an appropriate choice for this study given the nature of our data and their past use in modeling the distribution of marine mammals (e.g., Goetz et al., 2012;Gowan & Ortega-Ortiz, 2014;Ver Hoef & Jansen, 2007). The frequency curve for the sightings data (Figure 3a) closely resembles a negative binomial distribution with mean = 1 and dispersion parameter (k) = 0.1 (Zuur et al., 2009). Therefore, we analyzed our count data using a negative binomial distribution. This model family assumes that separate ecological processes influence presence/absence and number of individuals where the species is present (Zuur et al., 2009).
Hurdle models do not handle missing data well and k-fold cross validation failed on a model including all years and all variables. Due to uneven data collection across years and missing values caused by instrument error, data availability was uneven across years (Table 2). Therefore, we separated the data into five smaller datasets (Table 3).
Four datasets included all data points for each year and thus left out variables with missing values. The fifth dataset held only points for which all variables were measured and was smaller than the prior four. Because there were no sightings around the islands or Chanthaburi, we left these data out of the analysis. We explore differences and potential reasons for this lack of sightings in the Discussion.
We fit the data to a suite of hurdle models (Jackman, Tahk, Zeileis, Maimone, & Fearon, 2015), first using all variables within each framework, then dropping terms sequentially and performing model selection tests and evaluations (detailed below) to determine which configuration of terms resulted in the best model. We used Akaike's information criterion (AIC) and k-fold cross validation (with 10 folds) (Alfons, 2012) to choose the best model within each framework (Johnson & Omland, 2004;Kadane & Lazar, 2003;Kohavi, 1995;Redfern et al., 2006;Refaeilzadeh, Tang, & Liu, 2009;Zuur, Ieno, & Smith, 2007). We supplemented these criteria with a likelihood ratio test (Hothorn et al., 2015) to compare each reduced model to the full model within each framework (nested models), with a threshold of p > 0.05 (Johnson & Omland, 2004).
After choosing models from each framework, we evaluated each model component (presence/absence and count) using the area under the curve (AUC) of the receiver operating characteristic (ROC) plot (Franklin, 2009, Sing, Sander, Beerenwinkel, & Lengauer, 2015. We also calculated McFadden's pseudo-R 2 (ρ 2 ), a goodness-of-fit measure, as a method of model evaluation (McFadden, 1978). From these assessments, we chose a final set of models. To choose the final model, we ran all parameters as linear, except for depth and temperature, which were modeled using a quadratic (Figure 3b,c). After choosing the final model, we used the "predict" function in the PSCL package (version 1.4.9; Jackman et al., 2015) to obtain predicted probability, count, and overall fitted values from the model. We used these values to create an interpolated surface using an ordinary krige with a 3 × 3 smoother in ArcGIS (ESRI, 2014). We further determined optimal values of the significant variables identified by the models: We first set all environmental variables in the final model to their average values, then varied only the significant variables, using the "predict" function to make predictions for the relevant model part (zero or count). These optimal values were used to describe the preferred habitat of this species as well as to compare our results with temperature and depth preferences obtained in other parts of its range. To determine the areas most highly used by Irrawaddy dolphins along the Trat Province coast, we classified overall fitted values by Jenks natural breaks into five classes of dolphin occurrence likelihood (high probability of presence and large group size) in ArcGIS (ESRI, 2014).

| RE SULTS
Dolphins were encountered in every part of the study area except around the islands or off the coast of Chanthaburi. The environment was similar in these regions, with the notable exceptions of depth and distance to river mouth ( Trat January small (1-5 individuals), but a few large groups were observed in Laem Klat and Khlong Yai, and the largest were observed in Mai Rut (5-15 and 15-30 individuals, respectively; Figure 4). Average group size was 3.77 individuals (SD = 3.55, range = 1-30).
Our final chosen model was framework 4 with the quadratic depth term, containing data from 2014 only and including salinity, turbidity, calves, a quadratic depth term, and distance to river mouth in the zero component and temperature, turbidity, chlorophyll a, and pH in the count component (Table S3).
Both the first (linear)-and second (nonlinear)-order depth terms were significant predictors for the zero component

| Dolphin-habitat relationships in the Gulf of Thailand
In the eastern Gulf of Thailand, dry-season dolphin presence was most strongly predicted by depth, while temperature strongly predicted group size. This indicates preference for relatively cool waters of intermediate depth. Geographically, this places Irrawaddy dolphins at 1.5 to 7.5 km from shore ( Figure 6). Protected area planning with protective zoning based on these areas would likely have the greatest chance of protecting this population (Batisse, 1982;Day et al., 2012;Hooker, Whitehead, & Gowans, 1999;Hyrenbach et al., 2000;Kelleher, 1999;Lausche, 2011). Buffer zones could be formed surrounding the two areas, between approximate lati-   (Table S4) (Kreb & Budiono, 2005;Kreb & Rahadi, 2004;Minton et al., 2013;Peter et al., 2016). Average depth of dolphin sightings was deeper than our predicted optimal depth (Kreb & Budiono, 2005;Kreb & Rahadi, 2004), suggesting that a different variable, likely salinity (Minton et al., 2013;Peter et al., 2016), is a stronger driver in this habitat than in the Gulf of Thailand (Table S4).

In the outer Sundarbans Delta of Bangladesh and deltas of East
Kalimantan, Indonesia, dolphin observations occurred in narrow depth (average spread 9.1 m) and temperature ranges (3.6°C), but wide turbidity and salinity ranges (Table S4) (Kreb & Budiono, 2005, Smith et al., 2005. Average depths were shallower than identified for the Gulf of Thailand (average 6.23m). Temperatures at which dolphins were found were generally lower than those optimal in the Gulf of Thailand (average 23.7°C). This may be due to habitat availability (e.g., no temperatures below 24.93°C were recorded in any region of our study area) or the interaction of factors that were not discernible to our model.

| Potential reasons for absence in the islands and Chanthaburi
We did not see dolphins in the islands or Chanthaburi, but we can compare environmental measures between these areas and Trat. As Table 4 shows, average values of environmental variables are mostly similar. For temperature (Table 5), chlorophyll a, and salinity, Trat values fall between the values of the other two areas. The Trat study area is, however, characterized by much lower depths (Table 5), turbidity, and distances to river mouth than the other areas (Table 4) and is much less developed than Chanthaburi, with lower levels of fishing activity and industrial development. River mouths were less prevalent in the islands and Chanthaburi, resulting in sampling locations much farther from river mouths in those areas (average: 26.62 km and 10.58 km, respectively) than in Trat (average: 7.58 km) ( Table 6). In Trat, sightings did not occur more than 14.17 kilometers from a river mouth.
The potential importance of proximity to river mouths is supported by the fact that distance to river mouth is considered a reliable indica-    , 2011;Mörzer Bruyns, 1966;Smith & Hobbs, 2002;Smith et al., 2006;Stacey, 1996;Sutaria, 2009). The Trat data may not have been collected far enough from river mouths to be a significant variable in our model. Thus, we may not have recorded a sufficient range of distance to river mouth to properly test the importance of this variable.

ACK N OWLED G M ENTS
We thank the many Thai Department of Marine and Coastal Resources researchers who assisted with data collection. We are grateful for funding from Ocean Park Conservation Foundation, the Indo-Pacific Cetacean Foundation, the National Geographic Society

DATA AVA I L A B I L I T Y S TAT E M E N T
Data underlying this article will be available on Dryad, with https :// doi.org/10.5061/dryad.r2280 gb94.