Development of environmental niche models for use in underwater vehicle navigation

This paper presents a review of Environmental Niche Modeling and a methodology for isolating the environmental niche of an aquatic species, given that prior information is available for characterising the physical tolerances for that species. To test and demonstrate our methodology, the environmental niche of the kelp bass has been isolated within Big Fisherman's Cove, Santa Catalina Island, CA, at specific confidence intervals. The motivation for this examination is to demonstrate the utility of ecological analysis in Robotics. Specifically, the utilisation of physical water properties to provide relative navigation and localisation for an aquatic robot. The environmental niches act as navigational landmarks in the seemingly featureless underwater environment. As water patches tend to stick together, this provides persistent landmarks for use in aquatic navigation problems. We provide the background and development of Environmental Niche Models, and present results from field trials for solving the navigation and localisation problem for underwater vehicles. Specifically, we provide results from a technique developed by the authors that utilises physical water parameters, e.g., temperature, salinity, chlorophyll, etc., to localise an underwater vehicle in a given region. The presented method leverages the concept of Environmental Niche Models to provide accurate position estimation that rivals GPS accuracy.


Introduction
Many applications for autonomous underwater vehicles (AUVs) require the vehicle to remain submerged for long periods of time. From environmental monitoring to covert military operations, accurate underwater navigation is critical to the utility of the information gathered and the overall success of the mission. However, accurate underwater navigation remains a substantial challenge. Current navigation methods are based on satellite information from GPS to keep precision accuracy. However, GPS signals cannot penetrate water, so AUVs can only use GPS when they are at the surface. Consequently, most underwater vehicle navigation utilises some form of dead reckoning, with varying techniques to bound the error growth using external information. Without implementing significant external infrastructure (LBL or USBL), one may help dead-reckoning error growth by (i) surfacing more frequently for a GPS fix or (ii) integrating more accurate, energy intensive sensors, such as Doppler velocity loggers. Both of these methods have drawbacks. Continually surfacing for a GPS fix takes away from sampling time and requires more energy to be used for communications, or potentially gives away position information in a covert operational scenario. Surfacing also poses a physical threat to the vehicle, as it might accidentally surface in a hazardous location, e.g. a shipping lane; see [1,2]. Using more powerful sensors consumes the finite energy supply of an AUV faster and significantly reduces the deployment duration. To optimise time spent collecting data, or remaining covert with these vehicles, it is desirable to find alternative means of reducing position uncertainty while underwater. Existing approaches require accurate knowledge of the vehicle state (internal navigation), while others depend on underwater communication (external navigation). Here, we are interested in utilising any existing structure within the aquatic environment to assist in bounding the error growth for dead-reckoning navigation solutions for underwater vehicles.
Although the ocean environment is naturally stochastic and aperiodic, it does exhibit coherent structures that can be exploited.
Such structure has been exploited in other applications, such as facial recognition [3][4][5], city modelling [6], novel view synthesis for three-dimensional (3D) visualisation [7], and robotic localisation tasks [8,9]. The reasons for the limited adoption in the marine environment are three-fold: (i) significant engineering challenges still remain and must be overcome for large-scale field deployments. (ii) The structure exhibited by ocean processes is spatiotemporally dynamic, and (3) we lack a sufficient understanding to know a priori how to best model/represent an ocean feature (what it should look like). This renders the problem of tracking and sampling oceanic processes by AUVs in non-linear, uncertain environments highly challenging. Hence, it makes sense to leverage the inherent structure of these processes, or the structure of the bathymetry that force these processes to occur in specific regions given certain conditions, to enhance sampling and further our understanding.
In this paper, we propose a method to develop an environmental niche model, which extracts structure from the aquatic environment that can be used to aid navigation for underwater vehicles. We provide a background of environmental niche models, detail our construction of the environmental niche model in the context of robotic sampling, propose a method to integrate this into the navigation framework for underwater vehicles, and present results from field trials demonstrating the effectiveness of this method.

Background
One of the primary goals of ecology is to map species distributions over geographic ranges and be able to use predictive models to infer where various species are likely to be found at any point in time [10]. Largely, the two most common methods for predicting spatial variations of species ranges are species distribution modelling and environmental niche modelling [11]. Species distribution models (SDMs) aim to correlate geospatial distributions of individual species to locations, which vary through time [12]. SDMs are constructed based upon direct empirical observation of species distributions, from which correlations may be made with common environmental parameters to classify the physical attributes of species preferred habitats [13]. The benefit of SDMs is that the process is descriptive in highlighting evolutionary trends of entire populations of a species over time. This allows observers to see how species-specific population distributions have shifted over time, and based upon these historical shifts, future predictions may be inferred about population migration. However, SDMs require large time frames for observation and many hours of direct human input.
Environmental niche modelling is a predictive approach targeted at classifying geographic locales as either being habitable or inhabitable for a certain species. By monitoring specific physical parameters of an environment and understanding the physical tolerances of a certain species, it is possible to infer where that species will most likely be present [13,14]. Environmental niche modelling relates the possible presence of species to the physical environmental parameters of that location. Environmental niche models may utilise a wide range of data to generate a map of a locale showing only chemical and physical parameters that have either been measured or interpolated from direct measurements [15]. This type of model is devoid of the representation of actual species distributions; rather, it predicts where a species may reside within a specific environment. Prior to the construction of an environmental niche model, it is imperative to know the extents of physical parameters that the species can tolerate, i.e. the temperature range within which a species can physically survive. Second, a record of the physical parameters of the environment that the species is actually observed in helps to make parameters more or less important in modelling the actual location of the species [16]. While environmental niche modelling does have the ability to probabilistically determine if a species could be present in a specific location, it also tends to over-predict areas that the species may reside within compared to the area the species is actually occupying. Both environmental niche models and SDMs are slightly different approaches to the central goal of accurately classifying the distributions of species on a large scale.

Motivation
The investigation into environmental niche models and SDMs is motivated by the fact that water patches tend to stick together; water masses with similar properties generally propagate through a water body without much dissipation. This is seen at the large scale via the ocean conveyor belt [17,18], and at smaller scales with ocean fronts, Lagrangian coherent structures and algal blooms. The motivation and contribution of this paper is to propose a method to generate these models and examine their utility for applications in aquatic robot navigation. Although the niches may move in space and time, there appears to be relative navigation information, in the form of landmarks or hotspots, which can be exploited for prescribed regions of interest. Specifically, we are interested in utilising this concept for navigation in regions where repeated sampling or revisits occur; this enables the model to evolve over time and periodic variability to be extracted to provide a dynamic model of the environment through the application of deep learning and neural networks.
Repeated physical sampling of an environment is precisely what marine robots, like AUVs, were designed to do effectively and efficiently. Such sampling also aids in predicting the likelihood of a location being a suitable habitat for a specific life form [19,20]. For an environmental niche model to be accurate it is important that the parameters specific to the species in question have been well recorded and understood with relation to the species [21]. These individual parameters may be combined to form an environmental niche model, providing the overall probability that a species is present or not. Often times many parameters are used to generate a holistic niche model, however the quality of the data and understanding of its relation to the species is likely more important than the quantity of parameters utilised [22]. For example, it was concluded that for anemone fishes and their sea anemone hosts, the number of parameters was not as important as the quality and proper weighting of the parameters. It was observed through modelling, that the locations of the anemone fish and their sea anemone hosts did not improve greatly after adding minimum temperature and depth of water to the estimation of the anemone fishes and their sea anemone hosts. This highlights the fact that the quality of the data and its relevance to the species location is more important than the quantity of data used [23,24]. Hence, the targeting of a specific niche for a specific species is still a difficult task, but characterising existing structure based on collected data that presents itself as a niche or niches, may have utility in aiding navigation for underwater vehicles.
We are further motivated to investigate such an approach based on research presented on tracking fish over long periods of time in Big Fisherman's Cove on Santa Catalina Island, CA, USA [25]. The authors in [25] characterised historical tagged fish movement patterns over multiple months, and developed methods for stochastic, model-based control tracking and feedback-control tracking. It was based on research that many marine animals exhibit a wide range of movement patterns that change with environmental conditions, species interactions and social context. Specifically, water temperature affects the location of species aggregation, and barometric pressure reductions due to oncoming storms can lead to mass migration of animals from shallow water to deeper habitats [26,27]. Among the studies of marine movement patterns, the authors of [25] investigate the periodic migratory paths that are particularly important in understanding both the environment and periodic behaviours of marine animals. It is hypothesised that the periodic behaviours of marine animals are linked to the periodic variability of the marine environment; hence the fish are following favourable niches. Here, we aim to investigate and exploit this percieved structure for use in underwater vehicle navigation.

Construction of an environmental niche model
The proposed method of constructing an environmental niche is tailored towards aquatic environments. To begin, a survey of the region must be completed to provide data for constructing the environmental niche model. The survey should produce spatially correlated data on the water parameters relevant to the species of interest. The data needs to be of a sufficient density to enable good statistical methods. Upon completion of the initial survey the data is filtered for outliers. This filtering is done with a step filter that compares individual data points to the average value and standard deviation of the five data points on either side of it. If the value of the data point being examined in >3 standard deviation away from the average value then it is replaced by the average value. Pseudo code for the step filter algorithm is presented in Algorithm 1 (see Fig. 1).
In addition to data collection a geographically referenced satellite image, generally in the form of a geotiff (.tiff extension), depicting the region of interest may be utilised to provide ground truth for localisation of the data points (Fig. 2). Although the creation of a geotiff is not required it is very helpful for visual inspection of the data and can be used to create a base map for the Niche Model. The geotiff's raster image may be obtained from a publicly available resources such as Bing or Google Earth. The raster image can be geographically referenced with a geographic information system tool similar to QGIS, which outputs a geotiff.
The latitude and longitude scales of this rectangular region are then converted into an array with scales in meters utilising the WGS84 earth ellipsoid with the lower left corner denoted as the The filtered data are individually passed into a 1D Kalman filter. The Kalman filter is utilised to interpolate between surveyed points to create a regular grid spacing of interpolated measurements representing the distribution of each physical parameter. The output from the Kalman filter is a heat map of interpolated data of each individual parameter referenced to the origin of the rectangular array with distance relations measured in meters, see e.g. Fig. 3. The array of data underlying the heat map is reconverted back into geodetic coordinates so that the filtered, interpreted data is georeferenced to physical location on the WGS84 earth ellipsoid.
Each interpolated measurement is cross-referenced in turn against the individual arrays constructed to associate a physical parameter with a probability of finding the target species at that location. Each individual positional probability (η i ) is combined to produce a total probability (η T ) of finding a species in a specific location according to the below equation within the range of physical limits defined by a particular species, a Gaussian distribution is assumed for each limiting parameter, implying that the largest density of the target species will be found at the mean value of these limits. The extents of these physical limits are assumed to be the 3-sigma limit stating that 99.7% of these species population resides within these limits. A 2 × X array is created with the first column representing one of the parameter values anticipated in the survey region and the second column representing the associated probability that the species of interest is found at that specific parameter value. As many of these arrays may be created as there are limiting parameters that define the environment of the target species. Generally, as the number of physical limiting parameters increases the geographic area that defies the environmental niche of the species begins to become more specific. The accuracy of this model is intrinsically limited by the extent and precision with which these tolerances have been studied and classified.
As an input to constructing an environmental niche model, specific biologic tolerances/preferences of the species of interest must be known a priori. The inputs serve as limits within which the target species is expected to reside. Here, we choose a proxy fish species to demonstrate the niche determination methodology. In our previous work on augmented terrain-based navigation (ATBN), we relaxed this assumption and utilised all data parameters collected [28,29]. Although we have found utility with the developed ATBN approach, the environmental niche model presented here has an application in bounding a survey region, providing a theoretical foundation for our ATBN approach, and providing landmarks for use in relative localisation.
This final outcome has prescribed a likelihood of finding a species (in terms of a z-score) at a specific geographic location in an area of interest according to a priori biologic tolerances/ preferences of that species. The utility of creating an environmental niche model utilising the concept of a z-score is that now a user defined input may be added such that the user specifies either an acceptable likelihood of finding (or not finding) a certain species within an area of interest. The acceptable likelihood is given in terms of a confidence interval as determined from crossreferencing the user input z-score with readily available statistical charts. Defining a confidence interval within which the species may or may not be found reduces the common over prediction or species' ranges typically associated with an environmental niche model. It should be noted that the biologic tolerances/preferences of a species were assumed to be normally distributed; if this is not the case other probability distributional forms may be utilised (i.e. log-normal distributions, f distributions, etc.) but the process of constructing the model would remain the same.

Case study: isolating the environmental niche of the kelp bass
Santa Catalina Island, CA (33°26′40.4′′N118°29′6.5′′W) is home to a species of marine fish known as the kelp bass. Kelp bass (Paralabrax Clathratus) are an important part of California's recreational fishery and populations of kelp bass have been cited as increasing since the late 1970's [30]. Despite this abundance however, there is little that has been classified about the kelp bass in terms of physical or biologic tolerances. In lieu of concrete data concerning the physical limits of the kelp bass, the biologic tolerances of a species within the same taxonomic family Serranidae, which have been extensively studied, are used here as a surrogate. We remark that the biologic tolerances are within normal ranges found in Big Fisherman's Cove, and provide the species data necessary to validate our Niche Model development methodology.
The black sea bass (Centropristis striata) are commercially and recreationally fished on the East Coast of the US but due to recent over-fishing in areas, populations have suffered [31]. The decline of the black sea bass have led to numerous studies seeking to characterise the physical limits of species as well as its preferred habitat. By using these data as a surrogate, we additionally examine the potential for the black sea bass to survive in the Pacific Ocean coast of the US.
The environmental niche of the black sea bass can most succinctly be characterised by salinity, temperature and organic dissolved oxygen limits. Black sea bass are most likely to reside in water bodies with salinity ranges from 20 to 34 ppt, temperature ranges from 17 to 25 °C and dissolved oxygen ranges from 5 to 9 mg/L [31,32]. These parameters are used as proxy data to characterise the environmental niche for this case study. Broad spectrum physical water data had already been collected from Big Fisherman's Cove, a coastal ocean bay on the northwest side of Santa Catalina Island. The availability of this type of data makes it desirable in attempting to generate an environmental niche model. The methodology presented here for isolating the environmental niche of the kelp bass would be the same regardless if the physical parameters characterising the species' niche vary.

Data collection
Santa Catalina Island is located 22 miles off the coast of Los Angeles, California. Big Fisherman's Cove is positioned on the northwest coast of the island and is primarily a shallow, protected, saline ocean bay. Data collection in Big Fisherman's Cove, Santa Catalina Island occurred between 13 July and 14 July 2016. All vehicle deployments were near shore over both sandy and coral substrate that was <40 m deep. The data used in developing this methodology was collected by a YSI EcoMapper at a sampling rate of 2 Hz [33].  Fig. 4 showing the auto-generated boundaries of the surveyed region as well as the actual path of the AUV. All of these data were collected while the EcoMapper was running on the surface to provide accurate position estimation via GPS.

Data analysis and niche model development
The detailed methodology utilised to identify the niche of the kelp bass was outlined in the previous section, and is explicated here for clarity. The raw data were filtered with a step filter ten times (as presented in Algorithm 1 (Fig. 1)) to remove erroneous data spikes. The data from all of the individual surveys were concatenated into column vectors to be input into a linear (1D) Kalman filter utilised for interpolation. The heat maps showing the distributions of salinity, temperature and organic dissolved oxygen concentrations are shown in Figs. 5a-c, respectively.
The spatially referenced, interpolated data were crossreferenced against normal probability distributions of salinity, temperature and organic dissolved oxygen within the cove. The probability density functions (PDFs) were created such that the mean of each of the PDFs were centred at the median value of each respective parameter characterising the environmental niche of the kelp bass. The mean values for the salinity, temperature, and dissolved oxygen ranges are 27.5 ppt, 21°C, and 7 mg/L, respectively. The standard deviation of each parameter were chosen to be 2.5 ppt], 2°C, & 1 mg/L, respectively and were assigned to demark the 3 σ limit of the parameter ranges. Individual data points within the spatially referenced heat maps shown in Figs. 5a-c are successively cross-referenced against the PDFs to associate an individual probability of finding the kelp bass at that specific location according to that individual parameter. The composition of each of these individual parameter probabilities according to (1) produces a single scalar field of the combined likelihood of finding the species within the cove. Again, the magnitude of these individual probabilities is meaningless and the application of zscore is pertinent to compare those individual probabilities relative to the average of those probabilities. Integrating the probability distribution function ( f (z)) from the associated negative to the associated positive z-score values [This selects the desired probability of the species NOT occurring.], and then subtracting from 100%, yields the probability of the individual event as shown below From the scalar field comprising the combined z-score probabilities of locating the prescribed species within Big Fisherman's Cove, the environmental niche was isolated at confidence intervals of 80, 50, and 20% as presented in Figs. 6a-c, respectively. The z-scores associated with these confidence intervals were 0.2536, 0.6752 and 1.2825, respectively. Figs. 6a-c highlight the fact that the higher the desired certainty of finding the species the smaller the area becomes for finding that species. The converse is true stating that as the confidence interval decreases for finding the species in a given location, which area grows. This is in accordance with the natural expectation that there is no area that one can be 100% sure of finding the species and that if one has no idea (0% confidence) where the species is, it could essentially be anywhere.

Utilising the environmental niche of a species to navigate and localise an aquatic robot
The end result of our environmental niche model development is a geo-referenced, binary distribution. From Fig. 6 display red regions where the species is predicted to be, and blue regions where the species is unlikely to reside. Coincidently, the niches with higher certainty for species residence seem to outline the shallow sandy areas within the bay; areas where operating an underwater vehicle may be problematic. If we consider the niche developed for a 70% confidence (see Fig. 7), it basically provides a boundary to the edge of the cove and shallow (potentially uninteresting or hazardous) areas. Also, we have the added benefit of knowing where certain animals may reside so as to avoid pestering them with the underwater vehicle. Given this environmental niche, we can supply the vehicle with a simple control law to perform relative navigation and to provide an additional level of safety for operation. The control law is a simple gradient analysis of a combination of physical water parameters that are being gathered by the underwater vehicle as part of its mission anyway. For example, when the combination of salinity, temperature and dissolved oxygen as defined in the environmental niche model development (Section 4) reaches a defined scalar value, the vehicle executes a pre-programmed avoidance procedure to keep it within the deepwater region and away from shore or the area of high probability of marine animals. This pre-programmed avoidance procedure is as simple as a fixed yaw rotation in a set direction. Some a priori knowledge of the survey area and survey details assists in ensuring that this implementation of gradient ascent reacts well to local maxima, and that an intelligent avoidance manoeuvre is implemented.

Extending the application of environmental niche modelling to navigation and localisation
The aforementioned compromise between limiting time spent underwater in favour of positional accuracy, or the converse, illustrates the need for a novel methodology of localisation underwater. Previously, work by Reis and colleagues proposed augmenting traditional terrain-based navigation (TBN) in an underwater environment to improve localisation techniques devoid of a GPS signal [29,34,35]. TBN was initially developed for the localisation of long-range missiles and involves taking repeated altimeter measurements and comparing those to altimeter data in a stored array. Altimeter measurements are stored in an increasing vector, compared against a prior map, and after a sufficient amount of time this altitude vector will become unique within the array and the position of the missile becomes known [36].
A detailed survey of research and current challenges in underwater navigation, summarising existing work on TBN for underwater vehicles, is provided in [37]. One clearly identified shortcoming of TBN in the aquatic environment is the lack of accurate, high-resolution maps of the sea floor in many regions. Additionally, sensor limitations, especially the limitations of optical range sensors, substantially restrict TBN underwater. In [37], it is concluded that improved navigation will enable new missions that would previously have been considered infeasible or impractical. Recent work by Lagadec on TBN under ice [38] has demonstrated the feasibility of using a particle filter for long term glider navigation. Lower relief maps of regions above the arctic circle with a resolution of 2 km were sufficient to navigate with  reasonable accuracy (∼1 km accuracy, with a mean accuracy of ∼8 km in one simulation). The study suggests that for real deployments, technological advances would be necessary to achieve the required navigation performance. However, higher relief bathymetric maps could facilitate the implementation of a TBN that operates online, in real time. The primary limitation of the technique presented in [38] was the lack of an accurate terrain map, which does not invalidate the methodology used. A number of other studies have utilised particle filters as part of a TBN framework for underwater vehicles [38][39][40][41]. The particle filter is suitable as a solution to the TBN problem because it is probabilistic (and therefore captures environmental uncertainty), and because it naturally incorporates the property that the longer a path is traversed, the more likely a single solution will emerge.
To further improve TBN, a method of creating an augmented terrain map that combines both bathymetry information and physical water data collected was proposed and tested, with results presented in [28]. The assumption that including physical water data into the terrain map provides a reliable model comes from the concept of Environmental or ecological niche models. Ecological niche modelling is derived from one of the primary goals of ecology, which is to map species distribution over geographic ranges and be able to use predictive models to infer where various species are likely to be found [10][11][12][13]. Environmental niche modelling uses a wide range of data to generate a map of a locale showing only chemical and physical parameters that have either been measured or interpolated from direct measurements [15]. Specifically, niche modelling is a method to classify geographic locales as either being habitable or inhabitable by certain species. By monitoring specific physical parameters of an environment and understanding the tolerances of a certain species, it is possible to model where that species will most likely be present [13][14][15][16][17][18][19][20]. Here, we hypothesise that these niches may also be utilised for underwater vehicle navigation. At this stage, we will assume that the environment is static in both space and time; however, the spatiotemporal dynamics of observed ecological niches suggests that they exhibit periodicity or a predictable stochastic behaviour, see e.g. [25].
Reis et al. [29,35] incorporated an array of sensors to augment altimeter data in traditional TBN to shorten the convergence time for localisation and enable navigation over regions with low bathymetric relief. The principle of augmenting TBN with in situ measurements of physical water data is the novel contribution of this work and where the concept of an environmental niche model emerges into the application of navigation for underwater robots. This unique combination allows the comparison of more measured values to a priori knowledge of an environment to occur on a broader spectrum. Measuring parameters beyond sole altitude measurements allows for the determination of a unique position within an environment more quickly and more accurately. Reis and colleagues tested this hypothesis in both fresh and saline aquatic environments and in each case the global correlation of variables decreases when incorporating physical data into TBN techniques. A decrease in the global correlation of variables in an environment results in an increase in uniqueness of individual positions in that environment. This increase of uniqueness presents an opportunity for navigation and localisation with reference to unique features within a waterbody. The proposed approach for navigation and localisation of an aquatic robot was to collect data from an area, construct an environmental niche model representation of the region, and then use the unique features of that model as fiducial markers or landmarks for relative navigation. This approach enables an AUV to spend a larger amount of time dedicated to data collection underwater with increased navigational accuracy at no extra energy expenditure.

Some results from the augmented terrain-based navigation approach to localisation
In this section, we present a summary of a few results we have obtained by applying the proposed theory of environmental niche models to the development of ATBN and implementing this onto underwater vehicles during field trials. Complete details of the results presented here can be found in [42].

Data acquisition
The proposed methodology was tested with data from multiple deployments at the Big Fisherman's Cove on Santa Catalina Island, CA, USA. Further results from data gathered in Lake Nighthorse, CO, USA can be found in [42]. The missions were executed using a YSI EcoMapper AUVs [43]. For these missions, the AUV was operated on the surface to provide ground truth via GPS. The range of science data that were collected by the vehicle include Water Column depth (m), salinity (ppt), temperature (°C), pH, turbidity (NTU) and dissolved oxygen (mg/L), chlorophyll concentration (μ g/L), blue-green algae concentration (PC cells/mL).

Data preprocessing
From the original eight sensed quantities in each y, we selected Water Column depth (m), turbidity (NTU) and dissolved oxygen (mg/L) using the three largest eigenvalues of the sample covariance matrix of our data set (principal component analysis). This data was processed to remove outliers, and each y or observation is a triple, that is m = 3, where each of these sensed quantities is one of its features. The data were also centred such that the mean of each feature is zero.

Global localisation
Localisation is known as the central problem in mobile robotics and can be understood as the task of systematically eliminating uncertainty in the pose of a robot [44]. Given different ATBN maps using the bathymetric parameter and a combination of the aforementioned water parameters, we tested the global localisation accuracy. The global localisation problem is, given a sequence of measurements how well can a robot estimate is pose within an environment. Our augmented terrain map is similar to a scalar field of depth (bathymetry map). However, our ATBN map is optimised to have the lowest global autocorrelation; a sequence of values has a high probability of being unique. For details on how this is accomplished, see [29,35]. Specifically, suppose an underwater robot is given a map of the ocean environment. The goal of the AUV is to move within that environment, collecting water parameter data, e.g. temperature, salinity, dissolved oxygen, etc., combine these parameters using our innovative weighting scheme, see [29,35], and eliminate uncertainty about where it is located in the map.
We developed Algorithm 2 (see Fig. 8) to search for an estimated vehicle path given the following inputs: the terrain map, a list of observations from the path to be localised, an epsilon defining a margin of tolerance where no penalty is given to errors in localisation and a range representing the how far an observation can be localised following another. The output is the coordinates in the terrain map where the observations were localised for a certain path. The algorithm starts with an empty set of trees (line 1). For all coordinates (x, y) in the map, possible candidates are found for the first observation Y 0 , thereby initialising the root of the trees (lines 2-5). Each tree represents one possible location for the path. For each remaining observation, we iterate over every leaf of each tree, trying to find a neighbour to be added to the tree. A neighbour is within range r and with a margin of tolerance ϵ (lines 6-12). The deepest tree contains the coordinates of the localised path in that map. The execution time increases with the number of observations to be localised and decreases with a low global correlation hypothesis. Algorithm 2 (Fig. 8) returns the final tree created with the localised path in the terrain map. We remark here that we neither seed the algorithm with an initial location, nor do we provide a motion model, or even compass heading. The first measurement taken in by Algorithm 2 (Fig. 8) could be located anywhere on the map the tree is pruned by iteratively adding successive measurements, which reduces the uncertainty of location within the map. In our trials, we have consistently observed convergence to the correct location with <20 measurements; <10 s of operation, as data are streaming at 2 Hz.

Global localisation results
The ATBN maps for two days were created using our developed methodology proposed in [29]. In Fig. 9, the path in yellow represents the ground truth path to localise, and the red path represents the localisation result. The blue path is the path executed by the AUV to collect the data used to build the ATBN map. The average error (RMS) in localisation between the red and yellow paths is 4.0064 m. The path length is ∼100 m.
Next, we took a trajectory from the vehicle that was collected the following day, and tried to localise within the ATBN map created from the previous day. In Fig. 10, the path in red represents the ground truth path to localise, and the black path represents the localisation result. Here, the average error (root mean square) in localisation between the black and red paths is 29.86 m. The path length is ∼100 m. In this instance, we remark that the primary difference is that the water level had changed due to the tidal flux. Since water depth is a parameter in the map, everything is shifted about 30 m horizontally; this is on the order of the tidal fluctuation for this area. Correcting for this error, we found that the average error (RMS) in localisation between the black and red paths was 6.058 m, within the same order-of-magnitude to the previously mentioned result.
Another localisation test was done across multiple days, where a segment of a trajectory performed by the AUV from a later day was extracted to be localised on the map created multiple days prior to that, with correction for the tides included. It is important to highlight that the data collected by the AUV in this localisation problem was not used to create the underlying ATBN maps. The black segment in Fig. 11 represents the original trajectory to be localised and the red segment represents the approximate localised trajectory with an average root mean square error (RMSE) of 0.5445 m.
By using the traditional TBN approach, which only utilises bathymetric information, a 9.5031 m RMSE in localisation was observed. An example of this can be seen in Fig. 12, where the orange segment represents the original trajectory, the green segment represents the approximate trajectory localised by traditional TBN systems, and the red trajectory represents the approximate trajectory localised by our ATBN framework. In this example, the RMSE was 21.3511 m for traditional TBN (depth only) and 8.6631 m for our ATBN approach.
Localisation results using our proposed ATBN maps have provided promising results that reduce navigation uncertainty as compared with dead reckoning and traditional TBN results, and rival the accuracy of GPS.

Navigation, path tracking and estimation
Localisation and tracking of AUVs can become compromised by the spatiotemporal dynamics of the ocean environment and the limited communication capabilities. Our work focuses on the coastal regions of the ocean due to the higher frequency of occurrence of interesting phenomena, e.g. cyanobacterial blooms. Furthermore, the interesting features are themselves spatiotemporally dynamic, and effective sampling requires a good understanding of vehicle tracking relative to the sampled feature. These interesting phenomena are usually identified by unique features in the ocean, e.g. significant bathymetric relief, an  unstratified water column, or significantly different physical water parameter values. Here, we are interested in the utility of these unique features to aid in localisation of underwater vehicles.
A tracking algorithm is presented by using sensor data collected by an AUV. One issue with this raw sensor data is that, for aquatic environments, it can be highly correlated. Thus, for correlated data, points that are geometrically close might not be statistically close. Our process of tracking uses the distance between newly sensed observations y new and our historical observations y to determine the closest geographical location x of y new . The process we employ to decorrelate these parameters is called whitening. We use a whitening linear transformation to disperse the data and using Euclidean distance we can determine good candidates for the vehicle's position once it loses GPS signal. The effect of the whitening linear transformation is equivalent to using the Mahalanobis distance, which takes into account the effect of the correlation between the random variables measurements obtained from the sensors. This allows us to track the position of the vehicle within small intervals of time (less than an hour) until the GPS signal returns. Details of this decorrelation can be found in [35,42].
TBN frameworks have been used to localise AUVs, but it requires a high resolution of bathymetric maps. Nonetheless, the navigational error can still occur, especially in regions of little to no vertical relief. Based on the incorporation of physical science data, i.e. water parameters such as temperature, salinity, pH etc., to enhance the topographic map that the vehicle uses to navigate under the traditional TBN framework, as shown in [29,35], our approach used bathymetric information and water data to improve the localisation and tracking.

Model definition
We consider a small oceanic environment as our workspace. This workspace, denoted by W ⊂ ℝ 2 , is modelled as a 2D polygonal environment. An AUV A is modelled as a point robot without considering its orientation. The state space of the vehicle is represented as X = W which consists of all navigable locations of the environment. Each state of the robot x denotes a geographic coordinate in the form of longitude and latitude. A state trajectory of the vehicle is denoted as x: [0, t] → X for a finite time interval [0, t]. We assume that the vehicle has m sensors for the observation of the environment. Let Y ⊂ ℝ m be the observation space, which is the set of m sensor output values. An observation history for the robot is defined as ỹ: [0, t] → Y. Suppose a sensor h: X → Y is given and is applied over an interval of time [0, t]. For every t′ ∈ [0, t], some observations ỹ(t′) = h(x(t′)) are obtained. We define the sensor mapping over [0, t] as [45] H : in which X is the set of all state trajectories and Ỹ is the set of all possible observation histories. In our problem a state x represents GPS data of the underwater vehicle which is defined as where x 1 and x 2 are the longitude and latitude measured by the GPS. We measure m sensor variables that include water column depth, turbidity, salinity, temperature, pH, dissolved oxygen and conductivity. This multivariable observation can be written as where y i is the ith coordinate of this observation.

Problem formulation
We have a set of historical data that gives us the state trajectories of the vehicle x and the corresponding sensor observation histories ỹ.
In a new deployment in the same oceanic environment, the vehicle collects the sensor observation data which is denoted as ỹ new . The vehicle can get lost due to the unavailability of GPS data. Then, our goal is to track the state trajectory of the vehicle right after the GPS signal is lost GPS data are no longer received. Let the new state trajectory of the vehicle after the deployment be x new . In this context, we formulate our problem as follows: Problem 1, Tracking the state of an underwater vehicle: given a state trajectory of a vehicle x and its observation history ỹ for a specific time period, attempt to find the state trajectory of the vehicle x new from the new observation history ỹ new .
We have a set of historical data collected in bodies of water within a time range that will leave the analysis of changes in a temporal dimension for later discussion (2 h range in the collected data set). We will call this historical data set, our 'map', and we will assume it looks like a gridded polygon, not necessarily convex, i.e., we have lines of sampled data that go from North to South, and lines from West to East, whose endpoints are on the boundary of this polygon; call them grid lines.
The vehicle is deployed and it transmits data which include GPS plus some sensor data. This data is fed to the map, until the vehicle's GPS signal is lost, and the tracking process starts with the last GPS + sensor data point received which we will call the 'source' s.
We will call the non-GPS data points received after the source, our observation points 'o i ', i > 0. The points x i in our map are of the form where x 1 and x 2 are the longitude and latitude measured by the GPS, and x i , for i > 2, is one of the non-GPS sensored data parameters such as water column depth (m), turbidity (nephelometric turbidity units) and dissolved oxygen (mg/L). Similarly, for the observations o i , with the only difference that they do not contain the first two GPS parameters Longitude and Latitude, p 1 and p 2 , respectively.
The goal is to track the motion of this AUV right after the GPS signal is lost. At that point, we receive observation points o i missing the GPS data.
So, our approach is to track the vehicle finding the point in our map that statistically closest to the first observation o 1 lying inside an epsilon ball around the source. This epsilon is approximated by using one of the non-GPS parameters that measures the speed of the vehicle and the time elapsed between the source point and o 1 . If no point is found within such epsilon neighbourhood, we expand epsilon until our first candidate c 1 appears in the map. Based on one of our original assumptions, this candidate would appear in the first few expansions from the source. Now notice that these candidates are located either in one of these grid lines or at one of the points during the current deployment that were added to the map until the GPS signal was lost. So, we approximate o 1 with c 1 , and when we receive the second observation o 2 , we will repeat the process by localising o 2 inside a new epsilon ball around c 1 , where epsilon is determined by the average speed between o 1 and o 2 . We will bound this epsilon by some fixed quantity such that if no candidates are within it after having expanded it a few times, we will ignore such observation and wait for the next one to be approximated. Now this statistical distance we mentioned before that is calculated, for instance, between o 1 and all the points from our map, is done using the Mahalanobis distance, which is equivalent to applying a whitening linear transformation to o 1 and to all points in our map, and then computing the Euclidean distance between them. The advantage of using this distance is that it reverses the effect of the correlation within the parameters used to calculate it; thus revealing the true statistical distance between the sampled points and the observations. As we receive and approximate these observation points with candidates from our map, a curve connecting them is interpolated to describe an approximate path followed by the vehicle. And we continue this process until needed or until the GPS signal returns.

Localisation algorithm
Once the GPS signal is lost, we perform the whitening on the map and the new observation point, and having sorted the candidates, we take the statistically closest one from our map to the new observation, within a radius equal to the average velocity times the elapsed time around the source or last GPS data point available. This average velocity is also measured through non-GPS means (part of the AUV sensor data). If no candidates lie within that radius, we ignore that first observation point and after receiving the second one, we apply the same procedure from the source point with an augmented radius according to the elapsed time and average velocity. On the other hand, if the first observation was localised within the first radius, we mark that candidate as part of the new trajectory and draw a new radius around that point to localise the next observation point, and so on. Fig. 13 shows the performed trajectory of the robot during deployment (in blue) and the green points represent a list of observations where GPS signal was unavailable and the orange ones represent the approximate localisation of these observations without whitening the data and including the whitening.

Spatial and temporal considerations
The concept that incorporating physical water parameter data e.g., temperature, salinity, pH etc., into standard terrain maps provides a more robust map comes from extending the concept of environmental niche models. However, water bodies are inherently dynamic in nature, and a generated terrain map based solely or partially on these variables will change in both space and time. For this research, we initially assumed that the environment is quasistatic in both space and time; only considering operations across a few days.
Over multiple field trials, analysis of the spatiotemporal dynamics of observed ecological niches suggests that, in certain regions, they exhibit periodicity or a predictable stochastic behaviour, see e.g. [25]. Further research on ATBN includes an examination on how to intelligently update our maps [34], and ways to predict spatiotemporal dynamics of the region [42],   possibly through the incorporation of predictive ocean models [46]. Given enough deployments in the same region, covering a sufficient spatiotemporal epoch, one could investigate applications of machine or deep learning to incorporate dynamics onto the ATBN maps.

Other applications of environmental niche modelling
Many operations, such as mechanical dredging, underwater drilling, and marine construction occur in places that may be home to threatened or endangered species. While these areas are regulated by legislation to minimise impacts to the local environment, priority is commonly given to expediency rather than environmental stewardship. Considering that many of these processes are a necessity to international travel and trade, predictive modelling of endangered species' habitats has become a growing area of interest for some. The proposed approach for implementing an environmental niche model could be applied to assess where any of these construction operations could be safely conducted through probabilistic predictions of where a species is likely to be found. Traditional environmental niche models are constructed in a binary sense: either the species could reside in an area or it could not. While this approach is similar to standard methods of producing an environmental niche model for a species it is novel in the sense that the user can specify to what confidence interval it is likely to find a species in an area. This allows the user to assess and compare how paramount it is to conduct marine engineering or commercial operations in an area as compared against the likelihood of finding a species in that area. This approach enables the user to weight the cost versus benefit of a specific project goal versus acceptable risk of possibly disturbing a fragile species. Unlike the traditional binary model it should be possible with this model to reign in some of the over-prediction that is associated with the traditional predictive environmental niche methodology.
This proposed approach to environmental niche modelling is tested only in aquatic environments that have previously been surveyed. During these surveys the AUV was collecting both water chemistry data and bathymetry data. The physical water parameters that were measured were: salinity, temperature, specific conductance, pH, turbidity, chlorophyll concentration, blue-green algae concentration and organic dissolved oxygen. Measuring broad spectrum water chemistry in this manner allows for the combination of nine parameters that could potentially characterise the environmental niche of a given species.

Conclusion
Based upon the methodology presented here, it has been shown that the environmental niche of a species may be determined at various confidence intervals based upon environmentally surveyed data. At present, data were not available concerning the physical tolerances of the kelp bass. For demonstrative purposes, the kelp bass was assumed to have physical tolerances similar to the wellstudied black sea bass from the same taxonomic family. This assumption was used for a proof-of-concept to highlight the fact that an environmental niche can be isolated with a geographic locale if data are available.
The practicality in generating this kind of model can allow for the minimisation of pose estimation error growth when attempting to localise an aquatic robot by utilising nothing more than the equipment that is already on board most vehicles. Using this motivation, we presented some initial results from field trials utilising our developed method to localise underwater vehicles and compared results to GPS for ground truth. We found that the ATBN framework was reliable for localisation tasks, and could provide pose estimation at accuracies similar to GPS. Further research has been conducted, and for a more detailed analysis of the field trials and corresponding results, we encourage the interested reader to consider [28,29,42].
Further extensions of this work could include applying the same principles towards oceanic engineering applications in areas where either fragile species may be intruded upon by such actions, or this could open up areas for development which were once prohibited citing possible occurrence of such species. It is intended that the demonstration of this concept in an aquatic environment could be extended to terrestrial environments given a different means of surveying the environment.