Spectral reflectance of whale skin above the sea surface: a proposed measurement protocol

Great whales have been detected using very‐high‐resolution satellite imagery, suggesting this technology could be used to monitor whales in remote areas. However, the application of this method to whale studies is at an early developmental stage and several technical factors need to be addressed, including capacity for species differentiation and the maximum depth of detection in the water column. Both require knowledge of the spectral reflectance of the various whale species just above the sea surface, as when whales bodies break the surface of the water to breath, log or breach, there is, at times, no sea water between the whale's skin and the satellite sensor. Here we tested whether such reflectance could be measured on dead whale tissue. We measured the spectral reflectance of fresh integument collected during the bowhead subsistence harvest, and of thawed integument samples from various species obtained following strandings and stored at −20°C. We show that fresh and thawed samples of whale integument have different spectral properties. The reflectance of fresh samples was higher than the reflectance of thawed samples, as integument appears to darken after death and with time, even under frozen conditions. In this study, we present the first whale reflectance estimates (without the influence of sea water and for dead tissue). These provide a baseline for additional work, needed to advance the use of satellite imagery to monitor whales and facilitate their conservation.


Introduction
Whales can be detected from 600 km above the ocean using very-high-resolution (VHR) satellite imagery, that is, <50 cm (Fretwell et al. 2014;Cubaynes et al. 2019). The use of this improved spatial resolution has enhanced the capacity for sensing large whales, from seeing virtually unresolved objects (Abileah 2002) to more detailed objects with visible whale-defining features, such as flukes (Cubaynes et al. 2019).
With further developments, VHR satellite imagery has the potential to become a complementary and valuable tool to estimate whale abundance, particularly in remote oceans where few or no surveys are conducted (Kaschner et al. 2011(Kaschner et al. , 2012. Accurate trends of whale abundance are crucial for evaluating the efficacy of conservation measures implemented to support whale population recovery (Taylor and Dizon 1999;Stevick et al. 2003;George et al. 2004;Mace et al. 2008;Panigada et al. 2011;Fisheries and Oceans Canada, 2014;Pace et al. 2017). A key aspect required to realize this potential is to assess the spectral reflectance of whales above the sea surface, which is particularly necessary to develop tools for differentiating species, and measuring how well and at what depth whales can be detected in satellite imagery. However, the spectral reflectance of live whales just above the sea surface is currently unknown. Radiances of four whale species have previously been estimated (Cubaynes et al. 2019); however, these are for whales slightly below the surface, and so the spectra are attenuated due to the effect of seawater on light. There are not yet enough whales identified on satellite imagery to provide good species-specific spectral estimates using 'pure' pixels of whales, without the influence of seawater.
In remote sensing, spectroradiometers have been successfully used to acquire the spectral reflectance of various natural targets, such as penguin guano and vomit (Schwaller et al. 1984;Rees et al. 2017), corals (Lubin et al. 2001), trees (Lin et al. 2013), lichens (Rees et al. 2004) and minerals (Clark et al. 1990). These are stationary targets, as spectroradiometers need to remain still usually for several minutes while acquiring the reflectance. Hence, this method cannot be directly transferred to free-swimming whales. The acquisition of the reflectance of one target within an individual whale (e.g. a specific area on one whale) is a slow process involving several measurements of the target, interspersed by measurements of a known reference. This method also requires that the spectroradiometer is placed at a specific distance from the target, to control the area being measured and ensure that no other surfaces are measured. A hand-held spectroradiometer would typically need to be 1 m away from the target to measure the spectral reflectance of a sufficiently small area of whale integument, while avoiding measuring any part of the sky and/or the sea. Such close and lengthy approaches to free-swimming whales are not feasible for ethical and practical reasons (Scheidat et al. 2004;Isojunno and Miller 2015;Arg€ uelles et al. 2016).
A potential solution to measure the reflectance spectra of a whale above the sea surface involved using samples of whale integument of good condition that were collected and frozen after fatal strandings, an approach that enabled spectroradiometer tests to be conducted up close and with no time constraints. Here, we investigated whether the spectral reflectance of thawed whale integument collected at fatal strandings could be used to estimate the spectral reflectance of live whales above the sea surface. First, we assessed whether fresh and frozen whale integument have similar reflectance spectra. Then, we verified whether the spectral reflectance of thawed samples was unique to each of the species analysed.

Apparatus set-up
Measurements of spectral reflectance of whale integument above the sea surface were acquired using the set-up shown in Figure 1. All spectra were acquired at high spectral and spatial resolution, using a GREEN-Wave spectroradiometer, model VIS-50, (Stellarnet Inc., Tampa, FL, USA), which covers a wavelength range of 350-1150 nm with a spectral resolution of 1.6 nm and a sampling interval of 0.5 nm. The spectroradiometer was securely fixed to a tripod, with the sensor pointing perpendicularly to the whale integument and positioned at a predetermined distance from the target to ensure a known area of whale integument was measured. The distance between the sensor and the target was twice the radius of the measured surface area of the whale integument, as the sensor has a 30°field of view. The spectroradiometer was connected to a computer, running SpectraWiz â software (distributed by Stellarnet Inc.), to allow for visualization and acquisition of the spectral reflectance.

Sample collection and preparation
Because the set-up had to remain still for approximately 5 minutes to acquire the spectral reflectance of the target, we initially considered measuring the spectral reflectance of live-stranded whales. However, such unfortunate events are unpredictable, particularly for baleen whales (van der Hoop et al. 2013); therefore, we focused on measuring the spectral reflectance of whale integument samples collected during previous strandings and during the 2018 bowhead (Balaena mysticetus) subsistence fall harvest by Iñupiat hunters at Utqia_ gkvik (Barrow), Alaska. The samples collected during strandings represented seven species: minke (Balaenoptera acutorostrata), fin (B. physalus), sei (B. borealis), Bryde's (B. edenii), humpback (Megaptera novaeangliae), North Atlantic right (Eubalaena glacialis) and sperm whales (Physeter macrocephalus). The subsistence harvest samples are from bowhead whales. In this study, all samples of whale integuments consisted of epidermis (skin) through to hypodermis (fat).
A total of 37 samples of whale integument collected during strandings were frozen at À20°C at the International Fund for Animal Welfare (Northeastern US), the University of North Carolina Wilmington (Southeastern US) and at the Woods Hole Oceanographic Institution (Northeastern US). All stranded animals were coded 1 to 3 based on the Geraci and Lounsbury (2005) classification at the time of stranding. This coding is used to evaluate the quality of the whale carcass for research, with code 1 being alive at stranding, indicating the freshest and best preserved sample, and 3 being considered of fair quality with internal decomposition having started.
During the bowhead subsistence harvest, the reflectance of seven different portions of whale integument was either measured on the whale (i.e. before flensing), or on samples collected post-flensing. Flensing refers to the removal of the integument from the whale carcass. The Iñupiat 412 community of Utqia_ gkvik also gave us permission to freeze one of the seven samples at À20°C for 3 days. This sample had its reflectance measured before and after it was frozen and was used to assess the comparability between a spectral reflectance measured on a thawed versus fresh whale integument.

Spectral reflectance acquisition and preprocessing
All frozen samples were thawed to pliability before the spectral reflectance was measured. The acquisition of the spectral reflectance for each sample included three measurements of the whale integument intermittent with three measurements of a known reference card. We used a JJC GC-1II waterproof grey card of 254 by 202 mm, manufactured by JJC Photography Equipment Co., Ltd. (Shenzhen, China) as a reference. To follow agreed spectrometry protocols (Lubin et al. 2001;Rees et al. 2017), the grey card was calibrated using a 'Spectralon' white panel (reference SRT#034, on loan from NERC Field Spectroscopy Facility). To measure the reflectance of the grey card under the same geometrical and lighting conditions as the whale integument, we placed it immediately on top of the whale integument. Because different light sources were used for different samples; these were directly compared in order to establish any impact on the reflectance of the integuments. Light sources included halogen, fluorescent LED, surgical light (STERIS Amsco SQ240, STERIS, Mentor, OH, USA), sun light bulb (GE Reveal HD+ 45w, GE Lighting, Cleveland, OH, USA) and natural light.
All spectra collected at high spectral resolution were smoothed with a 10 nm moving average to remove noise. Prior to smoothing, we checked all spectral reflectance for the presence of narrow features that would be lost in the process of smoothing. No such features were observed. After smoothing, the spectral reflectance measured under fluorescent LED and the surgical light continued to have a high amount of noise at the wavelengths below 416.25 nm and above 802.75 nm. Therefore, we only analysed the smoothed, calibrated reflectance between 416.25 and 802.75 nm for all reflectance spectra. Occasional spectral measurements looked very different from other replicates, likely due to human error. We removed these measurements from subsequent analyses. Another measurement was excluded due to poor lighting conditions, specifically sample 18B13-1, which was measured at night, with an Allmand night light.
All spectral reflectances, covering the whole wavelength range available (350-1150 nm) were also convolved based on the radiometric response curves of the WorldView-3 sensors. The satellite WorldView-3 currently offers the best spatial resolution for detecting whales from space; therefore, we aimed to show what the spectral reflectance of each species would be using atmospherically corrected WorldView-3 imagery. To convolve the data, we used the calibrated, non-smoothed spectral profiles (n = 107) for thawed samples (n = 39), excluding profiles for which there was error in the measurements, or poor lighting conditions. The convolved reflectance R s , for a species s, is as follows: , the set of wavelengths in nm after binning at 0.5 nm over which the convolved reflectance is calculated, r i is the reflectance of whale integument from species s at a given wavelength and w i is the response curve for a given WorldView-3 sensor (Digi-talGlobe, 2016). The WorldView-3 bands investigated here were the panchromatic (450-800 nm), coastal (397-454 nm), blue (445-517 nm), green (507-586 nm), yellow (580-629 nm), red (626-696 nm), red-edge (698-749 nm), near-infrared 1 (765-899 nm) and near-infrared 2 (857-1039 nm).

Spectral reflectance: influence of the set-up versus animal
We used a bottom-up approach to test whether any element of the set-up or variable intrinsic to the animal influenced the spectral reflectance. First, we created a distance matrix D (with individual elements d j,k ) of spectral values using the Euclidean distance metric: where x i,j and x i,k are the spectral reflectance for each wavelength (nm) in B = [416.25, 802.75], after binning at 0.5 nm units, for different animals j and k. For each value in the distance matrix, we used the spectral reflectance averaged by animal (n = 32), as several measurements were made for the same animal under the same conditions. The only exception was animal 8, which was measured under different types of freshness condition (i.e. on the whale, freshly cut out of the whale and thawed); therefore, we averaged animal 8 under each type of freshness condition. Using the distance matrix and the dendextend R package (Galili 2015), we performed hierarchical clustering to test for specific groupings of the spectral reflectance by species and sampling method. Different agglomeration methods exist to perform hierarchical clustering. All these methods were compared using the Spearman correlation test (see supplemental work, Figure S2), which suggested to use Ward's minimum variation method, specifically the ward.D method argument within the hclust function from the stats R package (R Core Team 2019). To explain the clustering and assess the drivers of variation among spectral reflectance of whale integument, we carried out a permutational multivariate ANOVA (Adonis in vegan 2.5-5 implemented in R; Oksanen et al. 2019). The variables tested were related to either the animal or the experimental set-up and included species, epidermis colour, pigmentation, source of light, measurement type, freshness condition and time spent in the freezer (detailed in Table 1). The null hypothesis was that each variable (Table 1) had no effect on the reflectance of whale integument. Consequently, the method evaluated which variable(s) related to the set-up or animal could explain the clustering structure.

Fresh versus frozen spectral reflectance
The first objective was to assess whether fresh and frozen whale integuments have similar reflectance spectra. For this objective, we compared the spectral reflectance of the bowhead integument measured on a fresh sample post-flensing, and again after the same sample had been frozen for 3 days at À20°C and then thawed to pliability. Frozen samples are easier to access, making the protocol more easily transferable to other whale species. As we were only able to use one sample for the observational test examining differences between fresh and thawed integument, we also compared the mean spectral reflectance of all the samples that were fresh (i.e. spent no time in a freezer) to those that spent a 'short', 'medium' and 'long' times (defined below) in a freezer at À20°C. The fresh samples refer to those collected during the bowhead subsistence harvest and the frozen samples include those collected during strandings and the one sample of bowhead collected during the subsistence harvest that we were allowed to freeze. Five animals represented the 'fresh' category. The three other categories were determined by ordering the whale integument samples from shortest to longest time spent in a freezer, and subsequently by separating the samples into three categories of equal percentile (i.e. nine animals per category). Spectral reflectances (calibrated, smoothed and averaged per animal) in each category were then averaged. The short frozen-duration category was represented by samples that had spent between 3 and 473 days in a freezer, the medium duration samples were stored between 481 and 4159 days and the long period samples stored between 4411 and 7689 days. The above-mentioned ANOVA tested whether the variable 'estimated freezer time' (Table 1) significantly explained part of the clustering.

Spectral reflectance per species
Different species of whales have different epidermis colouration (Jefferson et al. 2015). As different colours have different reflectance (Rees 2013), we aimed to test whether the spectral reflectance of thawed samples was 414 unique to each whale species (second objective). To address this, we averaged separately the low (convolved) and high spectral resolution spectral reflectances per species, for thawed samples only. The above-mentioned ANOVA, tested on various variables including 'species', was used to assess whether the clustering of the spectral reflectance was driven by the variable 'species' (Table 1).

ANOVA: which factors influenced variation in spectral reflectance?
The permutational multivariate ANOVA performed here, showed that the time spent in a freezer was the only variable to significantly (P < 0.05) explain the variation observed among the spectral reflectance averaged per animal (Table 1); and therefore the clustering. However, the time spent in a freezer explained a low proportion of the clustering (R 2 = 0.2; Table 1). The type of light was slightly above the threshold to be considered significant (P = 0.055; Table 1).

Do fresh and frozen whale integuments have similar spectral reflectance?
Our controlled experiment, with the bowhead sample that had its reflectance measured when fresh and thawed, showed that freezing the integument darkens it across nearly all visible wavelengths, that is it becomes less reflective (Fig. 2). Although this represents only one sample, the same observation was made when comparing the average spectral reflectance of samples having spent different times in a freezer (Fig. 3). This effect could be seen when looking at the spectral reflectance estimates averaged over two clusters (Fig. 5), plotted out in Figure 4. The two distinct clusters yielded by hierarchical clustering had an average of 278 AE 305 days (cluster 1) and 2657 AE 2499 days (cluster 2) spent in a freezer (Figs. 4 and 5). This clustering was most strongly explained by time spent in the freezer (Table 1).

416
Do whale species have unique spectral reflectance?
The clustering analysis (Fig. 4) did not show grouping by species nor by epidermis colour, which was also observed when comparing the average spectral reflectance for each species (Fig. 6). All species had a low, flat reflectance throughout most of the measured wavelength range (approximately 416.25-700 nm), except for a slight increase beyond the red wavelength (Fig. 6). The noise observed on Figure 6 at the lowest and highest wavelengths in the spectrum was due to the type of artificial light used. As mentioned in the methods, fluorescent (with or without UV) and surgical lights had a more constrained wavelength range. Table 2 shows the spectral reflectance averaged per species and convolved using the WorldView-3 satellite radiometric response curves.

Discussion
In this study, we sought to assess whether measuring the spectral reflectance of thawed whale integument could be a useful alternative to measuring the spectral reflectance of live whales (under the assumption that freshly harvested integument is a good proxy for live whale integument). Accurate species-specific reflectance values are necessary to reliably discriminate species when searching for whales on satellite imagery, and they also provide an important first step towards assessing the visibility of whales at different depths underwater. Our results led to two interesting biological outcomes: (1) whale integument darkened the longer it stayed in a freezer, and (2) spectral reflectance of thawed samples showed no difference among species, potentially due to (1) above. Here we discuss the implications of these findings and suggest other approaches targeting live whales, which could help to fill this important data gap in future.

Fresh and frozen whale integuments: different spectral reflectances
The longer a whale integument remains in a freezer the darker it becomes. Therefore, when measuring the reflectance of live whale integument, fresh integument samples are more appropriate than frozen samples. However, the reflectance of fresh samples might not be comparable to the reflectance of live whales either. Although we did not have live whales to verify this, studies on human integument suggested a smoothing of the spectral reflectance soon after death (Brunsting and Sheard 1929;Angelopoulou 2001). Similar to what we observed in our analysis, these reported a relatively flat spectral reflectance with a slight increase in reflectance in the red region of the visible spectrum (approximately between 620 and 750 nm). The smoothing of the human integument reflectance after death was mostly explained by the loss of oxygen, which detaches from haemoglobin after death (Brunsting and Sheard 1929;Angelopoulou 2001). As whale integument also contains haemoglobin (Tawara 1950;Corda et al. 2003), it is plausible that a same whale integument has a different reflectance before and after death. The darkening of the integument, reported in this study, might be due to freezing, which causes desiccation and minor changes in the volatile lipids in the epidermis. Freezer burn have been reported for human integument and are revealed by a darkening of the integument (Burge et al. 1986). For whales these cold burns might also be manifested by a darkening, similar to the effect of prolonged exposure to sunlight. As documented by Martinez-Levasseur et al. (2011), whales can become sunburned when exposed to sun for extended periods, which darken their epidermis. Stranded whales are particularly prone to sunburn (McLellan et al. 2004), which might also explain the darkness of the spectral signatures observed among samples obtained from strandings. Measuring the spectral Figure 5. Averaged spectral reflectance for whale skins as separated into cluster 1 (grey dashed line) and cluster 2 (black line) using Ward's minimum variance method.
reflectance of dead whales to help characterize the spectral reflectance of live whales is therefore not recommended based on the observed darkening of the integument.

Different whale species: similar spectral reflectance
Our aim in this study was to establish whether different species had different spectral signatures, enlarging on initial results presented by Cubaynes et al. (2019). If different species have different spectral signatures, this can enable better species discrimination on satellite imagery. The capacity to discriminate species, at least to a similar degree as traditional surveys, is necessary if satellite imagery is to become a useful alternate method for surveying whales in remote and poorly studied places. In this study, multiple species had similar reflectance of their integument, which is opposite of what was anticipated, 418 based on the knowledge that different species have different epidermis colouration (Jefferson et al. 2015) and that different colours should have different reflectance (Rees 2013). Furthermore, previous studies found differences among species in their spectral analyses of live whales in satellite imagery (Cubaynes et al. 2019) and in aerial imagery (Abileah 2002). However, the absence of differences among species, observed in this study, could be explained by the observed darkening of the integument after death and also possibly due to sunburn. Hence, further confirming that measuring the spectral reflectance of integument collected on dead whales is not an alternative to measuring the spectral reflectance on live whales.
Although we found no difference among species and integument of dead whales were used, this study presents the first attempt to establish a catalogue of the spectral reflectance values of whales per species. The creation of such a catalogue is necessary to further develop the use of VHR satellite imagery for monitoring whales; therefore, we introduce it here as a baseline for future improvements (i.e. including spectral reflectance of live whales). Additional advances should endeavour to measure the reflectance spectra of live whales above the sea surface, to generate a more accurate catalogue.

Towards a spectral reflectance database for whales
Our study represents a first effort towards reliable measurement of spectral signatures for different whale species.
Here we have established that reflectances collected from whales post-mortem are not likely to be a good proxy for live whale reflectance, perhaps due to changes in the oxygen flow across skin. Continued use of VHR satellite images to gather reflectance of whales above the sea surface (e.g. Cubaynes et al. 2019) is likely to represent an important source of data for characterizing spectral signatures in future. However, full validation of the signature for each species is likely to take a long time to achieve, because of the small data yields in terms of whales identified per image, and the time and cost of image acquisition and processing.
In order to gather such data more rapidly, we also propose two adapted set-ups. The first (set-up A; Fig. 7) consists of mounting a hyperspectral camera on a small aircraft or an unmanned aerial vehicle (UAV), and flying it over whales in known aggregation grounds. Several studies have used data from imaging equipment mounted on planes or UAVs and flown them over marine mammals at sea (Hodgson et al. 2017;Boyd et al. 2019). Hyperspectral cameras do not provide spectral reflectance as detailed as those acquired from a spectroradiometer, in terms of spatial and spectral resolution. However, they are sufficiently detailed to be transformed into reflectances usable by all current VHR satellites. Hyperspectral cameras fixed on a UAV or aircraft will require more equipment than the set-up tried in this study. For instance, lenses helping to control the field of view of hyperspectral cameras will need to be fitted on the hyperspectral cameras, to ensure only the reflectance of a portion of the whale that is above the sea surface is measured.
Another option to measure the reflectance of whales above the surface would be to acquire the reflectance of live-stranded whales using a similar spectroradiometer to the one used in this study (set-up B in Fig. 7). Marine mammal stranding networks could be trained in how to measure the reflectance of whale integument. However, in live strandings, the welfare of the animal must be the priority, which might make it logistically difficult to collect the spectral reflectance. However, decisions as to how to manage live stranded whales are made with careful deliberation, often allowing at least one tidal cycle to elapse to provide sufficient data on the clinical status of the animal, its prognosis and likelihood of refloating before management decisions are concluded. Additionally, live stranded whales might not be ideal candidates as they are also known to sometimes suffer from sunburns (Kritzler 1952;McLellan et al. 2004), which tend to lead to a darkening of the integument (Martinez-Levasseur et al. 2011). A varying degree of sunburn has also been observed among free-swimming whales (Martinez-Levasseur et al. 2011); hence, there may be an inevitable spread in reflectance among individuals of a same species, as a result of varied degrees of sun exposure. As such the reflectance of a recently stranded whale is as close as feasible to the reflectance of a mobile whale, when immobility of the animal is required (e.g. when using a hand held spectroradiometer).
A spectral reflectance database for whales above the surface is particularly necessary to further investigate species differentiation and to estimate the maximum depth at which whales are visible in VHR satellite imagery. Two of the methods that could potentially evaluate this depth require knowledge of the spectral reflectance of whales above the sea surface. One method involves placing large panels at various depth and assessing which is the deepest panel visible, by acquiring a satellite image or by flying a drone or a plane over the area where the panels are installed. These panels will have to be calibrated to the spectral reflectance of whale above the sea surface. The second method makes use of an algorithm developed to estimate the bathymetry on VHR satellite imagery, such as Stumpf et al. (2003). The algorithm necessitates knowledge of the spectral reflectance of whales above the surface, as well as at various known depths. The reflectance of whale skin below the surface could potentially be modelled using results from the first method mentioned earlier.

Conclusion
The spectral reflectance of fresh whale integument is different from the reflectance of thawed whale integument (stored at À20°C). The main reason seems to be the observed darkening of the integument, as it spends an increasing amount of time in a freezer. This darkening might be initiated soon after death. Due to this observed darkening, all species showed similar reflectance, which was unexpected based on observations made by Abileah (2002) and Cubaynes et al. (2019). Therefore, we do not recommend using dead whale skin as an alternative to measuring the reflectance of live whales, due to the observed darkening. We recommend two adjusted set-ups to collect the reflectance of live whales above the sea surface. One involves the installation of a hyperspectral camera on board a plane or UAV and fly it over whales. The other is to acquire the reflectance of live stranded whales, where the stranding response teams could be trained to measure the reflectance using a spectroradiometer. However, we highlight that the primary focus should always remain on the welfare of the animals. Once more accurate reflectance measurements for different live whale species have been collected, they can be used to estimate the maximum depth of detection, which is necessary to calculate the visibility bias to ultimately produce abundance estimates using VHR satellite imagery, as well as aerial surveys using manned aircraft or UAVs.

Supporting Information
Additional supporting information may be found online in the Supporting Information section at the end of the article.
Using the distance matrix, we also used the non-metric multidimensional scaling function of the vegan 2.5-5 R package, to visualize the grouping of the spectral reflectance in a multidimensional space ( Figure S1). Figure S1. Non-metric multidimensional scaling diagram showing each animal, labelled per species, and coloured based on the length of time spent in a freezer (at À20°C), from the shortest length of time (pale blue) to the longest length of time spent in a freezer (dark blue). Figure S2. Comparison of the different agglomeration methods for hierarchical clustering using Spearman correlation.