A spatiotemporal analysis of acoustic interactions between great reed warblers (Acrocephalus arundinaceus) using microphone arrays and robot audition software HARK

Abstract Acoustic interactions are important for understanding intra‐ and interspecific communication in songbird communities from the viewpoint of soundscape ecology. It has been suggested that birds may divide up sound space to increase communication efficiency in such a manner that they tend to avoid overlap with other birds when they sing. We are interested in clarifying the dynamics underlying the process as an example of complex systems based on short‐term behavioral plasticity. However, it is very problematic to manually collect spatiotemporal patterns of acoustic events in natural habitats using data derived from a standard single‐channel recording of several species singing simultaneously. Our purpose here was to investigate fine‐scale spatiotemporal acoustic interactions of the great reed warbler. We surveyed spatial and temporal patterns of several vocalizing color‐banded great reed warblers (Acrocephalus arundinaceus) using an open‐source software for robot audition HARK (Honda Research Institute Japan Audition for Robots with Kyoto University) and three new 16‐channel, stand‐alone, and water‐resistant microphone arrays, named DACHO spread out in the bird's habitat. We first show that our system estimated the location of two color‐banded individuals’ song posts with mean error distance of 5.5 ± 4.5 m from the location of observed song posts. We then evaluated the temporal localization accuracy of the songs by comparing the duration of localized songs around the song posts with those annotated by human observers, with an accuracy score of average 0.89 for one bird that stayed at one song post. We further found significant temporal overlap avoidance and an asymmetric relationship between songs of the two singing individuals, using transfer entropy. We believe that our system and analytical approach contribute to a better understanding of fine‐scale acoustic interactions in time and space in bird communities.


| INTRODUCTION
Acoustic interactions are important for understanding communication among species and individuals in songbird communities from the viewpoint of soundscape ecology (Gasc, Francomano, Dunning, & Pijanowski, 2016). In particular, the temporal dynamics of vocalizations are of interest because they are known to have ecological and behavioral implications (Catchpole & Slater, 2008). Birds may divide up sound space in such a manner that they tend to avoid overlap with the songs of other bird species or individuals in order to communicate with neighbors efficiently. There have been empirical studies on the temporal sound space partitioning or overlap avoidance of singing behaviors of song birds across various time scales (Araya-Salas, Wojczulanis-Jakubas, Phillips, Mennill, & Wright, 2017;Brumm, 2006;Cody & Brown, 1969;Ficken, Ficken, & Hailman, 1974;Fleischer, Boarman, & Cody, 1985;Masco, Allesina, Mennill, & Pruett-Jones, 2016;Planqué & Slabbekoorn, 2008;Popp, Ficken, & Reinartz, 1985;Suzuki, Taylor, & Cody, 2012;Yang, Ma, & Slabbekoorn, 2014). We are interested in clarifying the dynamics underlying the process as an example of complex systems based on short-term behavioral plasticity (Tobias, Planqué, Cram, & Seddon, 2014) from both theoretical (Suzuki & Arita, 2014) and empirical standpoints Suzuki, Hedley, & Cody, 2015). Traditionally, researchers have used a standard single-channel microphone to record bird songs and manually analyzed the recording to study temporal pattern of the songs. However, it is problematic to manually collect spatiotemporal patterns of acoustic events in natural habitats using data derived from a single-channel recording of several species singing simultaneously.
Using a microphone array is a promising approach to acoustically monitor wildlife that produce sounds (Blumstein et al., 2011) because it can provide directional or spatial information of vocalizations from recordings. There have been several empirical studies to spatially localize bird songs using multiple microphones for playback experiments (Mennill, Battiston, & Wilson, 2012;Mennill, Burt, Fristrup, & Vehrencamp, 2006) and localization of songs of antbirds in the 2D (Collier, Kirschel, & Taylor, 2010) and 3D spaces (Harlow, Collier, Burkholder, & Taylor, 2013).  recently showed that coordinated singing in lekking long-billed hermits depends on the distance between individuals, using six stereo microphones to roughly estimate the distance between birds. Hedley, Huang, and Yao (2017) also successfully showed a 3D direction-of-arrival estimation of up to four simulated birds singing, using two stereo field recorders. Our intent is to further investigate the usability of microphone arrays to study fine-grained spatiotemporal interactions of bird songs such as for soundscape partitioning.
In this study, we investigated spatiotemporal patterns of the great reed warblers (Acrocephalus arundinaceus, GRWA) by combining microphone arrays and an open-source robot audition system for localization and separation. The singing behavior of this species has been extensively investigated because of the rich variety of song repertoires and its complexity (Forstmeier, Hasselquist, Bensch, & Leisler, 2006;Forstmeyer & Leisler, 2004;Hasselquist, Bensch, & von Schantz, 1996). However, as far as we know, no study has quantitatively discussed the existence of temporal partitioning of the sound space among neighboring individuals of this species.
Here, we describe the use of HARKBird to localize singing birds in a 2D space in the field. A preliminary analysis of spatial localization with the 3 eight-channel microphone arrays showed a reasonable accuracy in estimating the location of the song posts of the GRWAs . In this study, we use the data obtained from three newly developed 16-channel, standalone, and water-resistant microphone arrays, named DACHO, to automatically record bird songs in the field for a detailed analysis on its intraspecific competition. We also improve algorithms for localization. We first report the performance of our system. We then examine the temporal localization performance by comparing the beginning and ending timing of localized song with manually annotated data. Lastly, we examine the temporal overlap avoidance between the focal individuals based on manually annotated data using randomization tests Masco et al., 2016) and transfer entropy (Schreiber, 2000) for analyzing the information flows in these complex systems (Bossomaier, Barnett, Harré, & Lizier, 2016).

| Bird observation
Males of GRWA ( Figure 1) declare and defend their territories with loud and persistent songs during the breeding season. Studies have shown that the repertoires size may play an important role in attracting females and to warn off other potential rivals in the neighborhood (Forstmeier et al., 2006;Forstmeyer & Leisler, 2004;Hasselquist et al., 1996).
We conducted a bird survey in 18 May 2016, during the early breeding season, on a bank of the Ibi river, Kaminogo district, Mie prefecture in central Japan (35°34′59″N, 136°06′29″E). Figure 2 shows a map of the study site. The observations were conducted over approximately 6 hr starting at 6:00 a.m. Each of 17 observation sessions was 20 min long. Using spotting scopes, observers recorded the identification of the bird based on the color bands tagged to both legs, location, activities, and timing of each activity of the bird. Observers determined the location of each bird using landmarks on drone imagery taken by ourselves and marked posts in the field. Positions of landmarks were measured using a GPS (Trimble R10 GNSS; Trimble Inc., California, USA). It is difficult to identify the exact beginning and end of each song, so we instead reported the beginning and the end times of a sequential series of songs including short breaks inbetween songs typically lasting for a few minutes at each observed location.
Three GRWA males within our study area had been captured using a mist net for the purpose of this research and banded for identification prior to this recording experiment, two of which were visually confirmed during recording sessions (Figure 1). We confirmed the presence of five individuals at one point of our experiments, two of which were banded males. According to the field observation, there were a few additional males in the study area, at least one of which was visually confirmed to be a male without bands. This unbanded male briefly flew in and out of the study area, typical behavior of a young male floater in search of vacant territory (Mérő & Žuljević, 2017).
Observed data were classified into four categories: the songs of RYB, RGY, other individuals except for RYB and RGY (OTH), and unknown individuals (UNK). RYB and RGY represent the songs of color-banded individuals whose bands are red-yellow-blue and redgreen-yellow, respectively. OTH includes individuals of the same species that were not color-banded. UNK includes individuals of GRWA, but it was not sure whether they are color-banded or not; thus, it may include RYB and RGY. In cases where multiple individuals were singing simultaneously, we relied on localized results to help distinguish individuals.

| Recording with microphone arrays and song post localization
We used 16-channel, stand-alone, and water-resistant microphone arrays, named DACHO, specifically developed for bird observations in the field (WILD-BIRD-SONG-RECORDER; SYSTEM IN FRONTIER Inc., Tokyo, Japan). Each array consists of 16 microphones, arranged within an egg-shaped frame, which is 17 cm in height and 13 cm in width ( Figure 3). It records using a 16-channel, 16 bit, 16 kHz format. Recorded raw data are stored in SD cards and can be exported in either a raw or wave format for further analysis using customized software on a PC to which the array is connected with the USB interface. One can schedule a recording by preparing the time settings in a micro-SD card. See Table 1 for the specification of the microphone array. We placed three microphone arrays in the reed marsh where the GRWA inhabit ( Figure 2) and conducted a scheduled recording for each observation session.
We used HARKBird 1 to estimate the DOA of the sound sources acquired from each microphone array. The sound source localization algorithm of HARK is based on the MUltiple SIgnal Classification (MUSIC) method (Schmidt, 1986) using multiple spectrograms with the short-time Fourier transformation. See Suzuki et al. (2017) for additional details of HARKBird andNakadai et al. (2017, 2010) for HARK. We adjusted the parameters to localize songs of the GRWAs as much as possible while suppressing other sounds (e.g., noise and songs of other bird species). 2 An example of DOA of sound sources estimated by HARKBird (Figure 4) shows that the two GRWA individuals were singing alternately at directions different from a microphone array.
We used an improved algorithm for spatial localization based on the one adopted in Matsubayashi et al. (2017) (Figure 5). At each time frame DT (=0.2 s) of the localization process, we assumed that a half line arose from each microphone array toward the DOA estimated sound source. We used the center of mass of the three intersections F I G U R E 1 A color-banded individual of the great reed warbler (RYB) F I G U R E 2 The 2D imagery of the study area. The numbered circles represent the location of the three 16-channel microphone arrays of those three half lines as the estimated location of a 2D localized sound indicated by a "+" in Figure 5.
Because multiple sound sources can be localized at each time frame, we have to exclude cases when microphone arrays estimated DOA of different sound sources. For this purpose, we adopted the 2D localization result only when it met two requirements. First, the distances between all three intersections were equal or smaller than R (=15 m) (indicated as a dotted line). Second, the begin and end timing of all sound sources should be within the range of DB (=6.0 s) and DE (=1.0 s), respectively. The reason that we used a large DB and a small DE is that a song of the GRWA begins with several short introductory notes that may be difficult to localize, while it ends with loud notes that are easily localized. Thus, the end timing is a good signal for discriminating different sound sources.
Finally, we define the song duration of the spatially localized sound as that of the closest microphone array to the center of mass, where the sound was most clearly recorded. For example, in Figure 5, the song duration localized by the microphone array 1 was used. Note that because HARK with our setting requires a silence of 0.8 s to detect the end of each localized sound, we reduced the estimated duration of songs by 0.6 s.
From these results, we visualized spatial distributions of the location of the localized sound sources and observed birds in all the 17 sessions ( Figure 6). We also calculated the distribution of these observed-localized distances ( Figure 7). Specifically, at every time interval of 1 s, we measured the distance between each observed location and the localized location that was the closest to the observed location and visualized a histogram of this distance for each observed individual.

| Temporal analysis
To determine whether the temporal soundscape partitioning occurred between RYB and RGY, we compared the durations of bouts during which both individuals were observed and singing actively around their song posts in seven different sessions ( Figure 8). We adopted these durations to extract more direct and clearer interactions between these target individuals with minimum interruption of other individuals singing in the neighborhood.
To extract the temporal dynamics of the singing behavior of these two individuals, we presumed that sound sources localized around each bird's song post belong to that bird. Specifically, we used the sources within the range of 15 m (RYB) and 10 m (RGY) from the locations indicated with "x" in Figure 6, respectively. We adopted the larger range for RYB because this individual tended to move more frequently around his song post, whereas the other one remained at one spot. We assigned these sources in a timeline of each individual assuming that there was no time overlap among sound sources. In case of multiple overlapping song intervals, we adopted the one that began earlier. Trained researchers visually and auditorily examined the localization performance using Praat (Boersma, 2001). In this annotation process, we manually adjusted the beginning and end of a song, added mislocalized sound, and removed sounds that were not the songs of the targets. To examine whether a significant temporal overlap avoidance existed among these individuals, we first focused on the solo singing duration during which only one individual sang, using the Monte Carlo randomization test. We defined X rand (X ∈ {RYB, RGY}) as the randomized time series of the individual X during which the duration of nonsinging intervals were randomly shuffled from the original series.
We created 10,000 randomized data of RYB rand and RGY rand and calculated the solo durations for all data. We regarded the proportion of these solo durations that was larger than the observed duration as the p-value for the null hypothesis (no significant overlapping and avoiding overlap). There exist two possibilities in which these individuals actively avoid overlap or actively overlap with each other. We regarded that the former case is significant when the p-value is smaller 0.025 and the latter case is significant when the p-value is larger than 0.975 (two-tailed test). However, we expect that they were singing alternately (and thus avoiding overlaps) according to human observations.
We also used SONG (Song Overlap Null model Generator), a package of R (Masco et al., 2016), to examine whether there was a significant asymmetric effect from one individual to the other. Using SONG, we classified the overlapped duration of the two individuals (X and Y) 's songs into two: the duration during which a target individual X began to sing while the other reference individual Y was singing (i.e., X actively overlapped with Y) and vice versa. We created 10,000 randomized data using the time series X rand and Y, and calculated the durations during which the randomized target individual X rand actively overlapped with Y for all data. We defined the proportion of those durations that was smaller than the observed value as the p-value for F I G U R E 6 The spatial distribution of localized locations and observed locations in the all 17 recording sessions. A red "+" represents the former (i.e., localized) and colored "o" represents the latter (i.e., observed). Four categories of the observed birds, that is, RYB, RGY, OTH, and UNK are color-coded by yellow, green, orange, and gray, respectively. Blue rectangles represent the 5 × 5 m 2 area in which more than 10 sound sources were spatially localized. The brighter color corresponds to the larger number of localized sounds within the area We further measured the information flow from one individual's singing behavior to another using transfer entropy (Schreiber, 2000), which has been recently used to analyze information flows in complex systems (Bossomaier et al., 2016). Specifically, this measure quantifies the expected amount of directional information flow from one time series to another; the transfer entropy T Y→X (k, l) from a discrete time For a statistical test, we created 10,000 randomized data using the time series X and Y rand and calculated TE Y rand →X (k,l) for all data.
The p-value for the null hypothesis (i.e., the information flow from Y to X is not significantly larger than randomized ones) refers to the proportion of calculated values that is larger than the observed value TE Y→X (k, l).
We expect that if an individual X actively avoids overlapping with Y (in the sense above), the singing behavior of X is expected to depend on the behavior of Y, and thus, there should be a significant (1) , F I G U R E 7 The distribution of observed-localized distance of songs in all 17 recording sessions. At every time interval of 1 s, we measured the distance between each observed location and the localized location that was the closest to the observed location and visualized a histogram of this distance for each observed individual information flow from Y to X. In other words, the reference and target individuals correspond to the source and sink individuals, respectively.
Because transfer entropy is itself asymmetric measure and it is not natural to assume that the information flow can become significantly smaller than that in the cases with randomized sources, we adopted one-tailed test here. We also calculated a bootstrap estimate of observed value (TE)-expected (TE rand ) value of transfer entropy for each individual, using the data from all the seven sessions.
In addition, while we could only observe a single duration (100-800 s) in the session 17 in which RYB, RGY, and OTH were visually observed and actively singing, we further conducted the same analysis on possible pairs of these three individuals to see interactions among them.

| Spatiotemporal distribution
The location of observed birds shows that two color-banded individuals, RYB and RGY, tended to stay and sang at around their own song posts ( Figure 6). RYB sang in a tree and RGY sang within the reed bushes at the waterfront. UNK and OTH were sometimes observed around RYB, implying that these individuals might have competed with RYB as potential rivals.
Areas in which sound sources (represented as "+") were frequently localized are located in close proximity to each bird's song post reported by human observers, showing that the songs of these F I G U R E 8 The temporal and directional distribution of localized songs and observed locations in the all 17 recording sessions. The horizontal axis represents time, and the vertical axis represents the direction from the center of mass of the three microphone arrays. The observed durations of RYB, RGY, OTH, and UNK are indicated by thick yellow, green, orange, and gray lines, respectively. Localized sound sources are shown in blue lines, whose brightness corresponds to the distance from the center of mass of the three arrays. Double-headed arrows represent durations of bouts during which both RYB and RGY were observed and singing actively around their song posts in seven different sessions individuals were successfully localized. We also see several additional 5.457 ± 4.468. The distance of RYB, OTH, and UNK tended to distribute more broadly than that of RGY. This implies that they tended to move around the song post more frequently than RGY, but human observers cannot easily record such a small movement.
The second peak around 50 m corresponds to the cases where there were no sound sources around the song posts. However, this does not necessarily mean that the localization was unsuccessful, because the observed song duration includes not only multiple song durations but also includes short breaks in-between songs (explained below). We expect that the most of the sources localized from about 50 m away from one song post are expected to be the localized songs at the other song post.
In the temporal and directional distribution of localized sound sources and human observations in all 17 sessions (Figure 8), most of the localized sound sources (shown in blue lines) especially for RYB and RGY aligned well with human observations (indicated by thick lines). It should be noted that the localization results are more finegrained than human observations in the sense that they included the begin and end timing of each song. Results also show that RGY tended to stay at a fixed location while singing through all the sessions without any conspecific rival around his song post. At the same time, RYB tended to move frequently around his song post with potential rivals such as the one coded as OTH and UNK. Note that UNK could actually be RYB that went out of the observer's sight for a moment. RYB and OTH were expected to be competing for their song posts.

| Temporal localization
To evaluate the accuracy of temporal dynamics of automatically extracted song durations, we examined the durations of bouts during which both individuals were observed to be singing actively around their song posts in seven different sessions. These durations are indicated by double-headed arrows in Figure 8. In these durations, we visually confirmed the presence of the two singing individuals, RYB and RGY. However, OTH was not visually confirmed and his vocalization in the recordings was considerably rarer than those of RYB and RGY.
We believe that these conditions allow us to infer direct competition between the focal individuals. As for the effects of vocalizations of other species on the great reed warbler, we believe that these species dominated the acoustic space around them because of much higher rate of their vocalizations and their loudness.

The comparison between annotated and localized song bouts
showed that there was a large difference in the accuracy, truepositive and false-positive rate of the song extraction between the two (Figure 9). The songs of RGY, the one that sang at one song post, were successfully localized, indicated by high true-positive rate and low false-positive rate, which resulted in high accuracy in the range of 0.83 and 0.95 in all seven sessions. High accuracy supports that we can rely on the automatically extracted data for further analyses for this individual, although they still need a minor manual correction (e.g., adjustment of the beginning and end of a song, addition of mislocalized sounds, removal of sounds that were not the songs of the targets).
In contrast, the accuracy of the extracted songs of RYB was in a wider range of 0.67 and 0.89, which is lower than those of RGY. This is due to the fact that true-positive rate of his songs was lower than that of RGY. Lower true-positive rate was attributed to the behavior of RYB. This individual frequently flew around his territory or went behind the tree where his song post was located, which made the localization difficult and increased FN. The lower accuracy is also due to higher false-positive rate. This could be caused accidentally when other species (e.g., Japanese bush warbler (Horornis diphone), Coal Tit Periparus ater insularis) or other males of the GRWA occasionally sang F I G U R E 9 A comparison between annotated and localized song durations. We classified the durations into four categories: true positive (TP) when localized songs were actually observed, false positive (FP) when songs were localized but not observed, true negative (TN) when songs were not localized and not observed, and false negative (FN) when songs were not localized but actually observed. We evaluated the accuracy ((TP+TN)/(TP+FP+TN+FN)) of the results and ROC curves (defined by true-positive rate (TP/ (TP+FN)) and false-positive rate (FP/ (FP+TN)) of individuals. The number next to each point represents the session ID of each duration, and a value with parenthesis represents accuracy of localization result in the corresponding duration in a similar direction toward the microphone 1 because songs of RYB and these individuals could be localized as a single source. Despite these limitations, sufficient accuracy values indicate that they can serve as initial estimates for successive manual annotation.

| Temporal overlap avoidance
Using annotated data, we investigated the temporal interactions between RYB and RGY based on their annotated song timings (Table 2).
These two birds avoided overlapping indicated by their significantly longer solo singing duration than expected by chance in almost all of the seven recording sessions (7, 12, 13, 15; p < .001, 10, 11; p < .01). This soundscape partitioning might be realized by the asymmetric tendency of the overlap avoidance behavior.  It should be noted that both active overlap avoidance and the information flow demonstrated a similar trend. While the information transfer itself does not tell us about its mechanism, the information flow from RYB to RGY is expected to be associated with the active overlap avoidance behavior of RGY, which depends on the preceding song of RYB.
In addition, we further conducted the same analysis on possible pairs of RYB, RGY, and OTH in a single duration (100-800 s) in the session 17 in which these individuals were visually observed and actively singing. Results were summarized as follows: There was significant overlap avoidance between all pairs (p < .025). However, RYB was neither actively overlapping nor avoid overlapping with the other two individuals while both RGY and OTH were actively avoiding overlap with all the others (p < .025). There was significant information flow from RYB to OTH, OTH to RYB, and OTH to RGY (p < .025).
These results show that RYB were not actively avoiding overlapping with the others, while RGY and OTH were actively avoiding overlap with RYB. This supports our claim that there exists asymmetric relationship among them. In other words, RYB was a driver individual in this soundscape of GRWA songs.

| DISCUSSION
We to generate eight-channel data. After manually extracting wildlife vocalizations, they estimated the spatial location of each sound source using the cross-correlation method. Their experiments using play back calls showed that the locational accuracy was much higher when the sound source was inside the area enclosed by recorders, as compared to outside the boundary. While the localization accuracy highly depends on ecological situations and properties of localized songs, the locational accuracy of 10.22 ± 1.64 m for sounds broadcasted outside the arrays using their system is comparable to our localization accuracy. Collier et al. (2010) achieved much higher localization accuracy when they used the 8 four-channel microphone arrays and simple sounds within the perimeter of the area. We expect that we can also obtain much higher localization accuracy by either increasing the number of microphone arrays or enclosing focal individuals within the area surrounded by the microphone arrays.
Acoustic monitoring using automatic recording devices has been of interest, but practical comparisons of manual and autonomous methods for bird vocalizations are still limited (Digby, Towsey, Bell, & Teal, 2013). We extracted the temporal pattern of song bouts in which two color-banded individuals were actively singing at their song posts using the localized results. The extracted pattern included the  1969;Ficken et al., 1974;Masco et al., 2016;Planqué & Slabbekoorn, 2008;Popp et al., 1985;Suzuki et al., 2012;Yang et al., 2014), have long been a focus of interest in ornithology. Analysis shows that the two GRWAs were singing alternately in our study area. We also found that such coordinated singing process was realized by asymmetric ef- Measuring coordination behaviors among birds has been complicated, because distinguishing vocal coordination from patterns arising by chance is challenging using conventional statistical approaches Masco et al., 2016). Recently, the concept of transfer entropy has been used to analyze asymmetric relationships among components of various complex systems (e.g., cellular automata, small-world networks, and swarms; Bossomaier et al., 2016). In our study, the statistical analysis of information transfer from one individual to another showed a similar tendency of active overlap avoidance. A stronger information flow from RYB to RGY indicates that the future behavior of RGY can be predicted more accurately by knowing the behavior of RYB. We could extract these song-by-song interactions using the short time interval (second) to discrete continuous time data. Our results imply that transfer entropy is useful for measuring such short-term interactions based on the behavioral plasticity in bird vocalizations.
Interaction networks of birds are also a topic of interest (Stowell, Gill, & Clayton, 2016;Tobias et al., 2014). Although we mainly focused on the songs of two focal individuals, it is plausible that we should investigate the songs of other males in the study area for a full assessment of the complex system composed of multiple birds.
An additional analysis on three individuals RYB, RGY, and OTH supports our claim that there exists a directional network of interactions among them. In another study conducted in a forest in Japan, we found a statistically significant overlap avoidance in three bird species and a symmetric effect from one species to another . We believe that large-scale experiments using our system enables us to extract spatiotemporal behaviors of a greater number of individuals to obtain a network of the information flow as well as spatial relationships among them. Actually, we conducted large-scaled experiments using a greater number of microphone arrays in the field. According to preliminary analysis, one of the problems of the current 2D localization method based on triangulation is a combinatorial explosion of possible locations of sound sources due to the increased number of DOAs recognized by many arrays. It is our future work to resolve this problem.
We could successfully localize songs of the great reed warbler for mainly two technical reasons, both of which were critical to avoid localizing unnecessary sound sources including songs of other species. First, we could limit the minimum frequency range relatively high. Second, we could use the threshold values for MUSIC spectrum relatively large. Thus, conducting sound source localization with different species-specific settings would enable us to obtain vocalizations of other species effectively. The effectiveness of tuning localization parameters was assessed in our test we conducted for forest birds . We found that different species were successfully localized depending on the settings of parameters.
In this study, we manually identified songs of neighboring individuals (i.e., RYB, OTH, and UNK) using both automatic localization results and human observation. Automated sound recognition is a recent development in bioacoustics and bird monitoring (Jahn, Ganchev, Marques, & Schuchmann, 2017) and a benchmark problem in machine learning (Goëau, H., Vellinga, Planqué, & Joly, 2016). We expect that song classification based on localized and separated sound sources (Kojima, Sugiyama, Suzuki, Nakadai, & Taylor, 2016) will contribute to track the behaviors of conspecific individuals in our system. Integrating these techniques into our automatic localization system will enable us to investigate fine-grained acoustic interactions in time and space in bird communities, for a deeper understanding of soundscape ecology (Gasc et al., 2016).

ETHICS STATEMENT
The protocols followed for the survey and to capture individuals of the great reed warbler were approved by the Division of Wildlife, the