OpenSoundscape: An open‐source bioacoustics analysis package for Python

Landscape‐scale bioacoustic projects have become a popular approach to biodiversity monitoring. Combining passive acoustic monitoring recordings and automated detection provides an effective means of monitoring sound‐producing species' occupancy and phenology and can lend insight into unobserved behaviours and patterns. The availability of low‐cost recording hardware has lowered barriers to large‐scale data collection, but technological barriers in data analysis remain a bottleneck for extracting biological insight from bioacoustic datasets. We provide a robust and open‐source Python toolkit for detecting and localizing biological sounds in acoustic data. OpenSoundscape provides access to automated acoustic detection, classification and localization methods through a simple and easy‐to‐use set of tools. Extensive documentation and tutorials provide step‐by‐step instructions and examples of end‐to‐end analysis of bioacoustic data. Here, we describe the functionality of this package and provide concise examples of bioacoustic analyses with OpenSoundscape. By providing an interface for bioacoustic data and methods, we hope this package will lead to increased adoption of bioacoustics methods and ultimately to enhanced insights for ecology and conservation.


| INTRODUC TI ON
The sounds of the natural world provide us with a unique opportunity to spy on ecological happenings that are otherwise hidden from observation. For centuries, naturalists have relied on keen ears for biological sounds as a way to identify and study sound-producing organisms such as birds and frogs. More recently, technologies for capturing and recognizing natural sounds have transformed the study of bioacoustics from a small-scale endeavour to a large-scale, datadriven discipline powered by remote sensing-much in the way that satellite imagery transformed cartography from an intimately local and experience-based practice to a massive-scale data-driven one.
Large-scale bioacoustic monitoring projects have become a popular approach to biodiversity monitoring. Effective and affordable automated recording unit (ARU) hardware enables researchers to collect landscape-scale acoustic data. These are complemented by machine learning methods that provide efficient and accurate means of extracting species detections from the resulting audio data (Stowell, 2022). For instance, deep learning image recognition models have been used to recognize species-specific vocalizations of birds (e.g. Knight & Bayne, 2018;Ruff et al., 2020) and cetaceans (e.g. Bermant et al., 2019;Madhusudhana et al., 2021). Combining these two new technologies-automated recording hardware and automated sound detection software-provides an effective means of monitoring species across space and time and can provide insight into unobserved behaviours and patterns.
Tools for analysing bioacoustic monitoring data are still catching up with the rapid expansion of data collection that has been enabled by new recording hardware (Ulloa et al., 2021). Various existing packages and software, summarized in the following section, provide interfaces to some aspects of data management, exploration, annotation or automated species detection. However, employing state-of-the-art methods to extract species detections from largescale acoustic datasets still requires an advanced understanding of machine learning and computer programming, which has limited the adoption of these methods. In practice, many recent acoustic monitoring studies have relied on techniques such as template-based cross-correlation or energy detection, or have alternatively employed generic pre-trained classifiers (e.g. Cole et al., 2022;Toenies & Rich, 2021) even though training or fine-tuning automated classifiers using local data generally improves model performance (Lauha et al., 2022). When research groups have developed domain-specific deep learning classifiers, automated classifier performance can be excellent and can lead to novel biological insights (e.g. Bermant et al., 2019;Nolan et al., 2023;Wightman et al., 2022;Zhong et al., 2021). Making such analysis methods more accessible to researchers could broadly improve the quality of bioacoustic data analyses and enhance insights into ecological processes (Stowell, 2022).
Here, we present OpenSoundscape, an open-source Python package for detecting biological sounds of interest in acoustic monitoring data. OpenSoundscape provides access to powerful acoustic detection, classification, and localization methods including both machine learning and signal processing algorithms ( Figure 1). The package represents a synthesis of more than 4 years of development motivated by the authors' direct applications of these tools to ecology and conservation research. Extensive package documentation and tutorials for OpenSoundscape provide step-by-step instructions and examples for manipulating audio and automating the recognition of biological sounds. OpenSoundscape has already been used in a variety of bioacoustics applications, from exploratory data analysis to automated recognition of frogs (Lapp et al., 2021), birds (Lapp, Larkin, et al., 2023;Malamut, 2022) and gunshots (Katsis et al., 2022) through various signal processing and deep learning methods.
The remainder of this manuscript is organized into five sections. First, we review related work and place OpenSoundscape in the context of existing software for automated species detection.
Second, we outline the key functionalities of OpenSoundscape (version 0.9.1). Third, we briefly discuss the development practices and design principles of the package. Fourth, we provide four concise examples of bioacoustic workflows with OpenSoundscape to demonstrate the utility of this package. Finally, we conclude by outlining future directions for the package.

| E XIS TING SOF T WARE FOR AUTOMATED S PECIE S DE TEC TI ON
OpenSoundscape is designed to tackle the steps of bioacoustic monitoring data analyses that concern detecting, measuring and localizing sounds of interest in audio recordings, especially those Automating the detection of species in audio is a key step in bioacoustic monitoring, and is the focus of several existing software projects (e.g. Kaleidoscope Pro, Wildlife Acoustics, 2023; BirdNET, F I G U R E 1 The classes and functions of OpenSoundscape produce detections of biological sounds in time and space. Arrows in the diagram represent the flow of information from inputs to outputs. The classes and methods in the top-level API (beige box) can be called directly to use the core functionality of OpenSoundscape without knowledge of other parts of the API. API, application programming interface; CAM, class activation maps; CNN, convolutional neural network.  (1) providing default workflows that are streamlined and customized for bioacoustic data and (2) providing flexibility to adapt each aspect of this workflow to new projects and new domains.

| LIB R ARY OVERVIE W
OpenSoundscape is a bioacoustics toolkit for Python that provides a set of tools for detecting, classifying and localizing biological sounds in audio data. OpenSoundscape aims to achieve both simple interfaces accessible to non-programmers and powerful flexibility for addressing the complexity of diverse bioacoustic monitoring analysis tasks. It is designed to support scaling analyses across distributed computing systems. The package is publicly available on PyPI (pypi. org/proje ct/opens ounds cape) and GitHub (github.com/kitze slab/ opens ounds cape) under an MIT licence which allows unrestricted use and modification.
The primary functionality of OpenSoundscape is the development and application of automated algorithms for locating biological sounds of interest in space and time. In bioacoustics, some automated recognition tasks are well suited to machine learning while others are best solved with signal processing (Lapp et al., 2021), and OpenSoundscape provides functionality for both approaches.
OpenSoundscape interfaces with the popular PyTorch library (Paszke et al., 2019)  OpenSoundscape also provides intuitive and robust Audio and Spectrogram classes for interacting with audio data. The ability to retain, manipulate and inspect attributes and metadata of audio files alongside the raw sample data increases interpretability and reproducibility during acoustic data analyses, but to our knowledge, all other available Python tools lack this functionality. In OpenSoundscape, the Audio and Spectrogram classes provide a streamlined and featured API for inspecting and manipulating audio and spectrogram data and their associated metadata.
The top-level API of OpenSoundscape (beige box in Figure 1) consists of classes and functions that users can call directly to generate species detections directly from data files. Directly using these classes and methods provides the core functionality of Top-level API components: • CNN class: Train CNNs with custom parameters and flexible architecture; generate predictions on audio data; save and load trained models; monitor training and inference progress and metrics through integration with the Weights and Biases platform (Biewald, 2020).
• signal_processing module: Access a set of signal processing tools, including the find_accel_sequences function used to detect Ruffed Grouse drumming in (Lapp, Larkin, et al., 2023).
• localization module: Spatially localize audio events from timesynchronized recordings using the SynchronizedRecorderArray or SpatialEvent classes.
• ribbit function: Detect sounds with periodic amplitude modulation using the method described in (Lapp et al., 2021).

Selected intermediate-level API components:
• Audio class: Load, manipulate and save audio files with the Audio class; read and update audio file metadata; retrieve and calculate parameters (e.g. sample rate, duration, and decibels full scale (dBFS)); trim, extend, normalize or loop audio; split audio into clips of equal length; extract audio segments from longer files.
• Spectrogram class: Calculate, plot and save the spectrogram of an audio object with custom parameters.
• BoxedAnnotations class: View and manipulate audio annotations; prepare annotated data for training and evaluation of automated classification algorithms; load and save annotation files; filter, aggregate, manipulate and correct labels across a dataset.
• Preprocessor class: Customize the preprocessing and augmentation of training data for machine learning models.
• CAM class: Produce class activation maps such as GradCAM (Selvaraju et al., 2020) for visualizing what regions of a sample cause a CNN to predict a particular class.

Four Python notebooks demonstrating common workflows in
OpenSoundscape are included in the Supporting Information and are hosted on a public GitHub repository (github.com/kitze slab/ demos -for-opso). Each notebook is described briefly below. Detailed documentation and tutorials on the entirety of the OpenSoundscape package are available at opens ounds cape.org.

| Notebook 1: Exploring the acoustic structure of Atelopus varius vocalizations
Because of the great diversity of biological sounds present in soundscapes, gaining a qualitative and quantitative understanding of audio data is an essential first step in any bioacoustic analysis.
This process can identify potential issues or pitfalls and provide insight for optimizing data quality and clarity (for instance, through spectrogram parameter selection and gain adjustments, or the removal of noisy or invalid audio files). Unlike previously published

| Notebooks 2 and 3: Training a CNN to classify bird songs
Deep learning models provide a powerful means of automating the detection and classification of complex biological sounds. When trained and used appropriately, these methods can provide accurate automation of acoustic detection that scales to analyses of thousands F I G U R E 2 Using Spectrogram settings to inspect the acoustic structure of a recording of Atelopus varius (the variable harlequin frog). (a) window_samples = 256 and overlap_fraction = 0.9; (b) default parameters, window_samples = 512 and overlap_fraction = 0.5; (c) window_ samples = 1024 and overlap_fraction = 0.5. See Notebook 1 in the S1 for further details and code.

dB (a) (b) (c)
of hours of audio (Stowell, 2022). Although the process of training a machine learning model involves many decisions about preprocessing, augmentation and hyperparameters, OpenSoundscape simplifies this process by providing functionality and default parameter values tailored to bioacoustics. Notebook 2 first prepares training and test data from a public dataset of Raven-annotated audio (Chronister et al., 2021), then trains a CNN to recognize the vocalizations of seven bird species. Notebook 3 evaluates the performance of the CNN and demonstrates that it effectively recognizes bird songs such as that of the Eastern Towhee (Pipilo erythrophthalmus) in spectrograms (Figure 3).

| Notebook 4: Detecting repeated element sounds with signal processing
Although deep learning models are popular and effective approaches to many automated detection problems, they can be difficult to apply when little to no training data is available. In these scenarios, which may be common for bioacoustics tasks, signal processing methods may be preferable. Unlike deep learning approaches, these methods have interpretable parameters that can be tuned to biologically relevant values by the user. They can be especially effective in detecting songs with stereotyped temporal structure, a feature of many anuran vocalizations and invertebrate stridulations (Lapp et al., 2021). This notebook demonstrates the use of two detection methods from the signal_processing module, using each to detect the song of the Northern Flicker (Colaptes auratus).

| DE VELOPMENT PR AC TICE S
The OpenSoundscape source code is hosted on GitHub and published under the MIT licence, which allows unrestricted use and modification as well as public contributions to package development. On the GitHub web page, issues track bugs and feature requests for the code base, while discussions provide a public forum for user support and more general conversations. Continuous integration provided by GitHub ensures code quality through testing with PyTest (Krekel et al., 2004)  OpenSoundscape will add integration with an online model repository for loading and using trained machine learning models. In future development, OpenSoundscape will support the import and export of cross-platform model formats.
Standardizing metadata for audio and audio annotations also represents an important area of development (Stowell, 2022).
The capability to store, modify and retrieve metadata associated with an audio file is important for data analysis and reproducibility in bioacoustics workflows, but the community has yet to adopt Crowsetta (Nicholson, 2023) provides a much-needed common interface for several audio annotation formats, and support for integration with Crowsetta is a short-term priority for OpenSoundscape development. OpenSoundscape will add support for additional audio metadata formats as they become adopted by the community.

F I G U R E 3
Finally, future development of OpenSoundscape will work towards the release of a static and cohesive application programming interface (API). To achieve this, some modules currently included in the package may be ported to separate packages or repositories. Limiting the scope of the package and providing a static API will enhance the reliability of OpenSoundscape as a tool for reproducible research.

| CON CLUS ION
The rapid uptake of acoustic monitoring as a surveying tool demonstrates that ecologists appreciate the potential value of this technology in ecology and conservation research. At the same time, the fields of artificial intelligence and machine learning have generated methods capable of synthesizing insights from unstructured data.
However, applying such methods to bioacoustic analysis currently requires expertise in the development of automated recognition methods and extensive project-specific code for adapting methods to the audio domain. By embedding these methods within an ecosystem of data pipelines and other bioacoustics tools, we hope that OpenSoundscape will allow users to connect their powerful data with powerful methods, ultimately leading to rich biological insights.

ACK N O WLE D G E M ENTS
This work was financially supported by the Mascaro Center for We thank B. Moore, A. Watts and J. Jia for contributions to the OpenSoundscape codebase.

CO N FLI C T O F I NTE R E S T S TATE M E NT
The authors declare no conflicts of interest.

PE E R R E V I E W
The peer review history for this article is available at https:

DATA AVA I L A B I L I T Y S TAT E M E N T
OpenSoundscape is available at https://github.com/kitze slab/opens ounds cape and on PyPI, and version 0.9.1 is published on Zenodo with the DOI https://doi.org/10.5281/zenodo.8170077 . Documentation is hosted at https://opens ounds cape.org. The demonstration notebooks are available in the Supporting Information and also at https://github.com/kitze slab/ demos -for-opso and version 1.0.0 is published on Zenodo with the DOI https://doi.org/10.5281/zenodo.8170062 .