HumBug – An Acoustic Mosquito Monitoring Tool for use on budget smartphones

1. Mosquito surveys are time-consuming, expensive and can provide a biased spatial sample of occurrence data— the data often representing the location of the sur - veys, not the occurrence of the mosquitoes. 2. We present the HumBug project, an acoustic system that can turn any Android smartphone into a mosquito sensor. Our sensor has the potential to significantly increase the quantity of mosquito occurrence data as well as access locations that are more difficult to survey by traditional means. 3. We describe our database of wild-captured mosquito fight tone audio data and outline our mosquito detection algorithms that these data train. We also present our MozzWear App, designed to work on budget smartphones, which, together


| INTRODUC TI ON
There are over 100 genera of mosquito in the world containing over 3,500 species and they are found on every continent except Antarctica (Harbach). Only one genus (Anopheles) contains species capable of transmitting the parasites responsible for human malaria. It contains over 475 formally recognised species (Harbach) of which, approximately 75 are vectors of human malaria and around 40 are considered truly dangerous (Service & Townson, 2002;Sinka et al., 2012). These 40 species are inadvertently responsible for more human deaths than any other creature (Coetzee, 2004). In 2018, for example, malaria caused around 228 million cases of disease across more than 100 countries resulting in an estimated 416,000 deaths (World Health Organization, 2019). It is imperative therefore to accurately locate and identify the few dangerous mosquito species among the many benign ones to achieve efficient mosquito control (Sinka et al., 2010(Sinka et al., , 2012. Mosquito surveys are used to establish vector species' composition and abundance, human biting rates and thus the vectorial capacity (potential to transmit a pathogen). Traditional survey methods, such as human landing catches, which collect mosquitoes as they land on the exposed skin of a collector, can be time-consuming, expensive and are limited in the number of sites they can survey. They can also be subject to collector bias, either due to variability in the skill or experience of the collector or in their inherent attractiveness to local mosquito fauna. These surveys can also expose collectors to disease. Moreover, once the mosquitoes are collected, the specimens still need to undergo post-sampling processing for accurate species identification. Consequently, an affordable automated survey method that detects, identifies and counts mosquitoes could generate unprecedented levels of urgently needed high-quality occurrence and abundance data over extensive spatial and temporal scales.
The HumBug project ( Figure 1) utilises this distinctive acoustic characteristic and explores the potential of using a budget smartphone to capture a mosquito's signature flight tone in the field and to provide real-time occurrence data. Here we describe our novel methodology for (a) passively capturing mosquito flight tones on a smartphone by exploiting the natural host-seeking behaviour of blood-feeding females, (b) incorporating these data into a webbased platform, (c) developing a mosquito flight tone database and (d) using this database to train machine learning algorithms to detect and identify wild mosquito species in sub-Saharan Africa. The outcome is a system in which any (Android) smartphone can become a mosquito sensor, potentially massively increasing the available F I G U R E 1 The HumBug project workflow. We deploy our sensor (a smartphone running our MozzWear App) into adapted bednets (HumBug Nets). The flight tone of the host-seeking mosquito is recorded as the mosquito tries to access the person inside the net. The acoustic data are uploaded to the HumBug server for processing where it is passed through an algorithm pipeline that removes sections of human speech and refines the mosquito detection vector occurrence data and providing invaluable information for entomologists, biologists, vector-borne disease modellers and vector control programs.
The HumBug system has been designed with three user bases in mind. Firstly, as a tool to supplement and enhance ongoing mosquito monitoring programs to inform mosquito intervention policy.
Secondly, for use by the vector research community and disease modellers, to generate long-term comparable multi-site mosquito abundance data, identify peak mosquito activity as well as identify spatial occurrence over many more locations than are feasible with traditional mosquito survey methodologies. Finally, as a system for citizen scientists across the malaria endemic world to gather and upload mosquito data to help fight the vector-borne diseases that continue to impact on their lives.

| Capturing mosquito acoustic data on a smartphone
Mosquitoes are small insects and the physical movement of air caused by their beating wings creates the high-pitched whine of their flight tone. This quiet but distinctive sound can be difficult to detect even within moderate background noise. Thus, to ensure our smartphones record data, which is loud and clear enough for reliable mosquito detection and species identification, we needed to complete two steps. First, to develop an App (MozzWear) to record the mosquito's flight tone using the in-built microphone on a smartphone, and seamlessly stream the data to a central server. Second, to design a means to ensure that a mosquito flies close enough to the smartphone microphone to capture its flight tone (the HumBug Net).

| The MozzWear App
The App is written in the JAVA programming language and was developed for the Android platform, given that smartphone devices based on the Android operating systems can be found in wide distribution for as little as £30 per unit.
It has a simple, user-friendly interface ( Figure 2) that allows the user to select whether they want to record on detection or manually activate the record function. The 'Record on Detection' option (prototype) uses the phone's hardware to detect mosquito sounds directly, and only these synchronise with the server. The 'Record' function records constantly on implementation. Here the App shows real-time plots of the mosquito detection output, based on the detection algorithm's predicted probabilities derived from the combination of audio features and a machine learning model (see Section 2.3.1). The App allows the user to adjust the recording length (minimum: 1 s, maximum: 1 hr). Recordings running longer than the set recording length produce multiple files, for example a two-and-a-half hour recording period, where the recording length is set at 1 hr, produces three files, two 1-hr recordings and one 30min recording. Limiting a single recording length to a maximum of 1 hr ensures that the recording device is able to process and record the data without errors. However, if the recording continues beyond 1 hr, multiple files are produced allowing the device to record for as long as is needed. The App is designed to operate without an active data connection and records to the phone's internal memory. Once a recording session is complete and upon an active data connection, the recorded files are uploaded to the HumBug server for further analysis (Section 2.3). For more details about the App functionality, please refer to (Li et al., 2017).
To transfer the recorded data from the smartphone to the HumBug server, the MozzWear App includes a synchronisation ('sync') function, which, once activated within range of Wi-Fi or a suitable mobile data network, makes an HTTP POST request to the server. This sends both the audio recording plus additional information detailing the recording time and device-specific identification data to a bespoke web application based on a Python web server, where it is received and stored in a MongoDB database. The database is subsequently searchable using queries. The stored data are accessible client-side via a dashboard which shows device recording ID as well as a visual representation of the audio. Future iterations will display mosquito detection outcomes and species classification probabilities.

| The HumBug Net
Many of the most dangerous malaria mosquito vectors are active during the night, entering people's houses to bite them while they are asleep and vulnerable. Insecticide-treated bednets are widely distributed by vector control programs and provide a barrier between the mosquito and the sleeping human (widespread insecticide resistance among mosquito populations has reduced the impact of the insecticide). We therefore deploy our sensor (a smartphone running our MozzWear app) into adapted bednets.

| Addressing privacy and security concerns related to using MozzWear in a bednet
One of the key deployment targets for our MozzWear App is overnight recordings in the HumBug Net, capturing the night-active mosquitoes known to be the most effective vectors of malaria. However, these recordings can potentially pick up background sounds including speech. Not only do such background sounds make mosquito detection more difficult but they can also lead to privacy and security concerns for the user. Therefore, voice activity needs to be detected and removed from the archived mosquito recordings. To facilitate this, we implemented a voice activity detection (VAD) pipeline (Ramirez et al., 2007). This pipeline operates as a binary classifier, detecting the presence and absence of a speech signal based on Google's WebRTC project, which is open-source, lightweight and reputed to be reasonably reliable and fast (Karrer, 2021). Sahoo (2018) tested the WebRTC VAD method over 396 hr of data, across multiple recording types. The approach was between 77% and 99.8% accurate. Although the method could perhaps be improved upon in the future, the WebRTC approach is robust and updates are supported by Google. The sections of recording with likely voice activity can be removed from subsequent analyses. This not only helps preserve privacy in our recordings but creates cleaner sections of data for more detailed mosquito analysis.

| Development of a mosquito sound database
There are a number of variables that influence a mosquito's flight tone including the size of the mosquito (Sane, 2003;Unwin & Corbet, 1984;Villarreal et al., 2017), its age (Belton, 1986;Brogdon, 1994;Ogawa & Kanda, 1986) and the air temperature (Unwin & Corbet, 1984)  To record the mosquito sounds (for detailed methodology, see: HumBug Project Website), each captured mosquito was placed into a sample cup large enough for free flight and their flight tone was recorded using a high-specification field microphone (Telinga EM-23) as well as using locally available budget smartphones (LAVA, ITEL, Alcatel) running our MozzWear App (Section 2.1.1). Each individual specimen was identified to species using its morphological characteristics. A number of prominent mosquito vector species belong to groups ('species complexes') of closely related siblings that are taxonomically identical and can contain highly dangerous vectors alongside non-vector species. We therefore used standard PCR identification techniques (Scott et al., 1993) to fully identify mosquitoes from the An. funestus and An. gambiae species complexes.
Our database also holds flight tone data of multiple species re-

F I G U R E 2
The MozzWear App interface is designed with simplicity in mind. Functionality includes detection triggered recording as well as options for synchronising data records with the server

| Identification of mosquito species by their sounds
There are many studies that distinguish between different mosquito species using the fundamental frequency of their flight tone (Table 1).
The fundamental frequency, measured in Hertz, equates to the number of times the insect flaps its wings per second (Arthur et al., 2014;Belton & Costello, 1979;Williams & Galambos, 1950). However, fundamental frequency alone is not sufficient to differentiate between species (Chen et al., 2014) particularly when using datasets of wildcaptured mosquitoes that exhibit natural variability (Figure 4). This led us to consider more data-driven approaches which can learn feature representations to maximally distinguish the various genera or species.
Our previous work shows the possibility of distinguishing six species from data recorded in field studies in Thailand (Li et al., 2018). The HumBug workflow (Figure 1), therefore, makes use of both online and offline algorithms. Online, we perform functions critical to the discovery and recording of mosquitoes with our MozzWear smartphone App (see above). Our offline component, described below, is not limited by a phone's computational processing power, allowing more sophisticated modelling to be employed for the purpose of cleaning and preparing our audio data and for the final species identification.

| Mosquito detection
To identify mosquitoes according to their acoustic signature, we used our flight tone database to train a Bayesian convolutional neural network (Krizhevsky et al., 2017) with Monte Carlo dropout, following a similar architecture to our previous convolutional neural network (Kiskin et al., 2020). The audio was transformed to the spectral domain, in the form of 128 log-mel frequency coefficients.
Our choice of features follows the current trends in state-of-theart algorithms on challenging, realistic datasets in the audio domain (Hershey et al., 2017;Purwins et al., 2019). The Bayesian nature of the algorithm is especially important for making predictions in the field, as we would like to accurately capture any uncertainty our model has of its operating conditions (as indeed our experimental results on latest field trial data from Section 3.4 confirm). To our knowledge, this is the first application which combines Bayesian and deep learning methods in this field, which gives us realistic uncertainty estimation, as well as strong performance in supervised classification.
Future versions of the pipeline will identify and log the species as well as allowing data and SMS messages to be sent to the smartphone, conveying information and reminding the user to charge the phone and deploy it into the bednet each night. The native sampling rate of these devices is 8 kHz, which we have shown to be sufficient for the purposes of detection (Kiskin et al., 2020). Audio is compressed to 32 kbps advanced audio codec (aac) format to facilitate data transfer in rural African areas.

| MozzWear App
Furthermore, aac is natively supported in Android not requiring additional third part downloads to run the App. We have tested our algorithms to ensure no performance degradation has occurred due to compression. A thorough comparison is outside the scope of this paper but is part of ongoing work.

| HumBug Net
The HumBug Net has been assessed in a semi-field setting (publication in prep) to ensure optimal acoustic data collection with minimal impact on the comfort of the user. Two community trials are also currently underway. The first compares the efficacy of the HumBug system with accepted mosquito survey methodologies [Center for Disease Control light traps (CDC-LTs) and human-baited nets]. The second is looking at community acceptance and engagement, examining the willingness of community members to sleep under a HumBug Net and to oversee the charging and placing of the smartphone in the adapted net pocket and upload the audio data once per week. These data will demonstrate the potential for community members to generate much needed longitudinal mosquito survey data and how well they engage with the HumBug concept as a whole.
Preliminary data confirm observations from Thailand, that the HumBug Net successfully captures distinct recordings of hostseeking mosquitoes ( Figure 5).

TA B L E 1
Published, species-specific mosquito wing beat frequencies. Upper rows reproduced/adapted from Clements (Clements, 1999) with kind permission from CAB International, Wallingford, UK  Brogdon (1998) Over a six-night pre-trial collection, the mean number of mosquitoes captured by aspiration from within the outer canopy of the HumBug Net in a traditional village house (Igumbiro) was 83. Midway through the collection, a CDC-LT, run in the same house, collected 108 mosquitoes. Although crude, this preliminary evaluation of the ongoing study indicates concordance between the two methods. However, comparing mosquito survey methodologies is notoriously complex. Different methods can target different species (e.g. the species abundance and diversity captured in a CDC-LT can be different to that collected in a Human Baited Net Degefa et al., 2020;Tangena et al., 2015). Acoustic detection will also probably under-represent abundance and, as with all sampling methods, may over-represent certain species, for example those that are predisposed to aggressively seeking out a human host and are more active within the HumBug Net. This may also lead to potential double counting of single mosquitoes. However, the ability to deploy an automated system with minimal supervision and the potential for long-term data collection may outweigh these limitations.

| Database
Our current mosquito flight tone database, which is the largest and most comprehensive in the world, contains over 6,900 recordings of individual wild-captured mosquitoes from six genera

| Algorithms and detection capabilities
To build an offline detection algorithm, we train our model on a subset of data from our existing database, and test on a distinctly different subset, to help the generalisation of the algorithm to field data. Our detection algorithm correctly predicts noise with 97% accuracy, and mosquito with 89% on 7.1 hr of database data ( Figure 6). Our probabilistic model allows us to both estimate the presence of a mosquito and quantify how certain our model is in its predictions. Here we showcase its effectiveness on field data collected from South East Tanzania. As the certainty threshold is tightened, a greater proportion of predictions are correct, at the expense of an increase in false negative detections. We vary the certainty threshold, the mutual information (Houlsby et al., 2011) from its maximum value of 1.0 through a series of discrete steps as given in Table 4. We calculate the quantity of positives that the model produces for those values and estimate the true positive and negative rates by manually screening the detections for mosquito sound. We note that the dataset is heavily imbalanced, consisting overwhelmingly of noise. This results in the very high true negative rates, which were a strong point of the algorithm on out-of-sample testing, as evidenced by the confusion matrix ( Figure 6). Our key result is that the model correctly understands its uncertainty, as the true positive rate increases monotonically with the tightening of the uncertainty threshold. Using a certainty threshold of 0.02 with probability of detection set to 0.7 or greater, Figure 7 shows audio with the 'mosquito' tag that was automatically extracted. The algorithm successfully finds a high-quality, diverse mix of mosquito audio (corresponding to the hand labels which were added upon screening the algorithm output). Following detection of mosquito audio, we can then further classify into species, which we have shown possible in our earlier work (Li et al., 2018) which is an ongoing part of our current research.

ACK N OWLED G EM ENTS
The authors are grateful to the following people, all of who have

F I G U R E 7
Audio with the flag of 'mosquito' automatically extracted and concatenated by the detection algorithm on unlabelled field data (top). The corresponding hand labels added after prediction illustrate the high proportion of correct classifications