Bird Audio Detection challenge
Detecting bird sounds in audio is an important task for automatic wildlife monitoring, as well as in citizen science and audio library management. The current generation of software tools require manual work from the user: to choose the algorithm, to set the settings, and to post-process the results. This is holding bioacoustics back in embracing its “big data” era: let’s make this better!
In collaboration with the IEEE Signal Processing Society we propose a research data challenge for you to create a robust and scalable bird detection algorithm. We offer new datasets collected in real live bioacoustics monitoring projects, and an objective, standardised evaluation framework – and prizes for the strongest submissions.
- NEWS, July 2018: read our paper covering the Bird Audio Challenge in full
- NEWS, March 2018: a new edition of the Bird Audio Detection challenge runs in 2018 as part of DCASE 2018
Main links for the original Bird Audio Detection challenge (2016/2017):
- View the results online
- Download the data
- Read the academic papers:
- A full paper covering all the outcomes from the challenge (published in Methods in Ecology and Evolution, 2018)
- Conference papers by challenge participants, presented at EUSIPCO 2017:
- Cakir et al (joint winner of judges’ award)
- Thakur et al (joint winner of judges’ award)
- Kong et al
- Adavanne et al
- Grill and Schlüter (strongest-performing system)
- Abrol et al
- Conference survey paper (MLSP 2016) – the paper in which we originally launched the challenge
There was also a discussion forum used during the challenge.
Task description and datasets
The task is to design a system that, given a short audio recording, returns a binary decision for the presence/absence of bird sound (bird sound of any kind). The output can be just “0” or “1”, but we encourage weighted/probability outputs in the continuous range [0,1] for the purposes of evaluation. For the main assessment we will use the well-known “Area Under the ROC Curve” (AUC) measure for classification performance.
Dataset 1: Field recordings (freefield1010)
Our first dataset comes from freefield1010 – a collection of over 7,000 excerpts from field recordings around the world, gathered by the FreeSound project, and then standardised for research. This collection is very diverse in location and environment, and for the BAD Challenge we have newly annotated it for the presence/absence of birds.
Dataset 2: Crowdsourced dataset (Warblr)
Our second dataset comes from a UK bird-sound crowdsourcing research spinout called Warblr. From this initiative we have 10,000 ten-second smartphone audio recordings from around the UK. The audio totals around 44 hours duration. The audio will be published by Warblr under a Creative Commons licence. The audio covers a wide distribution of UK locations and environments, and includes weather noise, traffic noise, human speech and even human bird imitations. It is directly representative of the data that is collected from a mobile crowdsourcing initiative.
Dataset 3: Remote monitoring dataset (Chernobyl)
Our third dataset comes from the Natural Environment Research Council (NERC)-funded TREE (Transfer-Exposure-Effects) research project, which is deploying unattended remote monitoring equipment in the Chernobyl Exclusion Zone (CEZ). This academic investigation into the long-term effects of the Chernobyl accident on local ecology is led by Dr Wood. The project has captured approximately 10,000 hours of audio to date (since June 2015). The audio covers a range of birds and includes weather, large mammal and insect noise sampled across various CEZ environments, including abandoned village, grassland and forest areas.
The task specification gives technical details of how participants will submit their analyses.
Please note: we wish to encourage the development of generalisable and “tuning-free” methods by ensuring that testing data is recorded under different conditions than the publicly-available data. This helps ensure that the challenge addresses the needs identified in recent research for methods that can operate in unseen conditions. The training data will come from freefield1010 and Warblr, and the testing data from a mixture of data, predominantly from the Chernobyl dataset.
Download the development data:
- freefield1010: • [data labels] • [audio files (5.8 Gb zip)] (or [via bittorrent])
- Warblr: • [data labels] • [audio files (4.3 Gb zip)] (or [via bittorrent])
and the testing data:
- Test set, from Chernobyl and Warblr: • [audio files (6.1 Gb zip)] (or [via bittorrent])
– along with a blank submission file csv
(Thanks to Figshare and archive.org for hosting data downloads.)
How the challenge will work
The interface for submitting results files will be a Kaggle-like interface designed for academic users. The main criterion for evaluating submissions will be the AUC, but we also aim to encourage computationally-efficient methods by recording the computation time taken.
As well as the public online challenge, we are also organising:
- A conference special session on the topic, with contributions from international experts.
- A journal special issue on the topic.
Prizes: There will be one prize of £500 and one prize of €500, to be awarded to the strongest submissions. One prize will go to the submission with the best overall performance on the testing data (NB: to be eligible we will also require the teams to provide documentation / source code for their method – since, after all, the aim is to advance the state of the art). One prize will be the judges’ award, which will go to the submission which the judges feel has made a notable contribution to the topic. The organisers reserve the right to make the final decision on prize allocations, but please ask if you have any queries.
To stay informed about the challenge, please join our (low-traffic) public discussion email forum:
>> Join the group here. <<
- Dr Dan Stowell, Queen Mary University of London, London, UK.
- Pr. Hervé Glotin, Scaled Acoustic BioDiversity Dept, Univ. of Toulon & Inst. Univ. de France.
- Pr. Yannis Stylianou, Computer Science Dept, University of Crete.
- Dr Mike Wood, University of Salford, Greater Manchester, UK.