New data challenge: “Few-shot Bioacoustic Event Detection” at DCASE 2021

We’re pleased to announce a new data challenge: “Few-shot Bioacoustic Event Detection“, a new task within the “DCASE 2021” data challenge event.

We challenge YOU to create a system to detect the calls of birds, hyenas, meerkats and more.

This is a “few shot” task, meaning we only ever have a small number of examples of the sound to be detected. This is a great challenge for machine-learning students and researchers: it is not yet solved, and it is great practical utility for scientists and conservationists monitoring animals in the wild.

We are able to launch this task thanks to a great collaboration of people who contributed data from their own projects. These newly-curated datasets are contributed from projects recorded in Germany, USA, Kenya and Poland.

The training and validation datasets are available now to download. You can use them to develop new recognition systems. In June, the test sets will be made available, and participants will submit the results from their systems for official scoring.

Much more information on the Few-shot Bioacoustic Event Detection DCASE 2021 page.


Posted in dcase

Call for PhD applications at the Machine Listening Lab

The Machine Listening Lab is welcoming PhD applications for September 2021 entry. Applicants from all nationalities can apply across different funding schemes and PhD programmes. Current PhD funding opportunities for September 2021 entry include:

Applicants are encouraged to contact prospective supervisors before submitting their application – please send an email to your selected supervisors with your CV and draft research proposal.

Suggested PhD topics offered by Machine Listening Lab academics as part of the AIM PhD Programme include:

Music interestingness in the brain

Supervisor: Dr Huy Phan
in collaboration with Aarhus University

Measuring interestingness of a song when one is listening to the song will not only shed some light on individual music perception, allowing personalized music recommendation, but also open possibility of using music songs as a brain stimulus. This project aims to automatically measure interestingness of a music songs in the brain using Ear-EEG.

An Ear-EEG device will be used to measure the brain signal (EEG) in the ear canals when one is listening to a song, which is then assessed by machine learning algorithms (potentially deep neural networks) to map the recorded EEG signal into an interestingness measure. Data collection will be carried out and a cohort of young and healthy subjects will be recruited for this purpose. This data will allow exploring different machine learning algorithms and techniques for interestingness modelling. Personalisation and multi-modal modelling, that combines music information (either raw signals or high-level musical features, e.g. melody, music genre, etc.) and the EEG, will also be investigated. This is a joint project with the Centre for Ear-EEG, Aarhus University, and the candidate is expected to work with academics in both C4DM and the Centre for Ear-EEG.

Meta-learning for music data

Supervisor: Dr Emmanouil Benetos

Meta-learning, or “learning to learn”, is an emerging area in the broader field of machine learning. Contrary to conventional machine learning approaches where a particular task is solved using a fixed learning algorithm, the main aim of meta-learning is to learn and improve the learning algorithm itself, so that it can absorb information from one task and generalise across unseen tasks. Meta-learning has various uses in machine learning applications, for example in cases where large datasets are unavailable or when we would like to rapidly learn something about a new task without training our model from scratch. It is also closely related to other emerging machine learning concepts, such as multi-task learning, transfer learning, few-shot learning, and self-supervised learning amongst others. While meta-learning has seen a dramatic rise in research interest in recent years, its principles have seen limited adoption in the intersection of music and AI research.

https://www.aim.qmul.ac.uk/This PhD project will investigate methods for meta-learning applied to music data, such as audio recordings or music scores. The successful candidate will investigate, propose and develop novel machine learning methods and software tools for meta-learning, and will apply them to address tasks related to music and audio data analysis. This will result in methods that can rapidly learn from limited music data, or on methods that can learn from one task and generalise to other unseen tasks related to music and audio data analysis.

Suggested PhD topics for studentships in Computer Science or Electronic Engineering programmes include:

Scalable audio event detection and localisation for domestic acoustic monitoring

Supervisor: Dr Huy Phan

Audio event detection and localisation, which is a highly active research topic, entangles the “what” and “where” questions about occurring sound events. It would enable a wide range of novel applications, particularly domestic acoustic monitoring for healthcare. In this application, it is the case that the target acoustic environments are often different from house to house, causing reverberation mismatch particularly when a system is deployed in a totally new environment. This aspect remains uncharted in the current methods proposed for audio event detection and localisation, hindering scalable deployment and robustness of the system. This project aims to evaluate the robustness of the state-of-the-art methods and propose new machine-learning (potentially deep-learning) and inference methods to address this limitation. Furthermore, apart from being robust against environmental mismatch, such a scalable system should be self-adaptive to resources available of a target device (e.g. IoT devices and mobile devices), able to detect event of interest as early as possible.


Voice and language analysis for personality disorder detection

Supervisor: Dr Huy Phan

Bipolar Disorder (BD) and Borderline Personality Disorder (BPD) are two major mental health disorders that can seriously affect the life of patients. An early and correct diagnosis of these diseases is of paramount importance for an early intervention and treatment. This project aims to develop new methods to recognise these mental health disorders. We have interviewed and collected a voice database from a significant number of participants (both healthy and with disease status), the interviews were also transcripted. This database allows to explore different machine learning, particularly deep learning, methods to analyse the bimodal data (voice and language) for disease recognition. Another important aspect of this project is that we are not only interested in recognition but also in identifying acoustic and linguistic markers that are relevant to, and hopefully underpin, the diseases.

Multi-task learning for music information retrieval

Supervisor: Dr Emmanouil Benetos

Music signals and music representations incorporate and express several concepts: pitches, onsets/offsets, chords, beats, instrument identities, sound sources, and key to name but a few. In the field of music information retrieval, methods for automatically extracting information from audio focus only on isolated concepts and tasks, thus ignoring the interdependencies and connections between musical concepts. Recent advances in machine and deep learning have showed the potential of multi-task learning (MTL), where multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This research project will investigate methods for multi-task learning for music information retrieval. The successful candidate will investigate, propose and develop novel machine learning methods and software tools for jointly estimating multiple musical concepts from complex audio signals. This will result in improved learning efficiency and prediction accuracy when compared to task-specific models, and will help gain a deeper understanding on the connections between musical concepts.

Sound recognition in everyday environments

Supervisor: Dr Emmanouil Benetos

The emerging field of sound scene analysis refers to the development of software systems for automatically recognising everyday sounds and the environment/context of a recording. Applications of sound scene analysis include smart homes, urban planning, audio-based security/surveillance, indexing of sound archives, and acoustic ecology. This project will focus on recognizing sounds from everyday environments. You will carry out research and develop computational methods suitable for detecting overlapping sound events from noisy and complex audio, recorded in urban environments. In this project you will be based in the Machine Listening Lab in the Centre for Digital Music, developing new methods and software tools based on signal processing and machine learning theory.

Posted in Uncategorized

Welcoming new Lecturer: Bhusan Chettri

We’re pleased to welcome new Lecturer Bhusan Chettri to our group!

Bhusan started a full-time role this month as Lecturer in Data Analytics. He says:

Bhusan Chettri photo“I completed my Ph.D. degree from QMUL in August 2020 under the supervision of Dr. Emmanouil Benetos and Dr. Bob Sturm (who is now at KTH, Sweden). During my Ph.D. studies, I was also working as a part-time Lecturer of Data Analytics at QMUL.

“My Ph.D. research focused on the design and analysis of secure voice biometrics. I look forward to continuing my work on fake audio/speech detection using generative models, representation learning, and adversarial attacks. Furthermore, I am also keen to explore interpretability for voice biometrics and anti-spoofing systems.”


Posted in Uncategorized

Honorary Lectureship for Helen Bear

We’re pleased to announce that Helen Bear (Yogi), who has been working with us since 2018, has been awarded a QMUL Honorary Lectureship for 3 years.

Yogi says:

“I’m delighted to continue being a part of the team in C4DM. Complementary to my work in industry, at QMUL I am excited to continue my work in applied AI for visual and audio domains. Most recently I have been working in environmental sound scene analysis for multiple tasks, such as audio geotagging. But additionally I have been creating partnerships across QM including clinicians at the St Barts NHS trust to use AI to support healthcare and patients. “

To learn more about Yogi and her work, you can read this recent interview in Wonk Magazine.

Posted in People

Welcoming new Lecturer: Huy Phan

We’re pleased to welcome new Lecturer Huy Phan to our group!

Huy is a Lecturer in AI, and joined the C4DM in April this year. His interests are a great match to the Machine Listening Lab, and we look forward to working together (remotely and in person!). Huy says:

Photo: Huy Phan

“I am a Lecturer in AI at C4DM. Before joining QMUL, I was a postdoctoral research assistant at the University of Oxford and a lecturer at the University of Kent. I received PhD degree from the University of Lübeck, Germany. I am interested in applying machine learning to temporal signal analysis and processing (e.g. audio, EEG).

“At C4DM, I hope to join force with colleagues and students to make contribution to multi-view, multi-task, privacy-preserving, and non-iid generalisation perspectives of machine learning algorithms. I will focus on applications like audio event detection and localisation, audio scene classification, speech enhancement, and healthcare.”

Posted in People

New journal papers from the Machine Listening Lab

The Machine Listening Lab has recently had great success towards publishing journal papers related to the lab’s research priorities on developing new machine learning and signal processing methodologies for audio and timeseries analysis. Our new work ranges from new methods for speaker anti-spoofing, to visibility graphs for large-scale time series analysis, and on new evaluation methodologies for music prediction and transcription.

The list of our recently accepted and published journal papers can be found below; many of them are freely available or have links to preprints so you can read them already:

Posted in Publications

MLLab papers at ICASSP 2020

On 4-8 May 2020, several MLLab researchers will participate virtually at the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020). ICASSP is the flagship conference of the IEEE’s Signal Processing Society and is the leading conference in the field of signal processing. For this year, ICASSP is organised as a virtual conference offering free registration to attendees.

As in previous years, the Machine Listening Lab will have a strong presence at the conference, both in terms of numbers and overall impact. The following papers authored/co-authored by MLLab members will be presented:

Posted in Events, Publications

MLLab at WASPAA, DCASE and SANE 2019

In late October, Machine Listening Lab researchers will be participating at a series of back-to-back workshops in the United States focused on audio signal processing and computational scene analysis: the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA 2019) taking place in New Paltz, NY, the 8th Speech and Audio in the Northeast workshop (SANE 2019) taking place in New York City, and the 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE 2019) taking place in New York City.

The following papers from MLLab members will be presented at WASPAA 2019:

The following papers will be presented at DCASE 2019:

Finally, the following posters will be presented at SANE 2019 (click here for poster abstracts):

  • “An extensible cluster-graph taxonomy for open set sound scene analysis”, by Helen L. Bear and Emmanouil Benetos
  • “Adversarial Attacks in Audio Applications”, by Vinod Subramanian, Emmanouil Benetos, and Mark B. Sandler
  • “Neural Machine Transcription for Sound Events”, by Arjun Pankajakshan, Helen L. Bear, and Emmanouil Benetos

See you in New York state and city!

Posted in dcase, Publications

MLLab at Interspeech 2019

The Machine Listening Lab will be participating in this year’s INTERSPEECH conference, taking place on 15-19 September 2019 in Graz, Austria. The following papers will be presented by MLLab members:

Posted in Events, Publications

Research visit: Akash Jaiswal – Machine learning to analyse acoustic bird monitoring data

This week we are welcoming Akash Jaiswal, a PhD student from Jawaharlal Nehru University (Delhi, India). Akash has obtained a Newton-Bhabha Fund PhD placement to spend 3 months in the UK working with us on using machine learning to analyse acoustic bird monitoring data.

The placement description is as follows:

Worldwide increasing urbanization is directly associated with the rapid transformation of landscapes, impacting more and more natural habitats. Biodiversity assessment is essential to improve the management and quality of these habitats for greater biodiversity benefit and better provisioning of ecosystem services, and also to understand the ecological changes shaping urban animal communities. Birds are representative taxa for biodiversity monitoring in terrestrial habitats and have been studied frequently in this context. It has been observed that bird communities in habitats close to urban infrastructures exhibit reduced species richness with a few successful species more dominant as compared to adjacent natural habitats. But the mechanisms creating such community-level changes are poorly understood.

My PhD project is to understand such variations in bird communities across different habitats in a fast changing urban landscape like Delhi using birds’ singing activity as proxy for community composition. Acoustic monitoring of vocalizing animal communities proves to be less time-consuming and more resource-efficient as compared to field surveys of biodiversity. In this context, the aim of this research work is to assess the efficacy of using eco-acoustic indices to measure and characterize avian biodiversity, and its application to account for community-level variation in vocalizing birds (avian soundscape).

Although acoustic monitoring appears to be a promising solution for biodiversity assessment, the analysis of recorded acoustic samples remains challenging as the manual detection and identification of individual species vocalization from the large amount of field recordings is nearly impossible and also subject to observer bias/error. Automation of such analysis using machine learning techniques can facilitate identification of species and data analysis. Dr Dan Stowell at Queen Mary University of London has been working upon machine learning techniques to study sound signals including birdsongs, music and environmental sounds for years. He is also working on automated processes for analyzing large amounts of sound recordings – detecting the bird sounds and their relation to each other.

I am visiting Dr Stowell’s lab and working under his supervision to learn and use machine learning methods and automated processes to facilitate the analysis of the large amount of audio data I am collecting during my field sampling. Besides, I also believe that his supervision will certainly improve the quality and impact of my research work. This is certainly going to help in applicability of this modern technique to enhance my current and future projects.

Posted in Bird, Lab Updates