Call for PhD applications at the Machine Listening Lab – 2024 entry

Please find below outline PhD topics offered by academics in the Machine Listening Lab for 2024 start. Candidates are invited to contact the relevant academic by email to elaborate on what the research will entail. In several cases, there is a significant opportunity to jointly shape the direction of research.

Opportunities include internally and externally funded positions for PhD projects to start in Autumn 2024, with one position starting in Spring 2024. It is also possible to apply as a self-funded student or with funding from another source. Studentship opportunities include:

One PhD studentship at the Centre for Doctoral Training in AI and Music (home students, Spring 2024 start)
S&E Doctoral Research Studentships for Underrepresented Groups (home students, Autumn 2024 start, 4 positions funded across the Faculty of Science & Engineering)
CONACyT Scholarships (Autumn 2024 start)

Suggested PhD topics offered by Machine Listening Lab academics include:

Explainable AI for Sound Scene Analysis

Supervisors: Lin Wang and Emmanouil Benetos

Deep-learning models have revolutionized state-of-the-art technologies for environmental sound recognition motivated by their applications in healthcare, smart homes, or ambient assisted living. However, most of the systems used for these applications are based on black boxes and, therefore, cannot be interpreted, i.e. the rationale behind their decisions is unknown. Despite remarkable advances in the model performance , the research in explainable machine learning in the audio domain is still limited. The PhD topic aims to fill this gap. Applicants are invited to develop ideas to reduce this gap by proposing explainable deep-learning models for automatic sound event detection and classification in real-life environments.

Deep Audio Inpainting for Musical Signals

Supervisor: Lin Wang

Real-life audio signals often suffer from local degradation and lost information. Examples include short audio intervals corrupted by impulse noise and clicks, or a clip of audio wiped out due to damaged digital media or packet loss in audio transmission. Audio inpainting is a class of techniques that aims to restore lost information with newly generated samples without introducing audible artifacts. In addition to digital restoration, audio inpainting also finds wide applications in audio editing (e.g. noise removal in live music recording) and music enhancement (e.g. audio bandwidth extension and super-resolution). Recently, intrigued by the tremendous success in image and video inpainting, deep learning-based approaches started attracting attention in the research community, but still in the infant stage. Applicants are invited to develop ideas to investigate the possibility of adapting deep learning frameworks from various domains inclusive of audio synthesis and image inpainting for audio inpainting.

Resource-efficient machine learning for music

Supervisor: Emmanouil Benetos

State-of-the-art models for music information research are often very hard to run on small and embedded devices such as mobile phones, single-board computers, and other microprocessors. At the same time, the computational cost, footprint, and environmental impact for building and deploying deep learning models for making sense of music data is constantly increasing. This PhD project will investigate methods for creating resource-efficient models for music understanding, applied to various tasks in music information research that involve music audio data, such as automatic music transcription, audio fingerprinting, or music tagging. Methods to be investigated can include but are not limited to sparse training, network pruning, binary neural networks, post-training inference, and knowledge distillation. The successful candidate will investigate, propose and develop novel machine learning methods and software tools for resource-efficient music understanding, and will apply them to address tasks of their choice within the wider field of music information research. This will result in models that can be deployed on small or embedded devices, or on offline models where learning and inference times and computational resources are drastically reduced.