The Centre for Digital Music at Queen Mary University of London is inviting applications for PhD study for Autumn 2026 start across various funding schemes. Below are suggested PhD topics offered by academics; interested applicants can apply for a PhD under one of those topics, or can propose their own topic. In all cases, prospective applicants are strongly encouraged to contact academics at C4DM to informally discuss prospective research topics.
Opportunities include internally and externally funded positions for PhD projects to start in Autumn 2026. It is also possible to apply as a self-funded student or with funding from another source. Studentship opportunities include:
-
S&E Doctoral Research Studentships for Underrepresented Groups (UK home applicants, Autumn 2026 start, 3 positions funded across the Faculty of Science & Engineering)
-
CSC PhD Studentships in Electronic Engineering and Computer Science (Autumn 2026 start, Chinese applicants, up to 4 nominations allocated for the Centre for Digital Music)
-
International PhD Funding Schemes (Autumn 2026 start, numerous international funding agencies)
Each funding scheme has a dedicated application process and requirements. S&E Doctoral Research Studentships and CSC applications close on 28 January 2026 at 5pm UK time. Detailed information and application links can be found on the respective funding scheme pages, following the above links.
AI Models of Music Understanding
Supervisor: Simon Dixon
Eligible funding schemes: S&E Studentships for Underrepresented Groups, CSC PhD Studentships, International PhD Funding Scheme
Music information retrieval (MIR) applies computing and engineering technologies to musical data to satisfy users' information needs. This topic involves the application of artificial intelligence technologies to the processing of music, either in audio or symbolic (score, MIDI) form. The application could be e.g. for software to enhance the listening experience, for music education, for musical practice or for the scientific study of music. Examples of topics of particular interest are automatic transcription of multi-instrumental music, providing feedback to music learners, incorporation of musical knowledge into data-driven deep learning approaches, and tracing the transmission of musical styles, ideas or influences across time or locations.
It is intentional that this topic description is very general, but it is expected that applicants choose your own specific project within this broad area of research, according to your interests and experience. The research proposal should define the scope of the project, the relationship to the state of the art, the data and methods that you plan to use, and the expected outputs and means of evaluation.
Bridging Musical Intelligence and Machine Learning: Integrating Domain Knowledge into Music and Audio Representation Learning
Supervisor: George Fazekas
Eligible funding schemes: S&E Studentships for Underrepresented Groups, CSC PhD Studentships, International PhD Funding Scheme
Audio and music representation learning seeks to transform raw data into latent representations for downstream tasks such as classification, recommendation, retrieval and generation. While recent advances in deep learning, especially contrastive, self-supervised and diffusion-based approaches have achieved impressive results, most remain purely data-driven and neglect domain-specific musical structures like rhythm, melody, harmony, metrical hierarchy or genre-style traits.
This PhD project will explore ways to embed theoretical and structural knowledge into modern representation learning pipelines to enhance interpretability, controllability and performance. For example, incorporating symbolic or other structured representations, inductive biases, well-known principles exploited in classic DSP algorithms, or ontological constraints, the research aims to bridge the gap between data-driven models and the structured understanding of music and audio.
Potential directions include hybrid models that combine deep audio and symbolic embeddings, graph-based or relational learning of musical structure, and explainable methods for music analysis, production or generation. The project will also engage with principles of Ethical and Responsible AI: reducing data bias, improving transparency and supporting fair attribution of authorship.
Examples of relevant works include but not limited to:
Guinot, Quinton, Fazekas: “Semi-Supervised Contrastive Learning of Musical Representations”, ISMIR-2024
Yu, Fazekas: “Singing voice synthesis using differentiable LPC and glottal-flow-inspired wavetables”, ISMIR-2023
Agarwal, Wang, Richard: F-StrIPE: Fast Structure-Informed Positional Encoding for Symbolic Music Generation, ICASSP-2025
Automated machine learning for music understanding
Supervisor: Emmanouil Benetos
Eligible funding schemes: S&E Studentships for Underrepresented Groups, CSC PhD Studentships, International PhD Funding Scheme
The field of music information retrieval (MIR) has been growing for more than 20 years, with re-cent advances in deep learning having revolutionised the way machines can make sense of music data. At the same time, research in the field is still constrained by laborious tasks involving data preparation, feature extraction, model selection, architecture optimisation, hyperparameter optimisa-tion, and transfer learning, to name but a few. Some of the model and experimental design choices made by MIR researchers also reflect their own biases.
Inspired by recent developments in machine learning and automation, this PhD project will investi-gate and develop automated machine learning methods which can be applied at any stage in the MIR pipeline as to build music understanding models ready for deployment across a wide range of tasks. This project will also compare the automated decisions made on every step in the MIR pipe-line, as compared with manual model design choices made by researchers. The successful candidate will investigate, propose and develop novel deep learning methods for automating music under-standing, resulting in models that can accelerate MIR research and contribute to the democratisation of AI.
Sonification techniques for understanding hidden processes of LLMs
Supervisors: Anna Xambó and Charalampos Saitis
Eligible funding schemes: S&E Studentships for Underrepresented Groups, CSC PhD Studentships, International PhD Funding Scheme
Large language models (LLMs) are a type of artificial intelligence program that can recognise and generate text, which are trained on huge sets of data with a complex network of hidden processes. This PhD topic explores sonification techniques of LLMs for a better understanding of the way they process the information. Can we treat LLM engines such as ChatGPT as a musical instrument and listen to its internal processes? Can sonification techniques help us to hear and see how the information is processed? Compared to vinyl records or tape recordings, what is the acoustic signature, and what are the artefacts that are distinctive of this new medium? This work will contribute to addressing an important challenge in AI: making the inner workings and hidden knowledge of models more interpretable for people.
Keywords: sonification, large language models (LLMs), explainable AI
Audio-visual sensing for machine intelligence
Supervisor: Lin Wang
Eligible funding schemes: S&E Studentships for Underrepresented Groups, International PhD Funding Scheme
The project aims to develop novel audio-visual signal processing and machine learning algorithms that help improve machine intelligence and autonomy in an unknown environment, and to understand human behaviours interacting with robots. The project will investigate the application of AI algorithms for audio-visual scene analysis in real-life environments. One example is to employ multimodal sensors e.g. microphones and cameras, for analysing various sources and events present in the acoustic environment. Tasks to be considered include audio-visual source separation, localization/tracking, audio-visual event detection/recognition, audio-visual scene understanding.
Interpretable AI for Sound Event Detection and Classification
Supervisor: Lin Wang and Emmanouil Benetos
Eligible funding schemes: S&E Studentships for Underrepresented Groups, CSC PhD Studentships, International PhD Funding Scheme
Deep-learning models have revolutionized state-of-the-art technologies for environmental sound recognition motivated by their applications in healthcare, smart homes, or urban planning. However, most of the systems used for these applications are based on black boxes and, therefore, cannot be inspected, so the rationale behind their decisions is obscure. Despite recent advances, there is still a lack of research in interpretable machine learning in the audio domain. Applicants are invited to develop ideas to reduce this gap by proposing interpretable deep-learning models for automatic sound event detection and classification in real-life environments.
Using machine learning to enhance simulation of sound phenomena
Supervisor: Josh Reiss
Eligible funding schemes: S&E Studentships for Underrepresented Groups, International PhD Funding Scheme
Physical models of sound generating phenomena are widely used in digital musical instruments, noise and vibration modelling, and sound effects. They can be incredibly high quality, but they also often have a large number of free parameters that may not be specified just from an understanding of the phenomenon.
Machine learning from sample libraries could be the key to improving the physical models and speeding up the design process. Not only can optimisation approaches be used to select parameter values such that the output of the model matches samples, the accuracy of such an approach will give us insight into the limitations of a model. It also provides the opportunity to explore the overall performance of different physical modelling approaches, and to find out whether a model can be generalised to cover a large number of sounds, with a relatively small number of exposed parameters.
This work will explore such approaches. It will build on recent high impact research from the team in relation to optimisation of sound effect synthesis models. Existing physical models will be used, with parameter optimisation based on gradient descent. Performance will be compared against recent neural synthesis approaches, that often provide high quality synthesis but lack a physical basis. It will also seek to measure the extent to which entire sample libraries could be replaced by a small number of physical models with parameters set to match the samples in the library.
The student will have the opportunity to work closely with research engineers from the start-up company Nemisindo, though will also have the freedom to take the work in promising new directions. Publishing research in premier venues will be encouraged.
The project can be tailored to the skills of the researcher, and has the potential for high impact.
Intelligent audio production for the hearing impaired
Supervisor: Josh Reiss
Eligible funding schemes: S&E Studentships for Underrepresented Groups, International PhD Funding Scheme
This project will explore new approaches to audio production to address hearing loss, a growing concern with an aging population. The overall goal is to investigate, implement and validate original strategies for mixing audio content such that it can be delivered with improved perceptual quality for hearing impaired people.
Music content is typically recorded as multitracks, with different sound sources on different tracks. Similarly, soundtracks for television and radio content typically have dialogue, sound effects and music mixed together with normal-hearing listeners in mind. But a hearing impairment may result in this final mix sounding muddy and cluttered. The research team here have made strong advances on simulating hearing loss, understanding how to mix for hearing loss, and attempting to automatically deliver enhanced mixes for hearing loss. But these initial steps identified many unresolved issues and challenges. Why do hearing loss simulators differ from real world hearing loss, and how can this be corrected? How should hearing loss simulators be evaluated and how should they be used in the music production process? What is the best approach to mix audio content to address hearing loss? These questions will be investigated in this project.
The project can be tailored to the skills of the researcher, and has the potential for high impact.
