C4DM Seminar: Kavya Ranjan Saxena: Meta-learning-based domain adaptation for melody extraction
QMUL, School of Electronic Engineering and Computer Science
Centre for Digital Music Seminar Series
Seminar by:
Kavya Ranjan Saxena
Date/time: Wednesday, 25th September 2024, 11am
**Location: GC205, Graduate Centre Building, Mile End Campus, QMUL, E1 4NS ** Zoom: https://qmul-ac-uk.zoom.us/j/2387202947
Title: Meta-learning-based domain adaptation for melody extraction
Abstract: The task of extracting the dominant pitch from polyphonic audio is crucial in the music information retrieval field. A substantial amount of labelled audio data is required to effectively train the machine learning models to perform the task. Generally, the traditional models trained on audios of one domain, i.e., source, may not accurately extract pitch from audios of different domains, i.e., target. To boost the performance, the models are adapted on minimal labelled data from the target domain, a method known as the supervised domain adaptation. We use the meta-learning algorithm as the supervised domain adaptation method for the task of melody extraction, by proposing a novel weighting technique to handle the class imbalance when adapting to a few audios in the target domain. Further, this method can be extended as an efficient interactive melody extraction method based on active adaptation. This method selects the regions in the target audio that require human annotation using a confidence criterion based on normalized true class probability. The annotations are used by the model to adapt itself to the target domain using meta-learning. The meta-learning-based domain adaptation method is model-agnostic and can be applied to other non-adaptive melody extraction models to boost their performance.
Bio: Kavya Ranjan Saxena is a Ph.D. student at the Indian Institute of Technology Kanpur, India. Her research interests are in machine learning for signal processing with a focus on domain adaptation for melody extraction in the field of music information retrieval. Currently, she is working as an Intern – Speech Research Scientist at Krutrim (an Ola company), where her work focuses on Audio LLMs.
Presentation: [PDF Slides]