C4DM at ISMIR 2024

by Emmanouil Benetos — Monday, 28 October 2024

On 10-14 November 2024, several C4DM researchers will participate at the 25th International Society for Music Information Retrieval Conference (ISMIR 2024). ISMIR is the leading conference in the field of music informatics, and is currently the top-cited publication for Music & Musicology (source: Google Scholar). This year ISMIR will take place onsite in San Francisco (CA, USA) and online.

Similar to previous years, the Centre for Digital Music will have a strong presence at ISMIR 2024.

In the Scientific Programme, the following papers are authored/co-authored by C4DM members:

Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning (Ilaria Manco, Justin Salamon, Oriol Nieto)
Automatic Detection of Moral Values in Music Lyrics (Vjosa Preniqi, Iacopo Ghinassi, Julia Ive, Kyriaki Kalimeri, Charalampos Saitis)
Between the AI and Me: Analysing Listeners' Perspectives on AI- and Human-Composed Progressive Metal Music (Pedro Sarmento, Jackson Lothn, Mathieu Barthet)
Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation (Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo)
ComposerX: Multi-Agent Music Generation with LLMs (Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo)
Content-based Controls for Music Large-scale Language Modeling (Liwei Lin, Gus Xia, Junyan Jiang, Yixiao Zhang)
Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion Models (Javier Nistal, Marco Pasini, Cyran Aouameur, Maarten Grachten, Stefan Lattner)
Diff-MST: Differentiable Mixing Style Transfer (Soumya Sai Vanka, Christian J. Steinmetz, Jean-Baptiste Rolland, Joshua D. Reiss, George Fazekas)
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano (Huan Zhang, Jinhua Liang, Simon Dixon)
GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model (Xavier Riley, Zixun Guo, Drew Edwards, Simon Dixon)
I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition (Yannis Vasilakis, Rachel Bittner, Johan Pauwels)
MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling (Drew Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon)
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models (Benno Weck, Ilaria Manco, Emmanouil Benetos, Elio Quinton, George Fazekas, Dmitry Bogdanov)
Best Paper Nomination
Music2Latent: Consistency Autoencoders for Latent Audio Compression (Marco Pasini, Stefan Lattner, George Fazekas)
Semi-Supervised Contrastive Learning of Musical Representations (Julien Guinot, Elio Quinton, George Fazekas)
SpecMaskGIT: Masked Generative Modelling of Audio Spectrogram for Efficient Audio Synthesis and Beyond (Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji)
ST-ITO: Controlling audio effects for style transfer with inference-time optimization (Christian J. Steinmetz, Shubhr Singh, Marco Comunità, Ilias Ibnyahya, Shanxin Yuan, Emmanouil Benetos, Joshua D. Reiss)
Best Paper Nomination

The following Tutorial will be presented by C4DM PhD student Ilaria Manco:

Connecting Music Audio and Natural Language (Seung Heon Doh, Ilaria Manco, Zachary Novack, Jong Wook Kim and Ke Chen)

The following journal paper published at TISMIR will be presented at the conference:

PiJAMA: Piano Jazz with Automatic MIDI Annotations (Drew Edwards, Simon Dixon, Emmanouil Benetos)

As part of the MIREX public evaluations, C4DM PhD student Yixiao Zhang is task captain for the Music Description & Captioning task.

The following Late-Breaking Demos will be showcased at the conference:

Diff-MST^C: A Mixing Style Transfer Prototype for Cubase (Soumya Sai Vanka, Lennart Hannink, Jean-Baptiste Rolland, George Fazekas)
Enhancement of Speech and Language Models through unsupervised Learning with Music Datasets (Eviatar Bas, Iran R Roman)
Enhanced Automatic Drum Transcription via Drum Stem Source Separation (Xavier Riley, Simon Dixon)
[Exploring Transformer-Based Music Overpainting for Jazz Piano Variations](https://ismir2024program.ismir.net/lbd_479.html (Eleanor Row, Ivan Shanin, George Fazekas)
Source-level pitch and timbre editing for mixtures of tones duing disentangled representations (Yin-Jyun Luo, Kin Wai Cheuk, Woosung Choi, Toshimitsu Uesaka, Keisuke Toyama, Wei-Hsiang Liao, Simon Dixon, Yuki Mitsufuji)
How does the teacher rate? Observations from the NeuroPiano dataset (Huan Zhang, Vincent K.M. Cheung, Hayato Nishioka, Simon Dixon, Shinichi Furuya)
ReVamp: Visualisation and analysis in the digital audio workstation (Chris Cannam, George Fazekas)

Finally, the following C4DM members are organising Satellite Events:

Elona Shatri as General Chair for WoRMS 2024
Ilaria Manco as Organising Committee member for NLP4MUSA 2024

See you at ISMIR!