QMUL School of Electronic Engineering and Computer Science
Centre for Digital Music Seminar Series
Seminar by:
Yin-Jyun Luo (PhD candidate at Singapore University of Technology and Design)
Kin Wai Cheuk (PhD candidate at Singapore University of Technology and Design)
Date/time: Monday 28th of October, 2-3pm
Location: Graduate Centre - Room 222 Number 18 on Campus map: https://www.qmul.ac.uk/media/qmul/docs/about/Mile-End_map-April2019.pdf
Open to students, staff, alumni, public; all welcome. Admission is FREE, no pre-booking required.
1st Speaker: Yin-Jyun Luo
Title: Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders
Abstract: In this talk, I will share our empirical results on learning disentangled representations using Gaussian mixture variational autoencoders (GMVAEs) for music instrument sounds. Specifically, we achieve disentanglement of note timbre and pitch, respectively, represented as latent timbre and pitch variables, by learning separate neural network encoders. The distributions of the two latent variables are regularized by distinct Gaussian mixture distributions. A neural network decoder is used to synthesize sounds with the desired timbre and pitch, which takes a concatenation of the timbre and pitch variables as input. The performance of the disentanglement network is evaluated by both qualitative and quantitative approaches, which further demonstrate the model's applicability in both controllable sound synthesis and many-to-many timbre transfer.
Bio: Yin-Jyun Luo is a Ph.D candidate at Singapore University of Technology and Design, and is affiliated with the Institute of High Performance Computing, A*STAR, under the supervision of Professor Dorien Herremans and Professor Kat Agres. Previously, he was a research assistant in the Music and Culture Technology Lab led by Dr. Li Su at the Institute of Information Science, Academia Sinica, Taiwan. He has a Master of Science degree in Music Technology from National Chiao Tung University, Taiwan. Yin-Jyun is currently working on representation learning of music and audio using deep learning.
2nd Speaker: Kin Wai Cheuk
Title: nnAudio: a GPU audio processing tool and its application to music transcription.
Abstract: I will present a recently released neural-network based audio processing toolbox called nnAudio. This toolbox leverages 1D convolutional neural networks for real-time spectrogram generation (time-domain to frequency-domain conversion). This enables us to generate spectrograms on-the-fly without the need to store any of the spectrograms on the disk when training neural networks for audio related tasks. In this talk, I will discuss one of the possible applications of nnAudio, namely, the exploration of suitable input representations for automatic music transcription (AMT).
Bio: Kin Wai Cheuk is a Ph.D candidate at Singapore University of Technology and Design, and is affiliated with the Institute of High Performance Computing, A*STAR, under the supervision of Professor Dorien Herremans and Professor Kat Agres. He received both his Bachelor of Science in Physics (Minor in Music) and Master of Philosophy in Mechanical Engineering at The University of Hong Kong. His research interests include neural network-based music transcription and composition.