QMUL School of Electronic Engineering and Computer Science
Centre for Digital Music Seminar Series
**Seminar by Kazuyoshi Yoshii (Kyoto University, Japan) **
**Date/time: 2pm, Monday, 13th May 2019 **
**Location: Bancroft Road, room 3.01 ** Campus map: https://www.qmul.ac.uk/media/qmul/docs/about/Mile-End_map-April2019.pdf
Open to students, staff, alumni, public; all welcome. Admission is FREE, no pre-booking required.
Title: Latest Advances in Multi-Way Low-Rank Factorization: Nonnegativity, Positive Semidefiniteness, and Joint Diagonalization
Abstract: Nonnegative matrix factorization (NMF) is the most basic method of nonnegativity-based low-rank factorization of matrix data (e.g., time-frequency data). It has intensively been used for audio source separation and automatic music transcription because it decomposes the power spectrogram of a mixture signal into interpretable "parts" (e.g., musical notes) due to the nonnegativity constraints of all data. The naive "multi-way" extension of NMF is nonnegative tensor factorization (NTF) that can deal with tensor data (e.g., time-frequency-channel data). On the other hand, we proposed a more mathematically-fundamental "multi-dimensional" extension of NMF called positive semidefinite tensor factorization (PSDTF) [Yoshii+ 2013] that approximates each PSD matrix given as input as the weighted sum of basis PSD matrices in the same way that NMF approximates each nonnegative vector given as input as the weighted sum of basis nonnegative vectors. We further proposed a "multi-way" extension of PSDTF called correlated tensor factorization (CTF) [Yoshii 2018] that approximates a PSD matrix (covariance matrix over all elements of an input tensor) as the sum of Kronecker products of multiple sets of basis PSD matrices. To reduce the impractically-huge computational complexity of CTF, we recently proposed a fast approximation of CTF based on joint diagonalization of basis PSD matrices called independent low-rank tensor analysis (ILRTA) [Yoshii+ 2018]. Interestingly, we revealed that CTF and its counterpart ILRTA, which are ultimate multi-way covariance modeling frameworks, include as their special cases state-of-the-art single- and multi-channel source separation methods independently developed by different researchers such as Itakura-Saito NMF [Fevotte+ 2009], transform-learning NMF (TL-NMF) [Fagot+ 2018], multi-channel NMF (MNMF) [Sawada+ 2013], independent low-rank matrix analysis (ILRMA) [Kitamura+ 2016], fast full-rank spatial covariance analysis (FastFCA) [Ito+ 2018], and independent positive semidefinite tensor analysis (IPSDTA) [Ikeshita 2018].
Bio: Kazuyoshi Yoshii is a Senior Lecturer at Speech and Audio Processing Group, Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Japan and is concurrently the Team Leader of Sound Scene Understanding Team at RIKEN Center for Advanced Intelligence Project (AIP), Japan. He received a PhD in Informatics at Kyoto University in 2008. His research interests include statistical music analysis, microphone array processing, Bayesian modeling, and machine learning.