Audio and Music Signal Processing Mini-Symposium
October 20, 2011
Mitsubishi Electric Research Labs (MERL) is hosting a mini-symposium on audio and music signal processing, with three talks by eminent researchers in the field: Prof. Mark Plumbley, Dr. Cédric Févotte and Prof. Nobutaka Ono.
As space is finite, we would very much appreciate if you could RSVP to to let us know that you plan to attend.
Details
- Date: Thursday, October 20, 2011
- Time: 2:00 PM - 5:00 PM
- Hosts: Jonathan Le Roux, John R. Hershey (MERL Speech and Audio Team)
- Location: Mitsubishi Electric Research Labs (MERL), 201 Broadway, 8 Floor, Cambridge, MA 02139 (MERL is situated a few minutes walk from MIT and the Kendall/MIT T station)
- https://www.merl.com/contact/
Schedule
1:45 pm | Doors open |
2:00 pm | Welcome and Introduction to MERL, Dr. Kent Wittenburg (MERL) |
2:20 pm | "Analysing Digital Music", Prof. Mark Plumbley (Queen Mary, London) |
3:00 pm | "Itakura-Saito nonnegative matrix factorization and friends for music signal decomposition", Dr. Cédric Févotte (CNRS - Telecom ParisTech, Paris) |
3:40 pm | "Auxiliary Function Approach to Source Localization and Separation", Prof. Nobutaka Ono (National Institute of Informatics, Tokyo) |
4:20 pm | Refreshments |
Abstracts
2:20 pm - 3:00 pm
Speaker: Prof. Mark Plumbley (Queen Mary, London)
Title: "Analysing Digital Music"
Abstract:
Although music has been "digital" since the introduction of the Compact
Disc over 20 years ago, the term "Digital Music" has only recently come
into widespread use, as computer and internet technologies have begun to
be used to analyze, discover and deliver music and associated
information to listeners. Much of this work is about finding out
meaningful, semantic, information about the music track, such as the
artist, instruments, genre (rock/pop/jazz), lyrics, key, notes, beats,
and so on. In this talk, I will explore some of the technologies
emerging in this exciting and evolving area. I will also talk about some
of our work in the analysis of musical audio signals, including
automatic music transcription, beat tracking, audio source separation,
and sound visualization.
Speaker Biography:
Prof. Mark Plumbley is Director of the Centre for Digital Music (C4DM)
at Queen Mary University of London. His research interests include the
analysis of audio and music signals, including beat tracking, automatic
music transcription and source separation, using techniques such as
neural networks, information theory, and sparse representations. He is
Principal Investigator on several current EPSRC grants, including
"Information Dynamics of Music" and "Sustainable Software for Digital
Music and Audio Research", and he holds an EPSRC Leadership Fellowship.
He leads the UK Digital Music Research Network, is Chair of the
International Independent Component Analysis (ICA) Steering Committee,
and is a member of the IEEE Audio and Acoustic Signal Processing
Technical Committee.
Prof. Mark Plumbley's Website
3:00 pm - 3:40 pm
Speaker: Dr. Cédric Févotte (CNRS - Telecom ParisTech, Paris)
Title: "Itakura-Saito nonnegative matrix factorization and friends for music signal decomposition"
Abstract:
Other the last 10 years nonnegative matrix factorization (NMF) has
become a popular unsupervised dictionary learning/adaptive data
decomposition technique with applications in many fields. In particular,
much research about this topic has been driven by applications in audio,
where NMF has been applied with success to automatic music transcription
and single channel source source separation. In this setting the
nonnegative data is formed by the magnitude or power spectrogram of the
sound signal and is decomposed as the product of a dictionary matrix
containing elementary spectra representative of the data times an
activation matrix which contains the expansion coefficients of the data
frames in the dictionary.
After a general overview of NMF and a focus on majorization-minimization
(MM) algorithms for NMF, the presentation will discuss model selection
issues in the audio setting, pertaining to 1) the choice of
time-frequency representation (essentially, magnitude or power
spectrogram), and 2) the measure of fit used for the computation of the
factorization. We will give arguments in support of factorizing of the
power spectrogram with the Itakura-Saito (IS) divergence. In particular,
IS-NMF is shown to be connected to maximum likelihood estimation of
variance parameters in a well-defined statistical model of superimposed
Gaussian components and this model is in turn shown to be well suited to
audio.
Then presentation will briefly address variants of IS-NMF, namely IS-NMF
with regularization of the activation coefficients (Markov model, group
sparsity), online IS-NMF, automatic relevance determination for model
order selection and multichannel IS-NMF. Audio source separation demos
will be played.
Speaker Biography:
Cedric Fevotte obtained the State Engineering degree and the MSc degree
in Control and Computer Science from Ecole Centrale de Nantes (France)
in 2000, and then the PhD degree in 2003. As a PhD student he was with
the Signal Processing Group at Institut de Recherche en Communication et
Cybernetique de Nantes (IRCCyN) where he worked on time-frequency
approaches to blind source separation. From 2003 to 2006 he was a
research associate with the Signal Processing Laboratory at University
of Cambridge (Engineering Dept) where he developed Bayesian approaches
to sparse component analysis with applications to audio source
separation. He was then a research engineer with the start-up company
Mist-Technologies (now Audionamix) in Paris, designing mono/stereo to
5.1 surround sound upmix solutions. In Mar. 2007, he joined Telecom
ParisTech, first as a research associate and then as a CNRS tenured
research scientist in Nov. 2007. His research interests generally
concern statistical signal processing and unsupervised machine learning
and in particular applications to blind source separation and music
signal processing. He is the scientific leader of project TANGERINE
(Theory and applications of nonnegative matrix factorization) funded by
the French research funding agency ANR.
Dr. Cedric Fevotte's Website
3:40 pm - 4:20 pm
Speaker: Prof. Nobutaka Ono (National Institute of Informatics, Tokyo)
Title: "Auxiliary Function Approach to Source Localization and Separation"
Abstract:
Many kinds of source localization and separation problems can be
formulated as nonlinear optimization problems, and there are generally
no closed-form solutions. For fast and stable calculation, effective
iterative algorithms are desired. Auxiliary function technique, which
can be called majorization-minimization (MM) algorithm, is one of the
attractive approach for them since it can yield simple and
convergence-guaranteed update rules for parameter estimation. In this
talk, as a showcase of them, auxiliary-function-based algorithms for
source localization and separation such as TDOA-based source
localization, blind alignment of asynchronously-recorded signals,
harmonic/percussive sound separation, independent component/vector
analysis will be presented.
Speaker Biography:
Nobutaka Ono received the Ph.D degree in mathematical engineering and
information physics from the University of Tokyo (Japan) in 2001. He
worked at the Graduate School of Information Science and Technology,
University of Tokyo, as a Research Associate from 2001 to 2004, and as a
Lecturer from 2005 to 2010. In 2011, he joined the Principles of
Informatics Research Division at the National Institute of Informatics
(NII, Tokyo, Japan) as an Associate Professor. His research interests
include source separation and localization, array signal processing,
acoustic and music signal processing, audio coding, and machine
learning. He was the Secretary of the Technical Committee of
Psychological and Physiological Acoustics in Japan from 2006 to 2009. He
received the Sato Prize Paper Award from Acoustic Society of Japan (ASJ)
in 2000, the Igarashi Award at the Sensor Symposium on Sensors,
Micromachines, and Applied Systems from the Institute of Electrical
Engineers of Japan (IEEJ) in 2004, the Awaya Prize Young Researcher
Award from ASJ in 2007, the Best Paper Award at the International
Symposium on Industrial Electronics (ISIE) in 2008.
Prof. Nobutaka Ono's Website
Organizers
- Jonathan Le Roux (MERL)
- John R. Hershey (MERL)