Details of Grant

EPSRC Reference:

EP/D03261X/1

Title:

Probabilistic Modelling of Musical Audio for Machine Listening

Principal Investigator:

Godsill, Professor S

Other Investigators:

Researcher Co-Investigators:

Dr AT Cemgil

Project Partners:

Department:

Engineering

Organisation:

University of Cambridge

Scheme:

Standard Research (Pre-FEC)

Starts:

17 October 2005

Ends:

16 October 2008

Value (£):

304,617

EPSRC Research Topic Classifications:

Music & Acoustic Technology

EPSRC Industrial Sector Classifications:

Creative Industries

Related Grants:

Panel History:

Summary on Grant Application Form

Computer based music composition and sound synthesis date back to the early days of digital computation and artificial intelligence. However, despite recent technological advances in synthesis, compression, processing and distribution of digital audio, it is yet not possible to construct machines that can match human levels of musical listening ability. Thus, a central problem with current computer based music systems is that they are not equipped with listening capabilities. For flexible interaction and processing, it is essential that music systems are aware of the actual content of the music, i.e. are able to extract structure and organise high level information (such as pitch, rhythm, chord and timbre) directly from the (low-level, sample-based) audio signal itself.In the past, extensive research has been carried out to study and model human listening abilities. Research in this field, known as computational auditory scene analysis (CASA) or machine listenhing, aims to understand how humans can solve robustly a broad range of listening problems. Such problems include separating voices of two or more simultaneously speaking persons or identification of environmental sound objects such as alarm, machine or vehicle sounds. Recently, analysis of musical scenes is drawing increasing attention, primarily because of the need for content based retrieval in digital audio databases (on the internet, in a personal collection) and interest in interactive music performance systems (computer programs that listen and respond to human musicians).In this project, we propose to develop a rigorous methodology based on Bayesian data analysis that can be adapted to solve many interrelated problems in musical machine listening. Through a systematic and transparent incorporation of prior knowledge about musical signals, human perception and high level music structure, we aim to develop new models and methods that may push forward the state of the art in applications such as polyphonic music transcription, source separation, audio restoration, content based music retrieval and audio coding.There are two main components to this research. The first is the modelling framework, as discussed above: how can we write down mathematically a representation of musical sound that encapsulates all the salient information about notes, chords and their lilkely relationships over time? However, such models will be highly complex, with many unknown parameters in non-linear relationships. Hence a second component of the work is the computational aspect: how to infer the musical parameters of this complex system based just upon the recorded audio signal? We plan to develop and adapt existing computational methods to this highly challenging task in novel ways, employing high-speed computers and our established expertise in this area to provide practical solutions.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.cam.ac.uk