EPSRC logo

Details of Grant 

EPSRC Reference: EP/L000776/1
Title: Unifying audio signal processing and machine learning: a fundamental framework for machine hearing
Principal Investigator: Turner, Dr RE
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Medical Research Council (MRC)
Department: Engineering
Organisation: University of Cambridge
Scheme: First Grant - Revised 2009
Starts: 20 November 2013 Ends: 19 November 2015 Value (£): 97,101
EPSRC Research Topic Classifications:
Artificial Intelligence Digital Signal Processing
Music & Acoustic Technology Vision & Senses - ICT appl.
EPSRC Industrial Sector Classifications:
Healthcare
Related Grants:
Panel History:
Panel DatePanel NameOutcome
11 Apr 2013 EPSRC ICT Responsive Mode - Apr 2013 Announced
Summary on Grant Application Form
Modern technology is leading to a flood of audio data. For example, over seventy two hours of unstructured and unlabelled sound-tracks are uploaded to internet sites every minute. Automatic systems are urgently needed for recognising audio content so that these sound-tracks can be tagged for categorisation and search. Moreover, an increasing proportion of recordings are made on hand-held devices in challenging environments that contain multiple sound sources and noise. Such uncurated and noisy data necessitate automatic systems for cleaning the audio content and separating sources from mixtures. On a related note, devices for the hearing impaired currently perform poorly in noise. In fact, this is a major reason why six million people in the UK who would benefit from a hearing aid, do not use them (a market worth £18 billion p.a.). Patients fitted with cochlear implants suffer from similar limitations, and as the population ages more people are affected.

It is clear that audio recognition and enhancement methods are required to stop us drowning in audio-data, for processing in hearing devices, and to

support new technological innovations. Current approaches to these problems use a combination of audio signal processing (which places the audio data into a convenient format and reduces the data-rate) and machine learning (which removes noise, separates sources, or classifies the content). It is widely believed that these two fields must become increasingly integrated in the future. However, this union is currently a troubled one, suffering from four problems.

Inefficiency: The methods are too inefficient when we have vast amounts of data (as is the case for audio-tracks on the web) or for real-time applications (such as is necessary in hearing aids)

Impoverished models: The machine learning modules tend to be statistically limited.

Unadapted: The signal processing modules are unadapted despite evidence from other fields, like computer vision, which suggests thatautomatic tuning leads to significant performance gains

Distorted mixtures: The signal processing modules introduce non-linear distortions which are not captured by the machine learning modules.

In this project we address these four limitations by introducing a new theoretical framework which unifies signal processing and machine learning. The key step is to view the signal processing module as solving an inference problem. Since the machine-learning modules are often framed in this way, the two modules can be integrated into a single coherent approach allowing technologies from the two fields to be completely integrated. In the project we will then use the new approach to develop efficient, rich, adaptive, and distortion free approaches to audio denoising, source separation and recognition. We will evaluate the the noise reduction and source separations algorithms on the hearing impaired, and the audio recognition algorithms on audio-sound track data.

We believe this new framework will form a foundation of the emerging field of machine hearing. In the future, machine hearing will be deployed in a vast range of applications from music processing tasks to augmented reality systems (in conjunction with technologies from computer vision). We believe that this project will kick start this proliferation.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.cam.ac.uk