EPSRC logo

Details of Grant 

EPSRC Reference: EP/N014111/1
Title: Making Sense of Sounds
Principal Investigator: Plumbley, Professor M
Other Investigators:
Wang, Professor W Cox, Professor TJ Frohlich, Professor D
Mikolajczyk, Professor K Davies, Professor W Jackson, Professor P
Researcher Co-Investigators:
Project Partners:
Audio Analytic Ltd (UK) BBC City, University of London
NYU Steinhardt Pompeu Fabra University Queen Mary University of London
University of Groningen
Department: Vision Speech and Signal Proc CVSSP
Organisation: University of Surrey
Scheme: Standard Research
Starts: 14 March 2016 Ends: 31 October 2019 Value (£): 1,275,401
EPSRC Research Topic Classifications:
Artificial Intelligence Digital Signal Processing
Human-Computer Interactions Music & Acoustic Technology
EPSRC Industrial Sector Classifications:
Creative Industries Information Technologies
Related Grants:
Panel History:
Panel DatePanel NameOutcome
03 Sep 2015 Making Sense From Data Panel - Full Proposals Announced
Summary on Grant Application Form
In this project we will investigate how to make sense from sound data, focussing on how to convert these recordings into understandable and actionable information: specifically how to allow people to search, browse and interact with sounds.

Increasing quantities of sound data are now being gathered in archives such as sound and audiovisual archives, through sound sensors such as city soundscape monitoring and as soundtracks on user-generated content. For example, the British Library (BL) Sound Archive has over a million discs and thousands of tapes; the BBC has some 1 million hours of digitized content; smart cities such as Santander (Spain) and Assen (Netherlands) are beginning to wire themselves up with a large number of distributed sensors; and 100 hours of video (with sound) are uploaded you YouTube every minute.

However, the ability to understand and interact with all this sound data is hampered by a lack of tools allowing people to "make sense of sounds" based on the audio content. For example, in a sound map, users may be able to search for sound clips by geographical location, but not by "similar sounds". In broadcast archives, users must typically know which programme to look for, and listen through to find the section they need. Manually-entered textual metadata may allow text-based searching, but these typically only refer to the entire clip or programme, can often be ambiguous, and are hard to scale to large datasets. In addition, browsing sound data collections is a time-consuming process: without the help of e.g. key frame images available from video clips, each sound clip has to be "auditioned" (listened to) to find what is needed, and where the point of interest can be found. Radio programme producers currently have to train themselves to listen to audio clips at up to double speed to save time in the production process. Clearly better tools are needed.

To do this, we will investigate and develop new signal processing methods to analyse sound and audiovisual files, new interaction methods to search and browse through sets of sound files, and new methods to explore and understand the criteria searchers use when searching, selecting and interacting with sounds. The perceptual aspect will also investigate people's emotional response to sounds and soundscapes, assisting sound designers or producers to find audio samples with the effect they want to create, and informing the development of public policy on urban soundscapes and their impact on people.

There are a wide range of potential beneficiaries for the research and tools that will be produced in this project, including both professional users and the general public. Archivists who are digitizing content into sound and audiovisual archives will benefit from new ways to visualize and tag archive material. Radio or television programme makers will benefit from new ways to search through recorded programme material and databases of sound effects to reuse, and new tools to visualize and repurpose archive material once identified. Sound artists and musicians will benefit from new ways to find interesting sound objects, or collections of sounds, for them to use as part of compositions or installations. Educators will benefit from new ways to find material on particular topics (machines, wildlife) based on their sound properties rather than metadata. Urban planners and policy makers will benefit from new tools to understand the urban sound environment, and people living in those urban environments will benefit through improved city sound policies and better designed soundscapes, making the urban environment more pleasant. For the general public, many people are now building their own archives of recordings, in the form of videos with soundtracks, and may in future include photographs with associated sounds (audiophotographs). This research will help people make sense of the sounds that surround us, and the associations and memories that they bring.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.surrey.ac.uk