EPSRC Reference: |
EP/G050821/1 |
Title: |
Probabilistic Auditory Scene Analysis |
Principal Investigator: |
Turner, Dr RE |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Engineering |
Organisation: |
University of Cambridge |
Scheme: |
Postdoc Research Fellowship |
Starts: |
04 January 2010 |
Ends: |
03 January 2013 |
Value (£): |
232,105
|
EPSRC Research Topic Classifications: |
Biomedical neuroscience |
Vision & Senses - ICT appl. |
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
Auditory environments are typically very complicated. For example, thecocktail party comprises many sources; the chinking of glasses; thechattering of the many guests; the sound of backgroundmusic. Nevertheless, our auditory system can make sense of such ascene; it can work out how many acoustic sources there are anddetermine the individual contributions to the scene fromeach. Remarkably, it can do this using the information from a singlemicrophone. A major goal of auditory neuroscience is to understandhow the auditory system achieves this feat.Broadly speaking, it is thought that there are three stages toauditory scene analysis. The first stage is well understoodphysiologically and that is to convert the incoming sound into atime-frequency representation. This reveals the local energy in afrequency band at a particular time. In the second stage,psychophysical evidence suggests that primitive grouping principlesare used to group local regions of spectral-temporal energy arisingfrom a common source. By using simple stimuli - like tones and noise -a long list of primitive grouping principles have been elucidated. Forexample, the principle of good continuation identifies smoothlyvarying features with a single source and abrupt changes as asignature of separate sources. In the final stage of auditory sceneanalysis, called schema-based grouping, higher level knowledge, likethe structure of music or speech, is used to bind the groups ofspectral-temporal energy into streams so that there is one stream foreach source.There are many outstanding questions with this framework. Oneimportant open question is the role that auditory cortex plays inauditory scene analysis as it is not well established. Anotherconcerns the generality and completeness of the established list ofprimitive grouping rules. For although the principles successfullycharacterise perception of simple sounds it is unclear how successfuland relevant the description is for natural sounds. This project aims to resolve these questions though modelling work,psychophysics experiments and neural recording experiments. The newidea is to view the primitive grouping principles as arising frominference in a latent variable model of auditory scenes. A latentvariable model is a description of how an auditory scene, like thatencountered at a coctail party, is composed of latent auditorysources, like the chinking glasses and chattering guests. It alsoincludes a description of the statistics of these sources, like thefact that the chinking glasses tend to be isolated, high frequencyevents whist the chattering rather more constant and lower infrequency. The idea is that the brain is trying to infer these latentsources using prior knowledge of their statistics. New tools ofprobabilistic inference can make these intuitions concrete.This new perspective, called probabilistic scene analysis, has twomain advantages; one practical and one theoretical. The practicaladvantage is that a statistical characterisation of sounds can be usedto produce stimuli with complicated, but controlled structure, for usein experiments. The theoretical benefit is that the list of primitivegrouping rules, and the manner in which they trade off, are nowderived from the statistics of sounds; Heuristic implementation is nolonger required. This enables us to predict the results of theexperiments. In particular, the psychophysics experiments are aimedat resolving both how auditory grouping operates in synthetic auditorytextures (e.g. rain, wind, water etc.) and whether this is consistentwith the probabilistic account. Furthermore, the neural recordingexperiments will investigate the role of auditory cortex in auditoryscene analysis, and the hypothesis that it is representing high levelstatistics of sounds like slowly varying modulatory components.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
http://www.gatsby.ucl.ac.uk/~turner/index.html |
Further Information: |
|
Organisation Website: |
http://www.cam.ac.uk |