EPSRC Reference: |
EP/M026981/1 |
Title: |
Towards visually-driven speech enhancement for cognitively-inspired multi-modal hearing-aid devices (AV-COGHEAR) |
Principal Investigator: |
Hussain, Professor A |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Computing Science and Mathematics |
Organisation: |
University of Stirling |
Scheme: |
Standard Research - NR1 |
Starts: |
01 October 2015 |
Ends: |
31 March 2019 |
Value (£): |
418,262
|
EPSRC Research Topic Classifications: |
Digital Signal Processing |
Image & Vision Computing |
Vision & Senses - ICT appl. |
|
|
EPSRC Industrial Sector Classifications: |
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
10 Mar 2015
|
Hearing Aid Technologies
|
Announced
|
|
Summary on Grant Application Form |
Current commercial hearing aids use a number of sophisticated enhancement techniques to try and improve the quality of speech signals. However, today's best aids fail to work well in many everyday situations. In particular, they fail in busy social situations where there are many competing speech sources; they fail if the speaker is too far from the listener and swamped by noise. We have identified an opportunity to solve this problem by building hearing aids that can 'see'.
This ambitious project aims to develop a new generation of hearing aid technology that extracts speech from noise by using a camera to see what the talker is saying. The wearer of the device will be able to focus their hearing on a target talker and the device will filter out competing sound. This ability, which is beyond that of current technology, has the potential to improve the quality of life of the millions suffering from hearing loss (over 10m in the UK alone).
Our approach is consistent with normal hearing. Listeners naturally combine information from both their ears and eyes: we use our eyes to help us hear. When listening to speech, eyes follow the movements of the face and mouth and a sophisticated, multi-stage process uses this information to separate speech from the noise and fill in any gaps. Our hearing aid will act in much the same way. It will exploit visual information from a camera (e.g.using a Google Glass like system), and novel algorithms for intelligently combining audio and visual information, in order to improve speech quality and intelligibility in real-world noisy environments.
The project is bringing together a critical mass of researchers with the complementary expertise necessary to make the audio-visual hearing-aid possible. The project will combine new contrasting approaches to audio-visual speech enhancement that have been developed by the Cognitive Computing group at Stirling and the Speech and Hearing Group at Sheffield. The Stirling approach uses the visual signal to filter out noise; whereas the Sheffield approach uses the visual signal to fill in 'gaps' in the speech. The vision processing needed to track a speaker's lip and face movement will use a revolutionary 'bar code' representation developed by the Psychology Division at Stirling. The MRC Institute of Hearing Research (IHR) will provide the expertise needed to evaluate the approach on real hearing loss sufferers. Phonak AG, a leading international hearing aid manufacturer, will provide the advice and guidance necessary to maximise potential for industrial impact.
The project has been designed as a series of four workpackages that consider the key research challenges related to each component of the device's design. These questions have been identified by preliminary work at Sheffield and Stirling. Among the challenges are developing improved techniques for visually-driven audio-analysis; designing better metrics for weighting audio and visual evidence; developing techniques for optimally combining the noise-filtering and gap-filling approaches. A further key challenge is that, for a hearing aid to be effective, the processing cannot delay the signal by more than 10ms.
In the final year of the project a full integrated, software prototype will be clinically evaluated using listening tests with hearing-impaired volunteers in a range of modern noisy reverberant environments. Evaluation will use a new purpose-built speech corpus that will be designed specifically for testing this new class of multimodal device. The project's clinical research partner, the Scottish Section of MRC IHR, will provide advice on the experimental design and analysis aspects throughout the trials. Industry leader Phonak AG will provide advice and technical support for benchmarking real-time hearing devices. The final clinically-tested prototype will be made available to the whole hearing community as a testbed for further research, development, evaluation and benchmarking.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.stir.ac.uk |