EPSRC logo

Details of Grant 

EPSRC Reference: EP/T024976/1
Title: Unmute: Opening Spoken Language Interaction to the Currently Unheard
Principal Investigator: Bell, Dr PJ
Other Investigators:
Jones, Professor M Renals, Professor S Robinson, Professor SNW
Goldwater, Professor S Pearson, Professor JS
Researcher Co-Investigators:
Project Partners:
Auris Tech Ltd Translators Without Borders
Department: Centre for Speech Technology Research
Organisation: University of Edinburgh
Scheme: Standard Research
Starts: 01 December 2020 Ends: 31 July 2024 Value (£): 970,668
EPSRC Research Topic Classifications:
Artificial Intelligence Computational Linguistics
Human-Computer Interactions
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
10 Feb 2020 Responsible NLP for Intelligent Interfaces Panel 2020 Announced
Summary on Grant Application Form
Everyday, millions of users talk to machines, be they voice assistants on their mobile phones or standalone devices - such as Alexa or Google Home - in the homes and workplaces. By using Natural Language Processing for input and output through the medium of speech in an intelligent user interface, people are able to access a plethora of content and services. These systems rely on a several factors: i) firstly the use of automatic speech recognition (ASR) that can convert the speech signal into usable language tokens; ii) the availability of knowledge and resources in the system to address the user need; and, iii) an intelligent user interface interaction design that fits the contexts and capabilities of the end user. While state-of-the-art systems are beginning to address the needs of "conventional" users (i.e. those who speak a widely spoken and written language; and who have relatively high degrees of literacy, exposure to digital interactions and other resources), there are many hundreds millions of people who are being excluded globally. Paradoxically, these users who have resource constraints (such as low digital and textual literacy) could be the ones to most benefit from advances in spoken NLP systems opening up economic, social and educational possibilities currently unmet.

This project addresses the limitations of today's approaches to open up intelligent interfaces to the currently digitally 'unheard'. The challenges we will address are threefold. Firstly, and crucially, there is a need to explore highly innovative ASR techniques that can cope with languages that have limited or even no textual resources. Conventionally, ASR systems rely on vast amounts of transcribed speech to develop and train models. Our focus is on languages where there is little of this data and indeed on languages where there is no established written form. For the systems to be useful to the sorts of community we target, they have to of course make available relevant content that is in the language these users use; so the second challenge is establish infrastructures and user involvement to provide ways to generate such content. In doing so we hope to produce a blueprint and toolkit that can be used by many other low or zero-resource language communities. Finally, the user communities we will work with to develop these new approaches have a different perspective to "conventional" users and the third challenge is to surface the needs and values when interacting with an intelligent interface for content and services so that the underlying algorithms and the interaction devices and styles are appropriate and effective. Prior work by our team has shown that assumptions of what works in speech assistants that are deployed in conventional settings break down when these systems are exposed to groups in informal settlements in India and townships in South Africa. By taking this approach, we expect to innovate on and disrupt the interface styles and interaction devices currently used in intelligent speech and language interfaces, addressing the need of not just currently excluded users but offering up new possibilities for the rest of the world too.

The work brings together two world leading groups in a new collaboration - Edinburgh's CSTR with its long track record on speech technology innovation; and, the FIT Lab at Swansea's computational foundry that has pioneered interaction innovation for and with emergent users for over a decade, developing and advocating responsible innovation with communities in rural, peri-urban and urban developing world contexts. These groups are joined by both existing and new collaborations with NGO, spin-out and international academic stakeholders who will help shape the work and ensure it has direct and sustainable impact.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.ed.ac.uk