EPSRC logo

Details of Grant 

EPSRC Reference: EP/E027741/1
Title: Data-driven articulatory modelling: foundations for a new generation of speech synthesis
Principal Investigator: Renals, Professor S
Other Investigators:
King, Professor S
Researcher Co-Investigators:
Dr K Richmond
Project Partners:
Department: Sch of Informatics
Organisation: University of Edinburgh
Scheme: Standard Research
Starts: 01 November 2006 Ends: 31 October 2009 Value (£): 286,857
EPSRC Research Topic Classifications:
Comput./Corpus Linguistics
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:  
Summary on Grant Application Form
Technology to automatically generate artificial speech (speech synthesis) has come to sound natural enough within the past five years that its use has widened dramatically. Leaders in industry have integrated text-to-speech (TTS) systems into useful real-world applications, such as automated call-centres and call routing, telephone-based information systems (e.g. telephone banking or news services), readers for the visually impaired, and hands-free interfaces, such as car navigation systems.However, in spite of this success, state-of-the-art TTS systems are still severely limited in terms of control. In short, we can readily control what synthesisers say, but not how they say it. Therefore, although such systems are suitable for giving factual information in speech form, they are completely inadequate where a high level of expressiveness is required. By expressiveness we mean the ability to indicate questions or emphasis on selected words, or to convey emotion. Furthermore, the process of generating new synthetic voices is costly and labour-intensive. It is the aim of this project to develop an alternative to current speech synthesis technology with a comparable level of intelligibility and naturalness, but which affords far greater flexibility and control.Unit selection uses large collections of pre-recorded speech to perform synthesis by merely gluing together appropriate fragments in sequence. There is in effect little or no modelling of speech involved. In contrast, this project aims to develop a new model which is trained on pre-recorded speech and interprets it in a novel way: on the basis of its underlying articulation. The aim of this model is to produce synthetic speech which not only retains the qualities of the original speech used for training, but which also is much more versatile and therefore has the potential to be used in new and exciting ways.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.ed.ac.uk