EPSRC logo

Details of Grant 

EPSRC Reference: GR/J55106/01
Title: INTONATION AND DIALOGUE MODELS AS CONSTRAINTS IN SPEECH RECOGNITION
Principal Investigator: Isard, Mr S
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Theoretical and Applied Linguistics
Organisation: University of Edinburgh
Scheme: Standard Research (Pre-FEC)
Starts: 01 October 1993 Ends: 31 March 1997 Value (£): 227,143
EPSRC Research Topic Classifications:
Human Communication in ICT
EPSRC Industrial Sector Classifications:
Related Grants:
Panel History:  
Summary on Grant Application Form
The objective is to build a speech recognition system that will recognise dialogue speech rather than read utterances. Our particular approach is to make extensive use of intonation and its relation to dialogue structure. The project involves extracting intonational information from utterances, analysing the function that this intonation serves in signalling the dialogue role of the utterance, building a speech recogniser and integrating the three components.Progress:The project has several components that will form the basis of a speech recognition system which can perform in a dialogue situation.Intonational LabellingA substantial amount of time has been spent on producing a robust method of detecting prosodic phrase boundaries, pitch accents and boundary tones from speech (intonational events ). The current algorithm uses sophisticated feature extraction whereby a number of tracks are produced which correspond to vowelness, obstruentation, energy, pitch and differentiated pitch. These tracks are combined into a 5 dimensional vector which is used as the input to a time-delay neural net. This net produces another track, which represents a plot against time of the likelihood that a section of utterance contains an intonational event. This track is further processed to give a basic intonational description of the utterance. The technique has an accuracy of about 90%. Current work is concentrated on testing new input feature sets and in more sophisticated post-processing of the neural-net output track. Dialogue ModellingWe have completed a study of the function of intonation in single word utterances. It shows that given a specific discourse context, one can narrow the potential discourse functions of the utterance through a process of elimination based solely upon the utterance's intonation category. Extensions of this work will use the intonation transcription that we are developing in a comparison of longer utterances in dialogue and their functions. The dialogue analysis as it will be used in this project is already complete. Speech RecognitionA new speech recogniser is being developed on the project. This recogniser uses a hybrid neural-net/hidden Markov model approach. The neural net is used to produce a set of tracks (similarly to the way described above) representing various broad phonetic classes. The hidden Markov model then uses these tracks as input. The project aims to be able to recognise and process the HCRC Map Task corpus, which consists of spontaneous dialogue speech. Grammar models have been built of the corpus, and current work involves hand labelling of the corpus in order to get data for training phonetic models. Full recognition experiments on the corpus will begin soon.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.ed.ac.uk