EPSRC logo

Details of Grant 

EPSRC Reference: GR/J10204/01
Title: IMPROVING OF PHONETIC DISCRIMINATION OF HIDDEN MARKOV MODEL BASED SPEECH RECOGNISERS
Principal Investigator: Young, Professor SJ
Other Investigators:
Woodland, Professor PC
Researcher Co-Investigators:
Project Partners:
Department: Engineering
Organisation: University of Cambridge
Scheme: Standard Research (Pre-FEC)
Starts: 01 December 1992 Ends: 31 May 1996 Value (£): 174,227
EPSRC Research Topic Classifications:
Human Communication in ICT
EPSRC Industrial Sector Classifications:
Related Grants:
Panel History:  
Summary on Grant Application Form
To improve the phonetic discrimination of HMM-based systems by: the use of high-accuracy sub-phone models input transformations and discriminative training and to demonstrate their effectiveness in a working laboratory system. Progress:The first phase of the work in this project addressed the issue of how to build accurate sub-phone models. The approach has been to start from conventional context-dependent 3 state Hidden Markov Models (HMMs) and then to pool the states to form sub-phones. Initially, we investigated data-driven clustering in which states were pooled to form a sub-phone based entirely on acoustic similarity [1]. More recently, we have developed a method based on phonetic decision trees [2]. Both methods work well but the decision tree approach has the advantage that wider phonetic contexts can be included and models can be built for contexts that have not been seen in the training data.More recently the work has focused on the problem of applying transformations and discriminative training to very large vocabulary systems. To facilitate this, we have developed a method of generating lattices from a standard HMM recogniser. These can then be used to run recognition experiments quickly by rescoring rather than repeating a computationally expensive full search. They can also be used to generate alternative state alignments for a discriminative training scheme.When the project started, our focus task was medium (1000 word) vocabulary recognition as exemplified by the ARPA Resource Management Task. More recently, we have transferred our attention to large vocabulary dictation. Work from this project and EPSRC Project GR/K25380 have contributed to the building of the HTK Large Vocabulary Dictation System. This system returned the best performance of any system in the ARPA 1994 CSR Evaluation[3]. [1] Young S.J., Woodland P.C., State Clustering in HMM-based Continuous Speech Recognition. Computer Speech and Language, Vol 8, pp.369-384, 1994. [2] Young S.J., Odell J.J., Woodland P.C., Tree-Based State Tying for High Accuracy Acoustic Modelling. Proc. Human Language Technology Workshop, Morgan Kaufmann Publishers Inc, March 1994. [3] Woodland P.C., Leggetter C.J., Odell J., Valtchev V., Young S.J. The 1994 HTK Large Vocabulary Speech Recognition System. Proc ICASSP, Detroit, 1995.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.cam.ac.uk