EPSRC logo

Details of Grant 

EPSRC Reference: EP/E010857/1
Title: Learning the morphology of complex synthetic languages
Principal Investigator: Flach, Professor P
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Computer Science
Organisation: University of Bristol
Scheme: Standard Research
Starts: 01 October 2006 Ends: 30 September 2010 Value (£): 366,271
EPSRC Research Topic Classifications:
Artificial Intelligence
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:  
Summary on Grant Application Form
This project aims to apply advanced machine learning techniques in order to learn the morphology -- the way words are formed from constituents -- of synthetic (i.e., morphologically complex) languages. This will allow improved text-to-speech systems for complex languages such as isiZulu. Morphological analysis is the decomposition of words into their constituents (morphemes) with the assignment of grammatical features to each of constituents. To take a simple example in English, the word unhappier is decomposed into the following components: un(adjectival negative prefix)+happy(adjectival stem)+er(comparative suffix) taking into account both the allowed sequence of word constituents and the changes of the orthographic shape of these constituents when they are concatenated. Most morphological phenomena in the majority of European languages can be expressed by finite-state techniques such as regular expressions. This project, however, is concerned with the structurally more complex synthetic languages (mostly non-European). These languages exhibit complex recursive morphological structures that require more powerful mechanisms than finite-state automata. The main research goal of the project is to automatically decompose the word into its constituents by learning the rules for representing permissible sequences of word constituents and the rules that change the orthographic shape of the constituents. This involves tackling a set of open problems in morphological learning that prevents learning the whole set of morphological rules. We have chosen Inductive Logic Programming (ILP) for training as its logical foundations allow representing complex formalisms that can be expanded by stochastic features. ILP methods can also induce rules directly from unbounded data items such as strings, which makes annotation and training more naturally related to the underlying linguistics. The proposed research will have a tremendous benefit for producing Text-to-Speech Systems in developing African and Asian countries (and the practical delivery will be enabled by the partnerships and contacts forged by the Local Language Speech Technology Initiative, see www.llsti.org). The automated morphological analysis tools developed in this project will facilitate the creation of intelligible Text-to-Speech systems that require morphological analysis for (1) Automatic tone assignment, which is essential for most African languages.(2) Proper prosody, which includes stress assignment required for Russian, and phrase prediction required for most world languages including European ones.(3) Proper letter-to-sound rules required for the Indian languages Hindi and Telugu, the Turkish language and many others. The research will provide the technology for the implementation of indigenous and minority language voice services offered by mobile network providers (such as information on healthcare, jobs, agriculture, the environment etc.) Another important application for this technology will be screen readers for blind people in many Asian and African countries.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.bris.ac.uk