EPSRC logo

Details of Grant 

EPSRC Reference: EP/M005429/1
Title: DILiGENt: Domain-Independent Language Generation
Principal Investigator: Rieser, Professor V
Other Investigators:
Riedel, Dr S Lemon, Professor O
Researcher Co-Investigators:
Professor A Vlachos
Project Partners:
Department: S of Mathematical and Computer Sciences
Organisation: Heriot-Watt University
Scheme: Standard Research
Starts: 01 March 2015 Ends: 28 February 2018 Value (£): 453,591
EPSRC Research Topic Classifications:
Artificial Intelligence Comput./Corpus Linguistics
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
09 Sep 2014 EPSRC ICT Prioritisation Panel - Sept 2014 Announced
Summary on Grant Application Form
We propose a two year project to develop a novel data-driven methodology to rapidly create high quality NLG systems for new domains, by combining recent advances in three domains:

(1) advances in statistical models for NLG,

(2) crowdsourcing methods for natural language data collection, which have shown first promising results in related fields, such as Machine Translation, and

(3) recently developed imitation learning algorithms for structured prediction.

The project team combines expertise of two leading research groups in these areas:

At Heriot-Watt University, we recently demonstrated the potential for data-driven statistical NLG in limited domains. In order to make this framework domain-independent we will leverage recent machine learning models, developed by researchers at the University College London. These models learn by imitating the actions a human expert would perform to generate NL utterances, which we collect via a tightly integrated crowdsourcing procedure. The outcome of this work is a framework which will allow the rapid development of NLG systems for new domains, and thus accelerate the impact NLG technology has on the market.

We will showcase this framework on a dataset provided by the BBC, where we address the problem of generating weather reports for over 20,000 individual locations. Currently, the BBC website features only 10 reports written by meteorologists. Each of these reports covers a rather large area of the country (e.g. East of England), and thus of little interest to their users who are usually interested in the weather in a particular location (e.g. Norwich).

In a second, more ambitious step, we will explore how this framework scales to more complex interactive dialogue settings, where generation has to account for discourse phenomena, such as long-distance discourse relations or syntactic coordination. This will be evaluated in a shared task challenge for generation in interactive systems, hosted by Heriot-Watt University.

In sum, this project will further our understanding of domain-independent language generation, as well as deliver substantial and novel resources to support future research in this area (in the forms of code and data), and practical implementations of NLG systems in a wide-range of domains, from weather reports to natural language interfaces.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.hw.ac.uk