EPSRC Reference: |
EP/M005429/1 |
Title: |
DILiGENt: Domain-Independent Language Generation |
Principal Investigator: |
Rieser, Professor V |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
S of Mathematical and Computer Sciences |
Organisation: |
Heriot-Watt University |
Scheme: |
Standard Research |
Starts: |
01 March 2015 |
Ends: |
28 February 2018 |
Value (£): |
453,591
|
EPSRC Research Topic Classifications: |
Artificial Intelligence |
Comput./Corpus Linguistics |
|
EPSRC Industrial Sector Classifications: |
No relevance to Underpinning Sectors |
|
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
09 Sep 2014
|
EPSRC ICT Prioritisation Panel - Sept 2014
|
Announced
|
|
Summary on Grant Application Form |
We propose a two year project to develop a novel data-driven methodology to rapidly create high quality NLG systems for new domains, by combining recent advances in three domains:
(1) advances in statistical models for NLG,
(2) crowdsourcing methods for natural language data collection, which have shown first promising results in related fields, such as Machine Translation, and
(3) recently developed imitation learning algorithms for structured prediction.
The project team combines expertise of two leading research groups in these areas:
At Heriot-Watt University, we recently demonstrated the potential for data-driven statistical NLG in limited domains. In order to make this framework domain-independent we will leverage recent machine learning models, developed by researchers at the University College London. These models learn by imitating the actions a human expert would perform to generate NL utterances, which we collect via a tightly integrated crowdsourcing procedure. The outcome of this work is a framework which will allow the rapid development of NLG systems for new domains, and thus accelerate the impact NLG technology has on the market.
We will showcase this framework on a dataset provided by the BBC, where we address the problem of generating weather reports for over 20,000 individual locations. Currently, the BBC website features only 10 reports written by meteorologists. Each of these reports covers a rather large area of the country (e.g. East of England), and thus of little interest to their users who are usually interested in the weather in a particular location (e.g. Norwich).
In a second, more ambitious step, we will explore how this framework scales to more complex interactive dialogue settings, where generation has to account for discourse phenomena, such as long-distance discourse relations or syntactic coordination. This will be evaluated in a shared task challenge for generation in interactive systems, hosted by Heriot-Watt University.
In sum, this project will further our understanding of domain-independent language generation, as well as deliver substantial and novel resources to support future research in this area (in the forms of code and data), and practical implementations of NLG systems in a wide-range of domains, from weather reports to natural language interfaces.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.hw.ac.uk |