EPSRC logo

Details of Grant 

EPSRC Reference: EP/L026775/1
Title: Nonparametric Learning for Situated Data-to-Text Generation: Helping People to Understand Uncertain Data
Principal Investigator: Rieser, Professor V
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: S of Mathematical and Computer Sciences
Organisation: Heriot-Watt University
Scheme: First Grant - Revised 2009
Starts: 01 September 2014 Ends: 16 September 2016 Value (£): 98,425
EPSRC Research Topic Classifications:
Artificial Intelligence Comput./Corpus Linguistics
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
09 Apr 2014 EPSRC ICT Responsive Mode - Apr 2014 Announced
Summary on Grant Application Form
Information overload is a pervasive problem in many environments, particularly those in which human decision making is based on extensive data sets. Data-to-text systems have been shown to successfully address this problem by automatically generating textual descriptions of the underlying data. However, when translating (numerical) data into words, an appropriate level of precision needs to be chosen. The following example is from a system which summarises medical time series data for neonatal care: "At 17:24 T1 is 35.7 and T2 is 34.5C" (Gatt et al., 2009). This summary is clearly targeted to experts, such as doctors or nurses, which need precise information for decision making. However, other users, such as visiting parents might be more happy with a description such as "In the evening your baby had normal temperature."

In this project, we will build a data-to-text system that automatically determines the appropriate level of precision for a given context by using statistical machine learning methods. These methods can learn an optimal generation policy from real data and promise to be more robust to new situations than hand-written rules by human experts.

We will also investigate novel feedback-based non-parametric state estimation methods to reduce the data annotation cost for data-to-text systems. Typically, the first step in creating such systems is to manually interpret and align the raw data sources. However, this step is very costly as human experts need to trained for this task. Our new methods promise for data-to-text systems to be rapidly applied to new domains.

The domain we will be targeting for this initial project is pedestrian navigation, where the task is to translate uncertain user positions into walking instructions. The underlying data uncertainty here arises from several sources, such as the user's speech signal, the GPS location, estimated viewshed, walking direction and speed. We will integrate and test our learnt data-to-text generation strategy by integrating it in an existing system and running an evaluation with real users.

One of the outcomes of this project is a data-driven linguistic view on the question of "how to communicate uncertainty", which is an active interdisciplinary research area, including researchers from medicine, law, environmental modelling and climate change.

In future work we will also investigate how the proposed framework transfers to new domains, such as natural language generation from medical data, weather forecasts, or output from complex environmental models.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.hw.ac.uk