Details of Grant

EPSRC Reference:

EP/I032916/1

Title:

An integrated model of syntactic and semantic prediction in human language processing

Principal Investigator:

Keller, Professor F

Other Investigators:

Lapata, Professor M

Researcher Co-Investigators:

Project Partners:

Department:

Sch of Informatics

Organisation:

University of Edinburgh

Scheme:

Standard Research

Starts:

15 September 2011

Ends:

28 February 2015

Value (£):

329,562

EPSRC Research Topic Classifications:

Artificial Intelligence	Cognitive Science Appl. in ICT
Comput./Corpus Linguistics	Human Communication in ICT

EPSRC Industrial Sector Classifications:

No relevance to Underpinning Sectors

Related Grants:

Panel History:

Panel Date	Panel Name	Outcome
15 Mar 2011	EPSRC ICT Responsive Mode - Mar 2011	Announced

Summary on Grant Application Form

When humans process language, they do so incrementally: they compute the meaning of a sentence on a word-by-word basis, rather than waiting until they reach the end of the sentence. As a consequence, readers and listeners have to constantly update their interpretations as new input becomes available. Experimental evidence shows that they also make predictions about upcoming input: for example, when hearing a verbs such as eat , the listener predicts that an object such as soup is likely to follow. The prediction process has two components: syntactic prediction, i.e., the structure of the upcoming input is anticipated (after eat , an object is likely, but a subject isn't), and semantic prediction, i.e., the meaning of the upcoming input is anticipated (after eat , a noun referring to edible things is likely, but one referring to abstract things isn't).Previous research has developed computational models of either syntactic or semantic prediction in human sentence processing. But there are currently no models that capture both processes in a single framework, despite clear experimental evidence that humans rely on both types of information when generating predictions. The aim of this project is to develop a model of human sentence processing that integrates syntactic and semantic prediction; such a model will not only make it possible to investigate an important theoretical question in psycholinguistics, but it also has important potential applications in natural language processing.Our model will bring together two key approaches in sentence processing. On the syntactic side, we will develop an incremental, probabilistic parser that generates syntactic predictions. This parser will be based on an extension of the Tree-adjoining Grammar (TAG) formalism, which in previous work has been shown to capture prediction data. The parser will be combined with a distributional model of semantics, which is the standard way of modeling word meaning in cognitive science; we will extend this model to capture sentential meaning, thus making it amenable to integration with a parser. Three distinct ways of achieving such an integration will be pursued, each corresponding to a theoretical position in psycholinguistics: the autonomous processing view, which holds that syntax and semantics operate independently, the syntax-first view, which holds that semantic processing has access to syntax, but not vice versa, and the interactive processing view, according to which the two components freely exchange information.By implementing these three approaches, and evaluating the resulting predictions against data from eye-tracking and priming experiments, we will be able to shed light on a key question in psycholinguistics, viz., how syntactic and semantic processing interact.Apart from this theoretical contribution, the project also has a practical aim: a computational model of human sentence processing can be used to determine which parts of a text are hard to understand. This information can be used to provide feedback to human writers, score essays, or correct the output of automatic language generation systems. In order to assess the potential for such applications, we will focus on one particular problem, viz., text simplification. We will develop a system that takes input text and makes it easier to read, e.g., for language-impaired readers or for language learners. Our integrated model of syntax and semantics will be used to pinpoint the difficult parts of a text, which will then be replaced by simplified passages using a technique called integer linear programming, which has previously been used successfully for text rewriting. The resulting simplified texts will be evaluated for their intelligibility in studies with human readers.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.ed.ac.uk