When we read a story as a human, we build up a mental model of what is described. Such mental models are crucial for reading comprehension. They allow us to relate the story to our earlier experiences, to make inferences that require combining information from different sentences, and to interpret ambiguous sentences correctly. Crucially, mental models capture more information than what is literally mentioned in the story. They are representations of the situations that are described, rather than the text itself, and they are constructed by combining the story text with our commonsense understanding of how the world works.
The field of Natural Language Processing (NLP) has made rapid progress in the last few years, but the focus has largely been on sentence-level representations. Stories, such as news articles, social media posts or medical case reports, are essentially modelled as collections of sentences. As a result, current systems struggle with the ambiguity of language, since the correct interpretation of a word or sentence can often only be inferred by taking its broader story context into account. They are also severely limited in their ability to solve problems where information from different sentences needs to be combined. As a final example, current systems struggle to identify correspondences between related stories (e.g. different news articles about the same event), especially if they are written from a different perspective.
To address these fundamental challenges, we need a method to learn story-level representations that can act as an analogue to mental models. Intuitively, there are two steps involved in learning such story representations: first we need to model what is literally mentioned in the story, and then we need some form of commonsense reasoning to fill in the gaps. In practice, however, these two steps are closely interrelated: interpreting what is mentioned in the story requires a model of the story context, but constructing this model requires an interpretation of what is mentioned.
The solution I propose in this fellowship is based on representations called story graphs. These story graphs encode the events that occur, the entities involved, and the relationships that hold between these entities and events. A story can then be viewed as an incomplete specification of a story graph, similar to how a symbolic knowledge base corresponds to an incomplete specification of a possible world. Based on this view, we will rely on (weighted) logical encodings to represent what we know about a given story. These encodings will in particular serve as a compact representation of a ranking over possible story graphs, i.e. a ranking over possible interpretations of the story.
To reason about story graphs, I propose an innovative combination of neural networks with systematic reasoning. The key idea is to use focused inference patterns that are encoded as graph neural networks. The predictions of these neural networks will essentially play the same role as rule applications in symbolic AI frameworks. In this way, our method will tightly integrate the generalisation abilities and flexibility of neural networks with the advantages of having a principled and interpretable high-level reasoning process.
The proposed framework will allow us to reason about textual information in a principled way. It will lead to significant improvements in NLP tasks where a commonsense understanding is required of the situations that are described, or where information from multiple sentences or documents needs to be combined. It will furthermore enable a step change in applications that directly rely on structured text representations, such as situational understanding, information retrieval systems for the legal, medical and news domains, and tools for inferring business insights from news stories and social media feeds.
|