Details of Grant

EPSRC Reference:

EP/X04162X/1

Title:

A framework for evaluating and explaining the robustness of NLP models

Principal Investigator:

Cocarascu, Dr O

Other Investigators:

Researcher Co-Investigators:

Project Partners:

Amazon.com, Inc. (International)

IBM UK Ltd

Department:

Informatics

Organisation:

Kings College London

Scheme:

New Investigator Award

Starts:

01 June 2024

Ends:

30 November 2026

Value (£):

318,213

EPSRC Research Topic Classifications:

Artificial Intelligence

Computational Linguistics

EPSRC Industrial Sector Classifications:

Information Technologies

Related Grants:

Panel History:

Panel Date	Panel Name	Outcome
25 Sep 2023	EPSRC ICT Prioritisation Panel Sept 2023	Announced

Summary on Grant Application Form

The standard practice for evaluating the generalisation of supervised machine learning models in NLP tasks is to use previously unseen (i.e. held-out) data and report the performance on it using various metrics such as accuracy. Whilst metrics reported on held-out data summarise a model's performance, ultimately these results represent aggregate statistics on benchmarks and do not reflect the nuances in model behaviour and robustness when applied in real-world systems.

We propose a robustness evaluation framework for NLP models concerned with arguments and facts, which encompasses explanations for robustness failures to support systematic and efficient evaluation. We will develop novel methods for simulating real-world texts stemming from existing datasets, to help evaluate the stability and consistency of models when deployed in the wild. The simulation methods will be used to challenge NLP models through text-based transformations and distribution shifts on datasets as well as on data sub-sets that capture linguistic patterns, to provide a systematic coverage of real-world linguistic phenomena. Furthermore, our framework will shed insights into a model's robustness by generating explanations for robustness failures along the lexical, morphological, and syntactic dimensions, extracted from the various dataset simulations and data sub-sets, thus departing from current approaches that solely provide a metric to quantify robustness. We will focus on two NLP research areas, argument mining and fact verification, however, several simulation methods and the robustness explanations are also scalable to other NLP tasks.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website: