EPSRC logo

Details of Grant 

EPSRC Reference: EP/X03917X/1
Title: Robust and Efficient Model-based Reinforcement Learning
Principal Investigator: Bogunovic, Dr I
Other Investigators:
Researcher Co-Investigators:
Project Partners:
MediaTek UK Atomic Energy Authority
Department: Electronic and Electrical Engineering
Organisation: UCL
Scheme: New Investigator Award
Starts: 01 December 2023 Ends: 30 November 2026 Value (£): 398,101
EPSRC Research Topic Classifications:
Artificial Intelligence
EPSRC Industrial Sector Classifications:
Related Grants:
Panel History:
Panel DatePanel NameOutcome
03 Jul 2023 EPSRC ICT Prioritisation Panel July 2023 Announced
Summary on Grant Application Form
Reinforcement learning (RL) is concerned with training data-driven agents to make decisions. In particular, an RL agent interacting with an environment needs to learn an optimal policy, i.e., which actions to take in different states to maximize its rewards. Recently, RL has become one of the most prominent areas of machine learning since RL methods can have tremendous potential in solving complex tasks across various fields (e.g., in autonomous driving, nuclear fusion, healthcare, hardware design, etc.). However, a number of challenges still stand in the way of its widespread adoption. Contemporary RL algorithms are often data-intensive and lack robustness guarantees. Established (deep) RL approaches require a vast amount of data that is readily available in some environments (e.g., in video games). This is often not the case with real-world tasks where data acquisition is costly. Another major challenge is to use the learned control policies in the real world while ensuring reliable, robust, and safe performance. This research aims to provide practical model-based RL algorithms with rigorous statistical and robustness guarantees. This is significant in safety-critical applications where obtaining data is expensive, e.g., in nuclear fusion, learning policies to control plasmas is performed via expensive simulators. The key novelty will be to incorporate the versatile robustness aspects into model-based RL allowing for its broad application across different applications and domains.

This project focuses on designing algorithms that make use of powerful non-linear statistical models to learn about the world and can tackle large state spaces present in modern RL tasks. The focus is on obtaining near-optimal policies that are robust against distributional shifts in the environmental dynamics, (adversarial) data corruptions/outliers, and satisfy application-dependent safety constraints during exploration. A major contribution will be novel rigorous statistical sample complexity guarantees for designed algorithms that characterize convergence to optimal robust and safe policies. The obtained guarantees will be efficient in the sense of being independent of the number of states, and hence applicable to complex applications. This will require designing new robust estimators and confidence intervals for popular statistical models. Moreover, the project will result in an entire testbed with distributional shifts and attacking strategies that will be provided to benchmark the robustness of standard and novel robust RL algorithms. This project will be among the first contribution to achieving both robustness and efficiency in MBRL by providing practical algorithms that can be readily applied to emerging impactful real-world tasks such as robust control of nuclear plasmas (an exciting and promising path toward sustainable energy) and efficient discovery of system-on-chip designs.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: