EPSRC logo

Details of Grant 

EPSRC Reference: EP/V008331/1
Title: Deep Learning for Time-Inconsistent Dynamic Optimization
Principal Investigator: Zheng, Professor H
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Mathematics
Organisation: Imperial College London
Scheme: Standard Research
Starts: 01 July 2021 Ends: 30 June 2024 Value (£): 464,268
EPSRC Research Topic Classifications:
Mathematical Aspects of OR
EPSRC Industrial Sector Classifications:
Information Technologies
Related Grants:
Panel History:
Panel DatePanel NameOutcome
31 Aug 2020 EPSRC Mathematical Sciences Prioritisation Panel September 2020 Announced
Summary on Grant Application Form
The proposed research is to solve a so called time-inconsistent (TI) dynamic optimization problem that addresses decision making in the presence of inconsistent and often conflicting human behaviour, e.g., long term health benefit of stopping smoking vs instant pleasure of nicotine cravings. Solving TI dynamic optimization can have far-reaching impact from consumer behaviour to social welfare policy. The decision making under the TI framework is completely different from ones in standard optimization and economic theory under the rational behaviour assumption. The results for TI dynamic optimization are few and far between. The main bottleneck is computation due to the requirement of solving the system of high dimensional nonlinear partial differential equations and forward-backward stochastic differential equations. The project is to develop the fundamental theory and novel methodology to solve TI dynamic optimization by integrating the deep reinforcement learning (DRL)} from data science with advanced mathematical theories such as convex analysis, dual stochastic control, etc. The breakthrough in solving TI dynamic optimization can make great impact in applications. One example is asset allocation, many financial institutions use one-period mean variance (MV) model, which is simple to use but has many drawbacks. A multi-period or continuous time model is more realistic for stochastic asset price processes and fits better the dynamic nature of the economy, but is TI and difficult to solve. The findings of the project can help solve continuous time MV problems that would improve financial asset liability management and performance, which in turn would have great impact on societal prosperity and individual well-being. In short, progress in TI dynamic optimization and DRL computation can greatly help industry and government agencies to improve decision making and design more efficient and powerful computational software for real-world TI problems, based on the DRL solver developed in the project.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.imperial.ac.uk