EPSRC Reference: 
EP/V04771X/1 
Title: 
Overcoming the curse of dimensionality in dynamic programming by tensor decompositions 
Principal Investigator: 
Dolgov, Dr S 
Other Investigators: 

Researcher CoInvestigators: 

Project Partners: 

Department: 
Mathematical Sciences 
Organisation: 
University of Bath 
Scheme: 
Standard Research  NR1 
Starts: 
10 May 2021 
Ends: 
09 May 2023 
Value (£): 
202,431

EPSRC Research Topic Classifications: 
Nonlinear Systems Mathematics 


EPSRC Industrial Sector Classifications: 
No relevance to Underpinning Sectors 


Related Grants: 

Panel History: 

Summary on Grant Application Form 
Almost 60 years ago Richard Bellman coined the expression ``curse of dimensionality'' when referring to the overwhelming computational complexity associated with the solution of multistage decision processes through dynamic programming (DP), leading to the wellknown Bellman equation.
Nowadays the curse of dimensionality has become an ubiquitous expression shared in different fields such as numerical analysis, compressed sensing and machine learning. However, it is in the computation of optimal feedback laws for the control of dynamical systems where its meaning continues to be most evident.
Consider a simple pendulum, which is characterised by two variables, the position of the mass, and its velocity.
A classical demonstration is the stabilisation of the pendulum in the unstable upwards position by moving the base adaptively.
To synthesize the control signal for the actuator of the base that would be robust to stochastic fluctuations (e.g. from the air),
one needs to compute a feedback map as a twodimensional function of the position and velocity.
This feedback map satisfies the twodimensional partial differential equation (PDE), which has no analytic solution in general.
Solving it numerically requires a discretisation of both the position and velocity with some n points.
However, the total number of unknowns in the feedback map is n^2, corresponding to all combinations of the position and velocity points,
and therefore the computations take longer.
For d variables, the complexity of the straightforward numerical solutions grows exponentially as n^d.
Even a simple quadrocopter model is described by 12 variables, and modern applications in particle physics or opinion dynamics require optimal control of dynamical systems, which are in turn PDEs, discretised with hundreds or thousands of variables.
However, such systems are often very special in the sense that each variable is driven effectively by only neighbouring variables in the dynamics.
In this case, the sought feedback map admits a tensor approximation with (almost) separated variables.
This allows us to reduce the computational complexity dramatically, to a number of operations growing only polynomially with the number of variables, albeit at a price of introducing some error.
This error can be further corrected by using methods from reinforcement learning, similar to those that made the winning artificial Go player.

Key Findings 
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in nonacademic contexts 
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts 
Description 
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk 
Summary 

Date Materialised 


Sectors submitted by the Researcher 
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL: 

Further Information: 

Organisation Website: 
http://www.bath.ac.uk 