EPSRC logo

Details of Grant 

EPSRC Reference: EP/V055380/1
Title: Robust and scalable Markov chain Monte Carlo for heterogeneous models
Principal Investigator: Livingstone, Dr SJ
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Flatiron Institute
Department: Statistical Science
Organisation: UCL
Scheme: New Investigator Award
Starts: 04 July 2022 Ends: 31 October 2024 Value (£): 207,950
EPSRC Research Topic Classifications:
Statistics & Appl. Probability
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
17 May 2021 EPSRC Mathematical Sciences Prioritisation Panel May 2021 Announced
Summary on Grant Application Form
A large proportion of statistical inference tasks can be framed as either an optimisation or an integration problem. Markov chain Monte Carlo (MCMC) algorithms can be used to solve both, but are most commonly used for the latter. They have been successful in such important and diverse settings as the observation of gravitational waves, modelling the spread of infectious diseases, and predicting the results of elections from political polling data. MCMC algorithms are also popular outside of statistical inference, in particular their use is widespread for molecular dynamics simulations in statistical physics.

Despite their numerous successes, current MCMC algorithms has some known drawbacks. A prominent example is their performance when model parameters vary over very different scales and exhibit multiple levels of inter-dependence (heterogeneous models). There is an increasingly urgent need to improve this performance as high volume and highly heterogeneous datasets become more and more available, and as researchers begin to ask progressively more nuanced questions from their data for which heterogenous models are needed.

The standard approach to MCMC for heterogeneous models is through adaptive pre-conditioning of algorithms. Doing this naively in a high-dimensional setting comes at a significant cost (the required number of operations per algorithm step is often cubic in the number of model parameters, and the number of algorithmic tuning parameters to learn is quadratic). In addition, current state of the art algorithms such as Hamiltonian and Langevin Monte Carlo work particularly poorly in combination with the technique, as has recently been shown both theoretically and experimentally by myself and others. In this proposal I will attack this problem on two fronts.

In the first work package I will develop and study a new suite of MCMC algorithms that are specifically tailored to heterogeneous models. I will do this by designing algorithms based on the recently derived class of Markov processes termed 'locally-balanced', for which there is considerable evidence of improved robustness to model heterogeneity. I will provide a rigorous foundation for this class of Markov processes, establish key theoretical properties on convergence to equilibrium and optimality, and then design new algorithms based on this class of processes, each tailored towards specific application areas of known interest.

In the second work package I will develop new theoretically grounded methodology for scalable adaptive pre-conditioning of algorithms. I will do this in part by taking inspiration from the literature on sparse estimation of covariance matrices for high-dimensional datasets. I will design methods that are both scalable to high-dimensional settings and for which theoretical guarantees can be established, to provide a clear indication of expected performance gains. This should improve the applicability of existing state of the art methods such as Hamiltonian Monte Carlo to the high-dimensional and heterogeneous model setting.

There is a keen focus on integrating new methods within widely used statistical software within the proposal. To this end, I have planned collaborations with the founding developers of the 'Stan' statistical programming language, which has over 100,00 users, as well as detailed plans to create bespoke open source packages in software such as R and Python. I also outline plans to work closely with data scientists to apply the new methodology in many prominent application areas.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: