EPSRC Reference: |
EP/K01501X/1 |
Title: |
Advanced Stochastic Computation for Inference from Tree, Graph and Network Models |
Principal Investigator: |
De Iorio, Professor M |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Statistical Science |
Organisation: |
UCL |
Scheme: |
Standard Research |
Starts: |
01 October 2013 |
Ends: |
07 October 2017 |
Value (£): |
401,382
|
EPSRC Research Topic Classifications: |
Statistics & Appl. Probability |
|
|
EPSRC Industrial Sector Classifications: |
Pharmaceuticals and Biotechnology |
|
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
As a result of recent experimental advances, large amounts of biological data are becoming available for humans and other organisms.
Such data pose inference problems well beyond the capabilities of standard statistical tools. Experimental advances must be accompanied by the development of suitable biostatistical and bioinformatic tools that will make efficient use of the complex data to improve our understanding of the genetic forces shaping the evolution of genome organization.
Molecular evolution and comparative genomics are no longer fields where collecting data is the main obstacle to progress. Many probabilistic models proposed in biology try to capture the evolutionary mechanisms and reflect the data generating mechanism. Whilst these models still have limitations, there have been substantial improvements in recent years.
We contend that progress is mainly limited by a lack of adequate computational tools for extracting information from existing data and performing inference for complex models. A major bottleneck in the application of probabilistic models to biology is that their calibration is computationally expensive and in many instances not possible using modern techniques. Thus, researchers often prefer to use simple summary statistics to characterize the underlying biological process; this approach is obviously unsatisfactory. E.g., topological summary statistics capture basic characteristics of binary interaction networks but are affected by different types of bias so great, that caution must be taken when drawing conclusions. A big challenge for systems biology nowadays consists in developing statistical and bioinformatics tools within the rigorous framework of probabilistic modelling that will allow for a better and more comprehensive understanding of cellular functions.
In the last few decades a wealth of research has been performed on model-based inference for molecular data accompanied by an explosion of research in developing computationally efficient methods to facilitate it. Broadly speaking, there are three main approaches to statistical inference in molecular biology: (i) importance sampling (IS) for likelihood evaluation, (ii) Markov chain Monte Carlo (MCMC) methods (iii) Approximate Bayesian Computation (ABC) . In this proposal we concentrate on a combination of advanced IS (or more generally Sequential Monte Carlo (SMC)) and MCMC methods focussed upon Markov models in genetics and bioinformatics. In particular, building upon these techniques, we aim to develop a general framework for approximate inference which is theoretically sound, computationally feasable and still be able to accurately reflect the complexity of the underlying stochastic model. The main life science application on which we will concentrate are: genealogical trees,
protein networks, phylogenetic trees.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
|