EPSRC logo

Details of Grant 

EPSRC Reference: EP/X022595/1
Title: Efficient simulation and inference under approximate models of ancestry
Principal Investigator: Lohse, Dr KR
Other Investigators:
Researcher Co-Investigators:
Dr D Setter
Project Partners:
Department: Sch of Biological Sciences
Organisation: University of Edinburgh
Scheme: Standard Research
Starts: 01 October 2023 Ends: 30 September 2026 Value (£): 342,917
EPSRC Research Topic Classifications:
Statistics & Appl. Probability
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
EP/X024881/1
Panel History:
Panel DatePanel NameOutcome
28 Feb 2023 EPSRC Mathematical Sciences Prioritisation Panel February 2023 Announced
Summary on Grant Application Form
While large whole genome data sets are now being generated routinely for many taxa and populations, analyses of these data remain superficial and largely descriptive. In order to make sense of the genetic variation present in samples of genomes, we need to relate it mathematically to the evolutionary processes that generated it. This requires mathematical models of genetic ancestry that are tractable, yet realistic, and general enough to capture all fundamental evolutionary forces. At a minimum, a null model of genomes sampled from a population should capture the randomness of meiotic recombination and the fact that most mutations are either neutral or deleterious, and so are likely to be removed from the population as a result of genetic drift and (background) selection. Although the ancestry for a sample of recombining genomes can be described mathematically as a graph, this full backward-in-time description does not scale to large populations and currently does not include background selection. This means that it is currently impossible to efficiently simulate genomic variation even under the simplest biologically plausible null model. Statistical inference from genomic data is even more limited and state of the art statistical approaches for inferring past selection or demography from genomic data are based on crude (and extremely lossy) summaries of genome-wide variation.

This cross-disciplinary project brings together experts in computer science and mathematical biology and builds on recent breakthroughs to develop efficient approximate algorithms that accurately capture the effect of recombination and background selection on genome-wide ancestry and sequence variation. These algorithms will be implemented both as part of a standard simulation software and tools that calculate the fit of sequence data to models of past demography and selection. Such tools are fundamental for interpreting the vast volumes of genome sequence data that are now being generated across the tree of life. While the algorithms and tools to be developed are general, this project will immediately improve our ability to scan genomic data for signals of past positive selection whilst accounting for the randomness of ancestry.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.ed.ac.uk