EPSRC Reference: |
EP/X022595/1 |
Title: |
Efficient simulation and inference under approximate models of ancestry |
Principal Investigator: |
Lohse, Dr KR |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Sch of Biological Sciences |
Organisation: |
University of Edinburgh |
Scheme: |
Standard Research |
Starts: |
01 October 2023 |
Ends: |
30 September 2026 |
Value (£): |
342,917
|
EPSRC Research Topic Classifications: |
Statistics & Appl. Probability |
|
|
EPSRC Industrial Sector Classifications: |
No relevance to Underpinning Sectors |
|
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
While large whole genome data sets are now being generated routinely for many taxa and populations, analyses of these data remain superficial and largely descriptive. In order to make sense of the genetic variation present in samples of genomes, we need to relate it mathematically to the evolutionary processes that generated it. This requires mathematical models of genetic ancestry that are tractable, yet realistic, and general enough to capture all fundamental evolutionary forces. At a minimum, a null model of genomes sampled from a population should capture the randomness of meiotic recombination and the fact that most mutations are either neutral or deleterious, and so are likely to be removed from the population as a result of genetic drift and (background) selection. Although the ancestry for a sample of recombining genomes can be described mathematically as a graph, this full backward-in-time description does not scale to large populations and currently does not include background selection. This means that it is currently impossible to efficiently simulate genomic variation even under the simplest biologically plausible null model. Statistical inference from genomic data is even more limited and state of the art statistical approaches for inferring past selection or demography from genomic data are based on crude (and extremely lossy) summaries of genome-wide variation.
This cross-disciplinary project brings together experts in computer science and mathematical biology and builds on recent breakthroughs to develop efficient approximate algorithms that accurately capture the effect of recombination and background selection on genome-wide ancestry and sequence variation. These algorithms will be implemented both as part of a standard simulation software and tools that calculate the fit of sequence data to models of past demography and selection. Such tools are fundamental for interpreting the vast volumes of genome sequence data that are now being generated across the tree of life. While the algorithms and tools to be developed are general, this project will immediately improve our ability to scan genomic data for signals of past positive selection whilst accounting for the randomness of ancestry.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.ed.ac.uk |