EPSRC Reference: |
EP/J011940/1 |
Title: |
Dynamic pattern matching: Faster Algorithms and New Bounds |
Principal Investigator: |
Clifford, Dr R |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Computer Science |
Organisation: |
University of Bristol |
Scheme: |
Standard Research |
Starts: |
01 January 2012 |
Ends: |
31 December 2014 |
Value (£): |
289,045
|
EPSRC Research Topic Classifications: |
Fundamentals of Computing |
|
|
EPSRC Industrial Sector Classifications: |
No relevance to Underpinning Sectors |
|
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
07 Dec 2011
|
EPSRC ICT Responsive Mode - Dec 2011
|
Announced
|
|
Summary on Grant Application Form |
This project aims to provide the tools necessary to address the challenges that arise from processing massive datasets which are themselves dynamic. That is to say the data changes, perhaps very rapidly, over time. Where traditional ideas have found sophisticated methods for indexing and searching very large sets of static information, much of today's applications, from high frequency finance to the indexing of the world wide web, requires us to be able to cope with dynamic change.
Pattern matching algorithms are central to the modern digital world. From word processing and web search to computational genetics and high finance, the ability to find patterns efficiently and accurately is of fundamental importance to modern industry. However, we are now witnessing a fundamental change not only in the ways data are being processed but also in the sheer size of commonly held datasets. The public genome sequencing projects, for example, have produced hundreds of gigabytes of sequence and related meta data and web search tools must answer billions of queries a day, each within a fraction of a second. Telecommunication companies collect terabytes of data every day which is itself only a tiny proportion of the data that travels over their networks. The types of data that are being stored in massive volumes online are also increasing, from audio to still images, streaming video and entire library collections.
As examples of the challenge we face, in telecommunications not only is the total data size massive, it streams at such a rate that we can only afford to store a small sketch of the input at any time and yet must still answer sophisticated queries of the data as quickly as possible. In web search, as well as the arrival of new pages, any part the existing data may change without notice. In all cases, we must be able handle such changes quickly, efficiently and with minimum cost. This proposal will allow us to develop fast algorithms to perform pattern matching under these new conditions. We will consider a range of scenarios, even taking into account multiple simultaneous streams of data or situations where previously successfully processed data can be modified at any point and at any time.
The second and complementary topic for this project is that of showing time and space lower bounds for the problems we have discussed. Where ever faster and more efficient algorithms are developed by the algorithms community, our understanding of the limits of what can be achieved, even in principle, is currently at a much more basic level. Such lower bounds, were they available, would not only be important for the significant theoretical interest and insight they provide, but also for the practical purposes of preventing fruitless search for algorithms which cannot exist. Within computer science in general, lower bounds have historically proven hard to develop but in the context of streaming and dynamic computation new ideas can now be applied. These results will provide a framework within which future algorithmic research can be carried out, therefore saving many hours of potentially fruitless search for methods we have shown cannot exist.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.bris.ac.uk |