EPSRC Reference: |
EP/X039447/1 |
Title: |
Computing over Compressed Graph-Structured Data |
Principal Investigator: |
Wild, Dr S |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Computer Science |
Organisation: |
University of Liverpool |
Scheme: |
New Investigator Award |
Starts: |
01 December 2023 |
Ends: |
30 November 2026 |
Value (£): |
415,344
|
EPSRC Research Topic Classifications: |
Fundamentals of Computing |
|
|
EPSRC Industrial Sector Classifications: |
No relevance to Underpinning Sectors |
|
|
Related Grants: |
|
Panel History: |
Panel Date | Panel Name | Outcome |
03 Jul 2023
|
EPSRC ICT Prioritisation Panel July 2023
|
Announced
|
|
Summary on Grant Application Form |
The project aims to bring computation over compressed data to massive graph-structured datasets by extending optimally-compressed tree data structures we developed to certain classes of graphs. Graph-structured datasets such as knowledge graphs or social networks are growing in importance and size; at the same time, computation is increasingly pushed to mobile devices with limited memory capacity. Many applications yield large, but partially repetitive and predictable datasets, which makes them compressible; but on mobile devices, data is only useful when it can be queried directly in a compressed representation that fits into the device memory. Current methods for computing over compressed data do not yet work well for this scenario.
In order to enable queries on compressed graph-structured data we need to answer three research questions.
1. We need to know the intrinsic information content of graph-structured data so that we can decide whether a dataset can be sufficiently compressed to fit into local memory.
2. We need to know how to effectively compress graph-structured data, so that we can economically transmit and store graph-structured data on mobile devices.
3. We need to know how to answer queries on a compressed representation, so that we can make effective use of its compressibility while querying over a graph-structured dataset.
This project will combine methods from information theory, data compression, and succinct data structures, to carry out three work packages.
1. We will propose new notions of random sources and empirical entropy in order to approximate the intrinsic information content of graph-structured data.
2. We will develop new compression methods based on probabilistic context-free grammars (PCFGs) and probabilistic multiple context-free grammars (PMCFGs) in order to effectively compress graph-structured data.
3. We will apply and extend our tools for succinct tree data structures to new types of graphs and RNA structure data in order to enable computing directly over compressed graph-structured data.
We will use the outcomes of the work packages to create a versatile toolbox of space-efficient data structures to ease the development of applications working with massive graph-structured datasets.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.liv.ac.uk |