EPSRC logo

Details of Grant 

EPSRC Reference: EP/X039447/1
Title: Computing over Compressed Graph-Structured Data
Principal Investigator: Wild, Dr S
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Computer Science
Organisation: University of Liverpool
Scheme: New Investigator Award
Starts: 29 February 2024 Ends: 28 February 2027 Value (£): 415,344
EPSRC Research Topic Classifications:
Fundamentals of Computing
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
03 Jul 2023 EPSRC ICT Prioritisation Panel July 2023 Announced
Summary on Grant Application Form
The project aims to bring computation over compressed data to massive graph-structured datasets by extending optimally-compressed tree data structures we developed to certain classes of graphs. Graph-structured datasets such as knowledge graphs or social networks are growing in importance and size; at the same time, computation is increasingly pushed to mobile devices with limited memory capacity. Many applications yield large, but partially repetitive and predictable datasets, which makes them compressible; but on mobile devices, data is only useful when it can be queried directly in a compressed representation that fits into the device memory. Current methods for computing over compressed data do not yet work well for this scenario.

In order to enable queries on compressed graph-structured data we need to answer three research questions.

1. We need to know the intrinsic information content of graph-structured data so that we can decide whether a dataset can be sufficiently compressed to fit into local memory.

2. We need to know how to effectively compress graph-structured data, so that we can economically transmit and store graph-structured data on mobile devices.

3. We need to know how to answer queries on a compressed representation, so that we can make effective use of its compressibility while querying over a graph-structured dataset.

This project will combine methods from information theory, data compression, and succinct data structures, to carry out three work packages.

1. We will propose new notions of random sources and empirical entropy in order to approximate the intrinsic information content of graph-structured data.

2. We will develop new compression methods based on probabilistic context-free grammars (PCFGs) and probabilistic multiple context-free grammars (PMCFGs) in order to effectively compress graph-structured data.

3. We will apply and extend our tools for succinct tree data structures to new types of graphs and RNA structure data in order to enable computing directly over compressed graph-structured data.

We will use the outcomes of the work packages to create a versatile toolbox of space-efficient data structures to ease the development of applications working with massive graph-structured datasets.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.liv.ac.uk