EPSRC logo

Details of Grant 

EPSRC Reference: EP/V010611/1
Title: StreamDG: Streaming Processing of Massive Dynamic Graphs
Principal Investigator: Konrad, Dr C
Other Investigators:
Researcher Co-Investigators:
Project Partners:
LV=GI PRECISE Center, University of Pennsylvan University of Massachusetts Amherst
University of Warwick
Department: Computer Science
Organisation: University of Bristol
Scheme: New Investigator Award
Starts: 01 September 2021 Ends: 31 August 2024 Value (£): 248,776
EPSRC Research Topic Classifications:
Computer Graphics & Visual. Fundamentals of Computing
EPSRC Industrial Sector Classifications:
Financial Services
Related Grants:
Panel History:
Panel DatePanel NameOutcome
01 Oct 2020 EPSRC ICT Prioritisation Panel October 2020 Announced
Summary on Grant Application Form
Massive data sets play a central role in most areas of modern life. IT giants such as Amazon evaluate large amounts of customer data in order to provide instant buying recommendations, Facebook maintains a social network of billions of monthly active users, and Google collects data from billions of webpages every day in order to answer more than 70,000 search queries every second. Areas such as healthcare, education, and finance are heavily reliant on Big Data technology.

However, extracting useful information from massive data sets is a challenging task. Since modern massive data sets far exceed the RAM (random access memory) of modern computers, algorithms for processing such data sets are never able to "see" the input in its entirety at any one moment. This is a fundamental restriction that is similar to how we humans process the world around us: Our senses provide us with an almost unlimited amount of information as we walk through life, however, our brains are merely able to store a small summary of our past and most information is lost.

Data streaming algorithms are massive data set algorithms that mimic this behaviour: A data streaming algorithm scans a massive input data set once only and maintains a summary in the computer's RAM that is much smaller than the size of the input. The objective is to maintain summaries that are as small as possible but still represent the input data well enough in order to solve a specific task. This poses many interesting questions that have been the subject of research for multiple decades: Which problems allow the design of small-space streaming algorithms? Are there problems that cannot be solved with small space, and if this is the case, can these problems at least be solved approximately?

This project addresses streaming algorithms for processing massive graphs. Graphs are central objects in computer science and have countless applications. They allow us to describe relations (called edges) between entities (called nodes). For example, a social network graph consists of friendship relations (edges) between pairs of users (nodes). Most research on data streaming algorithms for processing massive graphs assumes that graphs are static objects that never change in structure or size. However, this assumption can rarely be guaranteed in practice. For example, social network graphs change both in structure and size when users join or leave the network or when new friendships are established and existing ones are ended. This observation yields the central questions of this project: Processing dynamic graphs is at least as hard as processing static graphs, but how much harder is it? By how much do summaries have to increase in size?

The latter question was first addressed in a seminal paper in 2012. To the surprise of many data streaming researchers, it was shown that for many important problems, the summaries required for the dynamic setting only need to be marginally larger than those for the static setting, while a substantial increase in size was expected. Today, a multitude of problems are known that behave in a similar way while only very few problems are known that require substantially larger summaries in the dynamic setting.

The aim of this project is to shed light on the space requirements of streaming algorithms for processing massive dynamic graphs. While our current knowledge suggests that most problems do not become substantially harder in the dynamic setting, we believe that this picture is somewhat skewed and that a multitude of key problems are in fact much harder to solve in the dynamic case. To confirm our conjecture, we will design new streaming algorithms and prove impossibility results that show this to be the case. Where streaming dynamic graph processing provably requires too much space to be practically feasible, we will provide alternative models that allow for the design of streaming algorithms with small space.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.bris.ac.uk