EPSRC logo

Details of Grant 

EPSRC Reference: EP/T014628/1
Title: Session Types for Reliable Distributed Systems (STARDUST)
Principal Investigator: Gay, Professor SJ
Other Investigators:
Trinder, Professor P Fowler, Dr S
Researcher Co-Investigators:
Project Partners:
Actyx AG Erlang Solutions Ltd Lightbend Inc
Quviq AB TATA Consultancy Services
Department: School of Computing Science
Organisation: University of Glasgow
Scheme: Standard Research
Starts: 01 October 2020 Ends: 30 September 2024 Value (£): 563,806
EPSRC Research Topic Classifications:
Fundamentals of Computing Software Engineering
EPSRC Industrial Sector Classifications:
Information Technologies
Related Grants:
EP/T014709/2 EP/T014512/1 EP/T014709/1
Panel History:
Panel DatePanel NameOutcome
15 Jan 2020 EPSRC ICT Prioritisation Panel January 2020 Announced
06 Nov 2019 EPSRC ICT Prioritisation Panel November 2019 Deferred
Summary on Grant Application Form
VISION

Distributed software systems are an essential part of the infrastructure of modern society. Such systems typically comprise diverse software components deployed across networks of hosts. Ensuring their reliability is challenging as software components must correctly communicate and synchronise with each other, and any of the hardware or software components may fail. Software failure and service outage are estimated to cost the world economy more than a trillion dollars annually.

Failures can occur at all levels of the system stack: hardware, operating system, network, software, and user. Here we focus on using advanced programming language technologies to enable the software level to better handle failures using a combination of fault prevention and fault tolerance. Specifically, we will combine the communication-structuring mechanism of session types with the scalability and fault-tolerance of actor-based software architectures.

Actor languages are based on independent processes (actors) communicating by asynchronous messages. Reliability is facilitated by actors having isolated state, and hence an actor can fail independently. Two key techniques for achieving reliability in actor languages are timeouts and supervision, and these are the main focus of STARDUST. Timeouts allow failures to be identified during execution, and failures are handled by establishing triggers and alternative behaviours within the code. An actor may supervise other actors, detecting failures and taking remedial action like restarting a failed actor.

Session types provide a way to specify and constrain the communication behaviour (protocol) between nodes in a system. A session type system excludes any non-conforming behaviour, perhaps statically (for fault prevention), dynamically, or a mixture of both (for fault tolerance). Several languages now have session-type support via libraries and tools.

By combining the strengths of actor languages and session types, we will develop a well-founded theory of reliable actor programming with clearly defined communication structures. Key aims are to deliver tools that provide lightweight support for developers, e.g. warn of potential issues, and to allow developers to continue to use established idioms. By doing so we aim to deliver a step change in the engineering of reliable distributed software systems.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.gla.ac.uk