EPSRC logo

Details of Grant 

EPSRC Reference: EP/W012308/1
Title: DIADEM: debugging made dependable and measurable
Principal Investigator: Kell, Dr S R
Other Investigators:
Researcher Co-Investigators:
Project Partners:
ARM Ltd Pernosco Ltd
Department: Informatics
Organisation: Kings College London
Scheme: New Investigator Award
Starts: 16 May 2022 Ends: 31 January 2025 Value (£): 324,823
EPSRC Research Topic Classifications:
Fundamentals of Computing
EPSRC Industrial Sector Classifications:
Information Technologies
Related Grants:
Panel History:
Panel DatePanel NameOutcome
16 Nov 2021 EPSRC ICT Prioritisation Panel November 2021 Announced
Summary on Grant Application Form
Software quality is increasingly critical to most of humanity. Bugs in software claim a huge annual toll, financially and even in human life. To eliminate bugs, developers depend crucially on their tools. Tools for interactive debugging are vital. They alone provide a high- (source-) level view of a running (binary) program, enabling programmers to 'see the bug' as it occurs in the program running in front of them. However, debugging infrastructure is notoriously unreliable, as it works only if various metadata is complete and correct. If not, the programmer sees a partial or incorrect view, which may be useless or actively misleading.

These problems occur often in popular languages (e.g. C, C++, Rust, Go), owing to a tension between debuggability and optimisation. Debugging in these languages works by compiler-generated /metadata/ describing how binary (executable) code relates to source (human-written) code. Metadata generation is 'best-effort', and optimisation frequently introduces flaws -- but simply disabling optimisations is seldom an option. Programmers rely on optimisations to relieve them of much hand-tuning. Without them, code may run tens of times slower. Furthermore, some bugs appear only in optimised code, owing to undefined behaviour (underspecification) in the source language.

This problem is extremely challenging. The heart of it is that writing compiler optimisations that preserve debuggability demands extra effort, on what are already intricate code transformations ('passes'). In practice corners are cut, leaving the output metadata approximate. To be acceptable to pass authors, improvements must reshape the effort/reward curve without increasing the task's baseline complexity. Unlike performance, debugging so far lacks quantitative benchmarks, so compiler authors have not prioritised or competed on debuggability.

Existing techniques amount to interaction-based testing of debugging, often with features for narrowing down which passes introduced a flaw. This is haphazard, since exploring all metadata includes the already-hard problem of achieving full-coverage tests (to 'drive' the debugger over all program locations). We propose instead to analyse metadata as an artifact in its own right. This means instead of tests that interact with a single concrete execution through a debugger, we must devise a custom systematic, symbolic method for exploring the compiled code, evaluating the correctness of metadata in a mathematical manner. Unlike haphazard testing, this promises systematic measurement of lost coverage and correctness; the latter can (we hypothesise) be automated using recent advances in formal specification of source languages, namely /executable semantics/, as a replacement for the current manual practices. This idea of parallel source- and binary-level exploration also suggests a radical approach: post-hoc synthesis of metadata, relieving the compiler of generating it at all. The idea here builds on successful work on neighbouring problems (translation validation and decompilation).

The project will proceed by practical methods, experimenting on a real production compiler (LLVM). It will build novel tools embeddable into existing compiler-testing workflows, both to diagnose compiler bugs and to quantify the improvement from fixing them. It will empirically explore abstractions and helpers used internally in compilers, to devise designs making them measurably more debug-preserving. Finally it will build a novel tool exploring the radical idea of synthesising high-quality metadata in post-hoc fashion, outside the compiler. It will develop metrics allowing quantitative comparison against traditional approaches. The beneficiaries are on many levels: compiler authors, software developers at large, and the general public who use or depend on the affected software.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: