EPSRC logo

Details of Grant 

EPSRC Reference: EP/X024539/1
Title: Test FLARE (Test Flakiness Automated Reproduction and Explanation)
Principal Investigator: McMinn, Professor PS
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Allegheny College BJSS Limited Carnegie Mellon University
Duolingo Microsoft SpotQA/Virtuoso
University of Passau Wandisco
Department: Computer Science
Organisation: University of Sheffield
Scheme: Standard Research
Starts: 02 October 2023 Ends: 01 February 2027 Value (£): 544,227
EPSRC Research Topic Classifications:
Software Engineering
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
24 Apr 2023 EPSRC ICT Prioritisation Panel April 2023 Announced
Summary on Grant Application Form
The cost of software failures is a huge burden to the worldwide economy that was estimated to be at least £1.3 trillion in 2017. Consequently, software testing, a vital defence against failures, contributes to a large proportion of software development effort and cost. Flaky tests are a particular strain on resources allocated to software development, because they intermittently pass and fail without changes to tests or project code, with often maddening, non-obvious causes. Flaky tests are tests that fundamentally do not always tell the truth: they can fail when code is working, and pass when it isn't. Because developers can no longer trust the results of their tests, they are unable to gain confidence that software is working correctly, potentially exposing end-users to the consequences of software failures. Flaky tests are a common occurrence in industry, significantly disrupting software development - even for companies with the greatest amount of resources to tackle them, such as Microsoft, Facebook, and Google.

A test can produce different pass/fail (i.e., flaky) outcomes because of differing, unpredicted ways that the execution environment in which it runs interacts with its behaviour and/or the code that it tests. For instance, a machine may be experiencing a heavy concurrent task load, causing it to execute tests slowly, sometimes triggering timeouts in the code under test, and sometimes not. Or, network access is erratic on the testing infrastructure, meaning the availability of network resources may be compromised. Or, a program under test's logic is time and date dependent. These are just a few real examples of the different ways in which tests can be flaky. For some environmental conditions, the test passes, but in an alternative context, the same test fails.

To remove flaky test behaviour, a developer has to modify test code or the code that it tests to control for aspects of its execution environment; i.e., the potential sources of its intermittent behaviour. But to accurately assess the differences in code execution behaviour and the places in the code that need to be changed, a developer must be able to reliably reproduce the differing pass/fail test outcomes. However, this not only involves recreating the environmental conditions that lead to the flaky behaviour, but also figuring out exactly what the environmental conditions were that caused the flakiness in the first place. Solving these issues and reproducing flaky tests manually can be extremely challenging for developers since the environmental conditions concerned (a) are intermittent; and (b) may be unrelated to anything the test is actually checking, and/or far-removed from the code being tested. Existing research techniques are insufficient for addressing these problems, and despite developer incentives for removing flakiness, Google, for instance, reports an astonishing one in seven tests as flaky.

What the Test FLARE Project Will Do: The Test FLARE project will develop and empirically evaluate techniques capable of (1) automatically reproducing flaky behaviour that is due to the execution environment. It will also provide developers with (2) automated, human-readable explanations that help developers further understand the reasons for the flaky behaviour.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.shef.ac.uk