EPSRC logo

Details of Grant 

EPSRC Reference: EP/T030526/1
Title: CITCoM: Casual Inference for Testing of Computational Models
Principal Investigator: Walkinshaw, Dr N
Other Investigators:
Wagg, Professor DJ Latimer, Dr N Hierons, Professor R
Researcher Co-Investigators:
Project Partners:
Case Western Reserve University Chalmers University of Technology Defence Science & Tech Lab DSTL
STFC Laboratories (Grouped)
Department: Computer Science
Organisation: University of Sheffield
Scheme: Standard Research
Starts: 01 January 2021 Ends: 28 February 2025 Value (£): 670,838
EPSRC Research Topic Classifications:
Software Engineering
EPSRC Industrial Sector Classifications:
Aerospace, Defence and Marine
Related Grants:
Panel History:
Panel DatePanel NameOutcome
20 May 2020 EPSRC ICT Prioritisation Panel May 2020 Announced
Summary on Grant Application Form
Computational models are being used increasingly to offer answers to important questions that affect us all. Scientists are increasingly resorting to computational models to simulate phenomena as diverse as the effects of drugs on a physiology, transmissions of diseases in a society, or the flow of blood through an artery. Within the public sector, computational models are fundamental to enabling the prediction of weather patterns, both in the short term and also to predict the impact of global warming in the longer term. They are also increasingly vital for supporting decisions on infrastructure spend; our project partners in the DAFNI project are developing computational modelling infrastructure to support the investment of £460bn over the course of the coming decade.

Given the high-stakes decisions that are usually involved, mistakes or "bugs" in a model can lead (and have led) to disastrous consequences. It is critical that these systems are rigorously tested to minimise this risk.

Computational models are however not amenable to traditional software testing and debugging techniques. They can include large numbers of parameters and configuration options. They can take a very long time (and require a lot of computational resources) to execute a single test run, which makes it infeasible to run large numbers of test executions. The data structures that they operate on can be particularly complex (e.g. 3D models of cities or coronary arteries), which means that these can be difficult to synthesise and inspect. Finally, if a test run is found to produce an incorrect result, these factors can make it very difficult to identify where the bug is in the source code of the model.

CITCoM is based on the observation that the challenge is in many ways rooted in data-analysis. In the presence of large numbers of input variables, there is the challenge of analysing the tested behaviour and ensuring that the observed behaviour is caused by the parameters that are the focus of the test (and not accidentally caused by other incidental parameters). There is the converse challenge of selecting which inputs need to be varied and which ones need to be controlled to demonstrate that a given combination of inputs causes a particular behaviour whilst keeping the number of test cases this requires to a minimum. If a fault occurs, there is the challenge of interrogating the data to locate the fault in the code.

Similar problems arise in a wide range of disciplines, and especially in the field of Epidemiology - where population data are scrutinised to determine the effects of drug treatments or medical interventions. Again, there are many variables at play (lifestyle, cultural background, genetic traits, habits). Collecting data can be expensive and time-consuming. Outcomes can be difficult to measure and complex to scrutinise. For such situations, the last decade has seen the rapid rise of a family of statistical analysis approaches called Causal Inference. This has enabled statisticians to design and reason about epidemiological trials and data in new and powerful ways to efficiently sample data, handle missing data-attributes, and use existing data to answer "what-if" questions, even if the data in question has not been collected yet.

CITCoM will use these powerful Causal Inference analysis capabilities to address the problems that arise when testing computational models. We will generate Causal Inference-driven automated test-generation techniques, test oracles, and debugging techniques. These will be trialled and honed on a set of large case-study models in collaboration with our partners on the DAFNI project at STFC, at DSTL, and within The University of Sheffield.

Ultimately, CITCoM will enable us to generate, collect, and analyse evidence from computational models to ensure that they do not contain faults, so that any decisions that they feed into are well-founded and trustworthy.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.shef.ac.uk