EPSRC logo

Details of Grant 

EPSRC Reference: EP/S000143/1
Title: Automatic repair of natural source code
Principal Investigator: Matragkas, Dr N
Other Investigators:
Researcher Co-Investigators:
Project Partners:
IBM UK Ltd
Department: Computer Science
Organisation: University of York
Scheme: New Investigator Award
Starts: 01 December 2018 Ends: 31 October 2020 Value (£): 131,677
EPSRC Research Topic Classifications:
Artificial Intelligence Computational Linguistics
Software Engineering
EPSRC Industrial Sector Classifications:
Information Technologies
Related Grants:
Panel History:
Panel DatePanel NameOutcome
02 May 2018 EPSRC ICT Prioritisation Panel May 2018 Announced
Summary on Grant Application Form
Fixing software defects is a time-consuming and costly activity. One of the main reasons why software debugging is so expensive is that it still remains mainly a manual activity. Fixing a bug is a complex process, which consists of many different steps including finding and understanding the underlying cause of the bug, identifying a set of changes that address the bug correctly, and finally verifying that those changes are correct. Automating this process (or parts of it) can potentially reduce the time, cost and effort it takes to fix bugs, and therefore the quality of the produced software.

Automatic Program repair (APR) is defined as the activity of removing a software defect during software maintenance without human intervention. The most popular approaches to automatic program repair rely on test suites as proxies of the behavioural specification of the system under repair. Although these approaches have demonstrated promising results, such as the synthesis of a fix for the famous Hearbleed bug, they face serious challenges. The main one is the correctness of the generated patches. Test suites are an imperfect metric of program correctness. Therefore techniques, which rely on test suites to evaluate the correctness of the generated patches, tend to overfit the test suite. Additionally, test-driven APR techniques can have scalability limitations in terms of execution time. The space of possible patches is vast and correct patches occur sparsely. In test-driven APR approaches the search time is mainly dominated by the validation of the patches, i.e. by the execution of the test suite. Existing approaches try to address this issue by employing various strategies such as parallel execution of test suites or the use of just a subset of the test suite (sample of positive test cases and all the negative ones). Although, such strategies improve the execution time, scalability of APR approaches remains a challenge.

The motivation of the project is to address overfitting and scalability limitations of APR techniques. The main hypothesis is the following: Statistical properties of source code can improve the state-of-the-art generate-and-validate techniques for APR in terms of patch correctness and execution time.

Statistical properties of source code (i.e. naturalness of source code) can help address the challenges of APR techniques in the following ways:

1) Heuristics based on the statistical properties of source code can be used to rank the results of fault localisation algorithms, and therefore augment their ability to correctly identify faulty lines of code.

2) Execution time of test-driven APR techniques can be reduced by discarding unnatural patches without executing the test suite.

3) Correctness of candidate patches can be improved by using the naturalness of code as part of the fitness function.

To measure the naturalness of source code, this project will mine existing software repositories such as GitHub and it will develop language models, similar to those used in the Natural Language Processing (NLP) domain. These models will be used to assess the naturalness of proposed patches.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.york.ac.uk