EPSRC logo

Details of Grant 

EPSRC Reference: EP/Z531327/1
Title: Statistical Foundations for Detecting Anomalous Structure in Stream Settings (DASS)
Principal Investigator: Eckley, Professor IA
Other Investigators:
Yao, Professor Q Cho, Dr HH Fearnhead, Professor P
Yu, Dr Y
Researcher Co-Investigators:
Project Partners:
British Geological Survey BT Government Office for Science
INAF Morgan Stanley & Co. National Nuclear Laboratory
QinetiQ Shell
Department: Mathematics and Statistics
Organisation: Lancaster University
Scheme: Standard Research TFS
Starts: 01 August 2024 Ends: 31 July 2029 Value (£): 4,042,771
EPSRC Research Topic Classifications:
Statistics & Appl. Probability
EPSRC Industrial Sector Classifications:
Related Grants:
Panel History:  
Summary on Grant Application Form
With the exponentially increasing prevalence of networked sensors and other devices for collecting data in real-time, automated data analysis methods with theoretically justified performance guarantees are in constant demand. Often a key question with such streaming data is whether they show evidence of anomalous behaviour. This could, e.g., be due to malignant bot activity on a website; early warning of potential equipment failure or detection of methane leakages. These and other motivating examples share a common feature which is not accommodated by classical point anomaly models in statistics: the anomaly may not simply be an 'outlying' observation, but rather a distinctive pattern observed over consecutive observations. The strategic vision for this programme grant is to establish the statistical foundations for Detecting Anomalous Structure in Streaming data settings (DASS).

Discussions with a wide-range of industrial partners from different sectors have identified important, generic challenges that cut across distinct DASS applications, and are relevant for analysing streaming data more broadly:

Contemporary Constrained Environments: Anomaly detection is often performed under various constraints due, for example, to the restrictions on measurement frequency, the volume of data transferable between sensors and a central processor, or battery usage limits. Additionally, certain scenarios may impose privacy restrictions when handling sensitive data. Consequently, it has become imperative to establish the mathematical underpinning for rigorously examining the trade-offs between, e.g., statistical accuracy, communication efficiency, privacy preservation and computational demands.

Handling Data Realities: A substantial portion of research in statistical anomaly detection operates under the assumption of clean data. Nevertheless, real-world data typically exhibit various imperfections, such as missing values, labelling errors in data streams, synchronisation discrepancies, sensor malfunctions and heterogeneous sensor performance. Consequently, there is a pressing need for the development of principled, model-based procedures that can effectively address the features of real data and enhance the resilience of anomaly detection methods.

Identifying, Accounting for and Tracking Dependence: Not only are data streams often interdependent, but also anomalous patterns may be dependent across those streams. Taking into account both types of dependence is crucial in enhancing the statistical efficiency of anomaly detection algorithms, and also in controlling the errors arising from handling a large number of data streams in a principled way. Other challenges include tracking the path of an anomaly across multiple data sources with a view to learning causal indicators allowing for precautionary intervention.

Our ambitious goal of comprehensively addressing these challenges is only achievable via the programme grant scheme. Our philosophy is to tackle the methodological, theoretical and computational aspects of these statistical problems together. This integrated approach is essential to achieving the substantive fundamental advances in statistics envisaged, and to ensuring that our new methods are sufficiently robust and efficient to be widely adopted by academics, industry and society more generally.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.lancs.ac.uk