EPSRC logo

Details of Grant 

EPSRC Reference: EP/W021986/1
Title: Dependence Modelling with Vine Copulas for the Integration of Unstructured and Structured Data
Principal Investigator: Dalla Valle, Dr L
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Devon and Cornwall Police
Department: Sch of Eng, Comp and Math (SECaM)
Organisation: University of Plymouth
Scheme: Standard Research - NR1
Starts: 11 May 2022 Ends: 10 May 2023 Value (£): 79,537
EPSRC Research Topic Classifications:
Statistics & Appl. Probability
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
Panel History:
Panel DatePanel NameOutcome
08 Dec 2021 EPSRC Mathematical Sciences Small Grants Panel December 2021 Announced
Summary on Grant Application Form
The project will develop a statistical data integration methodology, never considered before, that utilizes multiple sources of information to provide more accurate predictions than those currently available. Today we are living in the Big Data era, where masses of data in traditional formats are produced by companies and organizations and large quantities of information, mostly unstructured, are generated by social media, every second. However, are we effectively and efficiently exploiting all the information available to us from official and social media sources? The answer to this question is definitely, no. Most of the statistical approaches used to solve real-world problems are based on a single source of information and, although preliminary work attempting to leverage social media data exists, there are currently no comprehensive and functional methodologies able to fully capitalize on unstructured information and its associations with other available structured data. The consequence is that precious information contained in unstructured online data continues to be neglected and lost. While technology and digitalization advances are shaping the world, statistics is struggling to keep pace and it is currently in critical and urgent need of revolutionizing its methods and practices. This proposal aims at filling this gap, giving life to a pioneering and transformative statistical data integration methodology, fully leveraging the power of different sources of information, such as traditional and online-generated data. The project will support early-stage research on integrating unstructured and structured data using a new methodology based on vine copulas that will form the basis of future analyses, which will lead to a radical transformation of current data approaches, propelling statistics towards the future era. For this research, which is early-stage, yet will bring immediately usable results, the methodology will be applied to data of crimes committed in the South West region of the UK, integrating official police information, provided by our project partner Devon and Cornwall Police (DCP), with crime data discussed on different social media platforms. Our approach will provide a more thorough and realistic appraisal of the volume and severity of crimes in specific locations of the South West, since it will also account for hidden crimes, unreported to the police, but emerging from social media. The results of this project will be used by DCP to more effectively plan and organize their interventions and to efficiently allocate resources in targeted areas. Providing a deeper and more accurate knowledge of the geographical locations of criminal offences, including unreported crimes, this project will assist the police to better support communities in high criminal risk areas with timely interventions, making people feel more protected and safer. This will promote social inclusion and more equitable communities, especially in disadvantaged areas that are mostly affected by high criminality levels, including crimes which are not reported via traditional channels.

This project, initially targeting the South West of the UK, will lay the foundation for future grant applications extending the geographical area under assessment at national level.

In addition, due to the endless number of possible applications of our methodology, this project will be the milestone that will generate further breakthroughs in any other area of science where multiple data sources are available and accurate predictions are needed.

This project is timely since it addresses the urgent need to fully leverage the social media information currently available, but not taken advantage of. This research will provide a key opportunity for the UK to secure a leading international position at the forefront of advances in knowledge extraction, leading to huge social and economic benefits.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.plym.ac.uk