EPSRC logo

Details of Grant 

EPSRC Reference: EP/J020230/1
Title: Imperfect data: accuracy, impacts and extraction of meaningful information
Principal Investigator: Foody, Professor G
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Sch of Geography
Organisation: University of Nottingham
Scheme: Standard Research
Starts: 04 May 2012 Ends: 31 July 2013 Value (£): 69,249
EPSRC Research Topic Classifications:
Information & Knowledge Mgmt
EPSRC Industrial Sector Classifications:
Aerospace, Defence and Marine Information Technologies
Related Grants:
Panel History:
Panel DatePanel NameOutcome
09 Feb 2012 Data Intensive Systems (DaISy) Announced
Summary on Grant Application Form
Meaningful information is a fundamental requirement for informed, logical and reasoned activity. Extracting meaningful information from data can, however, be a challenge, especially given problems that data may, amongst other things, be inaccurate, incomplete, and possibly contradictory as arise from a variety of sources of variable quality and trust level.

Data imperfections are a generic problem in information extraction and decision making and so the work is relevant in many disciplines. Imperfect data are, for example, evident in medical diagnosis (e.g. a patient's test results are typically only an imperfect indicator of a condition), in defining nature reserves for species conservation (e.g. the species distribution maps and models are often highly sensitive to 'absence' data - was the species actually present but not observed?) and in security and defence applications (e.g. sub-pixel target detection algorithms applied to surveillance imagery vary in performance and utility between environments). Some problems with imperfect data were recently highly apparent in relation to the response to the Haiti earthquake of 2010, especially in relation to damage mapping to inform relief activities. Vast amounts of well-intentioned assistance was provided by numerous professional and amateur bodies with unprecedented data rates but the volumes of data and the problems with them were a concerns. Key problems were that maps were inaccurate, inconsistent and sometimes contradictory. As such a major mapping challenges arises in how to work with such data. One key issue is the need for information on the accuracy of data sources and methods to help use imperfect data. This project seeks to contribute to this task. It aims to illustrate the impacts of using imperfect data, explore methods to characterise the quality of the data and methods to combine data sources to yield an enhanced product of known accuracy.

A range of methods will be used but the core focus is on the use of latent class modelling. This type of analysis is based on multiple observations or data from a variety of sources. The relationships between the observers/data sources are used to attempt to explain their quality and suggest how the data could be interpreted to yield information. The approach is a form of statistical modelling and is highly attractive for the specific research proposal because if a model can be formed that fits the observed data, then model's parameters define the accuracy of the data sources and its outputs can be used to form new products of known accuracy. As such the modelling analysis may add value to data by indicating its quality and combining it usefully for extraction of information.

As the problems of imperfect data are generic the proposal has broad potential impacts. For the specific DaISy call there are clear impacts in relation to security and defence. For example methods that enable rapid and qualified information to be derived from sources of variable accuracy, completeness and trust level will increase effectiveness and the quality of decision making. Additionally as a model based approach it removes/reduces the need for reference data to be acquired for validation which could otherwise require deployment of personnel to dangerous locations and so of considerable benefit to health and well-being.

Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.nottingham.ac.uk