Details of Grant

EPSRC Reference:

EP/F069421/1

Title:

Adaptive cognition for automated sports video annotation (ACASVA)

Principal Investigator:

Kittler, Professor J

Other Investigators:

Osman, Dr M

Groeger, Professor JA

Researcher Co-Investigators:

Professor D Windridge

Project Partners:

Department:

Vision Speech and Signal Proc CVSSP

Organisation:

University of Surrey

Scheme:

Standard Research

Starts:

05 May 2009

Ends:

30 September 2013

Value (£):

1,415,482

EPSRC Research Topic Classifications:

Cognitive Science Appl. in ICT	Human Communication in ICT
Vision & Senses - ICT appl.

EPSRC Industrial Sector Classifications:

No relevance to Underpinning Sectors

Related Grants:

EP/F069626/1

Panel History:

Panel Date	Panel Name	Outcome
18 Jul 2008	ICT Large Grants Panel (18 July 08)	Announced

Summary on Grant Application Form

The development of a machine that can autonomously understand and interpret patterns of real-world events remains a challenging goal in AI. Humans are able to achieve this by developing sophisticated internal representational structures for object and events and the grammars that connect them. ACASVA aims to investigate the interaction between visual and linguistic grammars in learning by developing grammars in a scenario where the number of different events is constrained, by a set of rules, to be small: a sport. We will analyse video footage of a game (e.g. tennis) and use computer vision techniques to progressively understand it as a sequence of (possibly overlapping) events, and build a grammar of events. We will do a similar audio/linguistic analysis on the commentary on the game. Both of these grammars will be used to build a representational structure for understanding the game. Visual representations are additionally constrained by the inference of game rules so that object-classification mechanisms are preferentially tuned to game-relevant entities like 'player' rather than game-irrelevant entities like 'crowd-member'. We will also investigate how the two modes, sight and sound, can influence each other in the learning process; interpretation of the video is affected by the linguistic grammar and vice versa. Furthermore, this coupling of modes will lead to improved recognition of both audio and video events when the grammars from the video modes are used to influence the audio recognition, and vice versa. The psychological component of the ACASVA correspondingly attempts to learn how these capabilities are developed in humans; how visual grammars are organized and employed in the learning problem, how these grammars are modified by prior linguistic knowledge of the domain, how visual grammars map onto linguistic grammars, and how game rule-inferences influence lower-level visual learning (determined via gaze-behaviour). These results will feedback into the machine-learning problem and vice versa, as well as providing a performance benchmark for the system.Potential beneficiaries of ACASVA (in addition to the knowledge beneficiaries within the fields of science and engineering) include the broadcasting and on-line video search industries.

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

http://cvssp.org/acasva/

Further Information:

Organisation Website:

http://www.surrey.ac.uk