EPSRC Reference: EP/S026347/1
Title: Unparameterised multi-modal data, high order signatures, and the mathematics of data science
Principal Investigator: Lyons, Professor T
Cass, Dr T Oberhauser, Professor H Ni, Dr H
ARM Ltd Health and Safety Executive INRIA Rhone-Alpes
nVIDIA South China University of Technology Spherical Defence
The Alan Turing Institute University of Cambridge
Department: Mathematical Institute
Organisation: University of Oxford
Scheme: Programme Grants
Starts: 01 May 2019 Ends: 30 September 2024 Value (£): 4,100,854
Continuum Mechanics Mathematical Analysis
Statistics & Appl. Probability Vision & Senses - ICT appl.
Panel DatePanel NameOutcome
12 Mar 2019 Programme Grant Interviews - 12 March 2019 (Maths) Announced
Summary on Grant Application Form
Our ancestors communicated by scratching on the walls of caves, took navigational decisions by looking at the stars and made medical diagnoses simply by listening to patients. A great deal of information is captured in these simple data streams; our ability to capture, process, and decide actions based on information pervades all aspects of human life.

Today, one has the same challenges but the information is much more voluminous and the expectations for the outcomes far higher. When we write using our finger on an iphone, as our voice is recorded for doctors to assess our mood, when video is analysed for abnormal actions, or as telescopes look deep into the galaxies for black holes, stars, planets,... technically sophisticated systems translate streams of sequential data into processed and recognised patterns that can be actioned.

Our relatively new ability to offload data analysis onto massive digital systems is transforming our world. However huge challenges remain. Groundbreaking mathematical innovation is rapidly expanding our depth of understanding in one area. This project aims to build on successful pilot collaborations to create tools that really merge this new maths with the existing data science, and then apply them to exemplar challenges to produce a more effective abstraction of the "capture, process, and decide" process. The evidence is now overwhelming that dimension reduction and high order methods can capture sequential data very effectively. The maths underpinning this provided the crucial step that resulted in the extension of Newton's calculus beyond Itô's theory to rough paths; its mathematical articulation, the signature of a stream, has significantly enhanced deep learning methods to develop online handwriting recognition with state-of-the-art accuracy.

This project has the goal of developing and embedding the abstract mathematics around rough paths and complex streamed data into a few of the richest challenges involved in the "capture, prcess, and decide" task. The investigators and the world-leading project partners are connected by the shared challenge of improving this task with complex datasets of importance in four contexts:

* Health

* Human interfaces

* Human Actions

* Observing the Universe

The specific base challenges we start from are:

1) Use face, speech data, with other self-reported mood data to better detect when an intervention to support someone with mental illness is or is not working.

2) When a person writes (in Chinese) with their finger on a sat-nav device or mobile phone, to better transcribe this signal into digital characters accurately and economically, and to recognise who wrote it.

3) By observing evolving images in video data, develop tools that can classify the human actions.

4) Develop measurement instruments, and nonlinear processing techniques for astronomical data that improve detection sensitivity for transients and make new observations, e.g. for planets orbiting stars.

The technical challenges are deeply interconnected. This project is a near unique opportunity to bring these together to produce a validated common methodology, and to create substantial cross-fertilization. One recent example of how this can happen is worth highlighting. In 2013, Ben Graham (then University of Warwick, now Facebook) used the signature to quantify strokes from Chinese hand-written characters parsimoniously and efficiently. The capture stage is subtle and has appreciably improved the accuracy of the recognition process; the China-based partners on this project subsequently created an app which has been downloaded millions of times.

While the handwriting context for rough paths is very well defined and successful, understanding motion of people in videos is at a successful but early stage! The contexts are clearly related, and link through faces with the mental health challenge, and through occlusion with transients in astronomy. It is all joined up!
Key Findings
Organisation Website: http://www.ox.ac.uk