EPSRC Reference: |
EP/K02325X/1 |
Title: |
Accelerated Coordinate Descent Methods for Big Data Problems |
Principal Investigator: |
Richtarik, Dr P |
Other Investigators: |
|
Researcher Co-Investigators: |
|
Project Partners: |
|
Department: |
Sch of Mathematics |
Organisation: |
University of Edinburgh |
Scheme: |
First Grant - Revised 2009 |
Starts: |
01 November 2013 |
Ends: |
31 January 2016 |
Value (£): |
100,679
|
EPSRC Research Topic Classifications: |
Artificial Intelligence |
Mathematical Aspects of OR |
Numerical Analysis |
|
|
EPSRC Industrial Sector Classifications: |
No relevance to Underpinning Sectors |
|
|
Related Grants: |
|
Panel History: |
|
Summary on Grant Application Form |
Much of modern society and economy, in the United Kingdom and elsewhere, is moving in the direction of digitization and computation. Humankind is now able to collect and store enormous quantities of digital data coming from sources such as health records (e.g., IBM ``Watson'' project, MRI/CT scans), government databases (e.g., e-Government, GORS: government operational research service), social networks (e.g., Facebook, Linked-IN, delicious), online news (e.g., New York Times article database), corporate databases (e.g., bank records, Amazon.com) and the internet. Global society is, as a consequence, facing many unprecedented challenges and opportunities. One of the biggest of these has to do with the ability (or rather, lack thereof) to distill, understand and utilize in an optimal way the information contained within these gigantic data sources. The main technology for this is to "form an optimization problem'' and then solve it using a well-chosen optimization algorithm in a suitable computing environment (e.g., a multicore workstation, GPU-enabled machine, cloud).
In this project we aim to contribute to a breakthrough in our ability to solve optimization problems arising from big data domains via developing, analyzing and implementing new accelerated parallel coordinate descent (CD) methods. Since in big data problems the data is typically highly structured, well-designed CD methods can have very low memory requirements and arithmetic cost per iteration---often much smaller than the dimension of the problem. This is in sharp contrast with standard methods whose arithmetic complexity of a single iteration depends on the dimension at least quadratically.
Our research objectives are:
1. Acceleration Theory. We will analyze the iteration complexity (i.e., give bounds on the number of iterations/steps needed to achieve a prescribed level of accuracy) of new parallel coordinate descent methods accelerated using the following 4 strategies: a) nonuniformity (of the frequency with which individual coordinates are updated), b) asynchronicity (of updates and computation), c) distribution (of data and computation to nodes of a cluster) and d) inexactness (of certain operations and computations the algorithm depends on).
2. Stochastic Gradient Descent. We will analyze theoretically and test numerically the relationship between parallel coordinate descent (CD) methods and parallel stochastic gradient descent (SGD) methods.
3. ACDC Code. We will implement the accelerated algorithms in a code which we will make publicly available.
|
Key Findings |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Potential use in non-academic contexts |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Impacts |
Description |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk |
Summary |
|
Date Materialised |
|
|
Sectors submitted by the Researcher |
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
|
Project URL: |
|
Further Information: |
|
Organisation Website: |
http://www.ed.ac.uk |