Details of Grant

EPSRC Reference:

EP/K032968/1

Title:

NaaS: Network-as-a-Service in the Cloud

Principal Investigator:

Pietzuch, Professor PR

Other Investigators:

Wolf, Professor A

Costa, Dr P

Researcher Co-Investigators:

Project Partners:

Advanced Micro Devices Inc (AMD)	Citrix Systems	NetApp
Netronome

Department:

Computing

Organisation:

Imperial College London

Scheme:

Standard Research

Starts:

01 October 2013

Ends:

31 March 2017

Value (£):

666,149

EPSRC Research Topic Classifications:

Networks & Distributed Systems

EPSRC Industrial Sector Classifications:

Communications

Information Technologies

Related Grants:

EP/K031724/2

EP/K031724/1

EP/K034723/1

Panel History:

Panel Date	Panel Name	Outcome
27 Feb 2013	EPSRC ICT Responsive Mode - Feb 2013	Announced

Summary on Grant Application Form

Cloud computing has significantly changed the IT landscape. Today it is possible for small companies or even single individuals to access virtually unlimited resources in large data centres (DCs) for running computationally demanding tasks. This has triggered the rise of "big data" applications, which operate on large amounts of data. These include traditional batch-oriented applications, such as data mining, data indexing, log collection and analysis, and scientific applications, as well as real-time stream processing, web search and advertising.

To support big data applications, parallel processing systems, such as MapReduce, adopt a partition/aggregate model: a large input data set is distributed over many servers, and each server processes a share of the data. Locally generated intermediate results must then be aggregated to obtain the final result.

An open challenge of the partition/aggregate model is that it results in high contention for network resources in DCs when a large amount of data traffic is exchanged between servers. Facebook reports that, for 26% of processing tasks, network transfers are responsible for more than 50% of the execution time. This is consistent with other studies, showing that the network is often the bottleneck in big data applications.

Improving the performance of such network-bound applications in DCs has attracted much interest from the research community. A class of solutions focuses on reducing bandwidth usage by employing overlay networks to distribute data and to perform partial aggregation. However, this requires applications to reverse-engineer the physical network topology to optimise the layout of overlay networks. Even with perfect knowledge of the physical topology, there are still fundamental inefficiencies: e.g. any logical topology with a server fan-out higher than one cannot be mapped optimally to the physical network if servers have only a single network interface.

Other proposals increase network bandwidth through more complex topologies or higher-capacity networks. New topologies and network over-provisioning, however, increase the DC operational and capital expenditures-up to 5 times according to some estimates-which directly impacts tenant costs. For example, Amazon AWS recently introduced Cluster Compute instances with full-bisection 10 Gbps bandwidth, with an hourly cost of 16 times the default.

In contrast, we argue that the problem can be solved more effectively by providing DC tenants with efficient, easy and safe control of network operations. Instead of over-provisioning, we focus on optimising network traffic by exploiting application-specific knowledge. We term this approach "network-as-a-service" (NaaS) because it allows tenants to customise the service that they receive from the network.

NaaS-enabled tenants can deploy custom routing protocols, including multicast services or anycast/incast protocols, as well as more sophisticated mechanisms, such as content-based routing and content-centric networking.

By modifying the content of packets on-path, they can efficiently implement advanced, application-specific network services, such as in-network data aggregation and smart caching. Parallel processing systems such as MapReduce would greatly benefit because data can be aggregated on-path, thus reducing execution times. Key-value stores (e.g. memcached) can improve their performance by caching popular keys within the network, which decreases latency and bandwidth usage compared to end-host-only deployments.

The NaaS model has the potential to revolutionise current cloud computing offerings by increasing the performance of tenants' applications -through efficient in-network processing- while reducing development complexity. It aims to combine distributed computation and network communication in a single, coherent abstraction, providing a significant step towards the vision of "the DC is the computer".

Key Findings

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Potential use in non-academic contexts

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Impacts

Description	This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised

Sectors submitted by the Researcher

This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk

Project URL:

Further Information:

Organisation Website:

http://www.imperial.ac.uk