EPSRC logo

Details of Grant 

EPSRC Reference: EP/C009797/1
Title: Dynamic Operating Policies for Commercial Hosting Environments
Principal Investigator: Mitrani, Professor I
Other Investigators:
Thomas, Dr NA van Moorsel, Professor A
Researcher Co-Investigators:
Project Partners:
Department: Computing Sciences
Organisation: Newcastle University
Scheme: Standard Research (Pre-FEC)
Starts: 01 October 2005 Ends: 31 January 2009 Value (£): 176,886
EPSRC Research Topic Classifications:
Networks & Distributed Systems Parallel Computing
System on Chip
EPSRC Industrial Sector Classifications:
No relevance to Underpinning Sectors
Related Grants:
EP/C538277/1
Panel History:  
Summary on Grant Application Form
In this project we address the problem of the provision of reliable high performance collections of computing resources for eScience. The aim is to provide policies and middleware to support workload management and resource allocation.In this project we assume the following scenario. There is a heterogeneous collection of servers, spread over more than one site, hired by more than one service provider, and streams of workload (i.e. jobs to be processed on the servers), where the cost of waiting, service and switching of servers between service providers are different. In general, jobs may require multiple services, possibly needing to use several resources simultaneously. The system may also accept requests for reservation for future use. Servers can process more than one job at the same time via time-sharing. We also note that jobs from the same client are considered as non-independent, this is important when servers cache client data between jobs. The implication of this is that there will be an additional overhead if jobs from the same client are not run on the same server. The objective is a trade-off between the quality of service (provided to the server owners, the service providers who hire the servers, the clients who send the jobs and the administrators) and the costs of providing this service.We will use a number of well proven approaches to tackle this problem. - We will develop systems-level models and measurement techniques to reliably predict the service costs of tasks hosted on commercial server environments. We aim to provide supporting tools and services for the prediction of system behaviour in terms of response time and server throughput and as such provide software infrastructure that will ensure quality of services.- We will use stochastic modelling to develop abstract models of the hosting scenario which can be evaluated to determine the system performance under a given policy. Different performance measures can be used in different situations; for example, in normal operation we may wish to know what the long term average performance of the systems is, however, if the system is experiencing temporary avaliability problems, then a transient measure will be more useful. - We will engineer workload management solutions that ensure that the requirements of the users are met in the context of the changing demands on the system (including variable workload, availability of servers, dynamic operating policies etc). The approach we propose is holistic and includes the coordination of resources, the dispatching and scheduling of tasks and the recovery from failure (including hard failures such as server outage, and soft failure such as defaulting on compliance).- We will develop a software architecture, and invent distributed algorithms, so that a system can utilise decision-making algorithms and policies in efficient and dependable manner. This research is unique in that it focuses on the impact run-time decision-making algorithms have on the design of open, large-scale systems. The algorithms and design are independent of the application domain (workload management, resource allocation, retry optimisation, etc.) or the used metric (performance, reliability, etc.), and therefore can form the backbone for future large-scale adaptive systems, possibly through standardisation.- We will realise the potential of the dynamic operating policies and middleware through direct involvement with the e-Science business-user community.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: http://www.ncl.ac.uk