EPSRC logo

Details of Grant 

EPSRC Reference: GR/J82386/01
Title: STOCHASTIC REINFORCEMENT ALGORITHMS FOR LEARNING CONTINUOUS FUNCTIONS USING PROBABILISTIC RAM NETS
Principal Investigator: Gorse, Dr D
Other Investigators:
Researcher Co-Investigators:
Project Partners:
Department: Computer Science
Organisation: UCL
Scheme: Standard Research (Pre-FEC)
Starts: 18 April 1994 Ends: 17 April 1996 Value (£): 98,312
EPSRC Research Topic Classifications:
EPSRC Industrial Sector Classifications:
Related Grants:
Panel History:  
Summary on Grant Application Form
1. To develop a form of pulse-driven stochastic reinforcement training which is able to learn real-valued (continuous) functions and which is suitable for hardware-implementation using probabilistic net (pRAM) technology. 2. To demonstrate the effectiveness of the algorithm in a variety of application areas, including pattern recognition, time series prediction and real-time control.Progress:1.(a) We have developed a modified form of error-dependent reward and penalty function which gives faster convergence and lower final error levels.(b) We have extended the use of the output transform described in the project proposal to include an adaptable threshold as well as an adaptable gain thus giving greater flexibility to the learning system.(c) We have explored the use of an alternative output transform module (also with adaptable gain and threshold) which has the advantage of a simpler hardware realisation.2.(a) We have used the system for the classification of high-dimensional real-world pattern data. This has involved exploring the use of pyramidal architectures and multiple sampling of data points in order to ensure adequate generalisation.(b) We have used the system in a benchmark time series prediction problem, the well known sunspot numbers prediction task. The pRAM system was able to predict more accurately than a conventional neural network system of comparable complexity.(c) We have extended the learning techniques to situations of delayed reinforcement, using traces to record past experiences and actions and applied the new techniques successfully to the classic pole-balancing (inverted pendulum) control problem.
Key Findings
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Potential use in non-academic contexts
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Impacts
Description This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Summary
Date Materialised
Sectors submitted by the Researcher
This information can now be found on Gateway to Research (GtR) http://gtr.rcuk.ac.uk
Project URL:  
Further Information:  
Organisation Website: