Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.640580
Title: Parallel iterative solution methods for Markov decision processes
Author: Archibald, Thomas Welsh
Awarding Body: University of Edinburgh
Current Institution: University of Edinburgh
Date of Award: 1992
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Abstract:
Markov decision processes form an important class of dynamic programming problems because they are widely applicable. However solving real applications of Markov decision processes on serial computers is often impractical due to constraints on memory and processing time. Parallel processing has long been considered a potential solution to the computational intractability of these problems on serial machines, but prior to this work no detailed theoretical or practical studies in this area had been carried out. This thesis examines several successful serial iterative solution methods for infinite horizon, time invariant, discounted Markov decision processes and develops efficient analogous parallel algorithms. Particular consideration is given to the two classes of iterative solution methods known as value iteration methods and reward revision, but the techniques developed and the conclusions drawn are applicable to other iterative methods for Markov decision processes (for example policy iteration methods) and also to iterative methods in general. Iterative methods are applied to many other problem areas including dynamic programming and the solution of linear and differential equations. The main thrust of this thesis is concerned with the optimisation of the performance of the parallel algorithms developed. A detailed analysis of the implementation of several parallel iterative solution methods on a distributed memory, multiple instruction, multiple data, parallel processor reveals the key issues involved in optimising performance. Timing models are developed for processor communication time, processor calculation time and overall run time. These models guide the choice of the connection topology, the communication protocols and the degree of overlapping of communication and calculation. This leads to the development of a phased pipeline algorithm which yields 60 fold speed-ups when a ring of 121 transputers is used to solve problems with 60,000 states and sparse transition structures.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.640580  DOI: Not available
Share: