Use this URL to cite or link to this record in EThOS:
Title: Probabilistic fuzzy logic framework in reinforcement learning for decision making
Author: Hinojosa, William
ISNI:       0000 0004 2702 3851
Awarding Body: University of Salford
Current Institution: University of Salford
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Access from Institution:
This dissertation focuses on the problem of uncertainty handling during learning by agents dealing in stochastic environments by means of reinforcement learning. Most previous investigations in reinforcement learning have proposed algorithms to deal with the learning performance issues but neglecting the uncertainty present in stochastic environments. Reinforcement learning is a valuable learning method when a system requires a selection of actions whose consequences emerge over long periods for which input-output data are not available. In most combinations of fuzzy systems with reinforcement learning, the environment is considered deterministic. However, for many cases, the consequence of an action may be uncertain or stochastic in nature. This work proposes a novel reinforcement learning approach combined with the universal function approximation capability of fuzzy systems within a probabilistic fuzzy logic theory framework, where the information from the environment is not interpreted in a deterministic way as in classic approaches but rather, in a statistical way that considers a probability distribution of long term consequences. The generalized probabilistic fuzzy reinforcement learning (GPFRL) method, presented in this dissertation, is a modified version of the actor-critic learning architecture where the learning is enhanced by the introduction of a probability measure into the learning structure where an incremental gradient descent weight- updating algorithm provides convergence. XXIABSTRACT Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: first, the GPFRL have shown a robust performance when used in control optimization tasks. Second, its learning speed outperforms most of other similar methods. Third, GPFRL agents are feasible and promising for the design of adaptive behaviour robotics systems.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available