Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.668228
Title: Argumentation accelerated reinforcement learning
Author: Gao, Yang
ISNI:       0000 0004 5365 9781
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Abstract:
Reinforcement Learning (RL) is a popular statistical Artificial Intelligence (AI) technique for building autonomous agents, but it suffers from the curse of dimensionality: the computational requirement for obtaining the optimal policies grows exponentially with the size of the state space. Integrating heuristics into RL has proven to be an effective approach to combat this curse, but deriving high-quality heuristics from people's (typically conflicting) domain knowledge is challenging, yet it received little research attention. Argumentation theory is a logic-based AI technique well-known for its conflict resolution capability and intuitive appeal. In this thesis, we investigate the integration of argumentation frameworks into RL algorithms, so as to improve the convergence speed of RL algorithms. In particular, we propose a variant of Value-based Argumentation Framework (VAF) to represent domain knowledge and to derive heuristics from this knowledge. We prove that the heuristics derived from this framework can effectively instruct individual learning agents as well as multiple cooperative learning agents. In addition,we propose the Argumentation Accelerated RL (AARL) framework to integrate these heuristics into different RL algorithms via Potential Based Reward Shaping (PBRS) techniques: we use classical PBRS techniques for flat RL (e.g. SARSA(λ)) based AARL, and propose a novel PBRS technique for MAXQ-0, a hierarchical RL (HRL) algorithm, so as to implement HRL based AARL. We empirically test two AARL implementations - SARSA(λ)-based AARL and MAXQ-based AARL - in multiple application domains, including single-agent and multi-agent learning problems. Empirical results indicate that AARL can improve the convergence speed of RL, and can also be easily used by people that have little background in Argumentation and RL.
Supervisor: Toni, Francesca Sponsor: China Scholarship Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.668228  DOI: Not available
Share: