Use this URL to cite or link to this record in EThOS:
Title: Knowledge-based reward shaping with knowledge revision in reinforcement learning
Author: Efthymiadis, Kyriakos
ISNI:       0000 0004 5357 9159
Awarding Body: University of York
Current Institution: University of York
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
Reinforcement learning has proven to be a successful artificial intelligence technique when an agent needs to act and improve in a given environment. The agent receives feedback about its behaviour in terms of rewards through constant interaction with the environment and in time manages to identify which actions are more beneficial for each situation. Typically reinforcement learning assumes the agent has no prior knowledge about the environment it is acting on. Nevertheless, in many cases (potentially abstract and heuristic) domain knowledge of the reinforcement learning tasks is available by domain experts, and can be used toimprove the learning performance. One way of imparting knowledge to an agent is through reward shaping which guides an agent by providing additional rewards. One common assumption when imparting knowledge to an agent, is that the domain knowledge is always correct. Given that the provided knowledge is of a heuristic nature, there are cases when this assumption is not met and it has been shown that in cases where the provided knowledge is wrong, the agent takes longer to learn the optimal policy. As reinforcement learning methods are shifting more towards informed agents, the assumption that expert domain knowledge is always correct needs to be relaxed in order to scale these methods to more complex, real-life scenarios. To accomplish that, the agents need to have a mechanism to deal with those cases where the provided expert knowledge is not perfect. This thesis investigates and documents the adverse effects erroneous knowledge can have to the learning process of an agent if care is not taken. Moreover, it provides a novel approach to deal with erroneous knowledge through the use of knowledge revision principles, in order to allow agents to use their experiences to revise knowledge and thus benefit from more accurate shaping. Empirical evaluation shows that agents that are able to revise erroneous parts of the provided knowledge, can reach better policies faster when compared to agents that do not have knowledge revision capabilities.
Supervisor: Daniel, Kudenko Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available