Use this URL to cite or link to this record in EThOS:
Title: Multi-agent learning for security and sustainability
Author: Klima, R.
ISNI:       0000 0004 8501 6756
Awarding Body: University of Liverpool
Current Institution: University of Liverpool
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis studies the application of multi-agent learning in complex domains where safety and sustainability are crucial. We target some of the main obstacles in the deployment of multi-agent learning techniques in such domains. These obstacles consist of modelling complex environments with multi-agent interaction, designing robust learning processes and modelling adversarial agents. The main goal of using modern multi-agent learning methods is to improve the effectiveness of behaviour in such domains, and hence increase sustainability and security. This thesis investigates three complex real-world domains: space debris removal, critical domains with risky states and spatial security domains such as illegal rhino poaching. We first tackle the challenge of modelling a complex multi-agent environment. The focus is on the space debris removal problem, which poses a major threat to the sustainability of earth orbit. We develop a high-fidelity space debris simulator that allows us to simulate the future evolution of the space debris environment. Using the data from the simulator we propose a surrogate model, which enables fast evaluation of different strategies chosen by the space actors. We then analyse the dynamics of strategic decision making among multiple space actors, comparing different models of agent interaction: static vs. dynamic and centralised vs. decentralised. The outcome of our work can help future decision makers to design debris removal strategies, and consequently mitigate the threat of space debris. Next, we study how we can design a robust learning process in critical domains with risky states, where destabilisation of local components can lead to severe impact on the whole network. We propose a novel robust operator κ which can be combined with reinforcement learning methods, leading to learning safe policies, mitigating the threat of external attack, or failure in the system. Finally, we investigate the challenge of learning an effective behaviour while facing adversarial attackers in spatial security domains such as illegal rhino poaching. We assume that such attackers can be occasionally observed. Our approach consists of combining Bayesian inference with temporal difference learning, in order to build a model of the attacker behaviour. Our method can effectively use the partial observability of the attacker's location and approximate the performance of a full observability case. This thesis therefore presents novel methods and tackles several important obstacles in deploying multi-agent learning algorithms in the real-world, which further narrows the reality gap between theoretical models and real-world applications.
Supervisor: Tuyls, Karl ; Savani, Rahul Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral