Use this URL to cite or link to this record in EThOS:
Title: Deep domain generalisation : from homogeneous to heterogeneous
Author: Li, Da
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Domain Generalisation (DG) requires a machine learning model trained on one or more source domain(s) to perform well on an unseen domain. In previous DG works, three main problems exist. First, existing vision-based DG benchmarks are formed by conventional photo images only, and this undermines the significance of results evaluated on these benchmarks. Second, DG has been less studied in the context of deep convolutional neural networks (CNNs). Third, the assumption of these models that require identical task spaces between source and target domains is restrictive. To address these problems, in this thesis, compared with the traditional benchmarks, we propose a more challenging DG dataset with more diverse visual domains to drive future research in this field. Also, we study both model-based and model-agnostic deep learning methods for addressing DG tasks. Finally, we generalise the standard (homogeneous) DG to a more challenging setting -- heterogeneous label-space DG. The existing DG benchmarks are formed by photo images with different resolutions or come from different photo datasets. A simple baseline method (AGG) that aggregates the data from the source domain(s) to train a deep end-to-end model works surprisingly well on these benchmarks. Therefore, we propose a more challenging benchmark called `PACS' with broader domains, photo, art painting, cartoon and sketch. This new benchmark provides larger domain shift and includes a scenario in which DG is motivated: training on domains with abundant labelled data (e.g., photos), and testing on domains with scarce data (e.g., sketches). Also, we present, the first deep DG method, a low-rank parameterised CNN model for extracting domain-agnostic parameters to tackle DG and demonstrate that the extracted domain-agnostic parameters achieve state-of-the-art performance on both traditional VLCS and our proposed PACS. Next, a model-agnostic meta-learning method is proposed for tackling DG, and this method can be applied to both supervised learning and reinforcement learning settings. In each mini-batch, we mimic the domain shift of DG by synthesising virtual testing domains. The meta-optimisation objective of our method requires that steps to improve the training domain performance should also improve the test domain performance; thus, it trains a model with good generalisation. As a result, our method shows the state-of-the-art results on PACS and excellent performance on two control problems: cart pole and mountain car. Furthermore, we propose a lifelong learning framework for improving DG methods. In this lifelong learning framework, the base model encounters a sequence of domains, and, at each step of training, it is optimised to maximise the performance on the next domain. Then, the performance at domain n depends on the previous n-1 learning problems. Thus, backpropagating through the sequence means optimising performance not just for the next domain, but for all the following domains. Training on all such sequences of domains provides dramatically more `practice' for a base DG learner compared to existing approaches, thus improving performance on a real test domain. We incorporate two base methods MLDG and Undo Bias into this framework and show the noticeable improvement over vanilla MLDG and Undo Bias, which results in the state-of-the-art performance on three DG benchmarks. Finally, we propose an episodic training method for DG. The simple approach AGG works surprisingly well and surpasses many prior published DG methods. We improve upon this strong and fast baseline by training the network using an episodic batch construction strategy, so that each module of a model is exposed to domain-shift that characterises novel domains at runtime. We demonstrate that our episodic training improves AGG to state-of-the-art performance on three DG benchmarks. Furthermore, we show how to relax the previously mentioned assumption of matching label space between the source and target domains. This allows us to improve the pervasive workflow of using an ImageNet trained CNN as a fixed feature extractor for downstream recognition tasks. We propose the first heterogeneous DG benchmark VD-DG providing the largest-scale demonstration of DG to date.
Supervisor: Song, Yi-Zhe ; Hospedales, Timothy Sponsor: University of Surrey
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral