Use this URL to cite or link to this record in EThOS:
Title: Learning transferable representations
Author: Rojas-Carulla, Mateo
ISNI:       0000 0004 7961 969X
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
A first contribution of this thesis is to propose causality as a language for problems of distribution shift. First, we consider domain generalisation, where no data from the test distribution are observed during training. What assumptions can be made regarding the relation between train and test distributions for transfer to succeed? We argue that assuming the data in both tasks originate from the same causal graph leads to a natural solution: use only causal features for prediction, as the mechanism mapping causes to effects is invariant to shifts in the probability distributions induced by the causal structure. We provide optimality results when the test task is adversarial, and introduce a method for exploiting all remaining features when data from the test task are observed. We motivate that learning such invariant mechanisms mapping features to outputs leads to machine learning modules robust to transfer. Second, we consider a classification problem where only few examples are available for each label. How should an initial large dataset be leveraged to improve performance in this task? We argue that such a dataset should be used to learn powerful features for batch classification using a neural network. We present a framework which transfers between classes by building a probabilistic model on the weights of the network. Our results suggest that practitioners should use the original dataset for building features whose power can be exploited during few-shot learning. Finally, we extend causal discovery to solve problems such as distinguishing a painting from its counterfeit. Given two such static entities, a proxy random variable introduces the randomness necessary to construct two features of the static entities which preserve their causal footprint, measurable by a standard causal discovery procedure. Experiments on vision and language provide evidence that the causal relation between the static entities can often be identified.
Supervisor: Schölkopf, Bernhard ; Turner, Richard Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: Machine Learning ; Transfer Learning ; Dataset shift ; Learning to Learn ; Few-shot Learning ; Causality ; Causal Learning