Use this URL to cite or link to this record in EThOS:
Title: Computational models of the human visual cortex : on individual differences and ecologically valid input statistics
Author: Mehrer, Johannes
ISNI:       0000 0004 9354 3257
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Perception relies on cortical processes in response to sensory stimuli. Visual input entering the eyes ascends a cascade of processing steps from the retina to high-level regions of the cortex. Vision science investigates these transformations that give rise to high-level processing of visual objects, such as object recognition. In this thesis I investigate computational models of the human visual cortex with regard to their ability to predict cortical responses to visual objects. In particular, I describe two factors playing an important role in using deep neural networks (DNNs) to better understand cortical functioning: the initial weight state and ecologically more valid input statistics. In Chapter 1 of this thesis I will introduce relevant literature pertaining to deep neural networks as a modeling framework for the visual cortex. Next, I will lay out the motivation for the research questions investigated in this thesis and described in detail in Chapters 2, 3, and 4. Chapter 2 focuses on the impact of the initial weight state of a model on its ability to predict cortical representations. I describe work in which we demonstrate that two DNN instances identical in every aspect but their initial weights, yield very dissimilar representations. Relying on single network instances to predict cortical activation patterns in response to sensory stimuli poses a problem for computational neuroscience: depending on the initial set of weights the ability to mirror the cortical representations of these stimuli might vary. Thus, results based on single (“off-the-shelf”) model instances - as commonly used in computational neuroscience - may not generalize. In contrast, using multiple DNN instances might alleviate this problem as they allow insights in the variability of a given model architecture to predict cortical representations. These individual differences between model instances suggest that to allow results to generalize more easily the model instances should be treated similar to human experimental participants. In Chapter 3 I focus on ecologically more valid input statistics (in the form of training images) aiming to improve a model’s ability to predict cortical representations. The most successful models of the human visual cortex to date are DNNs trained on object recognition tasks designed with machine learning goals in mind. However, the image sets used for training these DNNs are often not ecologically realistic. For example, training on the most-widely used image set in computational neuroscience (ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012) requires the fine-grained distinction of 120 dog breeds, but does not contain visual object categories encountered frequently in everyday human life (e.g. woman, man, or child). This suggests that taking into account the human visual experience when training models of the human visual cortex on a categorization task might help to predict cortical representations. In this Chapter I describe the creation of a set of images aimed at mimicking the human visual diet: ecoset. Ecoset contains more than 1.5 million images from 565 basic level categories and is the largest image set specifically designed for computational neuroscience to date. Ecoset is freely available to allow the community to test their own hypotheses of models trained with input statistics matched to the human visual environment. In Chapter 4 we build on the results from the previous two Chapters. Using multiple DNN instances I investigate whether a brain-inspired model architecture (vNet) trained on ecologically more valid input statistics (ecoset) might improve its ability to predict cortical representations. I first demonstrate that ecoset might improve an architecture’s ability to mirror cortical representations. Furthermore, ecoset-trained vNet also outperforms state-ofthe- art computer vision and computational neuroscience models in terms of mirroring cortical representations in the human brain. Thus, incorporating biological and ecological aspects, such as brain-inspired architectural features and ecologically more valid input statistics, into computational models may yield better predictions of response patterns in the human visual cortex. Treating DNN instances similar to human experimental participants and considering ecological and biological factors for building these DNNs may be an important step towards better models of the human visual cortex. Such models might allow a better understanding of the cortical processes underlying high-level vision in the human brain.
Supervisor: Kietzmann, Tim C. ; Kriegeskorte, Nikolaus Sponsor: Cambridge Trust ; Cambridge Philosophical Society ; MRC
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: Vision ; Deep neural networks ; Representational similarity analysis ; Ecoset