Use this URL to cite or link to this record in EThOS:
Title: The role of prefrontal cortex and basal ganglia in model-based and model-free reinforcement learning
Author: Silva Miranda, B. A.
ISNI:       0000 0004 7229 8707
Awarding Body: UCL (University College London)
Current Institution: University College London (University of London)
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Contemporary reinforcement learning (RL) theory suggests that choices can be evaluated either by the model-free (MF) strategy of learning their past worth or the model-based (MB) strategy of predicting their likely consequences based on learning how decision states eventually transition to outcomes. Statistical and computational considerations argue that these strategies should ideally be combined. This thesis aimed to investigate the neural implementation of these two RL strategies and the mechanisms of their interactions. Two non-human primates performed a two-stage decision task designed to elicit and discriminate the use of both MF and MB-RL, while single-neuron activity was recorded from the prefrontal cortex (frontal pole, FP; anterior cingulate cortex, ACC; dorsolateral prefrontal cortex) and striatum (caudate and putamen). Logistic regression analysis revealed that the structure of the task (of MB importance) and the reward history (of MF and MB importance) significantly influenced choice. A trial-by-trial computational analysis also confirmed that choices were made according to a weighted combination of MF and MB- RL, with the influence of the latter approaching 90%. Furthermore, the valuations of both learning methods also influenced response vigour and pupil response. Neural correlates of key elements for MF and MB learning were observed across all brain areas, but functional segregation was also in evidence. Neurons in ACC encoded features of both MF and MB, suggesting a possible role in the arbitration between both strategies. Striatal activity was consistent with a role in value updating by encoding reward prediction errors. Finally, novel neurophysiological evidence was found in favour of the role of the FP in counterfactual processing. In conclusion, this thesis provides insight into the neural implementation of MF and MB-RL computations and their various effects on diverse aspects of behaviour. It supports the parallel operation and integration of the two approaches, while revealing unexpected intricacies.
Supervisor: Kennerley, S. ; Dayan, P. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available