Use this URL to cite or link to this record in EThOS:
Title: Machine learning for neural coding of sound envelopes : slithering from sinusoids to speech
Author: Levy, Alban Hugo
ISNI:       0000 0004 7233 8975
Awarding Body: University of Nottingham
Current Institution: University of Nottingham
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Restricted access.
Access from Institution:
Specific locations within the brain contain neurons which respond, by firing action potentials (spikes), when a sound is played in the ear of a person or animal. The number and timing of these spikes encodes information about the sound; this code is the basis for us perceiving and understanding the acoustic world around us. To understand how the brain processes sound, we must understand this code. The difficulty then lies in evaluating the unknown neural code. This thesis applies Machine Learning to evaluate auditory coding of dynamic sounds by spike trains, with datasets of varying complexity. In the first part, a battery of Machine Learning (ML) algorithms are used to evaluate modulation frequency coding from the neural response to amplitude-modulated sinusoids in cat Cochlear Nucleus spike train data. It is found on this recognition task that, whilst absolute performance levels depend on the types of algorithms, their performance relative to each other is the same on different types of neurons. Thus a single powerful classification algorithm is sufficient for evaluating neural codes. Similarly, different performance measures are useful in understanding differences between ML algorithms, but they shed little light on different neural coding strategies. In contrast, the features used for classification are crucial; e.g. Vector Strength does not provide an accurate measure of the information contained in spike timing. Overall, different types of neurons do not encode the same amount of amplitude-modulation information. This emphasises the value of using powerful Machine Learning methods applied to raw spike timing information. In the second part, a more ecological and heterogeneous set of sounds — speech — is used. The application of Hidden Markov Model based Automatic Speech Recognition (ASR) is tested within the constraints of an electrophysiological experiment. The findings suggest that a continuous digit recognition task is amenable to a physiology experiment: using only 10 minutes of simulated recording to train statistical models of phonemes, an accuracy of 70% could be achieved. This result jumps to about 85% when using 200 minutes worth of simulated data. Using a digit recognition framework is sufficient to examine the influence on the performance of different aspects of the size and nature of a neural population and the role of spike timing. Previous results suggest, however, that this accuracy would be reduced if experimental Inferior Colliculus data were used instead of a guinea-pig cochlear model. On the other hand, a fully-fledged continuous ASR task on a large vocabulary with many speakers may result in insufficient phoneme accuracy (∼40%) to base an auditory coding-related investigation on. Overall this suggests that complex ML algorithms such as ASR can nevertheless be practically used to assess neural coding of speech, with careful selection of features.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Q Science (General) ; QP351 Neurophysiology and neuropsychology