Use this URL to cite or link to this record in EThOS:
Title: Lossy compression of speech using perceptual criteria
Author: O'Donnell, Michael
ISNI:       0000 0001 3452 6121
Awarding Body: University of Central Lancashire
Current Institution: University of Central Lancashire
Date of Award: 1998
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
The research contained in this thesis provides an investigation into a new method of minimising the perceptual differences when encoding digitised speech. An application of the perceptual criteria is described in the context of a codebook encoding methodology Some of the background studies covered aspects of psychoacoustics, in particular the effects of the human outer, middle and inner ear. Models approximating each region of the ear are utilised and concatenated into a single overall auditory response path model. As the objective of the research is to encode and decode speech waveforms, some study into how speech is produced and the classification of speech sounds is required. From this there is a description of a basic speech production model which is modelled as a digital filter. A review of the main categories for coding schemes that are currently employed is presented along with commonly used coding methods. In particular the codebook coding method is reviewed in sufficient detail to contrast with the new coding method. The development of a new perceptual minimisation criterion which relies on dual application of the auditory response path model on the original and reconstructed speech waveforms is described. In this the ordering of eodebook searches, the frequency spectrum used as the search target, windowing functions with durations and placement are all analysed to determine the optimum encoder design. Also described are a number of prospective gain algorithms which cover both time and frequency domain implementations. A new encoder is constructed which fully integrates the new perceptual criterion into the minimisation of the original and reconstructed speech waveforms. In the minimisation no part of the traditional encoder method is used, however both methods use a similar technique for determining gain factors. Speech derived from both encoders was subjectively assessed by a number of untrained, independent listeners. The results presented show that both methods are comparable but there is a slight preference towards the traditional encoder. A measure of the complexity indicated that the new minimisation method is also more complex than the traditional encoder.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Engineering design ; W240 - Industrial/product design