Use this URL to cite or link to this record in EThOS:
Title: Advanced low bit-rate speech coding below 2.4 kbps
Author: Unver, Emre
ISNI:       0000 0004 2684 4391
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Access from Institution:
There has been a fast growth in the telecommunications industry in the past decades. With the increasing demand in the transmission of speech over bandwidth-limited media, such as mobile or satellite communication links, and storage of spoken information in bit-rate-limited media, such as silicon memory, efficient compression of speech has become an important issue. Although there are speech coding standards producing high quality speech above 4 kbps, there is still room for improvement at lower bit rates especially at 2.4 kbps and below. Especially for military wireless communications where some of the bandwidth is required for error correction, or for applications where speech is embedded into other speech or non-speech data, achieving good speech quality and intelligibility at very low bit-rates is important. Parametric coders, such as sinusoidal coders, are used extensively at low bit-rates. In this work, relaxing the delay, memory and complexity constraints, strategies for lowering the bit-rates of sinusoidal coders while maintaining good speech quality are discussed. These strategies include the extension of the previous work in the literature on combining several frames within a metaframe and variable bit-allocation schemes as well as a new voicing estimation algorithm from the spectral envelope. Moreover, the use of phonemes in speech coding is investigated for further bit reductions. A method for producing highly intelligible speech with modest quality at a very low bit-rate is presented. Coding of any extra information in order to achieve high quality is also discussed. These strategies have been implemented in the SB-LPC vocoder in order to perform parameter quantisation at several bit-rates. In listening tests, it has been found that the proposed techniques have been effective in lowering the bit-rate from 2.4 kbps to 1.2 kbps, from 1.2 kbps to 0.8 kbps, and from 4.0 kbps to 1.8 kbps while maintaining the speech quality. In addition to those, a coding scheme is also designed operating at 309 bps and producing speech whose intelligibility is similar to that of the MELP operating at 600 bps. Finally, discussions about the performance of the strategies proposed in this thesis as well as possibilities for improvement are given.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available