Combined speech and audio coding with bit rate and bandwidth scalability.
The past two decades have witnessed a rapid expansion within the telecommunications
industry. This growth has been primarily motivated by the proliferation of digital
communication systems and services which have become easily available through wired
and wireless systems. Current research trends involve the integration of speech, audio,
video and data channels into true multimedia communications over fixed and mobile
networks. However, while the available bandwidth in wired terrestrial networks is relatively
cheap and expandable, it becomes a limited resource in satellite and cellular-radio
systems. In order to accommodate an ever growing number of users while maintaining
high quality and low operational costs, it is necessary to maximise spectral efficiency.
This has given rise to the development of high rate compression techniques with the
ability to adapt to a broad class of input signals and to varying network resources.
The research carried out in this thesis has mainly focused on the design of a single
algorithm for compressing speech and audio Signals sampled at different rates. The
algorithms are based on the analysis-by-synthesis linear prediction coding (AbS-LPC)
scheme, which has been widely employed in various speech coding standards. However,
this bit rate reduction technique is based on the speech production mechanism and as
such provides a rigid structure which presents a major limitation for audio coding. In
order to improve the audio quality at low rates and to compensate for the errors incurred.
by the linear prediction during segments of high transitions, the algorithms employ an
efficient pulse excitation structure which represents the short innovation sequences with
sparse unit magnitude pulses. The scheme proposed for the compression of telephone
bandwidth speech and audio signals at 12kb/s achieves similar quality to the G.728
coder at 16kb/s and higher audio quality than the GSM-EFR standard at 12.2kb/s.
Wideband speech and audio coding schemes have been designed using both the full band
approach at bit rates of 17 and 19kb/s and also the split band technique at a bit rate of
20kb/s. The perceptual quality is comparable to the G.722 coder operating at 48kb/s.
The subband decomposition technique is also adapted to code speech and audio signals
sampled at 32kHz. The quality of the coder at 28kb/s is similar to the quality achieved
by the MP3 coder at 32kb/s. The algorithm also provides bandwidth and hit rate
scalability ranging from 12 to 64kb/s, making it ideal for deployment in rate-adaptive