An investigation into the real-time manipulation and control of three-dimensional sound fields
This thesis describes a system that can be used for the decoding of a three dimensional audio recording over headphones or two, or more, speakers. A literature review of psychoacoustics and a review (both historical and current) of surround sound systems is carried out. The need for a system which is platform independent is discussed, and the proposal for a system based on an amalgamation of Ambisonics, binaural and transaural reproduction schemes is given. In order for this system to function optimally, each of the three systems rely on providing the listener with the relevant psychoacoustic cues. The conversion from a five speaker ITU array to binaural decode is well documented but pair-wise panning algorithms will not produce the correct lateralisation parameters at the ears of a centrally seated listener. Although Ambisonics has been well researched, no one has, as yet, produced a psychoacoustically optimised decoder for the standard irregular five speaker array as specified by the ITU as the original theory, as proposed by Gerzon and Barton (1992) was produced (known as a Vienna decoder), and example solutions given, before the standard had been decided on. In this work, the original work by Gerzon and Barton (1992) is analysed, and shown to be suboptimal, showing a high/low frequency decoder mismatch due to the method of solving the set of non-linear simultaneous equations. A method, based on the Tabu search algorithm, is applied to the Vienna decoder problem and is shown to provide superior results to those shown by Gerzon and Barton (1992) and is capable of producing multiple solutions to the Vienna decoder problem. During the write up of this report Craven (2003) has shown how 4th order circular harmonics (as used in Ambisonics) can be used to create a frequency independent panning law for the five speaker ITU array, and this report also shows how the Tabu search algorithm can be used to optimise these decoders further. A new method is then demonstrated using the Tabu search algorithm coupled with lateralisation parameters extracted from a binaural simulation of the Ambisonic system to be optimised (as these are the parameters that the Vienna system is approximating). This method can then be altered to take into account head rotations directly which have been shown as an important psychoacoustic parameter in the localisation of a sound source (Spikofski et al., 2001) and is also shown to be useful in differentiating between decoders optimised using the Tabu search form of the Vienna optimisations as no objective measure had been suggested. Optimisations for both Binaural and Transaural reproductions are then discussed so as to maximise the performance of generic HRTF data (i.e. not individualised) using inverse filtering methods, and a technique is shown that minimises the amount of frequency dependant regularisation needed when calculating cross-talk cancellation filters.