A large-scale analysis of the acoustic-phonetic markers of speaker sex
The research for this thesis lies within the field of speaker characterisation through the acoustic-phonetic analysis of speech. The thesis consists of two parts: 1. An investigation of the acoustic-phonetic differences between the speech of women and men; 2. An examination of the practicalities of automating the investigation to analyse a large speech database. The acoustic-phonetic markers of speaker sex examined here are the fundamental frequency, the formant frequencies, and the relative amplitude of the first harmonic. The aims of the investigation were, firstly, to establish to what extent these markers differentiate between the sexes, and secondly, to examine the extent of between- and within-speaker deviation from the female and male norms, or average values for each sex. These points were investigated by an automated acoustic-phonetic analysis of the TIMIT database, involving a data set of almost 16,000 segments of speech. An automated method was developed to enable the signal processing and statistical analysis of a data set of this size. The problems to be encountered in the analysis of a highly variable data source (i.e. the acoustic speech waveform) are addressed.