Use this URL to cite or link to this record in EThOS:
Title: Visual-only person and word recognition : from lip motion dynamics
Author: Brown, Paul C.
ISNI:       0000 0004 6423 8683
Awarding Body: Queen's University Belfast
Current Institution: Queen's University Belfast
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
The thesis presents novel contributions to the use of lip motion dynamics as a standalone modality for robust person identification and word recognition. The novel contributions target the key areas of visual feature extraction, video temporal dynamics and training. The novel feature contribution applies the magnitude spectra of the two-dimensional Fast Fourier transform (Mag-2D-FFT) as a robust visual feature by virtue of its phase invariance. It outperforms benchmark two-dimensional Discrete Cosine Transform (2D-DCT), two-dimensional Discrete Wavelet Transform (2D-DWT) and multi-channel Gabor image-based techniques. It delivers over 3% person identification improvement on the CMU-PIE, VidTIMIT and XM2VTS Audio-visual databases, and up to 22% relative improvement in visual-ohly word recognition on the GRID Corpus. The novel temporal dynamics uses the Longest Matching Segment (LMS) method to encode full video dynamics of a training video, delivering comparable person identification on a Vector Quantization (VQ) model with full face recognition when combined with a dynamic version of the novel feature set, and over 7% word recognition accuracy on a Hidden Markov Model (HMM). The training contribution combines Gabor feature compression based on a modulus response with a novel formulation of video-based spatial sub-banding using the Posterior Union Model (VPUM) to tackle weakly constrained face recognition of partially occluded and multi-view images as a prelude to a lip-only application.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available