Use this URL to cite or link to this record in EThOS:
Title: The acquisition of coarse gaze estimates in visual surveillance
Author: Benfold, Ben
ISNI:       0000 0004 2735 1345
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
This thesis describes the development of methods for automatically obtaining coarse gaze direction estimates for pedestrians in surveillance video. Gaze direction estimates are beneficial in the context of surveillance as an indicator of an individual's intentions and their interest in their surroundings and other people. The overall task is broken down into two problems. The first is that of tracking large numbers of pedestrians in low resolution video, which is required to identify the head regions within video frames. The second problem is to process the extracted head regions and estimate the direction in which the person is facing as a coarse estimate of their gaze direction. The first approach for head tracking combines image measurements from HOG head detections and KLT corner tracking using a Kalman filter, and can track the heads of many pedestrians simultaneously to output head regions with pixel-level accuracy. The second approach uses Markov-Chain Monte-Carlo Data Association (MCMCDA) within a temporal sliding window to provide similarly accurate head regions, but with improved speed and robustness. The improved system accurately tracks the heads of twenty pedestrians in 1920x1080 video in real-time and can track through total occlusions for short time periods. The approaches for gaze direction estimation all make use of randomised decision tree classifiers. The first develops classifiers for low resolution head images that are invariant to hair and skin colours using branch decisions based on abstract labels rather than direct image measurements. The second approach addresses higher resolution images using HOG descriptors and novel Colour Triplet Comparison (CTC) based branches. The final approach infers custom appearance models for individual scenes using weakly supervised learning over large datasets of approximately 500,000 images. A Conditional Random Field (CRF) models interactions between appearance information and walking directions to estimate gaze directions for head image sequences.
Supervisor: Reid, Ian D. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Information engineering ; Image understanding ; Applications and algorithms ; computer vision ; machine vision ; gaze ; head pose ; surveillance ; visual surveillance ; video surveillance ; video analysis ; attention ; randomised tree ; randomized tree ; randomised fern ; randomized fern ; tracking ; head tracking ; thesis ; predicate fern