Local and global models for articulated motion analysis
Vision is likely the most important of the senses employed by humans in understanding their environment, but computer systems are still sorely lacking in this respect. The number of potential applications for visually capable computer systems is huge; this thesis focuses on the field of motion capture, in particular dealing with the problems encountered when analysing the motion of articulated or jointed targets, such as people. Joint articulation greatly increases the complexity of a target object, and increases the incidence of self-occlusion (one body part obscuring another). These problems are compounded in typical outdoor scenes by the clutter and noise generated by other objects. This thesis presents a model-based approach to automated extraction of walking people from video data, under indoor and outdoor capture conditions. Local and global modelling strategies are employed in an iterative process, similar to the Generalised Expectation-Maximisation algorithm. Prior knowledge of human shape, gait motion and self-occlusion is used to guide this extraction process. The extracted shape and motion information is applied to construct a gait signature, sufficient for recognition purposes. Results are presented demonstrating the success of this approach on the Southampton Gait Database, comprising 4820 sequences from 115 subjects. A recognition rate of 98.6% is achieved on clean indoor data, comparing favourably with other published approaches. This recognition rate is reduced to 87.1% under the more difficult outdoor capture conditions. Additional analyses are presented examining the discriminative potential of model features. It is shown that the majority of discriminative potential is contained within body shape features and gait frequency, although motion dynamics also make a significant contribution.