Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.426027
Title: Hierarchical decision making for semantic analysis and summarisation of sports videos
Author: Jaser, Edward
ISNI:       0000 0001 3589 3855
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2005
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Video comprises several modalities and as the richest of all type of media, it is a very important tool and powerful medium of communication. It is used extensively for presenting information and expressing and communicating ideas. Huge amount of video material is generated everyday covering a wide variety of subjects. Without efficient and flexible tools, the usability of this material is quite restricted. In recent years there has been much research addressing the problem of automatic video analysis and retrieval. In this thesis, the problem of automatic video annotation is considered. We develop a multistage decision making system tailored to the domain of sport videos. The first stage is concerned with reaching a compact yet efficient representation of raw video material. One popular approach to this problem is a representation in terms of low-level features. A major limitation is that the stored indexing features are too low-level; they relate directly to the properties of the data. In this stage we opted for a representation in terms of cues. Cues are the result of processing that associates the feature measurements with real-world objects or events. An additional advantage of this approach is that the cues from different types of features are presented in a homogeneous way. The second stage of the system is concerned with the classification of video shots. The set of classes considered relate to some characteristic views that occur frequently in sport videos. The decision making mechanism in this stage is a boosted decision tree which generates hypotheses concerning the semantics of the sports video content given the cues annotation. In contrast to many shot classifiers reported in the literature, the proposed one decomposes the global complex classification problem into a number of simpler tasks. It has the flexibility of choosing different subsets of features (cues in our case) to solve those tasks, thus eliminating unnecessary computations. The final stage of the system is designed to attack the misclassification committed in earlier stages by exploiting temporal context. Misclassification can be due to error in the cue extraction, in the shot classifier or the consequence of a genuine ambiguity as the same visual content may be attributed to different sport categories, depending on the context. The functionality of this stage is realised by a Hidden Markov Model system which bridges the gap between the semantic content categorisation defined by the user and the actual visual content categories. This stage also addresses the grouping of shots into scenes. Experimental results on a database comprising video material from six different events demonstrate that the proposed system is working well.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.426027  DOI: Not available
Share: