Use this URL to cite or link to this record in EThOS:
Title: Highly scalable 2D model-based video coding
Author: Hu, Mingyou
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2005
Availability of Full Text:
Access from EThOS:
Access from Institution:
With rapid mergers of computer, communications, and entertainment industries, we can expect a trend of growing heterogeneity (in channel bandwidth, receiver capacity, etc.) for future digital video coding applications. Furthermore, some new functions appear, such as object manipulation, which should be supported by the video coding techniques. The traditional video coding approach is very constrained and inefficient to the heterogeneity issue and user interaction. Scalable coding, allowing partial decoding at a variety of resolution, temporal, quality, and object levels from a single compressed codestream, is widely considered as a promising technology for efficient signal representation and transmission in a heterogeneous environment. However, although several scalable algorithms have been proposed in the literature and the international standards over the last decade, further research is necessary to improve the compression performance of scalable video coding. This thesis investigates scalable 2D model-based video coding method with efficient video compression as well as excellent scalability performance, in order to satisfy the newly appeared requirements. It first examines main model-based video coding techniques and scalable video coding methods. Also, the parametric video models that describe the real world and image generation process are briefly described. Next, video segmentation algorithms are investigated to semantically represent the video frame into video objects. At the first frame, the texture information and the motion from first several frames are used to extract the semantic foreground objects. For some sequences, user interaction is required to get semantic objects. In later frames, the proposed complexity-scalable contour-tracking algorithm is used to segment each frame. After that, each object is progressively approximated using three-layer 2D mesh model. In order to represent the motion of human face more precisely, face detection and modelling are also investigated. This technique, in which human face is modelled separately, is shown to produce improvements of object motion representation. Scalable model compression is also outlined in this thesis. Object model is represented into two parts: object shape and interior object model, which are compressed separately. A scalable contour approximation algorithm is proposed. Both intra- and predictive scalable shape-coding algorithms are investigated and proposed to code the object shape progressively. The encoded coarser layers are used to improve the coding efficiency of the current layer. The effectiveness of these algorithms is demonstrated through the results of extensive experiments. We also investigate the scalable texture coding of video objects. An improved shape-adaptive SPECK algorithm is employed in intra-texture coding and is also used for residual texture coding after motion compensated temporal filtering. During motion compensated temporal filtering, scalable mesh object model is used, and scalable motion vector coding is achieved using CABAC codec. A hierarchically structured bitstream is created, which is optimised for rate-distortion, to facilitate efficient bit truncation and bit allocation among video frames and video objects. The coding system can encode/decode the video object independently and generate a separate bit stream for each object. As is exhibited in our experiments, such a high coding scalability in the proposed coding system is achieved without a significant cost in compression performance commonly experienced in most scalable coding systems.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available