Title:
|
Context-based video coding
|
Although the mainstream of video coding technology continues to improve and iterate
on previous generations, it seems clear that consumer demands on video content will
continue to outstrip the savings made by better codecs. This is, in part, because
mainstream codecs are rooted in a established paradigm that uses residual coding to
maximise PSNR at a given bit rate.
However, it is well known that PSNR as a metric for visual quality does not correlate
well with viewers' subjective opinions. In recent years, research into residual-less
approaches to video coding has become popular. The aim is to achieve t he best possible
perceptual quality, irrespective of the PSNR with respect to the original. This allows
the use of more advanced motion models, tuned to specific content within the video.
This thesis proposes such an approach . Specifically, the motion of rigidly textured,
planar regions is modelled using a perspective model, so that the decoder can interpolate
these regions directly from reference frames. Prior knowledge of the scene is employed
to condition the motion estimation process, in the form of keyframe models marked
up under supervision. The motion estimation algorithm is able to compute planar
motion parameters independently of the motion of foreground objects, and is so able
to facilitate the detection of non-conforming regions. These algorithms are integrated
with a host codec, which codes non-planar regions as normal. A subjective trial shows
that this hybrid codec is able to achieve significant bit rate savings over the host codec,
while maintaining quality.
|