Ashish Singh Chandel, Compression Engineer-Video, Dish TV

Video compression has been the object of intensive research in the last thirty years. This article gives the idea about different techniques available for video compression. H.264/AVC exhibits superior coding performance improvement over its predecessors. The next-generation standards are being generated by both VCEG and MPEG.

ITU-T and ISO/IEC are the two main international organizations, which decide the standards for video compressions. ISO/IEC MPEG standard includes MPEG-1, MPEG-2, MPEG-4, MPEG-4 Part 10 (AVC), MPEG-7, MPEG-21, and M-JPEG. ITU-I VCEG standard includes H.26x series, H.261, H.263, and H.264. Currently, both VCEG and MPEG are launching their next-generation video coding projects.

Basic Techniques

All video-coding standards, based on motion prediction and discrete cosine transform, produce block artifacts at low data rate. To reduce blocking artefacts, mainly two directions are there:

  • lThe first is to code the prediction error of the hybrid scheme using the DWT.
  • lThe second is to use full 3-D wavelet decomposition.

Data rate savings between 20 and 50 percent are achieved using the test model of H.263+. The corresponding gains in PSNR are between 3 and 0.8 dB. MPEG-4 combines frame-based and segmentation-based approaches along with the mixing of natural and synthetic content allowing efficient coding as well as content access and manipulation. During the last ten years, the hybrid scheme combining motion- compensated prediction and DCT has represented the state-of-the-art in video coding. The natural video part of MPEG-4 is also based on motion-compensation prediction followed by the discreet cosine transform; the difference here made is coding of object shapes.

Implementation Strategies

There are a number of techniques to implement the video compression and these techniques can be broadly divided into two categories – hardware-based implementation and software-based implementation.

Hardware-based approach. The most common approach is to design a dedicated VLSI circuit for video compression. One can have function-specific hardware, such as associated inverse operations, DCT, VLC, and block matching. Due to the exploitations of the data flow of the algorithm and special control, the processing capability of these approaches can be increased tenfold compared to those of conventional microprocessors.

Software-based approach. Software-based approaches are becoming more popular because the performance of general-purpose processors has been increasing rapidly. The inherent modular nature of various video compression algorithms allows experimenting and hence improving various parts of the encoder independently, including ME, DCT algorithm, and rate-controlled coding. The major advantage of using the software-based approach is that it allows incorporating new research ideas and algorithms in the encoding process for achieving a better picture quality at a reduced bit rate for a desired level of picture quality, or on given bit rate. Real-time performance for high-quality profiles is still quite difficult, but encoding for simple video profiles of various standards can now be done on a single processor.

It is of utmost importance to understand the sequence of image and video-coding development expressed on the bases of generation-based coding approaches. The table shows this classification. It can be seen from this classification that the coding community has reached third-generation video coding techniques.

Existing New Technology

Ashish-Chandel-Disht-TV

The advances of video-coding techniques were contributed by various groups and organizations. Hence to provide a software platform to collect and evaluate these new techniques, a Key Technical Area (KTA) platform was developed based on JM11 reference software, where the new coding tools are added very frequently. The major new coding tools added to KTA platform can be summarized as follows:

Intra-prediction. In H.264, intra-prediction is enhanced with additional bi-directional intra-prediction (BIP) modes, where BIP combines prediction blocks from two prediction modes using a weighting matrix. Furthermore, mode-dependent directional transform (MDDT) using transforms derived from KLT is applied to capture the remaining energy in the residual block.

Inter-prediction. To further improve inter-prediction efficiency, finer fractional motion prediction and better motion vector predictions were proposed. Increasing the resolution of the displacement vector from 1/4-pel to 1/8-pel to obtain higher efficiency of the motion-compensated prediction is suggested. A competing framework for better motion vector coding is proposed, which includes SKIP mode, in which both spatial and temporal redundancies in motion vector fields are captured. Moreover, extending the macro block size up to 64x64 is suggested so that new partition sizes 6464, 6432, 3264, 3232, 3216, and 1632 can be used. Instead of using the fixed interpolation filter from H.264/AVC, adaptive interpolation filters (AIF) are proposed, such as 2D AIF, separable AIF, directional AIF, enhanced AIF and enhanced directional AIF.

Quantization. To achieve better quantization, optimized quantization decisions at the macro block level and at different coefficient positions are proposed. Rate distortion optimized quantization (RDOQ) was added to the JM reference software; it performs optimal quantization on a macro block. It does not require a change of H.264/AVC decoder syntax. More recently, it gives an improved, more efficient RDOQ implementation. In adaptive quantization matrix selection (AQMS), where different quantization steps can be had by different coefficient positions, a method deciding the best quantization matrix index is proposed to optimize the quantization matrix at a macro block level.

Transform. For motion partitions bigger than 1616, a 1616 transform is suggested in addition to 44 and 88 transforms. Moreover, transform coding is not always a must. Either spatial domain coding can be adaptively chosen or standardized transform coding can be chosen – this is proposed for each block of the prediction error.

In-loop filter. In KTA, besides the de-blocking filter, an additional adaptive loop filter (ALF) is added to improve coding efficiency by applying filters to the de-blocked-filtered picture. Adaptive loop filter has adopted two different techniques so far – one is quad-tree based adaptive loop filter (QALF) and the other is block-based adaptive loop filter (BALF).

Internal bit-depth increase. By using
12 bits of internal bit depth for 8-bit sources, so that the internal bit-depth is greater than the external bit-depth of the video codec, the coding efficiency can be further improved. There are many contributions not added to KTA yet; for example, proposed three methods, respectively, to use decoder side motion estimation (DSME) for B-picture motion vector decision, which improves coding efficiency by saving bits on B-picture motion vector coding. Also, some new techniques are under investigation and will be presented in the responses for call for proposals.

The basic different techniques available for video compression and the latest technique (H.264/AVC) available for video compression are also included. We have seen here that H.264/AVC has been developed by both the ISO/IEC (MPEG) and ITU-T (VCEG) organizations. It has various improvements in terms of coding efficiency, like flexibility, robustness and application domains. No doubt as per the requirements and applications, there will always be new developments in video-compression technique. From the review of various video-compression papers, it is inferred that there are still lots of possibilities for improvement of the video-compression technique.