Context adaptive video preprocessing

This work is not exhaustively presented below, as it is being prepared for publication

Principle

The increasing access to high resolution video streams (from HD to 4K and more) generates an important traffic on various networks. Broadcasters rely on modern hybrid coders like H.264, HEVC or VP9 to efficiently store and transmit HD+ sequences [1, 2]. Two different constraints can justify the need for high compression ratios. First, some applications such as video surveillance generate a tremendous amount of video content and to improve its compression directly impacts the storage costs. Second, the video service can be constrained by the transmission channel used : the available bandwidth for setting up a video communication is sometimes known in advance. In the case we consider – i.e. radio supported video communications -, the latter point is critical (with transmission capabilities ranging from 9 kb/s to 100 kb/s) which has justified our interest for the ultra-low bitrate domain.

Sub-sampling based video coding has been shown to provide interesting results with modern hybrid coding schemes, especially for low-bitrate compression of high-resolution sequences [3]. Some approaches have been embedded inside the CODEC (coding/decoding) architecture [4, 5, 6, 7], which requires to modify the underlying coding standard and may increase the encoding complexity [8]. Consequently, to decouple resizing and coding has also been proposed [9, 10, 11] (see figure below). For example, authors of [11] achieve promising results in the low-bitrate domain with HEVC thanks to the development of a novel, efficient and low-cost up-sampling scheme.

Generic framework of sub-sampling based video coding.

In this work, we propose a novel CODEC-independent preprocessing that combines the advantages of re-sampling and background simplification to achieve ROI-based video compression. Our approach jointly uses adaptive re-sampling [12], and an high performance texture removal filter [13]. Doing so, we adaptively reduce the spatial complexity of the input sequence to preserve background intelligibility and maximum quality on salient objects. Best efficiency is obtained with high-resolution sequences, but the proposed solution also exhibits promising results for moderate to low-resolution streams.

Example results

Configuration : HEVC (HM 15.0) in Low Delay mode. GOP 16. All sequences are 1 fps.

Coastguard – 352×288 – 10 kb/s.

2. Ducks – 1920×1080 – 100 kb/s.
NB: Leftmost duck is not defined as a ROI in the saliency mask used here.

3. Parkrun – 1920×1080 – 200kb/s.

References

[1]: T. Wiegand and G. Sullivan: Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification. (ITU-T Rec H.264|ISO/IEC 14496-10 AVC) Mars 2003.
[2]: J. Ohm, and G.J. Sullivan: High efficiency video coding: the next frontier in video compression [Standards in a Nutshell], in IEEE Signal Processing Magazine, vol. 30, no. 1, pp. 152-158, Jan. 2013.
[3]: A.M. Bruckstein, M. Elad, and R. Kimmel, Down-scaling for better transform compression, IEEE Transactions on Image Processing, vol. 12, no. 9, pp. 1132-1144, Sept 2003.
[4]: Z. Hu, H. Li, and W. Li, An adaptive downsampling based video coding with hybrid super-resolution method, 2012 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 504-507, May 2012.
[5]: M. Shen, P. Xue, and C. Wang, Down-sampling based video coding using super-resolution technique, IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 6, pp. 755-765, June 2011.
[6]: V.-A. Nguyen, Y. Peng Tan, and W. Lin, Adaptive downsampling/upsampling for better video compression at low bit rate, IEEE International Symposium on Circuits and Systems, pp. 1624-1627, May 2008.
[7]: R. Molina, A. K. Katsaggelos, L. D. Alvarez, and J. Mateos, Toward a new video compression scheme using super-resolution (invited paper) [6077-25], 2006.
[8]: D. Barreto, L.D. Alvarez, R. Molina, A.K. Katsaggelos, and G.M. Callic, Region-based super-resolution for compression, Multidimensional Systems and Signal Processing, vol. 18, no. 2-3, pp. 59-81, 2007.
[9]: Y. Dar and A. M. Bruckstein, Improving low bitrate video coding using spatio-temporal down-scaling, CoRR, vol. abs 1404.4026, 2014.
[10]: Ren-Jie Wang, Ming-Chen Chien, and Pao-Chi Chang, Adaptive down-sampling video coding, 2010.
[11]: G. Georgis, G. Lentaris, and D. Reisis, Reduced complexity super-resolution for low-bitrate video compression, IEEE Transactions on Circuits and Systems for Video Technology, no. 99, 2015.
[12]: F. Zund, Y. Pritch, A. Sorkine Hornung, S. Mangold, and T. Gross, Content-aware compression using saliency-driven image retargeting, IEEE International Conference on Image Processing (ICIP), pp. 1845-1849, Sept 2013.
[13]: Li Xu, Qiong Yan, Yang Xia, and Jiaya Jia, Structure extraction from texture via relative total variation, ACM Trans. Graph., vol. 31, no. 6, pp. 139:1-139:10, Nov. 2012.

In Video Technology Watch

Investigate & Gather

Context adaptive video preprocessing

Principle

Example results

References