基于最邻近帧质量增强的视频编码参考帧列表优化算法

霍俊彦; 邱瑞鹏; 马彦卓; 杨付正

doi:10.11959/j.issn.1000-436x.2022185

您当前的位置：

首页 >

文章列表页 >

基于最邻近帧质量增强的视频编码参考帧列表优化算法

学术论文 | 更新时间：2024-06-05

- 基于最邻近帧质量增强的视频编码参考帧列表优化算法
- Reference frame list optimization algorithm in video coding by quality enhancement of the nearest picture
- 通信学报 2022年43卷第11期页码：136-147
- 作者机构：
  
  西安电子科技大学ISN国家重点实验室，陕西西安 710071
- 作者简介：
  
  [ "霍俊彦（1982− ），女，山西晋中人，博士，西安电子科技大学副教授，主要研究方向为多媒体通信、虚拟现实、智能信息处理" ]
  [ "邱瑞鹏（1996− ），男，河南上蔡人，西安电子科技大学硕士生，主要研究方向为视频压缩" ]
  [ "马彦卓（1980− ），女，河北深州人，博士，西安电子科技大学副教授，主要研究方向为视频编码与视频传输" ]
  [ "杨付正（1977− ），男，山东德州人，博士，西安电子科技大学教授、博士生导师，主要研究方向为新一代视频压缩标准、基于深度学习的视频处理和虚拟现实" ]
- 基金信息：
  
  国家自然科学基金资助项目(62101409);国家自然科学基金资助项目(62171353)
- DOI：10.11959/j.issn.1000-436x.2022185
  中图分类号： TN911.7
- 网络出版日期：2022-11，
  
  纸质出版日期：2022-11-25
- 稿件说明：
移动端阅览
霍俊彦, 邱瑞鹏, 马彦卓, 等. 基于最邻近帧质量增强的视频编码参考帧列表优化算法[J]. 通信学报, 2022,43(11):136-147.

Junyan HUO, Ruipeng QIU, Yanzhuo MA, et al. Reference frame list optimization algorithm in video coding by quality enhancement of the nearest picture[J]. Journal on communications, 2022, 43(11): 136-147.
霍俊彦, 邱瑞鹏, 马彦卓, 等. 基于最邻近帧质量增强的视频编码参考帧列表优化算法[J]. 通信学报, 2022,43(11):136-147. DOI： 10.11959/j.issn.1000-436x.2022185.

Junyan HUO, Ruipeng QIU, Yanzhuo MA, et al. Reference frame list optimization algorithm in video coding by quality enhancement of the nearest picture[J]. Journal on communications, 2022, 43(11): 136-147. DOI： 10.11959/j.issn.1000-436x.2022185.

摘要

帧间预测是视频编码的核心模块，其利用参考帧的重建样本来预测当前图像样本，从而通过传输少量预测残差数据表示复杂视频内容。在有损视频编码中，参考帧质量受到量化失真的影响，导致预测精度较差，影响编码性能。针对低时延视频业务，提出一种基于最邻近帧质量增强的参考帧列表优化算法，通过基于深度学习的卷积神经网络增强与当前帧最邻近参考帧的质量，并将增强后的高质量帧整合到当前帧的参考帧列表中，提高了帧间预测精度。以高效视频编码 H.265/HEVC 参考软件平台 HM16.22 为参考基准，所提算法在 Y、Cb、Cr这3个分量上可分别节省9.06%、14.92%、13.19%的编码码率。

Abstract

Interframe prediction is a key module in video coding

which uses the samples in the reference frames to predict those in the current picture

thus helps to represent the complex video by transmitting a small amount of the prediction residual.In lossy video coding

the qualities of reference frames are affected by the quantization distortion

which lead to poor prediction accuracy and performance degradation.Targeted at the low latency video services

a reference frame list optimization algorithm was proposed

which enhanced the quality of the nearest reference frame by a deep learning-based convolutional neural network

and integrated the enhanced reference frame into the reference frame list to improve the accuracy of interframe prediction.Compared with H.265/HEVC reference software HM16.22

the proposed algorithm provides BD-rate savings of 9.06%

14.92% and 13.19% for Y

Cb and Cr components

respectively.

关键词

Keywords

references

SULLIVAN G J , OHM J R , HAN W J , et al . Overview of the high efficiency video coding (HEVC) standard [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2012 , 22 ( 12 ): 1649 - 1668 .

毕厚杰 . 新一代视频压缩编码标准—H.264/AVC [M ] . 北京 : 人民邮电出版社 , 2005 .

BI H J . A new generation of video compression and coding standards-H.264 /AVC [M ] . Beijing : Posts and Telecommunications Press , 2005 .

DONG C , DENG Y B , LOY C C , et al . Compression artifacts reduction by a deep convolutional network [C ] // Proceedings of 2015 IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2015 : 576 - 584 .

PARK W S , KIM M . CNN-based in-loop filtering for coding efficiency improvement [C ] // Proceedings of 2016 IEEE 12th Image,Video,and Multidimensional Signal Processing Workshop . Piscataway:IEEE Press , 2016 : 1 - 5 .

ZHANG Y B , SHEN T , JI X Y , et al . Residual highway convolutional neural networks for in-loop filtering in HEVC [J ] . IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society , 2018 , 27 ( 8 ): 3827 - 3841 .

XIA J Y , WEN J T . Asymmetric convolutional residual network for AV1 intra in-loop filtering [C ] // Proceedings of 2020 IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2020 : 1291 - 1295 .

ZHANG S F , FAN Z H , LING N , et al . Recursive residual convolutional neural network- based in-loop filtering for intra frames [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2020 , 30 ( 7 ): 1888 - 1900 .

DAI Y Y , LIU D , WU F . A convolutional neural network approach for post-processing in HEVC intra coding [C ] // IEEE International Conference on Multimedia Modeling . Piscataway:IEEE Press , 2017 : 28 - 39 .

LAI P R , WANG J S . Multi-stage attention convolutional neural networks for HEVC in-loop filtering [C ] // Proceedings of 2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems . Piscataway:IEEE Press , 2020 : 173 - 177 .

DING D D , KONG L Y , CHEN G Y , et al . A switchable deep learning approach for in-loop filtering in video coding [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2020 , 30 ( 7 ): 1871 - 1887 .

HU J , SHEN L , ALBANIE S , et al . Squeeze-and-excitation networks [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 8 ): 2011 - 2023 .

KANG J H , KIM S , LEE K M . Multi-modal/multi-scale convolutional neural network based in-loop filter design for next generation video codec [C ] // Proceedings of 2017 IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2017 : 26 - 30 .

HE X Y , HU Q , ZHANG X Y , et al . Enhancing HEVC compressed videos with a partition-masked convolutional neural network [C ] // Proceedings of 2018 25th IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2018 : 216 - 220 .

LI D W , YU L . An in-loop filter based on low-complexity CNN using residuals in intra video coding [C ] // Proceedings of 2019 IEEE International Symposium on Circuits and Systems . Piscataway:IEEE Press , 2019 : 1 - 5 .

ZHU H , XU X Z , LIU S . Residual convolutional neural network based in-loop filter with intra and inter frames processed respectively for Avs3 [C ] // Proceedings of 2020 IEEE International Conference on Multimedia ＆ Expo Workshops . Piscataway:IEEE Press , 2020 : 1 - 6 .

JIA C M , WANG S Q , ZHANG X F , et al . Spatial-temporal residue network based in-loop filter for video coding [C ] // Proceedings of 2017 IEEE Visual Communications and Image Processing . Piscataway:IEEE Press , 2017 : 1 - 4 .

LU G , OUYANG W L , XU D , et al . Deep Kalman filtering network for video compression artifact reduction [C ] // Proceedings of the European Conference on Computer Vision . Berlin:Springer , 2018 : 568 - 584 .

YANG R , SUN X Y , XU M , et al . Quality-gated convolutional LSTM for enhancing compressed video [C ] // Proceedings of 2019 IEEE International Conference on Multimedia and Expo . Piscataway:IEEE Press , 2019 : 532 - 537 .

LU M , CHENG M , XU Y L , et al . Learned quality enhancement via multi-frame priors for HEVC compliant low-delay applications [C ] // Proceedings of IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2019 : 934 - 938 .

TONG J C , WU X L , DING D D , et al . Learning-based multi-frame video quality enhancement [C ] // Proceedings of 2019 IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2019 : 929 - 933 .

MENG X D , DENG X , ZHU S Y , et al . A robust quality enhancement method based on joint spatial-temporal priors for video coding [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2021 , 31 ( 6 ): 2401 - 2414 .

DOSOVITSKIY A , FISCHER P , ILG E , et al . FlowNet:learning optical flow with convolutional networks [C ] // Proceedings of 2015 IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2016 : 2758 - 2766 .

SUN D Q , YANG X D , LIU M Y , et al . PWC-net:CNNs for optical flow using pyramid,warping,and cost volume [C ] // Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 8934 - 8943 .

YANG R , XU M , WANG Z L , et al . Multi-frame quality enhancement for compressed video [C ] // Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 6664 - 6673 .

GUAN Z Y , XING Q L , XU M , et al . MFQE 2.0:a new approach for multi-frame quality enhancement on compressed video [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 3 ): 949 - 963 .

DAI J F , QI H Z , XIONG Y W , et al . Deformable convolutional networks [C ] // Proceedings of 2017 IEEE International Conference on Computer Vision . Piscataway:IEEE Press , 2017 : 764 - 773 .

ZHU X Z , HU H , LIN S , et al . Deformable ConvNets V2:more deformable,better results [C ] // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2019 : 9300 - 9308 .

DENG J N , WANG L , PU S L , et al . Spatio-temporal deformable convolution for compressed video quality enhancement [C ] // Proceedings of the AAAI Conference on Artificial Intelligence . Palo Alto:AAAI Press , 2020 : 10696 - 10703 .

ZHAO M Y , XU Y , ZHOU S G . Recursive fusion and deformable spatiotemporal attention for video compression artifact reduction [C ] // Proceedings of the 29th ACM International Conference on Multimedia . New York:ACM Press , 2021 : 5646 - 5654 .

LIN J Y , HUANG Y , WANG L . FDAN:flow-guided deformable alignment network for video super-resolution [C ] // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2021 : 1 - 10 .

CHAN K C K , ZHOU S C , XU X Y , et al . BasicVSR++:improving video super-resolution with enhanced propagation and alignment [C ] // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2021 : 5972 - 5981 .

DING D D , WANG W Y , TONG J C , et al . Biprediction-based video quality enhancement via learning [J ] . IEEE Transactions on Cybernetics , 2022 , 52 ( 2 ): 1207 - 1220 .

WANG X T , CHAN K C K , YU K , et al . EDVR:video restoration with enhanced deformable convolutional networks [C ] // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway:IEEE Press , 2019 : 1954 - 1963 .

YAN N , LIU D , LI H Q , et al . A convolutional neural network approach for half-pel interpolation in video coding [C ] // Proceedings of 2017 IEEE International Symposium on Circuits and Systems . Piscataway:IEEE Press , 2017 : 1 - 4 .

ZHANG H , SONG L , LUO Z Y , et al . Learning a convolutional neural network for fractional interpolation in HEVC inter coding [C ] // Proceedings of 2017 IEEE Visual Communications and Image Processing . Piscataway:IEEE Press , 2017 : 1 - 4 .

YAN N , LIU D , LI H Q , et al . Convolutional neural network-based fractional-pixel motion compensation [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2019 , 29 ( 3 ): 840 - 853 .

YAN N , LIU D , LI H Q , et al . Convolutional neural network-based invertible half-pixel interpolation filter for video coding [C ] // Proceedings of 2018 25th IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2018 : 201 - 205 .

ZHAO Z H , WANG S Q , WANG S S , et al . Enhanced Bi-prediction with convolutional neural network for high-efficiency video coding [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2019 , 29 ( 11 ): 3291 - 3301 .

CHOI H , BAJIĆ I V , . Deep frame prediction for video coding [J ] . IEEE Transactions on Circuits and Systems for Video Technology , 2020 , 30 ( 7 ): 1843 - 1855 .

LIN J P , LIU D , LI H Q , et al . Generative adversarial network-based frame extrapolation for video coding [C ] // Proceedings of 2018 IEEE Visual Communications and Image Processing . Piscataway:IEEE Press , 2018 : 1 - 4 .

ZHAO L , WANG S Q , ZHANG X F , et al . Enhanced CTU-level inter prediction with deep frame rate up-conversion for high efficiency video coding [C ] // Proceedings of 2018 25th IEEE International Conference on Image Processing . Piscataway:IEEE Press , 2018 : 206 - 210 .

ZHAO L , WANG S Q , ZHANG X F , et al . Enhanced motion-compensated video coding with deep virtual reference frame generation [J ] . IEEE Transactions on Image Processing:a Publication of the IEEE Signal Processing Society , 2019 , 28 ( 10 ): 4832 - 4844 .

LEE J K , KIM N , CHO S , et al . Deep video prediction network-based inter-frame coding in HEVC [J ] . IEEE Access , 2020 , 8 : 95906 - 95917 .

CHOI H , BAJIĆ I V , . Affine transformation-based deep frame prediction [J ] . IEEE Transactions on Image Processing , 2021 , 30 : 3321 - 3334 .

KIM J , LEE J K , LEE K M . Accurate image super-resolution using very deep convolutional networks [C ] // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2016 : 1646 - 1654 .

RANJAN A , BLACK M J . Optical flow estimation using a spatial pyramid network [C ] // Proceedings of IEEE Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2017 : 4161 - 4170 .

BOSSEN F . Common HM test conditions and software reference configuration [R ] . 2013 .

BJØNTEGAARD G . Calculation of average PSNR differences between RD-curves [R ] . 2001 .

浏览量

635

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

超大规模太赫兹系统深度学习信道估计算法

基于机器学习的加密流量分类研究综述

基于深度学习的SDN异常流量分布式检测方法

基于Ngram-TFIDF的深度恶意代码可视化分类方法

基于后门攻击的恶意流量逃逸方法