跳到主要导航 跳到搜索 跳到主要内容

Enhanced Context Mining and Filtering for Learned Video Compression

  • Haifeng Guo
  • , Sam Kwong*
  • , Dongjie Ye
  • , Shiqi Wang
  • *此作品的通讯作者
  • City University of Hong Kong
  • Lingnan University

科研成果: 期刊稿件文章同行评审

摘要

The Deep Contextual Video Compression framework (DCVC) utilizes a conditional coding paradigm, where the context is extracted and employed as a condition for the contextual encoder-decoder and entropy model. In this paper, we propose enhanced context mining and filtering to improve the compression efficiency of DCVC. Firstly, considering the context of DCVC is generated without supervision and redundancy may exist among context channels, an enhanced context mining model is proposed to mitigate redundancy across context channels to obtain superior context features. Then, we introduce a transformer-based enhancement network as a filtering module to capture long-distance dependencies and further enhance compression efficiency. The transformer-based enhancement adopts a full-resolution pipeline and calculates self-attention across channel dimensions. By combining the local modeling ability of the enhanced context mining model and the non-local modeling ability of the transformer-based enhancement network, our model outperforms LDP configurations of Versatile Video Coding (VVC), achieving an average bit savings of 6.7% in terms of MS-SSIM.

源语言英语
页(从-至)3814-3826
页数13
期刊IEEE Transactions on Multimedia
26
DOI
出版状态已出版 - 2024
已对外发布

指纹

探究 'Enhanced Context Mining and Filtering for Learned Video Compression' 的科研主题。它们共同构成独一无二的指纹。

引用此