Skip to main navigation Skip to search Skip to main content

Learned Reference Picture Resampling Control: A Data-Centric Approach

  • Riyu Lu
  • , Yingwen Zhang
  • , Hengyu Man*
  • , Meng Wang
  • , Long Xu
  • , Shiqi Wang
  • , Xiaopeng Fan
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • City University of Hong Kong
  • Lingnan University
  • Ningbo University
  • Suzhou Research Institute

Research output: Contribution to journalArticlepeer-review

Abstract

Learned reference picture resampling control (LRPRC) adaptively adjusts the coding scale for each frame using an offline-trained neural network. It demonstrates promising rate-distortion (R-D) performance improvements over traditional methods, particularly in high-resolution, low-bit-rate video coding scenarios. However, existing LRPRC methods rely exclusively on locally optimal decision labels derived from greedy strategies for network training, leading to suboptimal control performance. To address this limitation, we introduce a novel data-centric solution that substantially improves training label quality, thereby enhancing overall LRPRC performance. Specifically, our key contribution is a parallelized beam search-based coding scale labeling algorithm, which captures decision dependencies across coding steps and produces higher-quality training labels with enhanced R-D performance. By fully exploiting the intra-trellis and inter-trellis parallelism of beam search and hierarchical coding, our proposed labeling algorithm achieves logarithmic-squared time complexity, making it highly suitable for large-scale cluster computing. We validate this simple yet effective data-centric LRPRC approach in the Versatile Video Encoder (VVenC) using 4K video sequences. Experimental results demonstrate that merely upgrading the beam search labels (without any neural architecture re-designs) consistently outperforms the state-of-the-art LRPRC method, achieving BD-rate reductions of 5.09%, 3.98%, and 3.59% under the fast, medium, and slow presets, respectively.

Original languageEnglish
Pages (from-to)5828-5838
Number of pages11
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume36
Issue number5
DOIs
StatePublished - 1 May 2026
Externally publishedYes

Keywords

  • Versatile video coding
  • beam search
  • machine learning
  • resampling-based compression

Fingerprint

Dive into the research topics of 'Learned Reference Picture Resampling Control: A Data-Centric Approach'. Together they form a unique fingerprint.

Cite this