TY - JOUR
T1 - Learned Reference Picture Resampling Control
T2 - A Data-Centric Approach
AU - Lu, Riyu
AU - Zhang, Yingwen
AU - Man, Hengyu
AU - Wang, Meng
AU - Xu, Long
AU - Wang, Shiqi
AU - Fan, Xiaopeng
N1 - Publisher Copyright:
© 2026 IEEE.
PY - 2026/5/1
Y1 - 2026/5/1
N2 - Learned reference picture resampling control (LRPRC) adaptively adjusts the coding scale for each frame using an offline-trained neural network. It demonstrates promising rate-distortion (R-D) performance improvements over traditional methods, particularly in high-resolution, low-bit-rate video coding scenarios. However, existing LRPRC methods rely exclusively on locally optimal decision labels derived from greedy strategies for network training, leading to suboptimal control performance. To address this limitation, we introduce a novel data-centric solution that substantially improves training label quality, thereby enhancing overall LRPRC performance. Specifically, our key contribution is a parallelized beam search-based coding scale labeling algorithm, which captures decision dependencies across coding steps and produces higher-quality training labels with enhanced R-D performance. By fully exploiting the intra-trellis and inter-trellis parallelism of beam search and hierarchical coding, our proposed labeling algorithm achieves logarithmic-squared time complexity, making it highly suitable for large-scale cluster computing. We validate this simple yet effective data-centric LRPRC approach in the Versatile Video Encoder (VVenC) using 4K video sequences. Experimental results demonstrate that merely upgrading the beam search labels (without any neural architecture re-designs) consistently outperforms the state-of-the-art LRPRC method, achieving BD-rate reductions of 5.09%, 3.98%, and 3.59% under the fast, medium, and slow presets, respectively.
AB - Learned reference picture resampling control (LRPRC) adaptively adjusts the coding scale for each frame using an offline-trained neural network. It demonstrates promising rate-distortion (R-D) performance improvements over traditional methods, particularly in high-resolution, low-bit-rate video coding scenarios. However, existing LRPRC methods rely exclusively on locally optimal decision labels derived from greedy strategies for network training, leading to suboptimal control performance. To address this limitation, we introduce a novel data-centric solution that substantially improves training label quality, thereby enhancing overall LRPRC performance. Specifically, our key contribution is a parallelized beam search-based coding scale labeling algorithm, which captures decision dependencies across coding steps and produces higher-quality training labels with enhanced R-D performance. By fully exploiting the intra-trellis and inter-trellis parallelism of beam search and hierarchical coding, our proposed labeling algorithm achieves logarithmic-squared time complexity, making it highly suitable for large-scale cluster computing. We validate this simple yet effective data-centric LRPRC approach in the Versatile Video Encoder (VVenC) using 4K video sequences. Experimental results demonstrate that merely upgrading the beam search labels (without any neural architecture re-designs) consistently outperforms the state-of-the-art LRPRC method, achieving BD-rate reductions of 5.09%, 3.98%, and 3.59% under the fast, medium, and slow presets, respectively.
KW - Versatile video coding
KW - beam search
KW - machine learning
KW - resampling-based compression
UR - https://www.scopus.com/pages/publications/105027995465
U2 - 10.1109/TCSVT.2026.3653045
DO - 10.1109/TCSVT.2026.3653045
M3 - 文章
AN - SCOPUS:105027995465
SN - 1051-8215
VL - 36
SP - 5828
EP - 5838
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 5
ER -