Parallelized RDOQ Algorithm and Fully Pipelined Hardware Architecture for AVS3 Video Coding

  • Xiaofeng Huang
  • , Ran Tang
  • , Rui Pan
  • , Haibing Yin*
  • , Zhao Wang
  • , Shiqi Wang
  • , Siwei Ma
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The rate-distortion optimized quantization (RDOQ) provides significant coding gain in the third generation of Audio Video coding Standard (AVS3). However, the high computational complexity and strong data dependency in RDOQ impede the hardware implementation. To address these issues, we propose a zig-zag scanline-level parallelized RDOQ algorithm and its fully pipelined hardware architecture for AVS3 video coding. For algorithm optimization, we update the run-level context for rate estimation in the inner zig-zag scanline and propose an efficient RD cost calculation form in the optimal coefficient level (OCL) decision step. In the last significant coefficient (LSC) position decision step, a greedy strategy based algorithm is proposed to optimize the determination process in parallel. Moreover, the proposed parallelized RDOQ algorithm is accelerated by single instruction multiple data (SIMD) on the Intel X86 platform. For hardware architecture design, a fully pipelined hardware architecture is proposed with nine pipeline stages. This design can process multiple transform units in parallel when the height is less than 32. Experimental results show that the proposed algorithm achieves 31.37%, 28.58%, and 28.53% time-saving by 0.25%, 0.26%, and 0.27% Bjontegaard delta rate (BD-Rate) increase on average under all intra (AI), random access (RA), and low delay B (LDB) configurations, respectively. The hardware implementation achieves 32 coefficients per cycle, and the area consumption is 1223.2-K logic gates when working at 471.2-MHz. It is proven that the proposed algorithm and hardware architecture design achieve a good trade-off between coding efficiency and hardware throughput.

Original languageEnglish
Pages (from-to)6430-6444
Number of pages15
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume34
Issue number7
DOIs
StatePublished - 2024
Externally publishedYes

Keywords

  • AVS3
  • hardware architecture
  • parallelized algorithm
  • RDOQ
  • zig-zag scanline

Fingerprint

Dive into the research topics of 'Parallelized RDOQ Algorithm and Fully Pipelined Hardware Architecture for AVS3 Video Coding'. Together they form a unique fingerprint.

Cite this