Skip to main navigation Skip to search Skip to main content

Convolutional Neural Network-Based Synthesized View Quality Enhancement for 3D Video Coding

  • Linwei Zhu
  • , Yun Zhang
  • , Shiqi Wang
  • , Hui Yuan
  • , Sam Kwong*
  • , Horace H.S. Ip
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The quality of synthesized view plays an important role in the 3D video system. In this paper, to further improve the coding efficiency, a convolutional neural network (CNN)-based synthesized view quality enhancement method for 3D high efficiency video coding (HEVC) is proposed. First, the distortion elimination in synthesized view is formulated as an image restoration task with the aim to reconstruct the latent distortion free synthesized image. Second, the learned CNN models are incorporated into 3D HEVC codec to improve the view synthesis performance for both view synthesis optimization (VSO) and the final synthesized view, where the geometric and compression distortions are considered according to the specific characteristics of synthesized view. Third, a new Lagrange multiplier in the rate-distortion cost function is derived to adapt the CNN-based VSO process to embrace a better 3D video coding performance. Extensive experimental results show that the proposed scheme can efficiently eliminate the artifacts in the synthesized image, and reduce 25.9% and 11.7% bit rate in terms of peak-signal-to-noise ratio and structural similarity index, which significantly outperforms the state-of-the-art methods.

Original languageEnglish
Article number8416728
Pages (from-to)5365-5377
Number of pages13
JournalIEEE Transactions on Image Processing
Volume27
Issue number11
DOIs
StatePublished - Nov 2018
Externally publishedYes

Keywords

  • 3D high efficiency video coding
  • Convolutional neural network
  • depth coding
  • Lagrange multiplier
  • view synthesis

Fingerprint

Dive into the research topics of 'Convolutional Neural Network-Based Synthesized View Quality Enhancement for 3D Video Coding'. Together they form a unique fingerprint.

Cite this