跳到主要导航 跳到搜索 跳到主要内容

Learning to Explore Saliency for Stereoscopic Videos Via Component-Based Interaction

  • Qiudan Zhang
  • , Xu Wang*
  • , Shiqi Wang
  • , Zhenhao Sun
  • , Sam Kwong
  • , Jianmin Jiang
  • *此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

In this paper, we devise a saliency prediction model for stereoscopic videos that learns to explore saliency inspired by the component-based interactions including spatial, temporal, as well as depth cues. The model first takes advantage of specific structure of 3D residual network (3D-ResNet) to model the saliency driven by spatio-temporal coherence from consecutive frames. Subsequently, the saliency inferred by implicit-depth is automatically derived based on the displacement correlation between left and right views by leveraging a deep convolutional network (ConvNet). Finally, a component-wise refinement network is devised to produce final saliency maps over time by aggregating saliency distributions obtained from multiple components. In order to further facilitate research towards stereoscopic video saliency, we create a new dataset including 175 stereoscopic video sequences with diverse content, as well as their dense eye fixation annotations. Extensive experiments support that our proposed model can achieve superior performance compared to the state-of-the-art methods on all publicly available eye fixation datasets.

源语言英语
文章编号9062560
页(从-至)5722-5736
页数15
期刊IEEE Transactions on Image Processing
29
DOI
出版状态已出版 - 2020
已对外发布

指纹

探究 'Learning to Explore Saliency for Stereoscopic Videos Via Component-Based Interaction' 的科研主题。它们共同构成独一无二的指纹。

引用此