TY - GEN
T1 - ScanDMM
T2 - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
AU - Sui, Xiangjie
AU - Fang, Yuming
AU - Zhu, Hanwei
AU - Wang, Shiqi
AU - Wang, Zhou
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Scanpath prediction for 360° images aims to produce dynamic gaze behaviors based on the human visual perception mechanism. Most existing scanpath prediction methods for 360° images do not give a complete treatment of the time-dependency when predicting human scanpath, resulting in inferior performance and poor generalizability. In this paper, we present a scanpath prediction method for 360° images by designing a novel Deep Markov Model (DMM) architecture, namely ScanDMM. We propose a semantics-guided transition function to learn the nonlinear dynamics of time-dependent attentional landscape. Moreover, a state initialization strategy is proposed by considering the starting point of viewing, enabling the model to learn the dynamics with the correct 'launcher'. We further demonstrate that our model achieves state-of-the-art performance on four 360° image databases, and exhibit its generalizability by presenting two applications of applying scanpath prediction models to other visual tasks - saliency detection and image quality assessment, expecting to provide profound insights into these fields.
AB - Scanpath prediction for 360° images aims to produce dynamic gaze behaviors based on the human visual perception mechanism. Most existing scanpath prediction methods for 360° images do not give a complete treatment of the time-dependency when predicting human scanpath, resulting in inferior performance and poor generalizability. In this paper, we present a scanpath prediction method for 360° images by designing a novel Deep Markov Model (DMM) architecture, namely ScanDMM. We propose a semantics-guided transition function to learn the nonlinear dynamics of time-dependent attentional landscape. Moreover, a state initialization strategy is proposed by considering the starting point of viewing, enabling the model to learn the dynamics with the correct 'launcher'. We further demonstrate that our model achieves state-of-the-art performance on four 360° image databases, and exhibit its generalizability by presenting two applications of applying scanpath prediction models to other visual tasks - saliency detection and image quality assessment, expecting to provide profound insights into these fields.
KW - Scene analysis and understanding
UR - https://www.scopus.com/pages/publications/85162139465
U2 - 10.1109/CVPR52729.2023.00675
DO - 10.1109/CVPR52729.2023.00675
M3 - 会议稿件
AN - SCOPUS:85162139465
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 6989
EP - 6999
BT - Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
PB - IEEE Computer Society
Y2 - 18 June 2023 through 22 June 2023
ER -