TY - JOUR
T1 - Frame-Wise Detection of Double HEVC Compression by Learning Deep Spatio-Temporal Representations in Compression Domain
AU - He, Peisong
AU - Li, Haoliang
AU - Wang, Hongxia
AU - Wang, Shiqi
AU - Jiang, Xinghao
AU - Zhang, Ruimei
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Detection of double compression is regarded as one primary step in analyzing the integrity of digital videos, which is of prominent importance in video forensics. However, current methods are vulnerable with the severe lossy quantization in the recompression process such that it is challenging to obtain reliable frame-wise detection results, especially for the high efficiency video coding (HEVC) standard. In view of these issues, in this paper, a hybrid neural network is proposed to reveal abnormal frames in HEVC videos with double compression by learning robust spatio-temporal representations from coding information in the compression domain. Based on the statistical analysis of Coding Units (CUs), it is interesting to find that HEVC video streams contain 'rich' coding information that could be leveraged to identify abnormal traces caused by double compression. Two types of coding information maps, including CU Size Map (CSM) and CU Prediction mode Map (CPM), are exploited. In contrast with the conventional paradigm relying on pixel-level representations of decoded frames, CSMs and CPMs of a short-time video clip are treated as the input, aiming to achieve high robustness against recompression of low quality. In our hybrid neural network, an attention-based two-stream residual network is proposed to learn hierarchical representations from CSM and CPM, which are then jointly optimized by the attention-based fusion module. Finally, the temporal variation is modeled by Long Short-Term Memory (LSTM) to obtain frame-wise detection results. We have conducted extensive experiments considering various video content and coding parameters, such as bitrates and sizes of Group of Picture. Experimental results show that our approach can obtain state-of-the-art performance compared with conventional methods, especially when videos are recompressed in the low bitrate coding scenarios.
AB - Detection of double compression is regarded as one primary step in analyzing the integrity of digital videos, which is of prominent importance in video forensics. However, current methods are vulnerable with the severe lossy quantization in the recompression process such that it is challenging to obtain reliable frame-wise detection results, especially for the high efficiency video coding (HEVC) standard. In view of these issues, in this paper, a hybrid neural network is proposed to reveal abnormal frames in HEVC videos with double compression by learning robust spatio-temporal representations from coding information in the compression domain. Based on the statistical analysis of Coding Units (CUs), it is interesting to find that HEVC video streams contain 'rich' coding information that could be leveraged to identify abnormal traces caused by double compression. Two types of coding information maps, including CU Size Map (CSM) and CU Prediction mode Map (CPM), are exploited. In contrast with the conventional paradigm relying on pixel-level representations of decoded frames, CSMs and CPMs of a short-time video clip are treated as the input, aiming to achieve high robustness against recompression of low quality. In our hybrid neural network, an attention-based two-stream residual network is proposed to learn hierarchical representations from CSM and CPM, which are then jointly optimized by the attention-based fusion module. Finally, the temporal variation is modeled by Long Short-Term Memory (LSTM) to obtain frame-wise detection results. We have conducted extensive experiments considering various video content and coding parameters, such as bitrates and sizes of Group of Picture. Experimental results show that our approach can obtain state-of-the-art performance compared with conventional methods, especially when videos are recompressed in the low bitrate coding scenarios.
KW - Coding information map
KW - double HEVC compression
KW - hybrid neural network
KW - spatio-temporal representation
KW - video forensics
UR - https://www.scopus.com/pages/publications/85090435046
U2 - 10.1109/TMM.2020.3021234
DO - 10.1109/TMM.2020.3021234
M3 - 文章
AN - SCOPUS:85090435046
SN - 1520-9210
VL - 23
SP - 3179
EP - 3192
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -