TY - JOUR
T1 - Appearance Matters, So Does Audio
T2 - Revealing the Hidden Face via Cross-Modality Transfer
AU - Kong, Chenqi
AU - Chen, Baoliang
AU - Yang, Wenhan
AU - Li, Haoliang
AU - Chen, Peilin
AU - Wang, Shiqi
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - Recently, there has been an exponential increase in the security concerns raised by faking face (e.g., deepfake), which automatically changes the identity with a specifically learned deep generative model. With numerous approaches proposed to identify the fake content, much less work has been dedicated to automatically revealing the authentic one that is originally acquired. Here, we propose a new paradigm that seeks to reveal the authentic face hidden behind the fake one by leveraging the joint information of face and audio. More specifically, given the fake face as well as the audio segment, the cross-modality transferable capability is exploited by learning to generate the feature of the authentic face, based on the underlying clues from the audio as well as the fake face appearance. The effectiveness of the proposed scheme is validated through a series of evaluations, and experimental results show that the proposed model achieves promising face reconstruction performance in revealing the hidden faces, in terms of reconstruction quality, as well as identity and face attribute inference accuracy.
AB - Recently, there has been an exponential increase in the security concerns raised by faking face (e.g., deepfake), which automatically changes the identity with a specifically learned deep generative model. With numerous approaches proposed to identify the fake content, much less work has been dedicated to automatically revealing the authentic one that is originally acquired. Here, we propose a new paradigm that seeks to reveal the authentic face hidden behind the fake one by leveraging the joint information of face and audio. More specifically, given the fake face as well as the audio segment, the cross-modality transferable capability is exploited by learning to generate the feature of the authentic face, based on the underlying clues from the audio as well as the fake face appearance. The effectiveness of the proposed scheme is validated through a series of evaluations, and experimental results show that the proposed model achieves promising face reconstruction performance in revealing the hidden faces, in terms of reconstruction quality, as well as identity and face attribute inference accuracy.
KW - cross modality
KW - Deepfake
KW - face reconstruction
KW - face revealing
KW - fake face
UR - https://www.scopus.com/pages/publications/85101471454
U2 - 10.1109/TCSVT.2021.3057457
DO - 10.1109/TCSVT.2021.3057457
M3 - 文章
AN - SCOPUS:85101471454
SN - 1051-8215
VL - 32
SP - 423
EP - 436
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 1
ER -