TY - GEN
T1 - Diffusion Facial Forgery Detection
AU - Cheng, Harry
AU - Guo, Yangyang
AU - Wang, Tianyi
AU - Nie, Liqiang
AU - Kankanhalli, Mohan
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/28
Y1 - 2024/10/28
N2 - Detecting diffusion-generated images has recently developed as an emerging research area. Existing diffusion-based datasets predominantly focus on general image generation. However, facial forgeries, which pose severe social risks, have remained less explored thus far. To address this gap, this paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images. DiFF comprises over 500,000 images that are synthesized using thirteen distinct generation methods under four conditions. In particular, this dataset utilizes 30,000 carefully collected textual and visual prompts, ensuring the synthesis of images with both high fidelity and semantic consistency. We conduct extensive experiments on the DiFF dataset via human subject tests and several representative forgery detection methods. The results demonstrate that the binary detection accuracies of both human observers and automated detectors often fall below 30%, revealing insights on the challenges in detecting diffusion-generated facial forgeries. Moreover, our experiments demonstrate that DiFF, compared to previous facial forgery datasets, contains a more diverse and realistic range of forgeries, showcasing its potential to aid in the development of more generalized detectors. Finally, we propose an edge graph regularization approach to effectively enhance the generalization capability of existing detectors.
AB - Detecting diffusion-generated images has recently developed as an emerging research area. Existing diffusion-based datasets predominantly focus on general image generation. However, facial forgeries, which pose severe social risks, have remained less explored thus far. To address this gap, this paper introduces DiFF, a comprehensive dataset dedicated to face-focused diffusion-generated images. DiFF comprises over 500,000 images that are synthesized using thirteen distinct generation methods under four conditions. In particular, this dataset utilizes 30,000 carefully collected textual and visual prompts, ensuring the synthesis of images with both high fidelity and semantic consistency. We conduct extensive experiments on the DiFF dataset via human subject tests and several representative forgery detection methods. The results demonstrate that the binary detection accuracies of both human observers and automated detectors often fall below 30%, revealing insights on the challenges in detecting diffusion-generated facial forgeries. Moreover, our experiments demonstrate that DiFF, compared to previous facial forgery datasets, contains a more diverse and realistic range of forgeries, showcasing its potential to aid in the development of more generalized detectors. Finally, we propose an edge graph regularization approach to effectively enhance the generalization capability of existing detectors.
KW - deepfake detection
KW - diffusion-based generation
KW - facial forgery detection
UR - https://www.scopus.com/pages/publications/85209821450
U2 - 10.1145/3664647.3680797
DO - 10.1145/3664647.3680797
M3 - 会议稿件
AN - SCOPUS:85209821450
T3 - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
SP - 5939
EP - 5948
BT - MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 32nd ACM International Conference on Multimedia, MM 2024
Y2 - 28 October 2024 through 1 November 2024
ER -