TY - GEN
T1 - A Transformer based Approach for Image Manipulation Chain Detection
AU - You, Jiaxiang
AU - Li, Yuanman
AU - Zhou, Jiantao
AU - Hua, Zhongyun
AU - Sun, Weiwei
AU - Li, Xia
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/10/17
Y1 - 2021/10/17
N2 - Image manipulation chain detection aims to identify the existence of involved operations and also their orders, playing an important role in multimedia forensics and image analysis. However,all the existing algorithms model the manipulation chain detection as a classification problem, and can only detect chains containing up to two operations. Due to the exponentially increased solution space and the complex interactions among operations, how to reveal a long chain from a processed image remains a long-standing problem in the multimedia forensic community. To address this challenge, in this paper, we propose a new direction for manipulation chain detection. Different from previous works, we treat the manipulation chain detection as a machine translation problem rather than a classification one, where we model the chains as the sentences of a target language, and each word serves as one possible image operation. Specifically, we first transform the manipulated image into a deep feature space, and further model the traces left by the manipulation chain as a sentence of a latent source language. Then, we propose to detect the manipulation chain through learning the mapping from the source language to the target one under a machine translation framework. Our method can detect manipulation chains consisting of up to five operations, and we obtain promising results on both the short-chain detection and the long-chain detection.
AB - Image manipulation chain detection aims to identify the existence of involved operations and also their orders, playing an important role in multimedia forensics and image analysis. However,all the existing algorithms model the manipulation chain detection as a classification problem, and can only detect chains containing up to two operations. Due to the exponentially increased solution space and the complex interactions among operations, how to reveal a long chain from a processed image remains a long-standing problem in the multimedia forensic community. To address this challenge, in this paper, we propose a new direction for manipulation chain detection. Different from previous works, we treat the manipulation chain detection as a machine translation problem rather than a classification one, where we model the chains as the sentences of a target language, and each word serves as one possible image operation. Specifically, we first transform the manipulated image into a deep feature space, and further model the traces left by the manipulation chain as a sentence of a latent source language. Then, we propose to detect the manipulation chain through learning the mapping from the source language to the target one under a machine translation framework. Our method can detect manipulation chains consisting of up to five operations, and we obtain promising results on both the short-chain detection and the long-chain detection.
KW - image forensics
KW - machine translation
KW - manipulation chain detection
KW - transformer
UR - https://www.scopus.com/pages/publications/85119345896
U2 - 10.1145/3474085.3475513
DO - 10.1145/3474085.3475513
M3 - 会议稿件
AN - SCOPUS:85119345896
T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
SP - 3510
EP - 3517
BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 29th ACM International Conference on Multimedia, MM 2021
Y2 - 20 October 2021 through 24 October 2021
ER -