Skip to main navigation Skip to search Skip to main content

Improving Sequential DeepFake Detection with Local information enhancement

  • Harbin Institute of Technology Weihai

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Existing Deepfake technology involves multi-step forgery to generate images.However, there are few sequential Deepfake detection methods available.To address this challenge, we propose a feature cross-fusion model that combines Vision Transformer (ViT) and Convolutional Neural Network (CNN), along with a novel data augmentation technique called Channel Random Erasing(CRE).This model first enhances its robustness by using CRE, which introduces controlled occlusions during training to simulate real-world manipulations.It then captures both the global and local features of images through multi-scale feature fusion.The Vision Transformer (ViT) captures global contextual information via self-attention mechanism, providing a strong global feature representation, while the Convolutional Neural Network (CNN) extracts local details through convolution operations, effectively capturing edges and texture information.Extensive experiments on the Seq-Deepfake benchmark demonstrate the effectiveness of this model, achieving better performance compared to current state-of-the-art methods.

Original languageEnglish
Title of host publicationProceedings of the 6th ACM International Conference on Multimedia in Asia, MMAsia 2024
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400712739
DOIs
StatePublished - 28 Dec 2024
Externally publishedYes
Event6th ACM International Conference on Multimedia in Asia, MMAsia 2024 - Auckland, New Zealand
Duration: 3 Dec 20246 Dec 2024

Publication series

NameProceedings of the 6th ACM International Conference on Multimedia in Asia, MMAsia 2024

Conference

Conference6th ACM International Conference on Multimedia in Asia, MMAsia 2024
Country/TerritoryNew Zealand
CityAuckland
Period3/12/246/12/24

Keywords

  • Channel Random Erasing
  • Cross-attention
  • Deepfake detection
  • ViT

Fingerprint

Dive into the research topics of 'Improving Sequential DeepFake Detection with Local information enhancement'. Together they form a unique fingerprint.

Cite this