Skip to main navigation Skip to search Skip to main content

Penny-Wise and Pound-Foolish in AI-Generated Image Detection

  • Yabin Wang
  • , Zhiwu Huang
  • , Zhou Su
  • , Adam Prugel-Bennett*
  • , Xiaopeng Hong
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • University of Southampton
  • Xi'an Jiaotong University
  • Pengcheng Laboratory

Research output: Contribution to journalArticlepeer-review

Abstract

The rise of AI-generated images has sparked serious concerns about their potential misuse across various domains, prompting the urgent need for robust detection methods. Despite advancements, many current approaches prioritize short-term gains at the expense of long-term effectiveness. This paper critiques the overly specialized approach of fine-tuning pre-trained models for short-term gains on a single AI image dataset, while disregarding the long-term imperative of achieving generalization and knowledge retention. To address this trade-off issue, we propose a novel learning framework (PoundNet) for the generalization of AI-generated image detection on a pre-trained vision-language model. PoundNet incorporates a learnable prompt design and a balanced objective to preserve broad knowledge from upstream tasks (object classification) while enhancing generalization for downstream tasks (AI-generated image detection). We train PoundNet on a single standard AI image dataset, following common practice in the literature. We then evaluate its performance across 10 large-scale public AI-generated image detection datasets with 5 main evaluation metrics, forming the largest benchmark test set for assessing the generalization ability of AI-generated image detection models, to our knowledge. The comprehensive benchmark evaluation demonstrates that PoundNet successfully balances generalization with knowledge retention, achieving a remarkable relative improvement of 19% in AI-generated image detection performance compared to state-of-the-art methods, while maintaining a strong performance of 63% on object classification tasks.

Keywords

  • AI-generated image detection
  • balanced objective
  • generalization
  • knowledge preservation
  • learnable prompt design
  • pre-trained vision-language model

Fingerprint

Dive into the research topics of 'Penny-Wise and Pound-Foolish in AI-Generated Image Detection'. Together they form a unique fingerprint.

Cite this