Abstract
Objective The rapid evolution of artificial intelligence-generated content(AIGC), particularly text-to-image models such as Stable Diffusion, Midjourney, and DALL-E, presents a dual-use dilemma. While enabling unprecedented creative applications, the proliferation of hyper-realistic AI-generated images poses severe societal risks, including the spread of disinformation, sophisticated fraud, and intellectual property infringement. This dynamic environment underscores the urgent need for robust and reliable detection technologies. However, conventional detection methods, which rely on offline training, are fundamentally ill-equipped for this dynamic environment. The“train-once, deploy-forever”paradigm causes them to fail when faced with the ceaseless emergence of novel generative models, leading to a swift decay in performance. This critical limitation highlights the need for a more adaptive approach. Continual learning(CL), a paradigm designed to enable models to learn sequentially from a continuous data stream while mitigating“catastrophic forgetting, ”offers a promising solution. Yet, its application to the diverse and fast-paced domain of modern AIGC detection is hindered by two significant, unaddressed gaps. First, there is a conspicuous absence of a specialized, large-scale benchmark for systematically evaluating continual AIGC detection methods. Second, and more critically, a unique real-world data constraint creates a novel and formidable learning challenge. This constraint arises because new generative models (positive samples)are often released and become accessible, while their corresponding, high-quality real training images(negative samples) remain proprietary and unavailable. This condition leads to a novel“mixed dual and single-class” incremental learning problem: Initial learning tasks may possess both positive and negative samples, but subsequent tasks are often restricted to positive samples only. This scenario fundamentally violates the core assumptions of most existing CL algorithms, rendering them ineffective. To address these profound challenges, this study establishes a foundational methodology for the continual detection of AI-generated images. Our primary objective is to construct a comprehensive benchmark and a robust framework capable of dynamically adapting to an ever-expanding stream of generative models, particularly under these realistic and challenging data constraints. Method First, we introduce and release a continual AI-generated image detection(CAID)benchmark, the first large-scale dataset specifically tailored for this task. It contains high-quality images from five state-of-the-art generative models(Stable Diffusion v1. 5, DALL-E 2, Imagen, Midjourney, and Parti) and corresponding real images, organized into a sequential task stream to simulate the real-world emergence of new AIGC technologies. Upon this benchmark, we formally define the CAID problem, which requires a model not only to perform binary classification(real vs. fake)but also to achieve multiclass source attribution(i. e., identifying the specific generator model), a task crucial for“copyright identification”. To standardize evaluation, we design three benchmarks with increasing difficulty based on data replay constraints. Scenario 1(full replay): a lenient setting where a small buffer of historical positive(fake)and negative(real)samples can be replayed. Scenario 2(negative-only replay): a more practical setting where only historical negative samples can be replayed, reflecting IP or privacy restrictions. Scenario 3(no replay): the most stringent setting, completely forbidding access to any past samples and forcing the model to learn incrementally from single-class data. Then, we propose tailored solutions for these scenarios. For Scenarios 1 and 2, we adapt existing CL methods using a“negative sample sharing”mechanism. This solution ensures that a small set of real images from the initial task is consistently available, providing the necessary negative class information to stabilize training and prevent the failure of standard loss functions. For the most severe no-replay scenario(Scenario 3), in which our experiments find that existing methods catastrophically fail, we propose a novel universal conversion framework. This framework is engineered to rescue failing methods by systematically addressing the core breakdown points of loss function invalidation, severe classifier output bias, and feature representation drift. We integrate three synergistic components to achieve this objective. First, we use knowledge distillation(KD), in which the model from the previous task acts as a“teacher”to guide the current model, preserving knowledge of past classes by matching its output logits. Second, we use a cosine-normalized classifier to replace the standard linear layer through cosine normalization(CN), which calibrates output logits across all tasks to mitigate bias. Finally, the framework utilizes prompt tuning(PT)by freezing the pretrained vision transformer backbone and training only a small set of learnable“prompt”parameters. This parameter-efficient approach drastically reduces overfitting on new single-class data and preserves the integrity of the learned feature space. Result Our extensive experiments on the CAID benchmark validate the efficacy of our methodology. In Scenarios 1 and 2, adapted CL methods like FOSTER and S Prompts perform well, confirming the value of replaying historical data. The most compelling results emerge from Scenari 3. Standard replay-free methods(LwF, EWC, and S-Prompts)completely collapse, with their average accuracy (AA) dropping to random-guess levels(~50%), and suffer from extreme catastrophic forgetting. The application of our universal conversion framework resurrects these methods. Their AA surges dramatically to 65%, and, critically, catastrophic forgetting is largely eliminated, with AF scores dropping to near-zero levels. This finding provides unequivocal evidence that our framework successfully navigates the extreme challenges of no-replay, single-class incremental learning. An in-depth ablation study confirms that all three components(KD, CN, and PT)are indispensable and work synergistically to achieve this remarkable performance recovery. Furthermore, t-SNE visualizations verify that our framework significantly reduces feature drift, while frequency spectrum analysis highlights the inherent difficulty of the CAID dataset in comparison with traditional deepfakes, justifying the need for our approach. Conclusion This study makes a foundational contribution to the critical and burgeoning field of AIGC detection. We have constructed and released the first large-scale benchmark for CAID, formally defined the problem, and identified the novel and practical“mixed dual and single-class”learning challenge. Our proposed solutions, particularly the innovative universal conversion framework, provide a robust and effective strategy for developing adaptive detection systems, demonstrating how to maintain performance even under the most stringent real world data constraints. By open-sourcing our dataset and code, we aim to provide a solid foundation and catalyze future research in this vital area. Ultimately, our findings offer significant methodological support for building the next generation of future-proof detection systems capable of keeping pace with the relentless evolution of AI generation technologies.
| Translated title of the contribution | 面向AI 生成图像持续检测基准数据集与框架研究 |
|---|---|
| Original language | English |
| Pages (from-to) | 3438-3450 |
| Number of pages | 13 |
| Journal | Journal of Image and Graphics |
| Volume | 30 |
| Issue number | 11 |
| DOIs | |
| State | Published - 2025 |
Keywords
- artificial intelligence generated content(AIGC)detection
- benchmark dataset
- catastrophic forgetting
- continual learning(CL)
- dataset
- deepfake
- knowledge distillation(KD)
- prompt tuning
Fingerprint
Dive into the research topics of 'Benchmark dataset and framework for continual AI-generated image detection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver