Skip to main navigation Skip to search Skip to main content

Enhancing robust generalization through appropriate adversarial example attack intensity

  • Xiaoguo Ding
  • , Liangjian Zhang
  • , Qiqi Bao
  • , Yaguan Qian*
  • , Bin Wang
  • , Zhaoquan Gu
  • , Yanchun Zhang
  • *Corresponding author for this work
  • Zhejiang University of Science and Technology
  • Zhejiang Key Laboratory of Artificial Intelligence of Things (AIoT) Network and Data Security
  • School of Computer Science and Technology, Harbin Institute of Technology
  • Victoria University

Research output: Contribution to journalArticlepeer-review

Abstract

Deep Neural Networks (DNNs) are notoriously susceptible to adversarial examples. To mitigate the impact of well-designed adversarial attacks on network models, researchers have developed various defense mechanisms, among which adversarial training has emerged as one of the most effective strategies to date. Adversarial training aims to augment training data with adversarial examples, thus giving DNNs a certain degree of robustness to defend against adversarial attacks. However, while obtaining adversarial robustness, this method comes at the cost of reducing the generalization performance, manifested in the reduced classification effect of clean test datasets. Researchers have been actively seeking to counter the balance between adversarial robustness and model generalization. We believe that the key to balancing these two aspects lies in identifying appropriate adversarial examples. Overly potent examples can lead to a decline in clean accuracy, whereas weaker examples may offer limited robustness. Based on our analysis, a new adversarial example generation algorithm called Denoising Projection Gradient Descent (DPGD) was proposed. DPGD adds a purification module and a constraint in generating adversarial examples, the former is used to limit the influence of too strong adversarial examples on model training and the latter is used to ensure the necessary attack intensity. Combining DPGD with the framework of traditional adversarial training, we obtain the Diffusion Adversarial Training (DifAT) approach. To verify the effectiveness of our proposed method, we conducted extensive experiments on benchmark datasets, including CIFAR-10, CIFAR-100, and Tiny-Imagenet. Our results demonstrate the effectiveness of DifAT in improving the robustness of DNNs while maintaining or even improving their generalization performance.

Original languageEnglish
Article number131599
JournalNeurocomputing
Volume657
DOIs
StatePublished - 7 Dec 2025
Externally publishedYes

Keywords

  • Adversarial examples
  • Adversarial training
  • Diffusion model
  • Generalization

Fingerprint

Dive into the research topics of 'Enhancing robust generalization through appropriate adversarial example attack intensity'. Together they form a unique fingerprint.

Cite this