Skip to main navigation Skip to search Skip to main content

Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt

  • Benfeng Wang
  • , Chao Huang*
  • , Jie Wen
  • , Wei Wang
  • , Yabo Liu
  • , Yong Xu
  • *Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

Video anomaly detection (VAD) aims at locating the abnormal events in videos. Recently, the Weakly Supervised VAD has made great progress, which only requires video-level annotations when training. In practical applications, different institutions may have different types of abnormal videos. However, the abnormal videos cannot be circulated on the internet due to privacy protection. To train a more generalized anomaly detector that can identify various anomalies, it is reasonable to introduce federated learning into WSVAD. In this paper, we propose Global and Local Context-driven Federated Learning, a new paradigm for privacy protected weakly supervised video anomaly detection. Specifically, we utilize the vision-language association of CLIP to detect whether the video frame is abnormal. Instead of leveraging handcrafted text prompts for CLIP, we propose a text prompt generator. The generated prompt is simultaneously influenced by text and visual. On the one hand, the text provides global context related to anomaly, which improves the model’s ability of generalization. On the other hand, the visual provides personalized local context because different clients may have videos with different types of anomalies or scenes. The generated prompt ensures global generalization while processing personalized data from different clients. Extensive experiments show that the proposed method achieves remarkable performance.

Original languageEnglish
Pages (from-to)21017-21025
Number of pages9
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume39
Issue number20
DOIs
StatePublished - 11 Apr 2025
Externally publishedYes
Event39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States
Duration: 25 Feb 20254 Mar 2025

Fingerprint

Dive into the research topics of 'Federated Weakly Supervised Video Anomaly Detection with Multimodal Prompt'. Together they form a unique fingerprint.

Cite this