Abstract
Distillation methods excel in multimodal industrial inspection by comparing teacher–student network outputs to detect anomalies. However, existing approaches often rely on single-view depth maps for 3D representation and simplistic multimodal fusion techniques, leading to incomplete 3D structure capture and semantic misalignment. To address these limitations, we introduce asymmetric dual-branch reverse distillation (A2RD) for multimodal anomaly detection. A2RD leverages multimodal data complementarity through crossmodal reverse distillation, using point cloud input to guarantee 3D structure information alongside RGB input. The framework features a dual-branch encoder-decoder architecture for parallel 3D and 2D feature reconstruction and achieves mutual distillation across branches. Within each branch, the student and teacher networks process different modalities, facilitating 3D-to-2D and 2D-to-3D knowledge flow. A modality conversion bottleneck ensures effective crossmodal feature alignment. Experiments on the MVTec 3D-AD and Eyecandies benchmarks demonstrate that A2RD outperforms state-of-the-art methods in both point cloud-level AUROC and point-level AUPRO, conclusively validating its effectiveness and superiority.
| Original language | English |
|---|---|
| Pages (from-to) | 12037-12053 |
| Number of pages | 17 |
| Journal | Visual Computer |
| Volume | 41 |
| Issue number | 14 |
| DOIs | |
| State | Published - Nov 2025 |
Keywords
- Anomaly Detection
- MVTec 3D-AD
- Multimodal
- Point Cloud
- Reverse Distillation
Fingerprint
Dive into the research topics of 'Enhancing Multimodal Anomaly Detection via Asymmetric Dual-Branch Reverse Distillation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver