Abstract
Video deblurring is a fundamental problem in low-level vision, and many methods have employed designs based on CNNs and transformers. Traditional CNNs often require deeper architectures to achieve a larger receptive field, which may not be optimal for spatially non-uniform blurs and intense motion blurs. While transformers offer a large receptive field, their quadratic complexity due to attention designs typically imposes a significant computational burden. In addressing these issues, we present an Attentive Large Kernel Network with Mixture of Experts (ALK-MoE). In ALK-MoE, an attentive large kernel backbone network is proposed. On one hand, it inherently extends the network’s receptive field through its large kernel design. On the other hand, it addresses the quadratic complexity of attention by employing a sophisticated attention design, thus maintaining its ability to capture long-range dependencies. Furthermore, to achieve more precise and robust alignment of inter-frame features using optical flow for better utilization of clear frames, a mixture of experts model is proposed. It involves integrating optical flow updates between different experts in a residual manner. Our ablation experiments and experiments on multiple datasets indicate that ALK-MoE achieves comparable or superior performance compared to Transformer-based methods, with lower complexity.
| Original language | English |
|---|---|
| Pages (from-to) | 5575-5588 |
| Number of pages | 14 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Volume | 35 |
| Issue number | 6 |
| DOIs | |
| State | Published - 2025 |
| Externally published | Yes |
Keywords
- Video deblurring
- large kernel attention
- long-range dependencies
- mixture of experts model
Fingerprint
Dive into the research topics of 'Attentive Large Kernel Network With Mixture of Experts for Video Deblurring'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver