Abstract
Deep learning methods have got a great success in high-resolution remote sensing analysis, especially Convolution Neural Network (CNN) and Transformer. However, CNNs have a failure in modeling the long-range dependency because of their fixed receptive fields and Transformers suffer from quadratic computational complexity relative to image resolution. The RWKV model achieves breakthroughs in natural language processing (NLP) through its linear-complexity sequence modeling; however, it exhibits anisotropic limitations in vision tasks due to the constraints of its one-dimensional scanning mechanism. To address these challenges, we adapt the RWKV architecture to high-resolution remote sensing and propose the Remote Sensing RWKV (RSRWKV) model, which incorporates a Linear-Complexity 2D Attention Mechanism. Specifically, RSRWKV employs a novel 2D-WKV scanning mechanism that bridges sequential processing with two-dimensional spatial reasoning while maintaining linear computational complexity. This design facilitates the aggregation of isotropic contexts in multiple spatial directions. Then, the MVC-Shift module further optimizes multiscale receptive field coverage, whereas the Efficient Channel Attention (ECA) module improves cross-channel feature interaction and semantic saliency modeling. Experimental evaluations on the NWPU RESISC45, VHR-10 v2, SSDD and GLHWater datasets demonstrate that RSRWKV surpasses CNN and Transformer baselines in classification, detection and segmentation tasks, establishing a scalable framework for high-resolution remote sensing analysis. Code available at https://github.com/Ling-yunchi/RSRWKV
| Original language | English |
|---|---|
| Pages (from-to) | 4078-4090 |
| Number of pages | 13 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Volume | 36 |
| Issue number | 4 |
| DOIs | |
| State | Published - 2026 |
| Externally published | Yes |
Keywords
- Computer vision
- RWKV
- attention
- machine learning
- remote sensing
Fingerprint
Dive into the research topics of 'RSRWKV: A Linear-Complexity 2D Attention Mechanism for Efficient Remote Sensing Vision Task'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver