Abstract
Satellite video scene classification (SVSC) is a critical task for dynamic earth observation. However, it remains challenging due to the distinct spatiotemporal characteristics of satellite videos and the scarcity of annotated data, both of which differ significantly from general video datasets. Although vision transformers (ViTs) have shown strong performance in general video classification, their direct application to SVSC often results in suboptimal performance and overfitting. To address these challenges, we propose PESAT, a novel parameter-efficient spatiotemporal adapter tuning framework specifically tailored for SVSC tasks. PESAT enables the effective adaptation of pretrained ViTs for SVSC by keeping the backbone model largely frozen and fine-tuning only a small number of strategically inserted adapter modules. Our framework incorporates three key innovations: an efficient temporal attention modeling (TAM) mechanism that reuses pretrained self-attention weights for temporal feature extraction without adding new parameters; a sensitivity-guided adapter insertion strategy that identifies optimal locations within the ViT to place adapters, maximizing their impact; and a hybrid gated adapter (HGA) module, which combines depthwise convolution and a dynamic gating mechanism to capture complex spatiotemporal contexts specific to satellite video data. Experimental results demonstrate the superior performance of the proposed method.
| Original language | English |
|---|---|
| Article number | 5644912 |
| Journal | IEEE Transactions on Geoscience and Remote Sensing |
| Volume | 63 |
| DOIs | |
| State | Published - 2025 |
| Externally published | Yes |
Keywords
- Parameter-efficient fine-tuning (PEFT)
- satellite video
- scene classification
- video classification
Fingerprint
Dive into the research topics of 'PESAT: A Parameter-Efficient Spatiotemporal Adapter Tuning Framework for Satellite Video Scene Classification'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver