Skip to main navigation Skip to search Skip to main content

A Multiscale Attention Network for Remote Sensing Scene Images Classification

  • Guokai Zhang
  • , Weizhe Xu
  • , Wei Zhao
  • , Chenxi Huang
  • , Eddie Ng Yk
  • , Yongyong Chen*
  • , Jian Su*
  • *Corresponding author for this work
  • University of Shanghai for Science and Technology
  • University of Manchester
  • Tongji University
  • Xiamen University
  • Nanyang Technological University
  • School of Computer Science and Technology, Harbin Institute of Technology
  • Nanjing University of Information Science & Technology

Research output: Contribution to journalArticlepeer-review

Abstract

The remote sensing scene images classification has been of great value to civil and military fields. Deep learning models, especially the convolutional neural network (CNN), have achieved great success in this task, however, they may suffer from two challenges: first, the sizes of the category objects are usually different, but the conventional CNN extracts the features with fixed convolution extractor, which could cause the failure in learning the multiscale features; second, some image regions may not be useful during the feature learning process, therefore, how to guide the network to select and focus on the most relevant regions is crucially vital for remote sensing scene image classification. To address these two challenges, we propose a multiscale attention network (MSA-Network), which integrates a multiscale (MS) module and a channel and position attention (CPA) module to boost the performance of the remote sensing scene classification. The proposed MS module learns multiscale features by adopting various sizes of sliding windows from different depths' layers and receptive fields. The CPA module is composed of two parts: the channel attention (CA) module and the position attention (PA) one. The CA module learns the global attention features from channel-level, and the PA module extracts the local attention features from pixel-level. Thus, fusing both of those two attention features, the network is apt to focus on the more critical and salient regions automatically. Extensive experiments on UC Merced, AID, NWPU-RESISC45 datasets demonstrate that the proposed MSA-Network outperforms several state-of-the-art methods.

Original languageEnglish
Pages (from-to)9530-9545
Number of pages16
JournalIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Volume14
DOIs
StatePublished - 2021
Externally publishedYes

Keywords

  • Remote sensing scene
  • attention
  • feature fusion
  • multi-scale

Fingerprint

Dive into the research topics of 'A Multiscale Attention Network for Remote Sensing Scene Images Classification'. Together they form a unique fingerprint.

Cite this