Skip to main navigation Skip to search Skip to main content

POLAFORMER: POLARITY-AWARE LINEAR ATTENTION FOR VISION TRANSFORMERS

  • Weikang Meng
  • , Yadan Luo
  • , Xin Li
  • , Dongmei Jiang
  • , Zheng Zhang*
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • Pengcheng Laboratory
  • University of Queensland

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Linear attention has emerged as a promising alternative to softmax-based attention, leveraging kernelized feature maps to reduce complexity from quadratic to linear in sequence length. However, the non-negative constraint on feature maps and the relaxed exponential function used in approximation lead to significant information loss compared to the original query-key dot products, resulting in less discriminative attention maps with higher entropy. To address the missing interactions driven by negative values in query-key pairs, we propose a polarity-aware linear attention mechanism that explicitly models both same-signed and opposite-signed query-key interactions, ensuring comprehensive coverage of relational information. Furthermore, to restore the spiky properties of attention maps, we provide a theoretical analysis proving the existence of a class of element-wise functions (with positive first and second derivatives) that can reduce entropy in the attention distribution. For simplicity, and recognizing the distinct contributions of each dimension, we employ a learnable power function for rescaling, allowing strong and weak attention signals to be effectively separated. Extensive experiments demonstrate that the proposed PolaFormer improves performance on various vision tasks, enhancing both expressiveness and efficiency by up to 4.6%. Code is available at https://github.com/ZacharyMeng/PolaFormer.

Original languageEnglish
Title of host publication13th International Conference on Learning Representations, ICLR 2025
PublisherInternational Conference on Learning Representations, ICLR
Pages52032-52049
Number of pages18
ISBN (Electronic)9798331320850
StatePublished - 2025
Externally publishedYes
Event13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore
Duration: 24 Apr 202528 Apr 2025

Publication series

Name13th International Conference on Learning Representations, ICLR 2025

Conference

Conference13th International Conference on Learning Representations, ICLR 2025
Country/TerritorySingapore
CitySingapore
Period24/04/2528/04/25

Fingerprint

Dive into the research topics of 'POLAFORMER: POLARITY-AWARE LINEAR ATTENTION FOR VISION TRANSFORMERS'. Together they form a unique fingerprint.

Cite this