Skip to main navigation Skip to search Skip to main content

Multi-Label Feature Selection under Coverage Imbalance and Feature Redundancy

  • Luhan Liu
  • , Xinyi He
  • , Hanlin Pan
  • , Yonghao Li*
  • , Wanfu Gao
  • , Jie Wen
  • , Weiping Ding
  • *Corresponding author for this work
  • College of Computer Science and Technology
  • Ministry of Education of the People's Republic of China
  • Southwestern University of Finance and Economics
  • Harbin Institute of Technology Shenzhen
  • Nantong University
  • City University of Macau

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-label feature selection plays a critical role in data management and analysis by reducing feature dimensionality while preserving discriminative capability. However, real-world multi-label datasets commonly exhibit label coverage imbalance, causing feature evaluation to be dominated by labels with high coverage. Moreover, feature redundancy is typically estimated using averaged dependency measures, which underestimate dominant redundant relationships under heterogeneous information scales. To address these challenges, we propose a multi-label feature selection method, termed Complementary and Redundancy-Aware Feature Selection for Imbalanced Coverage (CIRFS). CIRFS introduces a coverage-aware label weighting strategy that explicitly models label coverage and normalized label frequency to dynamically mitigate well-covered label dominance. In addition, it adopts a maximum redundancy ratio criterion to characterize feature redundancy from a worst-case information perspective, enabling accurate identification of dominant redundant relationships. Furthermore, mutual information (MI) and stabilized conditional mutual information (CMI) are jointly integrated to capture complementary aspects of feature-label information that cannot be fully characterized by either measure alone. Experiments on 14 real-world multi-label datasets demonstrate that CIRFS outperforms nine representative feature selection methods across four evaluation metrics.

Original languageEnglish
JournalIEEE Transactions on Knowledge and Data Engineering
DOIs
StateAccepted/In press - 2026
Externally publishedYes

Keywords

  • Multi-label learning
  • feature selection
  • information theory
  • label imbalance

Fingerprint

Dive into the research topics of 'Multi-Label Feature Selection under Coverage Imbalance and Feature Redundancy'. Together they form a unique fingerprint.

Cite this