Skip to main navigation Skip to search Skip to main content

MCE: Towards a general framework for handling missing modalities under imbalanced missing rates

  • Binyu Zhao*
  • , Wei Zhang
  • , Zhaonian Zou
  • *Corresponding author for this work
  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-modal learning has made significant advances across diverse pattern recognition applications. However, handling missing modalities, especially under imbalanced missing rates, remains a major challenge. This imbalance triggers a vicious cycle: modalities with higher missing rates receive fewer updates, leading to inconsistent learning progress and representational degradation that further diminishes their contribution. Existing methods typically focus on global dataset-level balancing, often overlooking critical sample-level variations in modality utility and the underlying issue of degraded feature quality. We propose Modality Capability Enhancement (MCE) to tackle these limitations. MCE includes two synergistic components: i) Learning Capability Enhancement (LCE), which introduces multi-level factors to dynamically balance modality-specific learning progress, and ii) Representation Capability Enhancement (RCE), which improves feature semantics and robustness through subset prediction and cross-modal completion tasks. Comprehensive evaluations on four multi-modal benchmarks show that MCE consistently outperforms state-of-the-art methods under various missing configurations. Our code is available at https://github.com/byzhaoAI/MCE.

Original languageEnglish
Article number112591
JournalPattern Recognition
Volume172
DOIs
StatePublished - Apr 2026
Externally publishedYes

Keywords

  • Capability enhancement
  • Imbalanced missing rate
  • Incomplete multi-modal learning

Fingerprint

Dive into the research topics of 'MCE: Towards a general framework for handling missing modalities under imbalanced missing rates'. Together they form a unique fingerprint.

Cite this