Abstract
Multi-modal learning has made significant advances across diverse pattern recognition applications. However, handling missing modalities, especially under imbalanced missing rates, remains a major challenge. This imbalance triggers a vicious cycle: modalities with higher missing rates receive fewer updates, leading to inconsistent learning progress and representational degradation that further diminishes their contribution. Existing methods typically focus on global dataset-level balancing, often overlooking critical sample-level variations in modality utility and the underlying issue of degraded feature quality. We propose Modality Capability Enhancement (MCE) to tackle these limitations. MCE includes two synergistic components: i) Learning Capability Enhancement (LCE), which introduces multi-level factors to dynamically balance modality-specific learning progress, and ii) Representation Capability Enhancement (RCE), which improves feature semantics and robustness through subset prediction and cross-modal completion tasks. Comprehensive evaluations on four multi-modal benchmarks show that MCE consistently outperforms state-of-the-art methods under various missing configurations. Our code is available at https://github.com/byzhaoAI/MCE.
| Original language | English |
|---|---|
| Article number | 112591 |
| Journal | Pattern Recognition |
| Volume | 172 |
| DOIs | |
| State | Published - Apr 2026 |
| Externally published | Yes |
Keywords
- Capability enhancement
- Imbalanced missing rate
- Incomplete multi-modal learning
Fingerprint
Dive into the research topics of 'MCE: Towards a general framework for handling missing modalities under imbalanced missing rates'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver