Skip to main navigation Skip to search Skip to main content

Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering

  • Ting Yu*
  • , Binhui Ge
  • , Shuhui Wang
  • , Yan Yang
  • , Qingming Huang
  • , Jun Yu
  • *Corresponding author for this work
  • Hangzhou Normal University
  • Nanjing University of Science and Technology
  • CAS - Institute of Computing Technology
  • Hangzhou Dianzi University
  • University of Chinese Academy of Sciences
  • Harbin Institute of Technology Shenzhen

Research output: Contribution to journalArticlepeer-review

Abstract

Medical Visual Question Answering (Med-VQA) holds immense promise as an invaluable medical assistance aid, offering timely diagnostic outcomes based on medical images and accompanying questions, thereby supporting medical professionals in making accurate clinical decisions. However, Med-VQA is still in its infancy, with existing solutions falling short in imitating human diagnostic processes and ensuring result consistency. To address these challenges, we propose a Consistency Conditioned Memory augmented Dynamic diagnosis model (CoCoMeD), incorporating two core components: a dynamic memory diagnosis engine and a consistency-conditioned enforcer. The dynamic memory diagnosis engine enables intricate diagnostic interactions by retaining vital visual cues from medical images and iteratively updating pertinent memories. This dynamic reasoning capability mirrors the cognitive processes observed in skilled medical diagnosticians, thus effectively enhancing the model's ability to reason over diverse medical visual facts and patient-specific questions. Moreover, to strengthen diagnostic coherence, the consistency-conditioned enforcer imposes coherence constraints linking interrelated questions with identical medical facts, ensuring the credibility and reliability of its diagnostic outcomes. Additionally, we present C-SLAKE, an extended Med-VQA dataset encompassing diverse medical image types, and categorized diagnostic question-answer pairs for consistent Med-VQA evaluation on rich medical sources. Comprehensive experiments on DME and C-SLAKE showcase CoCoMeD's superior performance and potential to advance trustworthy multi-source medical question answering.

Original languageEnglish
Pages (from-to)1357-1370
Number of pages14
JournalIEEE Journal of Biomedical and Health Informatics
Volume29
Issue number2
DOIs
StatePublished - 2025
Externally publishedYes

Keywords

  • Clinical decisions
  • consistency
  • dynamic memory diagnosis
  • dynamic reasoning
  • medical assistance
  • medical visual question answering

Fingerprint

Dive into the research topics of 'Consistency Conditioned Memory Augmented Dynamic Diagnosis Model for Medical Visual Question Answering'. Together they form a unique fingerprint.

Cite this