Skip to main navigation Skip to search Skip to main content

Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-Ray Report Generation

  • Xiao Wang
  • , Fuling Wang
  • , Haowen Wang*
  • , Bo Jiang*
  • , Chuanfu Li
  • , Yaowei Wang
  • , Yonghong Tian
  • , Jin Tang
  • *Corresponding author for this work
  • Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University
  • Anhui University of Chinese Medicine
  • Peng Cheng Laboratory
  • Harbin Institute of Technology Shenzhen
  • Peking University

Research output: Contribution to journalArticlepeer-review

Abstract

X-ray image based medical report generation achieves significant progress in recent years with the help of large language models, however, these models have not fully exploited the effective information in visual image regions, resulting in reports that are linguistically sound but insufficient in describing key diseases. In this paper, we propose a novel associative memory-enhanced X-ray report generation model that effectively mimics the process of professional doctors writing medical reports. It considers both the mining of global and local visual information and associates historical report information to better complete the writing of the current report. Specifically, given an X-ray image, we first utilize a classification model along with its activation maps to accomplish the mining of visual regions highly associated with diseases and the learning of disease query tokens. Then, we employ a visual Hopfield network to establish memory associations for disease-related tokens, and a report Hopfield network to retrieve report memory information. This process facilitates the generation of high-quality reports based on a large language model and achieves state-of-the-art performance on multiple benchmark datasets, including the IU X-ray, MIMIC-CXR, and Chexpert Plus.

Original languageEnglish
Pages (from-to)583-595
Number of pages13
JournalIEEE Transactions on Medical Imaging
Volume45
Issue number2
DOIs
StatePublished - 2026
Externally publishedYes

Keywords

  • Medical report generation
  • associative memory network
  • context sample retrieval
  • large language model

Fingerprint

Dive into the research topics of 'Activating Associative Disease-Aware Vision Token Memory for LLM-Based X-Ray Report Generation'. Together they form a unique fingerprint.

Cite this