Skip to main navigation Skip to search Skip to main content

Local and Global Aware Document Image Enhancement with Residual Denoising Diffusion Model

  • Hongrui Tie
  • , Heng Li
  • , Xiangping Wu*
  • , Qingcai Chen
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In document image enhancement scenarios, due to the limitations of high computational complexity caused by high-resolution input images, current methods often process these original degraded images by cropping them into patches of specified sizes. However, previous approaches that solely rely on cropped patches or merely use the document enhancement result of the global image as a reference are difficult to fully utilize global image information. This limitation often results in inconsistent enhancement effects across different regions of the same image. In this paper, we introduce LGA-Doc, a novel two-stage local-global information aware generative framework for document image enhancement. Our approach employs a context-aware image feature fusion module that facilitates feature interaction between local document patches and the global image, enabling deep integration of multi-granularity information. The experimental results demonstrate that our method achieves state-of-the-art performance on both the deblurring dataset and the binarization evaluation dataset. Ablation studies further validate the effectiveness of our local-global information aware module.

Original languageEnglish
Title of host publicationICMR 2025 - Proceedings of the 2025 International Conference on Multimedia Retrieval
PublisherAssociation for Computing Machinery, Inc
Pages1293-1302
Number of pages10
ISBN (Electronic)9798400718779
DOIs
StatePublished - 30 Jun 2025
Externally publishedYes
Event2025 International Conference on Multimedia Retrieval, ICMR 2025 - Chicago, United States
Duration: 30 Jun 20253 Jul 2025

Publication series

NameICMR 2025 - Proceedings of the 2025 International Conference on Multimedia Retrieval

Conference

Conference2025 International Conference on Multimedia Retrieval, ICMR 2025
Country/TerritoryUnited States
CityChicago
Period30/06/253/07/25

Keywords

  • cross-attention
  • deep learning
  • diffusion models
  • document binarization
  • document deblurring
  • document enhancement

Fingerprint

Dive into the research topics of 'Local and Global Aware Document Image Enhancement with Residual Denoising Diffusion Model'. Together they form a unique fingerprint.

Cite this