TY - GEN
T1 - Local and Global Aware Document Image Enhancement with Residual Denoising Diffusion Model
AU - Tie, Hongrui
AU - Li, Heng
AU - Wu, Xiangping
AU - Chen, Qingcai
N1 - Publisher Copyright:
© 2025 ACM.
PY - 2025/6/30
Y1 - 2025/6/30
N2 - In document image enhancement scenarios, due to the limitations of high computational complexity caused by high-resolution input images, current methods often process these original degraded images by cropping them into patches of specified sizes. However, previous approaches that solely rely on cropped patches or merely use the document enhancement result of the global image as a reference are difficult to fully utilize global image information. This limitation often results in inconsistent enhancement effects across different regions of the same image. In this paper, we introduce LGA-Doc, a novel two-stage local-global information aware generative framework for document image enhancement. Our approach employs a context-aware image feature fusion module that facilitates feature interaction between local document patches and the global image, enabling deep integration of multi-granularity information. The experimental results demonstrate that our method achieves state-of-the-art performance on both the deblurring dataset and the binarization evaluation dataset. Ablation studies further validate the effectiveness of our local-global information aware module.
AB - In document image enhancement scenarios, due to the limitations of high computational complexity caused by high-resolution input images, current methods often process these original degraded images by cropping them into patches of specified sizes. However, previous approaches that solely rely on cropped patches or merely use the document enhancement result of the global image as a reference are difficult to fully utilize global image information. This limitation often results in inconsistent enhancement effects across different regions of the same image. In this paper, we introduce LGA-Doc, a novel two-stage local-global information aware generative framework for document image enhancement. Our approach employs a context-aware image feature fusion module that facilitates feature interaction between local document patches and the global image, enabling deep integration of multi-granularity information. The experimental results demonstrate that our method achieves state-of-the-art performance on both the deblurring dataset and the binarization evaluation dataset. Ablation studies further validate the effectiveness of our local-global information aware module.
KW - cross-attention
KW - deep learning
KW - diffusion models
KW - document binarization
KW - document deblurring
KW - document enhancement
UR - https://www.scopus.com/pages/publications/105011589552
U2 - 10.1145/3731715.3733375
DO - 10.1145/3731715.3733375
M3 - 会议稿件
AN - SCOPUS:105011589552
T3 - ICMR 2025 - Proceedings of the 2025 International Conference on Multimedia Retrieval
SP - 1293
EP - 1302
BT - ICMR 2025 - Proceedings of the 2025 International Conference on Multimedia Retrieval
PB - Association for Computing Machinery, Inc
T2 - 2025 International Conference on Multimedia Retrieval, ICMR 2025
Y2 - 30 June 2025 through 3 July 2025
ER -