Abstract
End-to-end scene text spotting methods have garnered significant research attention due to their promising results. However, most existing approaches are not well suited for real-world applications because of their inherently complex pipelines. In this paper, we propose an end-to-end Character Region Excavation Network (CRENet) to streamline the text spotting pipeline. Our contributions are threefold: (i) Pipeline simplification: For the first time, we eliminate the text region retrieval step, allowing characters to be directly spotted from scene images. (ii) ROA layer: We introduce a novel RoI (Region of Interest) feature sampling layer for multi-oriented character region feature sampling, significantly enhancing the recognizer’s performance. (iii) Progressive learning strategy: We propose a progressive learning strategy to gradually bridge the gap between synthetic data and real-world images, addressing the challenge posed by the high cost of character-level annotations required during training. Extensive experiments demonstrate that our proposed method is robust and effective across horizontal, oriented, and curved text, achieving results comparable to state-of-the-art methods on ICDAR 2013, ICDAR 2015, Total-Text and ReCTS.
| Original language | English |
|---|---|
| Article number | 851 |
| Journal | Electronics (Switzerland) |
| Volume | 14 |
| Issue number | 5 |
| DOIs | |
| State | Published - Mar 2025 |
| Externally published | Yes |
Keywords
- character spotting
- progressive learning strategy
- scene text spotting
- text region retrieval
Fingerprint
Dive into the research topics of 'Character Can Speak Directly: An End-to-End Character Region Excavation Network for Scene Text Spotting'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver