TY - GEN
T1 - Keyword spotting in degraded document using mixed OCR and word shape coding
AU - Xia, Yong
AU - Quan, Guangri
AU - Xu, Yongdong
AU - Sun, Yushan
PY - 2010
Y1 - 2010
N2 - This paper presents a new way for keyword spotting in degraded imaged document. Two prevalent word indexing, OCR and word shape coding, are combined compactly based on the recognition confidence evaluation. The basic procedures are as follows. First, OCR candidates are used for OCR indexing. Second, a new stoke feature and convex-concave feature of word are adopted for word shape coding. Furthermore, an intelligent indexing based on recognition confidence is introduced, which is adaptive to image quality. Finally, an inexact matching is used for word spotting. A collection from NLM, including 1553 scanned imaged documents, is used to evaluate our method. The results confirm the validity of our method.
AB - This paper presents a new way for keyword spotting in degraded imaged document. Two prevalent word indexing, OCR and word shape coding, are combined compactly based on the recognition confidence evaluation. The basic procedures are as follows. First, OCR candidates are used for OCR indexing. Second, a new stoke feature and convex-concave feature of word are adopted for word shape coding. Furthermore, an intelligent indexing based on recognition confidence is introduced, which is adaptive to image quality. Finally, an inexact matching is used for word spotting. A collection from NLM, including 1553 scanned imaged documents, is used to evaluate our method. The results confirm the validity of our method.
KW - Degraded imaged document
KW - Keyword spotting
KW - OCR indexing
KW - Word shape coding
UR - https://www.scopus.com/pages/publications/78651276744
U2 - 10.1109/ICICISYS.2010.5658616
DO - 10.1109/ICICISYS.2010.5658616
M3 - 会议稿件
AN - SCOPUS:78651276744
SN - 9781424465835
T3 - Proceedings - 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2010
SP - 411
EP - 414
BT - Proceedings - 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2010
T2 - 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2010
Y2 - 29 October 2010 through 31 October 2010
ER -