TY - GEN
T1 - Discriminative training of MQDF classifier on synthetic Chinese string samples
AU - Chen, Xia
AU - Su, Tong Hua
AU - Zhang, Tian Wen
AU - Li, Yu
PY - 2010
Y1 - 2010
N2 - Reliable recognition of realistic Chinese handwriting is of overwhelming interests yet challenging. Among many factors, enough training samples and advanced learning method are critical to identify the underlying symbols of a string image. This paper presents an embedding training of MQDF classifier with the help of synthetic string samples within the segmentation-recognition integration framework. First, the fed string images are over-segmented into primitive segments. Then a separate MQDF classifier re-trained discriminatively on string samples is used to measure the confidence of segmentation hypothesis. The optimal path, including segmentation and recognition results, can be finally identified using the beam search technique. Merely using the natural string samples, there exist heavy problems of string sample shortage. To expand the training data, a perturbation model has been utilized for synthesizing string samples. Experiments are conducted on the standard subset of HIT-MW database. Both the embedding training method and the distortion model demonstrate appealing results.
AB - Reliable recognition of realistic Chinese handwriting is of overwhelming interests yet challenging. Among many factors, enough training samples and advanced learning method are critical to identify the underlying symbols of a string image. This paper presents an embedding training of MQDF classifier with the help of synthetic string samples within the segmentation-recognition integration framework. First, the fed string images are over-segmented into primitive segments. Then a separate MQDF classifier re-trained discriminatively on string samples is used to measure the confidence of segmentation hypothesis. The optimal path, including segmentation and recognition results, can be finally identified using the beam search technique. Merely using the natural string samples, there exist heavy problems of string sample shortage. To expand the training data, a perturbation model has been utilized for synthesizing string samples. Experiments are conducted on the standard subset of HIT-MW database. Both the embedding training method and the distortion model demonstrate appealing results.
KW - Chinese handwriting recognition
KW - Discriminative learning
KW - String-level training
KW - Synthetic samples
UR - https://www.scopus.com/pages/publications/78651436152
U2 - 10.1109/CCPR.2010.5659250
DO - 10.1109/CCPR.2010.5659250
M3 - 会议稿件
AN - SCOPUS:78651436152
SN - 9781424472109
T3 - 2010 Chinese Conference on Pattern Recognition, CCPR 2010 - Proceedings
SP - 914
EP - 918
BT - 2010 Chinese Conference on Pattern Recognition, CCPR 2010 - Proceedings
T2 - 2010 Chinese Conference on Pattern Recognition, CCPR 2010
Y2 - 21 October 2010 through 23 October 2010
ER -