Skip to main navigation Skip to search Skip to main content

Chinese parsing exploiting characters

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Characters play an important role in the Chinese language, yet computational processing of Chinese has been dominated by word-based approaches, with leaves in syntax trees being words. We investigate Chinese parsing from the character-level, extending the notion of phrase-structure trees by annotating internal structures of words. We demonstrate the importance of character-level information to Chinese processing by building a joint segmentation, part-of-speech (POS) tagging and phrase-structure parsing system that integrates character-structure features. Our joint system significantly outperforms a state-of-the-art word-based baseline on the standard CTB5 test, and gives the best published results for Chinese parsing.

Original languageEnglish
Title of host publicationLong Papers
PublisherAssociation for Computational Linguistics (ACL)
Pages125-134
Number of pages10
ISBN (Print)9781937284503
StatePublished - 2013
Event51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 - Sofia, Bulgaria
Duration: 4 Aug 20139 Aug 2013

Publication series

NameACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
Volume1

Conference

Conference51st Annual Meeting of the Association for Computational Linguistics, ACL 2013
Country/TerritoryBulgaria
CitySofia
Period4/08/139/08/13

Fingerprint

Dive into the research topics of 'Chinese parsing exploiting characters'. Together they form a unique fingerprint.

Cite this