Skip to main navigation Skip to search Skip to main content

Discriminate chinese word segmenter with global and context features

  • Harbin Institute of Technology

Research output: Contribution to journalConference articlepeer-review

Abstract

Chinese Word segmenter is the basis for all subsequent applications of natural language processing. The Corpus-based statistic method has become the predominant method. However, the training corpora are not enough especially in certain areas. Therefore, we introduce some global features and context features in order to get almost the same performance only with much smaller scale corpus. The experiments results show that our approach significantly outperforms the original feature sets in the same training data. Meanwhile, the time-consuming of model training is also reduced. In addition, these features do not depend on classifiers, so our method can easily be changed to other models.

Original languageEnglish
Pages (from-to)267-272
Number of pages6
JournalApplied Mechanics and Materials
Volume198-199
DOIs
StatePublished - 2012
Event2012 International Applied Mechanics, MechatronicsAutomation and System Simulation Meeting, AMMASS 2012 - Hangzhou, China
Duration: 24 Jun 201226 Jun 2012

Keywords

  • Chinese word segmenter
  • Conditional random fields (CRFs)
  • Context features
  • Global features

Fingerprint

Dive into the research topics of 'Discriminate chinese word segmenter with global and context features'. Together they form a unique fingerprint.

Cite this