Abstract
Chinese Word segmenter is the basis for all subsequent applications of natural language processing. The Corpus-based statistic method has become the predominant method. However, the training corpora are not enough especially in certain areas. Therefore, we introduce some global features and context features in order to get almost the same performance only with much smaller scale corpus. The experiments results show that our approach significantly outperforms the original feature sets in the same training data. Meanwhile, the time-consuming of model training is also reduced. In addition, these features do not depend on classifiers, so our method can easily be changed to other models.
| Original language | English |
|---|---|
| Pages (from-to) | 267-272 |
| Number of pages | 6 |
| Journal | Applied Mechanics and Materials |
| Volume | 198-199 |
| DOIs | |
| State | Published - 2012 |
| Event | 2012 International Applied Mechanics, MechatronicsAutomation and System Simulation Meeting, AMMASS 2012 - Hangzhou, China Duration: 24 Jun 2012 → 26 Jun 2012 |
Keywords
- Chinese word segmenter
- Conditional random fields (CRFs)
- Context features
- Global features
Fingerprint
Dive into the research topics of 'Discriminate chinese word segmenter with global and context features'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver