Skip to main navigation Skip to search Skip to main content

Lexicalized second-order-HMM for ambiguity resolution in Chinese segmentation and POS tagging

Research output: Contribution to journalArticlepeer-review

Abstract

Hidden Markov Model (HMM) is a main solution to ambiguities in Chinese segmentation and POS (part-of-speech) tagging. While most previous works for HMM-based Chinese segmentation and POS tagging consult POS information in contexts, they do not utilize lexical information which is crucial for resolving certain morphological ambiguity. This paper proposes a method which incorporates lexical information and wider context information into HMM. Model induction and related smoothing technique are presented in detail. Experiments indicate that this technique improves the segmentation and tagging accuracy by nearly 1%.

Original languageEnglish
Pages (from-to)346-350
Number of pages5
JournalHigh Technology Letters
Volume11
Issue number4
StatePublished - Dec 2005
Externally publishedYes

Keywords

  • Chinese segmentation
  • Hidden Markov model
  • Part-of-speech tagging

Fingerprint

Dive into the research topics of 'Lexicalized second-order-HMM for ambiguity resolution in Chinese segmentation and POS tagging'. Together they form a unique fingerprint.

Cite this