Skip to main navigation Skip to search Skip to main content

Improving feature extraction in named entity recognition based on maximum entropy model

  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

A new method of improving feature extraction for Named Entity Recognition is proposed in this paper. First of all, the context features and the entity features are extracted by the corresponding algorithm. The triggers extracted by Mutual Information, Information Gain, Average Mutual Information etc, are adopted to enhance the context features. And rough set theory is used to extract the entity features. Secondly, word cluster method is presented to improve the approach of expanding features, which make us select features more easily, and overcome the sparse data problem effectively. Finally, all the features are added into the maximum entropy model. The experiments have confirmed that our method is effective. The above method has been used in our word segmenter, which participated in the International SIGHAN-2005 Evaluation, and ranked first in open test in MSR corpus.

Original languageEnglish
Title of host publicationProceedings of the 2006 International Conference on Machine Learning and Cybernetics
Pages2630-2635
Number of pages6
DOIs
StatePublished - 2006
Externally publishedYes
Event2006 International Conference on Machine Learning and Cybernetics - Dalian, China
Duration: 13 Aug 200616 Aug 2006

Publication series

NameProceedings of the 2006 International Conference on Machine Learning and Cybernetics
Volume2006

Conference

Conference2006 International Conference on Machine Learning and Cybernetics
Country/TerritoryChina
CityDalian
Period13/08/0616/08/06

Keywords

  • Feature extraction
  • Maximum entropy model
  • Named entity recognition
  • Word cluster

Fingerprint

Dive into the research topics of 'Improving feature extraction in named entity recognition based on maximum entropy model'. Together they form a unique fingerprint.

Cite this