TY - GEN
T1 - Improving feature extraction in named entity recognition based on maximum entropy model
AU - Jiang, Wei
AU - Guan, Yi
AU - Wang, Xiao Long
PY - 2006
Y1 - 2006
N2 - A new method of improving feature extraction for Named Entity Recognition is proposed in this paper. First of all, the context features and the entity features are extracted by the corresponding algorithm. The triggers extracted by Mutual Information, Information Gain, Average Mutual Information etc, are adopted to enhance the context features. And rough set theory is used to extract the entity features. Secondly, word cluster method is presented to improve the approach of expanding features, which make us select features more easily, and overcome the sparse data problem effectively. Finally, all the features are added into the maximum entropy model. The experiments have confirmed that our method is effective. The above method has been used in our word segmenter, which participated in the International SIGHAN-2005 Evaluation, and ranked first in open test in MSR corpus.
AB - A new method of improving feature extraction for Named Entity Recognition is proposed in this paper. First of all, the context features and the entity features are extracted by the corresponding algorithm. The triggers extracted by Mutual Information, Information Gain, Average Mutual Information etc, are adopted to enhance the context features. And rough set theory is used to extract the entity features. Secondly, word cluster method is presented to improve the approach of expanding features, which make us select features more easily, and overcome the sparse data problem effectively. Finally, all the features are added into the maximum entropy model. The experiments have confirmed that our method is effective. The above method has been used in our word segmenter, which participated in the International SIGHAN-2005 Evaluation, and ranked first in open test in MSR corpus.
KW - Feature extraction
KW - Maximum entropy model
KW - Named entity recognition
KW - Word cluster
UR - https://www.scopus.com/pages/publications/33947262217
U2 - 10.1109/ICMLC.2006.258916
DO - 10.1109/ICMLC.2006.258916
M3 - 会议稿件
AN - SCOPUS:33947262217
SN - 1424400619
SN - 9781424400614
T3 - Proceedings of the 2006 International Conference on Machine Learning and Cybernetics
SP - 2630
EP - 2635
BT - Proceedings of the 2006 International Conference on Machine Learning and Cybernetics
T2 - 2006 International Conference on Machine Learning and Cybernetics
Y2 - 13 August 2006 through 16 August 2006
ER -