TY - GEN
T1 - Minimum normalized google distance for unsupervised multilingual Chinese-English word sense disambiguation
AU - Liu, Pengyuan
AU - Xue, Yongzeng
AU - Li, Shiqi
AU - Liu, Shui
PY - 2010
Y1 - 2010
N2 - This paper introduces normalized Google distance into the study of word sense disambiguation and presents a novel unsupervised method of word sense disambiguation. The normalized Google distance is a theory of similarity between words and phrases, based on information distance and Kolmogorov complexity by using the world-wide-web as database, with its page counts derived from a search engine such as Google. This unsupervised method regards the word sense disambiguation as a process of searching minimum normalized Google distance between n-gram and the translation or synonym of the target word, based on the supposition that one sense per n-gram. Our System is tested on Multilingual Chinese-English Lexical Sample task in Semeval-2007. Experimental result shows that our method outperforms the best competing system. Our Experiment on nouns of this dataset also gives a promising result.
AB - This paper introduces normalized Google distance into the study of word sense disambiguation and presents a novel unsupervised method of word sense disambiguation. The normalized Google distance is a theory of similarity between words and phrases, based on information distance and Kolmogorov complexity by using the world-wide-web as database, with its page counts derived from a search engine such as Google. This unsupervised method regards the word sense disambiguation as a process of searching minimum normalized Google distance between n-gram and the translation or synonym of the target word, based on the supposition that one sense per n-gram. Our System is tested on Multilingual Chinese-English Lexical Sample task in Semeval-2007. Experimental result shows that our method outperforms the best competing system. Our Experiment on nouns of this dataset also gives a promising result.
KW - Normalized Google distance
KW - One sense per n-gram
KW - Unsupervised word sense disambiguation
UR - https://www.scopus.com/pages/publications/79952545889
U2 - 10.1109/ICGEC.2010.69
DO - 10.1109/ICGEC.2010.69
M3 - 会议稿件
AN - SCOPUS:79952545889
SN - 9780769542812
T3 - Proceedings - 4th International Conference on Genetic and Evolutionary Computing, ICGEC 2010
SP - 252
EP - 255
BT - Proceedings - 4th International Conference on Genetic and Evolutionary Computing, ICGEC 2010
T2 - 4th International Conference on Genetic and Evolutionary Computing, ICGEC 2010
Y2 - 13 December 2010 through 15 December 2010
ER -