TY - GEN
T1 - A combined measure for text semantic similarity
AU - Li, Hao Di
AU - Chen, Qing Cai
AU - Wang, Xiao Long
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2013
Y1 - 2013
N2 - With the rapid development of artificial intelligence and natural language processing, text similarity calculation has become the core module of many applications such as semantic disambiguation, information retrieval, automatic question answering and data mining etc. Most of the existing semantic similarity algorithms are based on statistical methods or rule based methods that are conducted on ontology dictionaries and some kind of knowledge bases. Wherein the rule-based methods usually use the dictionary, the ontology tree or graph, or the co-occurrence number of attributes, while the statistical methods may choose to use or not use a knowledge base. While a statistical method of using a knowledge base incorporates more comprehensive knowledge and has the capability of reduces knowledge noise, it usually obtains better performance. Nevertheless, due to the imbalanced distribution of different items in a knowledge base, the semantic similarity calculation results for low-frequency words are usually poor.
AB - With the rapid development of artificial intelligence and natural language processing, text similarity calculation has become the core module of many applications such as semantic disambiguation, information retrieval, automatic question answering and data mining etc. Most of the existing semantic similarity algorithms are based on statistical methods or rule based methods that are conducted on ontology dictionaries and some kind of knowledge bases. Wherein the rule-based methods usually use the dictionary, the ontology tree or graph, or the co-occurrence number of attributes, while the statistical methods may choose to use or not use a knowledge base. While a statistical method of using a knowledge base incorporates more comprehensive knowledge and has the capability of reduces knowledge noise, it usually obtains better performance. Nevertheless, due to the imbalanced distribution of different items in a knowledge base, the semantic similarity calculation results for low-frequency words are usually poor.
KW - Combination of rule and statistical measure
KW - Semantic similarity
KW - Sentence level semantic similarity
UR - https://www.scopus.com/pages/publications/84907271366
U2 - 10.1109/ICMLC.2013.6890900
DO - 10.1109/ICMLC.2013.6890900
M3 - 会议稿件
AN - SCOPUS:84907271366
T3 - Proceedings - International Conference on Machine Learning and Cybernetics
SP - 1869
EP - 1873
BT - Proceedings - International Conference on Machine Learning and Cybernetics
PB - IEEE Computer Society
T2 - 12th International Conference on Machine Learning and Cybernetics, ICMLC 2013
Y2 - 14 July 2013 through 17 July 2013
ER -