TY - GEN
T1 - Context-based entity description rule for entity resolution
AU - Li, Lingli
AU - Li, Jianzhong
AU - Wang, Hongzhi
AU - Gao, Hong
PY - 2011
Y1 - 2011
N2 - In this paper, we consider the entity resolution(ER) problem, which is to identify objects referring to the same real-world entity. Prior work of ER involves expensive similarity comparison and clustering approaches. Additionally, the quality of entity resolution may be low due to insufficient information. To address these problems, by adopting context information of data objects, we present a novel framework of entity resolution, context-based entity description (CED), to make context information help entity resolution. In our framework, each entity is described by a set of CEDs. During entity resolution, objects are only compared with CEDs to determine its corresponding entity. Additionally, we propose efficient algorithms for CED discovery and CED-based entity resolution. We experimentally evaluated our CED-based ER algorithm on the real DBLP datasets, and the experimental results show that our algorithm can achieve both high precision and recall as well as outperform existing methods.
AB - In this paper, we consider the entity resolution(ER) problem, which is to identify objects referring to the same real-world entity. Prior work of ER involves expensive similarity comparison and clustering approaches. Additionally, the quality of entity resolution may be low due to insufficient information. To address these problems, by adopting context information of data objects, we present a novel framework of entity resolution, context-based entity description (CED), to make context information help entity resolution. In our framework, each entity is described by a set of CEDs. During entity resolution, objects are only compared with CEDs to determine its corresponding entity. Additionally, we propose efficient algorithms for CED discovery and CED-based entity resolution. We experimentally evaluated our CED-based ER algorithm on the real DBLP datasets, and the experimental results show that our algorithm can achieve both high precision and recall as well as outperform existing methods.
KW - contexted-based
KW - data cleaning
KW - entity resolution
UR - https://www.scopus.com/pages/publications/83055161604
U2 - 10.1145/2063576.2063825
DO - 10.1145/2063576.2063825
M3 - 会议稿件
AN - SCOPUS:83055161604
SN - 9781450307178
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1725
EP - 1730
BT - CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
T2 - 20th ACM Conference on Information and Knowledge Management, CIKM'11
Y2 - 24 October 2011 through 28 October 2011
ER -