Euclidean-based entity resolution for evolving data

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With large companies and corporations becoming increasingly responsible for data collection, in recent years, a growing number of scientists have proposed using a variety of algorithms and different theories to solve the database problem. Even though existing solutions are effective in many cases many, problems are left to solve during the integration of database. The entity resolution (ER) is a crucial problem to solve. ER has been used in many applications during the updating and loading process of the big data set, while the evolving data needs most. The evolving data set are currently used in the biology and computer information a lot, which contains microscope observation and biology information. Even though researchers have proposed different ER methods, the cost of ER problems is usually too large to accept. We use the high-dimensional space Euclidean vector to simulate the states of different entities in big data set. We combine this approach with the parallel improved Top-K algorithm, devising a way to more effectively detect the identity of the entity. Theoretical analysis and experimental results show that the proposed method could perform entity resolution on evolving data effectively and efficiently.

Original languageEnglish
Title of host publicationProceedings - 5th International Conference on Instrumentation and Measurement, Computer, Communication, and Control, IMCCC 2015
EditorsJun-Bao Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1547-1552
Number of pages6
ISBN (Electronic)9781467377232
DOIs
StatePublished - 11 Feb 2016
Event5th International Conference on Instrumentation and Measurement, Computer, Communication, and Control, IMCCC 2015 - Qinhuangdao, China
Duration: 18 Sep 201520 Sep 2015

Publication series

NameProceedings - 5th International Conference on Instrumentation and Measurement, Computer, Communication, and Control, IMCCC 2015

Conference

Conference5th International Conference on Instrumentation and Measurement, Computer, Communication, and Control, IMCCC 2015
Country/TerritoryChina
CityQinhuangdao
Period18/09/1520/09/15

Keywords

  • Entity resolution
  • Euclidean vector
  • Top-K

Fingerprint

Dive into the research topics of 'Euclidean-based entity resolution for evolving data'. Together they form a unique fingerprint.

Cite this