Skip to main navigation Skip to search Skip to main content

Marrying k-means with evidence accumulation in clustering analysis

  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Text clustering is becoming increasingly important to text mining and to the development of commercial applications. Previous research mainly focused on single clustering on documents. Compared with cluster ensembles, segmentations obtained from single clustering runs are less convincing in terms of accuracy and consistency. In this paper, we propose an approach based on evidence accumulation clustering (EAC) with k-means for text clustering problems. Our goal is to obtain a consistent, stable, and credible clustering scheme. First, we ran the k-means algorithm multiple times while the number of clusters ranges in an optimum area. Then, we constructed a matrix called co-association matrix by integrating all the derived clustering partitions. Finally, we obtained consistent clusters by performing hierarchical cluster algorithm on the co-association matrix. The linkage criterion used was a single link. The above process is equivalent to the process of finding a minimum spanning tree (MST) for a completed graph determined by a co-association matrix. The algorithm was tested on four text data sets. Experimental results showed that our method improves the accuracy of the final results.

Original languageEnglish
Title of host publication2018 IEEE 4th International Conference on Computer and Communications, ICCC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2050-2056
Number of pages7
ISBN (Electronic)9781538683392
DOIs
StatePublished - Dec 2018
Externally publishedYes
Event4th IEEE International Conference on Computer and Communications, ICCC 2018 - Chengdu, China
Duration: 7 Dec 201810 Dec 2018

Publication series

Name2018 IEEE 4th International Conference on Computer and Communications, ICCC 2018

Conference

Conference4th IEEE International Conference on Computer and Communications, ICCC 2018
Country/TerritoryChina
CityChengdu
Period7/12/1810/12/18

Keywords

  • Evidence accumulation clustering
  • Hierar-chical clustering
  • K-means

Fingerprint

Dive into the research topics of 'Marrying k-means with evidence accumulation in clustering analysis'. Together they form a unique fingerprint.

Cite this