Skip to main navigation Skip to search Skip to main content

Neighborhood density method for selecting initial cluster centers in K-means clustering

  • Yunming Ye*
  • , Joshua Zhexue Huang
  • , Xiaojun Chen
  • , Shuigeng Zhou
  • , Graham Williams
  • , Xiaofei Xu
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen
  • The University of Hong Kong
  • Fudan University
  • Australian Taxation Office, Canberra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper presents a new method for effectively selecting initial cluster centers in k-means clustering. This method identifies the high density neighborhoods from the data first and then selects the central points of the neighborhoods as initial centers. The recently published Neighborhood-Based Clustering (NBC) algorithm is used to search for high density neighborhoods. The new clustering algorithm NK-means integrates NBC into the k-means clustering process to improve the performance of the k-means algorithm while preserving the k-means efficiency. NBC is enhanced with a new cell-based neighborhood search method to accelerate the search for initial cluster centers. A merging method is employed to filter out insignificant initial centers to avoid too many clusters being generated. Experimental results on synthetic data sets have shown significant improvements in clustering accuracy in comparison with the random k-means and the refinement k-means algorithms.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 10th Pacific-Asia Conference, PAKDD 2006, Proceedings
PublisherSpringer Verlag
Pages189-198
Number of pages10
ISBN (Print)3540332065, 9783540332060
DOIs
StatePublished - 2006
Externally publishedYes
Event10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2006 - Singapore, Singapore
Duration: 9 Apr 200612 Apr 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3918 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, PAKDD 2006
Country/TerritorySingapore
CitySingapore
Period9/04/0612/04/06

Keywords

  • Clustering
  • Initial Cluster Center Selection
  • K-means
  • Neighborhood-Based Clustering

Fingerprint

Dive into the research topics of 'Neighborhood density method for selecting initial cluster centers in K-means clustering'. Together they form a unique fingerprint.

Cite this