Skip to main navigation Skip to search Skip to main content

G-ANMI: A mutual information based genetic clustering algorithm for categorical data

  • Shengchun Deng
  • , Zengyou He*
  • , Xiaofei Xu
  • *Corresponding author for this work
  • Harbin Institute of Technology
  • Hong Kong University of Science and Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Identification of meaningful clusters from categorical data is one key problem in data mining. Recently, Average Normalized Mutual Information (ANMI) has been used to define categorical data clustering as an optimization problem. To find globally optimal or near-optimal partition determined by ANMI, a genetic clustering algorithm (G-ANMI) is proposed in this paper. Experimental results show that G-ANMI is superior or comparable to existing algorithms for clustering categorical data in terms of clustering accuracy.

Original languageEnglish
Pages (from-to)144-149
Number of pages6
JournalKnowledge-Based Systems
Volume23
Issue number2
DOIs
StatePublished - Mar 2010

Keywords

  • Categorical data
  • Cluster ensemble
  • Clustering
  • Data mining
  • Genetic algorithm
  • Mutual information

Fingerprint

Dive into the research topics of 'G-ANMI: A mutual information based genetic clustering algorithm for categorical data'. Together they form a unique fingerprint.

Cite this