Skip to main navigation Skip to search Skip to main content

Clique percolation method for finding naturally cohesive and overlapping document clusters

  • Wei Gao*
  • , Kam Fai Wong
  • , Yunqing Xia
  • , Ruifeng Xu
  • *Corresponding author for this work
  • Chinese University of Hong Kong

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Techniques for find document clusters mostly depend on models that impose strong explicit and/or implicit priori assumptions. As a consequence, the clustering effects tend to be unnatural and stray away from the intrinsic grouping natures of a document collection. We apply a novel graph-theoretic technique called Clique Percolation Method (CPM) for document clustering. In this method, a process of enumerating highly cohesive maximal document cliques is performed in a random graph, where those strongly adjacent cliques are mingled to form naturally overlapping clusters. Our clustering results can unveil the inherent structural connections of the underlying data. Experiments show that CPM can outperform some typical algorithms on benchmark data sets, and shed light on its advantages on natural document clustering.

Original languageEnglish
Title of host publicationComputer Processing of Oriental Languages - Beyond the Orient
Subtitle of host publicationThe Research Challenges Ahead - 21st International Conference, ICCPOL 2006, Proceedings
Pages97-108
Number of pages12
DOIs
StatePublished - 2006
Externally publishedYes
Event21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006 - Singapore, Singapore
Duration: 17 Dec 200619 Dec 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4285 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Computer Processing of Oriental Languages: Beyond the Orient: The Research Challenges Ahead, ICCPOL 2006
Country/TerritorySingapore
CitySingapore
Period17/12/0619/12/06

Fingerprint

Dive into the research topics of 'Clique percolation method for finding naturally cohesive and overlapping document clusters'. Together they form a unique fingerprint.

Cite this