Skip to main navigation Skip to search Skip to main content

A topic detection approach through hierarchical clustering on concept graph

  • Harbin Institute of Technology Shenzhen
  • Shenzhen Key Laboratory of Internet Information Collaboration
  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

Topic detection and tracking (TDT) algorithms have long been developed for the discovery of topics. However, most existing TDT algorithms suffer from paying less attention to: (1) temporal distance between a pair of topics; (2) the mutual effect between highly correlated topic terms. In this paper, we proposed a novel topic detection approach by applying hierarchical clustering on the constructed concept graph (HCCG), which is able to solve aforementioned shortcomings simultaneously. In this approach, the concept is first defined as well as the concept behavior curve. Then, the temporal graph is constructed with concept as vertexes and connected by the edges sharing the same topic terms. By performing hierarchical clustering on this concept graph, the highly correlated concept behavior curves will be grouped together as topics. The proposed approach is evaluated on a number of datasets and the promising experimental results show that our approach is superior to K-means, agglomerative hierarchical clustering algorithm(AGH), and LDA with respects to precision, recall and F-measure. Moreover, the proposed concept behavior curves can be used to track the topic change trend by monitoring on the peak frequency of the concept behavior curves.

Original languageEnglish
Pages (from-to)2285-2295
Number of pages11
JournalApplied Mathematics and Information Sciences
Volume7
Issue number6
DOIs
StatePublished - 2013
Externally publishedYes

Keywords

  • Concept graph
  • Hierarchical clustering
  • Text clustering
  • Topic detection

Fingerprint

Dive into the research topics of 'A topic detection approach through hierarchical clustering on concept graph'. Together they form a unique fingerprint.

Cite this