Skip to main navigation Skip to search Skip to main content

GPU-BTM: A topic model for short text using auxiliary information

  • Harbin Institute of Technology
  • Dongguan University of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Recently, short texts become very popular in social life. To understand short texts, researchers develop topic models to extract topic information. However, conventional topic models mainly focus on long documents which cannot deal with the sparsity problem of short text. In this paper, we propose a novel topic model for short text called GPU-BTM, which incorporates Generalized Pólya Urn technique into Biterm Topic Model. GPU-BTM utilizes the similarity information and the co-occurrence pattern of words simultaneously to handle the sparsity problem. Specifically, the GPU module considers the similarity information among words, so that GPU-BTM generates more coherent topics. On the other hand, BTM module tries to capture the co-occurrence pattern of words so that the enriched contexts relieve the data sparsity problem. In the experiment part, the results demonstrate that GPU-BTM model outperforms four latest comparison models on two real world short text datasets.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 5th International Conference on Data Science in Cyberspace, DSC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages198-205
Number of pages8
ISBN (Electronic)9781728195582
DOIs
StatePublished - Jul 2020
Externally publishedYes
Event5th IEEE International Conference on Data Science in Cyberspace, DSC 2020 - Hong Kong, China
Duration: 27 Jul 202029 Jul 2020

Publication series

NameProceedings - 2020 IEEE 5th International Conference on Data Science in Cyberspace, DSC 2020

Conference

Conference5th IEEE International Conference on Data Science in Cyberspace, DSC 2020
Country/TerritoryChina
CityHong Kong
Period27/07/2029/07/20

Keywords

  • Auxiliary information
  • Short text
  • Topic model

Fingerprint

Dive into the research topics of 'GPU-BTM: A topic model for short text using auxiliary information'. Together they form a unique fingerprint.

Cite this