TY - GEN
T1 - Topic exploration in spatio-temporal document collections
AU - Zhao, Kaiqi
AU - Chen, Lisi
AU - Cong, Gao
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/6/26
Y1 - 2016/6/26
N2 - Huge amounts of data with both spatial and temporal information (e.g., geo-tagged tweets) are being generated, and are often used to share and spread personal updates, spontaneous ideas, and breaking news. We refer to such data as spatio-temporal documents. It is of great interest to explore topics in a collection of spatio-temporal documents. In this paper, we study the problem of efficiently mining topics from spatio-temporal documents within a user specified bounded region and timespan, to provide users with insights about events, trends, and public concerns within the specified region and time period. We propose a novel algorithm that is able to efficiently combine two pre-trained topic models learnt from two document sets with a bounded error, based on which we develop an efficient approach to mining topics from a large number of spatio-temporal documents within a region and a timespan. Our experimental results show that our approach is able to improve the runtime by at least an order of magnitude compared with the baselines. Meanwhile, the effectiveness of our proposed method is close to the baselines.
AB - Huge amounts of data with both spatial and temporal information (e.g., geo-tagged tweets) are being generated, and are often used to share and spread personal updates, spontaneous ideas, and breaking news. We refer to such data as spatio-temporal documents. It is of great interest to explore topics in a collection of spatio-temporal documents. In this paper, we study the problem of efficiently mining topics from spatio-temporal documents within a user specified bounded region and timespan, to provide users with insights about events, trends, and public concerns within the specified region and time period. We propose a novel algorithm that is able to efficiently combine two pre-trained topic models learnt from two document sets with a bounded error, based on which we develop an efficient approach to mining topics from a large number of spatio-temporal documents within a region and a timespan. Our experimental results show that our approach is able to improve the runtime by at least an order of magnitude compared with the baselines. Meanwhile, the effectiveness of our proposed method is close to the baselines.
UR - https://www.scopus.com/pages/publications/84979681574
U2 - 10.1145/2882903.2882921
DO - 10.1145/2882903.2882921
M3 - 会议稿件
AN - SCOPUS:84979681574
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 985
EP - 998
BT - SIGMOD 2016 - Proceedings of the 2016 International Conference on Management of Data
PB - Association for Computing Machinery
T2 - 2016 ACM SIGMOD International Conference on Management of Data, SIGMOD 2016
Y2 - 26 June 2016 through 1 July 2016
ER -