Skip to main navigation Skip to search Skip to main content

Efficient top-k Skyline Query Algorithm on Massive Data

  • Xixian Han*
  • , Cui Song
  • , Yunru Ge
  • , Hong Gao
  • , Jianzhong Li
  • *Corresponding author for this work
  • School of Computer Science and Technology, Harbin Institute of Technology

Research output: Contribution to journalArticlepeer-review

Abstract

In many applications, Skyline is an important operation to return a set of interesting tuples which are not dominated by other tuples in a potentially huge data space. The problem of Skyline is that it cannot control the size of Skyline results. This paper deals with a novel top-k Skyline query, which returns k Skyline tuples with the largest domination scores. It is found that the majority of existing algorithms ignore the application of domination scores as a metric of size limitation in Skyline query. This paper proposes a novel table-scan-based algorithm RSTS (ranked Skyline with table scan) to compute top-k Skyline query on massive data efficiently. RSTS first presorts table to arrange the tuples in the order of round-robin retrieval on sorted lists. The execution of RSTS consists of two phases. In phase 1, RSTS scans the presorted table sequentially and maintains the candidate tuples. In phase 2, RSTS calculates the domination scores of the candidates and returns query results. It is proven that RSTS has the characteristic of early termination, along with the theoretical analysis of scan depths. The pruning rule for candidate tuples is devised in this paper. The theoretical pruning effect shows that majority of the Skyline results can be discarded directly. The extensive experimental results show that RSTS can compute top-k Skyline results efficiently.

Original languageEnglish
Pages (from-to)775-787
Number of pages13
JournalJournal of Frontiers of Computer Science and Technology
Volume13
Issue number5
DOIs
StatePublished - 2019
Externally publishedYes

Keywords

  • massive data
  • pruning operation
  • ranked Skyline with table scan (RSTS) algorithm
  • table scan
  • top-k Skyline

Fingerprint

Dive into the research topics of 'Efficient top-k Skyline Query Algorithm on Massive Data'. Together they form a unique fingerprint.

Cite this