Abstract
In many applications, Skyline is an important operation to return a set of interesting tuples which are not dominated by other tuples in a potentially huge data space. The problem of Skyline is that it cannot control the size of Skyline results. This paper deals with a novel top-k Skyline query, which returns k Skyline tuples with the largest domination scores. It is found that the majority of existing algorithms ignore the application of domination scores as a metric of size limitation in Skyline query. This paper proposes a novel table-scan-based algorithm RSTS (ranked Skyline with table scan) to compute top-k Skyline query on massive data efficiently. RSTS first presorts table to arrange the tuples in the order of round-robin retrieval on sorted lists. The execution of RSTS consists of two phases. In phase 1, RSTS scans the presorted table sequentially and maintains the candidate tuples. In phase 2, RSTS calculates the domination scores of the candidates and returns query results. It is proven that RSTS has the characteristic of early termination, along with the theoretical analysis of scan depths. The pruning rule for candidate tuples is devised in this paper. The theoretical pruning effect shows that majority of the Skyline results can be discarded directly. The extensive experimental results show that RSTS can compute top-k Skyline results efficiently.
| Original language | English |
|---|---|
| Pages (from-to) | 775-787 |
| Number of pages | 13 |
| Journal | Journal of Frontiers of Computer Science and Technology |
| Volume | 13 |
| Issue number | 5 |
| DOIs | |
| State | Published - 2019 |
| Externally published | Yes |
Keywords
- massive data
- pruning operation
- ranked Skyline with table scan (RSTS) algorithm
- table scan
- top-k Skyline
Fingerprint
Dive into the research topics of 'Efficient top-k Skyline Query Algorithm on Massive Data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver