TY - GEN
T1 - A performance study of big spatial data systems
AU - Alam, Md Mahbub
AU - Ray, Suprio
AU - Bhavsar, Virendra C.
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2018/11/6
Y1 - 2018/11/6
N2 - With the accelerated growth in spatial data volume, being generated from a wide variety of sources, the need for efficient storage, retrieval, processing and analyzing of spatial data is ever more important. Hence, spatial data processing system has become an important field of research. In recent times a number of Big Spatial Data systems have been proposed by researchers around the world. These systems can be roughly categorized into Apache Hadoopbased and in-memory systems based on Apache Spark. The available features supported by these systems vary widely. However, there has not been any comprehensive evaluation study of these systems in terms of performance, scalability and functionality. To address this need, we propose a benchmark to evaluate Big Spatial Data systems. Although, Spark is a very popular framework, its performance is limited by the overhead associated with distributed resource management and coordination. The Big Spatial Data systems that are based on Spark, are also constrained by these. We introduce SpatialIgnite, a Big Spatial Data system that we have developed based on Apache Ignite. We investigate the present status of the Big Spatial Data systems by conducting a comprehensive feature analysis and performance evaluation of a few representative systems with our benchmark. Our study shows that SpatialIgnite performs better than Hadoop and Spark based systems that we have evaluated.
AB - With the accelerated growth in spatial data volume, being generated from a wide variety of sources, the need for efficient storage, retrieval, processing and analyzing of spatial data is ever more important. Hence, spatial data processing system has become an important field of research. In recent times a number of Big Spatial Data systems have been proposed by researchers around the world. These systems can be roughly categorized into Apache Hadoopbased and in-memory systems based on Apache Spark. The available features supported by these systems vary widely. However, there has not been any comprehensive evaluation study of these systems in terms of performance, scalability and functionality. To address this need, we propose a benchmark to evaluate Big Spatial Data systems. Although, Spark is a very popular framework, its performance is limited by the overhead associated with distributed resource management and coordination. The Big Spatial Data systems that are based on Spark, are also constrained by these. We introduce SpatialIgnite, a Big Spatial Data system that we have developed based on Apache Ignite. We investigate the present status of the Big Spatial Data systems by conducting a comprehensive feature analysis and performance evaluation of a few representative systems with our benchmark. Our study shows that SpatialIgnite performs better than Hadoop and Spark based systems that we have evaluated.
KW - Benchmark
KW - Big Spatial Data
KW - Hadoop
KW - Ignite
KW - In-Memory
KW - Performance Evaluation
KW - Spark
UR - https://www.scopus.com/pages/publications/85060581395
U2 - 10.1145/3282834.3282841
DO - 10.1145/3282834.3282841
M3 - 会议稿件
AN - SCOPUS:85060581395
T3 - Proceedings of the 7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018
SP - 1
EP - 9
BT - Proceedings of the 7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018
PB - Association for Computing Machinery, Inc
T2 - 7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018
Y2 - 6 November 2018 through 6 November 2018
ER -