Geographica Raster: A Benchmark for Geospatial RDF Stores implementing GeoSPARQL 2.0

Authors

Introduction

Geospatial extensions such as GeoSPARQL have been around for quite some time now.
The Geographica Benchmark has been established in the linked data community as a standardized benchmark to test the performance of geospatial RDF stores.
GeoSPARQL 2.0 extends GeoSPARQL by defining the following categories:

We therefore extend the Geographica Benchmark by incorporating the functions which have been proposed in the GeoSPARQL 2.0 standard proposal and would like to propose to contribute meaningful queries and application cases for the standard. As a first evaluation we performed a comparison of GeoSPARQL 2.0 against a POSTGIS database which already includes many of the features which are added using the GeoSPARQL 2.0 proposal.
We present the results of these evaluations here

Datasets

We follow the Geographica benchmark approach to define Micro Benchmark Queries and Macro Benchmark Queries which we define as shown below. Microbenchmark queries test the performance of one GeoSPARQL 2.0 function each and evaluates it against counterparts of other triple stores.
Macrobenchmark Queries test against realworld usecase of which we have defined some, but would like the community to contribute to in order to improve the benchmark.

Datasets

Datasets are divided in datasets containing vector data and datasets containing raster data images.

Vector Datasets

For the vector data datasets we use the Geographica Benchmark testsets which are linked below:

Raster Datasets

Raster data are saved in the raster dataset which we provide in this repository. The raster dataset consists of images provided by OSGEO and USGS and to a greater extent are concerned with risk assessment tasks.

Micro Benchmark Queries

Micro Benchmark Detailed Results

The results of the Micro Benchmark can be seen on the following homepage: Postgis-Jena Benchmark

Macro Benchmark Queries

Macro Benchmark Detailed Results

In the manuscript [1] we report in detail the results of the micro benchmark and the experiments regarding the synthetic workload. The macro benchmark results are also represented on this homepage:
Response Times per query for the Flood Simulation scenario
QueryGeoSPARQL 2.0POSTGIS
Q1

The synthetic workload

Geographica produced a testset on a synthetic workload. This synthetic dataset has been automatically generated. We repeated this test for the vector data functions which were newly defined in GeoSPARQL 2.0. The definition of a synthetically created raster data set is to be done.

Datasets

Queries

Geographica Raster source code

The Geographica Raster source code is included into the Postgis-Jena project which was central to the implementation of GeoSPARQL 2.0.
The benchmark can be executed using the following steps:

Technical Details

We have perform experiments using Geographica Raster for the following databases in their respective query languages: The results of these experiments can be found in the manuscript [1] of this benchmark.
To be published