- Base Fuseki with Jena Commands
- GeoSPARQL
- RDF Delta Fuseki
- RDF Delta Server
The image is available as ghcr.io/kurrawong/fuseki:<version>
where version is composed of the jena version and this container image's build version number.
For example, ghcr.io/kurrawong/fuseki:5.2.0-0
is built on Jena Fuseki 5.2.0 and the 0
indicates the build number of this container image. If we release a new build that's still based on Jena 5.2.0, the build number will be incremented to 1 to form ghcr.io/kurrawong/fuseki:5.2.0-1
.
See the tagged images here.
To make data loading an managing easier, it is recommended to install the kurra CLI.
uv tool install kurra
task fuseki:build
task fuseki:up
This will enable the Fuseki UI at http://localhost:3030/
A testdatabase is configured in testdata/config-geosparql.ttl
. It has all features enabled by default. You can disable them by setting the following properties to false
:
# some GeoSPARQL settings. See https://jena.apache.org/documentation/geosparql/geosparql-fuseki.html
geosparql:inference true ; # GeoSPARQL RDFS schema and inferencing (adds additional statements to the dataset)
geosparql:queryRewrite true ; # Simplifies queries, relies on applyDefaultGeometry
geosparql:applyDefaultGeometry true ; # Makes the dataset less dependent on one serialization. Adds additional geo:hasSerialization statements to the dataset
geosparql:indexEnabled true ; # Enable caching of re-usable data to improve query performance
geosparql:validateGeometryLiterals true ; # Logs warnings when adding invalid geometry
With the fuseki up and running, you can create this dataset using the following command:
kurra db create http://localhost:3030 --config ./testdata/config-geosparql.ttl
You'll see a warning in the docker logs of the fuseki
service:
WARN GeoAssembler :: Dataset empty. Spatial Index not constructed. Server will require restarting after adding data and any updates to build Spatial Index.
We can add some data and restart the server:
kurra db upload ./testdata/data-geosparql.ttl http://localhost:3030/test-geosparql
task fuseki:restart
Now you should see that the spatial index was created:
SpatialIndex :: Saving Spatial Index - Completed: /fuseki/databases/test-geosparql/spatial.index
To verify that the dataset is working, go to http://localhost:3030/#/dataset/test-geosparql/query and try some GeoSPARQL queries.
Useful tools to construct and query WKT geometries: https://www.geometrymapper.com/ https://wktmap.com/
Find all addresses within a certain area:
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?address
WHERE {
BIND("POLYGON ((152.685242 -27.161808, 152.698975 -27.829361, 153.492737 -27.829361, 153.435059 -27.178912, 152.685242 -27.161808))"^^geo:wktLiteral AS ?polygon)
?address geo:hasGeometry / geo:asWKT ?point .
FILTER(geof:sfWithin(?point, ?polygon))
}
# returns
# 1<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e>
# 2<https://linked.data.gov.au/dataset/qld-addr/address/beb30200-2988-5c0a-942b-36cd2138805a>
Note that thanks to the applyDefaultGeometry
and inference
options, the following also works:
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?address
WHERE {
BIND("POLYGON ((152.685242 -27.161808, 152.698975 -27.829361, 153.492737 -27.829361, 153.435059 -27.178912, 152.685242 -27.161808))"^^geo:wktLiteral AS ?polygon)
?address geo:hasDefaultGeometry / geo:hasSerialization ?point .
FILTER(geof:sfWithin(?point, ?polygon))
}
# returns
# 1<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e>
# 2<https://linked.data.gov.au/dataset/qld-addr/address/beb30200-2988-5c0a-942b-36cd2138805a>
These queries are useful when dealing with dynamic, user-defined polygons. However, much more is possible when polygons are included in the dataset, and thus also in the spatial index.
The dataset also contains a broad bounding box of Australia, which then gets included in the spatial index.
Thanks to the query rewriting, it means we can use a much simpler query to list all addresses in Australia:
PREFIX addr: <https://linked.data.gov.au/def/addr/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?address
WHERE {
?address a addr:Address .
<https://example.org/australia> geo:sfContains ?address .
}
# returns all 4 addresses in the test dataset
Or in reverse, we can look up which country a certain address is located in:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?country
WHERE {
?country a dbo:Country .
<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e> geo:sfWithin ?country .
}
# returns <https://example.org/australia>
Note that there might be some confusion between the spatial property & filter functions in the Jena namespace (spatial: and spatialF:) and those specified in the standard GeoSPARQL ontology namespace (geo: and geof:).
Because of this, none of the Non-topological Query Functions specified in the GeoSPARQL standard seem to work with the correct namespaces. Instead, there are equivalent implementations of these functions in the Jena namespace, sometimes under a different name.
For example, geof:distance
does not seem to work with Jena, whereas spatialF:distance
does.
PREFIX spatialF: <http://jena.apache.org/function/spatial#>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?distance
WHERE {
<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e> geo:hasDefaultGeometry / geo:hasSerialization ?point1 .
<https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6> geo:hasDefaultGeometry / geo:hasSerialization ?point2 .
BIND(xsd:decimal(spatialF:distance(?point1, ?point2, uom:kilometre)) AS ?distance) .
}
# returns "129.601686"^^<http://www.w3.org/2001/XMLSchema#decimal>
That means when migrating from other systems that do implement the GeoSPARQL standard as-is, some query rewriting might be required to ensure a seamless transition.
Jena supports property & filter functions as specified in the documentation: https://jena.apache.org/documentation/geosparql/index
For example, find addresses less than 150 kilometres from a reference point using latitude -27.5 and longitude 152.5
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX addr: <https://linked.data.gov.au/def/addr/>
SELECT DISTINCT ?address
WHERE {
?address a addr:Address ;
spatial:nearby(-27.5 152.5 100 uom:kilometre)
}
# returns
#<https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6>
#<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e>
Find all addresses north of that same point:
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX addr: <https://linked.data.gov.au/def/addr/>
SELECT DISTINCT ?address
WHERE {
?address a addr:Address ;
spatial:north(-27.5 152.5)
}
# returns <https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6>
When configuring a spatial dataset, combined with a Lucene index, it's important that the fuseki:dataset
of the fuseki:Service
points to the dataset with type text:TextDataset
, and not to the geosparql:geosparqlDataset
. Only then can we combine a spatial index with a full-text index. See testdata/config-geosparql.ttl
for an example.
With the lucene index enabled, the following queries are supported, according to the documentation:
?s text:query 'Queensland' # simplest query
?s text:query ('Queensland' 2) # with limit on results
?s text:query (rdfs:label 'Queensland') # query specific property
?s text:query (rdfs:label 'Queensland' 'lang:en') # restrict search to one language
(?s ?score) text:query 'Queensland' # include the score
(?s ?score ?literal) text:query 'Queensland' # include the original literal value
(?s ?score ?literal ?g) text:query 'Queensland' # include the graph
(?s ?score ?literal) text:query (rdfs:label "(Barbaralla AND Queensland)") # Boolean operators
(?s ?score ?literal) text:query (rdfs:label "(Queensla~)") # Fuzzy search
(?s ?sc ?lit) text:query ( "Queensland" "highlight:" ) # highlighting
(?s ?sc ?lit) text:query ( "Queensland" "highlight:s:<em class='hiLite'> | e:</em>" ) # highlighting with HTML
That means now we can combine the full-text search with the spatial index, which means we can search for text occurrences within a certain geographical area:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX text: <http://jena.apache.org/text#>
SELECT DISTINCT ?address ?literal
WHERE {
BIND("POLYGON ((152.685242 -27.161808, 152.698975 -27.829361, 153.492737 -27.829361, 153.435059 -27.178912, 152.685242 -27.161808))"^^geo:wktLiteral AS ?polygon)
?address geo:hasGeometry / geo:asWKT ?point ;
rdfs:label ?addressLabel .
FILTER(geof:sfWithin(?point, ?polygon))
(?address ?score ?literal) text:query ( "Drive" "highlight:" ) .
}
# returns
# 1<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e> "32 Barbaralla ↦Drive↤, Springwood, Queensland, Australia"@en
See Taskfile.yml for local development commands.
We can build patches for Jena ourselves by developing on a specific version of the Jena source code, and including patches in /docker/patches
.
A simple example of this is the addition of the GeoSPARQL dependency in /docker/patches/enable-geosparql.diff
as inspired by the zazuko docker image.
The process to add these to our own jena deployment is to check out the current Jena version tag from https://github.com/apache/jena , e.g. using git checkout jena-5.2.0
Then make the necessary changes, and run git diff > my-patch.diff
and add my-patch.diff
to /docker/patches
and in the Dockerfile
.