URN/URL strategy

A typical succesfactor in the world of web-crawling is a good urn-strategy.
Crawlers ingest a resource and identify it with some urn/url.
In theory the urn can be http://{ogc-proxy}/?url={ogc-request-as-get}, but it would be beneficial to simplify that structure, because a lot can happen with such a url (special characters, it's dynamic nature, etc).

A structure like this would be much better

http://{wfs-proxy}/wfs/{server-id}/ (to get a list of featuretypes/getcapabilities)
http://{wfs-proxy}/wfs/{server-id}/{featuretype}/{page} (to get a paginated list of features)
http://{wfs-proxy}/wfs/{server-id}/{featuretype}/feature/{recordid} (to get a feature)

Which means the proxy should have some persistence of server id's, which may require the wfs-proxy to advertise some methods to register a wfs server (and/or retrieve it from a coupled catalog).

Optional parameters on the url (query string) are: 
- outputformat (json, gml, kml, shape, geopackage, rdf/xml, jsonld, turtle) or this can be managed with an accept-header
- projection (relevant if gml/shape/geopackage)
- filter by a geographic area or any of the attribute fields

URN vs URL, from a search engine point of view the urn should be the same as the url, in semantic web a urn doesn't need to resolve to a resource, however a search engine will never crawl a resource if the urn is not a resolvable url.

On the other hand it is awkward that we facilitate wfs content to be crawled, but assign it a URL which is outside the domain where the resource is stored. Maybe this can be resolved over time  by those domains to install their own WFS proxy. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

URN/URL strategy #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

URN/URL strategy #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions