-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
Context:
- We have multiple CKAN datasets that are composed of a set of resources.
- In each of these datasets, one of these resources is a list of geographical regions.
- All other resources in these datasets include a column for geographical region.
- We want to use the
foreignKeys
prop (https://specs.frictionlessdata.io/table-schema/#foreign-keys) in frictionless to ensure that the choices in said column are present, and only present, in the geographical regions resource.
Good news:
- This is supported upstream by frictionless 🎉
Less good news:
- It requires treating two tables and two schemas as one Data Package
Idea No. 1:
- One way to implement this is that the ckanext-validation could check schemas for
foreignKeys
and, if found, bundle both tables and both Table Schemas and send it all to frictionless'svalidate
withtype=package
, e.g. we could have a_validate_data_package
to match https://github.com/frictionlessdata/ckanext-validation/blob/9c23581d34289536139cc81f7b7ddb756dab64f6/ckanext/validation/jobs.py#L139
Idea No. 2:
- An alternative is to have an entirely parallel set of package actions, e.g. https://github.com/frictionlessdata/ckanext-validation/blob/9c23581d34289536139cc81f7b7ddb756dab64f6/ckanext/validation/logic.py#L68 becomes
package_validation_run
and in these actions we treat the whole CKAN dataset (plus schemas) as a frictionless Data Package
We have a small amount of time to dedicate to this and would like to make a change that can be merged here (rather than maintaining our own fork) and so we'd really appreciate thoughts and ideas from existing maintainers and contributors.
All thoughts welcome 😄
fulior, aivuk and pdelboca
Metadata
Metadata
Assignees
Labels
No labels