New command line dh-validator.py tool for validationg csv,tsv,xls,xlsx data files against a schema.yaml file #450
ddooley
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
A new command-line dh-validate.py script simplifies the validation of DataHarmonizer-generated csv,tsv,xls,xlsx files. We look forward to feedback on using this below.
Basically, the linkml-validate command is good for the .json or .yaml data format, but the tabular csv,tsv,xls,xlsx input formats often don't validate well for two main reasons which are resolved by dh-validator.py generating a temporary .yaml file version of the tabular input with necessary adjustments made according to the given schema. dh-validator.py then sends this to linkml-validate for processing. The following adjustments are made:
We will be evolving this script to give a report of any miss-matched columns/fields, to facilitate having older tabular data validated in a newer LinkML schema version for example.
Beta Was this translation helpful? Give feedback.
All reactions