-
Notifications
You must be signed in to change notification settings - Fork 18
Description
This is to initiate a debate on pipeline.ini files, to provide a convention that will work for input validation.
In its current format, input validation would happen directly on the ini file before anything else happens.
1. How to handle file paths
There have been two suggestions:
- Require a full path for every file that is used
- Easy to write a validation script
- Portable, no cgat file structure required
- Would break with current practice: e.g. /path/to/fasta would replace the genome_dir and genome variables
- Require a "file(name)" prefix/suffix in parameter name
2. How to handle common directories
As per Ian's comment in pull request #331, providing the directory once may be desirable for directories with multiple required files in them. Any ideas on handling something like this using input validation would be helpful:
feature_dir=/path/to/dir
feature1=name1.file
feature2=name2.file
And what about this, where basename is then assembled into multiple files basename.file1 and basename.file2 by python later?
feature_dir=/path/to/dir
feature=basename
3. How to deal with defaults
Options
- Keep defaults
- Empty file with suggestions in comments (+/- a filled-in example)
4. How to deal with mandatory input
Ideas that can be parsed by an input validation script
- add "?" for mandatory input and provide user with example input
- add "req" or similar suffix/prefix to the parameter name
Ultimately, the question is, do we want to do this at all?
Apart from having to change all pipeline.ini files and the pipelines (depending on choices), it would also require reconfiguring all your existing inis.