Scale time resources

**Is your feature request related to a problem? Please describe.**
Each rule has a static requirement of time needed to run. However real running times scale with the input file sizes. On top of that, in the existing implementation, specific queues provided in the scicore slurm system are specified in the slurm profile, which might cause incompatibilities with other slurm systems.

**Describe the solution you'd like**

- [X] Make use of time variable within each rule. Scale the estimate with the input sizes. 

Based on some existing runs, I can fit by least squares the size - runtimes and use the coefficient for scaling. Not sure if this will always work (Multiple inputs, multiprocessing optimisation). Still as we have more information on successful runs we could improve the accuracy of these estimates.

- [X] Check if the attempt keyword can be used to increase the time requirement upon restart.

Turns out there is no access to the attempt parameter in the params field. The plan is to first calculate the time estimate in the resources and then interface that to the params field, so that we can then feed it to the cluster.json

- [X] Check if the time estimate is translated correctly in terms of the queueing system.
For the 6hour queue this works fine. Jobs that require less than 30 minutes are still going to the 6hour queue, which is specific to this slurm instance and should be fine in general.


Implementation tasks:

- [x] Add params and resources time parameters and use standard times first
- [x]  Fix the time parameter to be scaled to the input sizes by finding coefficient of scaling for a few rules (ensure the concept works).
- [ ] Fix the time parameter to be scaled to the input sizes by finding coefficient of scaling for each rule
- [x] Remove specific rule specifications in cluster.json files
- [ ] Run standard tests
- [ ] Run some real dataset to ensure the estimates are reasonable
- [ ] Check efficiency score of the workflow to observe if it has increased

**Describe alternatives you've considered**
If all the above do. not work focus on eliminating the queue specification from the cluster.json



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Scale time resources #59

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scale time resources #59

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions