Skip to content

Scale time resources #59

@mkatsanto

Description

@mkatsanto

Is your feature request related to a problem? Please describe.
Each rule has a static requirement of time needed to run. However real running times scale with the input file sizes. On top of that, in the existing implementation, specific queues provided in the scicore slurm system are specified in the slurm profile, which might cause incompatibilities with other slurm systems.

Describe the solution you'd like

  • Make use of time variable within each rule. Scale the estimate with the input sizes.

Based on some existing runs, I can fit by least squares the size - runtimes and use the coefficient for scaling. Not sure if this will always work (Multiple inputs, multiprocessing optimisation). Still as we have more information on successful runs we could improve the accuracy of these estimates.

  • Check if the attempt keyword can be used to increase the time requirement upon restart.

Turns out there is no access to the attempt parameter in the params field. The plan is to first calculate the time estimate in the resources and then interface that to the params field, so that we can then feed it to the cluster.json

  • Check if the time estimate is translated correctly in terms of the queueing system.
    For the 6hour queue this works fine. Jobs that require less than 30 minutes are still going to the 6hour queue, which is specific to this slurm instance and should be fine in general.

Implementation tasks:

  • Add params and resources time parameters and use standard times first
  • Fix the time parameter to be scaled to the input sizes by finding coefficient of scaling for a few rules (ensure the concept works).
  • Fix the time parameter to be scaled to the input sizes by finding coefficient of scaling for each rule
  • Remove specific rule specifications in cluster.json files
  • Run standard tests
  • Run some real dataset to ensure the estimates are reasonable
  • Check efficiency score of the workflow to observe if it has increased

Describe alternatives you've considered
If all the above do. not work focus on eliminating the queue specification from the cluster.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    futurewill not be fixed for NOW

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions