Skip to content

Kalos cluster under-utilised #1

@HDRah

Description

@HDRah

Hi, when I explored the cluster utilisation rate (number of running GPUs / total number of GPUs) based on the job start time, end time, and the number of GPUs for each job, I found that the maximum utilisation rate of the Kalos cluster is only around 70%, and there are lots of periods where less than 40% or even 20% of the total GPUs of the cluster are used, which is quite weird and is not the case for Seren. I also found that the Seren data has ~800k job records, while Kalos only has ~60k. Does this mean that not all jobs are recorded for Kalos, which further leads to the severe under-utilisation?

Sincerely appreciate it if you could help clarify this. Also thank you so much for sharing this fantastic dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions