Skip to content

Compute time doesn't seem to scale well with increasing number of threads past a certain point #92

@vlandau

Description

@vlandau

I have noticed scaling issues with Omniscape's multithreading, where once the number of threads gets to be high enough, compute time actually starts to increase. The problem I was using is quite large, and I'm running it on an expensive VM, so below, instead of recording the actually compute times, I'm showing the projected compute time from ProgressMeter.jl after letting the job run for a while until the ETA stabilizes. These benchmarks were run on an Azure VM with 64 logical cores (and 32 physical cores) and 256GB RAM:

  • 63 threads: ~5hr 15m
  • 32 threads: ~3hr 45m

☝🏻 This mostly makes sense, as only using physical cores could make for more efficient use of the processors.

It gets a bit stranger when switching to use a 32 logical core VM (16 physical cores), with 128GB RAM. Both VMs use Intel Xeon processors, so there shouldn't be any difference in single-thread processor speed. I would expect using 63 logical cores would be faster than using 31, and I'd also expect, base on the above, that on a machine with 32 logical cores and 16 physical cores, using 16 threads would similarly outperform using 31 threads. Indeed, that is not the case. Using 16 threads ran about as fast as 31 threads, not faster. The 16 threads job is also not much slower than the 32 threads job above.

  • 31 threads: ~4hr 45m
  • 16 threads: ~4hr 35m

This Omniscape run used a moving window size of 668, so about 1.4M pixels per Circuitscape solve, this means that Circuitscape solve time is >>> overhead from parallel processing.

I'm hoping there may be ways to make Omniscape scale more favorably with increasing number of threads. Things like continental-scale analyses may not be possible at this time given these numbers. The best solution may involve hierarchical parallel processing, but maybe there are some simpler steps that could be taken to improve scaling.

cc @ViralBShah @ranjanan

Metadata

Metadata

Assignees

No one assigned

    Labels

    performanceRelated to compute and memory efficiency

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions