Skip to content

Project Meeting 2018.11.16

Ben Stabler edited this page Nov 16, 2018 · 14 revisions

Multiprocessing

  • Lots of improvements for debugging/tracing/logging/etc. while we continue to optimize/understand runtime performance
  • Can now run in separate instances (on separate machines if desired) and coalesce into one pipeline file
  • New stride run option - slice households into 5 samples and run the first for example
  • Run 1/5th once with mp runs in 52 minutes, and uses 1/5th of the CPUs and 1/5th of the RAM
  • Doesn't scale since two strides at once, 72 minutes
  • But could run 5 simultaneous for a complete run in 52 minutes (but would need to write a distributed management setup)
  • Maybe try different low level C shared code like open blas instead of MKL
  • Working on running a few big cloud-based runs using our Azure DevOps account
  • Next try two strides at once on Linux since may behave differently
  • Best run is 130 minutes with 20 processors on Azure
  • Working toward a deployment recommendations memo based on our findings
  • Plan to wrap up the task by the end of next week
Clone this wiki locally