Skip to content

Add memspace "highest bandwidth" #408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 24, 2024
Merged

Conversation

kswiecicki
Copy link
Contributor

@kswiecicki kswiecicki commented Apr 4, 2024

Description

This memspace contains memory targets ordered by the highest bandwidth value (targets higher in the ordering are prioritised).
There is also an alternative way to assign nodes to this memspace by setting UMF_MEMSPACE_HIGHEST_BANDWIDTH environment variable.

TODO

  • Add tests for "highest bandwidth" memspace from UMF_MEMSPACE_HIGHEST_BANDWIDTH
  • Verify functionality on QEMU runner

Checklist

  • Code compiles without errors locally
  • All tests pass locally
  • CI workflows execute properly
  • CI workflows, not executed per PR (e.g. Nightly), execute properly
  • New tests added, especially if they will fail without my changes
  • Extended the README/documentation
  • All newly added source files have a license
  • All newly added source files are referenced in CMake files
  • Logger (with debug/info/... messages) is used

@kswiecicki kswiecicki force-pushed the memspace-hbw branch 2 times, most recently from ec4a05b to 33dfb12 Compare April 4, 2024 14:36
@kswiecicki
Copy link
Contributor Author

To use HWLOC property query API, we need the the lib version >= 2.3.0. Unfortunately on the Ubuntu 20.04 runners the latest available package is version 2.1.0 (https://packages.ubuntu.com/search?suite=all&searchon=names&keywords=hwloc). @lukaszstolarczuk do you think we could use a script to build and install HWLOC from source instead of relying on the package manager? Or alternatively install the package from newer repo eg. jammy (22.04LTS)?

@lukaszstolarczuk
Copy link
Contributor

To use HWLOC property query API, we need the the lib version >= 2.3.0. Unfortunately on the Ubuntu 20.04 runners the latest available package is version 2.1.0 (https://packages.ubuntu.com/search?suite=all&searchon=names&keywords=hwloc). @lukaszstolarczuk do you think we could use a script to build and install HWLOC from source instead of relying on the package manager? Or alternatively install the package from newer repo eg. jammy (22.04LTS)?

heh, we are using Ubuntu 22.04. I think the error you have is caused because of typo - you specified to look for >=2.30.0 😉
We have 2.7.0 installed: https://github.com/oneapi-src/unified-memory-framework/pull/408/checks#step:6:189

@kswiecicki
Copy link
Contributor Author

To use HWLOC property query API, we need the the lib version >= 2.3.0. Unfortunately on the Ubuntu 20.04 runners the latest available package is version 2.1.0 (https://packages.ubuntu.com/search?suite=all&searchon=names&keywords=hwloc). @lukaszstolarczuk do you think we could use a script to build and install HWLOC from source instead of relying on the package manager? Or alternatively install the package from newer repo eg. jammy (22.04LTS)?

heh, we are using Ubuntu 22.04. I think the error you have is caused because of typo - you specified to look for >=2.30.0 😉 We have 2.7.0 installed: https://github.com/oneapi-src/unified-memory-framework/pull/408/checks#step:6:189

My bad, I've left the version 2.30.0 as I was testing the FindLIBHWLOC.cmake module.
We are also using Ubuntu 20.04 runners that use the HWLOC version 2.1.0 installed via apt package manager.

@lukaszstolarczuk
Copy link
Contributor

To use HWLOC property query API, we need the the lib version >= 2.3.0. Unfortunately on the Ubuntu 20.04 runners the latest available package is version 2.1.0 (https://packages.ubuntu.com/search?suite=all&searchon=names&keywords=hwloc). @lukaszstolarczuk do you think we could use a script to build and install HWLOC from source instead of relying on the package manager? Or alternatively install the package from newer repo eg. jammy (22.04LTS)?

heh, we are using Ubuntu 22.04. I think the error you have is caused because of typo - you specified to look for >=2.30.0 😉 We have 2.7.0 installed: https://github.com/oneapi-src/unified-memory-framework/pull/408/checks#step:6:189

My bad, I've left the version 2.30.0 as I was testing the FindLIBHWLOC.cmake module. We are also using Ubuntu 20.04 runners that use the HWLOC version 2.1.0 installed via apt package manager.

Ok, right. For the few jobs using Ubuntu 20.04 we can do what you proposed - install from source. If this will end up being more than a few lines of code, please make this a python script, e.g. install-hwloc.py

@kswiecicki kswiecicki force-pushed the memspace-hbw branch 2 times, most recently from 040868d to ecdb5d0 Compare April 8, 2024 12:40
@kswiecicki
Copy link
Contributor Author

kswiecicki commented Apr 9, 2024

The original implementation of the highest bandwidth memspace was supposed to contain NUMA nodes (represented as memory targets in UMF) sorted by either the average or maximum (it was not yet decided) bandwidth value. This value was obtained by considering each NUMA node as the initiator for the bandwidth query and selecting all NUMA nodes as targets consecutively.
Since control over allocation placement is restricted to the nodemask provided to the membind function in the OS provider, we were unable to prioritize NUMA nodes with better bandwidth values. In that case, the closest nodes contained in the nodemask would be prioritized. For now, we've settled on the highest bandwidth memspace implementation that contains an aggregated list of NUMA nodes identified as the best targets after selecting each NUMA node as the initiator.

This comment serves as documentation for the development of the memspace API.
@igchor would you like to add something?

@ldorau ldorau mentioned this pull request Apr 10, 2024
3 tasks
@kswiecicki kswiecicki force-pushed the memspace-hbw branch 4 times, most recently from 403c1d7 to 3a59a29 Compare May 10, 2024 09:40
@kswiecicki kswiecicki marked this pull request as ready for review May 10, 2024 11:53
@kswiecicki kswiecicki requested a review from a team as a code owner May 10, 2024 11:54
@kswiecicki kswiecicki force-pushed the memspace-hbw branch 3 times, most recently from 4ea46aa to 22638d3 Compare May 16, 2024 14:19
@kswiecicki
Copy link
Contributor Author

kswiecicki commented May 16, 2024

@kswiecicki could you add a test on qemu that uses the highest bandwidth memspace and verifies that in case of equal bandwidths between nodes, local node is preferred?

The test should spawn multiple threads and pin every thread to a different NUMA node and the allocate memory from each thread. The allocated memory should come from the local node.

Done. I've added the test case, but there's no QEMU configuration that satisfies the equal bandwidth requirement. I think @KFilipek is working on adding new topologies for testing.

@kswiecicki kswiecicki force-pushed the memspace-hbw branch 3 times, most recently from a48450c to d7de5c6 Compare May 20, 2024 09:21
@bratpiorka bratpiorka requested review from ldorau and lplewa May 20, 2024 12:21
@kswiecicki kswiecicki force-pushed the memspace-hbw branch 2 times, most recently from d443128 to 2e118cf Compare May 22, 2024 10:05
@kswiecicki kswiecicki mentioned this pull request May 22, 2024
9 tasks
Copy link
Contributor

@lukaszstolarczuk lukaszstolarczuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@lplewa lplewa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do those functions works on windows - you did not updated .def file.

@kswiecicki
Copy link
Contributor Author

Do those functions works on windows - you did not updated .def file.

Memspace functionality is currently built only on Linux platforms.

This memspace contains an aggregated list of NUMA nodes identified as best
targets after selecting each NUMA node as the initiator.
Querying the bandwidth value requires HMAT support on the platform,
calling umfMemspaceHighestBandwidthGet() will return NULL if it's not
supported.
Those tests are skipped with GTEST_SKIP() when bandwidth property
can't be queried (HMAT is not supported on the platform).
It makes it easier to see which tests were actually completed without
skipping.
@lplewa lplewa merged commit 9c59759 into oneapi-src:main May 24, 2024
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants