Replies: 1 comment 1 reply
-
@lijh5 I would guess the issue is memory copy bandwidth , memory copy is needed for MPI at least on the receiver side, and is not needed for the basic ib_send_bw test.
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Performance issues with UCX UD transmission when packet size is 4K
bytes BW average[MB/sec]
2 11.17
4 23.28
8 46.95
16 93.43
32 187.06
64 374.52
128 750.02
256 1500.93
512 2992.35
1024 5976.25
2048 11844.51
4096 15569.28
Size Bandwidth (MB/s)
2 9.28
4 19.05
8 38.43
16 69.23
32 138.07
64 272.90
128 503.77
256 857.66
512 2063.33
1024 4201.60
2048 8411.68
4096 9389.34
bandwidth (MB/s)
average
9046.25
9864.50
9865.95
9863.96
9852.61
bandwidth (MB/s)
average
10858.72
10861.54
10852.11
10863.82
As can be seen, the performance of osu_bw is relatively low, especially in 4K, and there is no performance improvement when tested with ucx_perf_test. Do you have any optimization methods?
Can ucx_perftest only test one package length at a time? If you want to test 2-4K package lengths, how should I write it?
thank you very much!
Beta Was this translation helpful? Give feedback.
All reactions