Skip to content

Commit 8908970

Browse files
authored
fix wrong b200 flops number (#1393)
This pr fix what seems to be a wrong estimation of the peak flops for the B200. With the current code the peak flops of B200 is 4.5x bigger that H100 which seems off. It seems that the number reported https://nvdam.widen.net/s/wwnsxrhm2w/blackwell-datasheet-3384703 are with 2:4 sparsity ?
1 parent 7f5c3b6 commit 8908970

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

torchtitan/tools/utils.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ def get_peak_flops(device_name: str) -> int:
9797
return 989e12
9898
elif "B200" in device_name:
9999
# data from https://nvdam.widen.net/s/wwnsxrhm2w/blackwell-datasheet-3384703
100-
return 4.5e15
100+
return 2.25e15
101101
elif "MI300X" in device_name or "MI325X" in device_name:
102102
# MI300X data from https://www.amd.com/en/products/accelerators/instinct/mi300/mi300x.html
103103
# MI325X data from https://www.amd.com/en/products/accelerators/instinct/mi300/mi325x.html

0 commit comments

Comments
 (0)