Skip to content

Commit 9893b21

Browse files
committed
Results from GH actions on NVIDIA_RTX4090x1
1 parent 5750ec8 commit 9893b21

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+1156
-1057
lines changed

open/GATEOverflow/measurements/RTX4090x1-nvidia-gpu-TensorRT-default_config/stable-diffusion-xl/offline/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ pip install -U mlcflow
1717

1818
mlc rm cache -f
1919

20-
mlc pull repo mlcommons@mlperf-automations --checkout=2c455164ff7d00c2a6c1b369471d91f4ba6181b9
20+
mlc pull repo mlcommons@mlperf-automations --checkout=06b95fa9f0b3e5cedf5295a7b630442b2f9ffac3
2121

2222

2323
```
@@ -38,8 +38,8 @@ Platform: RTX4090x1-nvidia-gpu-TensorRT-default_config
3838
Model Precision: int8
3939

4040
### Accuracy Results
41-
`CLIP_SCORE`: `31.27703`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332`
42-
`FID_SCORE`: `23.13429`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008`
41+
`CLIP_SCORE`: `31.27781`, Required accuracy for closed division `>= 31.68632` and `<= 31.81332`
42+
`FID_SCORE`: `23.12385`, Required accuracy for closed division `>= 23.01086` and `<= 23.95008`
4343

4444
### Performance Results
45-
`Samples per second`: `0.698311`
45+
`Samples per second`: `0.694579`
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,30 @@
1-
[2025-04-01 09:20:01,806 main.py:229 INFO] Detected system ID: KnownSystem.Nvidia_9654bf8fd8c2
1+
[2025-05-01 13:34:04,888 main.py:229 INFO] Detected system ID: KnownSystem.ff13c90c32e9
22
/home/mlcuser/.local/lib/python3.8/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
33
warnings.warn(_BETA_TRANSFORMS_WARNING)
44
/home/mlcuser/.local/lib/python3.8/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
55
warnings.warn(_BETA_TRANSFORMS_WARNING)
6-
[2025-04-01 09:20:02,828 generate_conf_files.py:107 INFO] Generated measurements/ entries for Nvidia_9654bf8fd8c2_TRT/stable-diffusion-xl/Offline
7-
[2025-04-01 09:20:02,828 __init__.py:46 INFO] Running command: python3 -m code.stable-diffusion-xl.tensorrt.harness --logfile_outdir="/mlc-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/stable-diffusion-xl/offline/accuracy" --logfile_prefix="mlperf_log_" --performance_sample_count=5000 --test_mode="AccuracyOnly" --gpu_batch_size=2 --mlperf_conf_path="/home/mlcuser/MLC/repos/local/cache/get-git-repo_08fd7192/inference/mlperf.conf" --tensor_path="build/preprocessed_data/coco2014-tokenized-sdxl/5k_dataset_final/" --use_graphs=true --user_conf_path="/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/224dc7c9e7a64446a0b0faae629b8461.conf" --gpu_inference_streams=1 --gpu_copy_streams=1 --gpu_engines="./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan,./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan" --scenario Offline --model stable-diffusion-xl
8-
[2025-04-01 09:20:02,828 __init__.py:53 INFO] Overriding Environment
6+
[2025-05-01 13:34:05,929 generate_conf_files.py:107 INFO] Generated measurements/ entries for ff13c90c32e9_TRT/stable-diffusion-xl/Offline
7+
[2025-05-01 13:34:05,929 __init__.py:46 INFO] Running command: python3 -m code.stable-diffusion-xl.tensorrt.harness --logfile_outdir="/mlc-mount/home/arjun/gh_action_results/valid_results/RTX4090x1-nvidia_original-gpu-tensorrt-vdefault-default_config/stable-diffusion-xl/offline/accuracy" --logfile_prefix="mlperf_log_" --performance_sample_count=5000 --test_mode="AccuracyOnly" --gpu_batch_size=2 --mlperf_conf_path="/home/mlcuser/MLC/repos/local/cache/get-git-repo_08fd7192/inference/mlperf.conf" --tensor_path="build/preprocessed_data/coco2014-tokenized-sdxl/5k_dataset_final/" --use_graphs=true --user_conf_path="/home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/a8936f5611ea4b9385db41cbaee6977a.conf" --gpu_inference_streams=1 --gpu_copy_streams=1 --gpu_engines="./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan,./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan,./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan" --scenario Offline --model stable-diffusion-xl
8+
[2025-05-01 13:34:05,929 __init__.py:53 INFO] Overriding Environment
99
/home/mlcuser/.local/lib/python3.8/site-packages/torchvision/datapoints/__init__.py:12: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
1010
warnings.warn(_BETA_TRANSFORMS_WARNING)
1111
/home/mlcuser/.local/lib/python3.8/site-packages/torchvision/transforms/v2/__init__.py:54: UserWarning: The torchvision.datapoints and torchvision.transforms.v2 namespaces are still Beta. While we do not expect major breaking changes, some APIs may still change according to user feedback. Please submit any feedback you may have in this issue: https://github.com/pytorch/vision/issues/6753, and you can also check out https://github.com/pytorch/vision/issues/7319 to learn more about the APIs that we suspect might involve future changes. You can silence this warning by calling torchvision.disable_beta_transforms_warning().
1212
warnings.warn(_BETA_TRANSFORMS_WARNING)
13-
[2025-04-01 09:20:04,096 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
14-
[2025-04-01 09:20:04,196 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
15-
[2025-04-01 09:20:04,698 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan.
16-
[2025-04-01 09:20:05,788 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan.
17-
[2025-04-01 09:20:06,757 backend.py:96 INFO] Enabling cuda graphs for unet
18-
[2025-04-01 09:20:06,968 backend.py:154 INFO] captured graph for BS=1
19-
[2025-04-01 09:20:07,221 backend.py:154 INFO] captured graph for BS=2
20-
[2025-04-01 09:20:07,222 harness.py:207 INFO] Start Warm Up!
21-
[2025-04-01 09:20:13,043 harness.py:209 INFO] Warm Up Done!
22-
[2025-04-01 09:20:13,043 harness.py:211 INFO] Start Test!
23-
[2025-04-01 11:19:36,281 backend.py:801 INFO] [Server] Received 5000 total samples
24-
[2025-04-01 11:19:36,282 backend.py:809 INFO] [Device 0] Reported 5000 samples
25-
[2025-04-01 11:19:36,282 harness.py:214 INFO] Test Done!
26-
[2025-04-01 11:19:36,282 harness.py:216 INFO] Destroying SUT...
27-
[2025-04-01 11:19:36,282 harness.py:219 INFO] Destroying QSL...
13+
[2025-05-01 13:34:07,402 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
14+
[2025-05-01 13:34:07,505 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan.
15+
[2025-05-01 13:34:08,086 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan.
16+
[2025-05-01 13:34:09,134 backend.py:71 INFO] Loading TensorRT engine: ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan.
17+
[2025-05-01 13:34:10,271 backend.py:96 INFO] Enabling cuda graphs for unet
18+
[2025-05-01 13:34:10,494 backend.py:154 INFO] captured graph for BS=1
19+
[2025-05-01 13:34:10,747 backend.py:154 INFO] captured graph for BS=2
20+
[2025-05-01 13:34:10,748 harness.py:207 INFO] Start Warm Up!
21+
[2025-05-01 13:34:16,596 harness.py:209 INFO] Warm Up Done!
22+
[2025-05-01 13:34:16,597 harness.py:211 INFO] Start Test!
23+
[2025-05-01 15:34:27,802 backend.py:801 INFO] [Server] Received 5000 total samples
24+
[2025-05-01 15:34:27,803 backend.py:809 INFO] [Device 0] Reported 5000 samples
25+
[2025-05-01 15:34:27,803 harness.py:214 INFO] Test Done!
26+
[2025-05-01 15:34:27,803 harness.py:216 INFO] Destroying SUT...
27+
[2025-05-01 15:34:27,803 harness.py:219 INFO] Destroying QSL...
2828
benchmark : Benchmark.SDXL
2929
buffer_manager_thread_count : 0
3030
data_dir : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_f962b684/data
@@ -33,20 +33,20 @@ gpu_copy_streams : 1
3333
gpu_inference_streams : 1
3434
input_dtype : int32
3535
input_format : linear
36-
log_dir : /home/mlcuser/MLC/repos/local/cache/get-git-repo_32fceb49/repo/closed/NVIDIA/build/logs/2025.04.01-09.20.01
36+
log_dir : /home/mlcuser/MLC/repos/local/cache/get-git-repo_32fceb49/repo/closed/NVIDIA/build/logs/2025.05.01-13.34.04
3737
mlperf_conf_path : /home/mlcuser/MLC/repos/local/cache/get-git-repo_08fd7192/inference/mlperf.conf
3838
model_path : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_f962b684/models/SDXL/
3939
offline_expected_qps : 0.0
4040
precision : int8
4141
preprocessed_data_dir : /home/mlcuser/MLC/repos/local/cache/get-mlperf-inference-nvidia-scratch-space_f962b684/preprocessed_data
4242
scenario : Scenario.Offline
43-
system : SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='13th Gen Intel(R) Core(TM) i9-13900K', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=24, threads_per_core=1): 1}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=131.634496, byte_suffix=<ByteSuffix.GB: (1000, 3)>, _num_bytes=131634496000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA GeForce RTX 4090', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=23.98828125, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=25757220864), max_power_limit=450.0, pci_id='0x268410DE', compute_sm=89): 1})), numa_conf=None, system_id='Nvidia_9654bf8fd8c2')
43+
system : SystemConfiguration(host_cpu_conf=CPUConfiguration(layout={CPU(name='13th Gen Intel(R) Core(TM) i9-13900K', architecture=<CPUArchitecture.x86_64: AliasedName(name='x86_64', aliases=(), patterns=())>, core_count=24, threads_per_core=1): 1}), host_mem_conf=MemoryConfiguration(host_memory_capacity=Memory(quantity=131.634496, byte_suffix=<ByteSuffix.GB: (1000, 3)>, _num_bytes=131634496000), comparison_tolerance=0.05), accelerator_conf=AcceleratorConfiguration(layout=defaultdict(<class 'int'>, {GPU(name='NVIDIA GeForce RTX 4090', accelerator_type=<AcceleratorType.Discrete: AliasedName(name='Discrete', aliases=(), patterns=())>, vram=Memory(quantity=23.98828125, byte_suffix=<ByteSuffix.GiB: (1024, 3)>, _num_bytes=25757220864), max_power_limit=450.0, pci_id='0x268410DE', compute_sm=89): 1})), numa_conf=None, system_id='ff13c90c32e9')
4444
tensor_path : build/preprocessed_data/coco2014-tokenized-sdxl/5k_dataset_final/
4545
test_mode : AccuracyOnly
4646
use_graphs : True
47-
user_conf_path : /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/224dc7c9e7a64446a0b0faae629b8461.conf
48-
system_id : Nvidia_9654bf8fd8c2
49-
config_name : Nvidia_9654bf8fd8c2_stable-diffusion-xl_Offline
47+
user_conf_path : /home/mlcuser/MLC/repos/mlcommons@mlperf-automations/script/generate-mlperf-inference-user-conf/tmp/a8936f5611ea4b9385db41cbaee6977a.conf
48+
system_id : ff13c90c32e9
49+
config_name : ff13c90c32e9_stable-diffusion-xl_Offline
5050
workload_setting : WorkloadSetting(HarnessType.Custom, AccuracyTarget.k_99, PowerSetting.MaxP)
5151
optimization_level : plugin-enabled
5252
num_profiles : 1
@@ -56,11 +56,11 @@ inference_server : custom
5656
skip_file_checks : False
5757
power_limit : None
5858
cpu_freq : None
59-
[I] Loading bytes from ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan
60-
[I] Loading bytes from ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan
61-
[I] Loading bytes from ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan
62-
[I] Loading bytes from ./build/engines/Nvidia_9654bf8fd8c2/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan
63-
[2025-04-01 11:19:36,573 run_harness.py:166 INFO] Result: Accuracy run detected.
59+
[I] Loading bytes from ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIP-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan
60+
[I] Loading bytes from ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-CLIPWithProj-Offline-gpu-b2-fp16.custom_k_99_MaxP.plan
61+
[I] Loading bytes from ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-UNetXL-Offline-gpu-b2-int8.custom_k_99_MaxP.plan
62+
[I] Loading bytes from ./build/engines/ff13c90c32e9/stable-diffusion-xl/Offline/stable-diffusion-xl-VAE-Offline-gpu-b2-fp32.custom_k_99_MaxP.plan
63+
[2025-05-01 15:34:28,106 run_harness.py:166 INFO] Result: Accuracy run detected.
6464

6565
======================== Result summaries: ========================
6666

open/GATEOverflow/measurements/RTX4090x1-nvidia-gpu-TensorRT-default_config/stable-diffusion-xl/offline/cpu_info.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,5 +22,5 @@
2222
"MLC_HOST_CPU_L2_CACHE_SIZE": "24 MiB",
2323
"MLC_HOST_CPU_TOTAL_LOGICAL_CORES": "32",
2424
"MLC_HOST_MEMORY_CAPACITY": "128G",
25-
"MLC_HOST_DISK_CAPACITY": "9.4T"
25+
"MLC_HOST_DISK_CAPACITY": "9.1T"
2626
}

0 commit comments

Comments
 (0)