Skip to content

Code changes for integrating nvidia v5.0 #417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 18, 2025
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions automation/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -339,8 +339,8 @@ def compare_versions(i):
# 3.9.6 vs 3.9
# 3.9 vs 3.9.6

i_version1 = [int(v) if v.isdigit() else v for v in l_version1]
i_version2 = [int(v) if v.isdigit() else v for v in l_version2]
i_version1 = [int(v) for v in l_version1 if v.isdigit()]
i_version2 = [int(v) for v in l_version2 if v.isdigit()]

comparison = 0

Expand Down
10 changes: 10 additions & 0 deletions script/add-custom-nvidia-system/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,11 @@ deps:
# Detect pycuda
- tags: get,generic-python-lib,_pycuda

- tags: get,generic-python-lib,_package.typeguard
enable_if_env:
MLC_MLPERF_INFERENCE_VERSION:
- "5.0"

variations:
nvidia-only:
group: code
Expand Down Expand Up @@ -124,3 +129,8 @@ versions:
add_deps_recursive:
nvidia-inference-common-code:
version: r4.0

r5.0:
add_deps_recursive:
nvidia-inference-common-code:
version: r5.0
116 changes: 90 additions & 26 deletions script/app-mlperf-inference-nvidia/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,9 @@ deps:
- run_harness

- tags: get,generic-python-lib,_package.pycuda
version: "2022.2.2"
names:
- pycuda
version: "2023.1"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restrict it to v5. 0?

- tags: get,generic-python-lib,_package.nvmitten
update_tags_from_env_with_prefix:
Expand All @@ -281,11 +283,10 @@ deps:
enable_if_env:
MLC_RUN_STATE_DOCKER:
- 'yes'
MLC_ENV_NVMITTEN_DOCKER_WHEEL_PATH:
- 'yes'

- tags: get,nvidia,mitten
skip_if_env:
MLC_RUN_STATE_DOCKER:
- 'yes'
enable_if_env:
MLC_NVIDIA_MITTEN_FROM_SRC:
- 'yes'
Expand Down Expand Up @@ -351,6 +352,18 @@ post_deps:
# Variations to customize dependencies
variations:
# MLPerf inference version
v5.0:
group: version
env:
MLC_MLPERF_INFERENCE_CODE_VERSION: "v5.0"
MLC_MLPERF_GPTJ_MODEL_FP8_PATH_SUFFIX: GPTJ-FP8-quantized
MLC_NVIDIA_MITTEN_FROM_SRC: "yes"
MLC_GIT_CHECKOUT: "98bb85df8e936219ec7acd10ce1d702147fb1e21"
adr:
pytorch:
tags: _for-nvidia-mlperf-inference-v5.0
pycuda:
version_min: "2024.1"
v4.1:
group: version
env:
Expand Down Expand Up @@ -435,9 +448,20 @@ variations:
- tags: get,generic-python-lib,_numpy
- tags: get,generic-python-lib,_pycocotools
- tags: get,generic-python-lib,_onnx-graphsurgeon
- tags: get,generic,sys-util,_cmake
- tags: get,generic-python-lib,_package.cmake
- tags: get,generic-python-lib,_package.sympy

retinanet,v5.0:
deps:
- tags: get,generic-python-lib,_package.onnx
version: 1.17.0

retinanet,v4.0:
deps:
- tags: get,generic-python-lib,_package.onnx
version: 1.14.1
- tags: get,generic-python-lib,_package.sympy


sdxl:
new_env_keys:
Expand Down Expand Up @@ -481,8 +505,8 @@ variations:
names:
- nvtx
- tags: get,generic-python-lib,_package.cuda-python
version_max: 12.6.2
version_max_usable: 12.6.2
version_max: "12.6.2"
version_max_usable: "12.6.2"
names:
- cuda-python
- tags: get,generic-python-lib,_package.ninja
Expand All @@ -494,38 +518,78 @@ variations:
- tags: get,generic-python-lib,_package.colored
names:
- colored
- tags: get,generic-python-lib,_package.nvidia-ammo
names:
- nvidia-ammo
version: 0.7.4
env:
MLC_GENERIC_PYTHON_PIP_EXTRA_INDEX_URL: "https://pypi.nvidia.com"
MLC_GENERIC_PYTHON_PIP_EXTRA: "--no-cache-dir"
- tags: get,generic-python-lib,_package.optimum
names:
- optimum
- tags: get,generic-python-lib,_package.onnx
names:
- onnx
version: 1.14.0
- tags: get,generic-python-lib,_package.scipy
names:
- scipy
version: 1.10.1
- tags: get,generic-python-lib,_package.numpy
names:
- numpy
version_max: 1.22.99
version_max_usable: "1.22"

sdxl,v4.0:
deps:
- tags: get,generic-python-lib,_package.onnx
names:
- onnx
version: "1.14.0"
- tags: get,generic-python-lib,_package.numpy
names:
- numpy
version_max: "1.22.99"
version_max_usable: "1.22"
- tags: get,generic-python-lib,_package.nvidia-ammo
names:
- nvidia-ammo
version: "0.7.4"
env:
MLC_GENERIC_PYTHON_PIP_EXTRA_INDEX_URL: "https://pypi.nvidia.com"
MLC_GENERIC_PYTHON_PIP_EXTRA: "--no-cache-dir"

sdxl,v4.1:
deps:
- tags: get,generic-python-lib,_package.torchrec
version: 0.4.0
version: "0.4.0"
- tags: get,generic-python-lib,_package.torchmetrics
version: 1.0.3
version: "1.0.3"
- tags: get,generic-python-lib,_package.typeguard
- tags: get,generic-python-lib,_package.onnx
names:
- onnx
version: "1.14.0"
- tags: get,generic-python-lib,_package.numpy
names:
- numpy
version_max: "1.22.99"
version_max_usable: "1.22"
- tags: get,generic-python-lib,_package.nvidia-ammo
names:
- nvidia-ammo
version: "0.7.4"
env:
MLC_GENERIC_PYTHON_PIP_EXTRA_INDEX_URL: "https://pypi.nvidia.com"
MLC_GENERIC_PYTHON_PIP_EXTRA: "--no-cache-dir"
- tags: get,generic-python-lib,_package.scipy
names:
- scipy
version: "1.10.1"

sdxl,v5.0:
# nvidia-ammo is decommisioned and model-opt is being used which is built with TRTLLM
deps:
- tags: get,generic-python-lib,_package.torchrec
version: "0.6.0"
- tags: get,generic-python-lib,_package.torchmetrics
version: "1.0.3"
- tags: get,generic-python-lib,_package.typeguard
- tags: get,generic-python-lib,_package.onnx
names:
- onnx
version: "1.17.0"
- tags: get,generic-python-lib,_package.numpy
names:
- numpy
version_max: "1.26.99"
version_max_usable: "1.26.4"

bert_:
deps:
- tags: get,generic-python-lib,_transformers
Expand Down
55 changes: 55 additions & 0 deletions script/app-mlperf-inference/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -382,6 +382,24 @@ variations:
env:
MLC_ENV_NVMITTEN_DOCKER_WHEEL_PATH: '/opt/nvmitten-0.1.3b0-cp310-cp310-linux_aarch64.whl'

nvidia-original,r5.0_default:
env:
MLC_NVIDIA_MITTEN_FROM_SRC: 'yes'
docker:
build_deps:
- tags: detect,os
image_name: mlperf-inference-nvidia-v5.0-common
update_meta_if_env:
- enable_if_env:
MLC_HOST_PLATFORM_FLAVOR:
- x86_64
docker:
base_image: nvcr.io/nvidia/mlperf/mlperf-inference:mlpinf-v5.0-cuda12.8-pytorch25.01-ubuntu24.04-x86_64-release
- skip_if_env:
MLC_HOST_PLATFORM_FLAVOR:
- x86_64
docker:
base_image: nvcr.io/nvidia/mlperf/mlperf-inference:mlpinf-v5.0-cuda12.8-pytorch25.01-ubuntu24.04-aarch64-Grace-release

nvidia-original,gptj_:
env:
Expand Down Expand Up @@ -424,6 +442,14 @@ variations:
update_tags_from_env_with_prefix:
_tp-size.:
- MLC_NVIDIA_TP_SIZE

nvidia-original,r5.0_default,gptj_:
docker:
deps:
- tags: get,ml-model,gptj,_nvidia,_fp8
update_tags_from_env_with_prefix:
_tp-size.:
- MLC_NVIDIA_TP_SIZE


nvidia-original,r4.1-dev_default,llama2-70b_:
Expand All @@ -446,6 +472,14 @@ variations:
- MLC_NVIDIA_TP_SIZE
env:
BUILD_TRTLLM: 1

nvidia-original,r5.0_default,llama2-70b_:
docker:
deps:
- tags: get,ml-model,llama2-70b,_nvidia,_fp8
update_tags_from_env_with_prefix:
_tp-size.:
- MLC_NVIDIA_TP_SIZE

nvidia-original:
docker:
Expand Down Expand Up @@ -1813,6 +1847,27 @@ variations:
MLC_REGENERATE_MEASURE_FILES: 'yes'
env:
MLC_ENV_NVMITTEN_DOCKER_WHEEL_PATH: '/opt/nvmitten-0.1.3b0-cp38-cp38-linux_x86_64.whl'

r5.0_default:
group:
reproducibility
add_deps_recursive:
nvidia-inference-common-code:
version: r5.0
tags: _mlcommons
nvidia-inference-server:
version: r5.0
tags: _mlcommons
intel-harness:
tags: _v4.1
inference-src:
version: r5.0
nvidia-scratch-space:
tags: _version.5.0
default_env:
MLC_SKIP_SYS_UTILS: 'yes'
MLC_REGENERATE_MEASURE_FILES: 'yes'
MLC_MLPERF_INFERENCE_VERSION: '5.0'


invalid_variation_combinations:
Expand Down
4 changes: 4 additions & 0 deletions script/build-dockerfile/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,10 @@ def preprocess(i):
for cmd in config['RUN_CMDS']:
f.write('RUN ' + cmd + EOL)

if env.get('MLC_MLPERF_IMPLEMENTATION', '') == "nvidia" and env.get(
'MLC_MLPERF_INFERENCE_VERSION', '') == "5.0":
f.write('ENV ' + 'ENV' + "=\"" + 'release' + "\"" + EOL)

f.write(EOL + '# Setup docker user' + EOL)
docker_user = get_value(env, config, 'USER', 'MLC_DOCKER_USER')
docker_group = get_value(env, config, 'GROUP', 'MLC_DOCKER_GROUP')
Expand Down
2 changes: 1 addition & 1 deletion script/build-mlperf-inference-server-nvidia/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ deps:

# Detect pycuda
- tags: get,generic-python-lib,_pycuda
version: "2022.2.2"
version: "2023.1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work for Nvidia v4.0?

skip_if_env:
MLC_RUN_STATE_DOCKER:
- 'yes'
Expand Down
8 changes: 8 additions & 0 deletions script/build-mlperf-inference-server-nvidia/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,17 @@ if [[ ${MLC_MAKE_CLEAN} == "yes" ]]; then
fi

if [[ ${MLC_MLPERF_DEVICE} == "inferentia" ]]; then
echo "inferencia"
make prebuild
fi

# Perform sed replacement only if version is 5.0
if [[ "${MLC_MLPERF_INFERENCE_VERSION}" == "5.0" ]]; then
echo "Replacing /work/ with ${MLC_MLPERF_INFERENCE_NVIDIA_CODE_PATH} in all files..."
find . -type f -exec sed -i "s|/work/|${MLC_MLPERF_INFERENCE_NVIDIA_CODE_PATH}/|g" {} +
fi

echo ${MLC_MAKE_BUILD_COMMAND}
SKIP_DRIVER_CHECK=1 make ${MLC_MAKE_BUILD_COMMAND}

test $? -eq 0 || exit $?
5 changes: 5 additions & 0 deletions script/get-mlperf-inference-nvidia-common-code/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,8 @@ versions:
mlperf-inference-results:
version: v4.0
tags: _code-only-for-v5.0
r5.0:
add_deps_recursive:
mlperf-inference-results:
version: v5.0
tags: _code-only
4 changes: 4 additions & 0 deletions script/get-mlperf-inference-results/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,3 +86,7 @@ versions:
env:
MLC_GIT_URL: https://github.com/<<<GITHUB_REPO_OWNER>>>/inference_results_v4.1.git
MLC_MLPERF_INFERENCE_RESULTS_VERSION_NAME: v4.1
v5.0:
env:
MLC_GIT_URL: https://github.com/<<<GITHUB_REPO_OWNER>>>/inference_results_v5.0.git
MLC_MLPERF_INFERENCE_RESULTS_VERSION_NAME: v5.0
7 changes: 6 additions & 1 deletion script/get-nvidia-mitten/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,13 @@
def preprocess(i):

os_info = i['os_info']
env = i['env']
script_path = i['artifact'].path

# TBD
if env.get('MLC_MLPERF_INFERENCE_VERSION', '') == "5.0":
extra_run_cmd = 'patch -p1 < {}'.format(os.path.join(
script_path, 'patch', 'numpy-mitten-v5.0.patch'))
env['EXTRA_RUN_CMD'] = extra_run_cmd

return {'return': 0}

Expand Down
2 changes: 1 addition & 1 deletion script/get-nvidia-mitten/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ deps:
- python
tags: get,python3
- tags: get,generic-python-lib,_pycuda
version: 2022.2.2
version: "2023.1"
- env:
MLC_GIT_CHECKOUT_PATH_ENV_NAME: MLC_NVIDIA_MITTEN_SRC
extra_cache_tags: nvidia,mitten,src
Expand Down
13 changes: 13 additions & 0 deletions script/get-nvidia-mitten/patch/numpy-mitten-v5.0.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
diff --git a/setup.cfg b/setup.cfg
index 4976354..798175e 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -21,7 +21,7 @@ install_requires =
graphlib_backport >=1.0.3;python_version<'3.9'
requests >=2.28.1
tqdm >=4.65.0
- numpy >=1.22.0, <1.24.0
+ numpy >=1.26.4
GitPython >=3.1.31
pandas
opencv-python
3 changes: 3 additions & 0 deletions script/get-nvidia-mitten/run.sh
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
#!/bin/bash
cd ${MLC_NVIDIA_MITTEN_SRC}
echo "EXTRA_RUN_CMD = ${EXTRA_RUN_CMD}"
eval "${EXTRA_RUN_CMD}"
test $? -eq 0 || exit $?
${MLC_PYTHON_BIN_WITH_PATH} -m pip install .
test $? -eq 0 || exit $?
Loading
Loading