Skip to content

Code changes for supporting llama3_1-405b reference implementation #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 5, 2025
Merged
66 changes: 66 additions & 0 deletions script/app-mlperf-inference-mlcommons-python/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -496,6 +496,24 @@ deps:
RGAT_CHECKPOINT_PATH:
- 'on'


## LLAMA3_1-405B
- tags: get,ml-model,llama3
names:
- llama3-405b-model
- llama3-402b-model
enable_if_env:
CM_MODEL:
- llama3_1-405b
- llama3-405b
skip_if_env:
CM_ML_MODEL_LLAMA3_CHECKPOINT_PATH:
- 'on'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also only if we are in the docker build stage. Otherwise when the path is given we should register it in CM cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit bfb7cb6

CM_USE_MODEL_FROM_HOST:
- 'yes'
CM_RUN_STATE_DOCKER:
- 'yes'

########################################################################
# Install datasets

Expand Down Expand Up @@ -635,6 +653,24 @@ deps:
CM_USE_DATASET_FROM_HOST:
- 'yes'

## llama3_1 dataset
- tags: get,dataset,mlperf,inference,llama3,_validation
names:
- llama3_1-dataset
- llama3-dataset
enable_if_env:
CM_MODEL:
- llama3_1-405b
- llama3-402b
skip_if_env:
CM_DATASET_LLAMA3_PATH:
- "on"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as for the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit bfb7cb6

CM_USE_DATASET_FROM_HOST:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is not needed right? Because if the path is directly passed from the host to a container then, this won't work. Same for model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, thanks. Also I formatted the app-mlperf-inference-mlcommons-python _cm.yaml file with the help of prettier extension in VS Code.

- 'yes'
CM_RUN_STATE_DOCKER:
- 'yes'


########################################################################
# Install MLPerf inference dependencies

Expand Down Expand Up @@ -1281,6 +1317,36 @@ variations:
CM_TMP_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"
CM_TMP_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL_DGL: "https://data.dgl.ai/wheels/torch-<<<CM_TORCH_VERSION_MAJOR_MINOR>>>/repo.html"

llama3_1-405b:
group: models
env:
CM_MODEL: llama3_1-405b
adr:
pytorch:
version_max: 2.5.1
CM_MODEL: llama3-402b
deps:
- tags: get,generic-python-lib,_package.torchvision
- tags: get,generic-python-lib,_package.torchaudio
- tags: get,generic-python-lib,_package.torch-geometric
- tags: get,generic-python-lib,_package.transformers
- tags: get,generic-python-lib,_package.sentencepiece
- tags: get,generic-python-lib,_package.accelerate
- tags: get,generic-python-lib,_package.vllm
env:
CM_GENERIC_PYTHON_PIP_EXTRA: "--upgrade"
- tags: get,generic-python-lib,_package.pybind11
- tags: get,generic-python-lib,_package.pandas
version_max: 2.2.1

llama3_1-405b,cuda:
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>.html"

llama3_1-405b,cpu:
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"

# Target devices
cpu:
group: device
Expand Down
30 changes: 28 additions & 2 deletions script/app-mlperf-inference-mlcommons-python/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def preprocess(i):
str(env['CM_MLPERF_LOADGEN_BATCH_SIZE'])

if env.get('CM_MLPERF_LOADGEN_QUERY_COUNT', '') != '' and not env.get('CM_TMP_IGNORE_MLPERF_QUERY_COUNT', False) and (
env['CM_MLPERF_LOADGEN_MODE'] == 'accuracy' or 'gptj' in env['CM_MODEL'] or 'llama2' in env['CM_MODEL'] or 'mixtral' in env['CM_MODEL']) and env.get('CM_MLPERF_RUN_STYLE', '') != "valid":
env['CM_MLPERF_LOADGEN_MODE'] == 'accuracy' or 'gptj' in env['CM_MODEL'] or 'llama2' in env['CM_MODEL'] or 'mixtral' in env['CM_MODEL'] or 'llama3' in env['CM_MODEL']) and env.get('CM_MLPERF_RUN_STYLE', '') != "valid":
env['CM_MLPERF_LOADGEN_EXTRA_OPTIONS'] += " --count " + \
env['CM_MLPERF_LOADGEN_QUERY_COUNT']

Expand Down Expand Up @@ -127,7 +127,7 @@ def preprocess(i):
if 'CM_MLPERF_USER_CONF' in env:
user_conf_path = env['CM_MLPERF_USER_CONF']
x = "" if os_info['platform'] == 'windows' else "'"
if 'llama2-70b' in env['CM_MODEL'] or "mixtral-8x7b" in env["CM_MODEL"]:
if 'llama2-70b' in env['CM_MODEL'] or "mixtral-8x7b" in env["CM_MODEL"] or "llama3" in env["CM_MODEL"]:
scenario_extra_options += " --user-conf " + x + user_conf_path + x
else:
scenario_extra_options += " --user_conf " + x + user_conf_path + x
Expand Down Expand Up @@ -499,6 +499,32 @@ def get_run_cmd_reference(

if env.get('CM_ACTIVATE_RGAT_IN_MEMORY', '') == "yes":
cmd += " --in-memory "

elif "llama3" in env['CM_MODEL']:
env['RUN_DIR'] = os.path.join(
env['CM_MLPERF_INFERENCE_SOURCE'],
"language",
"llama3.1-405b")

if int(env.get('CM_MLPERF_INFERENCE_TP_SIZE', '')) > 1:
env['VLLM_WORKER_MULTIPROC_METHOD'] = "spawn"

cmd = env['CM_PYTHON_BIN_WITH_PATH'] + " main.py " \
" --scenario " + env['CM_MLPERF_LOADGEN_SCENARIO'] + \
" --dataset-path " + env['CM_DATASET_LLAMA3_PATH'] + \
" --output-log-dir " + env['CM_MLPERF_OUTPUT_DIR'] + \
' --dtype ' + env['CM_MLPERF_MODEL_PRECISION'] + \
" --model-path " + env['CM_ML_MODEL_LLAMA3_CHECKPOINT_PATH'] + \
" --tensor-parallel-size " + env['CM_MLPERF_INFERENCE_TP_SIZE'] + \
" --vllm "

if env.get('CM_MLPERF_INFERENCE_NUM_WORKERS', '') != '':
cmd += f" --num-workers {env['CM_MLPERF_INFERENCE_NUM_WORKERS']}"


cmd = cmd.replace("--count", "--total-sample-count")
cmd = cmd.replace("--max-batchsize", "--batch-size")


if env.get('CM_NETWORK_LOADGEN', '') in ["lon", "sut"]:
cmd = cmd + " " + "--network " + env['CM_NETWORK_LOADGEN']
Expand Down
42 changes: 42 additions & 0 deletions script/app-mlperf-inference/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,8 @@ variations:
tags: _int32
cnndm-accuracy-script:
tags: _int32
llama3_1-405b-accuracy-script:
tags: _int32
env:
CM_MLPERF_PYTHON: 'yes'
CM_MLPERF_IMPLEMENTATION: mlcommons_python
Expand Down Expand Up @@ -272,6 +274,10 @@ variations:
default_variations:
backend: pytorch

reference,llama3_1-405b:
default_variations:
backend: pytorch

reference,mixtral-8x7b:
default_variations:
backend: pytorch
Expand Down Expand Up @@ -795,6 +801,40 @@ variations:
- igbh-original
- igbh-dataset

llama3_1-405b:
group:
model
add_deps_recursive:
mlperf-inference-implementation:
tags: _llama3_1-405b
env:
CM_MODEL:
llama3_1-405b
posthook_deps:
- enable_if_env:
CM_MLPERF_LOADGEN_MODE:
- accuracy
- all
CM_MLPERF_ACCURACY_RESULTS_DIR:
- 'on'
skip_if_env:
CM_MLPERF_IMPLEMENTATION:
- nvidia
names:
- mlperf-accuracy-script
- llama3_1-405b-accuracy-script
tags: run,accuracy,mlperf,_dataset_llama3
docker:
deps:
- tags: get,ml-model,llama3
enable_if_env:
CM_USE_DATASET_FROM_HOST:
- 'yes'
names:
- llama3_1-405b
- llama3-405b


sdxl:
group:
model
Expand Down Expand Up @@ -1682,6 +1722,8 @@ variations:
tags: _mlcommons
intel-harness:
tags: _v4.1
inference-src:
version: r5.0
default_env:
CM_SKIP_SYS_UTILS: 'yes'
CM_REGENERATE_MEASURE_FILES: 'yes'
Expand Down
56 changes: 56 additions & 0 deletions script/get-dataset-mlperf-inference-llama3/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
alias: get-dataset-mlperf-inference-llama3
automation_alias: script
automation_uid: 5b4e0237da074764
cache: true
tags:
- get
- dataset
- mlperf
- llama3
- inference
uid: c3bc69599cbc4db7
new_env_keys:
- CM_DATASET_LLAMA3_PATH
input_mapping:
outdirname: CM_OUTDIRNAME
prehook_deps:
- env:
CM_DOWNLOAD_FINAL_ENV_NAME: CM_DATASET_LLAMA3_PATH
CM_EXTRACT_TO_FOLDER: llama-3-dataset
extra_cache_tags: dataset,llama3
force_cache: true
enable_if_env:
CM_TMP_REQUIRE_DOWNLOAD:
- 'yes'
names:
- dae
tags: download-and-extract
update_tags_from_env_with_prefix:
_url.:
- CM_DOWNLOAD_URL
variations:
validation:
default: true
group: dataset-type
env:
CM_RCLONE_URL: mlc-inference:mlcommons-inference-wg-public/llama3_405b/mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl
CM_DATASET_TYPE: validation
CM_DATASET_FILE_NAME: mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl
calibration:
group: dataset-type
env:
CM_RCLONE_URL: mlc-inference:mlcommons-inference-wg-public/llama3_405b/mlperf_llama3.1_405b_calibration_dataset_512_processed_fp16_eval.pkl
CM_DATASET_TYPE: calibration
CM_DATASET_FILE_NAME: mlperf_llama3.1_405b_calibration_dataset_512_processed_fp16_eval.pkl
rclone:
add_deps_recursive:
dae:
tags: _rclone
default: true
env:
CM_DOWNLOAD_FILENAME: checkpoint
CM_DOWNLOAD_URL: <<<CM_RCLONE_URL>>>
CM_RCLONE_CONFIG_NAME: mlc-inference
group: download-tool
print_env_at_the_end:
CM_DATASET_LLAMA3_PATH: Path to the dataset
31 changes: 31 additions & 0 deletions script/get-dataset-mlperf-inference-llama3/customize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from cmind import utils
import os


def preprocess(i):

os_info = i['os_info']

env = i['env']

if os_info['platform'] == "windows":
return {'return': 1, 'error': 'Script not supported in windows yet!'}

if env.get('CM_DATASET_LLAMA3_PATH', '') == '':
env['CM_TMP_REQUIRE_DOWNLOAD'] = "yes"

if env.get('CM_OUTDIRNAME', '') != '':
env['CM_DOWNLOAD_PATH'] = env['CM_OUTDIRNAME']

return {'return': 0}


def postprocess(i):

env = i['env']

if env.get('CM_TMP_REQUIRE_DOWNLOAD', '') == "yes":
env['CM_DATASET_LLAMA3_PATH'] = os.path.join(
env['CM_DATASET_LLAMA3_PATH'], env['CM_DATASET_FILE_NAME'])

return {'return': 0}
68 changes: 68 additions & 0 deletions script/get-ml-model-llama3/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
alias: get-ml-model-llama3
automation_alias: script
automation_uid: 5b4e0237da074764
cache: true
category: AI/ML models
input_mapping:
outdirname: CM_OUTDIRNAME
new_env_keys:
- CM_ML_MODEL_*
- LLAMA3_CHECKPOINT_PATH
prehook_deps:
- enable_if_env:
CM_TMP_REQUIRE_DOWNLOAD:
- 'yes'
env: {}
extra_cache_tags: llama3,llama-3
force_env_keys:
- CM_GIT_CHECKOUT_FOLDER
names:
- hf-zoo
tags: get,ml-model,huggingface,zoo,_clone-repo
print_env_at_the_end:
LLAMA3_CHECKPOINT_PATH: LLAMA3 checkpoint path
tags:
- get
- raw
- ml-model
- language-processing
- llama3
- llama3-405b
uid: 2f8cef2acc334e80
variations:
fp16:
default: true
env:
CM_ML_MODEL_INPUT_DATA_TYPES: fp16
CM_ML_MODEL_PRECISION: fp16
CM_ML_MODEL_WEIGHT_DATA_TYPES: fp16
group: precision
meta-llama/Llama-3.1-405B-Instruct:
adr:
hf-zoo:
tags: _model-stub.meta-llama/Llama-3.1-405B-Instruct
default: true
env:
CM_ML_MODEL_NAME: Llama-3-405b-instruct
CM_MODEL_ZOO_ENV_KEY: LLAMA3
group: huggingface-stub
meta-llama/Llama-3.1-8B-Instruct:
adr:
hf-zoo:
tags: _model-stub.meta-llama/Llama-3.1-8B-Instruct
env:
CM_ML_MODEL_NAME: Llama-3-8b-instruct
CM_MODEL_ZOO_ENV_KEY: LLAMA3
group: huggingface-stub
vllm:
default: true
env:
CM_ML_MODEL_FRAMEWORK: vllm
group: framework
stub.#:
adr:
hf-zoo:
tags: _model-stub.#
env:
CM_MODEL_ZOO_ENV_KEY: LLAMA3
group: huggingface-stub
Loading