Skip to content

Code changes for supporting llama3_1-405b reference implementation #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 5, 2025
Merged
58 changes: 58 additions & 0 deletions script/app-mlperf-inference-mlcommons-python/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -496,6 +496,20 @@ deps:
RGAT_CHECKPOINT_PATH:
- 'on'


## LLAMA3_1-405B
- tags: get,ml-model,llama3
names:
- llama3-405b-model
- llama3-402b-model
enable_if_env:
CM_MODEL:
- llama3_1-405b
- llama3-402b
skip_if_env:
CM_ML_MODEL_LLAMA3_CHECKPOINT_PATH:
- 'on'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also only if we are in the docker build stage. Otherwise when the path is given we should register it in CM cache

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit bfb7cb6


########################################################################
# Install datasets

Expand Down Expand Up @@ -635,6 +649,20 @@ deps:
CM_USE_DATASET_FROM_HOST:
- 'yes'

## llama3_1 dataset
- tags: get,dataset,mlperf,inference,llama3,_validation
names:
- llama3_1-dataset
- llama3-dataset
enable_if_env:
CM_MODEL:
- llama3_1-405b
- llama3-402b
skip_if_env:
CM_DATASET_LLAMA3_PATH:
- "on"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as for the model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit bfb7cb6



########################################################################
# Install MLPerf inference dependencies

Expand Down Expand Up @@ -1281,6 +1309,36 @@ variations:
CM_TMP_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"
CM_TMP_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL_DGL: "https://data.dgl.ai/wheels/torch-<<<CM_TORCH_VERSION_MAJOR_MINOR>>>/repo.html"

llama3_1-405b:
group: models
env:
CM_MODEL: llama3_1-405b
adr:
pytorch:
version_max: 2.5.1
CM_MODEL: llama3-402b
deps:
- tags: get,generic-python-lib,_package.torchvision
- tags: get,generic-python-lib,_package.torchaudio
- tags: get,generic-python-lib,_package.torch-geometric
- tags: get,generic-python-lib,_package.transformers
- tags: get,generic-python-lib,_package.sentencepiece
- tags: get,generic-python-lib,_package.accelerate
- tags: get,generic-python-lib,_package.vllm
env:
CM_GENERIC_PYTHON_PIP_EXTRA: "--upgrade"
- tags: get,generic-python-lib,_package.pybind11
- tags: get,generic-python-lib,_package.pandas
version_max: 2.2.1

llama3_1-405b,cuda:
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>.html"

llama3_1-405b,cpu:
env:
CM_GENERIC_PYTHON_PIP_EXTRA_FIND_LINKS_URL: "https://data.pyg.org/whl/torch-<<<CM_TORCH_VERSION>>>+cpu.html"

# Target devices
cpu:
group: device
Expand Down
30 changes: 28 additions & 2 deletions script/app-mlperf-inference-mlcommons-python/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ def preprocess(i):
str(env['CM_MLPERF_LOADGEN_BATCH_SIZE'])

if env.get('CM_MLPERF_LOADGEN_QUERY_COUNT', '') != '' and not env.get('CM_TMP_IGNORE_MLPERF_QUERY_COUNT', False) and (
env['CM_MLPERF_LOADGEN_MODE'] == 'accuracy' or 'gptj' in env['CM_MODEL'] or 'llama2' in env['CM_MODEL'] or 'mixtral' in env['CM_MODEL']) and env.get('CM_MLPERF_RUN_STYLE', '') != "valid":
env['CM_MLPERF_LOADGEN_MODE'] == 'accuracy' or 'gptj' in env['CM_MODEL'] or 'llama2' in env['CM_MODEL'] or 'mixtral' in env['CM_MODEL'] or 'llama3' in env['CM_MODEL']) and env.get('CM_MLPERF_RUN_STYLE', '') != "valid":
env['CM_MLPERF_LOADGEN_EXTRA_OPTIONS'] += " --count " + \
env['CM_MLPERF_LOADGEN_QUERY_COUNT']

Expand Down Expand Up @@ -127,7 +127,7 @@ def preprocess(i):
if 'CM_MLPERF_USER_CONF' in env:
user_conf_path = env['CM_MLPERF_USER_CONF']
x = "" if os_info['platform'] == 'windows' else "'"
if 'llama2-70b' in env['CM_MODEL'] or "mixtral-8x7b" in env["CM_MODEL"]:
if 'llama2-70b' in env['CM_MODEL'] or "mixtral-8x7b" in env["CM_MODEL"] or "llama3" in env["CM_MODEL"]:
scenario_extra_options += " --user-conf " + x + user_conf_path + x
else:
scenario_extra_options += " --user_conf " + x + user_conf_path + x
Expand Down Expand Up @@ -499,6 +499,32 @@ def get_run_cmd_reference(

if env.get('CM_ACTIVATE_RGAT_IN_MEMORY', '') == "yes":
cmd += " --in-memory "

elif "llama3" in env['CM_MODEL']:
env['RUN_DIR'] = os.path.join(
env['CM_MLPERF_INFERENCE_SOURCE'],
"language",
"llama3.1-405b")

if int(env.get('CM_MLPERF_INFERENCE_TP_SIZE', '')) > 1:
env['VLLM_WORKER_MULTIPROC_METHOD'] = "spawn"

cmd = env['CM_PYTHON_BIN_WITH_PATH'] + " main.py " \
" --scenario " + env['CM_MLPERF_LOADGEN_SCENARIO'] + \
" --dataset-path " + env['CM_DATASET_LLAMA3_PATH'] + \
" --output-log-dir " + env['CM_MLPERF_OUTPUT_DIR'] + \
' --dtype ' + env['CM_MLPERF_MODEL_PRECISION'] + \
" --model-path " + env['CM_ML_MODEL_LLAMA3_CHECKPOINT_PATH'] + \
" --tensor-parallel-size " + env['CM_MLPERF_INFERENCE_TP_SIZE'] + \
" --vllm "

if env.get('CM_MLPERF_INFERENCE_NUM_WORKERS', '') != '':
cmd += f" --num-workers {env['CM_MLPERF_INFERENCE_NUM_WORKERS']}"


cmd = cmd.replace("--count", "--total-sample-count")
cmd = cmd.replace("--max-batchsize", "--batch-size")


if env.get('CM_NETWORK_LOADGEN', '') in ["lon", "sut"]:
cmd = cmd + " " + "--network " + env['CM_NETWORK_LOADGEN']
Expand Down
33 changes: 33 additions & 0 deletions script/app-mlperf-inference/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,8 @@ variations:
tags: _int32
cnndm-accuracy-script:
tags: _int32
llama3_1-405b-accuracy-script:
tags: _int32
env:
CM_MLPERF_PYTHON: 'yes'
CM_MLPERF_IMPLEMENTATION: mlcommons_python
Expand Down Expand Up @@ -272,6 +274,10 @@ variations:
default_variations:
backend: pytorch

reference,llama3_1-405b:
default_variations:
backend: pytorch

reference,mixtral-8x7b:
default_variations:
backend: pytorch
Expand Down Expand Up @@ -795,6 +801,31 @@ variations:
- igbh-original
- igbh-dataset

llama3_1-405b:
group:
model
add_deps_recursive:
mlperf-inference-implementation:
tags: _llama3_1-405b
env:
CM_MODEL:
llama3_1-405b
posthook_deps:
- enable_if_env:
CM_MLPERF_LOADGEN_MODE:
- accuracy
- all
CM_MLPERF_ACCURACY_RESULTS_DIR:
- 'on'
skip_if_env:
CM_MLPERF_IMPLEMENTATION:
- nvidia
names:
- mlperf-accuracy-script
- llama3_1-405b-accuracy-script
tags: run,accuracy,mlperf,_dataset_llama3


sdxl:
group:
model
Expand Down Expand Up @@ -1682,6 +1713,8 @@ variations:
tags: _mlcommons
intel-harness:
tags: _v4.1
inference-src:
version: r5.0
default_env:
CM_SKIP_SYS_UTILS: 'yes'
CM_REGENERATE_MEASURE_FILES: 'yes'
Expand Down
56 changes: 56 additions & 0 deletions script/get-dataset-mlperf-inference-llama3/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
alias: get-dataset-mlperf-inference-llama3
automation_alias: script
automation_uid: 5b4e0237da074764
cache: true
tags:
- get
- dataset
- mlperf
- llama3
- inference
uid: c3bc69599cbc4db7
new_env_keys:
- CM_DATASET_LLAMA3_PATH
input_mapping:
path_to_dataset: CM_DATASET_LLAMA3_PATH
prehook_deps:
- env:
CM_DOWNLOAD_FINAL_ENV_NAME: CM_DATASET_LLAMA3_PATH
CM_EXTRACT_TO_FOLDER: llama-3-dataset
extra_cache_tags: dataset,llama3
force_cache: true
enable_if_env:
CM_TMP_REQUIRE_DOWNLOAD:
- 'yes'
names:
- dae
tags: download-and-extract
update_tags_from_env_with_prefix:
_url.:
- CM_DOWNLOAD_URL
variations:
validation:
default: true
group: dataset-type
env:
CM_RCLONE_URL: mlc-inference:mlcommons-inference-wg-public/llama3_405b/mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl
CM_DATASET_TYPE: validation
CM_DATASET_FILE_NAME: mlperf_llama3.1_405b_dataset_8313_processed_fp16_eval.pkl
calibration:
group: dataset-type
env:
CM_RCLONE_URL: mlc-inference:mlcommons-inference-wg-public/llama3_405b/mlperf_llama3.1_405b_calibration_dataset_512_processed_fp16_eval.pkl
CM_DATASET_TYPE: calibration
CM_DATASET_FILE_NAME: mlperf_llama3.1_405b_calibration_dataset_512_processed_fp16_eval.pkl
rclone:
add_deps_recursive:
dae:
tags: _rclone
default: true
env:
CM_DOWNLOAD_FILENAME: checkpoint
CM_DOWNLOAD_URL: <<<CM_RCLONE_URL>>>
CM_RCLONE_CONFIG_NAME: mlc-inference
group: download-tool
print_env_at_the_end:
CM_DATASET_LLAMA3_PATH: Path to the dataset
27 changes: 27 additions & 0 deletions script/get-dataset-mlperf-inference-llama3/customize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
from cmind import utils
import os


def preprocess(i):

os_info = i['os_info']

env = i['env']

if os_info['platform'] == "windows":
return {'return': 1, 'error': 'Script not supported in windows yet!'}

if env.get('CM_DATASET_LLAMA3_PATH', '') == '':
env['CM_TMP_REQUIRE_DOWNLOAD'] = "yes"

return {'return': 0}


def postprocess(i):

env = i['env']

if env.get('CM_TMP_REQUIRE_DOWNLOAD', '') == "yes":
env['CM_DATASET_LLAMA3_PATH'] = os.path.join(env['CM_DATASET_LLAMA3_PATH'], env['CM_DATASET_FILE_NAME'])

return {'return': 0}
90 changes: 90 additions & 0 deletions script/get-ml-model-llama3/_cm.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
alias: get-ml-model-llama3
automation_alias: script
automation_uid: 5b4e0237da074764
cache: true
category: AI/ML models
input_mapping:
path: CM_ML_MODEL_LLAMA3_DOWNLOAD_PATH
new_env_keys:
- CM_ML_MODEL_*
- LLAMA3_CHECKPOINT_PATH
prehook_deps:
- enable_if_env:
CM_TMP_REQUIRE_DOWNLOAD:
- 'yes'
env: {}
extra_cache_tags: llama3,llama-3
force_env_keys:
- CM_GIT_CHECKOUT_FOLDER
names:
- hf-zoo
tags: get,ml-model,huggingface,zoo,_clone-repo
print_env_at_the_end:
LLAMA3_CHECKPOINT_PATH: LLAMA3 checkpoint path
tags:
- get
- raw
- ml-model
- language-processing
- llama3
- llama3-405b
uid: 2f8cef2acc334e80
variations:
fp16:
default: true
env:
CM_ML_MODEL_INPUT_DATA_TYPES: fp16
CM_ML_MODEL_PRECISION: fp16
CM_ML_MODEL_WEIGHT_DATA_TYPES: fp16
group: precision
meta-llama/Llama-3.1-405B-Instruct:
adr:
hf-zoo:
tags: _model-stub.meta-llama/Llama-3.1-405B-Instruct
default: true
env:
CM_ML_MODEL_NAME: Llama-3-405b-instruct
CM_MODEL_ZOO_ENV_KEY: LLAMA3
group: huggingface-stub
meta-llama/Llama-3.1-8B-Instruct:
adr:
hf-zoo:
tags: _model-stub.meta-llama/Llama-3.1-8B-Instruct
env:
CM_ML_MODEL_NAME: Llama-3-8b-instruct
CM_MODEL_ZOO_ENV_KEY: LLAMA3
group: huggingface-stub
vllm:
default: true
env:
CM_ML_MODEL_FRAMEWORK: vllm
group: framework
stub.#:
adr:
hf-zoo:
tags: _model-stub.#
env:
CM_MODEL_ZOO_ENV_KEY: LLAMA3
group: huggingface-stub
docker:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this docker section needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit a21f2b0

use_host_group_id: True
use_host_user_id: True
real_run: false
pass_user_group: True #useful if docker is run by a different user fromt he one who built it and under the same group
pre_run_cmds:
#- cm pull repo && cm run script --tags=get,git,repo,_repo.https://github.com/GATEOverflow/inference_results_v4.0.git --update
- cm pull repo
mounts:
- "${{ CM_OUTDIRNAME }}:${{ CM_OUTDIRNAME }}"
- "${{ CM_ML_MODEL_LLAMA3_DOWNLOAD_PATH }}:${{ CM_ML_MODEL_LLAMA3_DOWNLOAD_PATH }}"
skip_run_cmd: 'no'
shm_size: '32gb'
interactive: True
extra_run_args: ' --dns 8.8.8.8 --dns 8.8.4.4 --cap-add SYS_ADMIN --cap-add SYS_TIME --security-opt apparmor=unconfined --security-opt seccomp=unconfined'
os: ubuntu
cm_repo: mlcommons@mlperf-automations
cm_repo_branch: dev
os_version: '22.04'
docker_input_mapping:
outdirname: CM_OUTDIRNAME
path: CM_ML_MODEL_LLAMA3_DOWNLOAD_PATH
29 changes: 29 additions & 0 deletions script/get-ml-model-llama3/customize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
from cmind import utils
import os


def preprocess(i):

os_info = i['os_info']
env = i['env']

path = env.get('CM_ML_MODEL_LLAMA3_DOWNLOAD_PATH', '').strip()

if path != "":
os.makedirs(path, exist_ok=True)
env['CM_GIT_CHECKOUT_FOLDER'] = os.path.join(
path, env['CM_ML_MODEL_NAME'])

env['CM_TMP_REQUIRE_DOWNLOAD'] = 'yes'

return {'return': 0}


def postprocess(i):

env = i['env']

env['CM_ML_MODEL_LLAMA3_CHECKPOINT_PATH'] = env['LLAMA3_CHECKPOINT_PATH']
env['CM_GET_DEPENDENT_CACHED_PATH'] = env['CM_ML_MODEL_PATH']

return {'return': 0}
Loading
Loading