Skip to content

Merge from dev #105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Jan 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
8205ab9
Update format.yml
arjunsuresh Jan 2, 2025
4d6f842
Added submit-mlperf-results script for auto upload of mlperf results
arjunsuresh Jan 3, 2025
8d1d8b7
[Automated Commit] Format Codebase
mlcommons-bot Jan 3, 2025
c2d2673
Added submit-mlperf-results script for auto upload of mlperf results
arjunsuresh Jan 3, 2025
5e34929
[Automated Commit] Format Codebase
mlcommons-bot Jan 3, 2025
9664886
Merge branch 'mlcommons:dev' into dev
arjunsuresh Jan 3, 2025
2b785b9
Merge pull request #101 from mlcommons/main
arjunsuresh Jan 3, 2025
79a8d6a
Merge branch 'mlcommons:dev' into dev
arjunsuresh Jan 3, 2025
d64b32b
Update format.yml
arjunsuresh Jan 3, 2025
bfe7ffc
Update customize.py
arjunsuresh Jan 3, 2025
f1cbe13
[Automated Commit] Format Codebase
arjunsuresh Jan 3, 2025
73fdc90
Merge pull request #102 from GATEOverflow/dev
arjunsuresh Jan 3, 2025
8ae5223
Added typing_extensions deps to draw-graph-from-json-data
arjunsuresh Jan 3, 2025
d7e89d7
Merge pull request #103 from GATEOverflow/dev
arjunsuresh Jan 3, 2025
68c35cf
Fixed the output parsing for docker container detect (#104)
arjunsuresh Jan 3, 2025
16f75ef
Update test-mlperf-inference-resnet50.yml | Added PAT for MLC
arjunsuresh Jan 3, 2025
5bcdfb5
Improve setup.py (#106)
arjunsuresh Jan 3, 2025
81816f9
Increment version to 0.6.19
mlcommons-bot Jan 3, 2025
04c1cd1
Updated git_commit_hash.txt
mlcommons-bot Jan 3, 2025
1b23221
Update test-mlperf-inference-resnet50.yml
arjunsuresh Jan 3, 2025
89fe114
Improve retinanet github action (#107)
arjunsuresh Jan 4, 2025
a602f0a
Fix retinanet github action (#108), added dev timeline
arjunsuresh Jan 4, 2025
753b094
Improve gh action (#109)
arjunsuresh Jan 4, 2025
2354c3f
Update test-mlperf-inference-resnet50.yml
arjunsuresh Jan 4, 2025
320ab05
Support GH_PAT for windows in push-mlperf-inference-results-to-github…
arjunsuresh Jan 4, 2025
309b88c
Update test-mlperf-inference-mixtral.yml
arjunsuresh Jan 4, 2025
32840f3
Update test-mlperf-inference-llama2.yml
arjunsuresh Jan 4, 2025
84b85ff
Update test-mlperf-inference-gptj.yml
arjunsuresh Jan 4, 2025
4ac4fc4
Update test-mlperf-inference-dlrm.yml
arjunsuresh Jan 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@ jobs:
run: |
HAS_CHANGES=$(git diff --staged --name-only)
if [ ${#HAS_CHANGES} -gt 0 ]; then
git config --global user.name mlcommons-bot
git config --global user.email "mlcommons-bot@users.noreply.github.com"
# Use the GitHub actor's name and email
git config --global user.name "${GITHUB_ACTOR}"
git config --global user.email "${GITHUB_ACTOR}@users.noreply.github.com"
# Commit changes
git commit -m '[Automated Commit] Format Codebase'
# Use the PAT to push changes
git push
fi
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-dlrm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ jobs:
export CM_REPOS=$HOME/GH_CM
python3 -m pip install cm4mlops
cm pull repo
cm run script --tags=run-mlperf,inference,_performance-only --pull_changes=yes --pull_inference_changes=yes --submitter="MLCommons" --model=dlrm-v2-99 --implementation=reference --backend=pytorch --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker --quiet --test_query_count=1 --target_qps=1 --docker_it=no --docker_cm_repo=gateoverflow@mlperf-automations --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --clean
cm run script --tags=run-mlperf,inference,_performance-only --pull_changes=yes --pull_inference_changes=yes --submitter="MLCommons" --model=dlrm-v2-99 --implementation=reference --backend=pytorch --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker --quiet --test_query_count=1 --target_qps=1 --docker_it=no --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --clean

build_intel:
if: github.repository_owner == 'gateoverflow_off'
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-gptj.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,5 @@ jobs:
python3 -m pip install cm4mlops
cm pull repo
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --docker --pull_changes=yes --pull_inference_changes=yes --model=gptj-99 --backend=${{ matrix.backend }} --device=cuda --scenario=Offline --test_query_count=1 --precision=${{ matrix.precision }} --target_qps=1 --quiet --docker_it=no --docker_cm_repo=gateoverflow@mlperf-automations --docker_cm_repo_branch=dev --adr.compiler.tags=gcc --beam_size=1 --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --get_platform_details=yes --implementation=reference --clean
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=dev --commit_message="Results from self hosted Github actions - NVIDIARTX4090" --quiet --submission_dir=$HOME/gh_action_submissions
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=auto-update --commit_message="Results from self hosted Github actions - NVIDIARTX4090" --quiet --submission_dir=$HOME/gh_action_submissions

2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-llama2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ jobs:
git config --global credential.helper store
huggingface-cli login --token ${{ secrets.HF_TOKEN }} --add-to-git-credential
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --pull_changes=yes --pull_inference_changes=yes --model=llama2-70b-99 --implementation=reference --backend=${{ matrix.backend }} --precision=${{ matrix.precision }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker --quiet --test_query_count=1 --target_qps=0.001 --docker_it=no --docker_cm_repo=gateoverflow@mlperf-automations --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --env.CM_MLPERF_MODEL_LLAMA2_70B_DOWNLOAD_TO_HOST=yes --clean
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=dev --commit_message="Results from self hosted Github actions" --quiet --submission_dir=$HOME/gh_action_submissions
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=auto-update --commit_message="Results from self hosted Github actions" --quiet --submission_dir=$HOME/gh_action_submissions
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-mixtral.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ jobs:
huggingface-cli login --token ${{ secrets.HF_TOKEN }} --add-to-git-credential
cm pull repo
cm run script --tags=run-mlperf,inference,_submission,_short --adr.inference-src.tags=_branch.dev --submitter="MLCommons" --pull_changes=yes --pull_inference_changes=yes --model=mixtral-8x7b --implementation=reference --batch_size=1 --precision=${{ matrix.precision }} --backend=${{ matrix.backend }} --category=datacenter --scenario=Offline --execution_mode=test --device=${{ matrix.device }} --docker_it=no --docker_cm_repo=gateoverflow@mlperf-automations --adr.compiler.tags=gcc --hw_name=gh_action --docker_dt=yes --results_dir=$HOME/gh_action_results --submission_dir=$HOME/gh_action_submissions --docker --quiet --test_query_count=3 --target_qps=0.001 --clean --env.CM_MLPERF_MODEL_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --env.CM_MLPERF_DATASET_MIXTRAL_8X7B_DOWNLOAD_TO_HOST=yes --adr.openorca-mbxp-gsm8k-combined-preprocessed.tags=_size.1
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=dev --commit_message="Results from self hosted Github actions - GO-phoenix" --quiet --submission_dir=$HOME/gh_action_submissions
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=auto-update --commit_message="Results from self hosted Github actions - GO-phoenix" --quiet --submission_dir=$HOME/gh_action_submissions
43 changes: 27 additions & 16 deletions .github/workflows/test-mlperf-inference-resnet50.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
# Run MLPerf inference ResNet50

name: MLPerf inference ResNet50

on:
pull_request_target:
branches: [ "main", "dev", "mlperf-inference" ]
branches: [ "main", "dev" ]
paths:
- '.github/workflows/test-mlperf-inference-resnet50.yml'
- '**'
Expand Down Expand Up @@ -39,10 +38,20 @@ jobs:
if: matrix.os == 'windows-latest'
run: |
git config --system core.longpaths true
- name: Install dependencies

- name: Install cm4mlops on Windows
if: matrix.os == 'windows-latest'
run: |
$env:CM_PULL_DEFAULT_MLOPS_REPO = "no"; pip install cm4mlops
- name: Install dependencies on Unix Platforms
if: matrix.os != 'windows-latest'
run: |
CM_PULL_DEFAULT_MLOPS_REPO=no pip install cm4mlops

- name: Pull MLOps repo
run: |
pip install "cmind @ git+https://git@github.com/mlcommons/ck.git@mlperf-inference#subdirectory=cm"
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}

- name: Test MLPerf Inference ResNet50 (Windows)
if: matrix.os == 'windows-latest'
run: |
Expand All @@ -51,17 +60,19 @@ jobs:
if: matrix.os != 'windows-latest'
run: |
cm run script --tags=run-mlperf,inference,_submission,_short --submitter="MLCommons" --pull_changes=yes --pull_inference_changes=yes --hw_name=gh_${{ matrix.os }}_x86 --model=resnet50 --implementation=${{ matrix.implementation }} --backend=${{ matrix.backend }} --device=cpu --scenario=Offline --test_query_count=500 --target_qps=1 -v --quiet
- name: Retrieve secrets from Keeper
id: ksecrets
uses: Keeper-Security/ksm-action@master
with:
keeper-secret-config: ${{ secrets.KSM_CONFIG }}
secrets: |-
ubwkjh-Ii8UJDpG2EoU6GQ/field/Access Token > env:PAT # Fetch PAT and store in environment variable

- name: Push Results
if: github.repository_owner == 'gateoverflow'
if: github.repository_owner == 'mlcommons'
env:
USER: "GitHub Action"
EMAIL: "admin@gateoverflow.com"
GITHUB_TOKEN: ${{ secrets.TEST_RESULTS_GITHUB_TOKEN }}
GITHUB_TOKEN: ${{ env.PAT }}
run: |
git config --global user.name "${{ env.USER }}"
git config --global user.email "${{ env.EMAIL }}"
git config --global credential.https://github.com.helper ""
git config --global credential.https://github.com.helper "!gh auth git-credential"
git config --global credential.https://gist.github.com.helper ""
git config --global credential.https://gist.github.com.helper "!gh auth git-credential"
cm run script --tags=push,github,mlperf,inference,submission --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=auto-update --commit_message="Results from R50 GH action on ${{ matrix.os }}" --quiet
git config --global user.name mlcommons-bot
git config --global user.email "mlcommons-bot@users.noreply.github.com"
cm run script --tags=push,github,mlperf,inference,submission --env.CM_GITHUB_PAT=${{ env.PAT }} --repo_url=https://github.com/mlcommons/mlperf_inference_test_submissions_v5.0 --repo_branch=auto-update --commit_message="Results from R50 GH action on ${{ matrix.os }}" --quiet
17 changes: 12 additions & 5 deletions .github/workflows/test-mlperf-inference-retinanet.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions
# Run MLPerf inference Retinanet

name: MLPerf inference retinanet

on:
pull_request_target:
branches: [ "main", "dev", "mlperf-inference" ]
branches: [ "main", "dev" ]
paths:
- '.github/workflows/test-mlperf-inference-retinanet.yml'
- '**'
Expand Down Expand Up @@ -39,10 +38,18 @@ jobs:
if: matrix.os == 'windows-latest'
run: |
git config --system core.longpaths true
- name: Install dependencies
- name: Install cm4mlops on Windows
if: matrix.os == 'windows-latest'
run: |
$env:CM_PULL_DEFAULT_MLOPS_REPO = "no"; pip install cm4mlops
- name: Install dependencies on Unix Platforms
if: matrix.os != 'windows-latest'
run: |
CM_PULL_DEFAULT_MLOPS_REPO=no pip install cm4mlops
- name: Pull MLOps repo
run: |
python3 -m pip install "cmind @ git+https://git@github.com/mlcommons/ck.git@mlperf-inference#subdirectory=cm"
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}

- name: Test MLPerf Inference Retinanet using ${{ matrix.backend }} on ${{ matrix.os }}
if: matrix.os == 'windows-latest'
run: |
Expand Down
72 changes: 72 additions & 0 deletions HISTORY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
## Timeline of CM developments

### **🚀 2022: Foundation and Early Developments**

- **March 2022:** Grigori Fursin began developing **CM (Collective Mind)**, also referred to as **CK2**, as a successor to CK [at OctoML](https://github.com/octoml/ck/commits/master/?since=2022-03-01&until=2022-03-31).
- **April 2022:** **Arjun Suresh** joined OctoML and collaborated with Grigori on developing **CM Automation** tools.
- **May 2022:** The **CM CLI** and **Python interface** were successfully [implemented and stabilized](https://github.com/octoml/ck/commits/master/?since=2022-04-01&until=2022-05-31) by Grigori.

---

### **🛠️ July–September 2022: MLPerf Integration and First Submission**

- Arjun completed the development of the **MLPerf Inference Script** within CM.
- OctoML achieved **first MLPerf Inference submission (v2.1)** using **CM Automation** ([progress here](https://github.com/octoml/ck/commits/master/?since=2022-06-01&until=2022-09-30)).

---

### **📊 October 2022 – March 2023: End-to-End Automation**

- End-to-end MLPerf inference automations using CM was successfully [completed in CM](https://github.com/octoml/ck/commits/master/?since=2022-10-01&until=2023-03-31).
- **Additional benchmarks** and **Power Measurement support** were integrated into CM.
- **cTuning** achieved a successful MLPerf Inference **v3.0 submission** using CM Automation.

---

### **🔄 April 2023: Transition and New Funding**

- Arjun and Grigori departed OctoML and resumed **CM development** under funding from **cKnowledge.org** and **cTuning**.

---

### **🚀 April–October 2023: Expanded Support and Milestone Submission**

- MLPerf inference automations were [extended](https://github.com/mlcommons/ck/commits/master?since=2023-04-01&until=2023-10-31) to support **NVIDIA implementations**.
- **cTuning** achieved the **largest-ever MLPerf Inference submission (v3.1)** using CM Automation.

---

### **🤝 November 2023: MLCommons Partnership**

- **MLCommons** began funding CM development to enhance support for **NVIDIA MLPerf inference** and introduce support for **Intel** and **Qualcomm MLPerf inference** implementations.

---

### **🌐 October 2023 – March 2024: Multi-Platform Expansion**

- MLPerf inference automations were [expanded](https://github.com/mlcommons/ck/commits/master?since=2023-10-01&until=2024-03-15) to support **NVIDIA, Intel, and Qualcomm implementations**.
- **cTuning** completed the **MLPerf Inference v4.0 submission** using CM Automation.

---

### **📝 April 2024: Documentation Improvements**

- MLCommons contracted **Arjun Suresh** via **GATEOverflow** to improve **MLPerf inference documentation** and enhance CM Automation on various platforms.

---

### **👥 May 2024: Team Expansion**

- **Anandhu Sooraj** joined MLCommons to collaborate with **Arjun Suresh** on CM development.

---

### **📖 June–December 2024: Enhanced Documentation and Automation**

- **Dedicated documentation site** launched for **MLPerf inference**.
- **CM scripts** were developed for **MLPerf Automotive**.
- **CM Docker support** was stabilized.
- **GitHub Actions workflows** were added for **MLPerf inference reference implementations** and **NVIDIA integrations** ([see updates](https://github.com/mlcommons/mlperf-automations/commits/main?since=2024-06-01&until=2024-12-31)).

---

2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.6.18
0.6.19
2 changes: 1 addition & 1 deletion git_commit_hash.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
76796b4c3966b04011c3cb6118412516c90ba50b
81816f94c4a396a012412cb3a1cf4096b4ad103e
1 change: 1 addition & 0 deletions script/draw-graph-from-json-data/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ deps:
- python3
- tags: get,generic-python-lib,_package.networkx
- tags: get,generic-python-lib,_package.matplotlib
- tags: get,generic-python-lib,_package.typing_extensions
6 changes: 6 additions & 0 deletions script/push-mlperf-inference-results-to-github/customize.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from cmind import utils
import cmind as cm
import os
from giturlparse import parse


def preprocess(i):
Expand Down Expand Up @@ -32,6 +33,11 @@ def preprocess(i):
env['CM_MLPERF_RESULTS_REPO_COMMIT_MESSAGE'] = env.get(
'CM_MLPERF_RESULTS_REPO_COMMIT_MESSAGE', 'Added new results')

p = parse(repo)
if env.get('CM_GITHUB_PAT', '') != '':
token = env['CM_GITHUB_PAT']
env['CM_SET_REMOTE_URL_CMD'] = f"""git remote set-url origin https://git:{token}@{p.host}/{p.owner}/{p.repo}"""

return {'return': 0}


Expand Down
3 changes: 3 additions & 0 deletions script/push-mlperf-inference-results-to-github/run.bat
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ REM Check if the previous command was successful
if %errorlevel% neq 0 exit /b %errorlevel%

git commit -a -m "%CM_MLPERF_RESULTS_REPO_COMMIT_MESSAGE%"

if defined CM_MLPERF_INFERENCE_SUBMISSION_DIR call %CM_SET_REMOTE_URL_CMD%

git push

REM Check if the previous command was successful
Expand Down
5 changes: 5 additions & 0 deletions script/push-mlperf-inference-results-to-github/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,10 @@ fi
test $? -eq 0 || exit $?

git commit -a -m "${CM_MLPERF_RESULTS_REPO_COMMIT_MESSAGE}"

if [[ -n ${CM_SET_REMOTE_URL_CMD} ]]; then
${CM_SET_REMOTE_URL_CMD}
fi

git push
test $? -eq 0 || exit $?
15 changes: 14 additions & 1 deletion script/run-docker-container/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,20 @@ def preprocess(i):

if len(out) > 0 and str(env.get('CM_DOCKER_REUSE_EXISTING_CONTAINER',
'')).lower() in ["1", "true", "yes"]: # container exists
out_json = json.loads(out)
# print(out)
out_split = out.splitlines()
if len(out_split) > 0:
try:
out_json = json.loads(out_split[0])
# print("JSON successfully loaded:", out_json)
except json.JSONDecodeError as e:
print(f"Error: First line of 'out' is not valid JSON: {e}")
return {
'return': 1, 'error': f"Error: First line of 'out' is not valid JSON: {e}"}
else:
out_json = []

if isinstance(out_json, list) and len(out_json) > 0:
existing_container_id = out_json[0]['Id']
print(f"Reusing existing container {existing_container_id}")
env['CM_DOCKER_CONTAINER_ID'] = existing_container_id
Expand Down
20 changes: 12 additions & 8 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -146,14 +146,18 @@ def custom_function(self):
'force': True,
'all': True})
branch = os.environ.get('CM_MLOPS_REPO_BRANCH', 'dev')
r = cmind.access({'action': 'pull',
'automation': 'repo',
'artifact': 'mlcommons@mlperf-automations',
'checkout': commit_hash,
'branch': branch})
print(r)
if r['return'] > 0:
return r['return']
pull_default_mlops_repo = os.environ.get(
'CM_PULL_DEFAULT_MLOPS_REPO', 'true')

if str(pull_default_mlops_repo).lower() not in ["no", "0", "false"]:
r = cmind.access({'action': 'pull',
'automation': 'repo',
'artifact': 'mlcommons@mlperf-automations',
'checkout': commit_hash,
'branch': branch})
print(r)
if r['return'] > 0:
return r['return']

def get_sys_platform(self):
self.system = platform.system()
Expand Down
Loading