Skip to content

Commit c76842f

Browse files
feat: add some utility scripts ✨ (#738)
1 parent bf0838d commit c76842f

File tree

5 files changed

+134
-37
lines changed

5 files changed

+134
-37
lines changed

package.sh

Lines changed: 0 additions & 37 deletions
This file was deleted.

scripts/README.md

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Utility Scripts
2+
3+
This folder is a collection of utility scripts, listed and explained below.
4+
5+
> All scripts need to be run in the root path of project, unless otherwise noted.
6+
7+
## gen_benchmark.py
8+
9+
Generating benchmark by collecting results from [configs](../configs) folder. Usage:
10+
11+
```shell
12+
python ./scripts/gen_benchmark.py
13+
```
14+
15+
It will generate a markdown file, named as `benchmark_results.md`.
16+
17+
## package.sh(Deprecated)
18+
19+
Making wheel package of `mindcv` and sha256sum of the wheel files. Usage:
20+
21+
```shell
22+
./scripts/package.sh
23+
```
24+
25+
**New**! Just simply run the following command to make the wheel:
26+
27+
```shell
28+
python -m build
29+
```
30+
31+
## launch_dist.sh or launch_dist.py
32+
33+
A simple clean launcher for distributed training on **_Ascend_**.
34+
Following [instruction](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.1/parallel/startup_method.html) from Mindspore,
35+
except launching distributed training with `mpirun`, we can also use multiprocess
36+
with multi-card networking configuration `rank_table.json` to manually start a process on each card.
37+
To get `rank_table.json` on your machine, try the hccl tools from [here](https://gitee.com/mindspore/models/tree/master/utils/hccl_tools).
38+
39+
> After you get the `rank_table.json`, replace the `"/path/to/rank_table.json"` in `launch_dist.sh` with the actual path.
40+
41+
Now, you can replace your standalone launching with distributed launching:
42+
43+
```diff
44+
- python script.py --arg1=value1 --arg2=value2
45+
+ ./scripts/launch_dist.sh script.py --arg1=value1 --arg2=value2
46+
```
47+
48+
where `--arg*` are arguments of `script.py`.
49+
50+
For example:
51+
52+
```shell
53+
./scripts/launch_dist.sh train.py --config=configs/resnet/resnet_50_ascend.yaml --data_dir=/my/awesome/dataset
54+
```
55+
56+
> Note: Don't forget to check the argument `--distribute` if you are using `train.py` or `train_with_func.py`!
57+
58+
For anyone who hates shell scripts, we offer python scripts `launch_dist.py` as well. Both are used in the same way!

scripts/launch_dist.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
#!/usr/bin/env python
2+
# Usage:
3+
# ./scripts/launch_dist.py script.py --arg1=value1 --arg2=value2
4+
# Example:
5+
# ./scripts/launch_dist.py train.py --config=configs/resnet/resnet_50_ascend.yaml --data_dir=/my/awesome/dataset
6+
7+
import multiprocessing as mp
8+
import os
9+
import sys
10+
11+
BIAS = 0
12+
RANK_SIZE = 8
13+
RANK_TABLE_FILE = "/path/to/rank_table.json"
14+
15+
16+
def worker(rank_id, script, args):
17+
os.environ["RANK_ID"] = f"{rank_id}" # logical id
18+
os.environ["DEVICE_ID"] = f"{rank_id + BIAS}" # physical id
19+
os.environ["RANK_TABLE_FILE"] = RANK_TABLE_FILE
20+
print(f"Launching rank: {os.getenv('RANK_ID')}, device: {os.getenv('DEVICE_ID')}, pid: {os.getpid()}")
21+
os.system(f"python -u {script} {args}")
22+
23+
24+
if __name__ == "__main__":
25+
mp.set_start_method("spawn")
26+
27+
script_, args_ = sys.argv[1], " ".join(sys.argv[2:])
28+
print(f"Script: {script_}, Args: {args_}")
29+
processes = [mp.Process(target=worker, args=(i, script_, args_)) for i in range(RANK_SIZE)]
30+
[p.start() for p in processes]
31+
[p.join() for p in processes]

scripts/launch_dist.sh

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
#!/bin/bash
2+
# Usage:
3+
# ./scripts/launch_dist.sh script.py --arg1=value1 --arg2=value2
4+
# Example:
5+
# ./scripts/launch_dist.sh train.py --config=configs/resnet/resnet_50_ascend.yaml --data_dir=/my/awesome/dataset
6+
7+
export RANK_SIZE=8
8+
export RANK_TABLE_FILE="/path/to/rank_table.json"
9+
10+
11+
echo "Script: $1, Args: ${@:2}" # ${parameter:offset:length}
12+
13+
# trap SIGINT to execute kill 0, which will kill all processes
14+
trap 'kill 0' SIGINT
15+
for ((i = 0; i < ${RANK_SIZE}; i++)); do
16+
export RANK_ID=$i
17+
export DEVICE_ID=$i
18+
echo "Launching rank: ${RANK_ID}, device: ${DEVICE_ID}"
19+
python -u $@ &
20+
done
21+
# wait for all processes to finish
22+
wait

scripts/package.sh

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#!/bin/bash
2+
3+
set -e
4+
5+
BASE_PATH=$(cd "$(dirname $0)/.."; pwd)
6+
OUTPUT_PATH="${BASE_PATH}/output"
7+
8+
9+
if [[ -d "${OUTPUT_PATH}" ]]; then
10+
rm -rf "${OUTPUT_PATH}"
11+
fi
12+
mkdir -pv "${OUTPUT_PATH}"
13+
14+
python ${BASE_PATH}/setup.py bdist_wheel
15+
16+
mv ${BASE_PATH}/dist/*whl ${OUTPUT_PATH}
17+
18+
cd "${OUTPUT_PATH}" || exit
19+
PACKAGE_LIST=$(ls mindcv-*.whl) || exit
20+
for PACKAGE_NAME in ${PACKAGE_LIST}; do
21+
echo "writing sha256sum of ${PACKAGE_NAME}"
22+
sha256sum -b "${PACKAGE_NAME}" > "${PACKAGE_NAME}.sha256"
23+
done

0 commit comments

Comments
 (0)