polish README and scripts

qzweng · qzweng · commit 2c62cd9e184a · 2023-05-11T17:11:07.000+08:00
diff --git a/experiments/README.md b/experiments/README.md
@@ -4,7 +4,7 @@
 
 Make sure the binary file `simon` has been generated in the `bin` directory (see "🚧 Environment Setup" in [README](../README.md) for details)
 
-### Generate Run Scripts
+### Run Scripts Generation
 
 ```bash
 # pwd: kubernetes-scheduler-simulator/experiments
@@ -25,46 +25,68 @@ $ cat experiments/run_scripts/run_scripts_0511.sh | while read i; do printf "%q\
 # "--max-procs=16" where 16 is the degree of PARALLEL suggested above
 # bash run_scripts_0511.sh will run experiments sequentially
 
-#  ..|''||   '|.   '|' '||''''|     '||    ||'  ..|''||   '|.   '|' |''||''| '||'  '||'    '||'          |     |''||''| '||''''|  '||''|.   
-# .|'    ||   |'|   |   ||  .        |||  |||  .|'    ||   |'|   |     ||     ||    ||      ||          |||       ||     ||  .     ||   ||  
-# ||      ||  | '|. |   ||''|        |'|..'||  ||      ||  | '|. |     ||     ||''''||      ||         |  ||      ||     ||''|     ||''|'   
-# '|.     ||  |   |||   ||           | '|' ||  '|.     ||  |   |||     ||     ||    ||      ||        .''''|.     ||     ||        ||   |.  
-#  ''|...|'  .|.   '|  .||.....|    .|. | .||.  ''|...|'  .|.   '|    .||.   .||.  .||.    .||.....| .|.  .||.   .||.   .||.....| .||.  '|'
+#  ..|''||   '|.   '|' '||''''|       '||    ||'  ..|''||   '|.   '|' |''||''| '||'  '||'      '||'          |     |''||''| '||''''|  '||''|.   
+# .|'    ||   |'|   |   ||  .          |||  |||  .|'    ||   |'|   |     ||     ||    ||        ||          |||       ||     ||  .     ||   ||  
+# ||      ||  | '|. |   ||''|          |'|..'||  ||      ||  | '|. |     ||     ||''''||        ||         |  ||      ||     ||''|     ||''|'   
+# '|.     ||  |   |||   ||             | '|' ||  '|.     ||  |   |||     ||     ||    ||        ||        .''''|.     ||     ||        ||   |.  
+#  ''|...|'  .|.   '|  .||.....|      .|. | .||.  ''|...|'  .|.   '|    .||.   .||.  .||.      .||.....| .|.  .||.   .||.   .||.....| .||.  '|'
 ```
 
-Roughly, it takes around
-- 10 minutes for 1 experiment on 2 vCPU.
-- 10 hours for 1020 experiments on a 256 vCPU machine with pool size of 128 threads.
+To explain the bash script generated (e.g., `run_scripts_0511.sh`)
+- Each experiment is conducted via [scripts/generate_config_and_run.py](../scripts/generate_config_and_run.py)
+    - Firstly, the script generates two configuration yaml files in that folder, which are served as input to `bin/simon apply` (i.e., cluster-config and scheduler-config, see "🔥 Quickstart Example" in repo [README](../README.md)), 
+    - Then, it execute the `bin/simon apply` command (confirmed by passing the `-e` parameter to the script)
+    - The `bin/simon`, written in Golang, will schedule the tasks and produce a scheduling log file in the corresponding folder.
+- Afterwards, [scripts/analysis.py](../scripts/analysis.py) is executed to parse logs and yields multiple `analysis_*` files in the folder
 
-### Merge
+In fact, it takes around
+- 10 minutes for 1 experiment on 2 vCPU, 9.4MB disk space for logs.
+- 10 hours for 1020 experiments on a 256 vCPU machine with pool size of 128 threads, 9.4GB disk space for logs.
+
+### Analysis & Merge
+
+As each experiment has its own folder where the `analysis_*` files are produced, here we bypass all folders and merge all the analysis files of the same category into one file under the `analysis/analysis_results` folder.
+
+The top folder of the experiment varies with DATE (e.g., `2023_0511`), while the `analysis_merge.sh` is hard-coded to bypass and merge folders under the `data` folder. Therefore, we need to softlink the top folder to be merged to `data` (e.g., `$ ln -s 2023_0511 data`) before executing `$ bash analysis_merge.sh`.
 
 ```bash
 # pwd: kubernetes-scheduler-simulator/experiments
 $ ln -s 2023_0511 data # softlink it to data
 # pwd: kubernetes-scheduler-simulator/experiments/analysis
 $ cd analysis
-$ bash bash_merge.sh
-# The output was generated under "analysis_results"
-# The results of our large-scale experiments are cached in "expected_results" for comparison
+# The output will be put under "analysis_results" folder
+$ bash analysis_merge.sh
 ```
 
+Our results of the extensive 1020 experiments are cached in [analysis/expected_results](./analysis/expected_results/) (9.8MB) for your reference.
+
 ### Plot
 
-Automatically generate figures in the paper based on the analysis results and compare them with the results in the [expected_results](plot/expected_results) directory.
-For example, running `python plot_paib_alloc.py` will generate `paib_alloc.pdf` figure, which corresponds to Fig. 9(a) in the paper.
+To reproduce the figures shown in the paper, we provide the plotting scripts in the [plot](./plot/) folder. As the scripts assume the existence of the merged `analysis_*.csv` files, we need to first copy (or softlink) these files to [plot](./plot/) folder
 
 ```bash
 # pwd: kubernetes-scheduler-simulator/experiments/analysis
 $ cd ..
 # pwd: kubernetes-scheduler-simulator/experiments
 $ cp analysis/analysis_results/* plot/ # copy all csv under analysis_results/ to plot/ for analysis
-# cp analysis/expected_results/* plot/ # if skipping the experiments and directly reuse our expected results for plotting
+```
+
+if you have skipped the extensive experiments and/or would like to use our expected analysis results for plotting, please replace the last command as:
+```bash
+$ cp analysis/expected_results/* plot/
+```
+
+As the final step, step into the [plot](./plot/) folder and generate figures (in pdf format) based on the analysis results. For example, running `python plot_openb_alloc.py` will produce `openb_alloc.pdf` in the current directory, which corresponds to Fig. 9(a) in the paper.
+
+```bash
 $ cd plot
-$ python plot_paib_alloc.py              # Fig. 9(a)
-$ python plot_paib_frag_amount.py        # Fig. 7(a)
-$ python plot_paib_frag_ratio.py         # Fig. 7(b)
-$ python plot_paib_gpushare_alloc_bar.py # Fig. 11
-$ python plot_paib_multigpu_alloc_bar.py # Fig. 12
-$ python plot_paib_gpuspec_alloc_bar.py  # Fig. 13
-$ python plot_paib_nongpu_alloc_bar.py   # Fig. 14
-```
+$ python plot_openb_alloc.py              # Fig. 9(a)
+$ python plot_openb_frag_amount.py        # Fig. 7(a)
+$ python plot_openb_frag_ratio.py         # Fig. 7(b)
+$ python plot_openb_gpushare_alloc_bar.py # Fig. 11
+$ python plot_openb_multigpu_alloc_bar.py # Fig. 12
+$ python plot_openb_gpuspec_alloc_bar.py  # Fig. 13
+$ python plot_openb_nongpu_alloc_bar.py   # Fig. 14
+```
+
+Our results shown in the paper are cached in [plot/expected_results](plot/expected_results) (164KB) for your reference.
diff --git a/experiments/run_scripts/expected_run_scripts_0511.sh b/experiments/run_scripts/expected_run_scripts_0511.sh
@@ -1,4 +1,4 @@
-#!/usr/bin/bash
+#!/bin/bash
 # cat run_scripts_0511.sh | while read i; do printf "%q\n" "$i"; done | xargs --max-procs=16 -I CMD bash -c CMD
 
 # 01, Random, random, <none>, <none> @ openb_pod_list_default
diff --git a/experiments/run_scripts/generate_run_scripts.py b/experiments/run_scripts/generate_run_scripts.py
@@ -1,31 +1,29 @@
-#
-# 2023_0509
+# 
 # Usage: python3 generate_run_scripts.py > run_scripts.sh
 
-PARALLEL=128
-NUM_REPEAT=10
 
-Date = "2023_0511"
-Remark = "Artifacts"
-FileList = [
-    #: Fig.7, Fig.9
+DATE = "2023_0511" # Used as the folder name under experiments/ to hold all log results. To avoid collision of repeated experiments, may change date or append _v1, _v2, etc.
+REMARK = "Artifacts"
+REPEAT =10 # Number of repetitive experiments.
+FILELIST = [
+    #: Main results in Fig. 7 and 9
     "data/openb_pod_list_default",
-    #: Fig.
+    #: Fig. 14 Various proportion of non-GPU tasks
     "data/openb_pod_list_cpu050",
     "data/openb_pod_list_cpu100",
     "data/openb_pod_list_cpu200",
     "data/openb_pod_list_cpu250",
-    #:
+    #: Fig. 11 Various proportion of GPU-sharing tasks
     "data/openb_pod_list_gpushare100",
     "data/openb_pod_list_gpushare40",
     "data/openb_pod_list_gpushare60",
     "data/openb_pod_list_gpushare80",
-    #:
+    #: Fig. 13 Various proportion of tasks with GPU-type constraints
     "data/openb_pod_list_gpuspec10",
     "data/openb_pod_list_gpuspec20",
     "data/openb_pod_list_gpuspec25",
     "data/openb_pod_list_gpuspec33",
-    #:
+    #: Fig. 12 Various proportion of multi-GPU tasks
     "data/openb_pod_list_multigpu20",
     "data/openb_pod_list_multigpu30",
     "data/openb_pod_list_multigpu40",
@@ -86,25 +84,25 @@ def get_dir_name_from_policy_id_list(id_list):
 ###########################################################
 ###########################################################
 
-def generate_run_scripts(asyncc=True, parallel=PARALLEL):
-    DateAndRemark = Date + "-" + Remark.replace(' ', "_").replace('(',"_").replace(')',"_")
+def generate_run_scripts(asyncc=True, parallel=16):
+    DateAndRemark = DATE + "-" + REMARK.replace(' ', "_").replace('(',"_").replace(')',"_")
     numJobs=0
     if asyncc:
-        print('#!/usr/bin/bash\n# screen -dmS sim-%s bash -c "bash run_scripts_%s.sh"\n' % (DateAndRemark, Date[-4:]))
+        print('#!/bin/bash\n# screen -dmS sim-%s bash -c "bash run_scripts_%s.sh"\n' % (DateAndRemark, DATE[-4:]))
     else:
-        print('#!/usr/bin/bash\n# cat run_scripts_%s.sh | while read i; do printf "%%q\\n" "$i"; done | xargs --max-procs=16 -I CMD bash -c CMD\n' % (Date[-4:]))
+        print('#!/bin/bash\n# cat run_scripts_%s.sh | while read i; do printf "%%q\\n" "$i"; done | xargs --max-procs=16 -I CMD bash -c CMD\n' % (DATE[-4:]))
     for tune_ratio in [1.3]:
-        tune_seed_end = 42 + NUM_REPEAT if NUM_REPEAT >= 1 else 43
+        tune_seed_end = 42 + REPEAT if REPEAT >= 1 else 43
         for tune_seed in range(42, tune_seed_end, 1):
-            for file in FileList:
+            for file in FILELIST:
                 filename = file.split('/')[-1]
                 for id, policy, gsm, dem, nm in MethodList:  # GpuSelMethod, DimExtMethod, NormMethod
                     dir_name = get_dir_name_from_method([id, policy, gsm, dem, nm])
                     gsm = policy if gsm == "<self>" else gsm
                     OUTPUT_YAML = False
                     SHUFFLE_POD = True
                     outstr = "# %s, %s, %s, %s, %s @ %s\n" % (id, policy, gsm, dem, nm, filename)
-                    outstr += 'EXPDIR="experiments/%s/%s/%s/%s/%s' % (Date, filename, dir_name, tune_ratio, tune_seed)
+                    outstr += 'EXPDIR="experiments/%s/%s/%s/%s/%s' % (DATE, filename, dir_name, tune_ratio, tune_seed)
                     outstr += '" && mkdir -p ${EXPDIR} && touch "${EXPDIR}/terminal.out" && '
                     outstr += 'python3 scripts/generate_config_and_run.py -d "${EXPDIR}" '
                     outstr += '-e -b '
@@ -132,7 +130,8 @@ def generate_run_scripts(asyncc=True, parallel=PARALLEL):
         print("wait && date")
 
 if __name__=='__main__':
-    # generate_run_scripts()
+    # generate_run_scripts(asyncc=True)
+    #: $ bash run_scripts.txt
     generate_run_scripts(asyncc=False)
     #: $ cat run_scripts.txt | while read i; do printf "%q\n" "$i"; done | xargs --max-procs=16 -I CMD bash -c CMD
 

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-#!/usr/bin/bash`
	`1`	`+#!/bin/bash`
`2`	`2`	`# cat run_scripts_0511.sh \| while read i; do printf "%q\n" "$i"; done \| xargs --max-procs=16 -I CMD bash -c CMD`
`3`	`3`
`4`	`4`	`# 01, Random, random, <none>, <none> @ openb_pod_list_default`