From 71479889ecb275886df30c6c570c3bcd6c346cb5 Mon Sep 17 00:00:00 2001
From: R <grctestemail@gmail.com>
Date: Sat, 9 Nov 2024 01:56:31 +0000
Subject: [PATCH 1/2] Update README.md

Add tl2 to the quant-type optional argument in the setup_env.py instructions
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 4bbfc6aa..43043af3 100644
--- a/README.md
+++ b/README.md
@@ -150,7 +150,7 @@ optional arguments:
                         Directory to save/load the model
   --log-dir LOG_DIR, -ld LOG_DIR
                         Directory to save the logging info
-  --quant-type {i2_s,tl1}, -q {i2_s,tl1}
+  --quant-type {i2_s,tl1,tl2}, -q {i2_s,tl1,tl2}
                         Quantization type
   --quant-embd          Quantize the embeddings to f16
   --use-pretuned, -p    Use the pretuned kernel parameters

From 29109f35fde45da94b7153141aa954b798b74613 Mon Sep 17 00:00:00 2001
From: R <grctestemail@gmail.com>
Date: Sat, 9 Nov 2024 18:42:53 +0000
Subject: [PATCH 2/2] Update README.md

Add tl2 to readme
Instruct to use pretuned kernels by default
---
 README.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 43043af3..621bc9fd 100644
--- a/README.md
+++ b/README.md
@@ -130,14 +130,14 @@ pip install -r requirements.txt
 3. Build the project
 ```bash
 # Download the model from Hugging Face, convert it to quantized gguf format, and build the project
-python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s
+python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s -p
 
 # Or you can manually download the model and run with local path
 huggingface-cli download HF1BitLLM/Llama3-8B-1.58-100B-tokens --local-dir models/Llama3-8B-1.58-100B-tokens
-python setup_env.py -md models/Llama3-8B-1.58-100B-tokens -q i2_s
+python setup_env.py -md models/Llama3-8B-1.58-100B-tokens -q i2_s -p
 ```
 <pre>
-usage: setup_env.py [-h] [--hf-repo {1bitLLM/bitnet_b1_58-large,1bitLLM/bitnet_b1_58-3B,HF1BitLLM/Llama3-8B-1.58-100B-tokens}] [--model-dir MODEL_DIR] [--log-dir LOG_DIR] [--quant-type {i2_s,tl1}] [--quant-embd]
+usage: setup_env.py [-h] [--hf-repo {1bitLLM/bitnet_b1_58-large,1bitLLM/bitnet_b1_58-3B,HF1BitLLM/Llama3-8B-1.58-100B-tokens}] [--model-dir MODEL_DIR] [--log-dir LOG_DIR] [--quant-type {i2_s,tl1,tl2}] [--quant-embd]
                     [--use-pretuned]
 
 Setup the environment for running inference