Skip to content

Commit 6940f6a

Browse files
committed
minor nits
Signed-off-by: Alex Chi <iskyzh@gmail.com>
1 parent 98bb2c3 commit 6940f6a

File tree

2 files changed

+2
-4
lines changed

2 files changed

+2
-4
lines changed

book/src/SUMMARY.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@
1212
- [RMSNorm and MLP](./week1-04-rmsnorm-and-mlp.md)
1313
- [The Qwen2 Model]()
1414
- [Generating the Response]()
15-
- [Loading the Model]()
1615
- [Sampling and Preparing for Week 2]()
1716
<!--
1817
- [Attention and Multi-Head Attention](./week1-01-attention.md)

book/src/week1-overview.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,8 @@ In this week, we will start from the basic matrix operations and see how those t
1010
Qwen2 model parameters into a model that generates text. We will implement the neural network layers used in the Qwen2
1111
model using mlx's matrix APIs.
1212

13-
We will use the Qwen2-7B-Instruct model for this week. As we need to dequantize the model parameters, the 4GB model needs
14-
20GB of memory in week 1. If you do not have enough memory, you can consider using the smaller 0.5B model (we do not have
15-
infra to test it so you need to figure out things on your own unfortunately).
13+
We will use the Qwen2-7B-Instruct model for this week. As we need to dequantize the model parameters, the model of 4GB
14+
download size needs 20GB of memory in week 1. If you do not have enough memory, you can consider using the smaller 0.5B model.
1615

1716
The MLX version of the Qwen2-7B-Instruct model we downloaded in the setup is an int4 quantized version of the original bfloat16 model.
1817

0 commit comments

Comments
 (0)