We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent b00cd07 commit e2bd9b1Copy full SHA for e2bd9b1
README.md
@@ -85,9 +85,9 @@ $$
85
hidden = rms_norm(residual)
86
gate = hidden @ gate_weight.T
87
up = hidden @ up_weight.T
88
-hidden = gate * sigmoid(gate) * up ## silu
89
-hidden = hidden @ down_weight.T
90
-residual = hidden + residual
+itermediate = gate * sigmoid(gate) * up ## silu
+output = itermediate @ down_weight.T
+residual = output + residual
91
```
92
93
如果你正确地实现了之前地几个算子,那么这个函数的实现应该是相当简单的。需要注意的是,上一层的输出存储于residual这个临时张量中,这就是用到了我们之前提到的残差连接的概念,最终我们实现的神经网络的输出也要加上前一层的residual并存储于residual中,以便于下一层的计算。hidden_states则用于存储过程中的计算结果。你可以用`src/model.rs`中的测例检验你的实现是否正确。
0 commit comments