You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Next, use WasmEdge to load the llama-2-13b model and then ask the model to questions.
63
+
The output WASM file is `target/wasm32-wasi/release/llama-chat.wasm`. Next, use WasmEdge to load the llama-2-7b model and then ask the model to questions.
After executing the command, you may need to wait a moment for the input prompt to appear. You can enter your question once you see the `[USER]:` prompt:
@@ -158,7 +150,7 @@ You can make the inference program run faster by AOT compiling the wasm file fir
@@ -260,7 +252,7 @@ Next, execute the model inference.
260
252
context.compute().expect("Failed to complete inference");
261
253
```
262
254
263
-
After the inference is fiished, extract the result from the computation context and losing invalid UTF8 sequences handled by converting the output to a string using `String::from_utf8_lossy`.
255
+
After the inference is finished, extract the result from the computation context and losing invalid UTF8 sequences handled by converting the output to a string using `String::from_utf8_lossy`.
264
256
265
257
```rust
266
258
let mut output_buffer = vec![0u8; *CTX_SIZE.get().unwrap()];
0 commit comments