|
1 |
| -# chat |
| 1 | +# web-llm-ui |
2 | 2 |
|
3 |
| -[](https://codespaces.new/adamelliotfields/chat?devcontainer_path=.devcontainer/devcontainer.json&machine=basicLinux32gb) |
| 3 | +[](https://codespaces.new/adamelliotfields/web-llm-ui?devcontainer_path=.devcontainer/devcontainer.json&machine=basicLinux32gb) |
4 | 4 |
|
5 |
| -> [!IMPORTANT] |
6 |
| -> No longer maintained. :cry: When I first made this, there was no UI for WebLLM. The official app at [chat.webllm.ai](https://chat.webllm.ai) is now the best UI for WebLLM and is actively maintained. Use that or one of Xenova's WebGPU [spaces](https://huggingface.co/collections/Xenova/transformersjs-demos-64f9c4f49c099d93dbc611df) instead! :llama: |
| 5 | +https://github.com/adamelliotfields/web-llm-ui/assets/7433025/07565763-606b-4de3-aa2d-8d5a26c83941 |
7 | 6 |
|
8 |
| -React chat UI for [Web LLM](https://webllm.mlc.ai) on GitHub Pages. Built with Tailwind and Jotai. Inspired by [Perplexity Labs](https://labs.perplexity.ai). |
| 7 | +A React app I made to experiment with [quantized](https://huggingface.co/docs/transformers/en/quantization/overview) models in the browser using [WebGPU](https://webgpu.org). The models are compiled to WebAssembly using [MLC](https://github.com/mlc-ai/mlc-llm), which is like [llama.cpp](https://github.com/ggml-org/llama.cpp) for the web. |
9 | 8 |
|
10 |
| -https://github.com/adamelliotfields/chat/assets/7433025/07565763-606b-4de3-aa2d-8d5a26c83941 |
11 |
| - |
12 |
| -## Introduction |
13 |
| - |
14 |
| -[Web LLM](https://github.com/mlc-ai/web-llm) is a project under the [MLC](https://mlc.ai) (machine learning compilation) organization. It allows you to run large language models in the browser using WebGPU and WebAssembly. Check out the [example](https://github.com/mlc-ai/web-llm/tree/main/examples/simple-chat) and read the [introduction](https://mlc.ai/chapter_introduction/index.html) to learn more. |
15 |
| - |
16 |
| -In addition to [`@mlc-ai/web-llm`](https://www.npmjs.com/package/@mlc-ai/web-llm), the app uses TypeScript, React, Jotai, and Tailwind. It's built with Vite and SWC. |
| 9 | +I'm not going to update this, but the official app at [chat.webllm.ai](https://chat.webllm.ai) is actively maintained. Use that or one of [xenova](https://huggingface.co/Xenova)'s WebGPU [spaces](https://huggingface.co/collections/Xenova/transformersjs-demos-64f9c4f49c099d93dbc611df) instead. |
17 | 10 |
|
18 | 11 | ## Usage
|
19 | 12 |
|
20 | 13 | ```sh
|
21 |
| -# localhost:5173 |
22 |
| -npm install |
23 |
| -npm start |
| 14 | +bun install |
| 15 | +bun start |
24 | 16 | ```
|
25 | 17 |
|
26 | 18 | ## Known issues
|
27 | 19 |
|
28 |
| -I'm currently using Windows/Edge stable on a Lenovo laptop with a RTX 2080 6GB. |
29 |
| - |
30 |
| -Using the demo app at [webllm.mlc.ai](https://webllm.mlc.ai), I did not have to enable any flags to get the `q4f32` quantized models to work (`f16` requires a flag). Go to [webgpureport.org](https://webgpureport.org) to inspect your system's WebGPU capabilities. |
31 |
| - |
32 |
| -### Fetch errors |
33 |
| - |
34 |
| -For whatever reason, I have to be behind a VPN to fetch the models from Hugging Face on Windows. 🤷♂️ |
| 20 | +Using `q4f32` quantized models, as `q4f16` requires a flag. See [webgpureport.org](https://webgpureport.org). |
35 | 21 |
|
36 | 22 | ### Cannot find global function
|
37 | 23 |
|
38 |
| -Usually a cache issue. |
39 |
| - |
40 |
| -You can delete an individual cache: |
| 24 | +If you see this message, it is a cache issue. You can delete an individual cache with: |
41 | 25 |
|
42 | 26 | ```js
|
43 | 27 | await caches.delete('webllm/wasm')
|
@@ -127,13 +111,3 @@ const inCache = hasModelInCache('Phi2-q4f32_1', config) // throws if model ID is
|
127 | 111 | ## VRAM requirements
|
128 | 112 |
|
129 | 113 | See [utils/vram_requirements](https://github.com/mlc-ai/web-llm/tree/main/utils/vram_requirements) in the Web LLM repo.
|
130 |
| - |
131 |
| -## TODO |
132 |
| - |
133 |
| -- [ ] Dark mode |
134 |
| -- [ ] Settings menu (temperature, system message, etc.) |
135 |
| -- [ ] Inference on web worker |
136 |
| -- [ ] Offline/PWA |
137 |
| -- [ ] Cache management |
138 |
| -- [ ] Image upload for multimodal like [LLaVA](https://llava-vl.github.io) |
139 |
| -- [ ] Tailwind class sorting by Biome 🤞 |
0 commit comments