Skip to content

Commit 6328b98

Browse files
authored
Update GPULlama3_ROADMAP.md
1 parent 8380b28 commit 6328b98

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

docs/GPULlama3_ROADMAP.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
11
### 🚧 Work-in-progress Features
22

3+
- [ ] **Additional quantization formats**
4+
- [ ] Q8
5+
- [ ] Q4
6+
- [ ] INT8 native support for GPUs
37
- [ ] **Additional architectures and model format**
48
- [ ] Mistral/Mixtral models
9+
- [ ] Qwen
510
- [ ] Gemma/Gemma2 models
6-
- [ ] Phi models
7-
- [ ] SmolLM
8-
- [ ] TinyLlama
11+
- [ ] TinyLlamas
912
- [ ] SafeTensors format
1013
- [ ] PyTorch checkpoint loading
1114
- [ ] Automatic model conversion utilities
12-
- [ ] **Additional quantization formats**
13-
- [ ] INT8
14-
- [ ] FP16 support
1515
- [ ] **Advanced inference capabilities**
1616
- [ ] Batch inference support
1717
- [ ] Speculative decoding
1818
- [ ] **Performance optimizations**
1919
- [ ] Multi-GPU support
20-
- [ ] Memory-efficient attention mechanisms
21-
- [ ] Kernel fusion improvements
20+
- [X] Memory-efficient attention mechanisms
21+
- [ ] More Kernel fusion improvements
2222
- [ ] **LangChain4j integration**
2323
- [ ] **GraalVM Native Image**

0 commit comments

Comments
 (0)