File tree Expand file tree Collapse file tree 1 file changed +8
-8
lines changed Expand file tree Collapse file tree 1 file changed +8
-8
lines changed Original file line number Diff line number Diff line change 1
1
### 🚧 Work-in-progress Features
2
2
3
+ - [ ] ** Additional quantization formats**
4
+ - [ ] Q8
5
+ - [ ] Q4
6
+ - [ ] INT8 native support for GPUs
3
7
- [ ] ** Additional architectures and model format**
4
8
- [ ] Mistral/Mixtral models
9
+ - [ ] Qwen
5
10
- [ ] Gemma/Gemma2 models
6
- - [ ] Phi models
7
- - [ ] SmolLM
8
- - [ ] TinyLlama
11
+ - [ ] TinyLlamas
9
12
- [ ] SafeTensors format
10
13
- [ ] PyTorch checkpoint loading
11
14
- [ ] Automatic model conversion utilities
12
- - [ ] ** Additional quantization formats**
13
- - [ ] INT8
14
- - [ ] FP16 support
15
15
- [ ] ** Advanced inference capabilities**
16
16
- [ ] Batch inference support
17
17
- [ ] Speculative decoding
18
18
- [ ] ** Performance optimizations**
19
19
- [ ] Multi-GPU support
20
- - [ ] Memory-efficient attention mechanisms
21
- - [ ] Kernel fusion improvements
20
+ - [X ] Memory-efficient attention mechanisms
21
+ - [ ] More Kernel fusion improvements
22
22
- [ ] ** LangChain4j integration**
23
23
- [ ] ** GraalVM Native Image**
You can’t perform that action at this time.
0 commit comments