Releases: turboderp-org/exllamav3
Releases · turboderp-org/exllamav3
0.0.5
- Add filter interface
- Add Formatron support (JSON/KBNF grammar etc.)
- Support Ernie 4.5 architecture
- Support SmolLM3 architecture
- Support Exaone4 architecture
- Improve Cohere2 support (fixes support for Command-A)
- Fix compatibility with certain Pixtral preprocessors
- Fix excessive virtual memory usage with large generator queues on Windows
- Add facilities for manually mixing/optimizing quants
- Add MMLU eval script
- Various other fixes, improvements and optimizations
Full Changelog: v0.0.4...v0.0.5
0.0.4
- Vision support
- Support
MiMoForCausalLM
(no MTP) - Support
Gemma3ForConditionalGeneration
- Support
Gemma3ForCausalLM
- Support
Mistral3ForConditionalGeneration
(Mistral 3.1) - Support
Dots1ForCausalLM
(dots.llm1) - Add Transformers integration
- Faster tokenizer initialization
- Various fixes
Full Changelog: v0.0.3...v0.0.4
0.0.3
- Support Mixtral architecture
- Support Gemma3 architecture (text only)
- Fixed support for Command-R+
- Return probs and top probs from generator
- Defrag for generator
- Many bugfixes
- Many performance improvements
- A bunch of new eval and test functionality
Full Changelog: v0.0.2...v0.0.3
0.0.2
- Adds Qwen3MoE support
- Various bugfixes and optimizations
Full Changelog: v0.0.1...v0.0.2
0.0.1
Actions: Add build action Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>