Skip to content

Releases: turboderp-org/exllamav3

0.0.5

19 Jul 02:05
Compare
Choose a tag to compare
  • Add filter interface
  • Add Formatron support (JSON/KBNF grammar etc.)
  • Support Ernie 4.5 architecture
  • Support SmolLM3 architecture
  • Support Exaone4 architecture
  • Improve Cohere2 support (fixes support for Command-A)
  • Fix compatibility with certain Pixtral preprocessors
  • Fix excessive virtual memory usage with large generator queues on Windows
  • Add facilities for manually mixing/optimizing quants
  • Add MMLU eval script
  • Various other fixes, improvements and optimizations

Full Changelog: v0.0.4...v0.0.5

0.0.4

15 Jun 05:05
Compare
Choose a tag to compare
  • Vision support
  • Support MiMoForCausalLM (no MTP)
  • Support Gemma3ForConditionalGeneration
  • Support Gemma3ForCausalLM
  • Support Mistral3ForConditionalGeneration (Mistral 3.1)
  • Support Dots1ForCausalLM (dots.llm1)
  • Add Transformers integration
  • Faster tokenizer initialization
  • Various fixes

Full Changelog: v0.0.3...v0.0.4

0.0.3

31 May 15:56
Compare
Choose a tag to compare
  • Support Mixtral architecture
  • Support Gemma3 architecture (text only)
  • Fixed support for Command-R+
  • Return probs and top probs from generator
  • Defrag for generator
  • Many bugfixes
  • Many performance improvements
  • A bunch of new eval and test functionality

Full Changelog: v0.0.2...v0.0.3

0.0.2

12 May 16:57
a905cff
Compare
Choose a tag to compare
  • Adds Qwen3MoE support
  • Various bugfixes and optimizations

Full Changelog: v0.0.1...v0.0.2

0.0.1

09 May 18:38
Compare
Choose a tag to compare
Actions: Add build action

Signed-off-by: kingbri <8082010+kingbri1@users.noreply.github.com>