-
Notifications
You must be signed in to change notification settings - Fork 117
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Something is mildly different between our Qwen3 implementation and the VLLM Qwen3 implementation. It's not significant enough that models transferred between the two degenerate completely but its enough to make the model significantly dumber
Plan of Attack
- Set up a full Qwen Roundtrip test
- Remove any source of numeric differences between Levanter and HF Qwen3
- Once the difference is identified, we'll need to submit a PR to VLLM to support whatever our architecture difference is so that people can inference Marin 32B with it.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working