-
Notifications
You must be signed in to change notification settings - Fork 532
Add Qwen3 0.6B, 1.7B, and 4B #10539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Qwen3 0.6B, 1.7B, and 4B #10539
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10539
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 21 PendingAs of commit 626a0f0 with merge base 32dffbc ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Is there a copy paste issue or the response to the prompt somehow wrong? Looks like what's vibe coding is the prompt but the response is about US president? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check with @madhu-fb about the qk_norm changes before landing.
Yea same question, seems like you might not have copied over the full response? |
Please add a README.md for the export flow and also i don't see the config json's for 1.7B and 4B. |
8412c37
to
8aac481
Compare
Nice job Add this to the top-level README - https://github.com/pytorch/executorch/blob/main/README.md?plain=1#L54 The perf benchmarks on desktop is not really representative. As next steps, having instructions on running on iOS and Android phones, and showing the benchmarks would be good. We can publicize the actual screencast on mobile phones more. |
@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
2 similar comments
@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@jackzhxng has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -243,14 +244,18 @@ def forward( | |||
k = k.view(bsz, seqlen, self.n_local_kv_heads, self.head_dim) | |||
v = v.view(bsz, seqlen, self.n_local_kv_heads, self.head_dim) | |||
|
|||
if self.use_qk_norm and self.qk_norm_before_rope: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might've been cleaner to make it qk_norm_mode
that can be one of None, BEFORE_ROPE, AFTER_ROPE. don't know if we have backward compatibility constraints preventing that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea i agree, this would be cleaner. @jackzhxng let's follow up with this change.
Add ExecuTorch support for Qwen3 0.6B, 1.7B, and 4B
Qwen3 0.6B
Export with xnnpack + 8da4w quantization
Run with pybindings
Qwen3 1.7B
Export with xnnpack + 8da4w quantization
Run with pybindings
Qwen3 4B
Export with xnnpack + 8da4w quantization
Run with pybindings
bypass-github-export-checks