issues/25-04-16-ainews-qwq-32b-claims-to-match-deepseek-r1-671b #96
Replies: 2 comments 1 reply
-
The date of this item should be 2025-03-06. |
Beta Was this translation helpful? Give feedback.
1 reply
-
This might help... |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
issues/25-04-16-ainews-qwq-32b-claims-to-match-deepseek-r1-671b
Alibaba Qwen released their QwQ-32B model, a 32 billion parameter reasoning model using a novel two-stage reinforcement learning approach: first scaling RL for math and coding tasks with accuracy verifiers and code execution servers, then applying RL for general capabilities like instruction following and alignment. Meanwhile, OpenAI rolled out GPT-4.5 to Plus users, with mixed feedback on coding performance and noted inference cost improvements. The QwQ model aims to compete with larger MoE models like DeepSeek-R1. "GPT-4.5 is unusable for coding" was a notable user critique, while others praised its reasoning improvements due to scaling pretraining.
https://news.smol.ai/issues/25-04-16-ainews-qwq-32b-claims-to-match-deepseek-r1-671b
Beta Was this translation helpful? Give feedback.
All reactions