You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I ran the test and found that many models can only answer 320 questions correctly. Why is that? Why is it 320, and if I enter 330, except for the initial "conetext windows", all subsequent ones fail, #1
I tested gemini2.0-flash, gemini2.5-flash, gemma3-27b, qwen3_235b-a22b,
among which,2.0-flash, 2.5-flash, qwen235b-a22b, are all exactly 320.
This is my prompt
Here are n five-digit additions in the form of
Qn. xn+yn,
You need to answer in the form of
An. {anwser}
, do not group,
example:
`
A1. 79281
A2. 138779
A3. 139180
...
`