You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your work "How Difficulty-Aware Staged Reinforcement Learning Enhances LLMs’ Reasoning Capabilities: A Preliminary Experimental Study" shows an amazing result of 16K training, outforming DeepScaleR.
Can you share your complete log? In the paper only AIME24 eval score can get, maybe provide all the training log will be useful for the further analysis.