PRIME-RL

All

5 repositories

TTRL
Public
TTRL: Test-Time Reinforcement Learning
rl reasoning llm
Python
•
MIT License
•60•769•9•0•Updated Aug 17, 2025Aug 17, 2025
Entropy-Mechanism-of-RL
Public
The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.
rl reasoning llm
Python
•9•305•3•0•Updated Jul 11, 2025Jul 11, 2025
SimpleVLA-RL
Public
Online RL with Simple Reward Enables Training VLA Models with Only One Trajectory
rl vla reasoning
Python
•
MIT License
•18•370•14•1•Updated Jun 20, 2025Jun 20, 2025
PRIME
Public
Scalable RL solution for advanced reasoning of language models
rl reasoning llm
Python
•
Apache License 2.0
•96•1.7k•7•1•Updated Mar 18, 2025Mar 18, 2025
ImplicitPRM
Public
Repo of paper "Free Process Rewards without Process Labels"
rl prm test-time-scaling
Python
•
Apache License 2.0
•11•161•12•0•Updated Mar 14, 2025Mar 14, 2025