🎯
Learning
NUS PhD student working on RL @sail-sg
-
Sea AI Lab @sail-sg
- Singapore
- https://lkevinzc.github.io/
- @zzlccc
Pinned Loading
-
sail-sg/oat
sail-sg/oat Public🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.
-
sail-sg/understand-r1-zero
sail-sg/understand-r1-zero PublicUnderstanding R1-Zero-Like Training: A Critical Perspective
-
sail-sg/oat-zero
sail-sg/oat-zero PublicA lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.
-
mosecorg/mosec
mosecorg/mosec PublicA high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
-
spiral-rl/spiral
spiral-rl/spiral PublicSPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning
-
sail-sg/VeriFree
sail-sg/VeriFree PublicReinforcing General Reasoning without Verifiers
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.