What's Changed
- Added verified answers to the logging by @abaheti95 in #63
- Adding GPU CI back by @dakinggg in #64
- Fix args propagation by @dakinggg in #65
- Fix weight propagation by @bcui-db in #66
- Microbatching fixes by @dakinggg in #71
- Make myself admin by @gupta-abhay in #72
- Update ci-testing to latest version by @dakinggg in #70
- Move generate to be done via
prompt_token_ids
by @bcui-db in #73 - Add GRPO assert that we need more than one generation by @bcui-db in #74
- Adding a Math format verifier by @gupta-abhay in #75
- Ping foundry version and hash to prepare foundry upgrade by @bowenyang008 in #76
- Bump to torch 2.7 by @bowenyang008 in #77
- Allow DPO reference model to be loaded from LoadCheckpoint callback by @dakinggg in #80
- Set default value as this is only used for local debugging by @gupta-abhay in #84
- Add More Codeowners by @bcui-db in #86
- Fix reward timeouts by @dakinggg in #87
- Remove llama models as defaults by @gupta-abhay in #88
- Skip initial vLLM weight load. by @dakinggg in #89
- Fix memory leak by @dakinggg in #90
- Renaming and Organization of RL algorithms in preparation for Development by @jdchang1 in #83
- Causal classifier by @alextrott16 in #8
- Vllm import Hotfix by @jdchang1 in #91
- Fixing entropy calculation by @abaheti95 in #85
Full Changelog: v0.5.0...v0.7.0