generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Initialize reward_kwargs to prevent UnboundLocalError in GRPOTrainer
#3459
opened May 16, 2025 by
teilomillet
Loading…
Update grpo.py to fix bugs for cli grpo --reward_funcs my_lib.my_reward
#3454
opened May 16, 2025 by
wa008
Loading…
4 of 5 tasks
add warning if dataset's input_ids exceed max_length
#3449
opened May 15, 2025 by
HERIUN
Loading…
1 of 5 tasks
update doc reference "dataset_formats" to "dataset_formats.md"
#3440
opened May 13, 2025 by
TsingZ0
Loading…
🛠️ quantization support for vllm generation
#3428
opened May 8, 2025 by
shirinyamani
Loading…
5 tasks
[DPO] Truncation leading to zero'd out samples
#3398
opened May 1, 2025 by
LeonEricsson
Loading…
2 of 5 tasks
Fix GRPO/DAPO/Dr.GRPO documentation: formula corrections and KL divergence clarification
#3395
opened Apr 30, 2025 by
JenWei0312
Loading…
1 of 5 tasks
Reintroduce
generate
method for PPOTrainer
#3374
opened Apr 27, 2025 by
CloseChoice
Loading…
4 tasks done
add support for reward func using nn.Module in GRPOTrainer
#3372
opened Apr 27, 2025 by
Tavish9
Loading…
1 of 5 tasks
[Feat] Suppport SGLang as rollout engine of GRPO trainer
#3370
opened Apr 27, 2025 by
ryang-max
Loading…
2 of 8 tasks
[GRPO] adds experimental support for the SSR replay buffer
#3325
opened Apr 18, 2025 by
edbeeching
•
Draft
[vllm] support base_url parameter for vLLM client initialization
#3324
opened Apr 18, 2025 by
re-imagined
Loading…
Allow for saving the PPOTrainer value model (critic model)
#3308
opened Apr 16, 2025 by
AMindToThink
Loading…
PPO value_model can't be None, so it shouldn't be Optional
#3300
opened Apr 15, 2025 by
AMindToThink
Loading…
Modified GRPOTrainer to accumulate gradient within a single training batch
#3288
opened Apr 13, 2025 by
jarrelscy
Loading…
3 of 5 tasks
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.