Skip to content

Commit 063ca3e

Browse files
authored
Merge pull request #278 from sergiopaniego/llm-grpo-trl
Added new `Post-training an LLM using GRPO with TRL` recipe 🧑‍🍳️
2 parents 1c90308 + 0e45d15 commit 063ca3e

File tree

3 files changed

+895
-3
lines changed

3 files changed

+895
-3
lines changed

notebooks/en/_toctree.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,9 @@
7474
title: Phoenix Observability Dashboard on HF Spaces
7575
- local: search_and_learn
7676
title: Scaling Test-Time Compute for Longer Thinking in LLMs
77-
77+
- local: fine_tuning_llm_grpo_trl
78+
title: Post training an LLM for reasoning with GRPO in TRL
79+
7880
- title: Computer Vision Recipes
7981
isExpanded: false
8082
sections:

notebooks/en/fine_tuning_llm_grpo_trl.ipynb

Lines changed: 891 additions & 0 deletions
Large diffs are not rendered by default.

notebooks/en/index.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,12 +7,11 @@ applications and solving various machine learning tasks using open-source tools
77

88
Check out the recently added notebooks:
99

10+
- [Post training an LLM for reasoning with GRPO in TRL](fine_tuning_llm_grpo_trl)
1011
- [Evaluating AI Search Engines with `judges` - the open-source library for LLM-as-a-judge evaluators](llm_judge_evaluating_ai_search_engines_with_judges_library)
1112
- [Structured Generation from Images or Documents Using Vision Language Models](structured_generation_vision_language_models)
1213
- [Vector Search on Hugging Face with the Hub as Backend](vector_search_with_hub_as_backend)
1314
- [Multi-Agent Order Management System with MongoDB](mongodb_smolagents_multi_micro_agents)
14-
- [Scaling Test-Time Compute for Longer Thinking in LLMs](search_and_learn)
15-
- [Signature-Aware Model Serving from MLflow with Ray Serve](mlflow_ray_serve)
1615

1716
You can also check out the notebooks in the cookbook's [GitHub repo](https://github.com/huggingface/cookbook).
1817

0 commit comments

Comments
 (0)