Skip to content

Question on the removal of .step() and .generate() methods in Trainer #3270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jskaf34 opened this issue Apr 9, 2025 · 5 comments · May be fixed by #3410
Open

Question on the removal of .step() and .generate() methods in Trainer #3270

jskaf34 opened this issue Apr 9, 2025 · 5 comments · May be fixed by #3410
Labels
❓ question Seeking clarification or more information

Comments

@jskaf34
Copy link

jskaf34 commented Apr 9, 2025

Hello everyone,

First of all, thank you for the amazing work you're doing—this library is truly a fantastic resource for the community! 😊

I had a more practical question regarding the decision to remove the .step() and .generate() methods from the Trainer classes. Our current implementation relies heavily on these methods, and updating to the new version would require a fair amount of engineering effort.

Before proceeding, I wanted to understand the rationale behind this change. Was there a specific motivation for deprecating these methods, and is there a recommended alternative approach?

Thank you in advance for your help and insights! 🚀

@github-actions github-actions bot added the ❓ question Seeking clarification or more information label Apr 9, 2025
@qgallouedec
Copy link
Member

Hi, thanks for the feedback.
That sounds more like a question about transformers, doesn't it?
Trainer still exposes a training_step method, and I looked at the codebase a year ago, Trainer didn't exposed either a step method, or a generate method. What are you referring to exactly? Or maybe in an even older version?

@eryawww
Copy link

eryawww commented Apr 10, 2025

Hello, thank you for the answer.

I believe I have a similar question. In the earlier version (trl==0.11), there was a PPOTrainer.step(query, response, score) method that was really handy for online/iterative RL scenario. From what I see in the current implementation, it looks like everything is now wrapped into PPOTrainer.train.

I’m wondering, what is the recommended way to implement an iterative scenario with the new version?

@jskaf34
Copy link
Author

jskaf34 commented Apr 10, 2025

Hey thank you for your answer.
In trl==0.11 version, PPOTrainer had a generate and a step method that were really convenient to customise our RL loops. In fact, many examples on the internet relies on this old version of TRL. Those methods were removed afterwards, I was wondering if there were any reason for that removal please, in case I had to implement them back to upgrade to TRL's new version.
Have a great day !

@qgallouedec
Copy link
Member

Ah ok you're talking about PPOTrainer. It wasn't clear, we have more than 15 trainers in this repo
The motivation was that we wanted our trainers to all inherit from transformers.Trainer, this allows us to benefit from all its great features and reduces the maintenance effort considerably. You have two options: either pin trl 0.11, or implement these method in the current PPOTrainer, and possibly open a PR.

@jskaf34
Copy link
Author

jskaf34 commented Apr 11, 2025

Okay, cristal clear, I'll let you know ! Thx for your answers 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
❓ question Seeking clarification or more information
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants