[HACK] Turn off inference mode in RL trainer for now#3440
Draft
andrewor14 wants to merge 1 commit intounslothai:mainfrom 
Draft
[HACK] Turn off inference mode in RL trainer for now#3440andrewor14 wants to merge 1 commit intounslothai:mainfrom 
andrewor14 wants to merge 1 commit intounslothai:mainfrom