This repository contains the implementation of a Reinforcement Learning with Human Feedback (RLHF) system using custom datasets. The project utilizes the trlX library for training a preference model that integrates human feedback directly into the optimization of language models.
To set up your environment to run this project, follow these steps:
-
Clone the repository:
git clone https://github.com/your_username/RLHF_Project.git cd RLHF_Project
-
Install the required dependencies:
pip install -r requirements.txt
To run the RLHF training process, execute the main.py script:
python main.py