Skip to content

bigai-nlco/RuleReasoner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RuleReasoner: Reinforced Rule-based Reasoning
via Domain-aware Dynamic Sampling

| ⚙️ Code | 🤗 Model | 📚 Data |

precommit Rich GNU%20Bash NumPy
Ray pandas MIT Documentation

📍 TL;DR

Reinforced Rule-based Reasoning (RuleReasoner) is a simple yet effective method enabling small reasoning models (SRMs) to effectively learn rule-based reasoning. Unlike large models that need complex training, RuleReasoner uses a curated collection of tasks and a domain-aware dynamic sampling approach, adjusting training based on historical performance. This simple yet effective technique allows SRMs to outperform frontier Large Reasoning Models (LRMs) by +4.1% on in-distribution tasks and +10.4% on out-of-distribution tasks, while also being more computationally efficient.

  • Domain-aware dynamic sampling with higher training sampling efficiency and domain performance balance.
OOD Performance
  • Comprehensive Data curation for data curricula on rule-centric application.
OOD Performance
  • Rule Reasoner (8B and 4B) depicts comparable performance versus a wide range of baselines.
OOD Performance
  • Rule Reasoner (8B and 4B) also achives strong OOD performance across three benchmarks (subsets of rule-based reasoning) including BBH, ProverQA, and BBEH.
OOD Performance

🗺️ Table of Contents

🎯 Quick Start

Prerequisites

Running RuleReasoner requires the following dependencies:

Installation

Build RuleReasoner from the source and install dependencies:

  1. Clone the repository:

    ❯ git clone https://github.com/bigai-nlco/RuleReasoner.git
  2. Navigate to the project directory:

    cd RuleReasoner
  3. Install the dependencies:

    ❯ pip install -r requirements.txt
    ❯ pip install -e ./verl
    ❯ pip install -e .

Training

Run the training with:

./scripts/train/train_mix.sh

Evaluation

Run the evaluation with:

./scripts/eval/eval_model.sh \
    --model $MODEL_PATH \
    --datasets $DATASET_PATH \
    --output-dir $OUTPUT_DIR

🌳 Project Structure

└── RuleReasoner
    ├── LICENSE
    ├── README.md
    ├── requirements.txt
    ├── scripts
    │   ├── build_dataset.py
    │   ├── data
    │   ├── eval
    │   └── train
    ├── setup.py
    ├── src
    │   ├── __init__.py
    │   ├── data
    │   ├── globals.py
    │   ├── system_prompts.py
    │   └── utils.py
    └── verl
	└── ...

🔧 Contributing

Contributing Guidelines
  1. Fork the Repository: Start by forking the project repository to your local host.
  2. Clone Locally: Clone the forked repository to your local machine using a git client.
    git clone https://github.com/bigai-nlco/RuleReasoner.git
  3. Create a New Branch: Always work on a new branch, giving it a descriptive name.
    git checkout -b new-feature-x
  4. Make Your Changes: Develop and test your changes locally.
  5. Commit Your Changes: Commit with a clear message describing your updates.
    git commit -m 'Implemented new feature x.'
  6. Push to local: Push the changes to your forked repository.
    git push origin new-feature-x
  7. Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
  8. Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!

©️ License

Rulereasoner is protected under the LICENSE License. For more details, please refer to the LICENSE file.

🔖 Citation

@article{liu2025rulereasoner,
      title={RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling}, 
      author={Yang Liu and Jiaqi Li and Zilong Zheng},
      year={2025},
      eprint={2506.08672},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.08672}, 
}

About

Official Repo for RuleReasoner.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •