Reinforced Rule-based Reasoning (RuleReasoner) is a simple yet effective method enabling small reasoning models (SRMs) to effectively learn rule-based reasoning. Unlike large models that need complex training, RuleReasoner uses a curated collection of tasks and a domain-aware dynamic sampling approach, adjusting training based on historical performance. This simple yet effective technique allows SRMs to outperform frontier Large Reasoning Models (LRMs) by +4.1% on in-distribution tasks and +10.4% on out-of-distribution tasks, while also being more computationally efficient.
- Domain-aware dynamic sampling with higher training sampling efficiency and domain performance balance.

- Comprehensive Data curation for data curricula on rule-centric application.

- Rule Reasoner (8B and 4B) depicts comparable performance versus a wide range of baselines.

- Rule Reasoner (8B and 4B) also achives strong OOD performance across three benchmarks (subsets of rule-based reasoning) including BBH, ProverQA, and BBEH.

Running RuleReasoner
requires the following dependencies:
Build RuleReasoner from the source and install dependencies:
-
Clone the repository:
❯ git clone https://github.com/bigai-nlco/RuleReasoner.git
-
Navigate to the project directory:
❯ cd RuleReasoner
-
Install the dependencies:
❯ pip install -r requirements.txt ❯ pip install -e ./verl ❯ pip install -e .
Run the training with:
./scripts/train/train_mix.sh
Run the evaluation with:
./scripts/eval/eval_model.sh \
--model $MODEL_PATH \
--datasets $DATASET_PATH \
--output-dir $OUTPUT_DIR
└── RuleReasoner
├── LICENSE
├── README.md
├── requirements.txt
├── scripts
│ ├── build_dataset.py
│ ├── data
│ ├── eval
│ └── train
├── setup.py
├── src
│ ├── __init__.py
│ ├── data
│ ├── globals.py
│ ├── system_prompts.py
│ └── utils.py
└── verl
└── ...
- 💬 Join the Discussions: Share your insights, provide feedback, or ask questions.
- 🐛 Report Issues: Submit bugs found or log feature requests for the
RuleReasoner
project. - 💡 Submit Pull Requests: Review open PRs, and submit your own PRs.
Contributing Guidelines
- Fork the Repository: Start by forking the project repository to your local host.
- Clone Locally: Clone the forked repository to your local machine using a git client.
git clone https://github.com/bigai-nlco/RuleReasoner.git
- Create a New Branch: Always work on a new branch, giving it a descriptive name.
git checkout -b new-feature-x
- Make Your Changes: Develop and test your changes locally.
- Commit Your Changes: Commit with a clear message describing your updates.
git commit -m 'Implemented new feature x.'
- Push to local: Push the changes to your forked repository.
git push origin new-feature-x
- Submit a Pull Request: Create a PR against the original project repository. Clearly describe the changes and their motivations.
- Review: Once your PR is reviewed and approved, it will be merged into the main branch. Congratulations on your contribution!
Rulereasoner is protected under the LICENSE License. For more details, please refer to the LICENSE file.
@article{liu2025rulereasoner,
title={RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling},
author={Yang Liu and Jiaqi Li and Zilong Zheng},
year={2025},
eprint={2506.08672},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2506.08672},
}