Elastic Reasoning

🚀 Scalable Chain of Thoughts via Elastic Reasoning 🌟

Updates

We released E1-AceReason-14B finetuned from AceReason-Nemotron-14B
We released E1-Math-7B finetuned from Skywork-OR1-Math-7B
We released E1-Math-1.5B and E1-Code-14B

Introduction

We propose Elastic Reasoning, a novel framework for scalable chain of thoughts that explicitly separates reasoning into two phases—thinking and solution—with independently allocated budgets. At test time, Elastic Reasoning prioritize that completeness of solution segments, significantly improving reliability under tight resource constraints. To train models that are robust to truncated thinking, we introduce a lightweight budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.

Main Takeaways

✂️ Thinking + Solution are explicitly separated with independent budgets — boosting reliability under tight compute constraints.
🧠 Budget-Constrained Rollout: We train models to handle truncated reasoning using GRPO.
📈 Flexible scalability: Robust performance across diverse inference budgets on reasoning benchmarks like AIME and LiveCodeBench.
⚙️ Better performance with fewer tokens: Our trained model generates outputs that are 30% shorter while maintaining (or even improving) accuracy.

Results (Avg@16)

Model	Tokens	Acc (%)	Tokens	Acc (%)	Tokens	Acc (%)	Tokens	Acc (%)	Tokens	Acc (%)
DeepScaleR-1.5B	10050	41.0	1488	5.2	1904	9.6	2809	15.8	3700	22.7
E1-Math-1.5B	6825	35.0	1340	13.5	1799	17.5	2650	24.8	3377	27.9
Skywork-OR1-Math-7B	13803	68.3	1534	1.0	2047	2.1	3051	7.7	4023	14.0
E1-Math-7B	11768	69.6	1381	16.9	1841	21.3	2799	26.0	3742	32.9

Environment Setup

Installation

# Installing Python 3.10 Environment.
conda create -n e1 python=3.10 -y
conda activate e1

# Installing dependencies.
cd Elastic-Reasoning
pip install -e ./verl
pip install -e .

Data

Our raw training data is in rllm/data/[train|test]/[code|math]/, along with preprocessing scripts in rllm/data/preprocess. To convert the raw data into Parquet files for training, run:

# Download datasets from GDrive, populates rllm/data/[train|test]/[math|code]/*.json
python scripts/data/download_datasets.py

# Generate parquet files for Deepcoder/DeepscaleR in data/*.parquet
python scripts/data/[deepcoder|deepscaler]_dataset.py

Training

export MODEL_PATH="agentica-org/DeepScaleR-1.5B-Preview"
./scripts/e1-math/e1_math_1.5b_1k_1k.sh --model $MODEL_PATH

Evaluation

To run our evaluation scripts, run:

./scripts/eval/eval_model.sh --model [CHECKPOINT_PATH] --datasets [DATASET1] [DATASET2] --output-dir [OUTPUT_DIR] --n [N_PASSES] --tp [TENSOR_PARALLEL_SIZE] --e1-mode [SEPARATE_BUDGETING] --e1-thinking-length [THINKING_LENGTH] --e1-solution-length [SOLUTION_LENGTH]

Example on MATH

./scripts/eval/eval_model.sh --model Salesforce/E1-Math-1.5B --datasets aime math amc minerva olympiad_bench --output-dir $HOME/E1-Math-1.5B --tp 1 --n 16 --e1-mode True --e1-thinking-length 1024 --e1-solution-length 1024

Example on LiveCodeBench

./scripts/eval/eval_model.sh --model Salesforce/E1-Code-14B --datasets test_livecodebench --output-dir $HOME/E1-Code-14B --tp 4 --e1-mode True --e1-thinking-length 1024 --e1-solution-length 1024

Example on Codeforces

./scripts/eval/eval_model.sh --model Salesforce/E1-Code-14B --datasets test_codeforces --output-dir $HOME/DeepCoder-14B-Preview --tp 4 --n 8 --e1-mode True --e1-thinking-length 1024 --e1-solution-length 1024

python scripts/deepcoder/benchmark/cf_elo_calc.py --results_path [RESULTS_JSON_PATH] --pass_n 8

Unconstrained evaluation

set --e1-mode False and --max-length [Maxmum token length, e.g. 32768]

Acknowledgement

We greatly thanks rllm and verl for providing the awesome codebase!

Citation

@article{xu2025scalable,
  title={Scalable Chain of Thoughts via Elastic Reasoning},
  author={Xu, Yuhui and Dong, Hanze and Wang, Lei and Sahoo, Doyen and Li, Junnan and Xiong, Caiming},
  journal={arXiv preprint arXiv:2505.05315},
  year={2025}
}

Ethical Considerations

This release is for research purposes only in support of an academic paper. Our models, datasets, and code are not specifically designed or evaluated for all downstream purposes. We strongly recommend users evaluate and address potential concerns related to accuracy, safety, and fairness before deploying this model. We encourage users to consider the common limitations of AI, comply with applicable laws, and leverage best practices when selecting use cases, particularly for high-risk scenarios where errors or misuse could significantly impact people’s lives, rights, or safety. For further guidance on use cases, refer to our AUP and AI AUP.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
figs		figs
rllm		rllm
scripts		scripts
verl		verl
.gitignore		.gitignore
AI_ETHICS.md		AI_ETHICS.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Elastic Reasoning

🚀 Scalable Chain of Thoughts via Elastic Reasoning 🌟

Updates

Table of Contents

Introduction

Environment Setup

Installation

Data

Training

Evaluation

Example on MATH

Example on LiveCodeBench

Example on Codeforces

Unconstrained evaluation

Acknowledgement

Citation

Ethical Considerations

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

SalesforceAIResearch/Elastic-Reasoning

Folders and files

Latest commit

History

Repository files navigation

Elastic Reasoning

🚀 Scalable Chain of Thoughts via Elastic Reasoning 🌟

Updates

Table of Contents

Introduction

Environment Setup

Installation

Data

Training

Evaluation

Example on MATH

Example on LiveCodeBench

Example on Codeforces

Unconstrained evaluation

Acknowledgement

Citation

Ethical Considerations

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages