Rethinking Reflection in Pre-Training

This repository contains various resources for the paper Rethinking Reflection in Pre-Training by Essential AI.

Prompts
Results
Tasks
Classifier
Datasets (on Hugging Face)

Updates:

2025-04 - we released our paper and made this announcement.

Abstract

A language model’s ability to reflect on its own reasoning provides a key advantage for solving complex problems. While most recent research has focused on how this ability develops during reinforcement learning, we show that it actually begins to emerge much earlier—during the model’s pre-training. To study this, we introduce deliberate errors into chains-of-thought and test whether the model can still arrive at the correct answer by recognizing and correcting these mistakes. By tracking performance across different stages of pre-training, we observe that this self-correcting ability appears early and improves steadily over time. For instance, an OLMo-2- 7B model pre-trained on 4 trillion tokens displays self-correction on our six self-reflection tasks.

Prompts

As detailed in our paper, we modified existing datasets to create adversarial datasets that can elicit reflection behavior from LLMs. Our modifications involved prompting LLMs and the prompts/ directory includes the specific instructions for each dataset.

Results

Our experiments produced a substantial volume of results and samples JSON files. These are available via Essential AI's Hugging Face hub. Additionally, in results/ we share csv files that collate our results, along with the plots from our paper and the code used to create them.

Tasks

Our experiments utilized the EleutherAI Language Model Evaluation Harness. We created custom tasks, which are available in tasks/:

bbh_adv
cruxeval_i_adv
cruxeval_o_adv
gsm8k_adv
gsm8k-platinum_adv
triviaqa_adv

To get started:

Install lm-eval as per the LM-Evaluation Harness instructions here.

Next, git clone https://github.com/Essential-AI/reflection.git.

Then, when calling lm-eval, include some or all of our custom task names, along with the --include_path flag and the path to reflection/tasks/. For example:

lm_eval --model hf \
    --tasks bbh_adv,cruxeval_i_adv,cruxeval_o_adv,gsm8k_adv,gsm8k-platinum_adv,triviaqa_adv
    --include_path reflection/tasks
    --model_args pretrained=EleutherAI/pythia-160m,revision=step100000,dtype="float" \
    --device cuda:0 \
    --batch_size auto:4

See here for more details on how to use the lm-eval interface.

Classifier

For the reflection classifier described in our paper, we created fewshot prompts tailored to each task. We share these in classifer as well as the golden labels obtained from human annotators.

Licenses

Our code and results are made available under the MIT license and our datasets are made available under the CC BY-SA 4.0 license. Please see here for license details of each dataset.

Citation

Please cite both our paper and the original work for each dataset that we modified. Please see here for the relevant citation details for each dataset.

@misc{ai2025rethinkingreflectionpretraining,
      title={Rethinking Reflection in Pre-Training}, 
      author={Essential AI and : and Darsh J Shah and Peter Rushton and Somanshu Singla and Mohit Parmar and Kurt Smith and Yash Vanjani and Ashish Vaswani and Adarsh Chaluvaraju and Andrew Hojel and Andrew Ma and Anil Thomas and Anthony Polloreno and Ashish Tanwer and Burhan Drak Sibai and Divya S Mansingka and Divya Shivaprasad and Ishaan Shah and Karl Stratos and Khoi Nguyen and Michael Callahan and Michael Pust and Mrinal Iyer and Philip Monk and Platon Mazarakis and Ritvik Kapila and Saurabh Srivastava and Tim Romanski},
      year={2025},
      eprint={2504.04022},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2504.04022}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
classifier		classifier
prompts		prompts
results		results
tasks		tasks
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rethinking Reflection in Pre-Training

Abstract

Prompts

Results

Tasks

To get started:

Classifier

Licenses

Citation

About

Uh oh!

Releases

Packages

Languages

License

Querent-ai/reflection

Folders and files

Latest commit

History

Repository files navigation

Rethinking Reflection in Pre-Training

Abstract

Prompts

Results

Tasks

To get started:

Classifier

Licenses

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages