Reason to rote: rethinking memorization in reasoning

Here is the code for our paper "Reason to rote: rethinking memorization in reasoning".

Setup

python3.12 -m venv py312_venv
source py312_venv/bin/activate
pip install -r requirements.txt
pip install -e auto-circuit

Experiments

In both experiments, we offered sample data and checkpoints: you can train your own model and run the experiments following the instructions below, or you can skip the data generation and training steps if you want to reproduce the main results of our paper.

FDA

cd fda

# Generate data
python gen_data.py --data_dir data/add_noise/40000_10000_05  --num_train 40000 --num_val 10000 --noise_rate 0.05
# Train the model
CUDA_VISIBLE_DEVICES=1 python train.py --data_dir data/add_noise/40000_10000_05 --batch_size 2048 --lr 1e-4 --arch_d 256 --arch_l 4 --arch_h 4 --output_dir training_outputs/noise_05_d_256_l_4_h_4 --run_name noise_05_d_256_l_4_h_4

# run the experiments
# the co-existence of both memorization and generalization
CUDA_VISIBLE_DEVICES=0 python experiments/co_exist.py
# identify the circuits that are responsible for the memorization and generalization using EAP
CUDA_VISIBLE_DEVICES=0 python experiments/circuit_eap.py
# ablating different attention heads
CUDA_VISIBLE_DEVICES=0 python experiments/ablation.py
# identifying and validating outlier heuristics
CUDA_VISIBLE_DEVICES=0 python experiments/outlier_heuristics.py

THR

cd thr

# Generate data
python gen_data.py --num_entities 20 --num_relations 20 --num_train_templates 5 --save_dir data/e20_r20_t5 --train_ratio 0.8 --noise_rates 0.05 

# Train the model
CUDA_VISIBLE_DEVICES=0 python train.py --data_dir data/e20_r20_t5 --lr 1e-4 --arch_d 256 --arch_l 8 --arch_h 4 --output_dir training_outputs/e20_r20_t5_d256_l8_h4_noise_5_lr0.0001 --run_name e20_r20_t5_d256_l8_h4_noise_5_lr0.0001

# the co-existence of both memorization and generalization
CUDA_VISIBLE_DEVICES=0 python experiments/co_exist.py
# identify the circuits that are responsible for the memorization and generalization using EAP
CUDA_VISIBLE_DEVICES=0 python experiments/circuit_eap.py
# ablating bridge entities with INLP and different attention heads
CUDA_VISIBLE_DEVICES=0 python experiments/ablation.py

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
auto-circuit		auto-circuit
fda		fda
llm_error_detection		llm_error_detection
thr		thr
.gitattributes		.gitattributes
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reason to rote: rethinking memorization in reasoning

Setup

Experiments

FDA

THR

About

Uh oh!

Releases

Packages

Languages

mainlp/memorized_reasonings

Folders and files

Latest commit

History

Repository files navigation

Reason to rote: rethinking memorization in reasoning

Setup

Experiments

FDA

THR

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages