Skip to content

mainlp/memorized_reasonings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reason to rote: rethinking memorization in reasoning

Here is the code for our paper "Reason to rote: rethinking memorization in reasoning".

Setup

python3.12 -m venv py312_venv
source py312_venv/bin/activate
pip install -r requirements.txt
pip install -e auto-circuit

Experiments

In both experiments, we offered sample data and checkpoints: you can train your own model and run the experiments following the instructions below, or you can skip the data generation and training steps if you want to reproduce the main results of our paper.

FDA

cd fda

# Generate data
python gen_data.py --data_dir data/add_noise/40000_10000_05  --num_train 40000 --num_val 10000 --noise_rate 0.05
# Train the model
CUDA_VISIBLE_DEVICES=1 python train.py --data_dir data/add_noise/40000_10000_05 --batch_size 2048 --lr 1e-4 --arch_d 256 --arch_l 4 --arch_h 4 --output_dir training_outputs/noise_05_d_256_l_4_h_4 --run_name noise_05_d_256_l_4_h_4

# run the experiments
# the co-existence of both memorization and generalization
CUDA_VISIBLE_DEVICES=0 python experiments/co_exist.py
# identify the circuits that are responsible for the memorization and generalization using EAP
CUDA_VISIBLE_DEVICES=0 python experiments/circuit_eap.py
# ablating different attention heads
CUDA_VISIBLE_DEVICES=0 python experiments/ablation.py
# identifying and validating outlier heuristics
CUDA_VISIBLE_DEVICES=0 python experiments/outlier_heuristics.py

THR

cd thr

# Generate data
python gen_data.py --num_entities 20 --num_relations 20 --num_train_templates 5 --save_dir data/e20_r20_t5 --train_ratio 0.8 --noise_rates 0.05 

# Train the model
CUDA_VISIBLE_DEVICES=0 python train.py --data_dir data/e20_r20_t5 --lr 1e-4 --arch_d 256 --arch_l 8 --arch_h 4 --output_dir training_outputs/e20_r20_t5_d256_l8_h4_noise_5_lr0.0001 --run_name e20_r20_t5_d256_l8_h4_noise_5_lr0.0001

# the co-existence of both memorization and generalization
CUDA_VISIBLE_DEVICES=0 python experiments/co_exist.py
# identify the circuits that are responsible for the memorization and generalization using EAP
CUDA_VISIBLE_DEVICES=0 python experiments/circuit_eap.py
# ablating bridge entities with INLP and different attention heads
CUDA_VISIBLE_DEVICES=0 python experiments/ablation.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages