- Uses 'import clip' / OpenAI/CLIP
- Modified version, exposing attn ->
attnclip
folder
- Download dataset from huggingface.co/datasets/SPRIGHT-T2I/spright_coco (I provided the labels as .json here), or insert your own as
train_dataset
,val_dataset
in allplop-for-clip
code. - Run all
plop-for-clip*
- Run all
compare-clip*
- Run all
clip-attention
- Check out the results!
- For more information on what 'register neurons' are and how to find them, see github.com/zer0int/CLIP-test-time-registers
- Register Neuron Intervention vs. Bogus Neuron Intervention:
- Attention L2 for individual heads:
This project provides a simple script to compute alignment metrics for transformer models on various datasets.
Install dependencies:
pip install -r requirements.txt
Run the main script:
python main.py --model <huggingface-model-handle> --dataset <math|code|history|logic> --batchsize <BATCHSIZE> --nbsamples <N> --seqlen <SEQ_LEN> --aggregation <type|layer|None> --output_dir <RESULTS_DIR>
Example:
python main.py --model meta-llama/Llama-3.2-1B-Instruct --dataset math --batchsize 8 --nbsamples 100 --seqlen 256 --aggregation type --output_dir results/
--model
: HuggingFace model handle (e.g.,google/gemma-2b
)--dataset
: Dataset name (math
,code
,history
,logic
)--batchsize
: Batch size (not used in this simple version, all samples are processed at once)--nbsamples
: Number of samples to use from the dataset--seqlen
: Sequence length for tokenization--aggregation
: How to aggregate results (type
,layer
, orNone
)--output_dir
: Directory to save results
- Raw and aggregated metrics are saved as JSON files in the specified output directory.