Skip to content

stevenk42/Transformer-Attention-Pattern-Analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Transformer Attention Pattern Analyzer

A Streamlit app to inspect real attention tensors from HF transformer models (GPT-2 family, GPT-Neo 125M).

Quick start

  1. Create a virtual env (recommended) and install deps:
python -m venv .venv && source .venv/bin/activate  # on Windows: .venv\Scripts\activate
pip install -r requirements.txt

⚠️ PyTorch install note: If pip install torch fails or is slow, follow the official instructions for your OS/CPU/GPU: https://pytorch.org/get-started/locally/

  1. Run the app (replace app.py with your downloaded filename if different):
streamlit run app.py
  1. In the sidebar, pick a model (e.g., distilgpt2), enter text, and click Analyze.

What’s in this build

  • Real attention heatmaps (per layer/head)
  • Stable spectral features via SVD (no bogus eigen stats)
  • Tokenizer‑aware token typing for semantic flow
  • Export of summary JSON
  • Heuristic text type classification (documented as a heuristic)

Tips

  • For longer texts, set Max sequence length to 512 but be mindful of memory.
  • CPU works; a GPU will speed up model load/forward passes.
  • If you see a warning about device_map='cpu', ensure accelerate is installed (included in requirements).

Known limits

  • “Semantic flows” require multiple token categories in the input; otherwise the tab will be empty.
  • The classifier is a simple heuristic based on attention statistics; it’s not a scientific labeler.

Made for quick research & demos. PRs welcome.

About

A scientific tool for analyzing attention patterns in transformer models.

Resources

License

Stars

Watchers

Forks

Packages

No packages published