Learning In-context n-grams with Transformers: Sub-n-grams Are Near-stationary Points

This code is adapted from the implementation by Nichani et al. (2024), available at https://github.com/eshnich/transformers-learn-causal-structure.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
figures		figures
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
models.py		models.py
plots.py		plots.py
problems.py		problems.py
requirements.txt		requirements.txt
train.py		train.py
util.py		util.py

Provide feedback