v0.1.0 (2025-02-12)

Feature

feat: pypi packaging and auto-release with semantic release (a711efe)

simplify matryoshka loss (43421f5)
Use torch.split() instead of direct indexing for 25% speedup (505a445)
Fix matryoshka spelling (aa45bf6)
Fix incorrect auxk logging name (784a62a)
Add citation (77f2690)
Make sure to detach reconstruction before calculating aux loss (db2b564)
Merge pull request #36 from saprmarks/aux_loss_fixes

Aux loss fixes, standardize decoder normalization (34eefda)

Standardize and fix topk auxk loss implementation (0af1971)
Normalize decoder after optimzer step (200ed3b)
Remove experimental matroyshka temperature (6c2fcfc)
Make sure x is on the correct dtype for jumprelu when logging (c697d0f)
Import trainers from correct relative location for submodule use (8363ff7)
By default, don't normalize Gated activations during inference (52b0c54)
Also update context manager for matroyshka threshold (65e7af8)
Disable autocast for threshold tracking (17aa5d5)
Add torch autocast to training loop (832f4a3)
Save state dicts to cpu (3c5a5cd)
Add an option to pass LR to TopK trainers (8316a44)
Add April Update Standard Trainer (cfb36ff)
Merge pull request #35 from saprmarks/code_cleanup

Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98)

Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (9751c57)
Merge pull request #34 from adamkarvonen/matroyshka

Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4)

Add a verbose option during training (0ff687b)
Prevent wandb cuda multiprocessing errors (370272a)
Log dead features for batch top k SAEs (936a69c)
Log number of dead features to wandb (77da794)
Add trainer number to wandb name (3b03b92)
Add notes (810dbb8)
Add option to ignore bos tokens (c2fe5b8)
Fix jumprelu training (ec961ac)
Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha (8eaa8b2)
Format with ruff (3e31571)
Add temperature scaling to matroyshka (ceabbc5)
norm the correct decoder dimension (5383603)
Fix loading matroyshkas from_pretrained() (764d4ac)
Initial matroyshka implementation (8ade55b)
Make sure we step the learning rate scheduler (1df47d8)
Merge pull request #33 from saprmarks/lr_scheduling

Lr scheduling (316dbbe)

Add sparsity warmup (a11670f)

Add sparsity warmup for trainers with a sparsity penalty (911b958)
Clean up lr decay (e0db40b)
Track lr decay implementation (f0bb66d)
Remove leftover variable, update expected results with standard SAE improvements (9687bb9)
Merge pull request #31 from saprmarks/add_demo

Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857)

Also scale topk thresholds when scaling biases (efd76b1)
Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T (8b95ec9)
Add bias scaling to topk saes (484ca01)
Fix topk bfloat16 dtype error (488a154)
Add option to normalize dataset activations (81968f2)
Remove demo script and graphing notebook (57f451b)
Track thresholds for topk and batchtopk during training (b5821fd)
Track threshold for batchtopk, rename for consistency (32d198f)
Modularize demo script (dcc02f0)
Begin creation of demo script (712eb98)
Fix JumpReLU training and loading (552a8c2)
Ensure activation buffer has the correct dtype (d416eab)
Merge pull request #30 from adamkarvonen/add_tests

Add end to end test...