Skip to content

Releases: saprmarks/dictionary_learning

v0.1.0

12 Feb 06:55
Compare
Choose a tag to compare

v0.1.0 (2025-02-12)

Feature

  • feat: pypi packaging and auto-release with semantic release (0ff8888)

Unknown

  • Merge pull request #37 from chanind/pypi-package

feat: pypi packaging and auto-release with semantic release (a711efe)

  • simplify matryoshka loss (43421f5)

  • Use torch.split() instead of direct indexing for 25% speedup (505a445)

  • Fix matryoshka spelling (aa45bf6)

  • Fix incorrect auxk logging name (784a62a)

  • Add citation (77f2690)

  • Make sure to detach reconstruction before calculating aux loss (db2b564)

  • Merge pull request #36 from saprmarks/aux_loss_fixes

Aux loss fixes, standardize decoder normalization (34eefda)

  • Standardize and fix topk auxk loss implementation (0af1971)

  • Normalize decoder after optimzer step (200ed3b)

  • Remove experimental matroyshka temperature (6c2fcfc)

  • Make sure x is on the correct dtype for jumprelu when logging (c697d0f)

  • Import trainers from correct relative location for submodule use (8363ff7)

  • By default, don't normalize Gated activations during inference (52b0c54)

  • Also update context manager for matroyshka threshold (65e7af8)

  • Disable autocast for threshold tracking (17aa5d5)

  • Add torch autocast to training loop (832f4a3)

  • Save state dicts to cpu (3c5a5cd)

  • Add an option to pass LR to TopK trainers (8316a44)

  • Add April Update Standard Trainer (cfb36ff)

  • Merge pull request #35 from saprmarks/code_cleanup

Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98)

  • Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (9751c57)

  • Merge pull request #34 from adamkarvonen/matroyshka

Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4)

  • Add a verbose option during training (0ff687b)

  • Prevent wandb cuda multiprocessing errors (370272a)

  • Log dead features for batch top k SAEs (936a69c)

  • Log number of dead features to wandb (77da794)

  • Add trainer number to wandb name (3b03b92)

  • Add notes (810dbb8)

  • Add option to ignore bos tokens (c2fe5b8)

  • Fix jumprelu training (ec961ac)

  • Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha (8eaa8b2)

  • Format with ruff (3e31571)

  • Add temperature scaling to matroyshka (ceabbc5)

  • norm the correct decoder dimension (5383603)

  • Fix loading matroyshkas from_pretrained() (764d4ac)

  • Initial matroyshka implementation (8ade55b)

  • Make sure we step the learning rate scheduler (1df47d8)

  • Merge pull request #33 from saprmarks/lr_scheduling

Lr scheduling (316dbbe)

  • Properly set new parameters in end to end test (e00fd64)

  • Standardize learning rate and sparsity schedules (a2d6c43)

  • Merge pull request #32 from saprmarks/add_sparsity_warmup

Add sparsity warmup (a11670f)

  • Add sparsity warmup for trainers with a sparsity penalty (911b958)

  • Clean up lr decay (e0db40b)

  • Track lr decay implementation (f0bb66d)

  • Remove leftover variable, update expected results with standard SAE improvements (9687bb9)

  • Merge pull request #31 from saprmarks/add_demo

Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857)

  • Also scale topk thresholds when scaling biases (efd76b1)

  • Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T (8b95ec9)

  • Add bias scaling to topk saes (484ca01)

  • Fix topk bfloat16 dtype error (488a154)

  • Add option to normalize dataset activations (81968f2)

  • Remove demo script and graphing notebook (57f451b)

  • Track thresholds for topk and batchtopk during training (b5821fd)

  • Track threshold for batchtopk, rename for consistency (32d198f)

  • Modularize demo script (dcc02f0)

  • Begin creation of demo script (712eb98)

  • Fix JumpReLU training and loading (552a8c2)

  • Ensure activation buffer has the correct dtype (d416eab)

  • Merge pull request #30 from adamkarvonen/add_tests

Add end to end test...

Read more