Releases: saprmarks/dictionary_learning
v0.1.0
v0.1.0 (2025-02-12)
Feature
- feat: pypi packaging and auto-release with semantic release (
0ff8888
)
Unknown
- Merge pull request #37 from chanind/pypi-package
feat: pypi packaging and auto-release with semantic release (a711efe
)
-
simplify matryoshka loss (
43421f5
) -
Use torch.split() instead of direct indexing for 25% speedup (
505a445
) -
Fix matryoshka spelling (
aa45bf6
) -
Fix incorrect auxk logging name (
784a62a
) -
Add citation (
77f2690
) -
Make sure to detach reconstruction before calculating aux loss (
db2b564
) -
Merge pull request #36 from saprmarks/aux_loss_fixes
Aux loss fixes, standardize decoder normalization (34eefda
)
-
Standardize and fix topk auxk loss implementation (
0af1971
) -
Normalize decoder after optimzer step (
200ed3b
) -
Remove experimental matroyshka temperature (
6c2fcfc
) -
Make sure x is on the correct dtype for jumprelu when logging (
c697d0f
) -
Import trainers from correct relative location for submodule use (
8363ff7
) -
By default, don't normalize Gated activations during inference (
52b0c54
) -
Also update context manager for matroyshka threshold (
65e7af8
) -
Disable autocast for threshold tracking (
17aa5d5
) -
Add torch autocast to training loop (
832f4a3
) -
Save state dicts to cpu (
3c5a5cd
) -
Add an option to pass LR to TopK trainers (
8316a44
) -
Add April Update Standard Trainer (
cfb36ff
) -
Merge pull request #35 from saprmarks/code_cleanup
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98
)
-
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (
9751c57
) -
Merge pull request #34 from adamkarvonen/matroyshka
Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4
)
-
Add a verbose option during training (
0ff687b
) -
Prevent wandb cuda multiprocessing errors (
370272a
) -
Log dead features for batch top k SAEs (
936a69c
) -
Log number of dead features to wandb (
77da794
) -
Add trainer number to wandb name (
3b03b92
) -
Add notes (
810dbb8
) -
Add option to ignore bos tokens (
c2fe5b8
) -
Fix jumprelu training (
ec961ac
) -
Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha (
8eaa8b2
) -
Format with ruff (
3e31571
) -
Add temperature scaling to matroyshka (
ceabbc5
) -
norm the correct decoder dimension (
5383603
) -
Fix loading matroyshkas from_pretrained() (
764d4ac
) -
Initial matroyshka implementation (
8ade55b
) -
Make sure we step the learning rate scheduler (
1df47d8
) -
Merge pull request #33 from saprmarks/lr_scheduling
Lr scheduling (316dbbe
)
-
Properly set new parameters in end to end test (
e00fd64
) -
Standardize learning rate and sparsity schedules (
a2d6c43
) -
Merge pull request #32 from saprmarks/add_sparsity_warmup
Add sparsity warmup (a11670f
)
-
Add sparsity warmup for trainers with a sparsity penalty (
911b958
) -
Clean up lr decay (
e0db40b
) -
Track lr decay implementation (
f0bb66d
) -
Remove leftover variable, update expected results with standard SAE improvements (
9687bb9
) -
Merge pull request #31 from saprmarks/add_demo
Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857
)
-
Also scale topk thresholds when scaling biases (
efd76b1
) -
Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T (
8b95ec9
) -
Add bias scaling to topk saes (
484ca01
) -
Fix topk bfloat16 dtype error (
488a154
) -
Add option to normalize dataset activations (
81968f2
) -
Remove demo script and graphing notebook (
57f451b
) -
Track thresholds for topk and batchtopk during training (
b5821fd
) -
Track threshold for batchtopk, rename for consistency (
32d198f
) -
Modularize demo script (
dcc02f0
) -
Begin creation of demo script (
712eb98
) -
Fix JumpReLU training and loading (
552a8c2
) -
Ensure activation buffer has the correct dtype (
d416eab
) -
Merge pull request #30 from adamkarvonen/add_tests
Add end to end test...