v0.1.0 (2025-02-12)
Feature
- feat: pypi packaging and auto-release with semantic release (
0ff8888
)
Unknown
- Merge pull request #37 from chanind/pypi-package
feat: pypi packaging and auto-release with semantic release (a711efe
)
-
simplify matryoshka loss (
43421f5
) -
Use torch.split() instead of direct indexing for 25% speedup (
505a445
) -
Fix matryoshka spelling (
aa45bf6
) -
Fix incorrect auxk logging name (
784a62a
) -
Add citation (
77f2690
) -
Make sure to detach reconstruction before calculating aux loss (
db2b564
) -
Merge pull request #36 from saprmarks/aux_loss_fixes
Aux loss fixes, standardize decoder normalization (34eefda
)
-
Standardize and fix topk auxk loss implementation (
0af1971
) -
Normalize decoder after optimzer step (
200ed3b
) -
Remove experimental matroyshka temperature (
6c2fcfc
) -
Make sure x is on the correct dtype for jumprelu when logging (
c697d0f
) -
Import trainers from correct relative location for submodule use (
8363ff7
) -
By default, don't normalize Gated activations during inference (
52b0c54
) -
Also update context manager for matroyshka threshold (
65e7af8
) -
Disable autocast for threshold tracking (
17aa5d5
) -
Add torch autocast to training loop (
832f4a3
) -
Save state dicts to cpu (
3c5a5cd
) -
Add an option to pass LR to TopK trainers (
8316a44
) -
Add April Update Standard Trainer (
cfb36ff
) -
Merge pull request #35 from saprmarks/code_cleanup
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (f19db98
)
-
Consolidate LR Schedulers, Sparsity Schedulers, and constrained optimizers (
9751c57
) -
Merge pull request #34 from adamkarvonen/matroyshka
Add Matroyshka, Fix Jump ReLU training, modify initialization (92648d4
)
-
Add a verbose option during training (
0ff687b
) -
Prevent wandb cuda multiprocessing errors (
370272a
) -
Log dead features for batch top k SAEs (
936a69c
) -
Log number of dead features to wandb (
77da794
) -
Add trainer number to wandb name (
3b03b92
) -
Add notes (
810dbb8
) -
Add option to ignore bos tokens (
c2fe5b8
) -
Fix jumprelu training (
ec961ac
) -
Use kaiming initialization if specified in paper, fix batch_top_k aux_k_alpha (
8eaa8b2
) -
Format with ruff (
3e31571
) -
Add temperature scaling to matroyshka (
ceabbc5
) -
norm the correct decoder dimension (
5383603
) -
Fix loading matroyshkas from_pretrained() (
764d4ac
) -
Initial matroyshka implementation (
8ade55b
) -
Make sure we step the learning rate scheduler (
1df47d8
) -
Merge pull request #33 from saprmarks/lr_scheduling
Lr scheduling (316dbbe
)
-
Properly set new parameters in end to end test (
e00fd64
) -
Standardize learning rate and sparsity schedules (
a2d6c43
) -
Merge pull request #32 from saprmarks/add_sparsity_warmup
Add sparsity warmup (a11670f
)
-
Add sparsity warmup for trainers with a sparsity penalty (
911b958
) -
Clean up lr decay (
e0db40b
) -
Track lr decay implementation (
f0bb66d
) -
Remove leftover variable, update expected results with standard SAE improvements (
9687bb9
) -
Merge pull request #31 from saprmarks/add_demo
Add option to normalize dataset, track thresholds for TopK SAEs, Fix Standard SAE (67a7857
)
-
Also scale topk thresholds when scaling biases (
efd76b1
) -
Use the correct standard SAE reconstruction loss, initialize W_dec to W_enc.T (
8b95ec9
) -
Add bias scaling to topk saes (
484ca01
) -
Fix topk bfloat16 dtype error (
488a154
) -
Add option to normalize dataset activations (
81968f2
) -
Remove demo script and graphing notebook (
57f451b
) -
Track thresholds for topk and batchtopk during training (
b5821fd
) -
Track threshold for batchtopk, rename for consistency (
32d198f
) -
Modularize demo script (
dcc02f0
) -
Begin creation of demo script (
712eb98
) -
Fix JumpReLU training and loading (
552a8c2
) -
Ensure activation buffer has the correct dtype (
d416eab
) -
Merge pull request #30 from adamkarvonen/add_tests
Add end to end test, upgrade nnsight to support 0.3.0, fix bugs (c4eed3c
)
- Merge pull request #26 from mntss/batchtokp_aux_fix
Fix BatchTopKSAE training (2ec1890
)
-
Check for is_tuple to support mlp / attn submodules (
d350415
) -
Change save_steps to a list of ints (
f1b9b80
) -
Add early stopping in forward pass (
05fe179
) -
Obtain better test results using multiple batches (
067bf7b
) -
Fix frac_alive calculation, perform evaluation over multiple batches (
dc30720
) -
Complete nnsight 0.2 to 0.3 changes (
807f6ef
) -
Rename input to inputs per nnsight 0.3.0 (
9ed4af2
) -
Add a simple end to end test (
fe54b00
) -
Create LICENSE (
32fec9c
) -
Fix BatchTopKSAE training (
4aea538
) -
dtype for loading SAEs (
932e10a
) -
Merge pull request #22 from pleask/jumprelu
Implement jumprelu training (713f638
)
Use separate wandb runs for each SAE being trained (df60f52
)
-
Merge branch 'main' into jumprelu (
3dfc069
) -
implement jumprelu training (
16bdfd9
) -
handle no wandb (
8164d32
) -
Merge pull request #20 from pleask/batchtopk
Implement BatchTopK (b001fb0
)
-
separate runs for each sae being trained (
7d3b127
) -
add batchtopk (
f08e00b
) -
Move f_gate to encoder's dtype (
43bdb3b
) -
Ensure that x_hat is in correct dtype (
3376f1b
) -
Preallocate buffer memory to lower peak VRAM usage when replenishing buffer (
90aff63
) -
Perform logging outside of training loop to lower peak memory usage (
57f8812
) -
Remove triton usage (
475fece
) -
Revert to triton TopK implementation (
d94697d
) -
Add relative reconstruction bias from GDM Gated SAE paper to evaluate() (
8984b01
) -
git push origin main:Merge branch 'ElanaPearl-small_bug_fixes' into main (
2d586e4
) -
simplifying readme (
9c46e06
) -
simplify readme (
5c96003
) -
add missing imports (
7f689d9
) -
fix arg name in trainer_config (
9577d26
) -
update sae training example code (
9374546
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
7d405f7
) -
GatedSAE: moved feature re-normalization into encode (
f628c0e
) -
documenting JumpReLU SAE support (
322b6c0
) -
support for JumpReluAutoEncoders (
57df4e7
) -
Add submodule_name to PAnnealTrainer (
ecdac03
) -
host SAEs on huggingface (
0ae37fe
) -
fixed batch loading in examine_dimension (
82485d7
) -
Merge pull request #17 from saprmarks/collab
Merge Collab Branch (cdf8222
)
-
moved experimental trainers to collab-dev (
8d1d581
) -
Merge branch 'main' into collab (
dda38b9
) -
Update README.md (
4d6c6a6
) -
remove a sentence (
2d40ed5
) -
add a list of trainers to the README (
746927a
) -
add architecture details to README (
60422a8
) -
make wandb integration optional (
a26c4e5
) -
make wandb integration optional (
0bdc871
) -
Fix tutorial 404 (
deb3df7
) -
Add missing values to config (
9e44ea9
) -
changed TrainerTopK class name (
c52ff00
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
c04ee3b
) -
fixed loss_recovered to incorporate top_k (
6be5635
) -
fixed TopK loss (spotted by Anish) (
a3b71f7
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
40bcdf6
) -
naming conventions (
5ff7fa1
) -
small fix to triton kernel (
5d21265
) -
small updates for eval (
585e820
) -
added some housekeeping stuff to top_k (
5559c2c
) -
add support for Top-k SAEs (
2d549d0
) -
add transcoder eval (
8446f4f
) -
add transcoder support (
c590a25
) -
added wandb finish to trainer (
113c042
) -
fixed anneal end bug (
fbd9ee4
) -
added layer and lm_name (
d173235
) -
adding layer and lm_name to trainer config (
6168ee0
) -
make tracer_args optional (
31b2828
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
87d2b58
) -
bug fix evaluating CE loss with NNsight models (
f8d81a1
) -
Combining P Annealing and Anthropic Update (
44318e9
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
43e9ca6
) -
removing normalization (
7a98d77
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
5f2b598
) -
added buffer for NNsight models (not LanguageModel classes) as an extra class. We'll want to combine the three buffers wo currently have at some point (
f19d284
) -
fixed nnsight issues model tracing for chess-gpt (
7e8c9f9
) -
added W_O projection to HeadBuffer (
47bd4cd
) -
added support for training SAEs on individual heads (
a0e3119
) -
added support for training SAEs on individual heads (
47351b4
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
7de0bd3
) -
default hyperparameter adjustments (
a09346b
) -
normalization in gated_new (
104aba2
) -
fixing bug where inputs can get overwritten (
93fd46e
) -
fixing tuple bug (
b05dcaf
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
73b5663
) -
multiple steps debugging (
de3eef1
) -
adding gradient pursuit function (
72941f1
) -
bugfix (
53aabc0
) -
bugfix (
91691b5
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
9ce7d80
) -
logging more things (
8498a75
) -
changing initialization for AutoEncoderNew (
c7ee7ec
) -
fixing gated SAE encoder scheme (
4084bc3
) -
changes to gatedSAE API (
9e001d1
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
05b397b
) -
changing initialization (
ebe0d57
) -
finished combining gated and p-annealing (
4c08614
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
8e0a6f9
) -
gated_anneal first steps (
ba8b8fa
) -
jump SAE (
873b764
) -
adapted loss logging in p_anneal (
33997c0
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
1eecbda
) -
merging gated and Anthropic SAEs (
b6a24d0
) -
revert trainer naming (
c0af6d9
) -
restored trainer naming (
2ec3c67
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
fe7e93b
) -
various changes (
32027ae
) -
debug panneal (
463907d
) -
debug panneal (
8c00100
) -
debug panneal (
dc632cd
) -
debug panneal (
166f6a9
) -
debug panneal (
bcebaa6
) -
debug pannealing (
446c568
) -
p_annealing loss buffer (
e4d4a35
) -
implement Ben's p-annealing strategy (
06a27f0
) -
panneal changes (
fe4ff6f
) -
logging trainer names to wandb (
f9c5e45
) -
bugfixes for StandardTrainerNew (
70acd85
) -
trainer for new anthropic infrastructure (
531c285
) -
adding r_mag parameter to GSAE (
198ddf4
) -
gatedSAE trainer (
3567d6d
) -
cosmetic change (
0200976
) -
GatedAutoEncoder class (
2cfc47b
) -
p annealing not affected by resampling (
ad8d837
) -
integrated trainer update (
c7613d3
) -
Merge branch 'collab' into p_annealing (
933b80c
) -
fixed p calculation (
9837a6f
) -
getting rid of useless seed arguement (
377c762
) -
trainer initializes SAE (
7dffb66
) -
trainer initialized SAE (
6e80590
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
c58d23d
) -
changes to lista p_anneal trainers (
3cc6642
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
9dfd3db
) -
decoupled lr warmup and p warmup in p_anneal trainer (
c3c1645
) -
Merge pull request #14 from saprmarks/p_annealing
added annealing and trainer_param_callback (61927bc
)
-
cosmetic changes to interp (
4a7966f
) -
Merge branch 'collab' of https://github.com/saprmarks/dictionary_learning into collab (
c76818e
) -
Merge pull request #13 from jannik-brinkmann/collab
add ListaTrainer (d4d2fd9
)
-
additional evluation metrics (
fa2ec08
) -
add GroupSAETrainer (
60e6068
) -
added annealing and trainer_param_callback (
18e3fca
) -
Merge remote-tracking branch 'upstream/collab' into collab (
4650c2a
) -
fixing neuron resampling (
a346be9
) -
improvements to saving and logging (
4a1d7ae
) -
can export buffer config (
d19d8d9
) -
fixing evaluation.py (
c91a581
) -
fixing bug in neuron resampling (
67a03c7
) -
add ListaTrainer (
880f570
) -
fixing neuron resampling in standard trainer (
3406262
) -
improvements to training and evaluating (
b111d40
) -
Factoring out SAETrainer class (
fabd001
) -
updating syntax for buffer (
035a0f9
) -
updating readme for from_pretrained (
70e8c2a
) -
from_pretrained (
db96abc
) -
Change syntax for specifying activation dimensions and batch sizes (
bdf1f19
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
86c7475
) -
activation_dim for IdentityDict is optional (
be1b68c
) -
update umap requirement (
776b53e
) -
Merge pull request #10 from adamkarvonen/shell_script_change
Add sae_set_name to local_path for dictionary downloader (33b5a6b
)
-
Add sae_set_name to local_path for dictionary downloader (
d6163be
) -
dispatch no longer needed when loading models (
69c32ca
) -
removed in_and_out option for activation buffer (
cf6ad1d
) -
updating readme with 10_32768 dictionaries (
614883f
) -
upgrade to nnsight 0.2 (
cbc5f79
) -
downloader script (
7a305c5
) -
fixing device issue in buffer (
b1b44f1
) -
added pretrained_dictionary_downloader.sh (
0028ebe
) -
added pretrained_dictionary_downloader.sh (
8b63d8d
) -
added pretrained_dictionary_downloader.sh (
6771aff
) -
efficiency improvements (
94844d4
) -
adding identity dict (
76bd32f
) -
debugging interp (
2f75db3
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
86812f5
) -
warns user when evaluating without enough data (
246c472
) -
cleaning up interp (
95d7310
) -
examine_dimension returns mbottom_tokens and logit stats (
40137ff
) -
continuing merge (
db693a6
) -
progress on merge (
949b3a7
) -
changes to buffer.py (
792546b
) -
fixing some things in buffer.py (
f58688e
) -
updating requirements (
a54b496
) -
updating requirements (
a1db591
) -
identity dictionary (
5e1f35e
) -
bug fix for neuron resampling (
b281b53
) -
UMAP visualizations (
81f8e1f
) -
better normalization for ghost_loss (
fc74af7
) -
neuron resampling without replacement (
4565e9a
) -
simplifications to interp functions (
2318666
) -
Second nnsight 0.2 pass through (
3bcebed
) -
Conversion to nnsight 0.2 first pass (
cac410a
) -
detaching another thing in ghost grads (
2f212d6
) -
Neuron resampling no longer errors when resampling zero neurons (
376dd3b
) -
NNsight v0.2 Updates (
90bbc76
) -
cosmetic improvements to buffer.py (
b2bd5f0
) -
fix to ghost grads (
9531fe5
) -
fixing table formatting (
0e69c8c
) -
Fixing some table formatting (
75f927f
) -
gpt2-small support (
f82146c
) -
fixing bug relevant to UnifiedTransformer support (
9ec9ce4
) -
Getting rid of histograms (
31d09d7
) -
Fixing tables in readme (
5934011
) -
Updates to the readme (
a5ca51e
) -
Fixing ghost grad bugs (
633d583
) -
Handling ghost grad case with no dead neurons (
4f19425
) -
adding support for buffer on other devices (
f3cf296
) -
support for ghost grads (
25d2a62
) -
add an implementation of ghost gradients (
2e09210
) -
fixing a bug with warmup, adding utils (
47bbde1
) -
remove HF arg from buffer. rename search_utils to interp (
7276f17
) -
typo fix (
3f6b922
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
278084b
) -
added utils for converting hf dataset to generator (
82fff19
) -
add ablated token effects to ; restore support for HF datasets (
799e2ca
) -
merge in function for examining features (
986bf96
) -
easier submodule/dictionary feature examination (
2c8b985
) -
Adding lr warmup after every time neurons are resampled (
429c582
) -
fixing issues with EmptyStream exception (
39ff6e1
) -
Minor changes due to updates in nnsight (
49bbbac
) -
Revert "restore support for streaming HF datasets"
This reverts commit b43527b. (23ada98
)
-
restore support for streaming HF datasets (
b43527b
) -
first version of automatic feature labeling (
c6753f6
) -
Add feature_effect function to search_utils.py (
0ada2c6
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
fab70b1
) -
adding sqrt to MSE (
63b2174
) -
Merge pull request #1 from cadentj/main
Update README.md (fd79bb3
)
-
Update README.md (
cf5ec24
) -
Update README.md (
55f33f2
) -
evaluation.py (
2edf59e
) -
evaluating dictionaries (
71e28fb
) -
Removing experimental use of sqrt on MSELoss (
865bbb5
) -
Adding readme, evaluation, cleaning up (
ddac948
) -
some stuff for saving dicts (
d1f0e21
) -
removing device from buffer (
398f15c
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
7f013c2
) -
lr schedule + enabling stretched mlp (
4eaf7e3
) -
add random feature search (
e58cc67
) -
restore HF support and progress bar (
7e2b6c6
) -
Merge branch 'main' of https://github.com/saprmarks/dictionary_learning into main (
d33ef05
) -
more support for saving checkpints (
0ca258a
) -
fix unit column bug + add scheduler (
5a05c8c
) -
fix merge bugs: checkpointing support (
9c5bbd8
) -
Merge: add HF datasets and checkpointing (
ccf6ed1
) -
checkpointing, progress bar, HF dataset support (
fd8a3ee
) -
progress bar for training autoencoders (
0a8064d
) -
implementing neuron resampling (
f9b9d02
) -
lotsa stuff (
bc09ba4
) -
adding init.py file for imports (
3d9fd43
) -
modifying buffer (
ba9441b
) -
first commit (
ea89e90
) -
Initial commit (
741f4d6
)