-
Notifications
You must be signed in to change notification settings - Fork 286
Sharded weights type error #2296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Sharded weights type error #2296
Conversation
@laxmareddyp could we add a test for this to prevent future breakage? Here is an example of testing sharded weights: |
init_kwargs = { | ||
"vocabulary_size": 1024, | ||
"num_layers": 12, | ||
"num_query_heads": 8, | ||
"num_key_value_heads": 4, | ||
"hidden_dim": 32, | ||
"intermediate_dim": 64, | ||
"head_dim": 4, | ||
"sliding_window_size": 5, | ||
"attention_logit_soft_cap": 50, | ||
"final_logit_soft_cap": 30, | ||
"layer_norm_epsilon": 1e-6, | ||
"query_head_dim_normalize": False, | ||
"use_post_ffw_norm": True, | ||
"use_post_attention_norm": True, | ||
"use_sliding_window_attention": True, | ||
} | ||
backbone = GemmaBackbone(**init_kwargs) # ~422KB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to setUp
and use it for all 3 test cases which we are doing here, so that our test setup will be lot more cleaner.
…l three relevant test cases to use the shared setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description of the change
Fix: Handle lists in weight_map for sharded weights
The
_get_sharded_filenames
method inpreset_utils.py
was raising aTypeError when
weight_map.values()
contained lists. This occurred becauselists are unhashable and cannot be added directly to a set.
Reference
Colab Notebook
Checklist