-
Notifications
You must be signed in to change notification settings - Fork 100
Open
Description
I encountered a runtime error while using the transformers-interpret library with a fine-tuned LLama-2 model that includes LoRA adapters for sequence classification. The error occurs when invoking the SequenceClassificationExplainer and seems related to tensor size mismatches during the rotary positional embedding application.
Traceback (most recent call last):
File "/home/input_attr_proj/src/input_attr.py", line 32, in <module>
word_attributions = cls_explainer("Hello")
File "/home/input_attr_env/lib/python3.10/site-packages/transformers_interpret/explainers/text/sequence_classification.py", line 316, in __call__
return self._run(text, index, class_name, embedding_type=embedding_type)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers_interpret/explainers/text/sequence_classification.py", line 270, in _run
self._calculate_attributions(embeddings=embeddings, index=index, class_name=class_name)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers_interpret/explainers/text/sequence_classification.py", line 226, in _calculate_attributions
lig = LIGAttributions(
File "/home/input_attr_env/lib/python3.10/site-packages/transformers_interpret/attributions.py", line 51, in __init__
self._attributions, self.delta = self.lig.attribute(
File "/home/input_attr_env/lib/python3.10/site-packages/captum/log/__init__.py", line 42, in wrapper
return func(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/captum/attr/_core/layer/layer_integrated_gradients.py", line 390, in attribute
baselines_layer = _forward_layer_eval(
File "/home/input_attr_env/lib/python3.10/site-packages/captum/_utils/gradient.py", line 182, in _forward_layer_eval
return _forward_layer_eval_with_neuron_grads(
File "/home/input_attr_env/lib/python3.10/site-packages/captum/_utils/gradient.py", line 445, in _forward_layer_eval_with_neuron_grads
saved_layer = _forward_layer_distributed_eval(
File "/home/input_attr_env/lib/python3.10/site-packages/captum/_utils/gradient.py", line 294, in _forward_layer_distributed_eval
output = _run_forward(
File "/home/input_attr_env/lib/python3.10/site-packages/captum/_utils/common.py", line 531, in _run_forward
output = forward_func(
File "/home/input_attr_env/lib/python3.10/site-packages/transformers_interpret/explainers/text/sequence_classification.py", line 181, in _forward
preds = self._get_preds(input_ids, token_type_ids, position_ids, attention_mask)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers_interpret/explainer.py", line 197, in _get_preds
preds = self.model(
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1352, in forward
transformer_outputs = self.model(
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/input_attrlib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 968, in forward
layer_outputs = decoder_layer(
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 713, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 624, in forward
query_states, key_states = apply_rotary_pos_emb(query_states, key_states, cos, sin)
File "/home/input_attr_env/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 182, in apply_rotary_pos_emb
q_embed = (q * cos) + (rotate_half(q) * sin)
RuntimeError: The size of tensor a (3) must match the size of tensor b (2) at non-singleton dimension 2
Code sample:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers_interpret import SequenceClassificationExplainer
id2label = {0: "No", 1: "Yes"}
label2id = {"No": 0, "Yes": 1}
model = AutoModelForSequenceClassification.from_pretrained("outputs/2024-04-21/04-27-20/outputs/checkpoint-2564/", device_map='auto',num_labels=2, id2label=id2label, label2id=label2id)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.pad_tioken = tokenizer.eos_token
cls_explainer = SequenceClassificationExplainer(model,tokenizer)
word_attributions = cls_explainer("Hello")
print(word_attributions)
Additional Context:
The error seems to occur in the apply_rotary_pos_emb function, indicating a tensor size mismatch. This might be due to the integration of LoRA adapters with the LLama-2 model. Any help to resolve this issue or guidance on proper compatibility would be greatly appreciated.
Metadata
Metadata
Assignees
Labels
No labels