-
Notifications
You must be signed in to change notification settings - Fork 346
Open
Description
I am trying to convert the DialogGPT model to Torchscript to load the model into Triton. I tried to use JIT to trace the model as mentioned here but hitting the following warning
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large")
model = AutoModelForCausalLM.from_pretrained(
"microsoft/DialoGPT-large", torchscript=True)
step = 0
new_user_input_ids = tokenizer.encode(
"This is a test!" + tokenizer.eos_token, return_tensors='pt')
bot_input_ids = torch.cat(
[chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
traced_model = torch.jit.trace(
model, bot_input_ids)
torch.jit.save(traced_model, "DialogGPT.pt")
/opt/conda/lib/python3.8/site-packages/transformers/models/gpt2/modeling_gpt2.py:196: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
attn_weights = attn_weights / (float(value.size(-1)) ** 0.5)
How can I trace with the generate method?
model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
I tried to cover the model ONNX format using this instruction but this takes three inputs
input_ids, attention_mask and token_type_ids. I am able to get input_ids and attention_mask from the tokenizer. How can I get the token_type_ids
inputs = tokenizer("How are you doing?", return_tensors="np")
Metadata
Metadata
Assignees
Labels
No labels