vllm-users-t-nitinkedia-sarathi-v2 sarathi implementation error #3422
bbietzsche
announced in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
model_name = 'mistralai/Mistral-7B-v0.1'
model = LLM(model=model_name, enforce_eager=True)
params = SamplingParams(max_tokens=256)
response = model.generate(example_prompts, params)
/content/vllm-users-t-nitinkedia-sarathi-v2/vllm/model_executor/layers/sampler.py in _get_logits(self, hidden_states, embedding, embedding_bias)
40 # Get the logits for the next tokens.
41 print(hidden_states.shape, embedding.t().shape)
---> 42 print(hidden_states.detach().cpu().max(), hidden_states.detach().cpu().min())
43 print(embedding.t().detach().cpu().max(), embedding.t().detach().cpu().min())
44 logits = torch.matmul(hidden_states, embedding.t())
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.is this any usage example for vllm-users-t-nitinkedia-sarathi-v2 branch
Beta Was this translation helpful? Give feedback.
All reactions