What's the proper way to run a downloaded model with GPU? #990
Unanswered
dany-nonstop
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
GALLERIES=[{"name":"model-gallery","url":"github:go-skynet/model-gallery/index.yaml"},{"url":"github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]
environment parameter to enable model galleries,{"id": "thebloke__wizard-vicuna-13b-ggml__wizard-vicuna-13b.ggmlv3.q4_0.bin"}
at api's/models/apply
to download a model locally, the download was successfulthebloke__wizard-vicuna-13b-ggml__wizard-vicuna-13b.ggmlv3.q4_0.bin.yaml
vicuna-chat.tmpl
vicuna-completion.tmpl
wizard-vicuna-13B.ggmlv3.q4_0.bin
./v1/completion
, I can only call the model by thewizard-vicuna-13B.ggmlv3.q4_0.bin
file name, not the model name in the gallerythebloke__wizard-vicuna-13b-ggml__wizard-vicuna-13b.ggmlv3.q4_0.bin
top
and and GPU load withnvidia-smi
inside the dockerthebloke__wizard-vicuna-13b-ggml__wizard-vicuna-13b.ggmlv3.q4_0.bin.yaml
, and added a linegpu_layers: 1000
, but it seems the file is ignored if I'm using the actual model namewizard-vicuna-13B.ggmlv3.q4_0.bin
.question: what is the proper way to use a model's definition yaml file? how can I enable GPU for inference?
many thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions