-
Notifications
You must be signed in to change notification settings - Fork 723
Description
Hi, When I run nexa pull NexaAI/Qwen3-VL-4B-Instruct-GGUF, it stucks during downloading no matter what quantization I choose. When I download from the https://huggingface.co/NexaAI/Qwen3-VL-4B-Instruct-GGUF/tree/main (these files: [Qwen3-VL-4B-Instruct.Q8_0.gguf], [mmproj.Q8_0.gguf],[tokenizer.json], [nexa.manifest]) and then place them in the C:\Users\PC_NAME.cache\nexa.ai\nexa_sdk\models\NexaAI\Qwen3-VL-4B-Instruct-GGUF and run nexa serve and in another terminal I run:
python rag_nexa.py --data ./docs
in the demos\RAG-Qwen3VL and enter the question I always get this error:
python rag_nexa.py --data ./docs
[info] Loading files from: ./docs
[info] Built 56 chunks.
[info] Ready. Using model=NexaAI/Qwen3-VL-4B-Instruct-GGUF endpoint=http://127.0.0.1:18181
Type your question (or just press Enter to quit):
[user] for the user question: "what is the location of the vehicle with the plate number 3688 FHD?" name the sub collecitons that will be used to make the query to answer the user question.
[retrieved]
- sub_collection_descriptions.txt#chunk11: . This sub-collection is the primary source for answering questions about vehicle identity, specifications, physical characteristics, and identification details...
- sub_collection_descriptions.txt#chunk24: . This sub-collection is the primary source for answering questions about vehicle location, current position, historical movement patterns, real-time status, an...
- sub_collection_descriptions.txt#chunk10: The asset_details_base sub-collection stores all essential vehicle identification and specification details for fleet assets. This comprehensive collection incl...
- sub_collection_descriptions.txt#chunk30: ================================================================================ DRIVER COLLECTION =============================================================...
- sub_collection_descriptions.txt#chunk44: ## Driver_Profile The Driver_Profile sub-collection contains the complete comprehensive profile of a driver including all personal and professional information...
[assistant]
[warn] streaming failed, fallback to non-stream. Reason: 500 Server Error: Internal Server Error for url: http://127.0.0.1:18181/v1/chat/completions
[error] Non-stream request also failed: 500 http://127.0.0.1:18181/v1/chat/completions
{"code":-100001,"error":"SDKError(Invalid input parameters or handle)"}
[user]
what is the issue, please help me fix it.