Skip to content

Issue with setting up demos\RAG-Qwen3VL with NexaAI/Qwen3-VL-4B-Instruct-GGUF #794

@mhtracking

Description

@mhtracking

Hi, When I run nexa pull NexaAI/Qwen3-VL-4B-Instruct-GGUF, it stucks during downloading no matter what quantization I choose. When I download from the https://huggingface.co/NexaAI/Qwen3-VL-4B-Instruct-GGUF/tree/main (these files: [Qwen3-VL-4B-Instruct.Q8_0.gguf], [mmproj.Q8_0.gguf],[tokenizer.json], [nexa.manifest]) and then place them in the C:\Users\PC_NAME.cache\nexa.ai\nexa_sdk\models\NexaAI\Qwen3-VL-4B-Instruct-GGUF and run nexa serve and in another terminal I run:
python rag_nexa.py --data ./docs
in the demos\RAG-Qwen3VL and enter the question I always get this error:

python rag_nexa.py --data ./docs
[info] Loading files from: ./docs
[info] Built 56 chunks.
[info] Ready. Using model=NexaAI/Qwen3-VL-4B-Instruct-GGUF endpoint=http://127.0.0.1:18181
Type your question (or just press Enter to quit):
[user] for the user question: "what is the location of the vehicle with the plate number 3688 FHD?" name the sub collecitons that will be used to make the query to answer the user question.

[retrieved]

  1. sub_collection_descriptions.txt#chunk11: . This sub-collection is the primary source for answering questions about vehicle identity, specifications, physical characteristics, and identification details...
  2. sub_collection_descriptions.txt#chunk24: . This sub-collection is the primary source for answering questions about vehicle location, current position, historical movement patterns, real-time status, an...
  3. sub_collection_descriptions.txt#chunk10: The asset_details_base sub-collection stores all essential vehicle identification and specification details for fleet assets. This comprehensive collection incl...
  4. sub_collection_descriptions.txt#chunk30: ================================================================================ DRIVER COLLECTION =============================================================...
  5. sub_collection_descriptions.txt#chunk44: ## Driver_Profile The Driver_Profile sub-collection contains the complete comprehensive profile of a driver including all personal and professional information...

[assistant]
[warn] streaming failed, fallback to non-stream. Reason: 500 Server Error: Internal Server Error for url: http://127.0.0.1:18181/v1/chat/completions
[error] Non-stream request also failed: 500 http://127.0.0.1:18181/v1/chat/completions
{"code":-100001,"error":"SDKError(Invalid input parameters or handle)"}
[user]

what is the issue, please help me fix it.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions