-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Description
What happened?
So Ive got models running on local ollama ive noticed when I try to use agents via litellm the tool calling function_calling dosent work, ive observed that even though ive configured this in the model info settings for both the existing and fresh configured models the /model_groups/info response does not return tool calling is supported resulting in an error, direct to the ollama server this works fine as the model does support tools.
This does not occur on models from openai/bedrock/azure/openrouter via litellm only for ollama models, ive tested this on both the ollama endpoint direct and the openai /v1 ollama endpoint.
error is:
POST "https://mylitellm.testdomain/chat/completions": 400 Bad Request {"message":"litellm.BadRequestError: OpenAIException - registry.ollama.ai/library/qwen3-coder:30b does not support tools. Received Model Group=myqwen3-coder:30b\nAvailable Model Group Fallbacks=None","type":null,"param":null,"code":"400"}
when I call /model/info with the model ID I get
{ "data": [ { "model_name": "myqwen3-coder:30b", "litellm_params": { "input_cost_per_token": 0, "output_cost_per_token": 0, "api_base": "http://xxxxxxx.elb.ap-southeast-2.amazonaws.com:11434/v1", "custom_llm_provider": "openai", "use_in_pass_through": false, "use_litellm_proxy": false, "merge_reasoning_content_in_choices": false, "model": "openai/qwen3-coder:30b", "guardrails": [] }, "model_info": { "id": "4dd7e249-b66d-4e3c-be35-d9af36cb6a27", "db_model": true, "key": "openai/qwen3-coder:30b", "mode": "chat", "access_groups": [], "direct_access": true, "supports_vision": true, "litellm_provider": "openai", "supports_reasoning": true, "access_via_team_ids": [ "5921b666-488f-445d-b2ad-80013388be2c" ], "input_cost_per_token": 0, "output_cost_per_token": 0, "supported_openai_params": [ "frequency_penalty", "logit_bias", "logprobs", "top_logprobs", "max_tokens", "max_completion_tokens", "modalities", "prediction", "n", "presence_penalty", "seed", "stop", "stream", "stream_options", "temperature", "top_p", "tools", "tool_choice", "function_call", "functions", "max_retries", "extra_headers", "parallel_tool_calls", "audio", "web_search_options", "response_format", "user" ], "supports_function_calling": true, "max_tokens": null, "max_input_tokens": null, "max_output_tokens": null, "cache_creation_input_token_cost": null, "cache_read_input_token_cost": null, "input_cost_per_character": null, "input_cost_per_token_above_128k_tokens": null, "input_cost_per_token_above_200k_tokens": null, "input_cost_per_query": null, "input_cost_per_second": null, "input_cost_per_audio_token": null, "input_cost_per_token_batches": null, "output_cost_per_token_batches": null, "output_cost_per_audio_token": null, "output_cost_per_character": null, "output_cost_per_reasoning_token": null, "output_cost_per_token_above_128k_tokens": null, "output_cost_per_character_above_128k_tokens": null, "output_cost_per_token_above_200k_tokens": null, "output_cost_per_second": null, "output_cost_per_image": null, "output_vector_size": null, "citation_cost_per_token": null, "supports_system_messages": null, "supports_response_schema": null, "supports_tool_choice": null, "supports_assistant_prefill": null, "supports_prompt_caching": null, "supports_audio_input": null, "supports_audio_output": null, "supports_pdf_input": null, "supports_embedding_image_input": null, "supports_native_streaming": null, "supports_web_search": null, "supports_url_context": null, "supports_computer_use": null, "search_context_cost_per_query": null, "tpm": null, "rpm": null } } ] }
When I call /model_groups/info none of the settings come through, as such I cannot work out how to use function_calling/tools or how to configure it in the proxy and the settings on the model do not seem to be proliferating.
output of /model_groups/info on the same model, notice support_function_calling returns false (I suspect this is the result of the error)
{ "data": [ { "model_group": "myqwen3-coder:30b", "providers": [ "openai" ], "max_input_tokens": null, "max_output_tokens": null, "input_cost_per_token": 0, "output_cost_per_token": 0, "input_cost_per_pixel": null, "mode": null, "tpm": null, "rpm": null, "supports_parallel_function_calling": false, "supports_vision": false, "supports_web_search": false, "supports_url_context": false, "supports_reasoning": false, "supports_function_calling": false, "supported_openai_params": [ "frequency_penalty", "logit_bias", "logprobs", "top_logprobs", "max_tokens", "max_completion_tokens", "modalities", "prediction", "n", "presence_penalty", "seed", "stop", "stream", "stream_options", "temperature", "top_p", "tools", "tool_choice", "function_call", "functions", "max_retries", "extra_headers", "parallel_tool_calls", "audio", "web_search_options", "response_format", "user" ], "configurable_clientside_auth_params": null, "is_public_model_group": false } ] }
Relevant log output
Are you a ML Ops Team?
Yes
What LiteLLM version are you on ?
v1.76.1