When LangChain calls the ChatOllama module, in some cases the request is interrupted, but LangChain continues to wait for a response from Ollama, causing the program to hang. #31549

smile0304 · 2025-06-10T04:24:32Z

smile0304
Jun 10, 2025

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

class ChatLLM:
    # build llm
    # ....
    @staticmethod
    def get_chat_ollama(model="qwen3:32b", num_ctx=50000):
        llm = ChatOllama(model=model, base_url="http://192.168.2.243:11434", num_ctx=num_ctx)
        return llm

Calling invoke, these steps don’t involve very complex processing：

    llm = chatllm.get_chat_ollama()
    chat_template = ChatPromptTemplate.from_messages(
        [
            ('system', PROMPT_SYSTEM_BY_DATE),
            ('human', HUMAN_PROMPT_BY_DATE)
        ]
    )
    # ......    

    rendered_prompt = chat_template.format_prompt(**input_data)
    print("\n===== DEBUG =====\n")
    print(rendered_prompt)
    print("\n===== END =====\n")

    try:
        res = chain.invoke(input_data)
    except httpx.TimeoutException as e:  # timeout
        logger.error("timeout：", e)
        res = None
    except TimeoutError as e:  # TimeoutError
        logger.error("TimeoutError：", e)
        res = None
    except Exception as e:  # other
        logger.error("err：", e)
        res = None

Description

In my code, data is provided to the LLM multiple times for reasoning, and this process may repeat thousands or even tens of thousands of times.

Everything works fine when the code initially runs, but at some point during the loop execution, LangChain ends up indefinitely waiting for a response from Ollama.

I used py-spy to analyze the current process stack and found that it was stuck waiting for an HTTP response：

This issue only occurs after a certain number of executions, which makes it difficult to debug.

I also tried upgrading Ollama and switching models, but the problem still persists.

I can confirm that the context length I provide each time does not exceed the model’s maximum limit：

System Info

python 3.11.11

langchain version：

langchain==0.3.21
langchain-community==0.3.20
langchain-core==0.3.48
langchain-deepseek==0.1.3
langchain-ollama==0.3.0
langchain-openai==0.3.10
langchain-text-splitters==0.3.7
openinference-instrumentation-langchain==0.1.37

system:
macos 15.1.1
&&
ubuntu 5.15.0-141-generic

No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.5 LTS
Release:	22.04
Codename:	jammy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

When LangChain calls the ChatOllama module, in some cases the request is interrupted, but LangChain continues to wait for a response from Ollama, causing the program to hang. #31549

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

When LangChain calls the ChatOllama module, in some cases the request is interrupted, but LangChain continues to wait for a response from Ollama, causing the program to hang. #31549

Uh oh!

Uh oh!

smile0304 Jun 10, 2025

Checked other resources

Commit to Help

Example Code

Description

System Info

Replies: 0 comments

smile0304
Jun 10, 2025