PandasQueryEngine with Ollama #19171

TastyMoocow · 2025-06-22T07:06:11Z

TastyMoocow
Jun 22, 2025

I'm trying to determine if a Pandas DF is correctly generated from PymuPDF's get_tables() function. The end goal is to automatically detect if the program needs to go back and call PymuPDF again by extending the bounding box. When I run the code, I get the following error.

Traceback (most recent call last): File "D:\python_projects\FinanceReportRead\.venv\Lib\site-packages\llama_index\experimental\query_engine\pandas\output_parser.py", line 42, in default_output_processor tree = ast.parse(output) ^^^^^^^^^^^^^^^^^ File "C:\Users\Daniel\AppData\Local\Programs\Python\Python311\Lib\ast.py", line 50, in parse return compile(source, filename, mode, flags, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<unknown>", line 2 Okay, let's see. The user is working with a pandas DataFrame called df. They provided the output of df.head(), which shows the columns as Assets, Col1, Col2, Col3, Col4, Col5. The question is asking for the column header names. ^ SyntaxError: unterminated string literal (detected at line 2)

Here's a cleaned up version of my code.

`doc = pymupdf.open("sample_docs/Ascend_10Q_2025.pdf")
data = doc.load_page(5)

llm = Ollama(model="qwen3:8b",
base_url="http://127.0.0.1:11434",
request_timeout=180,
keep_alive=1,
temperature=0)

tabsFinder = data.find_tables()
tables = []
dfs = []
for tab in tabsFinder.tables:
tables.append(tab)

for tab in tables:
df = tab.to_pandas()
dfs.append(df)

query_engine = PandasQueryEngine(df=dfs[0],llm=llm)
response = query_engine.query("What are the column header names?")
`

Looking at the example in the documentation page, it seems like all I need to include is the llm definition and a Pandas dataframe for the query. From the error code, it seems like Ollama was called since I see some kind of response from it.

@TastyMoocow · 2025-06-22T07:11:59Z

dosubot[bot]
bot Jun 22, 2025

Hi @TastyMoocow! I'm Dosu and I’m helping the llama_index team.

The error happens because PandasQueryEngine expects the LLM to return Python code (like a list of column names), but your Ollama model is returning a natural language explanation instead. The output parser tries to parse the LLM's response as code, so any extra text or explanation will cause a SyntaxError.

To fix this, you need to make sure the LLM only returns code. The prompt template used by PandasQueryEngine is designed to instruct the LLM to output only executable Python code, often between <python></python> tags, but not all models follow this strictly. You can customize the prompt template when you create the query engine to be more explicit, for example:

from llama_index.experimental.query_engine.pandas import PandasQueryEngine

prompt_template = (
    "Given the following pandas DataFrame, answer the user's question by returning ONLY executable Python code (no explanation, no markdown, no comments). "
    "For example, to get the column names, return: df.columns.tolist()\n"
    "DataFrame: {df}\n"
    "Question: {question}\n"
    "Python code:"
)

query_engine = PandasQueryEngine(df=dfs[0], llm=llm, prompt_template=prompt_template)

Set temperature=0 for the LLM to reduce randomness and encourage consistent, code-only output.

If the LLM still returns extra text, you can post-process the output to extract just the code, or subclass/wrap the output processor to be more robust.

For your DataFrame validation logic, you can check if the columns or shape match your expectations and, if not, trigger a re-extraction with an extended bounding box.

Best practices: always give clear instructions in your prompt about the expected output format, include examples, and use temperature=0 for deterministic results. More on this in the LlamaIndex prompt engineering guidance.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

0 replies

TastyMoocow · 2025-06-23T03:08:43Z

TastyMoocow
Jun 23, 2025
Author

@dosu

Thank you for pointing out the need to use a coder type model. After switching to qwen2.5-coder, it worked even without a prompt_template. Do you know which Ollama models would work with PandaQueryEngine? I tried codegemma:7b-code model but it threw an error instead.

1 reply

dosubot[bot] bot Jun 23, 2025

There isn’t an official or community-maintained list of Ollama models that are guaranteed to work with PandasQueryEngine. Compatibility depends on whether the model reliably outputs only executable Python code (no explanations, markdown, or extra text), since the engine parses the LLM output as code and will error if there’s anything else present.

Coder/code-specialized models like qwen2.5-coder tend to work because they follow code-only instructions more strictly. General-purpose or chat models, and even some code models (like codegemma:7b-code), may still return explanations or formatting that break the parser, leading to errors.

The best way to check compatibility is to run a simple query (like asking for df.columns.tolist()) and inspect the raw output. If the model returns just the code, it should work. If not, you can try tweaking the prompt template to be more explicit, set temperature=0 for deterministic output, or try a different coder model. If a model consistently fails, it’s likely not compatible with the current PandasQueryEngine workflow.

If you find a model that works well, sharing that info with the community would be helpful, since this isn’t tracked in the docs or codebase.

To reply, just mention @dosu.

^{How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PandasQueryEngine with Ollama #19171

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

PandasQueryEngine with Ollama #19171

Uh oh!

TastyMoocow Jun 22, 2025

Replies: 2 comments · 1 reply

Uh oh!

dosubot[bot] bot Jun 22, 2025

Uh oh!

TastyMoocow Jun 23, 2025 Author

Uh oh!

dosubot[bot] bot Jun 23, 2025

TastyMoocow
Jun 22, 2025

Replies: 2 comments 1 reply

dosubot[bot]
bot Jun 22, 2025

TastyMoocow
Jun 23, 2025
Author