Skip to content

Python: Bug: Simple python onnx connector sample not working as the library has not been updated with onnxruntime_genai api changes over past year #13001

@cmonto

Description

@cmonto

Describe the bug
('Failed Inference with ONNX', AttributeError("'onnxruntime_genai.onnxruntime_genai.GeneratorParams' object has no attribute 'input_ids'")).
 
To Reproduce
Steps to reproduce the behavior:
Just try to run the sample code given downloading the Phi-4-mini-instruct-onnx model from HF
 
Platform

  • Language: Python
  • Source: pip package version 1.35.3
  • AI model: Phi-4-mini-instruct-onnx
  • IDE: VS Code
  • OS: Windows
     
    Additional context
    Code trying to run.
import asyncio
import os
 
from semantic_kernel.connectors.ai.onnx import OnnxGenAIChatCompletion, OnnxGenAIPromptExecutionSettings
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.kernel import Kernel

kernel = Kernel()
 
service_id = "phi3"
streaming = True
 
model_path = os.path.join(
        os.path.dirname(__file__),
        "models", "Phi-4-mini-instruct-onnx", "cpu_and_mobile", "cpu-int4-rtn-block-32-acc-level-4"
    )
 
chat_completion = OnnxGenAIChatCompletion(ai_model_path=model_path,ai_model_id=service_id, template="phi3")
chat_completion.enable_multi_modality = False
 
settings = OnnxGenAIPromptExecutionSettings()
 
system_message = """You are a helpful assistant."""
chat_history = ChatHistory(system_message=system_message)
 
async def chat() -> bool:
    try:
        user_input = input("User:> ")
    except KeyboardInterrupt:
        print("\n\nExiting chat...")
        return False
    except EOFError:
        print("\n\nExiting chat...")
        return False
 
    if user_input == "exit":
        print("\n\nExiting chat...")
        return False
    chat_history.add_user_message(user_input)
    if streaming:
        print("Mosscap:> ", end="")
        message = ""
        async for chunk in chat_completion.get_streaming_chat_message_content(
            chat_history=chat_history, settings=settings, kernel=kernel
        ):
            if chunk:
                print(str(chunk), end="")
                message += str(chunk)
        chat_history.add_assistant_message(message)
        print("")
    else:
        answer = await chat_completion.get_chat_message_content(
            chat_history=chat_history, settings=settings, kernel=kernel
        )
        print(f"Mosscap:> {answer}")
        chat_history.add_message(answer)
    return True
 
 
async def main() -> None:
    chatting = True
    while chatting:
        chatting = await chat()
 
 
if __name__ == "__main__":
    asyncio.run(main())

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingpythonPull requests for the Python Semantic Kernel

Type

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions