.Net: Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient #12197

stephentoub · 2025-05-20T14:19:01Z

No description provided.

RogerBarreto · 2025-05-20T17:18:09Z

The model seems to be loading in the initialization of the client, should happen just in the runtime.

    public OnnxRuntimeGenAIChatClient(string modelPath, OnnxRuntimeGenAIChatClientOptions? options = null)
    {
        //...
        _model = new Model(modelPath);
        _tokenizer = new Tokenizer(_model);
    }

stephentoub · 2025-05-20T17:23:56Z

The model seems to be loading in the initialization of the client, should happen just in the runtime.

We can, but, why do we want to do that? Any config failures won't be noticed until use, additional code (not present in the current impl) is necessary to prevent concurrent usage from loading the likely multi-gb model multiple times, and first use will be delayed by a potentially very long time, likely timing out.

RogerBarreto · 2025-05-20T17:36:03Z

Their 0.8.0 still rely on the 9.4 preview. Getting Method not found in Integration tests.

RogerBarreto · 2025-05-20T17:37:19Z

We can, but, why do we want to do that?

Don't want to add behavioral changes to the IChatCompletionService that customers may already be relying into.

Any config failures won't be noticed until use, additional code (not present in the current impl) is necessary to prevent concurrent usage from loading the likely multi-gb model multiple times.

Currently the UnitTests are failing because of loading the model, I would agree that a fail fast should happen if the file do not exists, but not by loading the model.

Normally for local model usage what we see for instance using Ollama, the model gets loaded during the request time, which is how local model applications have been constructed ultimately.

I would also consider for this Early scenario, having the IChatCompletionService(Model) using the ChatClient(model) ctor.

RogerBarreto · 2025-05-20T17:38:32Z

Adding the delaying on the Service implementation side, so it don't necessarily requires a change the original OnnxChatClient impl.

stephentoub · 2025-05-20T18:32:25Z

Their 0.8.0 still rely on the 9.4 preview. Getting Method not found in Integration tests.

Ugh, I thought 0.8.0 included the update to the stable dependency. We'll need to wait.

…atClient

stephentoub · 2025-06-03T01:23:44Z

Updated to 0.8.1

markwallace-microsoft · 2025-06-10T08:34:54Z

One unrelated integration test failed

[xUnit.net 00:03:34.59]     SemanticKernel.IntegrationTests.Connectors.OpenAI.OpenAIChatCompletionNonStreamingTests.ChatCompletionWithWebSearchAsync [FAIL]
[xUnit.net 00:03:34.59]       Assert.NotEmpty() Failure: Collection was empty
[xUnit.net 00:03:34.59]       Stack Trace:
[xUnit.net 00:03:34.59]         /home/runner/work/semantic-kernel/semantic-kernel/dotnet/src/IntegrationTests/Connectors/OpenAI/OpenAIChatCompletion_NonStreamingTests.cs(162,0): at SemanticKernel.IntegrationTests.Connectors.OpenAI.OpenAIChatCompletionNonStreamingTests.ChatCompletionWithWebSearchAsync()
[xUnit.net 00:03:34.59]         --- End of stack trace from previous location ---

markwallace-microsoft · 2025-06-10T15:46:34Z

More unrelated integration test failures

[xUnit.net 00:01:22.94]     SemanticKernel.IntegrationTests.Connectors.OpenAI.OpenAIChatCompletionNonStreamingTests.ChatCompletionWithAudioInputAndOutputAsync [FAIL]
[xUnit.net 00:01:22.95]       Microsoft.SemanticKernel.HttpOperationException : Service request failed.
[xUnit.net 00:01:22.95]       Status: 503 (Service Unavailable)
[xUnit.net 00:01:22.95]       
[xUnit.net 00:01:22.95]       ---- System.ClientModel.ClientResultException : Service request failed.
[xUnit.net 00:01:22.95]       Status: 503 (Service Unavailable)
[xUnit.net 00:01:22.95]       
[xUnit.net 00:01:22.95]       Stack Trace:
[xUnit.net 00:01:22.95]         /home/runner/work/semantic-kernel/semantic-kernel/dotnet/src/Connectors/Connectors.OpenAI/Core/ClientCore.cs(244,0): at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.RunRequestAsync[T](Func`1 request)
[xUnit.net 00:01:22.95]         /home/runner/work/semantic-kernel/semantic-kernel/dotnet/src/Connectors/Connectors.OpenAI/Core/ClientCore.ChatCompletion.cs(171,0): at Microsoft.SemanticKernel.Connectors.OpenAI.ClientCore.GetChatMessageContentsAsync(String targetModel, ChatHistory chatHistory, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)
[xUnit.net 00:01:22.95]         /home/runner/work/semantic-kernel/semantic-kernel/dotnet/src/SemanticKernel.Abstractions/AI/ChatCompletion/ChatCompletionServiceExtensions.cs(83,0): at Microsoft.SemanticKernel.ChatCompletion.ChatCompletionServiceExtensions.GetChatMessageContentAsync(IChatCompletionService chatCompletionService, ChatHistory chatHistory, PromptExecutionSettings executionSettings, Kernel kernel, CancellationToken cancellationToken)


[xUnit.net 00:04:05.05]     SemanticKernel.IntegrationTests.Connectors.OpenAI.OpenAITextToAudioTests.OpenAITextToAudioTestAsync [FAIL]
[xUnit.net 00:04:05.05]       System.Threading.Tasks.TaskCanceledException : The request was canceled due to the configured HttpClient.Timeout of 100 seconds elapsing.
[xUnit.net 00:04:05.05]       ---- System.TimeoutException : The operation was canceled.
[xUnit.net 00:04:05.05]       -------- System.Threading.Tasks.TaskCanceledException : The operation was canceled.
[xUnit.net 00:04:05.05]       ------------ System.IO.IOException : Unable to read data from the transport connection: Operation canceled.
[xUnit.net 00:04:05.05]       ---------------- System.Net.Sockets.SocketException : Operation canceled

stephentoub requested a review from a team as a code owner May 20, 2025 14:19

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code kernel Issues or pull requests impacting the core kernel labels May 20, 2025

stephentoub had a problem deploying to integration May 20, 2025 14:19 — with GitHub Actions Failure

github-actions bot changed the title ~~Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient~~ .Net: Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient May 20, 2025

RogerBarreto had a problem deploying to integration May 20, 2025 17:38 — with GitHub Actions Error

RogerBarreto temporarily deployed to integration May 20, 2025 17:40 — with GitHub Actions Inactive

RogerBarreto temporarily deployed to integration May 20, 2025 17:50 — with GitHub Actions Inactive

Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAICh…

f76bc1b

…atClient

stephentoub force-pushed the onnxchatclient branch from c20bb04 to f76bc1b Compare June 3, 2025 01:23

stephentoub temporarily deployed to integration June 3, 2025 01:24 — with GitHub Actions Inactive

stephentoub enabled auto-merge June 3, 2025 03:31

RogerBarreto approved these changes Jun 4, 2025

View reviewed changes

Merge branch 'main' into onnxchatclient

1a8a569

RogerBarreto temporarily deployed to integration June 4, 2025 17:44 — with GitHub Actions Inactive

westey-m approved these changes Jun 5, 2025

View reviewed changes

stephentoub added this pull request to the merge queue Jun 5, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 5, 2025

Merge branch 'main' into onnxchatclient

dca24f5

RogerBarreto enabled auto-merge June 6, 2025 21:08

RogerBarreto temporarily deployed to integration June 6, 2025 21:09 — with GitHub Actions Inactive

Merge branch 'main' into onnxchatclient

b5aabf7

markwallace-microsoft temporarily deployed to integration June 10, 2025 07:55 — with GitHub Actions Inactive

RogerBarreto added this pull request to the merge queue Jun 10, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 10, 2025

Merge branch 'main' into onnxchatclient

c54af9c

markwallace-microsoft temporarily deployed to integration June 10, 2025 11:42 — with GitHub Actions Inactive

markwallace-microsoft added this pull request to the merge queue Jun 10, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jun 10, 2025

markwallace-microsoft added this pull request to the merge queue Jun 10, 2025

Merged via the queue into microsoft:main with commit d2120d4 Jun 10, 2025
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

.Net: Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient #12197

.Net: Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient #12197

Uh oh!

stephentoub commented May 20, 2025

Uh oh!

RogerBarreto commented May 20, 2025

Uh oh!

stephentoub commented May 20, 2025

Uh oh!

RogerBarreto commented May 20, 2025

Uh oh!

RogerBarreto commented May 20, 2025 •

edited

Loading

Uh oh!

RogerBarreto commented May 20, 2025 •

edited

Loading

Uh oh!

stephentoub commented May 20, 2025

Uh oh!

stephentoub commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

markwallace-microsoft commented Jun 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

markwallace-microsoft commented Jun 10, 2025

Uh oh!

Uh oh!

Uh oh!

.Net: Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient #12197

.Net: Implement OnnxRuntimeGenAIChatCompletionService on OnnxRuntimeGenAIChatClient #12197

Uh oh!

Conversation

stephentoub commented May 20, 2025

Uh oh!

RogerBarreto commented May 20, 2025

Uh oh!

stephentoub commented May 20, 2025

Uh oh!

RogerBarreto commented May 20, 2025

Uh oh!

RogerBarreto commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RogerBarreto commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentoub commented May 20, 2025

Uh oh!

stephentoub commented Jun 3, 2025

Uh oh!

Uh oh!

Uh oh!

markwallace-microsoft commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

markwallace-microsoft commented Jun 10, 2025

Uh oh!

Uh oh!

Uh oh!

RogerBarreto commented May 20, 2025 •

edited

Loading

RogerBarreto commented May 20, 2025 •

edited

Loading

markwallace-microsoft commented Jun 10, 2025 •

edited

Loading