Try out Model Router and Foundry Local with these simple Gradio samples #27
Replies: 1 comment 1 reply
-
@guygregory I got a great question in Discord (https://aka.ms/azureaifoundry/discord) : what is the cost of the router model? if every request has to go through the router first. Today: You can monitor the costs of your model router deployment in the Azure portal. Thought that was worth sharing here 👍 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Introduction
At Microsoft Build 2025, we announced a number of exciting new preview features, including Foundry Local and Model Router.
Foundry Local unlocks instant, on-device AI by allowing users to run generative AI models directly on their local machines, without relying on cloud-based inference. This approach offers enhanced privacy, lower latency, and greater control over AI workloads, empowering developers and organizations to build and experiment with AI locally.
The Model Router feature within Azure OpenAI is a deployable model trained to select the best large language model (LLM) to respond to a given prompt in real time. By assessing factors such as prompt complexity, cost, and performance, Model Router dynamically routes requests to the most suitable underlying model.
To accelerate adoption of these new features, I've included a couple of sample applications, which provide a simple Gradio front-end to make it easier to test and create custom demos:
Foundry Local samples
Model Router samples
Features and Screenshots
Screenshots:
Technical Details
Technologies Used ...
Challenges and Solutions
Problems Encountered & Solutions Implemented ...
Beta Was this translation helpful? Give feedback.
All reactions