This guide explains how to run gpt-oss-120b
on Azure AI Foundry, and how to run gpt-oss-20b
using Foundry Local.

This repo contains Python sample code for interacting with the gpt-oss-120b model deployed on Azure AI Foundry using Chat Completions. The gpt-oss models are OpenAI's open weight models that provide transparent access to its reasoning process.
- Python 3.7 or higher
- Azure subscription
- Azure CLI (optional, for deployment via command line)
Follow these steps to deploy and use the gpt-oss-120b model:
Deploy an Azure AI Foundry Project if you don't already have one available.
According to Microsoft Learn Docs, gpt-oss is available in all regions. I personally tested this using a Foundry Project in UK South.
๐ Detailed instructions: Create Azure AI Foundry Projects
Deploy the gpt-oss-120b model using one of the following methods:
- Navigate to your Azure AI Foundry Project
- Go to the Model catalog
- Search for and select "gpt-oss-120b"
- Click "Deploy" and follow the deployment wizard

az cognitiveservices account deployment create \
--resource-group <your-resource-group> \
--name <foundry-resource-name> \
--deployment-name "gpt-oss-120b" \
--model-name gpt-oss-120b \
--model-version 1 \
--model-format "OpenAI-OSS" \
--sku-name GlobalStandard \
--sku-capacity 1
Replace <your-resource-group>
and <foundry-resource-name>
with your actual values.
-
Clone this repository:
git clone https://github.com/guygregory/gpt-oss.git cd gpt-oss
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
cp .env.sample .env
-
Update the
.env
file with values from your Azure AI Foundry Project:AZURE_OPENAI_API_ENDPOINT
: Your foundry endpoint (format:https://<FOUNDRY_RESOURCE_NAME>.openai.azure.com/
)AZURE_OPENAI_V1_API_ENDPOINT
: Your foundry v1 endpoint (format:https://<FOUNDRY_RESOURCE_NAME>.openai.azure.com/openai/v1/
)AZURE_OPENAI_API_KEY
: Your API key (found under "Keys and Endpoint" in your project)AZURE_OPENAI_API_MODEL
: Your deployment name (default:gpt-oss-120b
)
Choose one of the Python samples:
python chat-gradio-aoai.py
python chat-basic-aoai.py
python chat-basic-aoai-v1.py
Starting in May 2025, you can now opt in to our next generation of v1 Azure OpenAI APIs which add support for:
- Ongoing access to the latest features with no need to update api-version each month.
- OpenAI client support with minimal code changes to swap between OpenAI and Azure OpenAI when using key-based authentication.
Code samples have been provided for both the v1 API Preview, and also the older API versions. The v1 API Preview samples have a v1.py suffix to distinguish them.
If you want the latest features, I would recommend using the v1 API Preview, with the api-version
set to preview
.
If you need a stable, GA version, and don't need the latest features, then you can use the older API. At time of writing, the latest GA API release is 2024-10-21
.
Azure OpenAI in Azure AI Foundry Models API lifecycle
- Authentication Error: Verify your API key and endpoint URL are correct
- Model Not Found: Ensure the deployment name matches your Azure deployment
- Region Not Supported: Verify you deployed in one of the supported regions
- Rate Limiting: The gpt-oss model has usage quotas; check your deployment capacity
- Check the Azure AI Foundry documentation
- Review Azure OpenAI service logs in the Azure portal
- Verify your deployment status in the Azure AI Foundry project
If you want to experiment with the gpt-oss-20b
model locally without using Azure, you can use Foundry Local.
- NVIDIA GPU with at least 16 GB VRAM
- Foundry Local version 0.6.87 or above
Check your version with:
winget list --id Microsoft.FoundryLocal
Update to the latest version (if needed):
winget upgrade Microsoft.FoundryLocal
foundry model run gpt-oss-20b
๐ก Note: This will download several GBs of model weights if not already cached.
- Documentation for running
gpt-oss-20b
on Foundry Local can be found here - Foundry Local is ideal for local development, offline prototyping, or sandbox testing
- If you get the error
Exception: Model <gpt-oss-20b> was not found in the catalog or local cache.
, check your PC meets the 16GB NVIDIA requirement, and you are running version 0.6.87 or above. Reboot your PC after upgrading. - Works well for basic chat and reasoning tasks if your hardware meets the requirements
Here are some useful links related to gpt-oss:
- OpenAI Announcement: OpenAI Open Models
- Microsoft Announcement: OpenAI's Open-Source Model gpt-oss on Azure AI Foundry and Windows AI Foundry
- Azure Region Availability: gpt-oss Model Availability
- Run gpt-oss-20b on Foundry Local: Get Started Guide
- Model Card: gpt-oss Model Card (PDF)
- Code Samples: This Repository
- HuggingFace: gpt-oss Collection
- Ollama: gpt-oss Library
- GitHub: OpenAI gpt-oss Repository
- Playground (OpenAI): gpt-oss.com
This project is licensed under the MIT License - see the LICENSE file for details.