Skip to content

This is a reverse engineered demo sample of OpenAI.fm for developers to synthesize voices with the new gpt-4o-mini-tts model

License

Notifications You must be signed in to change notification settings

Azure-Samples/azure-openai-tts-demo

Azure OpenAI GPT-4o Mini TTS Demo

A demo project for experimenting with the Azure OpenAI GPT-4o Mini TTS (Text-to-Speech) API. Includes a Gradio-powered soundboard UI and sample scripts for generating speech from text using a variety of voices and vibes.

AOAI TTS Soundboard

Features

  • Interactive Gradio soundboard to try different voices and vibes
  • Sample scripts for streaming and saving TTS audio
  • Easily configurable for your Azure OpenAI resource

Getting Started

Prerequisites

  • Python 3.8+
  • An Azure OpenAI resource created in eastus2 with GPT-4o Mini TTS model deployed as a Global Standard deployment type

Deploy GPT-4o Mini TTS Model with Azure AI Foundry

Quickly deploy the GPT-4o Mini TTS model:

  1. Sign in to the Azure AI Foundry portal.
  2. Navigate to Catalog, filter by Azure OpenAI, and select gpt-4o-mini-tts.
  3. Click Deploy, choose or create your resource, and enter a deployment name.

Click Deploy again to finalize.

For detailed instructions, see the full deployment guide.

Installation

  1. Clone the repository:
    git clone https://github.com/Azure-Samples/azure-openai-tts-demo.git
    cd azure-openai-tts-demo
  2. Create and activate a virtual environment:
    python -m venv .venv
    source .venv/bin/activate
  3. Install dependencies:
    pip install -r requirements.txt

Configuration

  1. Copy .env.example to .env and fill in your Azure OpenAI endpoint and API key:
    cp .env.example .env
    # Edit .env with your values
    Example:
    AZURE_OPENAI_ENDPOINT="https://<your-resource-name>.openai.azure.com/"
    AZURE_OPENAI_API_KEY="your-azure-openai-api-key"
    AZURE_OPENAI_API_VERSION="2025-03-01-preview"

Running the Demo

Gradio Soundboard UI

To launch the interactive soundboard:

python soundboard.py
  • Select a voice and vibe, then click Play to generate and listen to speech.

Sample Scripts

  • streaming-tts-to-file-sample.py: Streams TTS audio to a file.
  • async-streaming-tts-sample.py: Streams and plays TTS audio asynchronously.

Run a sample script with:

python streaming-tts-to-file-sample.py

Resources

Responsible Use and Content Requirements

When using this soundboard or any output generated by it, you must comply with the Microsoft Enterprise AI Services Code of Conduct, including but not limited to:

  • Disclosure: Clearly disclose when audio is AI-generated. Do not mislead others into believing the synthetic voice is a real person or attributable to a specific individual without their consent.
  • Prohibited Uses: Do not use this tool or its output to:
    • Deceive, impersonate, or misinform others.
    • Generate or distribute harmful, illegal, or abusive content (including hate speech, violence, harassment, or sexually explicit material).
    • Attempt to infer or simulate sensitive personal attributes or emotional states.
    • Create chatbots for erotic, romantic, or impersonation purposes.
    • Violate any applicable law or regulation.
  • Human Oversight: Ensure appropriate human oversight and do not use the tool for consequential decisions affecting legal, financial, or human rights.
  • Content Rights: You are responsible for ensuring you have the rights to any content you input and for the responsible use of all output.
  • Feedback and Abuse: Provide a way for users to report abuse or issues with generated content.

For the full list of requirements and restrictions, see the Microsoft Enterprise AI Services Code of Conduct.

License

MIT License. See LICENSE.md for details.


Disclaimer:

This project is for educational and personal use only. It is not affiliated with, endorsed by, or officially supported by OpenAI or Microsoft. This project was inspired by OpenAI's openai.fm interactive site, which is an interactive demo for developers to try the new text-to-speech model in the OpenAI API. This demo sample's sole purpose was lovingly inspired by openai.fm's interactive demo to help developers understand the API and how to use these models within the context of Azure OpenAI Service.

About

This is a reverse engineered demo sample of OpenAI.fm for developers to synthesize voices with the new gpt-4o-mini-tts model

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages