VapiAI · nitishsurana · Jul 8, 2025 · Jul 8, 2025
diff --git a/fern/apis/api/openapi-overrides.yml b/fern/apis/api/openapi-overrides.yml
@@ -1083,8 +1083,6 @@ components:
       title: TavusVoice
     VapiVoice:
       title: VapiVoice
-    SesameVoice:
-      title: SesameVoice
     AIEdgeCondition:
       title: AIEdgeCondition
     LogicEdgeCondition:

diff --git a/fern/docs.yml b/fern/docs.yml
@@ -470,8 +470,8 @@ navigation:
                     path: providers/voice/rimeai.mdx
                   - page: Deepgram
                     path: providers/voice/deepgram.mdx
-                  - page: Sesame
-                    path: providers/voice/sesame.mdx
+                  - page: Inworld
+                    path: providers/voice/inworld.mdx
                   - section: Video models
                     contents:
                       - page: Tavus

diff --git a/fern/providers/voice/inworld.mdx b/fern/providers/voice/inworld.mdx
@@ -0,0 +1,61 @@
+---
+title: InworldAI
+subtitle: What is Inworld.ai?
+slug: providers/voice/inworld
+---
+
+**What is Inworld.ai?**
+
+Inworld.ai provides developers with tools to create lifelike voice agents. It supports zero-shot voice cloning, enabling the creation of personalized voices from short audio samples. The system is optimized for low-latency streaming, making it suitable for applications requiring immediate audio responses.
+
+**The Evolution of AI Speech Synthesis:**
+
+Advancements in deep learning and neural networks have significantly improved the quality of AI-generated speech. Inworld.ai leverages these developments to deliver natural-sounding, emotionally expressive voices suitable for various applications, including virtual assistants and interactive games.
+
+**Overview of Inworld.ai's Offerings:**
+
+Inworld.ai provides a comprehensive suite of features designed to meet diverse voice synthesis needs:
+
+**Real-Time Speech Synthesis:**
+
+Inworld.ai  is engineered for low-latency performance, delivering the first two seconds of audio in approximately 200 milliseconds. This responsiveness is critical for real-time applications such as conversational agents and interactive gaming characters.
+
+**Zero-Shot Voice Cloning:**
+
+The platform offers zero-shot voice cloning, allowing developers to create custom voices from as little as 5 seconds of audio input. This feature facilitates the development of unique voice identities for various applications.
+
+**Multilingual Support:**
+
+Inworld.ai supports 11 languages, including English, Spanish, French, Korean, and Chinese. This multilingual capability enables developers to build applications for diverse global audiences.
+
+**Audio Markup Controls:**
+
+Developers can use audio markup tags such as [happy], [whispering], or [sigh] to control the emotional tone and style of the synthesized speech. This feature enhances the expressiveness of voice agents.
+
+**Developer API:**
+
+Inworld.ai provides an API with comprehensive documentation, facilitating integration into various applications. The API supports real-time streaming and offers options for customizing voice parameters to suit specific use cases.
+
+**Use Cases for Inworld.ai:**
+
+Inworld.ai's versatile platform supports a wide range of applications:
+
+**Interactive Applications:**
+
+Developers can create responsive voice agents for customer service, virtual assistants, and interactive gaming characters, enhancing user engagement through natural-sounding speech.
+
+**Content Creation:**
+
+Content creators can utilize Inworld.ai to generate high-quality voiceovers for videos, podcasts, and other media, streamlining the production process.
+
+**Education and Training:**
+
+Educational platforms can employ Inworld.ai to provide clear and expressive narration for e-learning materials, improving the learning experience for users.
+
+**Integration with Vapi:**
+
+Inworld.ai  is integrated with Vapi, allowing developers to access its features through the Vapi platform. This integration simplifies the process of building and deploying voice agents, offering tools for testing and optimizing performance before production.
+
+**Conclusion:**
+
+Inworld.ai offers a combination of expressive voice synthesis, low-latency performance, and multilingual support, making it a valuable tool for developers seeking to enhance their applications with natural-sounding speech.
diff --git a/fern/providers/voice/sesame.mdx b/fern/providers/voice/sesame.mdx