Skip to content

Commit 29e1386

Browse files
feat: pronunciation dictionaries docs
1 parent 0d211c9 commit 29e1386

File tree

2 files changed

+221
-0
lines changed

2 files changed

+221
-0
lines changed
Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
---
2+
title: Pronunciation dictionaries
3+
subtitle: Control how your AI assistant pronounces specific words and phrases
4+
slug: assistants/pronunciation-dictionaries
5+
---
6+
7+
## Overview
8+
9+
Pronunciation dictionaries allow you to customize how your AI assistant pronounces specific words, names, acronyms, or technical terms. This feature is particularly useful for ensuring consistent pronunciation of brand names, proper nouns, or industry-specific terminology that might be mispronounced by default.
10+
11+
**Note:** Pronunciation dictionaries are exclusive to ElevenLabs voices and require specific model configurations.
12+
13+
## How Pronunciation Dictionaries Work
14+
15+
<Steps>
16+
<Step title="Create Pronunciation Rules">
17+
Define specific words or phrases and how they should be pronounced using either phonetic notation or word substitutions.
18+
</Step>
19+
20+
<Step title="Upload Dictionary to Vapi">
21+
Create a pronunciation dictionary through Vapi's API with your custom rules.
22+
</Step>
23+
24+
<Step title="Configure Your Assistant">
25+
Associate the pronunciation dictionary with your assistant's voice configuration.
26+
</Step>
27+
28+
<Step title="Automatic Application">
29+
When your assistant encounters the specified words during conversation, it will use your custom pronunciations automatically.
30+
</Step>
31+
</Steps>
32+
33+
## Prerequisites
34+
35+
- A Vapi assistant configured with an ElevenLabs voice
36+
- Understanding of phonetic notation (IPA or CMU Arpabet) for phoneme-based rules
37+
- Access to Vapi's API for dictionary creation
38+
39+
## Types of Pronunciation Rules
40+
41+
### Phoneme Rules
42+
43+
Phoneme rules specify exact pronunciation using phonetic alphabets. These provide the most precise control over pronunciation.
44+
45+
**Supported Alphabets:**
46+
- **IPA (International Phonetic Alphabet)**: More universal, uses symbols like `/tə'meɪtoʊ/`
47+
- **CMU Arpabet**: ASCII-based format, uses notation like `T AH M EY T OW`
48+
49+
**Model Compatibility:**
50+
Phoneme rules only work with specific ElevenLabs models:
51+
- `eleven_turbo_v2_5`
52+
- `eleven_turbo_v2`
53+
- `eleven_flash_v2`
54+
55+
### Alias Rules
56+
57+
Alias rules replace words with alternative spellings or phrases. These work with all ElevenLabs models and are useful for:
58+
- Converting acronyms to full phrases (e.g., "UN" → "United Nations")
59+
- Providing phonetic spellings for difficult words
60+
- Standardizing pronunciation across different contexts
61+
62+
## Implementation
63+
64+
<Steps>
65+
<Step title="Create a Pronunciation Dictionary">
66+
Use Vapi's API to create a pronunciation dictionary with your custom rules.
67+
68+
```bash
69+
POST https://api.vapi.ai/provider/11labs/pronunciation-dictionary
70+
Content-Type: application/json
71+
Authorization: Bearer YOUR_API_KEY
72+
```
73+
74+
```json
75+
{
76+
"name": "My Custom Dictionary",
77+
"rules": [
78+
{
79+
"stringToReplace": "tomato",
80+
"type": "phoneme",
81+
"phoneme": "/tə'meɪtoʊ/",
82+
"alphabet": "ipa"
83+
},
84+
{
85+
"stringToReplace": "Vapi",
86+
"type": "phoneme",
87+
"phoneme": "V AE P IY",
88+
"alphabet": "cmu-arpabet"
89+
},
90+
{
91+
"stringToReplace": "UN",
92+
"type": "alias",
93+
"alias": "United Nations"
94+
}
95+
]
96+
}
97+
```
98+
99+
The API will respond with:
100+
```json
101+
{
102+
"pronunciationDictionaryId": "rjshI10OgN6KxqtJBqO4",
103+
"versionId": "xJl0ImZzi3cYp61T0UQG",
104+
"name": "My Custom Dictionary",
105+
"rules": [...],
106+
"createdAt": "2024-01-15T10:30:00Z"
107+
}
108+
```
109+
</Step>
110+
111+
<Step title="Configure Your Assistant's Voice">
112+
Update your assistant configuration to use the pronunciation dictionary.
113+
114+
```json
115+
{
116+
"voice": {
117+
"model": "eleven_turbo_v2_5",
118+
"voiceId": "sarah",
119+
"provider": "11labs",
120+
"stability": 0.5,
121+
"similarityBoost": 0.75,
122+
"pronunciationDictionaryLocators": [
123+
{
124+
"pronunciationDictionaryId": "rjshI10OgN6KxqtJBqO4",
125+
"versionId": "xJl0ImZzi3cYp61T0UQG"
126+
}
127+
]
128+
}
129+
}
130+
```
131+
132+
<Note>
133+
When a pronunciation dictionary is added, SSML parsing will be automatically enabled for your assistant.
134+
</Note>
135+
</Step>
136+
137+
<Step title="Test Your Pronunciation">
138+
Create a test call or use the Vapi playground to verify that your custom pronunciations are working correctly.
139+
</Step>
140+
</Steps>
141+
142+
## Using Your Own ElevenLabs Account (BYOK)
143+
144+
If you're using your own ElevenLabs API key (Bring Your Own Key), you can create pronunciation dictionaries directly in your ElevenLabs account and reference them in Vapi:
145+
146+
1. Create a pronunciation dictionary in your ElevenLabs account
147+
2. Note the `pronunciationDictionaryId` and `versionId` from ElevenLabs
148+
3. Use these IDs in your Vapi assistant configuration:
149+
150+
```json
151+
{
152+
"voice": {
153+
"model": "eleven_turbo_v2_5",
154+
"voiceId": "your-voice-id",
155+
"provider": "11labs",
156+
"pronunciationDictionaryLocators": [
157+
{
158+
"pronunciationDictionaryId": "your-elevenlabs-dict-id",
159+
"versionId": "your-elevenlabs-version-id"
160+
}
161+
]
162+
}
163+
}
164+
```
165+
166+
## Managing Pronunciation Dictionaries
167+
168+
### List Your Dictionaries
169+
170+
```bash
171+
GET https://api.vapi.ai/provider/11labs/pronunciation-dictionary
172+
Authorization: Bearer YOUR_API_KEY
173+
```
174+
175+
### Update Dictionary Rules
176+
177+
```bash
178+
PATCH https://api.vapi.ai/provider/11labs/pronunciation-dictionary/{dictionaryId}
179+
Content-Type: application/json
180+
Authorization: Bearer YOUR_API_KEY
181+
```
182+
183+
```json
184+
{
185+
"rules": [
186+
{
187+
"stringToReplace": "tomato",
188+
"type": "phoneme",
189+
"phoneme": "/tə'mɑːtoʊ/",
190+
"alphabet": "ipa"
191+
}
192+
]
193+
}
194+
```
195+
196+
## Best Practices
197+
198+
<Note>
199+
- **Case Sensitivity**: Pronunciation dictionary searches are case-sensitive. Create separate entries for different capitalizations if needed.
200+
- **Order Matters**: Rules are applied in the order they appear in the dictionary. The first matching rule is used.
201+
- **Testing**: Always test pronunciation changes with your specific voice and model combination.
202+
- **Phoneme Accuracy**: Ensure proper stress marking for multi-syllable words when using phoneme rules.
203+
- **Model Compatibility**: Remember that phoneme rules only work with specific ElevenLabs models.
204+
</Note>
205+
206+
## Common Issues
207+
208+
**Pronunciation Not Applied**
209+
- Verify you're using a compatible ElevenLabs model for phoneme rules
210+
- Check that the `stringToReplace` exactly matches the text in your content (case-sensitive)
211+
- Ensure the pronunciation dictionary is properly referenced in your voice configuration
212+
213+
**SSML Conflicts**
214+
- When pronunciation dictionaries are enabled, SSML parsing is automatically activated
215+
- Ensure any existing SSML tags in your content are properly formatted
216+
217+
**Performance Impact**
218+
- Large dictionaries may slightly increase processing time
219+
- Consider organizing rules by frequency of use for optimal performance

fern/docs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,8 @@ navigation:
146146
path: assistants/assistant-hooks.mdx
147147
- page: Background speech denoising
148148
path: assistants/background-speech-denoising.mdx
149+
- page: Pronunciation dictionaries
150+
path: assistants/pronunciation-dictionaries.mdx
149151
- section: Model configurations
150152
icon: fa-light fa-waveform-lines
151153
contents:

0 commit comments

Comments
 (0)