Skip to content

Commit ea3b985

Browse files
committed
Add audio related endpoint docs for AI server.
1 parent 8b9cded commit ea3b985

File tree

1 file changed

+46
-0
lines changed

1 file changed

+46
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
title: "Transcribing Audio"
3+
---
4+
5+
# Transcribing Audio
6+
7+
AI Server can transcribe audio files to text using the Speech-to-Text provider. This is powered by the Whisper model and is hosted on your own ComfyUI Agent.
8+
9+
## Using Speech-to-Text
10+
11+
To transcribe an audio file to text, you can use the `SpeechToText` request:
12+
13+
```csharp
14+
var request = new SpeechToText{};
15+
16+
var response = await client.PostFilesWithRequest<GenerationResponse>(
17+
request,
18+
[new UploadFile("audio", File.OpenRead("audio.wav"), "audio.wav")]
19+
);
20+
21+
// Two texts are returned
22+
// The first is the timestamped text json with `start` and `end` timestamps
23+
var textWithTimestamps = response.TextOutputs[0].Text;
24+
// The second is the plain text
25+
var textOnly = response.TextOutputs[1].Text;
26+
```
27+
28+
## Text To Speech
29+
30+
AI Server also has a Text-to-Speech endpoint that works with the ComfyUI Agent. This can be used to generate audio files from text.
31+
32+
```csharp
33+
var request = new TextToSpeech
34+
{
35+
Text = "Hello, how are you?",
36+
Sync = true
37+
};
38+
39+
var response = await client.PostAsync(request);
40+
response.Outputs[0].Url.DownloadFileTo("hello.mp3");
41+
```
42+
43+
The ComfyUI Agent uses PiperTTS to generate the audio files. You can configure download the necessary models by setting the `DEFAULT_MODELS` in the `.env` file to include `text-to-speech` for your ComfyUI Agent.
44+
If you have included an `OPENAI_API_KEY` in your `.env` file, you can also use the OpenAI API to generate audio files from text. By default, this uses their 'tts-1:alloy' model, while PiperTTS via ComfyUI Agent uses the preconfigured 'high:en_US-lessac' model.
45+
46+
See the [`/lib/data/ai-models.json`](/lib/data/ai-models.json) file for more information on the available models.

0 commit comments

Comments
 (0)