This CLI application allows you to request speech-to-text transcription in SRT subtitle format from an API. It leverages the Speech-to-Text API Client library to communicate with the Speech-to-Text API.
- Transcription: Generate SRT subtitles from audio files.
- Translation: Generate translated SRT subtitles from audio files.
- Local Processing: All processing is done locally on the user's machine, ensuring that no files are sent over the internet.
The CLI provides the following commands:
generate-srt
orgs
: Generate SRT subtitles from an audio file.- Options:
--file
or-f
: The audio file to transcribe. (Required)--source-language
or-sl
: The source language of the audio in the pattern xx_XX. (Can be configured inappSettings.json
)
- Options:
generate-translated-srt
orgts
: Generate translated SRT subtitles from an audio file.- Options:
--file
or-f
: The audio file to transcribe. (Required)--source-language
or-sl
: The source language of the audio in the pattern xx_XX. (Can be configured inappSettings.json
)--target-language
or-tl
: The target language for the translation in the pattern xx_XX. (Can be configured inappSettings.json
)
- Options:
- The CLI application leverages the Speech-to-Text API Client library to handle API requests and responses.
- It communicates with the Speech-to-Text API, which performs transcription using OpenAI's Whisper model and translation using Seamless or mBART models. You can also use the home-lab project, which configures the ports in Docker and integrates all my AI-based APIs.
- The application is built as a self-contained executable in .NET 9 and does not require the .NET runtime to be installed. You can download the executable from the releases page.
- The CLI is also available as an MSIX package, which integrates the CLI with the Windows context menu, adding two options: "Generate SRT" and "Generate Translated SRT".
- Important: The CLI requires the Speech-to-Text API to be running locally. Without the API, the CLI cannot function. You can run the API locally or use the home-lab project to set it up with Docker.
There are two ways to use the Speech-to-Text CLI:
SpeechToTextCli.exe generate-srt --file "path/to/audio/file.wav" --source-language "en_US"
SpeechToTextCli.exe generate-translated-srt --file "path/to/audio/file.wav" --source-language "en_US" --target-language "es_ES"
You can install the CLI as an MSIX package, which integrates the CLI with the Windows context menu. This adds two options to the context menu: "Generate SRT" and "Generate Translated SRT".
To use this option, right-click on an audio file and select the desired option from the context menu.
In both cases, the CLI requires the Speech-to-Text API to be running locally. Without the API, the CLI cannot function. You can run the API locally or use the home-lab project to set it up with Docker and ensure proper communication between the CLI and the API.
The CLI application uses the .NET configuration system. You can configure the base address of the Speech-to-Text API, as well as the default source and target languages, in your appSettings.json
file:
{
"SpeechToText": {
"BaseAddress": "http://localhost:8000",
"SourceLanguage": "en_US",
"TargetLanguage": "pl_PL",
"TranscribeRouteTimeout": 3600,
"HealthCheckRouteTimeout": 10
},
"Logging": {
"LogLevel": {
"Default": "Trace",
"System": "Warning",
"Microsoft": "Warning",
"Microsoft.Hosting.Lifetime": "Information"
},
"Console": {
"FormatterName": "simple",
"TimestampFormat": "yyyy-MM-dd HH:mm:ss fff ",
"SingleLine": true,
"IncludeScopes": true,
"UseUtcTimestamp": true,
"ColorBehavior": "Enabled"
}
}
}
This project is licensed under the MIT License - see the LICENSE file for details.