Effortless speech synthesis with Kokoro, in Python.
- Features
- Requirements
- Installation
- Examples
- Usage
- Command Line Interface (CLI)
- Example Output Files
- Build from Source
- API
- License
- Simple interface for generating speech audio and subtitles
- Supports all Kokoro voices
- Outputs valid SRT subtitles
- Automatic Model Management
- Python 3.10+
- torch
- kokoro
- soundfile
All dependencies except Python are installed automatically.
From PyPI:
pip install Simpler-Kokoro
Or clone the repo and install locally:
git clone https://github.com/WilleIshere/SimplerKokoro.git
cd SimplerKokoro
pip install .
You can find runnable example scripts in the examples/
folder:
basic_example.py
: Basic usage, generate speech from text.subtitles_example.py
: Generate speech with SRT subtitles.custom_speed_example.py
: Generate speech with custom speed.custom_models_dir_example.py
: Specify a custom directory for model downloads.
Basic Example
from Simpler_Kokoro import SimplerKokoro
# Create an instance
sk = SimplerKokoro()
# List available voices
voices = sk.list_voices()
print("Available voices:", [v['name'] for v in voices])
# Generate speech
sk.generate(
text="Hello, this is a test of the Simpler Kokoro voice synthesis.",
voice=voices[0]['name'], # Make sure to specify the name of the voice
output_path="output.wav"
)
Generate Speech with Subtitles
sk.generate(
text="Hello, this is a test. This is another sentence.",
voice=voices[0]['name'],
output_path="output.wav",
write_subtitles=True,
subtitles_path="output.srt",
subtititles_word_level=True
)
Generate Speech with Custom Speed
sk.generate(
text="This is spoken faster than normal.",
voice=voices[1]['name'],
output_path="fast_output.wav",
speed=1.5
)
Specify a Path to Download Models
sk.generate(
models_dir="Folder-to-put-models-in",
text="Thats a cool model directory.",
voice=voices[1]['name'],
output_path="fast_output.wav",
)
You can use the library in the command line too.
Example:
python -m Simpler_Kokoro <command> [options]
Command | Description | Options |
---|---|---|
list-voices | List available Kokoro voices | --repo, --models_dir, --log_level |
generate | Generate speech audio from text | --text (required), --voice (required), --output (required), --speed, --write_subtitles, --subtitles_path, --subtitles_word_level, --repo, --models_dir, --log_level |
Global options:
Option | Description | Default |
---|---|---|
--repo | HuggingFace repo to use for models | hexgrad/Kokoro-82M |
--models_dir | Directory to store model files | models |
--log_level | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | INFO |
Generate command options:
Option | Description | Default |
---|---|---|
--text | Text to synthesize (required) | |
--voice | Voice name to use (required) | |
--output | Output WAV file path (required) | |
--speed | Speech speed multiplier | 1.0 |
--write_subtitles | Write SRT subtitles | False |
--subtitles_path | Path to save subtitles | subtitles.srt |
--subtitles_word_level | Word-level subtitles | False |
output.wav
: The synthesized speech audio file.output.srt
: Subtitles in SRT format (ifwrite_subtitles=True
).
Sample SRT output
1
00:00:00,000 --> 00:00:01,200
Hello,
2
00:00:01,200 --> 00:00:02,500
this is a test.
3
00:00:02,500 --> 00:00:04,000
This is another sentence.
To build the package from source:
git clone https://github.com/WilleIshere/SimplerKokoro.git
cd SimplerKokoro
pip install build
python -m build
This will create distribution files in the dist/
directory:
.whl
(wheel) file for pip installation.tar.gz
source archive
To install the built wheel locally:
pip install dist/Simpler_Kokoro-*.whl
You can now use the package as described in the usage section.
list_voices()
: Returns a list of available voices with metadata.generate(text, voice, output_path, speed=1.0, write_subtitles=False, subtitles_path='subtitles.srt', subtititles_word_level=False)
: Generates speech audio and optional subtitles.
This project is licensed under the GPL-3.0 license.