speek

Text-to-speech CLI using Google's Gemini API with configurable voices and styles.

Features

Multiple Voices: Choose from 5 different Gemini TTS voices
Configurable Speech Styles: Control tone and style with custom prompts
Temperature Control: Adjust creativity/consistency (0.0-2.0)
Persistent Config: Settings saved in ~/.speek/config.json
Interactive Setup: Guided configuration on first use
Multiple Input Methods: Direct text, stdin, or piped input

Installation

Using pnpm dlx (Recommended)

pnpm dlx speek "Hello world"

Global Installation

npm install -g speek
# or
pnpm install -g speek

Prerequisites

SoX (for audio playback)

# macOS
brew install sox

# Ubuntu/Debian
sudo apt-get install sox

Gemini API Key
- Get one at: https://aistudio.google.com/apikey
- Set environment variable: export GEMINI_API_KEY="your_key_here"

Usage

Basic Usage

# Direct text
speek "Hello world"

# From stdin
echo "Hello world" | speek

# From file
cat file.txt | speek

Configuration

# Initial setup (prompted automatically if no API key)
speek --setup

# Show current config
speek --config

# Help
speek --help

Voice Options

Aoede - Calm, warm female voice (default)
Charon - Deep, authoritative male voice
Fenrir - Energetic, youthful male voice
Kore - Clear, professional female voice
Puck - Playful, animated voice

Speech Style Examples

"Speak naturally and clearly" (default)
"Speak with enthusiasm and energy"
"Speak in a calm, soothing tone"
"Speak like a professional news anchor"
"Speak with a friendly, conversational tone"
Custom styles supported

Configuration File

Settings are automatically saved to ~/.speek/config.json:

{
  "voiceName": "Aoede",
  "speechStyle": "You are a helpful assistant. Speak naturally and clearly.",
  "temperature": 1
}

Development

# Clone and install
git clone <repo>
cd speek
pnpm install

# Development
pnpm dev "Hello world"

# Build
pnpm build

# Test
pnpm start "Hello world"

API Reference

The tool uses Google's Gemini 2.5 Flash TTS model via the REST API with:

24kHz, 16-bit, mono raw audio output
Configurable voice selection
System prompt injection for style control
Temperature-based creativity control

Troubleshooting

Common Issues

"play command not found"
- Install SoX: brew install sox (macOS) or apt-get install sox (Ubuntu)
"GEMINI_API_KEY not set"
- Get API key: https://aistudio.google.com/apikey
- Set environment variable or use interactive setup
Audio playback fails
- Ensure SoX is properly installed
- Check audio output device is working

Debug Mode

# Check config
speek --config

# Verify dependencies
speek "test" # Will check dependencies automatically

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
prompts		prompts
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
biome.jsonc		biome.jsonc
package-lock.json		package-lock.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
speek-0.0.5.tgz		speek-0.0.5.tgz
test-prompt.js		test-prompt.js
tmpout		tmpout
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

speek

Features

Installation

Using pnpm dlx (Recommended)

Global Installation

Prerequisites

Usage

Basic Usage

Configuration

Voice Options

Speech Style Examples

Configuration File

Development

API Reference

Troubleshooting

Common Issues

Debug Mode

License

About

Uh oh!

Releases

Packages

Languages

cd-slash/speek

Folders and files

Latest commit

History

Repository files navigation

speek

Features

Installation

Using pnpm dlx (Recommended)

Global Installation

Prerequisites

Usage

Basic Usage

Configuration

Voice Options

Speech Style Examples

Configuration File

Development

API Reference

Troubleshooting

Common Issues

Debug Mode

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages