Yandex SpeechKit STT Plugin for LiveKit Agents

⚠️ IMPORTANT DISCLAIMER This is an independent, community-developed plugin and is NOT officially affiliated with or endorsed by LiveKit or Yandex.

This project is NOT part of either the official LiveKit or Yandex ecosystems.
For issues, support, or contributions related to this plugin, please use this project's repository directly.
Do not use LiveKit's or Yandex's official support channels for plugin-specific matters.

This plugin provides Yandex SpeechKit Speech-to-Text (STT) integration for LiveKit Agents, enabling real-time Russian and English speech recognition.

Features

Real-time streaming STT using Yandex SpeechKit v3 API
Multi-language support with a primary focus on Russian and English
Automatic language detection capabilities
Interim results for responsive user experience
Profanity filtering and text normalization options
Seamless LiveKit integration following established plugin patterns

🧪 End-to-End Testing

🎉 BONUS FEATURE!

This plugin goes WAY BEYOND basic STT functionality by including a COMPREHENSIVE E2E TESTING INFRASTRUCTURE with REAL LiveKit Cloud integration!

This is NOT your typical plugin - you're getting testing capabilities that most plugins simply don't offer.

This exceptional testing infrastructure validates the COMPLETE PIPELINE from audio input to LiveKit room management, giving you confidence that everything works together seamlessly.

What makes this special:

✨ Real LiveKit Cloud rooms - Not mocked, not simulated - ACTUAL cloud infrastructure testing
✨ Complete agent lifecycle testing - Full room creation, participant management, and cleanup
✨ Production-ready validation - Test the exact same pipeline your users will experience
✨ Professional debugging tools - Descriptive room naming and dashboard monitoring
✨ Zero setup complexity - Provide your LiveKit credentials and run the tests

Room Naming Convention

Tests use descriptive room names that indicate the test type and expected participant counts for easy debugging:

test-simple-expect-0p-room-XXXXXXXX - Simple infrastructure tests with 0 participants expected
test-simple-expect-1p-room-XXXXXXXX - Simple connection tests with 1 participant expected
test-agent-expect-2p-room-XXXXXXXX - Agent tests with 2 participants expected (agent and participant)

⚠️ Important: Account Balance Requirements

Before running E2E tests, ensure your Yandex Cloud account has a positive balance. Tests will fail with authentication errors if your account balance is negative or insufficient.

🚀 Ready to test like a pro? See DEVELOPMENT.md for detailed testing information and LiveKit Cloud dashboard monitoring.

Installation

Prerequisites

Ensure your system meets the following requirements:

Git: Download Git if not already installed.
Python 3.9+: Ensure Python is installed and configured correctly (matches LiveKit Agents requirement)
Hatch: Install modern, extensible Python project manager Hatch.

Official Yandex Cloud SDK Integration

This plugin uses the official Yandex Cloud SDK - which provides:

Official protobuf definitions for Yandex SpeechKit API v3
Proper gRPC stubs maintained by Yandex Cloud team
Full API compatibility with the latest Yandex Cloud features
Automatic updates when new API versions are released

No custom stub generation required - the plugin automatically uses the official API definitions.

Install the Plugin

Clone this repository:

git clone git@github.com:sergerdn/livekit-plugins-yandex.git
cd livekit-plugins-yandex

Install using Hatch:

To install the plugin, it's recommended to build the wheel using Hatch and then install it.

This is the standard way to distribute and install Hatch-managed Python projects.

# Ensure Hatch is installed (pipx install hatch)
hatch build
# The wheel will be in the dist/ folder
# You can then install it using pip (globally or in a virtual environment)
# For example:
pip install dist/livekit_plugins_yandex-*.whl

For development, you would typically set up a Hatch environment:

# This creates or updates a virtual environment managed by Hatch
# and installs dependencies, including the plugin in editable mode.
hatch env create
# To run commands within this environment:
hatch shell
# Or prefix commands with `hatch run`:
# hatch run python your_script.py

Refer to DEVELOPMENT.md for more detailed development setup instructions.

Configuration

Environment Variables Setup

The plugin requires Yandex Cloud credentials to function. You can configure these using environment variables.

Method 1: Using .env File (Recommended)

Copy the environment template:
```
cp .env.example .env
```

Edit the .env file with your credentials:

# Required - Yandex Cloud credentials
YANDEX_API_KEY=your_service_account_api_key_here
YANDEX_FOLDER_ID=your_folder_id_here

# Optional - Yandex STT configuration (uncomment to customize)
# YANDEX_STT_ENDPOINT=stt.api.cloud.yandex.net:443
# YANDEX_STT_LANGUAGE=ru-RU
# YANDEX_STT_MODEL=general
# YANDEX_STT_DEBUG=false

# Required for E2E agent testing only
# LIVEKIT_WS_URL=wss://your-project.livekit.cloud
# LIVEKIT_API_KEY=your_livekit_api_key
# LIVEKIT_API_SECRET=your_livekit_api_secret

Method 2: Direct Environment Variables

You can also set environment variables directly in your shell:

# Required - Yandex Cloud credentials
export YANDEX_API_KEY="your_service_account_api_key_here"
export YANDEX_FOLDER_ID="your_folder_id_here"

# Optional - STT configuration
export YANDEX_STT_ENDPOINT="stt.api.cloud.yandex.net:443"
export YANDEX_STT_LANGUAGE="ru-RU"
export YANDEX_STT_MODEL="general"
export YANDEX_STT_DEBUG="false"

# For E2E agent testing only
export LIVEKIT_WS_URL="wss://your-project.livekit.cloud"
export LIVEKIT_API_KEY="your_livekit_api_key"
export LIVEKIT_API_SECRET="your_livekit_api_secret"

Environment Variables Reference

Yandex Cloud Configuration:

Variable	Required	Default	Description
`YANDEX_API_KEY`	✅ Yes	None	Service account API key (not IAM token)
`YANDEX_FOLDER_ID`	✅ Yes	None	Yandex Cloud folder ID
`YANDEX_STT_ENDPOINT`	❌ No	`stt.api.cloud.yandex.net:443`	gRPC endpoint for SpeechKit API
`YANDEX_STT_LANGUAGE`	❌ No	`ru-RU`	Default language (ru-RU, en-US, tr-TR, etc.)
`YANDEX_STT_MODEL`	❌ No	`general`	Recognition model (general, premium)
`YANDEX_STT_DEBUG`	❌ No	`false`	Enable debug logging for STT operations

LiveKit Cloud Configuration (for E2E testing):

Variable	Required	Default	Description
`LIVEKIT_WS_URL`	🧪 E2E	None	LiveKit WebSocket URL (wss://your-project.livekit.cloud)
`LIVEKIT_API_KEY`	🧪 E2E	None	LiveKit Cloud API key for agent testing
`LIVEKIT_API_SECRET`	🧪 E2E	None	LiveKit Cloud API secret for agent testing

Development Configuration:

Variable	Required	Default	Description
`LOG_LEVEL`	❌ No	`INFO`	Logging level (DEBUG, INFO, WARNING, ERROR)

Legend:

✅ Required: Must be set for basic plugin functionality
🧪 E2E: Required only for end-to-end agent testing (make test_e2e_agent)
❌ Optional: Has sensible defaults, can be customized if needed

Authentication Notes

✅ Supported: API Key authentication only
❌ Not Supported: IAM token authentication
Security: Never commit .env files to version control

Yandex Cloud Setup

Before using this plugin, you need to set up a Yandex Cloud account and obtain API credentials:

Prerequisites

Create a Yandex Cloud account at yandex.cloud
Create a folder in your Yandex Cloud console
Set up API authentication (API key method only - IAM authentication is not supported)

Step-by-Step Setup

📖 Quick Start Guide: Follow the official Yandex Cloud SpeechKit Quick Start for complete setup instructions, including API key creation.

Required Steps:

Create Yandex Cloud Account
- Go to yandex.cloud
- Sign up or log in to your account
Create a Cloud Folder
- In the Yandex Cloud console, create a new folder
- Note the folder ID (you'll need this for configuration)
Create Service Account
- Create a service account with SpeechKit permissions
- Assign the speechkit.stt role to the service account
Generate API Key
- Create an API key for your service account
- Important: Only API key authentication is supported (IAM tokens are not supported)
- Save the API key securely

Authentication Method

✅ Supported: API Key authentication ❌ Not Supported: IAM token authentication

Make sure to use the API key method when following Yandex Cloud documentation.

Security Note

⚠️ Important: Never commit your .env file to version control. The .env file is already included in .gitignore to prevent accidental commits of sensitive credentials.

Usage

Basic Usage

from livekit.agents import AgentSession
from livekit.plugins import yandex

# Create an STT instance optimized for real-time streaming
stt = yandex.STT(
    language="ru-RU",  # Russian
    interim_results=True,  # Enable real-time partial results (recommended)
)

# Use in LiveKit Agent for real-time streaming
agent = AgentSession(
    stt=stt,
    # ... other configuration
)

Manual Credentials

from livekit.plugins import yandex

# Explicitly provide credentials
stt = yandex.STT(
    language="ru-RU",
    api_key="your_api_key",
    folder_id="your_folder_id"
)

Advanced Configuration

from livekit.plugins import yandex

# Optimized for real-time streaming with language detection
stt = yandex.STT(
    detect_language=True,
    interim_results=True,  # Essential for real-time UX
    profanity_filter=True,
    model="general"
)

# English-only real-time recognition
stt = yandex.STT(
    language="en-US",
    interim_results=True,  # Always enable for streaming
    model="general",
    sample_rate=16000
)

# High-performance streaming configuration
stt = yandex.STT(
    language="ru-RU",
    interim_results=True,  # Real-time partial results
    profanity_filter=False,  # Disable for lower latency
    sample_rate=16000,  # Standard for real-time audio
)

Real-Time Streaming Recognition

Primary Use Case: This plugin is designed for real-time streaming audio processing, not batch file processing.

Working Examples

For complete, working examples of proper streaming implementation, see:

example_plugin_usage.py - Comprehensive demonstration script showing:
- ✅ Real-time streaming with push_frame() method
- ✅ Emulated streaming from audio files (for testing)
- ✅ Simulated live audio processing (like microphone input)
- ❌ Batch processing comparison (shows why it's discouraged)
tests/e2e/plugin_e2e/test_real_audio_processing.py - Plugin-level E2E tests with real Yandex Cloud API calls
tests/e2e/agent_e2e/ - Agent-level E2E tests with LiveKit integration

Key Streaming Patterns

The examples demonstrate the correct patterns for:

Creating streaming sessions with stt.stream()
Processing audio frames with stream.push_frame(frame)
Handling interim and final results asynchronously
Proper session management and cleanup

Streaming vs. Batch Processing

✅ Recommended: Real-Time Streaming

Process audio frames as they arrive using push_frame()
Get immediate interim results for responsive UX
Handle long audio streams efficiently

❌ Discouraged: Batch File Processing

Loading entire audio files defeats real-time benefits
No interim results = poor user experience
Higher memory usage and latency

Run the examples:

# See all streaming patterns in action
python example_plugin_usage.py

# Run plugin-level E2E tests (STT functionality)
make test_e2e_plugin

# Run agent-level E2E tests (LiveKit integration)
make test_e2e_agent

# Run all E2E tests
make test_e2e

Supported Languages

Primary languages with full support:

Russian (ru-RU)
English (en-US)

Additional supported languages:

Turkish (tr-TR)
Kazakh (kk-KK)
Uzbek (uz-UZ)
Armenian (hy-AM)
Hebrew (he-IL)
Arabic (ar)
And many more...

Configuration Options

Parameter	Type	Default	Description
`model`	`str`	`"general"`	Recognition model (general, premium)
`language`	`str`	`"ru-RU"`	Language code (ru-RU, en-US, etc.)
`detect_language`	`bool`	`False`	Auto language detection
`interim_results`	`bool`	`True`	Enable interim results
`profanity_filter`	`bool`	`False`	Filter profanity
`sample_rate`	`int`	`16000`	Audio sample rate (8000, 16000, 48000)
`api_key`	`str`	`None`	Yandex Cloud API key
`folder_id`	`str`	`None`	Yandex Cloud folder ID
`grpc_endpoint`	`str`	`"stt.api.cloud.yandex.net:443"`	gRPC endpoint for SpeechKit API

Error Handling

The plugin includes comprehensive error handling for:

Authentication failures (invalid API keys/tokens)
Network connectivity issues (timeouts, connection drops)
Rate limiting (quota exceeded)
Audio format validation (unsupported formats)
gRPC communication errors (service unavailable)

Development Setup

For Contributors

See DEVELOPMENT.md for detailed development setup instructions.

Development

For detailed development information, see DEVELOPMENT.md.

Quick Start:

# Install development dependencies
make install

# Run tests
make test_unit

# Generate test fixtures
make fixtures

# Check code quality
make lint

Development Status

This plugin is ready for development and testing.

Completed Features:

Complete project structure and configuration
Authentication system with API key support
Full STT interface implementation
gRPC streaming implementation using official Yandex Cloud SDK
Comprehensive test suite (unit, integration, functional, e2e)
Audio fixture generation tools
Code quality tools and linting
Development documentation and workflows
Cross-platform support (Windows, macOS, Linux)

Ready for:

Real audio processing and testing
Integration with LiveKit applications
Production deployment (with proper credentials)

Development Tools:

Unified fixture generator for test audio
Comprehensive test suite with proper isolation
Code quality enforcement (linting, formatting, type checking)
Cross-platform development support

Contributing

Contributions are welcome! Please follow these guidelines:

Fork this repository
Create a feature branch
Make your changes with tests
Submit a pull request

For development setup, see DEVELOPMENT.md.

License

This plugin is licensed under the Apache License 2.0. See LICENSE for details.

Support

Issues: GitHub Issues - Report bugs or request features for this plugin
LiveKit Documentation: LiveKit Docs - For general LiveKit Agents documentation
LiveKit Community: LiveKit Discord - For general LiveKit support (not plugin-specific issues)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs/images		docs/images
livekit/plugins/yandex		livekit/plugins/yandex
tests		tests
utils		utils
.env.example		.env.example
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
DEVELOPMENT.md		DEVELOPMENT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
example_plugin_usage.py		example_plugin_usage.py
pyproject.toml		pyproject.toml

License

sergerdn/livekit-plugins-yandex

Folders and files

Latest commit

History

Repository files navigation

Yandex SpeechKit STT Plugin for LiveKit Agents

Features

🧪 End-to-End Testing

Room Naming Convention

⚠️ Important: Account Balance Requirements

Installation

Prerequisites

Official Yandex Cloud SDK Integration

Install the Plugin

Configuration

Environment Variables Setup

Method 1: Using .env File (Recommended)

Method 2: Direct Environment Variables

Environment Variables Reference

Authentication Notes

Yandex Cloud Setup

Prerequisites

Step-by-Step Setup

Required Steps:

Authentication Method

Security Note

Usage

Basic Usage

Manual Credentials

Advanced Configuration

Real-Time Streaming Recognition

Working Examples

Key Streaming Patterns

Streaming vs. Batch Processing

Supported Languages

Configuration Options

Error Handling

Development Setup

For Contributors

Development

Development Status

Contributing

License

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages