Skip to content

Commit 66c3211

Browse files
committed
Merge remote-tracking branch 'origin/main' into orpheus
2 parents a95b173 + 7d3ee3c commit 66c3211

19 files changed

+3215
-202
lines changed

.github/copilot-instructions.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# GitHub Copilot Instructions
2+
3+
## Project Context
4+
5+
Wyoming OpenAI is a proxy middleware that bridges the Wyoming protocol with OpenAI-compatible endpoints for ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) services. It enables Wyoming clients like Home Assistant to use various OpenAI-compatible STT/TTS services.
6+
7+
## Code Style and Conventions
8+
9+
- Use async/await patterns for all I/O operations
10+
- Follow Python type hints for function signatures
11+
- Maintain consistency with existing error handling patterns
12+
- Use logging for debugging and error messages
13+
- Keep functions focused and modular
14+
15+
## Architecture Overview
16+
17+
### Core Components
18+
19+
- **`handler.py`**: Contains `OpenAIEventHandler` - the main Wyoming protocol event handler that processes ASR and TTS requests
20+
- **`compatibility.py`**: Provides `CustomAsyncOpenAI` class with backend detection and OpenAI API compatibility layer
21+
- **`__main__.py`**: Entry point with argument parsing and server initialization
22+
- **`utilities.py`**: Helper functions for audio processing and data handling
23+
- **`const.py`**: Version constants and configuration
24+
25+
### Key Patterns
26+
27+
1. **Async Event Handling**: Uses Wyoming's `AsyncEventHandler` to process incoming protocol events
28+
2. **Backend Abstraction**: `CustomAsyncOpenAI` wraps different backends (OpenAI, Speaches, LocalAI, etc.) with a unified interface
29+
3. **Stream Processing**: Handles both streaming and non-streaming transcription modes
30+
4. **Audio Buffer Management**: Collects audio chunks into complete files for processing
31+
32+
### Wyoming Protocol Events
33+
34+
The handler processes these Wyoming events:
35+
- `AudioStart/AudioChunk/AudioStop` → STT transcription
36+
- `Transcribe` → Initiate transcription request
37+
- `Synthesize` → TTS audio generation
38+
39+
### Supported Backends
40+
41+
The `OpenAIBackend` enum defines supported backends:
42+
- `OPENAI`: Official OpenAI API
43+
- `SPEACHES`: Local Speaches service
44+
- `LOCALAI`: LocalAI service
45+
- `KOKORO_FASTAPI`: Kokoro TTS service
46+
47+
## Testing Guidelines
48+
49+
When writing tests:
50+
- Use pytest fixtures for common setup
51+
- Mock external API calls
52+
- Test both success and error scenarios
53+
- Include integration tests for end-to-end flows
54+
- Aim for high code coverage
55+
56+
Test files are organized by module:
57+
- `test_handler.py`: Event handler logic
58+
- `test_compatibility.py`: Backend compatibility
59+
- `test_utilities.py`: Helper functions
60+
- `test_integration.py`: End-to-end scenarios
61+
62+
## Common Development Tasks
63+
64+
### Running Tests
65+
```bash
66+
pytest # Run all tests
67+
pytest --cov=wyoming_openai # With coverage
68+
pytest tests/test_handler.py # Specific test file
69+
```
70+
71+
### Code Quality
72+
```bash
73+
ruff check . # Run linting
74+
ruff check . --fix # Auto-fix issues
75+
```
76+
77+
### Local Development
78+
```bash
79+
pip install -e ".[dev]" # Install dev dependencies
80+
python -m wyoming_openai --uri tcp://0.0.0.0:10300 --stt-models whisper-1 --tts-models tts-1
81+
```
82+
83+
### Docker Development
84+
```bash
85+
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build
86+
```
87+
88+
## Configuration
89+
90+
The server accepts both command-line arguments and environment variables. When suggesting configuration changes, consider:
91+
- STT/TTS API keys and URLs
92+
- Model lists for STT and TTS
93+
- Voice configurations
94+
- Backend-specific settings (temperature, speed, etc.)
95+
96+
## When Making Changes
97+
98+
- Ensure backward compatibility with existing Wyoming clients
99+
- Update tests to reflect new functionality
100+
- Add appropriate logging for debugging
101+
- Document new configuration options
102+
- Consider impact on all supported backends
103+
- Validate audio format conversions maintain quality

.github/workflows/docker-image-pr.yml

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,30 +2,45 @@ name: Docker Image PR Build
22

33
on:
44
pull_request:
5+
types: [opened, synchronize, reopened]
56
branches: [ "main" ]
67

78
jobs:
8-
build:
9+
build-and-push:
10+
# Only run for PRs from the same repository (security measure)
11+
if: github.event.pull_request.head.repo.full_name == github.repository
912
runs-on: ubuntu-latest
1013
permissions:
1114
contents: read
12-
packages: read
15+
packages: write
1316

1417
steps:
1518
- uses: actions/checkout@v4
1619

20+
- name: Set up Docker Buildx
21+
uses: docker/setup-buildx-action@v3
22+
23+
- name: Log in to GitHub Container Registry
24+
uses: docker/login-action@v3
25+
with:
26+
registry: ghcr.io
27+
username: ${{ github.actor }}
28+
password: ${{ secrets.GITHUB_TOKEN }}
29+
1730
- name: Extract metadata (tags, labels) for Docker
1831
id: meta
1932
uses: docker/metadata-action@v5
2033
with:
2134
images: ghcr.io/${{ github.repository }}
2235
tags: |
23-
type=sha
36+
type=raw,value=pr-${{ github.event.number }}
2437
25-
- name: Build Docker image
38+
- name: Build and push Docker image
2639
uses: docker/build-push-action@v5
2740
with:
2841
context: .
29-
push: false
42+
push: true
3043
tags: ${{ steps.meta.outputs.tags }}
3144
labels: ${{ steps.meta.outputs.labels }}
45+
cache-from: type=gha
46+
cache-to: type=gha,mode=max

.github/workflows/pr-cleanup.yml

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
name: PR Docker Cleanup
2+
3+
on:
4+
pull_request:
5+
types: [closed]
6+
branches: [ "main" ]
7+
8+
jobs:
9+
cleanup:
10+
# Only run for PRs from the same repository (security measure)
11+
if: github.event.pull_request.head.repo.full_name == github.repository
12+
runs-on: ubuntu-latest
13+
permissions:
14+
contents: read
15+
packages: write
16+
17+
steps:
18+
- name: Log in to GitHub Container Registry
19+
uses: docker/login-action@v3
20+
with:
21+
registry: ghcr.io
22+
username: ${{ github.actor }}
23+
password: ${{ secrets.GITHUB_TOKEN }}
24+
25+
- name: Delete PR Docker image
26+
continue-on-error: true
27+
run: |
28+
# Convert repository name to lowercase for Docker registry
29+
REPO_LOWER=$(echo "${{ github.repository }}" | \
30+
tr '[:upper:]' '[:lower:]')
31+
PACKAGE_NAME=$(basename ${REPO_LOWER})
32+
TAG_NAME="pr-${{ github.event.number }}"
33+
34+
echo "Attempting to delete tag: ${TAG_NAME} for package: ${PACKAGE_NAME}"
35+
36+
# Determine the correct API base path based on repository owner type
37+
OWNER_TYPE="${{ github.repository_owner_type }}"
38+
OWNER="${{ github.repository_owner }}"
39+
if [ "$OWNER_TYPE" = "Organization" ]; then
40+
API_BASE="orgs/${OWNER}"
41+
else
42+
API_BASE="users/${OWNER}"
43+
fi
44+
45+
echo "Using API base path: ${API_BASE}"
46+
47+
# Get all versions of the package with error handling
48+
API_URL="https://api.github.com/${API_BASE}/packages/container/${PACKAGE_NAME}/versions"
49+
RESPONSE=$(curl -sSf \
50+
-H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" \
51+
-H "Accept: application/vnd.github+json" \
52+
"${API_URL}" 2>&1)
53+
CURL_EXIT_CODE=$?
54+
if [ $CURL_EXIT_CODE -ne 0 ]; then
55+
echo "Error: Failed to fetch package versions from GitHub API. Response:"
56+
echo "$RESPONSE"
57+
exit $CURL_EXIT_CODE
58+
fi
59+
VERSIONS=$(echo "$RESPONSE" | \
60+
jq -r '.[] | select(.metadata.container.tags[]? == "'${TAG_NAME}'") | .id')
61+
62+
if [ -n "$VERSIONS" ]; then
63+
for VERSION_ID in $VERSIONS; do
64+
echo "Deleting version ID: $VERSION_ID with tag: ${TAG_NAME}"
65+
DELETE_URL="${API_URL}/${VERSION_ID}"
66+
DELETE_RESPONSE=$(curl -sSf -X DELETE \
67+
-H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" \
68+
-H "Accept: application/vnd.github+json" \
69+
"${DELETE_URL}" 2>&1)
70+
DELETE_EXIT_CODE=$?
71+
if [ $DELETE_EXIT_CODE -eq 0 ]; then
72+
echo "Successfully deleted Docker image version: ${VERSION_ID}"
73+
else
74+
echo "Warning: Failed to delete version ID: $VERSION_ID. Response:"
75+
echo "$DELETE_RESPONSE"
76+
fi
77+
done
78+
else
79+
echo "No Docker image found for tag: ${TAG_NAME}, nothing to clean up"
80+
fi

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,4 +171,7 @@ cython_debug/
171171
.pypirc
172172

173173
# VSCode settings
174-
.vscode/
174+
.vscode/
175+
176+
# AI
177+
.claude/

CLAUDE.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
Wyoming OpenAI is a proxy middleware that bridges the Wyoming protocol with OpenAI-compatible endpoints for ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) services. It enables Wyoming clients like Home Assistant to use various OpenAI-compatible STT/TTS services.
8+
9+
## Development Commands
10+
11+
### Testing
12+
```bash
13+
# Install development dependencies
14+
pip install -e ".[dev]"
15+
16+
# Run all tests
17+
pytest
18+
19+
# Run tests with coverage
20+
pytest --cov=wyoming_openai
21+
22+
# Run specific test file
23+
pytest tests/test_handler.py
24+
```
25+
26+
### Code Quality
27+
```bash
28+
# Run linting with Ruff
29+
ruff check .
30+
31+
# Auto-fix linting issues
32+
ruff check . --fix
33+
```
34+
35+
### Local Development Setup
36+
```bash
37+
# Install in editable mode
38+
pip install -e .
39+
40+
# Run the server locally
41+
python -m wyoming_openai --uri tcp://0.0.0.0:10300 --stt-models whisper-1 --tts-models tts-1
42+
```
43+
44+
### Docker Development
45+
```bash
46+
# Build and run development container
47+
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d --build
48+
49+
# With local services (e.g., Speaches)
50+
docker compose -f docker-compose.speaches.yml -f docker-compose.dev.yml up -d --build
51+
```
52+
53+
## Architecture
54+
55+
### Core Components
56+
57+
- **`handler.py`**: Contains `OpenAIEventHandler` - the main Wyoming protocol event handler that processes ASR and TTS requests
58+
- **`compatibility.py`**: Provides `CustomAsyncOpenAI` class with backend detection and OpenAI API compatibility layer
59+
- **`__main__.py`**: Entry point with argument parsing and server initialization
60+
- **`utilities.py`**: Helper functions for audio processing and data handling
61+
- **`const.py`**: Version constants and configuration
62+
63+
### Key Architecture Patterns
64+
65+
1. **Async Event Handling**: Uses Wyoming's `AsyncEventHandler` to process incoming protocol events
66+
2. **Backend Abstraction**: `CustomAsyncOpenAI` wraps different backends (OpenAI, Speaches, LocalAI, etc.) with a unified interface
67+
3. **Stream Processing**: Handles both streaming and non-streaming transcription modes
68+
4. **Audio Buffer Management**: Collects audio chunks into complete files for processing
69+
70+
### Wyoming Protocol Flow
71+
72+
The handler processes these Wyoming events:
73+
- `AudioStart/AudioChunk/AudioStop` → STT transcription
74+
- `Transcribe` → Initiate transcription request
75+
- `Synthesize` → TTS audio generation
76+
77+
### Backend Support
78+
79+
The `OpenAIBackend` enum defines supported backends:
80+
- `OPENAI`: Official OpenAI API
81+
- `SPEACHES`: Local Speaches service
82+
- `LOCALAI`: LocalAI service
83+
- `KOKORO_FASTAPI`: Kokoro TTS service
84+
85+
## Configuration
86+
87+
The server accepts both command-line arguments and environment variables. Key configuration includes:
88+
- STT/TTS API keys and URLs
89+
- Model lists for STT and TTS
90+
- Voice configurations
91+
- Backend-specific settings (temperature, speed, etc.)
92+
93+
## Testing Strategy
94+
95+
Tests are organized by module:
96+
- `test_handler.py`: Event handler logic
97+
- `test_compatibility.py`: Backend compatibility
98+
- `test_utilities.py`: Helper functions
99+
- `test_integration.py`: End-to-end scenarios

Dockerfile

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ FROM python:3.12-slim
55
ENV PYTHONDONTWRITEBYTECODE=1
66
ENV PYTHONUNBUFFERED=1
77

8-
# Install system dependencies (if any)
9-
# build-essential and libssl-dev might be needed for some dependencies
10-
RUN apt-get update && \
11-
apt-get install -y --no-install-recommends \
12-
build-essential \
13-
libssl-dev \
14-
&& rm -rf /var/lib/apt/lists/*
8+
# No system dependencies needed - all Python packages have pre-compiled wheels
9+
# Uncomment the following lines if you need to install system dependencies
10+
# RUN apt-get update && \
11+
# apt-get install -y --no-install-recommends \
12+
# build-essential \
13+
# libssl-dev \
14+
# && rm -rf /var/lib/apt/lists/*
1515

1616
# Set the working directory in the container
1717
WORKDIR /app

0 commit comments

Comments
 (0)