Streaming hotfix and Chatterbox TTS release (0.3.8)
Incremental TTS streaming is restored and now actively prefetches sentence-level audio while it plays, bringing back low-latency responses and adding resilience when upstream synthesis misbehaves.
New Features
- Replaced the single-lock approach with a concurrent, sentence-aware pipeline that streams the active task immediately and buffers the rest for gap-free TTS streaming playback.
- Introduced a semaphore-based limiter (default 3 concurrent requests) to balance responsiveness and API throughput.
- Chatterbox TTS Deployment: Added a ready-to-run docker-compose.chatterbox.yml plus README guidance for self-hosted neural voices, including voice cloning.
Fixes
- Issue #32 – TTS Streaming Regression: Immediate streaming no longer buffers every sentence; Kokoro/Speaches setups regain real-time delivery.
- Failure Recovery: New TtsStreamError + _abort_synthesis reset state, emit synthesize-stopped, and prevent duplicate audio after OpenAI errors or empty responses.
Documentation
- README refresh highlights the concurrent streaming workflow, Chatterbox deployment, and revised feature list.
- .github/copilot-instructions.md captures project conventions for contributors and AI assistants.
Dependencies
- Bumped openai to 2.3.0 and wyoming to 1.8.0 to align with the latest API capabilities.
Full Changelog: v0.3.7...v0.3.8