Transform your AI conversations with natural, hands-free voice interaction.
Say, Pi is a browser extension that brings multilingual voice capabilities to AI chatbots. Built with enterprise-grade architecture and modern web standards, it delivers accurate speech recognition and natural text-to-speech across Pi.ai, Claude.ai, and ChatGPT.
Watch Demo • Install for Chrome • Install for Firefox • Documentation
Say, Pi is a three-tier system that enhances AI chat experiences with sophisticated voice interaction:
┌─────────────────────────────────────────────────────────────┐
│ │
│ Browser Extension (saypi-userscript) │
│ ├─ Content Script: Voice UI, chatbot integration │
│ ├─ Service Worker: Auth, messaging, lifecycle management │
│ ├─ Offscreen Documents: Audio processing under strict CSP │
│ └─ Modern Stack: TypeScript, XState, Webpack, WXT │
│ │
└─────────────────────────────────────────────────────────────┘
↕
┌─────────────────────────────────────────────────────────────┐
│ │
│ API Server (saypi-api) │
│ ├─ FastAPI/Python: Speech-to-text, text-to-speech │
│ ├─ OpenAI Whisper: Accurate transcription │
│ ├─ Multi-voice TTS: Natural speech synthesis │
│ └─ Redis: Caching and quota management │
│ │
└─────────────────────────────────────────────────────────────┘
↕
┌─────────────────────────────────────────────────────────────┐
│ │
│ Web Application (saypi-saas) │
│ ├─ Nuxt.js/Vue.js: Subscription management, user dashboard │
│ ├─ Multi-provider billing: Stripe, LemonSqueezy, Chargebee │
│ ├─ PostgreSQL/Prisma: User accounts, entitlements │
│ └─ i18n: Multi-language support │
│ │
└─────────────────────────────────────────────────────────────┘
This repository (saypi-userscript) contains the browser extension—the client-side component that users install to enable voice chat with AI assistants.
- Real-time speech detection using Silero VAD (ONNX models)
- Intelligent segmentation for natural conversation flow
- Dual-mode operation: Offscreen for Chrome/Edge, onscreen fallback for Firefox/Safari
- Works across all supported browsers and chatbot platforms
- Natural voice synthesis with multiple voice options
- CSP-compliant audio playback via Chrome Offscreen Documents API
- Chatbot-specific integration: Native support for Pi.ai, Claude.ai (desktop Chrome/Edge), ChatGPT (all browsers)
- See Browser Compatibility Matrix for platform-specific details
- OpenAI Whisper integration for high-quality transcription
- Multi-language support with automatic language detection
- Real-time feedback during voice input
- Chatbot abstraction layer with pluggable implementations
- Progressive DOM observation with backoff for SPA compatibility
- Event-driven architecture for loose coupling between components
This extension embodies enterprise development standards applied to browser extensions:
- Separation of Concerns: Modular architecture with clear boundaries (audio, VAD, TTS, DOM, auth)
- Progressive Enhancement: Graceful degradation across browsers and CSP environments
- Type Safety: Full TypeScript coverage with strict compilation
- State Management: XState finite state machines for predictable audio/UI flows
- Event-Driven Design: EventBus for decoupled module communication
- Testability: Dual test framework (Jest for JavaScript, Vitest for TypeScript) with 80%+ coverage
| Layer | Technologies |
|---|---|
| Build System | Webpack 5, WXT (Web Extension Tooling), TypeScript 5.2+ |
| State Management | XState v4 (FSM), RxJS (reactive streams) |
| Audio Processing | Web Audio API, AudioWorklet, ONNX Runtime Web |
| Machine Learning | Silero VAD models (ONNX), WASM acceleration |
| Browser APIs | Offscreen Documents (MV3), Service Workers, chrome.* APIs |
| Testing | Jest (JavaScript), Vitest (TypeScript/ESM), JSDOM |
| Code Quality | ESLint, Prettier, TypeScript strict mode |
src/
├── chatbots/ # Chatbot-specific implementations
│ ├── ChatbotService.ts # Factory for chatbot detection
│ ├── Chatbot.ts # Abstract base class
│ ├── Claude.ts # Claude.ai DOM integration
│ ├── Pi.ts # Pi.ai DOM integration
│ └── bootstrap.ts # DOMObserver with progressive search
│
├── audio/ # Audio pipeline
│ ├── AudioModule.js # Main coordinator
│ ├── OffscreenAudioBridge.js # Content ↔ offscreen bridge
│ ├── AudioInputMachine.ts # XState: mic → VAD → STT
│ └── AudioOutputMachine.ts # XState: TTS → playback
│
├── vad/ # Voice Activity Detection
│ ├── OffscreenVADClient.ts # Chrome/Edge (offscreen)
│ ├── OnscreenVADClient.ts # Firefox/Safari fallback
│ └── custom-model-fetcher.js # Firefox ArrayBuffer realm fix
│
├── tts/ # Text-to-Speech
│ ├── TextToSpeechService.ts # Core TTS management
│ ├── ChatHistoryManager.ts # TTS queue for chat messages
│ └── VoiceMenuUIManager.ts # Voice selection UI
│
├── dom/ # DOM Management
│ ├── Observation.ts # Found/decorated element tracking
│ └── bootstrap.ts # Progressive search with backoff
│
├── offscreen/ # Offscreen documents (CSP bypass)
│ ├── audio_handler.ts # Audio playback under CSP
│ ├── vad_handler.ts # VAD processing
│ └── media_coordinator.ts # Message routing
│
├── compat/ # Browser compatibility
│ ├── BrowserCompatibilityModule.ts # Detection logic
│ └── CompatibilityNotificationUI.ts # User-facing notices
│
└── svc/ # Background services
├── background.ts # Service worker (auth, lifecycle)
└── JwtManager.ts # Token management with refresh
The build system produces an optimized bundle under 2MB (excluding binary assets):
- Core bundles: 8 files, ~1.2MB total (main: 450KB, vendors: 700KB)
- Dynamic chunks: Lazy-loaded for chatbot-specific features
- Binary assets: 4 WASM files (37MB), 3 ONNX models (5MB)—excluded from Firefox AMO size limits
- Code splitting: Vendor libraries separated for optimal caching
See CLAUDE.md for detailed bundle analysis.
The extension implements platform-specific adaptations for Chrome, Edge, Firefox, and Safari:
- Chrome/Edge (Desktop): Full feature support via Offscreen Documents API
- Firefox (Desktop/Mobile): VAD + STT on all sites; TTS on Pi.ai and ChatGPT
- Safari/Mobile Chromium: Partial support with graceful degradation
See Browser Compatibility Matrix for complete platform/feature grid.
Multi-layer testing ensures reliability across browsers and chatbot platforms:
npm test # Run all tests (Jest + Vitest)
npm run test:jest # JavaScript integration tests
npm run test:vitest # TypeScript unit tests- Unit tests: Individual module behavior (Vitest for TS, Jest for JS)
- Integration tests: Audio pipeline, VAD/STT flows, DOM observation
- Mock implementations: Chrome extension APIs, chatbot DOM structures
- JSDOM environment: DOM manipulation testing without browser
Install from your browser's extension store:
- Chrome/Edge: Chrome Web Store
- Firefox: Firefox Add-ons
Prerequisites:
- Node.js ≥22.0.0, npm ≥10.0.0
- Python 3.11.12 (for ONNX pruning during builds)
jqfor Firefox builds (brew install jq/apt install jq)
Quick Start:
# Clone and install dependencies
git clone <repository>
cd saypi-userscript
npm install
# Set up environment (copy and edit)
cp .env.example .env
cp .env.production.example .env.production
# Start development server with hot reload
npm run dev # Chrome (default)
npm run dev:firefox # Firefox MV2
# Build for production
npm run build # Chrome/Edge
npm run build:firefox # Firefox (with packaging)Environment Management:
# Quick switch between local and remote API servers
npm run switch # Toggle between local/remote
npm run switch local # localhost:3000, 127.0.0.1:5001
npm run switch remote # saypi.ai, api.saypi.ai
npm run switch status # Check current configPython Tooling (for ONNX model pruning):
# One-time setup per machine (creates .venv/)
npm run setup:python
# Runs automatically during builds, or manually:
npm run copy-onnx # Copy + prune ONNX modelsSee CLAUDE.md for complete development workflow and build commands.
- CLAUDE.md - Architecture overview, build commands, testing, module organization
- CONTRIBUTING.md - Project status, contribution policy, feedback channels
- Browser Compatibility Matrix - Platform/feature support grid with technical explanations
Build Verification:
This extension uses Webpack for bundling. To verify the build matches the submitted code:
# Install dependencies
npm install
# Configure environment (copy templates, edit as needed)
cp .env.example .env
cp .env.production.example .env.production
# Build extension
npm run build
# Package for Firefox
./package-extension.sh firefoxBuild Environment:
- Node.js v22 LTS (compatible with Mozilla review environment)
- npm v10
- See README:94-136 for detailed instructions
The dist/ output matches the submitted XPI exactly. All bundling is deterministic.
| Browser | Pi.ai | Claude.ai | ChatGPT | Notes |
|---|---|---|---|---|
| Chrome/Edge Desktop | ✅ Full | ✅ Full | ✅ Full | All features via Offscreen API |
| Firefox Desktop | ✅ Full | ✅ Full | No TTS on Claude (CSP + no offscreen) | |
| Firefox Mobile | ✅ Full | ✅ Full | Same as desktop | |
| Safari/Kiwi | ❌ Limited | Platform-specific limitations |
Legend: ✅ Full support •
See Browser Compatibility Matrix for technical details and workarounds.
# Development
npm run dev # Chrome dev server with hot reload
npm run dev:firefox # Firefox dev (opens temp profile)
# Building
npm run build # Production build (validates + copies ONNX)
npm run build:firefox # Firefox build + packaging
# Testing
npm test # All tests (Jest + Vitest)
npm run test:vitest:watch # Watch mode for rapid iteration
# Environment
npm run validate:env # Check .env files (auto-runs before dev/build)
npm run switch # Toggle local/remote API serversSave VAD segments to disk (for debugging voice detection):
# In .env (not .env.production):
KEEP_SEGMENTS=true
# Start dev server
npm run dev
# Segments saved to: ~/Downloads/SayPiSegments/
# Format: saypi-segment_<start>_to_<end>_<duration>ms.wavNote: downloads permission added only in dev builds; production ignores this setting.
The extension supports 30+ languages with a flexible translation workflow:
# Translate all content (UI strings + store descriptions)
npm run translate
# Verify setup without running translations
npm run translate:check
# Skip confirmation prompts
npm run translate -- --yesWhen you only need to translate specific keys (e.g., after updating marketing copy):
# Translate specific keys from messages.json
python3 tools/i18n/i18n-translate-keys.py appName appDescription
# Translate multiple keys at once
python3 tools/i18n/i18n-translate-keys.py -y callInProgress callNotStarted voiceSettings
# Translate only description.txt (store listings)
python3 tools/i18n/i18n-translate-release-text.py -yWhen to use which tool:
npm run translate- Full translation before major releases (slow but comprehensive)i18n-translate-keys.py- Update specific UI strings (fast, targeted)i18n-translate-release-text.py- Update store descriptions only
What gets translated:
_locales/{locale}/messages.json- UI strings (viatranslate-clifor full sync, OpenAI for targeted)_locales/{locale}/description.txt- Browser store listings (via OpenAI API)
Requirements:
translate-cli(Go):go install github.com/quailyquaily/translate-cli@latest- OpenAI API key:
export OPENAI_API_KEY="..." - Python 3.6+ with
requestslibrary:pip install requests
This is proprietary software. The source code is publicly visible for transparency and security review, but is not licensed for modification, distribution, or use beyond the terms in LICENSE.
We are not accepting external code contributions. However, we value feedback:
- Bug reports: Open a GitHub issue with reproduction steps
- Feature suggestions: Discuss via issues or contact channels
- Security concerns: Email info@saypi.ai
See CONTRIBUTING.md for details.
Questions? Feedback? We'd love to hear from you:
- Email: info@saypi.ai
- Twitter/X: @saypi_ai
- Facebook: Say, Pi
- Discord: Find us on the Pi Party server 🥧
Say, Pi is an unofficial enhancement for Pi.ai, Claude.ai, and ChatGPT. It is not affiliated with Inflection AI, Anthropic, or OpenAI. Use responsibly and respect each platform's terms of service. Compatibility with future platform updates is not guaranteed.
Built with ❤️ by the Say, Pi team • www.saypi.ai
