-
Notifications
You must be signed in to change notification settings - Fork 46
soniox fallback with deepgram #350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release
Are you sure you want to change the base?
Conversation
WalkthroughIntroduces an STT fallback and auto-restart system: detection of STT provider (Soniox) errors, scheduling of fallback restarts, per-session STT override, session-parameter persistence, and orchestration to restart voice sessions using a configured fallback provider. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Main Process
participant Pool as Process Pool
participant Sub as Voice Subprocess
participant Pipeline as Pipeline (subprocess)
participant STT as STT Provider
participant FallbackMgr as FallbackSessionManager
rect rgb(220,240,255)
Note over Agent,Pool: Start session (pool or direct)
Agent->>Pool: start_voice_session_internal(room_url, token, session_id, params)
Pool->>Sub: allocate/start subprocess (store session params)
end
rect rgb(255,230,230)
Note over Sub,Pipeline: Run pipeline with primary STT
Sub->>Pipeline: create_pipeline_internal(primary_stt)
Pipeline->>STT: open connection (Soniox)
STT-->>Pipeline: ErrorFrame (connection/error)
Pipeline->>FallbackMgr: on_pipeline_error(error_frame)
FallbackMgr->>FallbackMgr: PipelineRestartManager.should_enable_fallback(...)
alt enable fallback
FallbackMgr->>Sub: emit FALLBACK_SESSION_END (stdout)
end
end
rect rgb(240,255,240)
Note over Pool,Agent: Auto-restart with fallback STT
Pool->>Pool: parse FALLBACK_SESSION_END, retrieve stored params
Pool->>Agent: _restart_fallback_session_with_params(...) -> start_voice_session_internal(..., fallback_stt)
Agent->>Pool: start subprocess with fallback provider
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
app/agents/voice/automatic/tools/juspay/analytics.py (1)
726-729: Consider more specific exception handling.The code uses bare
except Exception:handlers in three locations (lines 726, 743, 755). While the current approach works, consider these optional improvements:
- Line 726: Could catch specific exceptions like
ValueError,TypeError,KeyErrorfor payload construction- Lines 743, 755: Current bare catch is acceptable for logging fallback scenarios
Based on learnings: Static analysis tools flagged these as potential improvements.
Also applies to: 743-744, 755-756
app/agents/voice/automatic/stt/__init__.py (1)
122-128: Normalize fallback provider casingPlease normalize
fallback_stt_providerbefore the provider switch. Without this, a mixed-case value (e.g."Deepgram") will miss every branch and silently drop to the Google default.- effective_stt_provider = ( - fallback_stt_provider if fallback_stt_provider else config.STT_PROVIDER - ) + effective_stt_provider = ( + fallback_stt_provider.lower() + if fallback_stt_provider + else config.STT_PROVIDER + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
app/agents/voice/automatic/__init__.py(8 hunks)app/agents/voice/automatic/services/fallback/pipeline_restart_manager.py(1 hunks)app/agents/voice/automatic/services/fallback/session_manager.py(1 hunks)app/agents/voice/automatic/stt/__init__.py(5 hunks)app/agents/voice/automatic/tools/juspay/analytics.py(4 hunks)app/api/routers/automatic.py(2 hunks)app/core/config.py(1 hunks)app/helpers/automatic/process_pool.py(7 hunks)app/main.py(2 hunks)docs/STT_FALLBACK_IMPLEMENTATION.md(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (6)
app/api/routers/automatic.py (1)
app/helpers/automatic/process_pool.py (4)
get_voice_agent_pool(899-908)store_session_parameters(863-871)get_process(588-638)return_process(716-760)
app/agents/voice/automatic/__init__.py (9)
app/agents/voice/automatic/services/fallback/pipeline_restart_manager.py (2)
PipelineRestartManager(10-83)should_enable_fallback(50-83)app/agents/voice/automatic/services/fallback/session_manager.py (3)
FallbackSessionContext(13-24)get_fallback_session_manager(154-168)register_fallback_session(35-47)app/agents/voice/automatic/prompts/__init__.py (1)
get_system_prompt(26-59)app/agents/voice/automatic/types/decoders.py (1)
decode_tts_provider(4-11)app/agents/voice/automatic/features/llm_wrapper.py (1)
create_summarizing_context(125-139)app/agents/voice/automatic/processors/ptt_vad_filter.py (1)
PTTVADFilter(28-105)app/agents/voice/automatic/processors/llm_spy.py (1)
LLMSpyProcessor(126-334)app/agents/voice/automatic/services/mem0/memory.py (1)
ImprovedMem0MemoryService(28-530)app/agents/voice/automatic/stt/__init__.py (1)
get_stt_service(93-243)
app/agents/voice/automatic/stt/__init__.py (1)
app/agents/voice/breeze_buddy/workflows/order_confirmation/utils.py (1)
get_stt_service(11-63)
app/helpers/automatic/process_pool.py (2)
app/api/routers/automatic.py (1)
start_voice_session_internal(137-274)app/agents/voice/automatic/services/fallback/session_manager.py (2)
get_fallback_session_manager(154-168)create_restart_args(117-135)
app/agents/voice/automatic/tools/juspay/analytics.py (1)
app/core/transport/http_client.py (1)
create_http_client(32-51)
app/main.py (1)
app/helpers/automatic/process_pool.py (1)
store_session_parameters(863-871)
🪛 LanguageTool
docs/STT_FALLBACK_IMPLEMENTATION.md
[uncategorized] ~5-~5: Possible missing article found.
Context: ...lementation ## 🎯 Objective Implement automatic fallback mechanism for Speech-to-Text (...
(AI_HYDRA_LEO_MISSING_AN)
[grammar] ~322-~322: The singular proper name ‘Singleton’ must be used with a third-person or a past tense verb.
Context: ...❌ Alternative Approaches**: - Singleton pattern across processes (memory isolation issu...
(HE_VERB_AGR)
[typographical] ~595-~595: If specifying a range, consider using an en dash instead of a hyphen.
Context: ...back conditions - Session Recovery: 2-3 seconds for complete auto-restart - **S...
(HYPHEN_TO_EN)
🪛 markdownlint-cli2 (0.18.1)
docs/STT_FALLBACK_IMPLEMENTATION.md
29-29: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
63-63: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
212-212: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
499-499: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
535-535: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
545-545: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
552-552: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
559-559: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🪛 Ruff (0.14.3)
app/api/routers/automatic.py
191-191: Consider moving this statement to an else block
(TRY300)
202-202: Do not catch blind exception: Exception
(BLE001)
248-248: Consider [arg_name, *value] instead of concatenation
Replace with [arg_name, *value]
(RUF005)
256-256: subprocess call: check for execution of untrusted input
(S603)
272-272: Do not catch blind exception: Exception
(BLE001)
app/agents/voice/automatic/services/fallback/session_manager.py
109-109: Do not catch blind exception: Exception
(BLE001)
app/agents/voice/automatic/__init__.py
264-264: Unused function argument: voice_name
(ARG001)
264-264: Unused function argument: mode
(ARG001)
357-357: Do not catch blind exception: Exception
(BLE001)
663-663: Unpacked variable pipeline is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
663-663: Unpacked variable context_aggregator is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
942-942: Use explicit conversion flag
Replace with conversion flag
(RUF010)
952-952: Consider moving this statement to an else block
(TRY300)
954-954: Do not catch blind exception: Exception
(BLE001)
app/helpers/automatic/process_pool.py
363-363: Do not catch blind exception: Exception
(BLE001)
537-537: Do not catch blind exception: Exception
(BLE001)
583-583: Do not catch blind exception: Exception
(BLE001)
app/agents/voice/automatic/tools/juspay/analytics.py
726-726: Do not catch blind exception: Exception
(BLE001)
743-743: Do not catch blind exception: Exception
(BLE001)
755-755: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (2)
app/agents/voice/automatic/tools/juspay/analytics.py (2)
598-757: Verify relevance to PR objectives.The changes in this file are purely formatting improvements to logging statements, which appear unrelated to the PR's stated objective of adding "soniox fallback with deepgram" (STT provider fallback functionality). This file handles Juspay payment analytics and offer management.
Can you clarify how these logging format changes relate to the STT fallback feature? If these are incidental improvements, consider moving them to a separate PR focused on code quality/formatting to maintain clear PR scope.
598-600: LGTM: Formatting improvements enhance readability.The multi-line formatting of logger statements and the headers dictionary improves code readability without altering behavior. The changes are consistent with good Python style practices.
Also applies to: 630-632, 720-723, 727-729, 740-742, 748-750, 757-757
4ce33c7 to
37ab7cc
Compare
37ab7cc to
aeb9aec
Compare
|
Tip For best results, initiate chat on the files or code changes. @Devansh-1218, hello! How can I assist you with this PR? I can help with:
What would you like me to do? |
|
@coderabbitai |
|
@Devansh-1218, I'll perform a comprehensive review of this PR implementing the Soniox-to-Deepgram STT fallback functionality. I'll analyze the architecture, implementation, error handling, and integration points across all the modified files. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
app/helpers/automatic/process_pool.py (2)
321-364: Consider a more robust IPC delimiter for FALLBACK_SESSION_END.The colon-delimited format works (error_reason is the last field so embedded colons don't break parsing), but it's fragile if the protocol evolves or if field ordering changes. Consider using a more robust approach like JSON encoding or a less common delimiter sequence.
Example with JSON:
- elif "FALLBACK_SESSION_END:" in line: + elif line.startswith("FALLBACK_SESSION_END:"): logger.info( f"Process {voice_process.process_id[:8]} fallback session ended" ) - # Parse fallback session information try: - parts = line.split(":", 4) # Split into max 5 parts - if len(parts) >= 5: - session_id = parts[1] - original_stt = parts[2] - fallback_stt = parts[3] - error_reason = parts[4] + payload = line[len("FALLBACK_SESSION_END:") :] + data = json.loads(payload) + session_id = data["session_id"] + original_stt = data["original_stt"] + fallback_stt = data["fallback_stt"] + error_reason = data["error_reason"]This would require updating the corresponding
app/agents/voice/automatic/__init__.pyat line 943-946.
558-603: Remove unused_restart_fallback_sessionmethod (lines 558-603).This method is never invoked anywhere in the codebase. The parallel method
_restart_fallback_session_with_params(invoked at line 347) handles session restarts. Remove the duplicate to reduce maintenance burden.app/agents/voice/automatic/__init__.py (2)
263-410: Remove unused function parameters.The
voice_nameandmodeparameters are not used within the function body. Consider removing them or documenting why they're reserved for future use.async def create_pipeline_internal( - args, transport, stt, tts, llm, tools, rtvi, voice_name, mode + args, transport, stt, tts, llm, tools, rtvi ):You'll also need to update the call site at line 663:
- pipeline, task, ptt_vad_filter, context_aggregator = await create_pipeline_internal( - args, transport, stt, tts, llm, tools, rtvi, voice_name, mode - ) + pipeline, task, ptt_vad_filter, context_aggregator = await create_pipeline_internal( + args, transport, stt, tts, llm, tools, rtvi + )
662-665: Prefix unused unpacked variables with underscores.The unpacked variables
pipelineandcontext_aggregatorare not used after line 663. Follow the Python convention of prefixing unused variables with underscores.- # Create pipeline using the extracted function - pipeline, task, ptt_vad_filter, context_aggregator = await create_pipeline_internal( + _pipeline, task, ptt_vad_filter, _context_aggregator = await create_pipeline_internal( args, transport, stt, tts, llm, tools, rtvi, voice_name, mode )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
app/agents/voice/automatic/__init__.py(8 hunks)app/agents/voice/automatic/services/fallback/pipeline_restart_manager.py(1 hunks)app/agents/voice/automatic/services/fallback/session_manager.py(1 hunks)app/agents/voice/automatic/stt/__init__.py(5 hunks)app/agents/voice/automatic/tools/juspay/analytics.py(4 hunks)app/api/routers/automatic.py(2 hunks)app/core/config.py(1 hunks)app/helpers/automatic/process_pool.py(7 hunks)app/main.py(2 hunks)docs/STT_FALLBACK_IMPLEMENTATION.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- app/core/config.py
- app/agents/voice/automatic/services/fallback/pipeline_restart_manager.py
🧰 Additional context used
🪛 LanguageTool
docs/STT_FALLBACK_IMPLEMENTATION.md
[uncategorized] ~5-~5: Possible missing article found.
Context: ...lementation ## 🎯 Objective Implement automatic fallback mechanism for Speech-to-Text (...
(AI_HYDRA_LEO_MISSING_AN)
[grammar] ~395-~395: The singular proper name ‘Singleton’ must be used with a third-person or a past tense verb.
Context: ...❌ Alternative Approaches**: - Singleton pattern across processes (memory isolation issu...
(HE_VERB_AGR)
[typographical] ~702-~702: If specifying a range, consider using an en dash instead of a hyphen.
Context: ...back conditions - Session Recovery: 2-3 seconds for complete auto-restart - **S...
(HYPHEN_TO_EN)
🪛 markdownlint-cli2 (0.18.1)
docs/STT_FALLBACK_IMPLEMENTATION.md
29-29: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
63-63: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
225-225: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
606-606: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
642-642: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
652-652: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
659-659: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
666-666: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🪛 Ruff (0.14.3)
app/agents/voice/automatic/services/fallback/session_manager.py
109-109: Do not catch blind exception: Exception
(BLE001)
app/api/routers/automatic.py
191-191: Consider moving this statement to an else block
(TRY300)
202-202: Do not catch blind exception: Exception
(BLE001)
248-248: Consider [arg_name, *value] instead of concatenation
Replace with [arg_name, *value]
(RUF005)
256-256: subprocess call: check for execution of untrusted input
(S603)
271-271: Do not catch blind exception: Exception
(BLE001)
app/helpers/automatic/process_pool.py
363-363: Do not catch blind exception: Exception
(BLE001)
553-553: Do not catch blind exception: Exception
(BLE001)
599-599: Do not catch blind exception: Exception
(BLE001)
app/agents/voice/automatic/__init__.py
264-264: Unused function argument: voice_name
(ARG001)
264-264: Unused function argument: mode
(ARG001)
357-357: Do not catch blind exception: Exception
(BLE001)
663-663: Unpacked variable pipeline is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
663-663: Unpacked variable context_aggregator is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
944-944: Use explicit conversion flag
Replace with conversion flag
(RUF010)
954-954: Consider moving this statement to an else block
(TRY300)
956-956: Do not catch blind exception: Exception
(BLE001)
app/agents/voice/automatic/tools/juspay/analytics.py
726-726: Do not catch blind exception: Exception
(BLE001)
743-743: Do not catch blind exception: Exception
(BLE001)
755-755: Do not catch blind exception: Exception
(BLE001)
| pool.store_session_parameters( | ||
| session_id, {"room_url": room_url, "token": token, **session_params} | ||
| ) | ||
|
|
||
| # Try to get process from pool first | ||
| try: | ||
| voice_process = await pool.get_process(session_id) | ||
|
|
||
| try: | ||
| # Configure the pre-warmed process | ||
| session_config = { | ||
| "room_url": room_url, | ||
| "token": token, | ||
| "session_id": session_id, | ||
| **session_params, | ||
| } | ||
|
|
||
| config_json = json.dumps(session_config) + "\n" | ||
| voice_process.process.stdin.write(config_json.encode("utf-8")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix session identifier override during fallback restarts
When we hit a fallback path, session_params already contains the original session metadata (including session_id, room_url, and token) that we previously stored. Because the dictionary unpacking places session_params last, any new identifiers passed in via the function arguments get overwritten. That means a restart meant to run under a fresh session_id (or refreshed Daily tokens) still boots the child process with the old identifiers, which breaks tracking and can collide with the process pool’s bookkeeping.
Reversing the merge order so that the freshly supplied values win solves this and keeps both pool and direct-launch flows consistent.
Please apply something along these lines:
- pool.store_session_parameters(
- session_id, {"room_url": room_url, "token": token, **session_params}
- )
+ merged_params = {
+ **session_params,
+ "room_url": room_url,
+ "token": token,
+ "session_id": session_id,
+ }
+ pool.store_session_parameters(session_id, merged_params)
...
- session_config = {
- "room_url": room_url,
- "token": token,
- "session_id": session_id,
- **session_params,
- }
+ session_config = {
+ **session_params,
+ "room_url": room_url,
+ "token": token,
+ "session_id": session_id,
+ }
...
- direct_session_config = {
- "room_url": room_url,
- "token": bot_token,
- "session_id": session_id,
- "client_sid": client_sid,
- **session_params,
- }
+ direct_session_config = {
+ **session_params,
+ "room_url": room_url,
+ "token": bot_token,
+ "session_id": session_id,
+ "client_sid": client_sid,
+ }This guarantees the restart honors whichever session ID/token the orchestrator wants to use.
Also applies to: 206-315
🤖 Prompt for AI Agents
In app/api/routers/automatic.py around lines 158 to 176 (and similarly for the
fallback restart block between 206 and 315), the current dict merge puts
session_params last so stored identifiers in session_params override the fresh
function args; change the merge order so the explicit fresh values win by
unpacking session_params first and then the keys { "room_url": room_url,
"token": token, "session_id": session_id } after it (i.e., session_params,
**overrides) when building the session dict for pool.store_session_parameters
and when constructing session_config before writing to the child process stdin;
ensure both occurrences are updated to prevent using stale session_id/token on
restarts.
Summary by CodeRabbit
New Features
Documentation
Reliability