Skip to content

feat: use vad #867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

feat: use vad #867

wants to merge 2 commits into from

Conversation

quake
Copy link
Contributor

@quake quake commented May 22, 2025

try to resolve #857

Since the crate voice_activity_detector has already nicely wrapped silero based vad, I thought we could just use it, and I tried to write this PR, it introduces changes to the chunker crate and related components, replacing the old Predictor-based approach with a new vad implementation for processing audio streams. unit test test_chunker to generate the wav file that sounds good.

Summary by CodeRabbit

  • New Features

    • Integrated voice activity detection (VAD) for audio chunking, replacing the previous RMS-based method.
    • Improved speech detection accuracy in local speech-to-text processing.
  • Refactor

    • Simplified the audio chunking interface to use a dedicated voice activity detector.
    • Removed legacy predictor-based logic and related error handling.
  • Chores

    • Updated dependencies to include the new voice activity detector.

Copy link

coderabbitai bot commented May 22, 2025

Walkthrough

The changes replace the previous RMS-based and pluggable predictor-based audio chunking logic with a new approach that uses the voice_activity_detector crate for voice activity detection (VAD). All related error handling and predictor abstractions are removed, and the chunking logic is refactored to use VAD for silence detection and chunk segmentation.

Changes

File(s) Change Summary
Cargo.toml, plugins/local-stt/Cargo.toml Added voice_activity_detector dependency to the workspace and local-stt plugin.
crates/chunker/Cargo.toml Removed hypr-vad dependency; added voice_activity_detector as dependency.
crates/chunker/src/error.rs, crates/chunker/src/predictor.rs Deleted error handling and predictor trait/implementations related to previous chunking logic.
crates/chunker/src/lib.rs Removed generic predictor interface; updated chunking to use VoiceActivityDetector. Updated tests accordingly.
crates/chunker/src/stream.rs Refactored ChunkStream to use VAD for silence detection and chunking; removed predictor-based logic and related generics. Added VAD-based filtering helper.
plugins/local-stt/src/server.rs Replaced RMS-based chunker with VAD-based chunker in the WebSocket audio stream handler.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant Chunker
    participant VAD

    Client->>Server: Send audio stream
    Server->>Chunker: Pass audio stream
    Chunker->>VAD: Analyze audio samples for speech
    VAD-->>Chunker: Speech/silence labels
    Chunker->>Server: Return speech-only chunks
    Server-->>Client: Stream transcription results
Loading

Assessment against linked issues

Objective Addressed Explanation
Use Silero-based VAD for chunking, removing RMS/pluggable predictor logic (#857)
Ensure chunker splits based on silence and strips silence as much as possible (#857, #662)
Maintain max chunk length constraint (e.g., 30 sec, ideally ~12 sec) (#857)
Integrate with dataset/tests for chunker validation (#857)

Poem

A rabbit hopped into the code,
RMS and predictors off he strode.
With VAD in paw, he chunked with glee,
Silence trimmed, as clean as can be!
Now speech flows smooth, no empty sound—
In every hop, great chunks are found!
🐇🎶

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
crates/chunker/src/stream.rs (1)

48-50: 🛠️ Refactor suggestion

as_stream() is rebuilt on every poll – this can replay the source

let stream = this.source.as_stream(); followed by pin!(stream) runs on each poll_next call, which
creates a brand-new stream every time.
If as_stream() yields a fresh iterator starting from the beginning (which is typical), the chunker will
loop forever on the first samples or duplicate audio.

Store the pinned stream once inside the struct instead:

-pub struct ChunkStream<S: AsyncSource + Unpin> {
+pub struct ChunkStream<S: AsyncSource + Unpin, St> {
     source: S,
-    vad: VoiceActivityDetector,
+    stream: Pin<Box<St>>,
     vad: VoiceActivityDetector,
@@
-        Self {
-            source,
-            vad,
-            buffer: Vec::new(),
-            max_duration,
-        }
+        let stream = Box::pin(source.as_stream());
+        Self { source, stream, vad, buffer: Vec::new(), max_duration }
🧹 Nitpick comments (5)
plugins/local-stt/src/server.rs (1)

149-157: VAD configuration looks good but could be more robust.

The implementation correctly configures the VoiceActivityDetector with appropriate parameters for speech recognition. The sample rate of 16kHz and chunk size of 512 samples are reasonable values for voice detection.

However, using expect() for error handling could be improved for production code.

Consider handling potential errors more gracefully:

- let vad = VoiceActivityDetector::builder()
-     .sample_rate(sample_rate)
-     .chunk_size(512usize)
-     .build()
-     .expect("vad config is valid");
+ let vad = match VoiceActivityDetector::builder()
+     .sample_rate(sample_rate)
+     .chunk_size(512usize)
+     .build() {
+         Ok(vad) => vad,
+         Err(e) => {
+             tracing::error!("Failed to build VAD: {}", e);
+             // Either return an error or fall back to default parameters
+             return;
+         }
+     };

Additionally, consider making these parameters configurable rather than hardcoded, which would make the system more adaptable to different audio conditions.

crates/chunker/src/lib.rs (2)

9-15: Return ChunkStream without consuming the VAD instance

The chunks method currently takes vad by value, which moves the detector into the ChunkStream.
That prevents re-using the same configured detector for issuing multiple independent chunk streams from the
same audio source (e.g. different max-duration settings), and also means the caller loses access to the VAD
for any other purpose.

If the underlying type implements Clone (it does) and is Send + Sync safe, consider taking the argument
by &VoiceActivityDetector and cloning internally, or alternatively by impl Into<VoiceActivityDetector>
to give the caller more flexibility.


48-56: Avoid panicking when the temp directory does not yet exist

std::fs::remove_dir_all returns ErrorKind::NotFound when the directory is missing.
Unwrapping on that error makes the test unnecessarily brittle.

-        std::fs::remove_dir_all("tmp/english_1").unwrap();
+        let _ = std::fs::remove_dir_all("tmp/english_1");

(or check for the error explicitly).

crates/chunker/src/stream.rs (2)

79-85: Duplicated logic – extract helper for final chunk creation

The three identical blocks that convert speech into a SamplesBuffer could be factored into a small helper
to reduce repetition and future maintenance overhead.


90-105: Expose threshold & padding as parameters

filter_speech_chunks hard-codes a 0.75 probability threshold and 3-frame padding.
Different environments may require tuning. Consider propagating these values from the public API or from the
builder of ChunkStream.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1635590 and 017b07c.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml (1 hunks)
  • crates/chunker/Cargo.toml (1 hunks)
  • crates/chunker/src/error.rs (0 hunks)
  • crates/chunker/src/lib.rs (3 hunks)
  • crates/chunker/src/predictor.rs (0 hunks)
  • crates/chunker/src/stream.rs (3 hunks)
  • plugins/local-stt/Cargo.toml (1 hunks)
  • plugins/local-stt/src/server.rs (2 hunks)
💤 Files with no reviewable changes (2)
  • crates/chunker/src/error.rs
  • crates/chunker/src/predictor.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
crates/chunker/src/lib.rs (1)
crates/chunker/src/stream.rs (1)
  • new (20-27)
🔇 Additional comments (5)
Cargo.toml (1)

196-196: Added voice_activity_detector dependency looks good.

The addition of the voice_activity_detector crate as a workspace dependency aligns perfectly with the PR objective to replace the previous VAD implementation with the Silero VAD model.

crates/chunker/Cargo.toml (1)

19-19: Appropriate dependency replacement.

The change to use the voice_activity_detector crate instead of hypr-vad is consistent with the PR objective to integrate the Silero VAD model for improved audio chunking.

plugins/local-stt/Cargo.toml (1)

53-53: Consistent dependency addition.

Adding voice_activity_detector as a dependency to the local-stt plugin ensures it can use the new VAD implementation directly, which aligns with the changes made in the server.rs file.

plugins/local-stt/src/server.rs (1)

24-24: Appropriate import for the new VAD implementation.

The import of VoiceActivityDetector aligns with the PR objective to use the Silero VAD model.

crates/chunker/src/lib.rs (1)

40-45: Handle possible sample-rate mismatch between VAD and audio

The VAD is hard-coded to sample_rate(16_000), while the audio_source’s real sample-rate is whatever is
encoded in english_1::AUDIO_PATH.
If these differ, the detector’s result quality (and therefore chunking accuracy) will be severely degraded.

-        let vad = VoiceActivityDetector::builder()
-            .sample_rate(16000)
+        let vad = VoiceActivityDetector::builder()
+            .sample_rate(audio_source.sample_rate())

Either assert that the source really is 16 kHz, down-sample the input, or build the VAD with the run-time
value.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
crates/chunker/src/lib.rs (1)

48-49: Error handling in tests could be improved

The call to remove_dir_all will panic if the directory doesn't exist, which could cause test failures in clean environments.

Consider handling the case where the directory doesn't exist:

-std::fs::remove_dir_all("tmp/english_1").unwrap();
+let _ = std::fs::remove_dir_all("tmp/english_1"); // Ignore errors if directory doesn't exist
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 017b07c and 83d8ca8.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml (1 hunks)
  • crates/chunker/Cargo.toml (1 hunks)
  • crates/chunker/src/error.rs (0 hunks)
  • crates/chunker/src/lib.rs (3 hunks)
  • crates/chunker/src/predictor.rs (0 hunks)
  • crates/chunker/src/stream.rs (3 hunks)
  • plugins/local-stt/Cargo.toml (1 hunks)
  • plugins/local-stt/src/server.rs (2 hunks)
💤 Files with no reviewable changes (2)
  • crates/chunker/src/error.rs
  • crates/chunker/src/predictor.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • Cargo.toml
  • plugins/local-stt/Cargo.toml
  • crates/chunker/Cargo.toml
  • plugins/local-stt/src/server.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
crates/chunker/src/lib.rs (1)
crates/chunker/src/stream.rs (1)
  • new (20-27)
🔇 Additional comments (6)
crates/chunker/src/lib.rs (3)

7-7: Architectural change: New import for VoiceActivityDetector

This import reflects the transition from the previous generic Predictor-based approach to a concrete VoiceActivityDetector implementation, which aligns with the PR objective of using the voice_activity_detector crate for audio chunking.


10-15: API change: From generic predictor to concrete VoiceActivityDetector

The ChunkerExt trait's chunks method has been modified to accept a concrete VoiceActivityDetector instead of a generic Predictor type parameter. This change:

  • Simplifies the API by making the chunking method more specific
  • Reduces flexibility (no longer supports pluggable predictors)
  • Aligns with the PR objective of using VAD for chunking

This is a breaking change that clients of this API will need to update.


40-45: Proper VAD configuration for test

The VAD configuration uses appropriate parameters for the test:

  • Sample rate (16000) matches the test WAV spec
  • Chunk size (512) is a reasonable value for audio processing

The configuration and initialization looks good.

crates/chunker/src/stream.rs (3)

10-10: New VAD imports aligned with architectural change

The import of IteratorExt provides extension methods for iterators that enable voice activity detection functionality, aligning with the PR's goal to use the voice_activity_detector crate.


12-17: Struct refactoring: Removed generic Predictor type parameter

The ChunkStream struct has been simplified by removing the generic Predictor type parameter and replacing it with a concrete VoiceActivityDetector field. This change is consistent with the architectural shift in lib.rs.


56-59: Previous issue fixed: Stream no longer terminates on empty speech

The previous implementation had an issue where empty speech would terminate the stream prematurely by returning Poll::Ready(None). The new implementation correctly returns Some(buffer) even if the speech is empty, allowing the stream to continue processing. This addresses the issue mentioned in the previous review.

Also applies to: 62-65, 71-73

Comment on lines +77 to +103
// helper function to filter speech chunks
fn filter_speech_chunks<D: IntoIterator<Item = f32>>(
vad: &mut VoiceActivityDetector,
data: D,
) -> Vec<f32> {
data.into_iter()
.label(vad, 0.75, 3)
.filter_map(|label| {
if label.is_speech() {
Some(label.into_iter())
} else {
None
}
})
.flatten()
.collect()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve the speech filtering implementation

The new helper function has several areas for improvement:

  1. Hardcoded parameters (0.75 threshold, 3 window size) should be configurable
  2. No error handling for potential VAD failures
  3. The empty speech case could return None to allow callers to skip empty chunks

Consider refactoring to:

// helper function to filter speech chunks
fn filter_speech_chunks<D: IntoIterator<Item = f32>>(
    vad: &mut VoiceActivityDetector,
    data: D,
+   threshold: f32,
+   window_size: usize,
) -> Vec<f32> {
    let speech = data.into_iter()
-       .label(vad, 0.75, 3)
+       .label(vad, threshold, window_size)
        .filter_map(|label| {
            if label.is_speech() {
                Some(label.into_iter())
            } else {
                None
            }
        })
        .flatten()
        .collect();
+   speech
}

Also, consider updating the calling code to handle empty speech vectors appropriately.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In crates/chunker/src/stream.rs between lines 77 and 93, the
filter_speech_chunks function uses hardcoded parameters for the VAD threshold
and window size, lacks error handling for VAD failures, and always returns a Vec
even if empty. Refactor the function to accept threshold and window size as
parameters to make it configurable, add error handling to manage potential VAD
failures gracefully, and change the return type to Option<Vec<f32>> so it
returns None when no speech chunks are detected. Also, update any calling code
to handle the Option return type properly, skipping processing when None is
returned.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
crates/chunker/src/stream.rs (2)

63-71: Consider skipping instead of terminating when buffer is empty

When the source stream ends with leftover data but no speech is detected, the current implementation returns Poll::Ready(None), which signals the end of the stream. This might prematurely terminate processing if there's more data to come.

Replace the early-return with a different approach to avoid terminating the stream prematurely:

Poll::Ready(None) if !this.buffer.is_empty() => {
    let data = std::mem::take(&mut this.buffer);
    let speech = filter_speech_chunks(&mut this.vad, data);
-   if speech.is_empty() {
-       return Poll::Ready(None);
-   } else {
+   if !speech.is_empty() {
        return Poll::Ready(Some(SamplesBuffer::new(1, sample_rate, speech)));
    }
+   // No speech detected in the final buffer, just signal end of stream
+   return Poll::Ready(None);
}

87-103: 🛠️ Refactor suggestion

Make VAD parameters configurable

The filter_speech_chunks function uses hardcoded values for the threshold (0.75) and window size (3) in the VAD processing. These should be configurable parameters to allow tuning for different audio conditions and use cases.

// helper function to filter speech chunks
fn filter_speech_chunks<D: IntoIterator<Item = f32>>(
    vad: &mut VoiceActivityDetector,
    data: D,
+   threshold: f32,
+   window_size: usize,
) -> Vec<f32> {
    data.into_iter()
-       .label(vad, 0.75, 3)
+       .label(vad, threshold, window_size)
        .filter_map(|label| {
            if label.is_speech() {
                Some(label.into_iter())
            } else {
                None
            }
        })
        .flatten()
        .collect()
}

Then update the call sites to provide appropriate values:

- let speech = filter_speech_chunks(&mut this.vad, data);
+ let speech = filter_speech_chunks(&mut this.vad, data, 0.75, 3);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 20da009 and 6c5442d.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • Cargo.toml (1 hunks)
  • crates/chunker/Cargo.toml (1 hunks)
  • crates/chunker/src/error.rs (0 hunks)
  • crates/chunker/src/lib.rs (3 hunks)
  • crates/chunker/src/predictor.rs (0 hunks)
  • crates/chunker/src/stream.rs (3 hunks)
  • plugins/local-stt/Cargo.toml (1 hunks)
  • plugins/local-stt/src/server.rs (2 hunks)
💤 Files with no reviewable changes (2)
  • crates/chunker/src/error.rs
  • crates/chunker/src/predictor.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • Cargo.toml
  • crates/chunker/Cargo.toml
  • plugins/local-stt/Cargo.toml
  • plugins/local-stt/src/server.rs
  • crates/chunker/src/lib.rs
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: ci (windows, windows-latest)
  • GitHub Check: ci
  • GitHub Check: ci (macos, macos-latest)
🔇 Additional comments (2)
crates/chunker/src/stream.rs (2)

10-10: LGTM: Well-implemented VAD integration

The changes to replace the generic Predictor with a concrete VoiceActivityDetector are implemented cleanly. The structure modifications and import changes are done consistently across the file.

Also applies to: 12-14, 19-23


55-61: Properly handling empty speech chunks

The implementation now correctly handles empty speech chunks by checking if !speech.is_empty() before returning a buffer, which addresses the issue mentioned in previous reviews.

Comment on lines +77 to +83
let data = this.buffer.drain(0..max_samples);
let speech = filter_speech_chunks(&mut this.vad, data);
if speech.is_empty() {
Poll::Pending
} else {
Poll::Ready(Some(SamplesBuffer::new(1, sample_rate, speech)))
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve handling of empty speech during max duration flush

When draining the buffer due to max duration with no speech detected, the current implementation returns Poll::Pending. This might cause the stream to stall if there's more audio to process.

Consider continuing to poll rather than returning Poll::Pending when no speech is detected:

let data = this.buffer.drain(0..max_samples);
let speech = filter_speech_chunks(&mut this.vad, data);
if speech.is_empty() {
-   Poll::Pending
+   // Continue polling instead of waiting
+   cx.waker().wake_by_ref();
+   Poll::Pending
} else {
    Poll::Ready(Some(SamplesBuffer::new(1, sample_rate, speech)))
}

Alternatively, consider collecting more data before returning to the caller:

let data = this.buffer.drain(0..max_samples);
let speech = filter_speech_chunks(&mut this.vad, data);
if speech.is_empty() {
-   Poll::Pending
+   // Continue with the next loop iteration to collect more data
+   continue;
} else {
    Poll::Ready(Some(SamplesBuffer::new(1, sample_rate, speech)))
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let data = this.buffer.drain(0..max_samples);
let speech = filter_speech_chunks(&mut this.vad, data);
if speech.is_empty() {
Poll::Pending
} else {
Poll::Ready(Some(SamplesBuffer::new(1, sample_rate, speech)))
}
let data = this.buffer.drain(0..max_samples);
let speech = filter_speech_chunks(&mut this.vad, data);
if speech.is_empty() {
// Continue polling instead of waiting
cx.waker().wake_by_ref();
Poll::Pending
} else {
Poll::Ready(Some(SamplesBuffer::new(1, sample_rate, speech)))
}
🤖 Prompt for AI Agents
In crates/chunker/src/stream.rs around lines 77 to 83, the code returns
Poll::Pending when no speech is detected after draining the buffer, which can
cause the stream to stall. Instead of returning Poll::Pending, modify the logic
to continue polling for more data or handle the empty speech case by either
skipping the current chunk and polling again or buffering more data before
returning. This ensures the stream does not stall and continues processing
incoming audio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Silero based audio chunker
1 participant