-
-
Notifications
You must be signed in to change notification settings - Fork 132
Open
Description
Hi, i'm working on voice translator app, and trying streamlit-webrtc with faster-whisper model
I noticed this link :https://github.com/whitphx/streamlit-webrtc/blob/main/pages/10_sendonly_audio.py
and found there is a while loop to handle audio data like below:
while True:
if webrtc_ctx.audio_receiver:
try:
audio_frames = webrtc_ctx.audio_receiver.get_frames(timeout=1)
except queue.Empty:
logger.warning("Queue is empty. Abort.")
break
sound_chunk = pydub.AudioSegment.empty()
for audio_frame in audio_frames:
sound = pydub.AudioSegment(
data=audio_frame.to_ndarray().tobytes(),
sample_width=audio_frame.format.bytes,
frame_rate=audio_frame.sample_rate,
channels=len(audio_frame.layout.channels),
...
Doesn't it block UI thread since the while loop
is in UI thread in my point view, and once i use below code, the page is pending.
Below is my code, please help to figure out how to fix it
webrtc_ctx = webrtc_streamer(
key="speech-to-text",
mode=WebRtcMode.SENDONLY,
audio_receiver_size=10240,
rtc_configuration={"iceServers": [{"urls": ["stun:stun.l.google.com:19302"]}]},
media_stream_constraints={"video": False, "audio": True},
)
status_indicator = st.empty()
text_output = st.empty()
stream = None
while True:
if webrtc_ctx.audio_receiver:
sound_chunk = pydub.AudioSegment.empty()
try:
audio_frames = webrtc_ctx.audio_receiver.get_frames(timeout=1)
except queue.Empty:
time.sleep(0.1)
status_indicator.write("No frame arrived.")
status_indicator.write("Running. Say something!")
for audio_frame in audio_frames:
sound = pydub.AudioSegment(
data=audio_frame.to_ndarray().tobytes(),
sample_width=audio_frame.format.bytes,
frame_rate=audio_frame.sample_rate,
channels=len(audio_frame.layout.channels),
)
sound_chunk += sound
print(sound_chunk)
if len(sound_chunk) > 0:
sound_chunk = sound_chunk.set_channels(1).set_frame_rate(44100)
buffer = np.array(sound_chunk.get_array_of_samples())
segments, _ = WhisperModel(model_size
, device=f"{'cuda' if supports_gpu else 'cpu'}"
, compute_type=compute_type).transcribe(buffer)
transcript = " ".join(segment.text for segment in segments)
print(transcript)
text = " ".join(segment.text for segment in segments)
text_output.markdown(f"**Text:** {text}")
else:
status_indicator.write("AudioReciver is not set. Abort.")
Metadata
Metadata
Assignees
Labels
No labels