Strange behaviour with flashlight decoder on Riva #6130

itzsimpl · 2023-03-02T11:17:49Z

itzsimpl
Mar 2, 2023

I've posted the original question on the Riva forum (https://forums.developer.nvidia.com/t/riva-en-us-when-using-lm-interim-results-with-stability-change-drop-already-predicted-but-less-stable-words/234888) November last, as I know the NeMo GitHub is not the right place for Riva related questions. But since I still haven't received any useful information I'll post a discussion Q&A here as well, if anyone of you can shed some light on this.

In Riva at least the flashlight decoder returns intermediate results split by stability (in my case 0.1 and 0.9); I understand that a low stability indicates the transcript can change a lot. What bothers me is that when observing the entire intermediate result ([text with stability 0.9] [text with stability 0.1]) one expects words from the start of portion with stability 0.1 will get removed from it and get appended to the end of portion with stability 0.9.

What happens in reality is that often the word in question (even when correct) first disappears or changes completely before repairing in its correct form. This makes a very unpleasant discontinuity in the intermediate transcript. Especially since the portion with stability 0.1 is not short (approx. 2s). In a longer speech there is a missing/changing word approx 2s into the speach.

Any suggestions to how to resolve this issue? I assume this has something to do with the parameters

   --chunk_size=0.16
   --left_padding_size=1.92
   --right_padding_size=1.92

Would a "Cache-aware Streaming Conformer" model help? Or any other? How should these parameters be set for that type of model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Strange behaviour with flashlight decoder on Riva #6130

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Strange behaviour with flashlight decoder on Riva #6130

Uh oh!

itzsimpl Mar 2, 2023

Replies: 0 comments

itzsimpl
Mar 2, 2023