Replies: 1 comment
-
AI generated solution. please verify The difference between "simulating" cache-aware streaming and actually performing it in NeMo's conformer models is primarily about the execution environment, not the underlying algorithm or optimizations. When the documentation mentions that functions like The cache-aware streaming approach in NeMo works by:
When you call The term "simulate" is used because the code is running on pre-recorded audio in a controlled environment rather than on live audio input, but the underlying streaming implementation with its caching optimizations is fully functional and production-ready. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
See these sources:
https://github.com/NVIDIA/NeMo/blob/cda2a637e9c1fefaa419e7b31ab2203d72d9819f/docs/source/asr/models.rst?plain=1#L236
https://github.com/NVIDIA/NeMo/blob/cda2a637e9c1fefaa419e7b31ab2203d72d9819f/nemo/collections/asr/parts/mixins/mixins.py#L590
https://github.com/NVIDIA/NeMo/blob/cda2a637e9c1fefaa419e7b31ab2203d72d9819f/nemo/collections/asr/parts/mixins/mixins.py#L714
Does it "simulate" cache-aware streaming or does it perform it? Models trained natively with cache-aware streaming are available, e.g. here. Does running functions such as
conformer_stream_step()
repeatedly, like it's done in the notebook here, actually perform the streaming step with the appropriate optimizations? Is it that it somehow logically produces the same output as cache-aware streaming but unoptimized, like you're still feeding in large batches of context into the model or something and they're just thrown out to produce the same output as optimized cache-aware streaming?Beta Was this translation helpful? Give feedback.
All reactions