Replies: 1 comment
-
Hi @hbredin much apologies for the tag as the issue seems more related to product knowledge than technical know-hows. In the supplementary information, the model summary is as follows and suggest that the model ingests 2s chunks directly. If so, could I check at what stage does the mili-second segmentation starts?: Truly much thanks. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
May I check with you, for the vad tutorial provided in the repo, 2s chunks were used for training. Could I check if the resolution is 16ms?
I am trying to visualize the effect of the sincnet layer and am attempting to extract the sincnet ParamSincFB layer. However, the VAD pipeline for inference breaks for this purpose and I am trying to replicate the Pipeline method with torchaudio.load instead. Could I check if I should use 16ms or 2 s.
Beta Was this translation helpful? Give feedback.
All reactions