Skip to content
Discussion options

You must be logged in to vote

In case anyone else runs into this error the issue was that the input files can only have 1 audio channels. The file I was using had 2 audio channels which caused nemo/collections/asr/parts/utils/audio_utils.py to return a list of list due to how soundfile and/or librosa handles multiple audio channels. Because it was a list of list in the nemo/collections/asr/parts/utils/decoder_timestamps_utils.py file at samples = np.pad(samples, (0, int(delay * model_stride_in_secs * self.asr_model._cfg.sample_rate))) would blow up since the formula there expects a single list not a nested list causing the amount of memory needed.

You can use something like ffprobe -i TestVideo.mp4 -show_entries strea…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by Okohedeki
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant